UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 13 von 289

Framework for automatic information extraction from research papers on nanocrystal devices

Beilstein journal of nanotechnology, 2015-09, Vol.6 (1), p.1872-1882

2015

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

Framework for automatic information extraction from research papers on nanocrystal devices

Ist Teil von

Beilstein journal of nanotechnology, 2015-09, Vol.6 (1), p.1872-1882

Ort / Verlag

Germany: Beilstein-Institut

Erscheinungsjahr

2015

Quelle

EZB Electronic Journals Library

Beschreibungen/Notizen

To support nanocrystal device development, we have been working on a computational framework to utilize information in research papers on nanocrystal devices. We developed an annotated corpus called " NaDev" (Nanocrystal Device Development) for this purpose. We also proposed an automatic information extraction system called "NaDevEx" (Nanocrystal Device Automatic Information Extraction Framework). NaDevEx aims at extracting information from research papers on nanocrystal devices using the NaDev corpus and machine-learning techniques. However, the characteristics of NaDevEx were not examined in detail. In this paper, we conduct system evaluation experiments for NaDevEx using the NaDev corpus. We discuss three main issues: system performance, compared with human annotators; the effect of paper type (synthesis or characterization) on system performance; and the effects of domain knowledge features (e.g., a chemical named entity recognition system and list of names of physical quantities) on system performance. We found that overall system performance was 89% in precision and 69% in recall. If we consider identification of terms that intersect with correct terms for the same information category as the correct identification, i.e., loose agreement (in many cases, we can find that appropriate head nouns such as temperature or pressure loosely match between two terms), the overall performance is 95% in precision and 74% in recall. The system performance is almost comparable with results of human annotators for information categories with rich domain knowledge information (source material). However, for other information categories, given the relatively large number of terms that exist only in one paper, recall of individual information categories is not high (39-73%); however, precision is better (75-97%). The average performance for synthesis papers is better than that for characterization papers because of the lack of training examples for characterization papers. Based on these results, we discuss future research plans for improving the performance of the system.

Sprache: Englisch
Identifikatoren: ISSN: 2190-4286
eISSN: 2190-4286
DOI: 10.3762/bjnano.6.190
Titel-ID: cdi_doaj_primary_oai_doaj_org_article_7decfe282d004e8987879d5eaeaa1f16

Format: –
Schlagworte: annotated corpus, automatic information extraction, Full Research Paper, nanocrystal device development, nanoinformatics, Nanoscience, Nanotechnology, text mining

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Framework for automatic information extraction from research papers on nanocrystal devices

Details

Weiterführende Literatur