Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 9 von 44
Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, 2004, Vol.1, p.384-387 Vol.1
2004
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Identification of embedded mathematical expressions in scanned documents
Ist Teil von
  • Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, 2004, Vol.1, p.384-387 Vol.1
Ort / Verlag
IEEE
Erscheinungsjahr
2004
Quelle
IEEE Xplore
Beschreibungen/Notizen
  • Efficient extraction of mathematical expressions is considered as an important pre-processing step to apply existing OCR systems to convert scientific papers into their electronic format. In this correspondence, a technique for extracting embedded (or in-line) expressions has been presented. The proposed method for expression extraction initially invokes an existing OCR to recognize the input document. Several features including word n-grams (a statistical analysis of a corpus of scientific documents reveals that the word level n-gram profile for sentences containing embedded expressions is quite different from that of the sentences without any expression) are computed on sentence level to spot sentences containing expressions. Expression zones are pin pointed by exploiting OCR inability to handle expressions and by using some common typographical aspects followed in typing mathematical expressions. Experimental results on a considerable size of dataset show high efficiency of the proposed technique.
Sprache
Englisch
Identifikatoren
ISBN: 0769521282, 9780769521282
ISSN: 1051-4651
eISSN: 2831-7475
DOI: 10.1109/ICPR.2004.1334132
Titel-ID: cdi_ieee_primary_1334132

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX