UB Paderborn / Katalog / Suche / Details

Ergebnis 23 von 23

2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, p.363-368

2019

Volltextzugriff (PDF)

Autor(en) / Beteiligte

Titel

Integrating Coordinates with Context for Information Extraction in Document Images

Ist Teil von

2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, p.363-368

Ort / Verlag

IEEE

Erscheinungsjahr

2019

Quelle

IEEE Xplore

Beschreibungen/Notizen

Information extraction from document collections is a fundamental and important step to understand, structure and analyze data. Many approaches with rules and deep learning based techniques have been applied on plain text, however, when it comes to document images, such demand still exists but becomes quite challenging without linguistic knowledge. In this paper, we propose an approach to extract required named entities (NEs) from document images by integrating the coordinate information from the detection and recognition stage into the contextual information of the BiLSTM-CRF model with an attention mechanism. We test this method on two real-world datasets. One is a Contract Dataset of Listed Companies, and the other is an Insurance Policy Dataset of our own. The result shows a combination of coordinates and context with attention leverages extraction in document images, opening up potential applications on such tasks.

Sprache: Englisch
Identifikatoren: eISSN: 2379-2140
DOI: 10.1109/ICDAR.2019.00065
Titel-ID: cdi_ieee_primary_8978016

Format: –
Schlagworte: BiLSTM-CRF, Contracts, Coordinates and Context, Data mining, Feature extraction, Image recognition, Information Extraction, Insurance, Named Entities, Scene Text Detection and Recognition, Task analysis, Text recognition

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX