UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 11 von 2020

Multi-Scale Multi-Task FCN for Semantic Page Segmentation and Table Detection

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017, Vol.1, p.254-261

2017

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

Multi-Scale Multi-Task FCN for Semantic Page Segmentation and Table Detection

Ist Teil von

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 2017, Vol.1, p.254-261

Ort / Verlag

IEEE

Erscheinungsjahr

2017

Quelle

IEEE Xplore

Beschreibungen/Notizen

Page segmentation and table detection play an important role in understanding the structure of documents. We present a page segmentation algorithm that incorporates state-of-the-art deep learning methods for segmenting three types of document elements: text blocks, tables, and figures. We propose a multi-scale, multi-task fully convolutional neural network (FCN) for the tasks of semantic page segmentation and element contour detection. The semantic segmentation network accurately predicts the probability at each pixel of the three element classes. The contour detection network accurately predicts instance level "edges" around each element occurrence. We propose a conditional random field (CRF) that uses features output from the semantic segmentation and contour networks to improve upon the semantic segmentation network output. Given the semantic segmentation output, we also extract individual table instances from the page using some heuristic rules and a verification network to remove false positives. We show that although we only consider a page image as input, we produce comparable results with other methods that relies on PDF file information and heuristics and hand crafted features tailored to specific types of documents. Our approach learns the representative features for page segmentation from real and synthetic training data. %, and produces good results on real documents. The learning-based property makes it a more general method than existing methods in terms of document types and element appearances. For example, our method reliably detects sparsely lined tables which are hard for rule-based or heuristic methods.

Sprache: Englisch
Identifikatoren: eISSN: 2379-2140
DOI: 10.1109/ICDAR.2017.50
Titel-ID: cdi_ieee_primary_8269981

Format: –
Schlagworte: Feature extraction, Fully Convolutional Neural Network, Image edge detection, Image segmentation, Page Segmentation, Semantics, Table Detection, Task analysis, Training, Training data

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Multi-Scale Multi-Task FCN for Semantic Page Segmentation and Table Detection

Details

Weiterführende Literatur