Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 3 von 21
Conceptual Modeling for Novel Application Domains, 2003, p.390-401
2003

Details

Autor(en) / Beteiligte
Titel
Extracting Relations from XML Documents
Ist Teil von
  • Conceptual Modeling for Novel Application Domains, 2003, p.390-401
Ort / Verlag
Berlin, Heidelberg: Springer Berlin Heidelberg
Erscheinungsjahr
2003
Link zum Volltext
Quelle
Alma/SFX Local Collection
Beschreibungen/Notizen
  • XML is becoming a prevalent format for data exchange. Many XML documents have complex schemas that are not always known, and can vary widely between information sources and applications. In contrast, database applications rely mainly on the flat relational model. We propose a novel, partially supervised approach for extracting user-defined relations from XML documents with unknown schema. The extracted relations can be directly used by an RDBMS, or utilized for information integration or data mining tasks. Our method attempts to automatically capture the lexical and structural features that indicate the relevant portions of the input document, based on a few user-annotated examples. This information can then be used to extract the relation of interest from documents with schemas potentially different from the training examples. We present preliminary experiments showing that our method could be capable of extracting the target relation from XML documents even in the presence of significant variations in the document schemas.
Sprache
Englisch
Identifikatoren
ISBN: 3540202579, 9783540202578
ISSN: 0302-9743
eISSN: 1611-3349
DOI: 10.1007/978-3-540-39597-3_38
Titel-ID: cdi_pascalfrancis_primary_15758574

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX