Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 15 von 106
2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, 2010, Vol.6, p.2791-2795
2010

Details

Autor(en) / Beteiligte
Titel
Combining topic models and string kernel for deep web categorization
Ist Teil von
  • 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery, 2010, Vol.6, p.2791-2795
Ort / Verlag
IEEE
Erscheinungsjahr
2010
Link zum Volltext
Quelle
IEEE Xplore
Beschreibungen/Notizen
  • Online databases maintain a collection of structured domain-specific documents dynamically generated in response to users' queries instead of being accessed by static URLs. Categorizing deep webs according to their object domains is a critical step to integrate such sources. While existing methods focus on supervised or post-query methodologies, we propose a more practical pre-query algorithm operating in an unsupervised manner. Given the domain number, our two phase approach firstly investigates the hidden domain distribution for each query form using topic models and each query form's object domain can be identified preliminarily. In this phase, we construct our training set composing the query forms deemed to have already been categorized correctly, and beside, the deep webs needed to be reclassified are also selected in this phase. In the second phase, we train a classifier with String Kernel methods to reclassify the uncertain deep webs to improve the overall performance. The advantage of our algorithm over previous ones is that we capture the semantic structure for each query form. Based on the two phase architecture, our framework works in an unsupervised manner and achieves satisfactory results. Experiments on the TEL-8 dataset from the UIUC Web integration repository 1 show the effectiveness and efficiency of our algorithm.
Sprache
Englisch
Identifikatoren
ISBN: 1424459311, 9781424459315
DOI: 10.1109/FSKD.2010.5569236
Titel-ID: cdi_ieee_primary_5569236

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX