Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 4 von 139
2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013, p.925-936
2013

Details

Autor(en) / Beteiligte
Titel
Top-k string similarity search with edit-distance constraints
Ist Teil von
  • 2013 IEEE 29th International Conference on Data Engineering (ICDE), 2013, p.925-936
Ort / Verlag
IEEE
Erscheinungsjahr
2013
Link zum Volltext
Quelle
IEL
Beschreibungen/Notizen
  • String similarity search is a fundamental operation in many areas, such as data cleaning, information retrieval, and bioinformatics. In this paper we study the problem of top-k string similarity search with edit-distance constraints, which, given a collection of strings and a query string, returns the top-k strings with the smallest edit distances to the query string. Existing methods usually try different edit-distance thresholds and select an appropriate threshold to find top-k answers. However it is rather expensive to select an appropriate threshold. To address this problem, we propose a progressive framework by improving the traditional dynamic-programming algorithm to compute edit distance. We prune unnecessary entries in the dynamic-programming matrix and only compute those pivotal entries. We extend our techniques to support top-k similarity search. We develop a range-based method by grouping the pivotal entries to avoid duplicated computations. Experimental results show that our method achieves high performance, and significantly outperforms state-of-the-art approaches on real-world datasets.
Sprache
Englisch
Identifikatoren
ISBN: 9781467349093, 1467349097
ISSN: 1063-6382
eISSN: 2375-026X
DOI: 10.1109/ICDE.2013.6544886
Titel-ID: cdi_ieee_primary_6544886

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX