UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 22 von 1628

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

IEEE transactions on knowledge and data engineering, 2023-05, Vol.35 (5), p.4767-4780

2023

Details

Autor(en) / Beteiligte

Titel

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Ist Teil von

IEEE transactions on knowledge and data engineering, 2023-05, Vol.35 (5), p.4767-4780

Ort / Verlag

IEEE

Erscheinungsjahr

2023

Link zum Volltext

IEEE_Xplore

Quelle

IEEE Xplore

Beschreibungen/Notizen

Recently, a new density-peak-based clustering method, called clustering with local density peaks-based minimum spanning tree (LDP-MST), was proposed, which has several attractive merits, e.g., being able to detect arbitrarily shaped clusters and not very sensitive to noise and parameters. Nevertheless, we also found the limitation of LDP-MST in efficiency. Specifically, LDP-MST has <inline-formula><tex-math notation="LaTeX">O(N\log N+M^{2})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo form="prefix">log</mml:mo><mml:mi>N</mml:mi><mml:mo>+</mml:mo><mml:msup><mml:mi>M</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq1-3150403.gif"/> </inline-formula> time, where <inline-formula><tex-math notation="LaTeX">N</tex-math> <mml:math><mml:mi>N</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq2-3150403.gif"/> </inline-formula> denotes the dataset size and <inline-formula><tex-math notation="LaTeX">M</tex-math> <mml:math><mml:mi>M</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq3-3150403.gif"/> </inline-formula> is an intermediate variable denoting the number of local density peaks. As our experimental results reveal, when processing large-size datasets, the value of <inline-formula><tex-math notation="LaTeX">M</tex-math> <mml:math><mml:mi>M</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq4-3150403.gif"/> </inline-formula> could be very large and consequently those steps of LDP-MST involving <inline-formula><tex-math notation="LaTeX">O(M^{2})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>M</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq5-3150403.gif"/> </inline-formula> time term would be time-consuming. And in the worst case, the value of <inline-formula><tex-math notation="LaTeX">M</tex-math> <mml:math><mml:mi>M</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq6-3150403.gif"/> </inline-formula> could be very close to that of <inline-formula><tex-math notation="LaTeX">N</tex-math> <mml:math><mml:mi>N</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq7-3150403.gif"/> </inline-formula>, which means that the time complexity of LDP-MST could be <inline-formula><tex-math notation="LaTeX">O(N^{2})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>N</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq8-3150403.gif"/> </inline-formula> in the worst case of <inline-formula><tex-math notation="LaTeX">M</tex-math> <mml:math><mml:mi>M</mml:mi></mml:math><inline-graphic xlink:href="qiu-ieq9-3150403.gif"/> </inline-formula>. In this study, we use more efficient algorithms to implement those steps of LDP-MST that involve the <inline-formula><tex-math notation="LaTeX">O(M^{2})</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mi>M</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq10-3150403.gif"/> </inline-formula> time term such that the proposed method, Fast LDP-MST, has <inline-formula><tex-math notation="LaTeX">O(N\log N)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>N</mml:mi><mml:mo form="prefix">log</mml:mo><mml:mi>N</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq11-3150403.gif"/> </inline-formula> time complexity even if <inline-formula><tex-math notation="LaTeX">M\approx N</tex-math> <mml:math><mml:mrow><mml:mi>M</mml:mi><mml:mo>≈</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href="qiu-ieq12-3150403.gif"/> </inline-formula>. Our experiments demonstrate that Fast LDP-MST is overall more efficient than LDP-MST on large-size datasets, without sacrificing the merits of LDP-MST in effectiveness, robustness, and user-friendliness.

Sprache: Englisch
Identifikatoren: ISSN: 1041-4347
eISSN: 1558-2191
DOI: 10.1109/TKDE.2022.3150403
Titel-ID: cdi_ieee_primary_9712197

Format: –
Schlagworte: Clustering algorithms, Clustering methods, Density-peak-based clustering, Directed acyclic graph, in-tree, large-size datasets, minimal spanning tree, Robustness, Sensitivity, Time complexity, Visualization

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Fast LDP-MST: An Efficient Density-Peak-Based Clustering Method for Large-Size Datasets

Details

Weiterführende Literatur