Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Journal of bioinformatics and computational biology, 2022-02, Vol.20 (1), p.2150036
2022
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
EdClust: A heuristic sequence clustering method with higher sensitivity
Ist Teil von
  • Journal of bioinformatics and computational biology, 2022-02, Vol.20 (1), p.2150036
Ort / Verlag
Singapore
Erscheinungsjahr
2022
Quelle
MEDLINE
Beschreibungen/Notizen
  • The development of high-throughput technologies has produced increasing amounts of sequence data and an increasing need for efficient clustering algorithms that can process massive volumes of sequencing data for downstream analysis. Heuristic clustering methods are widely applied for sequence clustering because of their low computational complexity. Although numerous heuristic clustering methods have been developed, they suffer from two limitations: overestimation of inferred clusters and low clustering sensitivity. To address these issues, we present a new sequence clustering method (edClust) based on Edlib, a C/C[Formula: see text] library for fast, exact semi-global sequence alignment to group similar sequences. The new method edClust was tested on three large-scale sequence databases, and we compared edClust to several classic heuristic clustering methods, such as UCLUST, CD-HIT, and VSEARCH. Evaluations based on the metrics of cluster number and seed sensitivity (SS) demonstrate that edClust can produce fewer clusters than other methods and that its SS is higher than that of other methods. The source codes of edClust are available from https://github.com/zhang134/EdClust.git under the GNU GPL license.
Sprache
Englisch
Identifikatoren
ISSN: 0219-7200
eISSN: 1757-6334
DOI: 10.1142/S0219720021500360
Titel-ID: cdi_pubmed_primary_34939905

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX