Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 10 von 33

Details

Autor(en) / Beteiligte
Titel
StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics [version 1; peer review: 2 approved, 1 approved with reservations]
Ist Teil von
  • F1000 research, 2013, Vol.2, p.248
Ort / Verlag
England: F1000 Research Ltd
Erscheinungsjahr
2013
Link zum Volltext
Quelle
EZB Free E-Journals
Beschreibungen/Notizen
  • Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. "provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month". The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages.
Sprache
Englisch
Identifikatoren
ISSN: 2046-1402
eISSN: 2046-1402
DOI: 10.12688/f1000research.2-248.v1
Titel-ID: cdi_doaj_primary_oai_doaj_org_article_aad5403074d847c48a842c74e5622dda
Format
Schlagworte
Bioinformatics

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX