UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Lightweight Cardinality Estimation in LSM-based Systems

Proceedings of the 2018 International Conference on Management of Data, 2018, p.841-855

2018

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

Lightweight Cardinality Estimation in LSM-based Systems

Ist Teil von

Proceedings of the 2018 International Conference on Management of Data, 2018, p.841-855

Ort / Verlag

New York, NY, USA: ACM

Erscheinungsjahr

2018

Quelle

ACM Digital Library Complete

Beschreibungen/Notizen

Data sources, such as social media, mobile apps and IoT sensors, generate billions of records each day. Keeping up with this influx of data while providing useful analytics to the users is a major challenge for today's data-intensive systems. A popular solution that allows such systems to handle rapidly incoming data is to rely on log-structured merge (LSM) storage models. LSM-based systems provide a tunable trade-off between ingesting vast amounts of data at a high rate and running efficient analytical queries on top of that data. For queries, it is well-known that the query processing performance largely depends on the ability to generate efficient execution plans. Previous research showed that OLAP query workloads rely on having small, yet precise, statistical summaries of the underlying data, which can drive the cost-based query optimization. In this paper we address the problem of computing data statistics for workloads with rapid data ingestion and propose a lightweight statistics-collection framework that exploits the properties of LSM storage. Our approach is designed to piggyback on the events (flush and merge) of the LSM lifecycle. This allows us to easily create an initial statistics and then keep them in sync with rapidly changing data while minimizing the overhead to the existing system. We have implemented and adapted well-known algorithms to produce various types of statistical synopses, including equi-width histograms, equi-height histograms, and wavelets. We performed an in-depth empirical evaluation that considers both the cardinality estimation accuracy and runtime overheads of collecting and using statistics. The experiments were conducted by prototyping our approach on top of Apache AsterixDB, an open source Big Data management system that has an entirely LSM-based storage backend.

Sprache: Englisch
Identifikatoren: ISBN: 1450347037, 9781450347037
DOI: 10.1145/3183713.3183761
Titel-ID: cdi_acm_books_10_1145_3183713_3183761_brief

Format: –
Schlagworte: Information systems -- Data management systems -- Database management system engines -- Database query processing -- Query optimization

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Lightweight Cardinality Estimation in LSM-based Systems

Details

Weiterführende Literatur