UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

SOOM: Sort-Based Optimizer for Big Data Multi-Query

Big data, 2020-02, Vol.8 (1), p.38-61

2020

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

SOOM: Sort-Based Optimizer for Big Data Multi-Query

Ist Teil von

Big data, 2020-02, Vol.8 (1), p.38-61

Ort / Verlag

United States

Erscheinungsjahr

2020

Quelle

MEDLINE

Beschreibungen/Notizen

Mostly, sorting of data is a common operation in many applications, which causes the consumption of resources and thus leads to computation overheads. Regarding the context of Big Data multi-query, the shared sort operations are fairly large, which incur high-cost I/Os whether explicit or implicit. In particular, Big Data multi-query, including aggregation and sort operations, takes long execution time due to reshuffle of the same data multiple times using similar tasks. Therefore, exploiting the sharing data and the sharing sort opportunities of similar tasks can offer the possibility of reusing the previous results to optimize multi-query. For considering sharing data, our previous work, Multi-Query Optimization Using Tuple Size and Histogram (MOTH) system, has been introduced to consider the granularity of the sharing data opportunities among multi-query. However, time overheads regarding redundant data in-network movement (i.e., shuffling time to transfer intermediate data for sort operations) have not been considered. Therefore, the MOTH system has been extended to SOOM (Sort-Based Optimizer over MOTH) system to exploit sharing sort opportunities, including explicit sorts of sort queries and implicit sorts of aggregation queries. The proposed SOOM system consists of two additional modules to exploit sharing sort opportunities, namely query explorer and sort exploiter, which leverage our existing MOTH system to fulfill optimizing multiple aggregation and sort queries. The experimental evaluation has shown that the SOOM system outperforms the naive and the state-of-art techniques regarding query execution time among queries by 45% and 30%, respectively, while introducing maximal intermediate data size reduction by 67% and 61% in average, respectively, over Hadoop-like infrastructures.

Sprache: Englisch
Identifikatoren: ISSN: 2167-6461
eISSN: 2167-647X
DOI: 10.1089/big.2019.0023
Titel-ID: cdi_pubmed_primary_31999479

Format: –
Schlagworte: Big Data, Computer Systems, Electronic Data Processing, Information Storage and Retrieval

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

SOOM: Sort-Based Optimizer for Big Data Multi-Query

Details

Weiterführende Literatur