UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 7 von 135

Explaining outputs in modern data analytics

Proceedings of the VLDB Endowment, 2016-08, Vol.9 (12), p.1137-1148

2016

Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte

Titel

Explaining outputs in modern data analytics

Ist Teil von

Proceedings of the VLDB Endowment, 2016-08, Vol.9 (12), p.1137-1148

Erscheinungsjahr

2016

Quelle

ACM Digital Library

Beschreibungen/Notizen

We report on the design and implementation of a general framework for interactively explaining the outputs of modern data-parallel computations, including iterative data analytics. To produce explanations, existing works adopt a naive backward tracing approach which runs into known issues; naive backward tracing may identify: (i) too much information that is difficult to process, and (ii) not enough information to reproduce the output, which hinders the logical debugging of the program. The contribution of this work is twofold. First, we provide methods to effectively reduce the size of explanations based on the first occurrence of a record in an iterative computation. Second, we provide a general method for identifying explanations that are sufficient to reproduce the target output in arbitrary computations -- a problem for which no viable solution existed until now. We implement our approach on differential dataflow , a modern high-throughput, low-latency dataflow platform. We add a small (but extensible) set of rules to explain each of its data-parallel operators, and we implement these rules as differential dataflow operators themselves. This choice allows our implementation to inherit the performance characteristics of differential dataflow, and results in a system that efficiently computes and updates explanatory inputs even as the inputs of the reference computation change. We evaluate our system with various analytic tasks on real datasets, and we show that it produces concise explanations in tens of milliseconds, while remaining faster -- up to two orders of magnitude -- than even the best implementations that do not support explanations.

Sprache: Englisch
Identifikatoren: ISSN: 2150-8097
eISSN: 2150-8097
DOI: 10.14778/2994509.2994530
Titel-ID: cdi_crossref_primary_10_14778_2994509_2994530

Format: –

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Explaining outputs in modern data analytics

Details

Weiterführende Literatur