Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 7 von 135
Proceedings of the VLDB Endowment, 2016-08, Vol.9 (12), p.1137-1148
2016
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Explaining outputs in modern data analytics
Ist Teil von
  • Proceedings of the VLDB Endowment, 2016-08, Vol.9 (12), p.1137-1148
Erscheinungsjahr
2016
Quelle
ACM Digital Library
Beschreibungen/Notizen
  • We report on the design and implementation of a general framework for interactively explaining the outputs of modern data-parallel computations, including iterative data analytics. To produce explanations, existing works adopt a naive backward tracing approach which runs into known issues; naive backward tracing may identify: (i) too much information that is difficult to process, and (ii) not enough information to reproduce the output, which hinders the logical debugging of the program. The contribution of this work is twofold. First, we provide methods to effectively reduce the size of explanations based on the first occurrence of a record in an iterative computation. Second, we provide a general method for identifying explanations that are sufficient to reproduce the target output in arbitrary computations -- a problem for which no viable solution existed until now. We implement our approach on differential dataflow , a modern high-throughput, low-latency dataflow platform. We add a small (but extensible) set of rules to explain each of its data-parallel operators, and we implement these rules as differential dataflow operators themselves. This choice allows our implementation to inherit the performance characteristics of differential dataflow, and results in a system that efficiently computes and updates explanatory inputs even as the inputs of the reference computation change. We evaluate our system with various analytic tasks on real datasets, and we show that it produces concise explanations in tens of milliseconds, while remaining faster -- up to two orders of magnitude -- than even the best implementations that do not support explanations.
Sprache
Englisch
Identifikatoren
ISSN: 2150-8097
eISSN: 2150-8097
DOI: 10.14778/2994509.2994530
Titel-ID: cdi_crossref_primary_10_14778_2994509_2994530
Format

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX