Interactive Visualization for Data Science Scripts

Rebecca Faust, Carlos Scheidegger, Katherine Isaacs, William Z. Bernstein, Michael Sharp, Chris North

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As the field of data science continues to grow, so does the need for adequate tools to understand and debug data science scripts. Current debugging practices fall short when applied to a data science setting, due to the exploratory and iterative nature of analysis scripts. Additionally, computational notebooks, the preferred scripting environment of many data scientists, present additional challenges to understanding and debugging workflows, including the non-linear execution of code snippets. This paper presents Anteater, a trace-based visual debugging method for data science scripts. Anteater automatically traces and visualizes execution data with minimal analyst input. The visualizations illustrate execution and value behaviors that aid in understanding the results of analysis scripts. To maximize the number of workflows supported, we present prototype implementations in both Python and Jupyter. Last, to demonstrate Anteater's support for analysis understanding tasks, we provide two usage scenarios on real world analysis scripts.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE Visualization in Data Science, VDS 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages37-45
Number of pages9
ISBN (Electronic)9781665457217
DOIs
StatePublished - 2022
Event2022 IEEE Visualization in Data Science, VDS 2022 -
Duration: Jan 1 2022 → …

Publication series

NameProceedings - 2022 IEEE Visualization in Data Science, VDS 2022

Conference

Conference2022 IEEE Visualization in Data Science, VDS 2022
Period1/1/22 → …

Keywords

  • Debugging
  • Interactive Visualization
  • Jupyter
  • Program Traces

ASJC Scopus subject areas

  • Information Systems
  • Media Technology

Fingerprint

Dive into the research topics of 'Interactive Visualization for Data Science Scripts'. Together they form a unique fingerprint.

Cite this