Aardvark: Comparative Visualization of Data Analysis Scripts

Rebecca Faust, Carlos Scheidegger, Chris North

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Debugging programs is one of the most challenging and time consuming parts of programming. Data science scripts present additional challenges as debugging often centers around more exploratory tasks, such as understanding the differences between results under different parameter settings. In fact, a common exploratory debugging practice is to run, modify, and re-run a script to observe the effects of the modification. Analysts perform this process frequently as they explore different settings and algorithms in their analysis. However, traditional debugging methods are not well suited to comparing across multiple executions of a script. They often require maintaining two instances of the debugging method and making manual, serial comparisons of program values. To address this gap, we present Aardvark, a comparative trace-based debugging method for identifying and visualizing the differences between two executions of data analysis scripts. Aardvark traces two consecutive instances of an analysis script, identifies the differences between them, and presents them through comparative visualizations. We present a prototype implementation in Python as well as an extension to support scripts in Jupyter notebooks. Finally, to demonstrate Aardvark, we provide two usage scenarios on real world analysis scripts.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE Visualization in Data Science, VDS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages30-38
Number of pages9
ISBN (Electronic)9798350330205
DOIs
StatePublished - 2023
Event2023 IEEE Visualization in Data Science, VDS 2023 - Hybrid, Melbourne, Australia
Duration: Oct 23 2023 → …

Publication series

NameProceedings - 2023 IEEE Visualization in Data Science, VDS 2023

Conference

Conference2023 IEEE Visualization in Data Science, VDS 2023
Country/TerritoryAustralia
CityHybrid, Melbourne
Period10/23/23 → …

Keywords

  • Comparison
  • Debugging
  • Interactive Visualization
  • Jupyter
  • Program Traces

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Aardvark: Comparative Visualization of Data Analysis Scripts'. Together they form a unique fingerprint.

Cite this