The convenient access to copious multifaceted data has encouraged machine learning researchers to reconsider correlation-based learning and embrace the opportunity of causality-based learning, i.e., causal machine learning (causal learning). Recent years have, therefore, witnessed great effort in developing causal learning algorithms aiming to help artificial intelligence (AI) achieve human-level intelligence. Due to the lack of ground-truth data, one of the biggest challenges in current causal learning research is algorithm evaluations. This largely impedes the cross-pollination of AI and causal inference and hinders the two fields to benefit from the advances of the other. To bridge from conventional causal inference (i.e., based on statistical methods) to causal learning with Big Data (i.e., the intersection of causal inference and machine learning), in this survey, we review commonly used datasets, evaluation methods, and measures for causal learning using an evaluation pipeline similar to conventional machine learning. We focus on the two fundamental causal inference tasks and causality-aware machine learning tasks. Limitations of current evaluation procedures are also discussed. We, then, examine popular causal inference tools/packages and conclude with primary challenges and opportunities for benchmarking causal learning algorithms in the era of Big Data. The survey seeks to bring to the forefront the urgency of developing publicly available benchmarks and consensus-building standards for causal learning evaluation with observational data. In doing so, we hope to broaden the discussions and facilitate collaboration to advance the innovation and application of causal learning.

Original languageEnglish (US)
Pages (from-to)924-943
Number of pages20
JournalIEEE Transactions on Artificial Intelligence
Issue number6
StatePublished - Dec 1 2022


  • Benchmarking
  • Big Data
  • causal inference
  • causal learning
  • evaluation

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications


Dive into the research topics of 'Evaluation Methods and Measures for Causal Learning Algorithms'. Together they form a unique fingerprint.

Cite this