Rescorla–Wagner Models with Sparse Dynamic Attention

Joel Nishimura, Amy L. Cochran

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


The Rescorla–Wagner (R–W) model describes human associative learning by proposing that an agent updates associations between stimuli, such as events in their environment or predictive cues, proportionally to a prediction error. While this model has proven informative in experiments, it has been posited that humans selectively attend to certain cues to overcome a problem with the R–W model scaling to large cue dimensions. We formally characterize this scaling problem and provide a solution that involves limiting attention in a R–W model to a sparse set of cues. Given the universal difficulty in selecting features for prediction, sparse attention faces challenges beyond those faced by the R–W model. We demonstrate several ways in which a naive attention model can fail explain those failures and leverage that understanding to produce a Sparse Attention R–W with Inference framework (SAR-WI). The SAR-WI framework not only satisfies a constraint on the number of attended cues, it also performs as well as the R–W model on a number of natural learning tasks, can correctly infer associative strengths, and focuses attention on predictive cues while ignoring uninformative cues. Given the simplicity of proposed alterations, we hope this work informs future development and empirical validation of associative learning models that seek to incorporate sparse attention.

Original languageEnglish (US)
Article number69
JournalBulletin of mathematical biology
Issue number6
StatePublished - Jun 1 2020


  • Attention
  • Learning
  • Rescorla–Wagner

ASJC Scopus subject areas

  • Neuroscience(all)
  • Immunology
  • Mathematics(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Environmental Science(all)
  • Pharmacology
  • Agricultural and Biological Sciences(all)
  • Computational Theory and Mathematics


Dive into the research topics of 'Rescorla–Wagner Models with Sparse Dynamic Attention'. Together they form a unique fingerprint.

Cite this