TY - GEN
T1 - Why? Why not? When? Visual Explanations of Agent Behaviour in Reinforcement Learning
AU - Mishra, Aditi
AU - Soni, Utkarsh
AU - Huang, Jinbin
AU - Bryan, Chris
N1 - Funding Information: This research was supported by the U.S. National Science Foundation through grant OAC-1934766. Publisher Copyright: © 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Reinforcement learning (RL) is used in many domains, including autonomous driving, robotics, stock trading, and video games. Unfortunately, the black box nature of RL agents, combined with legal and ethical considerations, makes it increasingly important that humans (including those are who not experts in RL) understand the reasoning behind the actions taken by an RL agent, particularly in safety-critical domains. To help address this challenge, we introduce PolicyExplainer, a visual analytics interface which lets the user directly query an autonomous agent. PolicyExplainer visualizes the states, policy, and expected future rewards for an agent, and supports asking and answering questions such as: 'Why take this action? Why not take this other action? When is this action taken?' PolicyExplainer is designed based upon a domain analysis with RL researchers, and is evaluated via qualitative and quantitative assessments on a trio of domains: taxi navigation, a stack bot domain, and drug recommendation for HIV patients. We find that PolicyExplainer's visual approach promotes trust and understanding of agent decisions better than a state-of-the-art text-based explanation approach. Interviews with domain practitioners provide further validation for PolicyExplainer as applied to safety-critical domains. Our results help demonstrate how visualization-based approaches can be leveraged to decode the behavior of autonomous RL agents, particularly for RL non-experts.
AB - Reinforcement learning (RL) is used in many domains, including autonomous driving, robotics, stock trading, and video games. Unfortunately, the black box nature of RL agents, combined with legal and ethical considerations, makes it increasingly important that humans (including those are who not experts in RL) understand the reasoning behind the actions taken by an RL agent, particularly in safety-critical domains. To help address this challenge, we introduce PolicyExplainer, a visual analytics interface which lets the user directly query an autonomous agent. PolicyExplainer visualizes the states, policy, and expected future rewards for an agent, and supports asking and answering questions such as: 'Why take this action? Why not take this other action? When is this action taken?' PolicyExplainer is designed based upon a domain analysis with RL researchers, and is evaluated via qualitative and quantitative assessments on a trio of domains: taxi navigation, a stack bot domain, and drug recommendation for HIV patients. We find that PolicyExplainer's visual approach promotes trust and understanding of agent decisions better than a state-of-the-art text-based explanation approach. Interviews with domain practitioners provide further validation for PolicyExplainer as applied to safety-critical domains. Our results help demonstrate how visualization-based approaches can be leveraged to decode the behavior of autonomous RL agents, particularly for RL non-experts.
KW - Human-centered computing-Visualization-Visualization design and evaluation methods
KW - Human-centered computing-Visualization-Visualization techniques-Treemaps
UR - http://www.scopus.com/inward/record.url?scp=85132446785&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132446785&partnerID=8YFLogxK
U2 - 10.1109/PacificVis53943.2022.00020
DO - 10.1109/PacificVis53943.2022.00020
M3 - Conference contribution
T3 - IEEE Pacific Visualization Symposium
SP - 111
EP - 120
BT - Proceedings - 2022 IEEE 15th Pacific Visualization Symposium, PacificVis 2022
PB - IEEE Computer Society
T2 - 15th IEEE Pacific Visualization Symposium, PacificVis 2022
Y2 - 11 April 2022 through 14 April 2022
ER -