What is behind the curtain? Increasing transparency in reinforcement learning with human preferences and explanations

Angelopoulos, G.; Mangiacapra, L.; Rossi, A.; Di Napoli, C.; Rossi, S.

doi:10.1016/j.engappai.2025.110520

In this work, we investigate whether the transparency of a robot's behaviour is improved when human preferences on the actions the robot performs are taken into account during the learning process. For this purpose, a shielding mechanism called Preference Shielding is proposed and included in a reinforcement learning algorithm to account for human preferences. We also use the shielding to decide when to provide explanations of the robot's actions. We carried out a within-subjects study involving 26 participants to evaluate the robot's transparency. Results indicate that considering human preferences during learning improves legibility compared with providing only explanations. In addition, combining human preferences and explanations further amplifies transparency. Results also confirm that increased transparency leads to an increase in people's perception of the robot's safety, comfort, and reliability. These findings show the importance of transparency during learning and suggest a paradigm for robotic applications when a robot has to learn a task in the presence of or in collaboration with a human.

What is behind the curtain? Increasing transparency in reinforcement learning with human preferences and explanations / Angelopoulos, G.; Mangiacapra, L.; Rossi, A.; Di Napoli, C.; Rossi, S.. - In: ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE. - ISSN 0952-1976. - 149:(2025). [10.1016/j.engappai.2025.110520]