Reducing risk for assistive reinforcement learning policies with diffusion models

Authors

  • Andrii Tytarenko Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine https://orcid.org/0000-0002-8265-642X

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2024.3.09

Keywords:

assistive robotics, reinforcement learning, diffusion models, imitation learning

Abstract

Care-giving and assistive robotics, driven by advancements in AI, offer promising solutions to meet the growing demand for care, particularly in the context of increasing numbers of individuals requiring assistance. It creates a pressing need for efficient and safe assistive devices, particularly in light of heightened demand due to war-related injuries. While cost has been a barrier to accessibility, technological progress can democratize these solutions. Safety remains a paramount concern, especially given the intricate interactions between assistive robots and humans. This study explores the application of reinforcement learning (RL) and imitation learning in improving policy design for assistive robots. The proposed approach makes the risky policies safer without additional environmental interactions. The enhancement of the conventional RL approaches in tasks related to assistive robotics is demonstrated through experimentation using simulated environments.

Author Biography

Andrii Tytarenko, Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv

Ph.D. student at Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine.

References

D.M. Taylor, “Americans with disabilities: 2014,” US Census Bureau, pp. 1–32, 2018.

J. Broekens et al., “Assistive social robots in elderly care: A review,” Gerontechnology, vol. 8, no. 2, pp. 94–103, 2009.

R. Ye et al., “Rcare world: A human-centric simulation world for care-giving robots,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2022, pp. 33–40.

J.Z.-Y. He, Z. Erickson, D.S. Brown, A. Raghunathan, and A. Dragan, “Learning representations that enable generalization in assistive tasks,” in Conference on Robot Learning, PMLR, 2023, pp. 2105–2114.

Z. Fu, T.Z. Zhao, and C. Finn, “Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation,” arXiv preprint, 2024. doi: https://doi.org/10.48550/arXiv.2401.02117

J. Luo et al., “Serl: A software suite for sample-efficient robotic reinforcement learning,” arXiv preprint, 2024. doi: https://doi.org/10.48550/arXiv.2401.16013

Z. Erickson, V. Gangaram, A. Kapusta, C.K. Liu, and C.C. Kemp, “Assistive gym: A physics simulation framework for assistive robotics,” in 2020 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2020, pp. 10169–10176.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint, 2017. doi: https://doi.org/10.48550/arXiv.1707.06347

V. Mnih et al., “Asynchronous methods for deep reinforcement learning,” in International Conference on Machine Learning, PMLR, 2016, pp. 1928–1937.

M. Welling and Y.W. Teh, “Bayesian learning via stochastic gradient langevin dynamics,” in Proceedings of the 28th International Conference on Machine Learning (ICML-11), Citeseer, 2011, pp. 681–688.

S. Levine, “Reinforcement learning and control as probabilistic inference: Tutorial and review,” arXiv preprint, 2018. doi: https://doi.org/10.48550/arXiv.1805.00909

C. Chi et al., “Diffusion policy: Visuomotor policy learning via action diffusion,” arXiv preprint, 2023. doi: https://doi.org/10.48550/arXiv.2303.04137

Downloads

Published

2024-09-28

Issue

Section

Methods, models, and technologies of artificial intelligence in system analysis and control