Improved the sample efficiency of episodic reinforcement learning by forcing state representations

Ruiyuan Zhang; William Zhu

doi:10.1117/12.2639467

27 June 2022 Improved the sample efficiency of episodic reinforcement learning by forcing state representations

Ruiyuan Zhang, William Zhu

Proceedings Volume 12253, International Conference on Automation Control, Algorithm, and Intelligent Bionics (ACAIB 2022); 122531I (2022) https://doi.org/10.1117/12.2639467
Event: Second International Conference on Automation Control, Algorithm, and Intelligent Bionics (ACAIB 2022), 2022, Qingdao, China

Abstract

Episodic reinforcement learning (ERL) is a class of algorithms that use episodic memory for improving the performance and the sample efficiency of reinforcement learning (RL). Although it has achieved some success, existing ERL algorithms have to interact with the environment for many rounds to gain satisfying performance. In this paper, we propose the algorithm episodic memory by forcing state representations (EMSR) to improve the performance and sample efficiency of ERL. Specifically, our algorithm uses a transition model to predict the hidden state representations of the agent’s multiple future steps for augmenting reward maximization, which can help the agent learn quickly. In this way, our method can achieve better performance and higher sample efficiency than previous state-of-the-art algorithms. Experimental results demonstrate the superiority of our method.

Citation Download Citation

Ruiyuan Zhang and William Zhu "Improved the sample efficiency of episodic reinforcement learning by forcing state representations", Proc. SPIE 12253, International Conference on Automation Control, Algorithm, and Intelligent Bionics (ACAIB 2022), 122531I (27 June 2022); https://doi.org/10.1117/12.2639467

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Computer programming

Neural networks

Stochastic processes

Machine learning

RELATED CONTENT

Memory-efficient large-scale linear support vector machine
Proceedings of SPIE (February 14 2015)

A deep reinforcement learning algorithm for large scale vehicle routing...
Proceedings of SPIE (May 23 2022)

Research on electromagnetic attack of advanced encryption standard based on...
Proceedings of SPIE (December 08 2022)

Prediction of used car prices using back propagation neural network...
Proceedings of SPIE (January 12 2023)

Understanding information leakage of distributed inference with deep neural networks...
Proceedings of SPIE (May 04 2018)

Alternation minimization training of radial basis function networks
Proceedings of SPIE (March 22 1996)

Case studies in applying fitness distributions in evolutionary algorithms ...
Proceedings of SPIE (March 30 2000)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years