With the outstanding accomplishments achieved in recent years, Reinforcement Learning (RL) has become an area where researchers have flocked to in order to find innovative ideas and solutions to challenge and conquer some of the most difficult tasks. While the bulk of the research has been focused on the learning algorithms such as SARSA, Q-Learning, and Genetic, not much attention has been paid to tools used to help these algorithms (e.g. the Experience Replay Buffer). This paper goes over what is believed to be the most accurate Taxonomy of the AI field and briefly covers the Q-Learning algorithm, as it is the base algorithm for this study. Most importantly, it proposes a new Experience Replay Buffer technique, the Round Robin Prioritized Experience Replay Buffer (RRPERB), which aims to help RL agents learn quicker and generalize better to rarely seen states by not completely depriving them of experiences which are ranked as less priority.
With the outstanding accomplishments achieved in recent years, Reinforcement Learning (RL) has become an area where researchers have flocked to in order to find innovative ideas and solutions to challenge and conquer some of the most difficult tasks. While the bulk of the research has been focused on the learning algorithms such as SARSA, Q-Learning, and Genetic, not much attention has been paid to tools used to help these algorithms (e.g. the Experience Replay Buffer). This paper goes over what is believed to be the most accurate Taxonomy of the AI field and briefly covers the Q-Learning algorithm, as it is the base algorithm for this study. Most importantly, it proposes a new Experience Replay Buffer technique, the Round Robin Prioritized Experience Replay Buffer (RRPERB), which aims to help RL agents learn quicker and generalize better to rarely seen states by not completely depriving them of experiences which are ranked as less priority.
This article aims to provide the reader with a clear understanding of a subdiscipline in artificial intelligence, Deep Neural Networks. In addition to this, we cover a set of proposed Domain Specific Architectures, Accelerators, that are optimized for these types of computations. In optimizing these computations, we are able to reduce data transfers by keeping data at the processing unit in their individual register files thus increasing energy efficiency per computation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.