Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Continuous control with deep reinforcement learning
6.769
Zitationen
8
Autoren
2016
Jahr
Abstract
Abstract: We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Ähnliche Arbeiten
Adaptation in Natural and Artificial Systems
1992 · 35.518 Zit.
Reinforcement Learning: An Introduction
1998 · 26.802 Zit.
Reinforcement Learning: An Introduction
2005 · 25.701 Zit.
Deep learning in neural networks: An overview
2014 · 17.759 Zit.
Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)
2017 · 11.250 Zit.