Browse wiki

From MurrayWiki
Jump to: navigation, search
End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks
Abstract Reinforcement Learning (RL) algorithms hav
Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break be- fore an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on- line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework lever- ages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guaran- tees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.
safety during the entire learning process.  +
Authors Richard Cheng, Gabor Orosz, Richard M. Murray, Joel W. Burdick  +
ID 2018e  +
Source To appear, 2019 AAAI Conference on Artificial Intelligence  +
Tag comb19-aiaa  +
Title End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks +
Type Conference paper  +
Categories Papers
Modification date
This property is a special property in this wiki.
27 December 2018 05:18:35  +
URL
This property is a special property in this wiki.
http://www.cds.caltech.edu/~murray/preprints/comb19-aiaa.pdf  +
hide properties that link here 
End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks + Title
 

 

Enter the name of the page to start browsing from.