A decentralized reinforcement learning controller for collaborative driving
Multivehicle Systems, Volume # 1 | Part# 1
Authors
Luke Ng; Chris Clark; Jan Huissoon; Gabriele D'Eleuterio
Digital Object Identifier (DOI)
10.3182/20061002-2-BR-4906.00015
Page Numbers:
83-88
Index Terms
autonomous vehicles,co-ordination,decentralized control,machine learning,control
Abstract
Research in the collaborative driving domain strives to create control systems that coordinate the motion of multiple vehicles in order to navigate traffic both efficiently and safely. In this paper a novel individual vehicle controller based on reinforcement learning is introduced. This controller is capable of both lateral and longitudinal control while driving in a multi-vehicle platoon. The design and development of this controller is discussed in detail and simulation results showing learning progress and performance are presented.
References
[1] Baraff, D. (1996). Linear-Time Dynamics using
Lagrange Multipliers. In: Computer Graphics
Proceedings, Annual Conference Series
(SIGGRAPH '96). pp. 137-146. New Orleans,
LA, USA, Aug 1996.
[2] Bellman, R. E. (1957). Dynamic Programming.
Princeton University Press, Princeton, NJ.
[3] Dayans, P. (1992). The convergence of TD(λ).
Machine Learning. Vol. 8, pp. 341-362, 1992.
[4] Halle, S., J. Laumonier and B. Chaib-draa. (2004). A
Decentralized Approach to Collaborative Drive
Coordination. In: Proceedings of 7th IEEE
International Conference on Intelligent
Transportation Systems (ITSC'2004).
Washington, DC, USA, Oct 2004.
[5] Huppe, X., J. de Lafontaine, M. Beauregard and F.
Michaud. (2003). Guidance and Control of a
Platoon of Vehicles Adapted to Changing
Environment Conditions. In: Proceedings IEEE
Conference on Systems, Man, and Cybernetics.
pp. 3091-3096.
[6] Kohl, N. and P. Stone. (2004). Policy Gradient
Reinforcement Learning for Fast Quadrupedal
Locomotion. In: Proceedings of the IEEE
International Conference on Robotics and
Automation. pp. 2619-2624. May 2004.
[7] Laumonier, J., C. Desjardins and B. Chaib-draa.
(2006). Cooperative Adaptive Cruise Control: a
Reinforcement Learning Approach. In:
Proceedings of 4th Workshop in Traffic and
Transportation, AAMAS'06. Hakodate,
Hokkaido, Japan, May 2006.
[8] Michel, O. (2004). Cyberbotics Ltd. Webotstm:
Professional Mobile Robot Simulation. In:
International Journal of Advanced Robotic
Systems. Vol. 1 Number 1 pp. 39-42.
[9] Ng, A. Y., A. Coates, M. Diel, V. Ganapathi, J.
Schulte, B. Tse, E. Berger, and E. Liang. (2004).
Autonomous inverted helicopter flight via
reinforcement learning. In: Proceedings of
International Symposium on Experimental
Robotics.
[10] Rummery, G. A. and M. Niranjan (1994). On-Line Q-Learning
using connectionist systems. In:
Technical Report CUED/F-INFENG/TR 166.
Engineering Department, Cambridge University.
[11] Stone, P. and R. S. Sutton. (2001). Scaling
Reinforcement Learning toward RoboCup
Soccer. In: Proceedings of The Eighteenth
International Conference on Machine Learning
(ICML 2001). pp. 537-544. Willianstown, MA,
USA, June 2001.
[12] Sutton, R. S. and A. G. Barto. (1998). Reinforcement
Learning: An Introduction. A Bradford Book.
The MIT Press. Cambridge, MA, USA.1998.
[13] Varaiya, P. (1993). Smart cars on smart roads:
problems of control. In: IEEE Transactions on
Automatic Control. Vol. 32, Mar 1993.
[14] Watkins, C. J. C. H. (1989). Learning from Delayed
Rewards. Ph.D. thesis. Cambridge University.
