NNRG Projects - Adaptive Packet Routing: The Confidence-Based Dual Reinforcement Q-Learning Algorithm

Adaptive Packet Routing: The Confidence-Based Dual Reinforcement Q-Learning Algorithm

Active from 1998 - 2000

Standard reinforcement learning (TD or Q learning) is based on forward exploration: later estimates are used to update earlier ones. In Dual Reinforcement Learning, backward exploration is also utilized: earlier estimates are used to update later estimates. The quality of estimates can be further improved by keeping track of how recently they were updated. In this project, these ideas are applied to the Q-routing algorithm for adaptive packet routing in communication networks, improving the speed of learning and the quality of the final routing policy.

People

Shailesh Kumar

Masters Alumni

Publications

Topographic Receptive Fields and Patterned Lateral Interaction in a Self-Organizing Model of the Primary Visual Cortex	Joseph Sirosh and Risto Miikkulainen	Neural Computation, 9:577-594, 1996.	1996

Related Areas

Reinforcement Learning