Reinforcement Learning Based Routing in EH-WSNs with Dual Alternative Batteries

This paper considers an Energy Harvesting Wireless Sensor Network (EH-WSN) where nodes have a dual alternative battery system. We propose a stateless distributed reinforcement learning based routing algorithm, named QLRA, where each node learns the best next hop(s) to forward its data based on thebattery and data information of its neighbors. We study how the number of sources and path exploration probability impacts the performance of QLRA. Numerical results show that after learning, QLRA is able to achieve minimal end-to-end delays in all tested scenarios, which is about 18% lower than the average end-to-end delay of a competing routing algorithm.