ACK-less rate adaptation using distributional reinforcement learning for reliable IEEE 802.11bc broadcast WLANs
Kanda, Takamochi; Koda, Yusuke; Kihira, Yuto; Yamamoto, Koji; Nishio, Takayuki (2022-06-01)
T. Kanda, Y. Koda, Y. Kihira, K. Yamamoto and T. Nishio, "ACK-Less Rate Adaptation Using Distributional Reinforcement Learning for Reliable IEEE 802.11bc Broadcast WLANs," in IEEE Access, vol. 10, pp. 58858-58868, 2022, doi: 10.1109/ACCESS.2022.3179581
© The Author(s) 2022. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
As a step towards establishing reliable broadcast wireless local area networks (WLANs), this paper proposes acknowledgement (ACK)-less rate adaptation to alleviate reception failures at broadcast recipient stations (STAs) using distributional reinforcement learning (RL). The key point of this study is that the algorithms for learning the strategy of ACK-less rate adaptation are evaluated in terms of the broadcast performance, which is composed of the data rate of the broadcast access point (AP) and the reception success rate at the recipient STAs. ACK-less rate adaptation framework was realized using the received signal strength (RSS) of the uplink frames transmitted by the non-broadcast STAs to the non-broadcast APs, which correlated with the broadcast performance with a confounding effect from the deployment of the broadcast recipient STAs. However, this rate adaptation framework has the problem that it incurs the reception failures at a part of the broadcast recipient STAs, because deep Q-learning used in the previous framework cannot deal with the wide distribution of the broadcast performance. To address this challenge, this paper further discusses the rate adaptation using distributional RL, which approximates the entire distribution of the broadcast performance. The simulations confirmed the following: 1) Using the expected broadcast performance learned by deep Q-learning improved the performance in terms of the Pareto efficiency. 2) Learning the entire distribution of the broadcast performance enabled the broadcast AP to determine the tail of the distribution using risk measure, and to alleviate reception failures while implementing the rate adaptation in the same way as the method that learns only expected broadcast performance.
- Avoin saatavuus