TY - GEN
T1 - Dynamic Spectrum Access in Non-stationary Environments
T2 - 2023 International Conference on Computing, Networking and Communications, ICNC 2023
AU - Feng, Mingjie
AU - Zhang, Wenhan
AU - Krunz, Marwan
N1 - Funding Information: This research was supported in part by NSF (grants CNS1910348, CNS-1731164, CNS-1813401, and IIP-1822071), by U.S. Army Small Business Innovation Research Program Office, and by the Army Research Office under Contract No. W911NF-21-C-0016. Any opinions, findings, conclusions, or recommendations expressed in this paper are those of the author(s) and do not necessarily reflect the views of NSF or Army. Publisher Copyright: © 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this paper, we investigate the problem of dynamic spectrum access (DSA) in non-stationary environments, Where secondary users (SUs) and primary users (PUs) operate over a shared set of orthogonal channels. The non-stationarity is caused by the time-varying PU activity and the coupled channel access strategies of different SUs. Considering such non-stationarity and the channel dynamics, the DSA problem is formulated as a hidden-mode Markov Decision Process (HMMDP), Which can be decomposed into multiple MDPs under different modes. At each time, one of the modes is active, each mode corresponds to a unique MDP. The HMMDP is solved when the active mode is determined and the MDP under this mode is solved. We first propose a deep reinforcement learning (DRL) framework for solving the MDP under a given mode. We then propose a long short-term memory (LSTM)-based approach to predict the active mode at each time slot. Simulation results show that the proposed scheme outperforms benchmark schemes by achieving significantly fewer collisions and improved spectrum utilization.
AB - In this paper, we investigate the problem of dynamic spectrum access (DSA) in non-stationary environments, Where secondary users (SUs) and primary users (PUs) operate over a shared set of orthogonal channels. The non-stationarity is caused by the time-varying PU activity and the coupled channel access strategies of different SUs. Considering such non-stationarity and the channel dynamics, the DSA problem is formulated as a hidden-mode Markov Decision Process (HMMDP), Which can be decomposed into multiple MDPs under different modes. At each time, one of the modes is active, each mode corresponds to a unique MDP. The HMMDP is solved when the active mode is determined and the MDP under this mode is solved. We first propose a deep reinforcement learning (DRL) framework for solving the MDP under a given mode. We then propose a long short-term memory (LSTM)-based approach to predict the active mode at each time slot. Simulation results show that the proposed scheme outperforms benchmark schemes by achieving significantly fewer collisions and improved spectrum utilization.
KW - Dynamic spectrum access
KW - deep reinforcement learning
KW - hidden-mode Markov Decision Process
KW - long short-term memory
KW - non-stationary environment
UR - http://www.scopus.com/inward/record.url?scp=85152011363&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85152011363&partnerID=8YFLogxK
U2 - 10.1109/ICNC57223.2023.10074454
DO - 10.1109/ICNC57223.2023.10074454
M3 - Conference contribution
T3 - 2023 International Conference on Computing, Networking and Communications, ICNC 2023
SP - 159
EP - 164
BT - 2023 International Conference on Computing, Networking and Communications, ICNC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 February 2023 through 22 February 2023
ER -