Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
The internet of robotic things (IoRT) is an emerging technology that combines user equipment (UE) by allowing communications among each other and data transmission with existing communications and network protocols. However, current IoRT network topologies and resources are insufficient to handle this massive data flow and meet the quality of service (QoS) requirements due to the rapid increment of connected UEs. Hence, the most crucial challenge is radio resource management by controlling the emitting power of the antenna called power allocation (PA), considering the interfering multiple access channel (IMAC). In this paper, we propose a data-driven and model-free twin delayed deep deterministic policy gradient (TD3) algorithm which controls the continuous power level of the PA. TD3 is a modified algorithm of deep deterministic policy gradient (DDPG) that consists of six networks: two actors (one for model and the other for target) and four critics (two for models and two for targets) networks. Results show that the proposed TD3 algorithm outperforms the model-based methods such as fractional programming (FP) and weighted MMSE (WMMSE) as well as model-free algorithms, for example, deep Q network (DQN) and DDPG on sum-rate performance with good generalization power.