Details
Original language | English |
---|---|
Publication status | Published - 2024 |
Abstract
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2024.
Research output: Thesis › Doctoral thesis
}
TY - BOOK
T1 - Scalable Planning in Large Multi-Agent Queuing Systems
AU - Tahir, Anam
PY - 2024
Y1 - 2024
N2 - This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.
AB - This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.
UR - https://tuprints.ulb.tu-darmstadt.de/id/eprint/27525
U2 - 10.26083/TUPRINTS-00027525
DO - 10.26083/TUPRINTS-00027525
M3 - Doctoral thesis
T3 - UNSPECIFIED
ER -