Scalable Planning in Large Multi-Agent Queuing Systems

Research output: ThesisDoctoral thesis

Authors

  • Anam Tahir

Research Organisations

View graph of relations

Details

Original languageEnglish
Publication statusPublished - 2024

Abstract

This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.

Cite this

Scalable Planning in Large Multi-Agent Queuing Systems. / Tahir, Anam.
2024.

Research output: ThesisDoctoral thesis

Tahir A. Scalable Planning in Large Multi-Agent Queuing Systems. 2024. (UNSPECIFIED). doi: 10.26083/TUPRINTS-00027525
Download
@phdthesis{27ba35f5a4214a8e8c0f07cae231d314,
title = "Scalable Planning in Large Multi-Agent Queuing Systems",
abstract = "This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.",
author = "Anam Tahir",
year = "2024",
doi = "10.26083/TUPRINTS-00027525",
language = "English",
series = "UNSPECIFIED",

}

Download

TY - BOOK

T1 - Scalable Planning in Large Multi-Agent Queuing Systems

AU - Tahir, Anam

PY - 2024

Y1 - 2024

N2 - This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.

AB - This dissertation presents methods for modelling large queuing systems which are integral to our daily lives, impacting processes and systems significantly. Load balancing, is a vital component of queuing systems, which involves distributing incoming jobs among available resources to achieve specific objectives like minimizing waiting times or job drops. Despite existing static load balancing algorithms, the growing complexity of computational systems necessitates dynamic approaches, as argued in this thesis. Large queuing systems face challenges like partial observability and network delays, emphasizing the need for scalable policies, which is the primary focus of this thesis. The thesis starts with a partially observable single-agent system with large number of queues which is modelled as a partially observable Markov decision process and then solved using Monte Carlo tree search algorithm. The queuing system and the proposed load balancing is analyzed using various inter-arrival and service time distributions as well different reward functions. Network traffic data from Kaggle was also used to infer the underlying distributions for the arrival and service processes for a real system, and then analyzed the performance of our proposed policy on it. Next, this single-agent model was extended to the multi-agent queuing system, where the challenges of scalability and non-stationary were tackled by modelling this large queuing system as a mean-field control problem with arbitrary synchronization delays. The load balancing policy for the resulting single-agent Markov decision process was learned using the proximal policy optimization algorithm from reinforcement learning. This contribution highlights the need for learning a policy for when the synchronization delays is not too low, when join-the-shortest-queue is optimal, or not too high, when random allocation is the best load balancing policy. It also provides theoretical guarantees and empirically proves that the policy learned in the mean-field system performs well in large finite queuing systems as well. The successful framework of mean-field control for modelling a large queuing system was then further extended to include the locality of interactions between agents, instead of assuming a fully connected network where every agent can have access to every other queue. To model these decentralized interactions, the recently developed sparse mean-field theory was used and extended to obtain a mean-field control framework. The proximal policy optimization algorithm was then used on the resulting sparse mean-field control Markov decision process to learn a scalable and decentralized load balancing policy. In all the above-mentioned contributions, our proposed learned load balancing policies were able to perform well when compared to the existing state-of-the-art work, thus motivating future works in this direction.

UR - https://tuprints.ulb.tu-darmstadt.de/id/eprint/27525

U2 - 10.26083/TUPRINTS-00027525

DO - 10.26083/TUPRINTS-00027525

M3 - Doctoral thesis

T3 - UNSPECIFIED

ER -