Details
Original language | English |
---|---|
Pages (from-to) | 1610-1622 |
Number of pages | 13 |
Journal | IEEE transactions on computers |
Volume | 72 |
Issue number | 6 |
Publication status | Published - 1 Jun 2023 |
Abstract
Load balancing arises as a fundamental problem, underlying the dimensioning and operation of many computing and communication systems, such as job routing in data center clusters, multipath communication, Big Data and queueing systems. In essence, the decision-making agent maps each arriving job to one of the possibly heterogeneous servers while aiming at an optimization goal such as load balancing, low average delay or low loss rate. One main difficulty in finding optimal load balancing policies here is that the agent only partially observes the impact of its decisions, e.g., through the delayed acknowledgements of the served jobs. In this paper, we provide a partially observable (PO) model that captures the load balancing decisions in parallel buffered systems under limited information of delayed acknowledgements. We present a simulation model for this PO system to find a load balancing policy in real-time using a scalable Monte Carlo tree search algorithm. We numerically show that the resulting policy outperforms other limited information load balancing strategies such as variants of Join-the-Most-Observations and has comparable performance to full information strategies like: Join-the-Shortest-Queue, Join-the-Shortest-Queue(d) and Shortest-Expected-Delay. Finally, we show that our approach can optimise the real-time parallel processing by using network data provided by Kaggle.
Keywords
- Parallel systems, load balancing, partial observability
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Mathematics(all)
- Theoretical Computer Science
- Computer Science(all)
- Hardware and Architecture
- Computer Science(all)
- Computational Theory and Mathematics
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: IEEE transactions on computers, Vol. 72, No. 6, 01.06.2023, p. 1610-1622.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Load Balancing in Compute Clusters With Delayed Feedback
AU - Tahir, Anam
AU - Alt, Bastian
AU - Rizk, Amr
AU - Koeppl, Heinz
N1 - Publisher Copyright: © 1968-2012 IEEE.
PY - 2023/6/1
Y1 - 2023/6/1
N2 - Load balancing arises as a fundamental problem, underlying the dimensioning and operation of many computing and communication systems, such as job routing in data center clusters, multipath communication, Big Data and queueing systems. In essence, the decision-making agent maps each arriving job to one of the possibly heterogeneous servers while aiming at an optimization goal such as load balancing, low average delay or low loss rate. One main difficulty in finding optimal load balancing policies here is that the agent only partially observes the impact of its decisions, e.g., through the delayed acknowledgements of the served jobs. In this paper, we provide a partially observable (PO) model that captures the load balancing decisions in parallel buffered systems under limited information of delayed acknowledgements. We present a simulation model for this PO system to find a load balancing policy in real-time using a scalable Monte Carlo tree search algorithm. We numerically show that the resulting policy outperforms other limited information load balancing strategies such as variants of Join-the-Most-Observations and has comparable performance to full information strategies like: Join-the-Shortest-Queue, Join-the-Shortest-Queue(d) and Shortest-Expected-Delay. Finally, we show that our approach can optimise the real-time parallel processing by using network data provided by Kaggle.
AB - Load balancing arises as a fundamental problem, underlying the dimensioning and operation of many computing and communication systems, such as job routing in data center clusters, multipath communication, Big Data and queueing systems. In essence, the decision-making agent maps each arriving job to one of the possibly heterogeneous servers while aiming at an optimization goal such as load balancing, low average delay or low loss rate. One main difficulty in finding optimal load balancing policies here is that the agent only partially observes the impact of its decisions, e.g., through the delayed acknowledgements of the served jobs. In this paper, we provide a partially observable (PO) model that captures the load balancing decisions in parallel buffered systems under limited information of delayed acknowledgements. We present a simulation model for this PO system to find a load balancing policy in real-time using a scalable Monte Carlo tree search algorithm. We numerically show that the resulting policy outperforms other limited information load balancing strategies such as variants of Join-the-Most-Observations and has comparable performance to full information strategies like: Join-the-Shortest-Queue, Join-the-Shortest-Queue(d) and Shortest-Expected-Delay. Finally, we show that our approach can optimise the real-time parallel processing by using network data provided by Kaggle.
KW - Parallel systems
KW - load balancing
KW - partial observability
UR - http://www.scopus.com/inward/record.url?scp=85140789519&partnerID=8YFLogxK
U2 - 10.1109/TC.2022.3215907
DO - 10.1109/TC.2022.3215907
M3 - Article
VL - 72
SP - 1610
EP - 1622
JO - IEEE transactions on computers
JF - IEEE transactions on computers
SN - 0018-9340
IS - 6
ER -