Details
Original language | English |
---|---|
Title of host publication | Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference |
Publisher | Association for Computing Machinery (ACM) |
Pages | 176-188 |
Number of pages | 13 |
ISBN (electronic) | 9781450370097 |
ISBN (print) | 9781450370097 |
Publication status | Published - 13 Sept 2019 |
Externally published | Yes |
Event | ACM/IFIP 20th International Middleware Conference - UC Davis, United States Duration: 9 Dec 2019 → 13 Dec 2019 |
Publication series
Name | Proceedings of the 20th International Middleware Conference |
---|
Abstract
Keywords
- Apache Spark, Big Data, Self-Adaptive Executors
ASJC Scopus subject areas
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. p. 176-188 (Proceedings of the 20th International Middleware Conference).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Self-adaptive Executors for Big Data Processing
AU - Omranian Khorasani, Sobhan
AU - Rellermeyer, Jan
AU - Epema, Dick
N1 - Publisher Copyright: © 2019 Association for Computing Machinery.
PY - 2019/9/13
Y1 - 2019/9/13
N2 - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.
AB - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.
KW - Apache Spark
KW - Big Data
KW - Self-Adaptive Executors
UR - http://www.scopus.com/inward/record.url?scp=85078060450&partnerID=8YFLogxK
U2 - 10.1145/3361525.3361545
DO - 10.1145/3361525.3361545
M3 - Conference contribution
SN - 9781450370097
T3 - Proceedings of the 20th International Middleware Conference
SP - 176
EP - 188
BT - Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference
PB - Association for Computing Machinery (ACM)
T2 - ACM/IFIP 20th International Middleware Conference
Y2 - 9 December 2019 through 13 December 2019
ER -