Self-adaptive Executors for Big Data Processing

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

External Research Organisations

  • Delft University of Technology
View graph of relations

Details

Original languageEnglish
Title of host publicationMiddleware 2019 - Proceedings of the 2019 20th International Middleware Conference
PublisherAssociation for Computing Machinery (ACM)
Pages176-188
Number of pages13
ISBN (electronic)9781450370097
ISBN (print)9781450370097
Publication statusPublished - 13 Sept 2019
Externally publishedYes
EventACM/IFIP 20th International Middleware Conference - UC Davis, United States
Duration: 9 Dec 201913 Dec 2019

Publication series

NameProceedings of the 20th International Middleware Conference

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

Keywords

    Apache Spark, Big Data, Self-Adaptive Executors

ASJC Scopus subject areas

Cite this

Self-adaptive Executors for Big Data Processing. / Omranian Khorasani, Sobhan; Rellermeyer, Jan; Epema, Dick.
Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. p. 176-188 (Proceedings of the 20th International Middleware Conference).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Omranian Khorasani, S, Rellermeyer, J & Epema, D 2019, Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Proceedings of the 20th International Middleware Conference, Association for Computing Machinery (ACM), pp. 176-188, ACM/IFIP 20th International Middleware Conference, United States, 9 Dec 2019. https://doi.org/10.1145/3361525.3361545
Omranian Khorasani, S., Rellermeyer, J., & Epema, D. (2019). Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference (pp. 176-188). (Proceedings of the 20th International Middleware Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/3361525.3361545
Omranian Khorasani S, Rellermeyer J, Epema D. Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM). 2019. p. 176-188. (Proceedings of the 20th International Middleware Conference). doi: 10.1145/3361525.3361545
Omranian Khorasani, Sobhan ; Rellermeyer, Jan ; Epema, Dick. / Self-adaptive Executors for Big Data Processing. Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. pp. 176-188 (Proceedings of the 20th International Middleware Conference).
Download
@inproceedings{987b5ac65d58481e9bc34bb267b8fbe1,
title = "Self-adaptive Executors for Big Data Processing",
abstract = "The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.",
keywords = "Apache Spark, Big Data, Self-Adaptive Executors",
author = "{Omranian Khorasani}, Sobhan and Jan Rellermeyer and Dick Epema",
note = "Publisher Copyright: {\textcopyright} 2019 Association for Computing Machinery.; ACM/IFIP 20th International Middleware Conference ; Conference date: 09-12-2019 Through 13-12-2019",
year = "2019",
month = sep,
day = "13",
doi = "10.1145/3361525.3361545",
language = "English",
isbn = "9781450370097",
series = "Proceedings of the 20th International Middleware Conference",
publisher = "Association for Computing Machinery (ACM)",
pages = "176--188",
booktitle = "Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference",
address = "United States",

}

Download

TY - GEN

T1 - Self-adaptive Executors for Big Data Processing

AU - Omranian Khorasani, Sobhan

AU - Rellermeyer, Jan

AU - Epema, Dick

N1 - Publisher Copyright: © 2019 Association for Computing Machinery.

PY - 2019/9/13

Y1 - 2019/9/13

N2 - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

AB - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

KW - Apache Spark

KW - Big Data

KW - Self-Adaptive Executors

UR - http://www.scopus.com/inward/record.url?scp=85078060450&partnerID=8YFLogxK

U2 - 10.1145/3361525.3361545

DO - 10.1145/3361525.3361545

M3 - Conference contribution

SN - 9781450370097

T3 - Proceedings of the 20th International Middleware Conference

SP - 176

EP - 188

BT - Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference

PB - Association for Computing Machinery (ACM)

T2 - ACM/IFIP 20th International Middleware Conference

Y2 - 9 December 2019 through 13 December 2019

ER -

By the same author(s)