Self-adaptive Executors for Big Data Processing

Sobhan Omranian Khorasani; Jan Rellermeyer; Dick Epema

doi:10.1145/3361525.3361545

Details

Original language	English
Title of host publication	Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference
Publisher	Association for Computing Machinery (ACM)
Pages	176-188
Number of pages	13
ISBN (electronic)	9781450370097
ISBN (print)	9781450370097
Publication status	Published - 13 Sept 2019
Externally published	Yes
Event	ACM/IFIP 20th International Middleware Conference - UC Davis, United States Duration: 9 Dec 2019 → 13 Dec 2019

Publication series

Name	Proceedings of the 20th International Middleware Conference

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

Keywords

Apache Spark, Big Data, Self-Adaptive Executors

ASJC Scopus subject areas

Computer Science(all)
Software

Cite this

Self-adaptive Executors for Big Data Processing. / Omranian Khorasani, Sobhan; Rellermeyer, Jan; Epema, Dick.
Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. p. 176-188 (Proceedings of the 20th International Middleware Conference).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Omranian Khorasani, S, Rellermeyer, J & Epema, D 2019, Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Proceedings of the 20th International Middleware Conference, Association for Computing Machinery (ACM), pp. 176-188, ACM/IFIP 20th International Middleware Conference, United States, 9 Dec 2019. https://doi.org/10.1145/3361525.3361545

Omranian Khorasani, S., Rellermeyer, J., & Epema, D. (2019). Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference (pp. 176-188). (Proceedings of the 20th International Middleware Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/3361525.3361545

Omranian Khorasani S, Rellermeyer J, Epema D. Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM). 2019. p. 176-188. (Proceedings of the 20th International Middleware Conference). doi: 10.1145/3361525.3361545

Omranian Khorasani, Sobhan ; Rellermeyer, Jan ; Epema, Dick. / Self-adaptive Executors for Big Data Processing. Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. pp. 176-188 (Proceedings of the 20th International Middleware Conference).

Download

@inproceedings{987b5ac65d58481e9bc34bb267b8fbe1,

title = "Self-adaptive Executors for Big Data Processing",

abstract = "The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.",

keywords = "Apache Spark, Big Data, Self-Adaptive Executors",

author = "{Omranian Khorasani}, Sobhan and Jan Rellermeyer and Dick Epema",

note = "Publisher Copyright: {\textcopyright} 2019 Association for Computing Machinery.; ACM/IFIP 20th International Middleware Conference ; Conference date: 09-12-2019 Through 13-12-2019",

year = "2019",

month = sep,

day = "13",

doi = "10.1145/3361525.3361545",

language = "English",

isbn = "9781450370097",

series = "Proceedings of the 20th International Middleware Conference",

publisher = "Association for Computing Machinery (ACM)",

pages = "176--188",

booktitle = "Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference",

address = "United States",

}

Download

TY - GEN

T1 - Self-adaptive Executors for Big Data Processing

AU - Omranian Khorasani, Sobhan

AU - Rellermeyer, Jan

AU - Epema, Dick

PY - 2019/9/13

Y1 - 2019/9/13

N2 - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

AB - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

KW - Apache Spark

KW - Big Data

KW - Self-Adaptive Executors

UR - http://www.scopus.com/inward/record.url?scp=85078060450&partnerID=8YFLogxK

U2 - 10.1145/3361525.3361545

DO - 10.1145/3361525.3361545

M3 - Conference contribution

SN - 9781450370097

T3 - Proceedings of the 20th International Middleware Conference

SP - 176

EP - 188

BT - Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference

PB - Association for Computing Machinery (ACM)

T2 - ACM/IFIP 20th International Middleware Conference

Y2 - 9 December 2019 through 13 December 2019

ER -

Research@Leibniz University

Self-adaptive Executors for Big Data Processing

Authors

External Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Toward Competitive Serverless Deep Learning

The Performance of Distributed Applications: A Traffic Shaping Perspective

Log Parsing Evaluation in the Era of Modern Software Systems

Brug: An Adaptive Memory (Re-)Allocator

Is Your Anomaly Detector Ready for Change? Adapting AIOps Solutions to the Real World

Toward Competitive Serverless Deep Learning

The Performance of Distributed Applications: A Traffic Shaping Perspective

Log Parsing Evaluation in the Era of Modern Software Systems

Brug: An Adaptive Memory (Re-)Allocator

Is Your Anomaly Detector Ready for Change? Adapting AIOps Solutions to the Real World

Toward Competitive Serverless Deep Learning