Details
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining - 25th Pacific-Asia Conference, PAKDD 2021, Proceedings |
Editors | Kamal Karlapalem, Hong Cheng, Naren Ramakrishnan, R. K. Agrawal, P. Krishna Reddy, Jaideep Srivastava, Tanmoy Chakraborty |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 152-163 |
Number of pages | 12 |
ISBN (print) | 9783030757618 |
Publication status | Published - 2021 |
Externally published | Yes |
Event | 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021 - Virtual, Online Duration: 11 May 2021 → 14 May 2021 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12712 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (electronic) | 1611-3349 |
Abstract
Algorithm selection refers to the task of automatically selecting the most suitable algorithm for solving an instance of a computational problem from a set of candidate algorithms. Here, suitability is typically measured in terms of the algorithms’ runtimes. To allow the selection of algorithms on new problem instances, machine learning models are trained on previously observed performance data and then used to predict the algorithms’ performances. Due to the computational effort, the execution of such algorithms is often prematurely terminated, which leads to right-censored observations representing a lower bound on the actual runtime. While simply neglecting these censored samples leads to overly optimistic models, imputing them with precise though hypothetical values, such as the commonly used penalized average runtime, is a rather arbitrary and biased approach. In this paper, we propose a simple regression method based on so-called superset learning, in which right-censored runtime data are explicitly incorporated in terms of interval-valued observations, offering an intuitive and efficient approach to handling censored data. Benchmarking on publicly available algorithm performance data, we demonstrate that it outperforms the aforementioned naïve ways of dealing with censored samples and is competitive to established methods for censored regression in the field of algorithm selection.
Keywords
- Algorithm selection, Censored data, Superset learning
ASJC Scopus subject areas
- Mathematics(all)
- Theoretical Computer Science
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Advances in Knowledge Discovery and Data Mining - 25th Pacific-Asia Conference, PAKDD 2021, Proceedings. ed. / Kamal Karlapalem; Hong Cheng; Naren Ramakrishnan; R. K. Agrawal; P. Krishna Reddy; Jaideep Srivastava; Tanmoy Chakraborty. Springer Science and Business Media Deutschland GmbH, 2021. p. 152-163 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12712 LNAI).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Algorithm Selection as Superset Learning
T2 - 25th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2021
AU - Hanselle, Jonas
AU - Tornede, Alexander
AU - Wever, Marcel
AU - Hüllermeier, Eyke
N1 - Publisher Copyright: © 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Algorithm selection refers to the task of automatically selecting the most suitable algorithm for solving an instance of a computational problem from a set of candidate algorithms. Here, suitability is typically measured in terms of the algorithms’ runtimes. To allow the selection of algorithms on new problem instances, machine learning models are trained on previously observed performance data and then used to predict the algorithms’ performances. Due to the computational effort, the execution of such algorithms is often prematurely terminated, which leads to right-censored observations representing a lower bound on the actual runtime. While simply neglecting these censored samples leads to overly optimistic models, imputing them with precise though hypothetical values, such as the commonly used penalized average runtime, is a rather arbitrary and biased approach. In this paper, we propose a simple regression method based on so-called superset learning, in which right-censored runtime data are explicitly incorporated in terms of interval-valued observations, offering an intuitive and efficient approach to handling censored data. Benchmarking on publicly available algorithm performance data, we demonstrate that it outperforms the aforementioned naïve ways of dealing with censored samples and is competitive to established methods for censored regression in the field of algorithm selection.
AB - Algorithm selection refers to the task of automatically selecting the most suitable algorithm for solving an instance of a computational problem from a set of candidate algorithms. Here, suitability is typically measured in terms of the algorithms’ runtimes. To allow the selection of algorithms on new problem instances, machine learning models are trained on previously observed performance data and then used to predict the algorithms’ performances. Due to the computational effort, the execution of such algorithms is often prematurely terminated, which leads to right-censored observations representing a lower bound on the actual runtime. While simply neglecting these censored samples leads to overly optimistic models, imputing them with precise though hypothetical values, such as the commonly used penalized average runtime, is a rather arbitrary and biased approach. In this paper, we propose a simple regression method based on so-called superset learning, in which right-censored runtime data are explicitly incorporated in terms of interval-valued observations, offering an intuitive and efficient approach to handling censored data. Benchmarking on publicly available algorithm performance data, we demonstrate that it outperforms the aforementioned naïve ways of dealing with censored samples and is competitive to established methods for censored regression in the field of algorithm selection.
KW - Algorithm selection
KW - Censored data
KW - Superset learning
UR - http://www.scopus.com/inward/record.url?scp=85111141266&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-75762-5_13
DO - 10.1007/978-3-030-75762-5_13
M3 - Conference contribution
SN - 9783030757618
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 152
EP - 163
BT - Advances in Knowledge Discovery and Data Mining - 25th Pacific-Asia Conference, PAKDD 2021, Proceedings
A2 - Karlapalem, Kamal
A2 - Cheng, Hong
A2 - Ramakrishnan, Naren
A2 - Agrawal, R. K.
A2 - Reddy, P. Krishna
A2 - Srivastava, Jaideep
A2 - Chakraborty, Tanmoy
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 11 May 2021 through 14 May 2021
ER -