Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Machine Learning and Knowledge Discovery in Databases |
Untertitel | Research Track - European Conference, ECML PKDD 2024, Proceedings |
Herausgeber/-innen | Albert Bifet, Jesse Davis, Tomas Krilavičius, Meelis Kull, Eirini Ntoutsi, Indrė Žliobaitė |
Herausgeber (Verlag) | Springer Science and Business Media Deutschland GmbH |
Seiten | 434-449 |
Seitenumfang | 16 |
ISBN (elektronisch) | 978-3-031-70368-3 |
ISBN (Print) | 9783031703676 |
Publikationsstatus | Veröffentlicht - 2024 |
Veranstaltung | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Litauen Dauer: 9 Sept. 2024 → 13 Sept. 2024 |
Publikationsreihe
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 14947 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (elektronisch) | 1611-3349 |
Abstract
Clinical Decision Support Systems (CDSS) have become ubiquitous in healthcare facilities, leveraging the increasing presence of Electronic Health Records (EHR). Predicting clinical outcomes from clinical text, such as identifying diagnoses based on the admission state of patients, is among the core tasks that a CDSS must address. The state-of-the-art for this task has been set by transformer encoder models, recently superseded by encoders enhanced with a prototypical network. This task remains a significant challenge due to the substantial imbalance of the outcome labels, which is characterized by a long-tailed distribution where the majority of diagnoses are under-represented. Motivated by recent biologically inspired findings in deep learning, we propose S-Proto, a novel, efficient, and sparse prototypical layer. Our method achieves state-of-the-art performance in outcome diagnosis prediction, without compromising on the explainability characteristics of prototypical encoders. Quantitative results demonstrate that our approach is robust to the challenges presented by clinical notes, and transfers successfully to a second, unseen dataset. Qualitative evaluation with medical doctors shows that S-Proto is capable of disaggregating the representations of a disease that manifests differently in patient cohorts.
ASJC Scopus Sachgebiete
- Mathematik (insg.)
- Theoretische Informatik
- Informatik (insg.)
- Allgemeine Computerwissenschaft
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Machine Learning and Knowledge Discovery in Databases: Research Track - European Conference, ECML PKDD 2024, Proceedings. Hrsg. / Albert Bifet; Jesse Davis; Tomas Krilavičius; Meelis Kull; Eirini Ntoutsi; Indrė Žliobaitė. Springer Science and Business Media Deutschland GmbH, 2024. S. 434-449 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 14947 LNAI).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Boosting Long-Tail Data Classification with Sparse Prototypical Networks
AU - Figueroa, Alexei
AU - Papaioannou, Jens Michalis
AU - Fallon, Conor
AU - Bekiaridou, Alexandra
AU - Bressem, Keno
AU - Zanos, Stavros
AU - Gers, Felix
AU - Nejdl, Wolfgang
AU - Löser, Alexander
N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Clinical Decision Support Systems (CDSS) have become ubiquitous in healthcare facilities, leveraging the increasing presence of Electronic Health Records (EHR). Predicting clinical outcomes from clinical text, such as identifying diagnoses based on the admission state of patients, is among the core tasks that a CDSS must address. The state-of-the-art for this task has been set by transformer encoder models, recently superseded by encoders enhanced with a prototypical network. This task remains a significant challenge due to the substantial imbalance of the outcome labels, which is characterized by a long-tailed distribution where the majority of diagnoses are under-represented. Motivated by recent biologically inspired findings in deep learning, we propose S-Proto, a novel, efficient, and sparse prototypical layer. Our method achieves state-of-the-art performance in outcome diagnosis prediction, without compromising on the explainability characteristics of prototypical encoders. Quantitative results demonstrate that our approach is robust to the challenges presented by clinical notes, and transfers successfully to a second, unseen dataset. Qualitative evaluation with medical doctors shows that S-Proto is capable of disaggregating the representations of a disease that manifests differently in patient cohorts.
AB - Clinical Decision Support Systems (CDSS) have become ubiquitous in healthcare facilities, leveraging the increasing presence of Electronic Health Records (EHR). Predicting clinical outcomes from clinical text, such as identifying diagnoses based on the admission state of patients, is among the core tasks that a CDSS must address. The state-of-the-art for this task has been set by transformer encoder models, recently superseded by encoders enhanced with a prototypical network. This task remains a significant challenge due to the substantial imbalance of the outcome labels, which is characterized by a long-tailed distribution where the majority of diagnoses are under-represented. Motivated by recent biologically inspired findings in deep learning, we propose S-Proto, a novel, efficient, and sparse prototypical layer. Our method achieves state-of-the-art performance in outcome diagnosis prediction, without compromising on the explainability characteristics of prototypical encoders. Quantitative results demonstrate that our approach is robust to the challenges presented by clinical notes, and transfers successfully to a second, unseen dataset. Qualitative evaluation with medical doctors shows that S-Proto is capable of disaggregating the representations of a disease that manifests differently in patient cohorts.
KW - Long-Tail
KW - NLP
KW - Prototypical Networks
KW - Sparsity
UR - http://www.scopus.com/inward/record.url?scp=85204376875&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-70368-3_26
DO - 10.1007/978-3-031-70368-3_26
M3 - Conference contribution
AN - SCOPUS:85204376875
SN - 9783031703676
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 434
EP - 449
BT - Machine Learning and Knowledge Discovery in Databases
A2 - Bifet, Albert
A2 - Davis, Jesse
A2 - Krilavičius, Tomas
A2 - Kull, Meelis
A2 - Ntoutsi, Eirini
A2 - Žliobaitė, Indrė
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Y2 - 9 September 2024 through 13 September 2024
ER -