Details
Original language | English |
---|---|
Title of host publication | WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining |
Pages | 1112-1113 |
Number of pages | 2 |
ISBN (electronic) | 9798400713293 |
Publication status | Published - 10 Mar 2025 |
Event | 18th ACM International Conference on Web Search and Data Mining, WSDM 2025 - Hannover, Germany Duration: 10 Mar 2025 → 14 Mar 2025 |
Abstract
This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.
Keywords
- cell-type classification, few-shot learning, foundation models, out-of-distribution data
ASJC Scopus subject areas
- Computer Science(all)
- Computer Networks and Communications
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. 2025. p. 1112-1113.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task
AU - Steiner, Nicolas
AU - Li, Ziteng
AU - Vosoughi, Omid
AU - Schrader, Johanna
AU - Roy, Soumyadeep
AU - Nejdl, Wolfgang
AU - Tang, Ming
N1 - Publisher Copyright: © 2025 Copyright held by the owner/author(s).
PY - 2025/3/10
Y1 - 2025/3/10
N2 - This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.
AB - This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.
KW - cell-type classification
KW - few-shot learning
KW - foundation models
KW - out-of-distribution data
UR - http://www.scopus.com/inward/record.url?scp=105001669179&partnerID=8YFLogxK
U2 - 10.1145/3701551.3708811
DO - 10.1145/3701551.3708811
M3 - Conference contribution
AN - SCOPUS:105001669179
SP - 1112
EP - 1113
BT - WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining
T2 - 18th ACM International Conference on Web Search and Data Mining, WSDM 2025
Y2 - 10 March 2025 through 14 March 2025
ER -