Loading [MathJax]/extensions/tex2jax.js

A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Nicolas Steiner
  • Ziteng Li
  • Omid Vosoughi
  • Johanna Schrader
  • Soumyadeep Roy
  • Wolfgang Nejdl
  • Ming Tang

Research Organisations

External Research Organisations

  • Indian Institute of Technology Kharagpur (IITKGP)

Details

Original languageEnglish
Title of host publicationWSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining
Pages1112-1113
Number of pages2
ISBN (electronic)9798400713293
Publication statusPublished - 10 Mar 2025
Event18th ACM International Conference on Web Search and Data Mining, WSDM 2025 - Hannover, Germany
Duration: 10 Mar 202514 Mar 2025

Abstract

This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.

Keywords

    cell-type classification, few-shot learning, foundation models, out-of-distribution data

ASJC Scopus subject areas

Cite this

A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task. / Steiner, Nicolas; Li, Ziteng; Vosoughi, Omid et al.
WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. 2025. p. 1112-1113.

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Steiner, N, Li, Z, Vosoughi, O, Schrader, J, Roy, S, Nejdl, W & Tang, M 2025, A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task. in WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. pp. 1112-1113, 18th ACM International Conference on Web Search and Data Mining, WSDM 2025, Hannover, Lower Saxony, Germany, 10 Mar 2025. https://doi.org/10.1145/3701551.3708811
Steiner, N., Li, Z., Vosoughi, O., Schrader, J., Roy, S., Nejdl, W., & Tang, M. (2025). A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task. In WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining (pp. 1112-1113) https://doi.org/10.1145/3701551.3708811
Steiner N, Li Z, Vosoughi O, Schrader J, Roy S, Nejdl W et al. A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task. In WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. 2025. p. 1112-1113 doi: 10.1145/3701551.3708811
Steiner, Nicolas ; Li, Ziteng ; Vosoughi, Omid et al. / A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task. WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. 2025. pp. 1112-1113
Download
@inproceedings{903cef16bb1e4f49a8829429742fac45,
title = "A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task",
abstract = "This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.",
keywords = "cell-type classification, few-shot learning, foundation models, out-of-distribution data",
author = "Nicolas Steiner and Ziteng Li and Omid Vosoughi and Johanna Schrader and Soumyadeep Roy and Wolfgang Nejdl and Ming Tang",
note = "Publisher Copyright: {\textcopyright} 2025 Copyright held by the owner/author(s).; 18th ACM International Conference on Web Search and Data Mining, WSDM 2025, WSDM 2025 ; Conference date: 10-03-2025 Through 14-03-2025",
year = "2025",
month = mar,
day = "10",
doi = "10.1145/3701551.3708811",
language = "English",
pages = "1112--1113",
booktitle = "WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining",

}

Download

TY - GEN

T1 - A Systematic Evaluation of Single-Cell Foundation Models on Cell-Type Classification Task

AU - Steiner, Nicolas

AU - Li, Ziteng

AU - Vosoughi, Omid

AU - Schrader, Johanna

AU - Roy, Soumyadeep

AU - Nejdl, Wolfgang

AU - Tang, Ming

N1 - Publisher Copyright: © 2025 Copyright held by the owner/author(s).

PY - 2025/3/10

Y1 - 2025/3/10

N2 - This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.

AB - This study presents a comprehensive benchmarking of three state-of-the-art single-cell foundation models scGPT, Geneformer, and scFoundation, on cell-type classification tasks. We evaluate the models on three datasets: myeloid, human pancreas, and multiple sclerosis, examining both standard fine-tuning and few-shot learning scenarios. Our work reveals that scFoundation consistently achieves the best performance while Geneformer performs poorly, yielding results sometimes even worse than those of the baseline models. Additionally, we demonstrate that a good foundation model can generalize well even when fine-tuned with out-of-distribution data, a capability that the baseline models lack. Our work highlights the potential of foundation models for addressing challenging biomedical questions, particularly in contexts where models are trained on one population but deployed on another.

KW - cell-type classification

KW - few-shot learning

KW - foundation models

KW - out-of-distribution data

UR - http://www.scopus.com/inward/record.url?scp=105001669179&partnerID=8YFLogxK

U2 - 10.1145/3701551.3708811

DO - 10.1145/3701551.3708811

M3 - Conference contribution

AN - SCOPUS:105001669179

SP - 1112

EP - 1113

BT - WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining

T2 - 18th ACM International Conference on Web Search and Data Mining, WSDM 2025

Y2 - 10 March 2025 through 14 March 2025

ER -

By the same author(s)