LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Sameer Sadruddin
  • Jennifer D’Souza
  • Eleni Poupaki
  • Alex Watkins
  • Hamed Babaei Giglou
  • Anisa Rula
  • Bora Karasulu
  • Sören Auer
  • Adrie Mackus
  • Erwin Kessels

Organisationseinheiten

Externe Organisationen

  • Technische Informationsbibliothek (TIB) Leibniz-Informationszentrum Technik und Naturwissenschaften und Universitätsbibliothek
  • Eindhoven University of Technology (TU/e)
  • University of Warwick
  • University of Brescia
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksThe Semantic Web
Untertitel22nd European Semantic Web Conference, ESWC 2025, Proceedings
Herausgeber/-innenEdward Curry, Maribel Acosta, Maria Poveda-Villalón, Marieke van Erp, Adegboyega Ojo, Katja Hose, Cogan Shimizu, Pasquale Lisena
Herausgeber (Verlag)Springer Science and Business Media Deutschland GmbH
Seiten244-261
Seitenumfang18
ISBN (elektronisch)978-3-031-94578-6
ISBN (Print)9783031945779
PublikationsstatusVeröffentlicht - 31 Mai 2025
Veranstaltung22nd European Semantic Web Conference, ESWC 2025 - Portoroz, Slowenien
Dauer: 1 Juni 20255 Juni 2025

Publikationsreihe

NameLecture Notes in Computer Science
Band15719 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Abstract

Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.

ASJC Scopus Sachgebiete

Zitieren

LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. / Sadruddin, Sameer; D’Souza, Jennifer; Poupaki, Eleni et al.
The Semantic Web : 22nd European Semantic Web Conference, ESWC 2025, Proceedings. Hrsg. / Edward Curry; Maribel Acosta; Maria Poveda-Villalón; Marieke van Erp; Adegboyega Ojo; Katja Hose; Cogan Shimizu; Pasquale Lisena. Springer Science and Business Media Deutschland GmbH, 2025. S. 244-261 (Lecture Notes in Computer Science; Band 15719 LNCS).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Sadruddin, S, D’Souza, J, Poupaki, E, Watkins, A, Babaei Giglou, H, Rula, A, Karasulu, B, Auer, S, Mackus, A & Kessels, E 2025, LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. in E Curry, M Acosta, M Poveda-Villalón, M van Erp, A Ojo, K Hose, C Shimizu & P Lisena (Hrsg.), The Semantic Web : 22nd European Semantic Web Conference, ESWC 2025, Proceedings. Lecture Notes in Computer Science, Bd. 15719 LNCS, Springer Science and Business Media Deutschland GmbH, S. 244-261, 22nd European Semantic Web Conference, ESWC 2025, Portoroz, Slowenien, 1 Juni 2025. https://doi.org/10.1007/978-3-031-94578-6_14, https://doi.org/10.48550/arXiv.2504.00752
Sadruddin, S., D’Souza, J., Poupaki, E., Watkins, A., Babaei Giglou, H., Rula, A., Karasulu, B., Auer, S., Mackus, A., & Kessels, E. (2025). LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. In E. Curry, M. Acosta, M. Poveda-Villalón, M. van Erp, A. Ojo, K. Hose, C. Shimizu, & P. Lisena (Hrsg.), The Semantic Web : 22nd European Semantic Web Conference, ESWC 2025, Proceedings (S. 244-261). (Lecture Notes in Computer Science; Band 15719 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-94578-6_14, https://doi.org/10.48550/arXiv.2504.00752
Sadruddin S, D’Souza J, Poupaki E, Watkins A, Babaei Giglou H, Rula A et al. LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. in Curry E, Acosta M, Poveda-Villalón M, van Erp M, Ojo A, Hose K, Shimizu C, Lisena P, Hrsg., The Semantic Web : 22nd European Semantic Web Conference, ESWC 2025, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. S. 244-261. (Lecture Notes in Computer Science). doi: 10.1007/978-3-031-94578-6_14, 10.48550/arXiv.2504.00752
Sadruddin, Sameer ; D’Souza, Jennifer ; Poupaki, Eleni et al. / LLMs4SchemaDiscovery : A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models. The Semantic Web : 22nd European Semantic Web Conference, ESWC 2025, Proceedings. Hrsg. / Edward Curry ; Maribel Acosta ; Maria Poveda-Villalón ; Marieke van Erp ; Adegboyega Ojo ; Katja Hose ; Cogan Shimizu ; Pasquale Lisena. Springer Science and Business Media Deutschland GmbH, 2025. S. 244-261 (Lecture Notes in Computer Science).
Download
@inproceedings{d74c94dc5ffd4eaf9183d210ab54ea68,
title = "LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models",
abstract = "Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.",
keywords = "Human-in-the-loop Workflow, Large Language Models, Schema Discovery, Schema Mining, Scientific Schemas",
author = "Sameer Sadruddin and Jennifer D{\textquoteright}Souza and Eleni Poupaki and Alex Watkins and {Babaei Giglou}, Hamed and Anisa Rula and Bora Karasulu and S{\"o}ren Auer and Adrie Mackus and Erwin Kessels",
note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.; 22nd European Semantic Web Conference, ESWC 2025, ESWC 2025 ; Conference date: 01-06-2025 Through 05-06-2025",
year = "2025",
month = may,
day = "31",
doi = "10.1007/978-3-031-94578-6_14",
language = "English",
isbn = "9783031945779",
series = "Lecture Notes in Computer Science",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "244--261",
editor = "Edward Curry and Maribel Acosta and Maria Poveda-Villal{\'o}n and {van Erp}, Marieke and Adegboyega Ojo and Katja Hose and Cogan Shimizu and Pasquale Lisena",
booktitle = "The Semantic Web",
address = "Germany",

}

Download

TY - GEN

T1 - LLMs4SchemaDiscovery

T2 - 22nd European Semantic Web Conference, ESWC 2025

AU - Sadruddin, Sameer

AU - D’Souza, Jennifer

AU - Poupaki, Eleni

AU - Watkins, Alex

AU - Babaei Giglou, Hamed

AU - Rula, Anisa

AU - Karasulu, Bora

AU - Auer, Sören

AU - Mackus, Adrie

AU - Kessels, Erwin

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

PY - 2025/5/31

Y1 - 2025/5/31

N2 - Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.

AB - Extracting structured information from unstructured text is crucial for modeling real-world processes, but traditional schema mining relies on semi-structured data, limiting scalability. This paper introduces schema-miner, a novel tool that combines large language models with human feedback to automate and refine schema extraction. Through an iterative workflow, it organizes properties from text, incorporates expert input, and integrates domain-specific ontologies for semantic depth. Applied to materials science—specifically atomic layer deposition—schema-miner demonstrates that expert-guided LLMs generate semantically rich schemas suitable for diverse real-world applications.

KW - Human-in-the-loop Workflow

KW - Large Language Models

KW - Schema Discovery

KW - Schema Mining

KW - Scientific Schemas

UR - http://www.scopus.com/inward/record.url?scp=105007760723&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-94578-6_14

DO - 10.1007/978-3-031-94578-6_14

M3 - Conference contribution

AN - SCOPUS:105007760723

SN - 9783031945779

T3 - Lecture Notes in Computer Science

SP - 244

EP - 261

BT - The Semantic Web

A2 - Curry, Edward

A2 - Acosta, Maribel

A2 - Poveda-Villalón, Maria

A2 - van Erp, Marieke

A2 - Ojo, Adegboyega

A2 - Hose, Katja

A2 - Shimizu, Cogan

A2 - Lisena, Pasquale

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 1 June 2025 through 5 June 2025

ER -

Von denselben Autoren