Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 1–26 |
Seitenumfang | 26 |
Fachzeitschrift | International Journal of Science Education |
Publikationsstatus | Elektronisch veröffentlicht (E-Pub) - 12 Mai 2025 |
Abstract
ASJC Scopus Sachgebiete
- Sozialwissenschaften (insg.)
- Ausbildung bzw. Denomination
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: International Journal of Science Education, 12.05.2025, S. 1–26.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Automatic feedback on physics tasks using open-source generative artificial intelligence
AU - Meyer, André
AU - Bleckmann, Tom
AU - Friege, Gunnar
N1 - Publisher Copyright: © 2025 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
PY - 2025/5/12
Y1 - 2025/5/12
N2 - This study explores the feasibility of using open-source large language models (LLMs) to generate automatic feedback on physics problem-solving tasks in educational settings. A quantised version of the open-source LLM OpenChat 3.6 was employed to generate German-language feedback for high school students on standard school hardware. The study procedure involved five stages: data preparation, model selection, prompt design, response evaluation, and quality analysis of feedback. OpenChat 3.6 achieved an accuracy of 0.84 in classifying student answers. In comparison, GPT4-o achieved an accuracy of 0.85. The open-source LLM provided accurate and suitable feedback in 69% of cases, with substantial interrater agreement (κ = 0.89) on feedback quality. However, performance varied across task types, highlighting areas for improvement in prompt specificity, especially in handling physics terminology. These findings suggest that, with optimisation, open-source LLMs can offer a locally controlled and effective solution for formative assessment in physics education, enabling real-time, targeted feedback to support student learning.
AB - This study explores the feasibility of using open-source large language models (LLMs) to generate automatic feedback on physics problem-solving tasks in educational settings. A quantised version of the open-source LLM OpenChat 3.6 was employed to generate German-language feedback for high school students on standard school hardware. The study procedure involved five stages: data preparation, model selection, prompt design, response evaluation, and quality analysis of feedback. OpenChat 3.6 achieved an accuracy of 0.84 in classifying student answers. In comparison, GPT4-o achieved an accuracy of 0.85. The open-source LLM provided accurate and suitable feedback in 69% of cases, with substantial interrater agreement (κ = 0.89) on feedback quality. However, performance varied across task types, highlighting areas for improvement in prompt specificity, especially in handling physics terminology. These findings suggest that, with optimisation, open-source LLMs can offer a locally controlled and effective solution for formative assessment in physics education, enabling real-time, targeted feedback to support student learning.
KW - LLM
KW - automated feedback
KW - science education
UR - http://www.scopus.com/inward/record.url?scp=105004836149&partnerID=8YFLogxK
U2 - 10.1080/09500693.2025.2499220
DO - 10.1080/09500693.2025.2499220
M3 - Article
SP - 1
EP - 26
JO - International Journal of Science Education
JF - International Journal of Science Education
SN - 0950-0693
ER -