Loading [MathJax]/extensions/tex2jax.js

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Nils Feldhus
  • Aliki Anagnostopoulou
  • Qianli Wang
  • Milad Alshomary
  • Henning Wachsmuth

Externe Organisationen

  • Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI)
  • Columbia University
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 1
  • Captures
    • Readers: 4
see details

Details

OriginalspracheEnglisch
Titel des SammelwerksProceedings of the 2024 International Conference on Information Technology for Social Good
ErscheinungsortNew York, NY, USA
Herausgeber (Verlag)Association for Computing Machinery
Seiten225–230
Seitenumfang6
ISBN (elektronisch)9798400710940
ISBN (Print)9798400710940
PublikationsstatusVeröffentlicht - 4 Sept. 2024

Abstract

For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

ASJC Scopus Sachgebiete

Zitieren

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. / Feldhus, Nils; Anagnostopoulou, Aliki; Wang, Qianli et al.
Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA: Association for Computing Machinery, 2024. S. 225–230.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Feldhus, N, Anagnostopoulou, A, Wang, Q, Alshomary, M, Wachsmuth, H, Sonntag, D & Möller, S 2024, Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. in Proceedings of the 2024 International Conference on Information Technology for Social Good. Association for Computing Machinery, New York, NY, USA, S. 225–230. https://doi.org/10.1145/3677525.3678665
Feldhus, N., Anagnostopoulou, A., Wang, Q., Alshomary, M., Wachsmuth, H., Sonntag, D., & Möller, S. (2024). Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. In Proceedings of the 2024 International Conference on Information Technology for Social Good (S. 225–230). Association for Computing Machinery. https://doi.org/10.1145/3677525.3678665
Feldhus N, Anagnostopoulou A, Wang Q, Alshomary M, Wachsmuth H, Sonntag D et al. Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. in Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA: Association for Computing Machinery. 2024. S. 225–230 doi: 10.1145/3677525.3678665
Feldhus, Nils ; Anagnostopoulou, Aliki ; Wang, Qianli et al. / Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA : Association for Computing Machinery, 2024. S. 225–230
Download
@inproceedings{bc8295271aaf4c9288b08ab778d89afc,
title = "Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues",
abstract = "For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee{\textquoteright}s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1",
keywords = "Dialogue, Discourse Analysis, Evaluation, Explanations",
author = "Nils Feldhus and Aliki Anagnostopoulou and Qianli Wang and Milad Alshomary and Henning Wachsmuth and Daniel Sonntag and Sebastian M{\"o}ller",
note = "Publisher Copyright: {\textcopyright} 2024 owner/author.",
year = "2024",
month = sep,
day = "4",
doi = "10.1145/3677525.3678665",
language = "English",
isbn = "9798400710940",
pages = "225–230",
booktitle = "Proceedings of the 2024 International Conference on Information Technology for Social Good",
publisher = "Association for Computing Machinery",
address = "United States",

}

Download

TY - GEN

T1 - Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

AU - Feldhus, Nils

AU - Anagnostopoulou, Aliki

AU - Wang, Qianli

AU - Alshomary, Milad

AU - Wachsmuth, Henning

AU - Sonntag, Daniel

AU - Möller, Sebastian

N1 - Publisher Copyright: © 2024 owner/author.

PY - 2024/9/4

Y1 - 2024/9/4

N2 - For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

AB - For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

KW - Dialogue

KW - Discourse Analysis

KW - Evaluation

KW - Explanations

UR - http://www.scopus.com/inward/record.url?scp=105005394251&partnerID=8YFLogxK

U2 - 10.1145/3677525.3678665

DO - 10.1145/3677525.3678665

M3 - Conference contribution

SN - 9798400710940

SP - 225

EP - 230

BT - Proceedings of the 2024 International Conference on Information Technology for Social Good

PB - Association for Computing Machinery

CY - New York, NY, USA

ER -

Von denselben Autoren