Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Nils Feldhus; Aliki Anagnostopoulou; Qianli Wang; Milad Alshomary; Henning Wachsmuth; Daniel Sonntag; Sebastian Möller

doi:10.1145/3677525.3678665

Details

Originalsprache	Englisch
Titel des Sammelwerks	Proceedings of the 2024 International Conference on Information Technology for Social Good
Erscheinungsort	New York, NY, USA
Herausgeber (Verlag)	Association for Computing Machinery
Seiten	225–230
Seitenumfang	6
ISBN (elektronisch)	9798400710940
ISBN (Print)	9798400710940
Publikationsstatus	Veröffentlicht - 4 Sept. 2024

Abstract

For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

ASJC Scopus Sachgebiete

Informatik (insg.)
Computernetzwerke und -kommunikation
Informatik (insg.)
Information systems

Zitieren

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. / Feldhus, Nils; Anagnostopoulou, Aliki; Wang, Qianli et al.
Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA: Association for Computing Machinery, 2024. S. 225–230.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Feldhus, N, Anagnostopoulou, A, Wang, Q, Alshomary, M, Wachsmuth, H, Sonntag, D & Möller, S 2024, Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. in Proceedings of the 2024 International Conference on Information Technology for Social Good. Association for Computing Machinery, New York, NY, USA, S. 225–230. https://doi.org/10.1145/3677525.3678665

Feldhus, N., Anagnostopoulou, A., Wang, Q., Alshomary, M., Wachsmuth, H., Sonntag, D., & Möller, S. (2024). Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. In Proceedings of the 2024 International Conference on Information Technology for Social Good (S. 225–230). Association for Computing Machinery. https://doi.org/10.1145/3677525.3678665

Feldhus N, Anagnostopoulou A, Wang Q, Alshomary M, Wachsmuth H, Sonntag D et al. Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. in Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA: Association for Computing Machinery. 2024. S. 225–230 doi: 10.1145/3677525.3678665

Feldhus, Nils ; Anagnostopoulou, Aliki ; Wang, Qianli et al. / Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues. Proceedings of the 2024 International Conference on Information Technology for Social Good. New York, NY, USA : Association for Computing Machinery, 2024. S. 225–230

Download

@inproceedings{bc8295271aaf4c9288b08ab778d89afc,

title = "Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues",

abstract = "For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee{\textquoteright}s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1",

keywords = "Dialogue, Discourse Analysis, Evaluation, Explanations",

author = "Nils Feldhus and Aliki Anagnostopoulou and Qianli Wang and Milad Alshomary and Henning Wachsmuth and Daniel Sonntag and Sebastian M{\"o}ller",

note = "Publisher Copyright: {\textcopyright} 2024 owner/author.",

year = "2024",

month = sep,

day = "4",

doi = "10.1145/3677525.3678665",

language = "English",

isbn = "9798400710940",

pages = "225–230",

booktitle = "Proceedings of the 2024 International Conference on Information Technology for Social Good",

publisher = "Association for Computing Machinery",

address = "United States",

}

Download

TY - GEN

T1 - Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

AU - Feldhus, Nils

AU - Anagnostopoulou, Aliki

AU - Wang, Qianli

AU - Alshomary, Milad

AU - Wachsmuth, Henning

AU - Sonntag, Daniel

AU - Möller, Sebastian

PY - 2024/9/4

Y1 - 2024/9/4

N2 - For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

AB - For dialogues in which teachers explain difficult concepts to students, didactics research often debates which teaching strategies lead to the best learning outcome. In this paper, we test if LLMs can reliably annotate such explanation dialogues, s.t. they could assist in lesson planning and tutoring systems. We first create a new annotation scheme of teaching acts aligned with contemporary teaching models and re-annotate a dataset of conversational explanations about communicating scientific understanding in teacher-student settings on five levels of the explainee’s expertise: ReWIRED contains three layers of acts (Teaching, Explanation, Dialogue) with increased granularity (span-level). We then evaluate language models on the labeling of such acts and find that the broad range and structure of the proposed labels is hard to model for LLMs such as GPT-3.5/-4 via prompting, but a fine-tuned BERT can perform both act classification and span labeling well. Finally, we operationalize a series of quality metrics for instructional explanations in the form of a test suite, finding that they match the five expertise levels well.1

KW - Dialogue

KW - Discourse Analysis

KW - Evaluation

KW - Explanations

UR - http://www.scopus.com/inward/record.url?scp=105005394251&partnerID=8YFLogxK

U2 - 10.1145/3677525.3678665

DO - 10.1145/3677525.3678665

M3 - Conference contribution

SN - 9798400710940

SP - 225

EP - 230

BT - Proceedings of the 2024 International Conference on Information Technology for Social Good

PB - Association for Computing Machinery

CY - New York, NY, USA

ER -

Research@Leibniz University

Towards Modeling and Evaluating Instructional Explanations in Teacher-Student Dialogues

Autorschaft

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection

Mehrebenenannotation argumentativer Lerner∗innentexte für die automatische Textauswertung

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models

Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness

Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection

Mehrebenenannotation argumentativer Lerner∗innentexte für die automatische Textauswertung

When to use a metaphor: Metaphors in dialogical explanations with addressees of different expertise