Ranking facts for explaining answers to elementary science questions

Jennifer D'Souza; Isaiah Onando Mulang; Sören Auer

doi:10.1017/S1351324921000358

Details

Originalsprache	Englisch
Seiten (von - bis)	228-253
Seitenumfang	26
Fachzeitschrift	Natural language engineering
Jahrgang	29
Ausgabenummer	2
Frühes Online-Datum	24 Jan. 2022
Publikationsstatus	Veröffentlicht - 2023
Extern publiziert	Ja

Abstract

In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by "connecting the dots"across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.

ASJC Scopus Sachgebiete

Informatik (insg.)
Software
Geisteswissenschaftliche Fächer (insg.)
Sprache und Linguistik
Sozialwissenschaften (insg.)
Linguistik und Sprache
Informatik (insg.)
Artificial intelligence

Zitieren

Ranking facts for explaining answers to elementary science questions. / D'Souza, Jennifer; Mulang, Isaiah Onando; Auer, Sören.
in: Natural language engineering, Jahrgang 29, Nr. 2, 2023, S. 228-253.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

D'Souza, J, Mulang, IO & Auer, S 2023, 'Ranking facts for explaining answers to elementary science questions', Natural language engineering, Jg. 29, Nr. 2, S. 228-253. https://doi.org/10.1017/S1351324921000358, https://doi.org/10.48550/arXiv.2110.09036

D'Souza, J., Mulang, I. O., & Auer, S. (2023). Ranking facts for explaining answers to elementary science questions. Natural language engineering, 29(2), 228-253. https://doi.org/10.1017/S1351324921000358, https://doi.org/10.48550/arXiv.2110.09036

D'Souza J, Mulang IO, Auer S. Ranking facts for explaining answers to elementary science questions. Natural language engineering. 2023;29(2):228-253. Epub 2022 Jan 24. doi: 10.1017/S1351324921000358, 10.48550/arXiv.2110.09036

D'Souza, Jennifer ; Mulang, Isaiah Onando ; Auer, Sören. / Ranking facts for explaining answers to elementary science questions. in: Natural language engineering. 2023 ; Jahrgang 29, Nr. 2. S. 228-253.

Download

@article{46bac06d64dd42e5b8920d67f8395aa3,

title = "Ranking facts for explaining answers to elementary science questions",

abstract = "In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by {"}connecting the dots{"}across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.",

keywords = "Explanation generation, Information extraction, Machine learning, Semantics, Statistical methods",

author = "Jennifer D'Souza and Mulang, {Isaiah Onando} and S{\"o}ren Auer",

year = "2023",

doi = "10.1017/S1351324921000358",

language = "English",

volume = "29",

pages = "228--253",

journal = "Natural language engineering",

issn = "1351-3249",

publisher = "Cambridge University Press",

number = "2",

}

Download

TY - JOUR

T1 - Ranking facts for explaining answers to elementary science questions

AU - D'Souza, Jennifer

AU - Mulang, Isaiah Onando

AU - Auer, Sören

PY - 2023

Y1 - 2023

N2 - In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by "connecting the dots"across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.

AB - In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by "connecting the dots"across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions, originating from comprehensive evaluations against nine existing systems, are (1) a case study in which two preference ordering approaches are systematically compared, and where the pointwise approach is shown to outperform the pairwise approach, thus adding to the existing survey of observations on this topic; (2) since our system outperforms a highly-effective TF-IDF-based IR technique by 3.5 and 4.9 points on the development and test sets, respectively, it demonstrates some of the further task improvement possibilities (e.g., in terms of an efficient learning algorithm, semantic features) on this task; (3) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (4) the human-engineered features make it an interpretable machine learning model for the task.

KW - Explanation generation

KW - Information extraction

KW - Machine learning

KW - Semantics

KW - Statistical methods

UR - http://www.scopus.com/inward/record.url?scp=85124044090&partnerID=8YFLogxK

U2 - 10.1017/S1351324921000358

DO - 10.1017/S1351324921000358

M3 - Article

AN - SCOPUS:85124044090

VL - 29

SP - 228

EP - 253

JO - Natural language engineering

JF - Natural language engineering

SN - 1351-3249

IS - 2

ER -

Research@Leibniz University

Ranking facts for explaining answers to elementary science questions

Autorschaft

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Leveraging LLMs for Scientific Abstract Summarization: Unearthing the Essence of Research in a Single Sentence

WSDM 2025 General Chairs' Welcome

Leveraging GPT Models For Semantic Table Annotation

Managing Comprehensive Research Instrument Descriptions Within a Scholarly Knowledge Graph

DataDesc: A framework for creating and sharing technical metadata for research software interfaces

Leveraging LLMs for Scientific Abstract Summarization: Unearthing the Essence of Research in a Single Sentence

WSDM 2025 General Chairs' Welcome

Leveraging GPT Models For Semantic Table Annotation