The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Sören Auer; Dante A.C. Barone; Cassiano Bartz; Eduardo G. Cortes; Mohamad Yaser Jaradeh; Oliver Karras; Manolis Koubarakis; Dmitry Mouromtsev; Dmitrii Pliukhin; Daniil Radyush; Ivan Shilin; Markus Stocker; Eleni Tsalapati

doi:10.1038/s41598-023-33607-z

Details

Originalsprache	Englisch
Aufsatznummer	7240
Fachzeitschrift	Scientific reports
Jahrgang	13
Publikationsstatus	Veröffentlicht - 4 Mai 2023

Abstract

Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

ASJC Scopus Sachgebiete

Allgemein

Zitieren

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge. / Auer, Sören; Barone, Dante A.C.; Bartz, Cassiano et al.
in: Scientific reports, Jahrgang 13, 7240, 04.05.2023.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Auer, S, Barone, DAC, Bartz, C, Cortes, EG, Jaradeh, MY, Karras, O, Koubarakis, M, Mouromtsev, D, Pliukhin, D, Radyush, D, Shilin, I, Stocker, M & Tsalapati, E 2023, 'The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge', Scientific reports, Jg. 13, 7240. https://doi.org/10.1038/s41598-023-33607-z

Auer, S., Barone, D. A. C., Bartz, C., Cortes, E. G., Jaradeh, M. Y., Karras, O., Koubarakis, M., Mouromtsev, D., Pliukhin, D., Radyush, D., Shilin, I., Stocker, M., & Tsalapati, E. (2023). The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge. Scientific reports, 13, Artikel 7240. https://doi.org/10.1038/s41598-023-33607-z

Auer S, Barone DAC, Bartz C, Cortes EG, Jaradeh MY, Karras O et al. The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge. Scientific reports. 2023 Mai 4;13:7240. doi: 10.1038/s41598-023-33607-z

Auer, Sören ; Barone, Dante A.C. ; Bartz, Cassiano et al. / The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge. in: Scientific reports. 2023 ; Jahrgang 13.

Download

@article{5a1f92d6681d4ca8946b6b5226a9859e,

title = "The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge",

abstract = "Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.",

author = "S{\"o}ren Auer and Barone, {Dante A.C.} and Cassiano Bartz and Cortes, {Eduardo G.} and Jaradeh, {Mohamad Yaser} and Oliver Karras and Manolis Koubarakis and Dmitry Mouromtsev and Dmitrii Pliukhin and Daniil Radyush and Ivan Shilin and Markus Stocker and Eleni Tsalapati",

note = "Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and by the German Federal Ministry of Education and Research (BMBF) under the project LeibnizKILabor (Grant no. 01DD20003), German Research Foundation DFG for NFDI4Ing (No. 442146713) and NFDI4DataScience (No. 460234259). It has, also, received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sk{\l}odowska-Curie Grant agreement No. 101032307. It is, also, financed in part by the Coordena{\c c}{\~a}o de Aperfei{\c c}oamento de Pessoal de N{\'i}vel Superior-Brasil (CAPES)-Finance Code 001. ",

year = "2023",

month = may,

day = "4",

doi = "10.1038/s41598-023-33607-z",

language = "English",

volume = "13",

journal = "Scientific reports",

issn = "2045-2322",

publisher = "Nature Publishing Group",

}

Download

TY - JOUR

T1 - The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

AU - Auer, Sören

AU - Barone, Dante A.C.

AU - Bartz, Cassiano

AU - Cortes, Eduardo G.

AU - Jaradeh, Mohamad Yaser

AU - Karras, Oliver

AU - Koubarakis, Manolis

AU - Mouromtsev, Dmitry

AU - Pliukhin, Dmitrii

AU - Radyush, Daniil

AU - Shilin, Ivan

AU - Stocker, Markus

AU - Tsalapati, Eleni

N1 - Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and by the German Federal Ministry of Education and Research (BMBF) under the project LeibnizKILabor (Grant no. 01DD20003), German Research Foundation DFG for NFDI4Ing (No. 442146713) and NFDI4DataScience (No. 460234259). It has, also, received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant agreement No. 101032307. It is, also, financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES)-Finance Code 001.

PY - 2023/5/4

Y1 - 2023/5/4

N2 - Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

AB - Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

UR - http://www.scopus.com/inward/record.url?scp=85157959553&partnerID=8YFLogxK

U2 - 10.1038/s41598-023-33607-z

DO - 10.1038/s41598-023-33607-z

M3 - Article

C2 - 37142627

AN - SCOPUS:85157959553

VL - 13

JO - Scientific reports

JF - Scientific reports

SN - 2045-2322

M1 - 7240

ER -

Research@Leibniz University

The SciQA Scientific Question Answering Benchmark for Scholarly Knowledge

Autoren

Organisationseinheiten

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

CLEF 2024 SimpleText Track

ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph

Increasing Reproducibility in Science by Interlinking Semantic Artifact Descriptions in a Knowledge Graph

Scholarly Knowledge Graph Construction from Published Software Packages

Impact of COVID-19 research: a study on predicting influential scholarly documents using machine learning and a domain-independent knowledge graph