Details
Original language | English |
---|---|
Pages (from-to) | 1989-2016 |
Number of pages | 28 |
Journal | Knowledge and information systems |
Volume | 65 |
Issue number | 5 |
Early online date | 7 Jan 2023 |
Publication status | Published - May 2023 |
Abstract
In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
Keywords
- Information extraction, NLP pipelines, Semantic search, Semantic web, Software reusability
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Information Systems
- Computer Science(all)
- Human-Computer Interaction
- Computer Science(all)
- Hardware and Architecture
- Computer Science(all)
- Artificial Intelligence
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Knowledge and information systems, Vol. 65, No. 5, 05.2023, p. 1989-2016.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Information extraction pipelines for knowledge graphs
AU - Jaradeh, Mohamad Yaser
AU - Singh, Kuldeep
AU - Stocker, Markus
AU - Both, Andreas
AU - Auer, Sören
N1 - Funding Information: We thank anonymous reviewers for their very useful comments and suggestions. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. We also thank Allard Oelen and Vitalis Wiens for their valuable feedback.
PY - 2023/5
Y1 - 2023/5
N2 - In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
AB - In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community’s disjoint efforts on KG completion. We include more components into the architecture of Plumber to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.
KW - Information extraction
KW - NLP pipelines
KW - Semantic search
KW - Semantic web
KW - Software reusability
UR - http://www.scopus.com/inward/record.url?scp=85145703568&partnerID=8YFLogxK
U2 - 10.1007/s10115-022-01826-x
DO - 10.1007/s10115-022-01826-x
M3 - Article
AN - SCOPUS:85145703568
VL - 65
SP - 1989
EP - 2016
JO - Knowledge and information systems
JF - Knowledge and information systems
SN - 0219-1377
IS - 5
ER -