Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Mohamad Yaser Jaradeh
  • Kuldeep Singh
  • Markus Stocker
  • Andreas Both
  • Sören Auer

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
  • Anhalt University of Applied Sciences
  • Zerotha-Research and Cerence GmbH
View graph of relations

Details

Original languageEnglish
Title of host publicationWeb Engineering
Subtitle of host publication21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings
EditorsMarco Brambilla, Richard Chbeir, Flavius Frasincar, Ioana Manolescu
PublisherSpringer Science and Business Media Deutschland GmbH
Pages240-254
Number of pages15
ISBN (print)9783030742959
Publication statusPublished - 2021
Event21st International Conference on Web Engineering, ICWE 2021 - Virtual, Online
Duration: 18 May 202121 May 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12706 LNCS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

Keywords

    Information extraction, NLP pipelines, Semantic search, Semantic Web, Software reusability

ASJC Scopus subject areas

Cite this

Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. / Jaradeh, Mohamad Yaser; Singh, Kuldeep; Stocker, Markus et al.
Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. ed. / Marco Brambilla; Richard Chbeir; Flavius Frasincar; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. p. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12706 LNCS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Jaradeh, MY, Singh, K, Stocker, M, Both, A & Auer, S 2021, Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. in M Brambilla, R Chbeir, F Frasincar & I Manolescu (eds), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12706 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 240-254, 21st International Conference on Web Engineering, ICWE 2021, Virtual, Online, 18 May 2021. https://doi.org/10.1007/978-3-030-74296-6_19
Jaradeh, M. Y., Singh, K., Stocker, M., Both, A., & Auer, S. (2021). Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. In M. Brambilla, R. Chbeir, F. Frasincar, & I. Manolescu (Eds.), Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings (pp. 240-254). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 12706 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-74296-6_19
Jaradeh MY, Singh K, Stocker M, Both A, Auer S. Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines. In Brambilla M, Chbeir R, Frasincar F, Manolescu I, editors, Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. p. 240-254. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2021 May 11. doi: 10.1007/978-3-030-74296-6_19
Jaradeh, Mohamad Yaser ; Singh, Kuldeep ; Stocker, Markus et al. / Better Call the Plumber : Orchestrating Dynamic Information Extraction Pipelines. Web Engineering : 21st International Conference, ICWE 2021Biarritz, France, May 18–21, 2021Proceedings. editor / Marco Brambilla ; Richard Chbeir ; Flavius Frasincar ; Ioana Manolescu. Springer Science and Business Media Deutschland GmbH, 2021. pp. 240-254 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{6df4b5e334ab48aaabb0bf8323059def,
title = "Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines",
abstract = "We propose Plumber, the first framework that brings together the research community{\textquoteright}s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.",
keywords = "Information extraction, NLP pipelines, Semantic search, Semantic Web, Software reusability",
author = "Jaradeh, {Mohamad Yaser} and Kuldeep Singh and Markus Stocker and Andreas Both and S{\"o}ren Auer",
note = "Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology. ; 21st International Conference on Web Engineering, ICWE 2021 ; Conference date: 18-05-2021 Through 21-05-2021",
year = "2021",
doi = "10.1007/978-3-030-74296-6_19",
language = "English",
isbn = "9783030742959",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "240--254",
editor = "Marco Brambilla and Richard Chbeir and Flavius Frasincar and Ioana Manolescu",
booktitle = "Web Engineering",
address = "Germany",

}

Download

TY - GEN

T1 - Better Call the Plumber

T2 - 21st International Conference on Web Engineering, ICWE 2021

AU - Jaradeh, Mohamad Yaser

AU - Singh, Kuldeep

AU - Stocker, Markus

AU - Both, Andreas

AU - Auer, Sören

N1 - Funding Information: Acknowledgements. This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and the TIB Leibniz Information Centre for Science and Technology.

PY - 2021

Y1 - 2021

N2 - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

AB - We propose Plumber, the first framework that brings together the research community’s disjoint information extraction (IE) efforts. The Plumber architecture comprises 33 reusable components for various Knowledge Graphs (KG) information extraction subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable information extraction pipelines and offers overall 264 distinct pipelines. We study the optimization problem of choosing suitable pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over two KGs: DBpedia, and Open Research Knowledge Graph (ORKG). Our results demonstrate the effectiveness of Plumber in dynamically generating KG information extraction pipelines, outperforming all baselines agnostics of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components, and discuss their limitations.

KW - Information extraction

KW - NLP pipelines

KW - Semantic search

KW - Semantic Web

KW - Software reusability

UR - http://www.scopus.com/inward/record.url?scp=85111157954&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-74296-6_19

DO - 10.1007/978-3-030-74296-6_19

M3 - Conference contribution

AN - SCOPUS:85111157954

SN - 9783030742959

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 240

EP - 254

BT - Web Engineering

A2 - Brambilla, Marco

A2 - Chbeir, Richard

A2 - Frasincar, Flavius

A2 - Manolescu, Ioana

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 18 May 2021 through 21 May 2021

ER -

By the same author(s)