Loading [MathJax]/extensions/tex2jax.js

Scholarly knowledge reuse leveraging knowledge graphs

Research output: ThesisDoctoral thesis

Authors

  • Muhammad Haris

Research Organisations

Details

Original languageEnglish
QualificationDoctor rerum naturalium
Awarding Institution
Supervised by
Date of Award29 Oct 2024
Place of PublicationHannover
Publication statusPublished - 5 Nov 2024

Abstract

The invention of the World Wide Web (WWW) has enabled the widespread publication of scholarly knowledge, primarily in the form of research articles. Despite improved access to these publications, scholarly communication remains largely document-based. This trend leads to inefficient access and marginal utilization of scholarly knowledge, underscoring the need for more efficient methodologies in the dissemination and retrieval of such knowledge. Automating the structured representation of scholarly knowledge presented in research articles is challenging due to the unstructured nature of the content. Consequently, traditional information retrieval systems have become inadequate for machine-based exploration and reuse of scholarly knowledge. It is also crucial to address data integration, and interoperability challenges, as they are significant factors that affect the reusability of scholarly knowledge. Leveraging the Open Research Knowledge Graph (ORKG)--- a scholarly infrastructure supporting the production, curation and reuse of FAIR (Findable, Accessible, Interoperable, and Reusable) scholarly knowledge--- we present different approaches for systematically extracting, enriching, and querying scholarly knowledge. First, we propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting structured knowledge interrelates articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. Second, we propose an approach for representing, publishing, and using information, extracted from various data sources, about instruments and associated scholarly artefacts. Our approach extracts heterogeneous information about instruments from different data sources as well as retrieves the artefacts that have been produced by these instruments. The resulting structured knowledge serves as a foundation for exploring and gaining a deeper understanding of the use and role of instruments in research. Third, we propose the DOI-based persistent identification of ORKG artefacts (Papers and Comparisons). This enables ORKG data citability and discovery in global scholarly infrastructures (e.g., DataCite, OpenAIRE, ORCID). Fourth, we propose a generic approach for linking ORKG content to third-party semantic resources (e.g., taxonomies, thesauri, ontologies). Such linking increases the interoperability and facilitates the reuse of scholarly knowledge, primarily by removing ambiguity. Finally, we present a GraphQL-based federated query service for executing distributed queries on multiple scholarly infrastructures (specifically, ORKG, DataCite, OpenAIRE and Wikidata), thus enabling the integrated retrieval of scholarly content from these infrastructures. In summary, our proposed approaches for populating, enriching, and querying scholarly knowledge graphs amount to an important and impactful contribution towards FAIR scholarly knowledge in 21st century scholarly infrastructures.

Cite this

Scholarly knowledge reuse leveraging knowledge graphs. / Haris, Muhammad.
Hannover, 2024. 142 p.

Research output: ThesisDoctoral thesis

Haris, M 2024, 'Scholarly knowledge reuse leveraging knowledge graphs', Doctor rerum naturalium, Leibniz University Hannover, Hannover. https://doi.org/10.15488/18101
Haris, M. (2024). Scholarly knowledge reuse leveraging knowledge graphs. [Doctoral thesis, Leibniz University Hannover]. https://doi.org/10.15488/18101
Haris M. Scholarly knowledge reuse leveraging knowledge graphs. Hannover, 2024. 142 p. doi: 10.15488/18101
Haris, Muhammad. / Scholarly knowledge reuse leveraging knowledge graphs. Hannover, 2024. 142 p.
Download
@phdthesis{968db5733a844ded97ee62843f89dc28,
title = "Scholarly knowledge reuse leveraging knowledge graphs",
abstract = "The invention of the World Wide Web (WWW) has enabled the widespread publication of scholarly knowledge, primarily in the form of research articles. Despite improved access to these publications, scholarly communication remains largely document-based. This trend leads to inefficient access and marginal utilization of scholarly knowledge, underscoring the need for more efficient methodologies in the dissemination and retrieval of such knowledge. Automating the structured representation of scholarly knowledge presented in research articles is challenging due to the unstructured nature of the content. Consequently, traditional information retrieval systems have become inadequate for machine-based exploration and reuse of scholarly knowledge. It is also crucial to address data integration, and interoperability challenges, as they are significant factors that affect the reusability of scholarly knowledge. Leveraging the Open Research Knowledge Graph (ORKG)--- a scholarly infrastructure supporting the production, curation and reuse of FAIR (Findable, Accessible, Interoperable, and Reusable) scholarly knowledge--- we present different approaches for systematically extracting, enriching, and querying scholarly knowledge. First, we propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting structured knowledge interrelates articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. Second, we propose an approach for representing, publishing, and using information, extracted from various data sources, about instruments and associated scholarly artefacts. Our approach extracts heterogeneous information about instruments from different data sources as well as retrieves the artefacts that have been produced by these instruments. The resulting structured knowledge serves as a foundation for exploring and gaining a deeper understanding of the use and role of instruments in research. Third, we propose the DOI-based persistent identification of ORKG artefacts (Papers and Comparisons). This enables ORKG data citability and discovery in global scholarly infrastructures (e.g., DataCite, OpenAIRE, ORCID). Fourth, we propose a generic approach for linking ORKG content to third-party semantic resources (e.g., taxonomies, thesauri, ontologies). Such linking increases the interoperability and facilitates the reuse of scholarly knowledge, primarily by removing ambiguity. Finally, we present a GraphQL-based federated query service for executing distributed queries on multiple scholarly infrastructures (specifically, ORKG, DataCite, OpenAIRE and Wikidata), thus enabling the integrated retrieval of scholarly content from these infrastructures. In summary, our proposed approaches for populating, enriching, and querying scholarly knowledge graphs amount to an important and impactful contribution towards FAIR scholarly knowledge in 21st century scholarly infrastructures.",
author = "Muhammad Haris",
year = "2024",
month = nov,
day = "5",
doi = "10.15488/18101",
language = "English",
school = "Leibniz University Hannover",

}

Download

TY - BOOK

T1 - Scholarly knowledge reuse leveraging knowledge graphs

AU - Haris, Muhammad

PY - 2024/11/5

Y1 - 2024/11/5

N2 - The invention of the World Wide Web (WWW) has enabled the widespread publication of scholarly knowledge, primarily in the form of research articles. Despite improved access to these publications, scholarly communication remains largely document-based. This trend leads to inefficient access and marginal utilization of scholarly knowledge, underscoring the need for more efficient methodologies in the dissemination and retrieval of such knowledge. Automating the structured representation of scholarly knowledge presented in research articles is challenging due to the unstructured nature of the content. Consequently, traditional information retrieval systems have become inadequate for machine-based exploration and reuse of scholarly knowledge. It is also crucial to address data integration, and interoperability challenges, as they are significant factors that affect the reusability of scholarly knowledge. Leveraging the Open Research Knowledge Graph (ORKG)--- a scholarly infrastructure supporting the production, curation and reuse of FAIR (Findable, Accessible, Interoperable, and Reusable) scholarly knowledge--- we present different approaches for systematically extracting, enriching, and querying scholarly knowledge. First, we propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting structured knowledge interrelates articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. Second, we propose an approach for representing, publishing, and using information, extracted from various data sources, about instruments and associated scholarly artefacts. Our approach extracts heterogeneous information about instruments from different data sources as well as retrieves the artefacts that have been produced by these instruments. The resulting structured knowledge serves as a foundation for exploring and gaining a deeper understanding of the use and role of instruments in research. Third, we propose the DOI-based persistent identification of ORKG artefacts (Papers and Comparisons). This enables ORKG data citability and discovery in global scholarly infrastructures (e.g., DataCite, OpenAIRE, ORCID). Fourth, we propose a generic approach for linking ORKG content to third-party semantic resources (e.g., taxonomies, thesauri, ontologies). Such linking increases the interoperability and facilitates the reuse of scholarly knowledge, primarily by removing ambiguity. Finally, we present a GraphQL-based federated query service for executing distributed queries on multiple scholarly infrastructures (specifically, ORKG, DataCite, OpenAIRE and Wikidata), thus enabling the integrated retrieval of scholarly content from these infrastructures. In summary, our proposed approaches for populating, enriching, and querying scholarly knowledge graphs amount to an important and impactful contribution towards FAIR scholarly knowledge in 21st century scholarly infrastructures.

AB - The invention of the World Wide Web (WWW) has enabled the widespread publication of scholarly knowledge, primarily in the form of research articles. Despite improved access to these publications, scholarly communication remains largely document-based. This trend leads to inefficient access and marginal utilization of scholarly knowledge, underscoring the need for more efficient methodologies in the dissemination and retrieval of such knowledge. Automating the structured representation of scholarly knowledge presented in research articles is challenging due to the unstructured nature of the content. Consequently, traditional information retrieval systems have become inadequate for machine-based exploration and reuse of scholarly knowledge. It is also crucial to address data integration, and interoperability challenges, as they are significant factors that affect the reusability of scholarly knowledge. Leveraging the Open Research Knowledge Graph (ORKG)--- a scholarly infrastructure supporting the production, curation and reuse of FAIR (Findable, Accessible, Interoperable, and Reusable) scholarly knowledge--- we present different approaches for systematically extracting, enriching, and querying scholarly knowledge. First, we propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting structured knowledge interrelates articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. Second, we propose an approach for representing, publishing, and using information, extracted from various data sources, about instruments and associated scholarly artefacts. Our approach extracts heterogeneous information about instruments from different data sources as well as retrieves the artefacts that have been produced by these instruments. The resulting structured knowledge serves as a foundation for exploring and gaining a deeper understanding of the use and role of instruments in research. Third, we propose the DOI-based persistent identification of ORKG artefacts (Papers and Comparisons). This enables ORKG data citability and discovery in global scholarly infrastructures (e.g., DataCite, OpenAIRE, ORCID). Fourth, we propose a generic approach for linking ORKG content to third-party semantic resources (e.g., taxonomies, thesauri, ontologies). Such linking increases the interoperability and facilitates the reuse of scholarly knowledge, primarily by removing ambiguity. Finally, we present a GraphQL-based federated query service for executing distributed queries on multiple scholarly infrastructures (specifically, ORKG, DataCite, OpenAIRE and Wikidata), thus enabling the integrated retrieval of scholarly content from these infrastructures. In summary, our proposed approaches for populating, enriching, and querying scholarly knowledge graphs amount to an important and impactful contribution towards FAIR scholarly knowledge in 21st century scholarly infrastructures.

U2 - 10.15488/18101

DO - 10.15488/18101

M3 - Doctoral thesis

CY - Hannover

ER -

By the same author(s)