Scholarly Knowledge Graph Construction from Published Software Packages

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

Research Organisations

External Research Organisations

  • German National Library of Science and Technology (TIB)
View graph of relations

Details

Original languageEnglish
Title of host publicationLeveraging Generative Intelligence in Digital Libraries
Subtitle of host publicationTowards Human-Machine Collaboration
EditorsDion H. Goh, Shu-Jiun Chen, Suppawong Tuarob
PublisherSpringer Science and Business Media Deutschland GmbH
Pages170-179
Number of pages10
ISBN (electronic)978-981-99-8088-8
ISBN (print)9789819980871
Publication statusPublished - 30 Nov 2023
Event25th International Conference on Asia-Pacific Digital Libraries, ICADL 2023 - Taipei, Taiwan
Duration: 4 Dec 20237 Dec 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14458 LNNS
ISSN (Print)0302-9743
ISSN (electronic)1611-3349

Abstract

The value of structured scholarly knowledge for research and society at large is well understood, but producing scholarly knowledge (i.e., knowledge traditionally published in articles) in structured form remains a challenge. We propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting knowledge graph includes articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. The knowledge graph also includes the results reported as scholarly knowledge in articles. Our code is available on GitHub at the following link: https://github.com/mharis111/parse-software-scripts.

Keywords

    Abstract Syntax Tree, Analyzing Software Packages, Code Analysis, Machine Actionability, Open Research Knowledge Graph, Scholarly Communication

ASJC Scopus subject areas

Cite this

Scholarly Knowledge Graph Construction from Published Software Packages. / Haris, Muhammad; Auer, Sören; Stocker, Markus.
Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration . ed. / Dion H. Goh; Shu-Jiun Chen; Suppawong Tuarob. Springer Science and Business Media Deutschland GmbH, 2023. p. 170-179 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14458 LNNS).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Haris, M, Auer, S & Stocker, M 2023, Scholarly Knowledge Graph Construction from Published Software Packages. in DH Goh, S-J Chen & S Tuarob (eds), Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14458 LNNS, Springer Science and Business Media Deutschland GmbH, pp. 170-179, 25th International Conference on Asia-Pacific Digital Libraries, ICADL 2023, Taipei, Taiwan, 4 Dec 2023. https://doi.org/10.48550/arXiv.2312.01065, https://doi.org/10.1007/978-981-99-8088-8_15
Haris, M., Auer, S., & Stocker, M. (2023). Scholarly Knowledge Graph Construction from Published Software Packages. In D. H. Goh, S.-J. Chen, & S. Tuarob (Eds.), Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration (pp. 170-179). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14458 LNNS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.48550/arXiv.2312.01065, https://doi.org/10.1007/978-981-99-8088-8_15
Haris M, Auer S, Stocker M. Scholarly Knowledge Graph Construction from Published Software Packages. In Goh DH, Chen SJ, Tuarob S, editors, Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration . Springer Science and Business Media Deutschland GmbH. 2023. p. 170-179. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). Epub 2023 Nov 29. doi: 10.48550/arXiv.2312.01065, 10.1007/978-981-99-8088-8_15
Haris, Muhammad ; Auer, Sören ; Stocker, Markus. / Scholarly Knowledge Graph Construction from Published Software Packages. Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration . editor / Dion H. Goh ; Shu-Jiun Chen ; Suppawong Tuarob. Springer Science and Business Media Deutschland GmbH, 2023. pp. 170-179 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Download
@inproceedings{4780bfa3943a4ef69b07b17e3c632cf5,
title = "Scholarly Knowledge Graph Construction from Published Software Packages",
abstract = "The value of structured scholarly knowledge for research and society at large is well understood, but producing scholarly knowledge (i.e., knowledge traditionally published in articles) in structured form remains a challenge. We propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting knowledge graph includes articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. The knowledge graph also includes the results reported as scholarly knowledge in articles. Our code is available on GitHub at the following link: https://github.com/mharis111/parse-software-scripts.",
keywords = "Abstract Syntax Tree, Analyzing Software Packages, Code Analysis, Machine Actionability, Open Research Knowledge Graph, Scholarly Communication",
author = "Muhammad Haris and S{\"o}ren Auer and Markus Stocker",
note = "Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and TIB–Leibniz Information Centre for Science and Technology. ; 25th International Conference on Asia-Pacific Digital Libraries, ICADL 2023 ; Conference date: 04-12-2023 Through 07-12-2023",
year = "2023",
month = nov,
day = "30",
doi = "10.48550/arXiv.2312.01065",
language = "English",
isbn = "9789819980871",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Science and Business Media Deutschland GmbH",
pages = "170--179",
editor = "Goh, {Dion H.} and Shu-Jiun Chen and Suppawong Tuarob",
booktitle = "Leveraging Generative Intelligence in Digital Libraries",
address = "Germany",

}

Download

TY - GEN

T1 - Scholarly Knowledge Graph Construction from Published Software Packages

AU - Haris, Muhammad

AU - Auer, Sören

AU - Stocker, Markus

N1 - Funding Information: This work was co-funded by the European Research Council for the project ScienceGRAPH (Grant agreement ID: 819536) and TIB–Leibniz Information Centre for Science and Technology.

PY - 2023/11/30

Y1 - 2023/11/30

N2 - The value of structured scholarly knowledge for research and society at large is well understood, but producing scholarly knowledge (i.e., knowledge traditionally published in articles) in structured form remains a challenge. We propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting knowledge graph includes articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. The knowledge graph also includes the results reported as scholarly knowledge in articles. Our code is available on GitHub at the following link: https://github.com/mharis111/parse-software-scripts.

AB - The value of structured scholarly knowledge for research and society at large is well understood, but producing scholarly knowledge (i.e., knowledge traditionally published in articles) in structured form remains a challenge. We propose an approach for automatically extracting scholarly knowledge from published software packages by static analysis of their metadata and contents (scripts and data) and populating a scholarly knowledge graph with the extracted knowledge. Our approach is based on mining scientific software packages linked to article publications by extracting metadata and analyzing the Abstract Syntax Tree (AST) of the source code to obtain information about the used and produced data as well as operations performed on data. The resulting knowledge graph includes articles, software packages metadata, and computational techniques applied to input data utilized as materials in research work. The knowledge graph also includes the results reported as scholarly knowledge in articles. Our code is available on GitHub at the following link: https://github.com/mharis111/parse-software-scripts.

KW - Abstract Syntax Tree

KW - Analyzing Software Packages

KW - Code Analysis

KW - Machine Actionability

KW - Open Research Knowledge Graph

KW - Scholarly Communication

UR - http://www.scopus.com/inward/record.url?scp=85180152166&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2312.01065

DO - 10.48550/arXiv.2312.01065

M3 - Conference contribution

AN - SCOPUS:85180152166

SN - 9789819980871

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 170

EP - 179

BT - Leveraging Generative Intelligence in Digital Libraries

A2 - Goh, Dion H.

A2 - Chen, Shu-Jiun

A2 - Tuarob, Suppawong

PB - Springer Science and Business Media Deutschland GmbH

T2 - 25th International Conference on Asia-Pacific Digital Libraries, ICADL 2023

Y2 - 4 December 2023 through 7 December 2023

ER -

By the same author(s)