Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Web Information Systems Engineering |
Untertitel | WISE 2024 - 25th International Conference, Proceedings |
Herausgeber/-innen | Mahmoud Barhamgi, Hua Wang, Xin Wang |
Herausgeber (Verlag) | Springer Science and Business Media Deutschland GmbH |
Seiten | 467-483 |
Seitenumfang | 17 |
ISBN (elektronisch) | 978-981-96-0567-5 |
ISBN (Print) | 9789819605668 |
Publikationsstatus | Veröffentlicht - 2025 |
Veranstaltung | 25th International Conference on Web Information Systems Engineering, WISE 2024 - Doha, Katar Dauer: 2 Dez. 2024 → 5 Dez. 2024 |
Publikationsreihe
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 15437 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (elektronisch) | 1611-3349 |
Abstract
Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.
ASJC Scopus Sachgebiete
- Mathematik (insg.)
- Theoretische Informatik
- Informatik (insg.)
- Allgemeine Computerwissenschaft
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Web Information Systems Engineering : WISE 2024 - 25th International Conference, Proceedings. Hrsg. / Mahmoud Barhamgi; Hua Wang; Xin Wang. Springer Science and Business Media Deutschland GmbH, 2025. S. 467-483 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 15437 LNCS).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - SemMatch
T2 - 25th International Conference on Web Information Systems Engineering, WISE 2024
AU - Huang, Hao
AU - Vidal, Maria Esther
N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.
AB - Causal inference is used in various domains such as healthcare, economics, and political science to infer causal effects from observational data where each unit (entity) has different properties. Existing approaches often assume data completeness, and thus exclude all units with incomplete data when performing causal inference, which can lead to inaccurate causal estimates. In addition, existing approaches follow the Close World Assumption, where facts not present in the database are assumed to be false, limiting the ability to reason under data incompleteness assumption. Knowledge graphs (KGs) are data structures that represent data in semi-structured formats and model the meaning of data via ontologies. We propose a method, SemMatch, based on KGs to enhance causal inference under a data incompleteness assumption.SemMatch relies on a semantic reasoning process specified by a set of logical rules over KGs, to infer implicit facts and partially address data incompleteness. Then, SemMatch applies machine learning methods to estimate the importance of properties. Finally, SemMatch employs causal estimation methods that consider property importance, facilitating causal reasoning across units with incomplete data to determine the causal effect. We evaluate SemMatch on synthetic datasets, and demonstrate that it achieves a lower mean absolute error (MAE) and square root of precision in estimation of heterogeneous effect (PEHE) in causal effect estimation compared to existing state-of-the-art methods. Observed results suggest that accounting for semantic reasoning and including units with incomplete data improves causal estimation accuracy.
KW - Causal Inference
KW - Knowledge Graph
KW - Matching
KW - Semantics
UR - http://www.scopus.com/inward/record.url?scp=85211921518&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0567-5_33
DO - 10.1007/978-981-96-0567-5_33
M3 - Conference contribution
AN - SCOPUS:85211921518
SN - 9789819605668
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 467
EP - 483
BT - Web Information Systems Engineering
A2 - Barhamgi, Mahmoud
A2 - Wang, Hua
A2 - Wang, Xin
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 2 December 2024 through 5 December 2024
ER -