Details
| Original language | English |
|---|---|
| Article number | 122419 |
| Journal | Information Sciences |
| Volume | 719 |
| Early online date | 13 Jun 2025 |
| Publication status | Published - Nov 2025 |
Abstract
In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.
Keywords
- AutoML, Bayes-optimal predictor, Information leakage detection, Mutual information, Privacy, Statistical tests
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Engineering(all)
- Control and Systems Engineering
- Mathematics(all)
- Theoretical Computer Science
- Computer Science(all)
- Computer Science Applications
- Decision Sciences(all)
- Information Systems and Management
- Computer Science(all)
- Artificial Intelligence
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Information Sciences, Vol. 719, 122419, 11.2025.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Information Leakage Detection through Approximate Bayes-optimal Prediction
AU - Gupta, Pritha
AU - Wever, Marcel Dominik
AU - Hüllermeier, Eyke
N1 - Publisher Copyright: © 2025 The Author(s)
PY - 2025/11
Y1 - 2025/11
N2 - In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.
AB - In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.
KW - AutoML
KW - Bayes-optimal predictor
KW - Information leakage detection
KW - Mutual information
KW - Privacy
KW - Statistical tests
UR - http://www.scopus.com/inward/record.url?scp=105008448972&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2401.14283
DO - 10.48550/arXiv.2401.14283
M3 - Article
VL - 719
JO - Information Sciences
JF - Information Sciences
SN - 0020-0255
M1 - 122419
ER -