Information Leakage Detection through Approximate Bayes-optimal Prediction

Research output: Contribution to journalArticleResearchpeer review

Authors

Research Organisations

External Research Organisations

  • Ruhr-Universität Bochum
  • Ludwig-Maximilians-Universität München (LMU)
  • Munich Center for Machine Learning (MCML)
  • Universidad de la Sabana
View graph of relations

Details

Original languageEnglish
Article number122419
JournalInformation Sciences
Volume719
Early online date13 Jun 2025
Publication statusPublished - Nov 2025

Abstract

In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.

Keywords

    AutoML, Bayes-optimal predictor, Information leakage detection, Mutual information, Privacy, Statistical tests

ASJC Scopus subject areas

Cite this

Information Leakage Detection through Approximate Bayes-optimal Prediction. / Gupta, Pritha; Wever, Marcel Dominik; Hüllermeier, Eyke.
In: Information Sciences, Vol. 719, 122419, 11.2025.

Research output: Contribution to journalArticleResearchpeer review

Gupta P, Wever MD, Hüllermeier E. Information Leakage Detection through Approximate Bayes-optimal Prediction. Information Sciences. 2025 Nov;719:122419. Epub 2025 Jun 13. doi: 10.48550/arXiv.2401.14283, 10.1016/j.ins.2025.122419
Download
@article{f84624748b9f4e4ab8581cd6c368537b,
title = "Information Leakage Detection through Approximate Bayes-optimal Prediction",
abstract = "In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.",
keywords = "AutoML, Bayes-optimal predictor, Information leakage detection, Mutual information, Privacy, Statistical tests",
author = "Pritha Gupta and Wever, {Marcel Dominik} and Eyke H{\"u}llermeier",
note = "Publisher Copyright: {\textcopyright} 2025 The Author(s)",
year = "2025",
month = nov,
doi = "10.48550/arXiv.2401.14283",
language = "English",
volume = "719",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",

}

Download

TY - JOUR

T1 - Information Leakage Detection through Approximate Bayes-optimal Prediction

AU - Gupta, Pritha

AU - Wever, Marcel Dominik

AU - Hüllermeier, Eyke

N1 - Publisher Copyright: © 2025 The Author(s)

PY - 2025/11

Y1 - 2025/11

N2 - In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.

AB - In today's data-driven world, the proliferation of publicly available information raises security concerns due to the information leakage (IL) problem. IL involves unintentionally exposing sensitive information to unauthorized parties via observable system information. Conventional statistical approaches rely on estimating mutual information (MI) between observable and secret information for detecting ILs, face challenges of the curse of dimensionality, convergence, computational complexity, and MI misestimation. Though effective, emerging supervised machine learning based approaches to detect ILs are limited to the binary system, sensitive information, and lacks a comprehensive framework. To address these limitations, we establish a theoretical framework using statistical learning theory and information theory to quantify and detect IL accurately. Using automated machine learning, we demonstrate that MI can be accurately estimated by approximating the typically unknown Bayes predictor 's LOG-LOSS and accuracy. Based on this, we show how MI can effectively be estimated to detect ILs. Our method performs superior to state-of-the-art baselines in an empirical study considering synthetic and real-world OpenSSL TLS server datasets.

KW - AutoML

KW - Bayes-optimal predictor

KW - Information leakage detection

KW - Mutual information

KW - Privacy

KW - Statistical tests

UR - http://www.scopus.com/inward/record.url?scp=105008448972&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2401.14283

DO - 10.48550/arXiv.2401.14283

M3 - Article

VL - 719

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

M1 - 122419

ER -

By the same author(s)