HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Patrick Glandorf; Timo Kaiser; Bodo Rosenhahn

doi:10.48550/arXiv.2308.07163

Details

Original language	English
Title of host publication	2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1226-1235
Number of pages	10
ISBN (electronic)	9798350307443
ISBN (print)	9798350307450
Publication status	Published - 2023
Event	2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023 - Paris, France Duration: 2 Oct 2023 → 6 Oct 2023

Abstract

Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.¹

Keywords

Neural Networks, Pruning, Sparsity, Unstructured Pruning

ASJC Scopus subject areas

Computer Science(all)
Artificial Intelligence
Computer Science(all)
Computer Science Applications
Computer Science(all)
Computer Vision and Pattern Recognition

Cite this

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. / Glandorf, Patrick; Kaiser, Timo; Rosenhahn, Bodo.
2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., 2023. p. 1226-1235.

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Glandorf, P, Kaiser, T & Rosenhahn, B 2023, HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. in 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., pp. 1226-1235, 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023, Paris, France, 2 Oct 2023. https://doi.org/10.48550/arXiv.2308.07163, https://doi.org/10.1109/ICCVW60793.2023.00133

Glandorf, P., Kaiser, T., & Rosenhahn, B. (2023). HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) (pp. 1226-1235). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2308.07163, https://doi.org/10.1109/ICCVW60793.2023.00133

Glandorf P, Kaiser T, Rosenhahn B. HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization. In 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc. 2023. p. 1226-1235 doi: 10.48550/arXiv.2308.07163, 10.1109/ICCVW60793.2023.00133

Glandorf, Patrick ; Kaiser, Timo ; Rosenhahn, Bodo. / HyperSparse Neural Networks : Shifting Exploration to Exploitation through Adaptive Regularization. 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Institute of Electrical and Electronics Engineers Inc., 2023. pp. 1226-1235

Download

@inproceedings{02264f81a4854e268c6daafe8fb8ba90,

title = "HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization",

abstract = "Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model {"}knowledge{"}into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1",

keywords = "Neural Networks, Pruning, Sparsity, Unstructured Pruning",

author = "Patrick Glandorf and Timo Kaiser and Bodo Rosenhahn",

note = "Funding Information: This work was supported by the Federal Ministry of Education and Research (BMBF), Germany under the project AI service center KISSKI (grant no. 01IS22093C), the Deutsche Forschungsgemeinschaft (DFG) under Germany's Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122), and by the Federal Ministry of the Environment, Nature Conservation, Nuclear Safety and Consumer Protection, Germany under the project GreenAutoML4FAS (grant no. 67KI32007A). ; 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023 ; Conference date: 02-10-2023 Through 06-10-2023",

year = "2023",

doi = "10.48550/arXiv.2308.07163",

language = "English",

isbn = "9798350307450",

pages = "1226--1235",

booktitle = "2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

}

Download

TY - GEN

T1 - HyperSparse Neural Networks

T2 - 2023 IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2023

AU - Glandorf, Patrick

AU - Kaiser, Timo

AU - Rosenhahn, Bodo

N1 - Funding Information: This work was supported by the Federal Ministry of Education and Research (BMBF), Germany under the project AI service center KISSKI (grant no. 01IS22093C), the Deutsche Forschungsgemeinschaft (DFG) under Germany's Excellence Strategy within the Cluster of Excellence PhoenixD (EXC 2122), and by the Federal Ministry of the Environment, Nature Conservation, Nuclear Safety and Consumer Protection, Germany under the project GreenAutoML4FAS (grant no. 67KI32007A).

PY - 2023

Y1 - 2023

N2 - Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1

AB - Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner with increasing weight regularization. Our method compresses the pre-trained model "knowledge"into the weights of highest magnitude. Therefore, we introduce a novel regularization loss named HyperSparse that exploits the highest weights while conserving the ability of weight exploration. Extensive experiments on CIFAR and TinyImageNet show that our method leads to notable performance gains compared to other sparsification methods, especially in extremely high sparsity regimes up to 99.8% model sparsity. Additional investigations provide new insights into the patterns that are encoded in weights with high magnitudes.1

KW - Neural Networks

KW - Pruning

KW - Sparsity

KW - Unstructured Pruning

UR - http://www.scopus.com/inward/record.url?scp=85180564637&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2308.07163

DO - 10.48550/arXiv.2308.07163

M3 - Conference contribution

AN - SCOPUS:85180564637

SN - 9798350307450

SP - 1226

EP - 1235

BT - 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 2 October 2023 through 6 October 2023

ER -

Research@Leibniz University

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Authors

Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

CHOTA: A Higher Order Accuracy Metric for Cell Tracking

Safe Resetless Reinforcement Learning: Enhancing Training Autonomy with Risk-Averse Agents

Guest Editorial: Special Issue on Multimodal Learning

Attribute-Centric Compositional Text-to-Image Generation

Automl for Multi-Class Anomaly Compensation of Sensor Drift

CHOTA: A Higher Order Accuracy Metric for Cell Tracking

Safe Resetless Reinforcement Learning: Enhancing Training Autonomy with Risk-Averse Agents

Guest Editorial: Special Issue on Multimodal Learning

Attribute-Centric Compositional Text-to-Image Generation

Automl for Multi-Class Anomaly Compensation of Sensor Drift

CHOTA: A Higher Order Accuracy Metric for Cell Tracking