SGD_Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition

Research output: Contribution to journalArticleResearchpeer review

Authors

External Research Organisations

  • National Supercomputing Center, Changsha
  • State University of New York (SUNY)
  • Delft University of Technology
View graph of relations

Details

Original languageEnglish
Article number9309187
Pages (from-to)1828-1841
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume32
Issue number7
Publication statusPublished - 1 Jan 2021
Externally publishedYes

Abstract

Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art.

Keywords

    High-order, high-dimension and sparse tensor, low-rank representation learning, machine learning algorithm, parallel strategy, sparse tucker decomposition, stochastic optimization

ASJC Scopus subject areas

Cite this

SGD_Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition. / Li, Hao; Li, Zixuan; Li, Kenli et al.
In: IEEE Transactions on Parallel and Distributed Systems, Vol. 32, No. 7, 9309187, 01.01.2021, p. 1828-1841.

Research output: Contribution to journalArticleResearchpeer review

Download
@article{71245e95977e44a4bf6d9711ec800b81,
title = "SGD_Tucker: A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition",
abstract = "Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art. ",
keywords = "High-order, high-dimension and sparse tensor, low-rank representation learning, machine learning algorithm, parallel strategy, sparse tucker decomposition, stochastic optimization",
author = "Hao Li and Zixuan Li and Kenli Li and Jan Rellermeyer and Lydia Chen and Keqin Li",
note = "Publisher Copyright: {\textcopyright} 1990-2012 IEEE.",
year = "2021",
month = jan,
day = "1",
doi = "10.1109/TPDS.2020.3047460",
language = "English",
volume = "32",
pages = "1828--1841",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "7",

}

Download

TY - JOUR

T1 - SGD_Tucker

T2 - A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition

AU - Li, Hao

AU - Li, Zixuan

AU - Li, Kenli

AU - Rellermeyer, Jan

AU - Chen, Lydia

AU - Li, Keqin

N1 - Publisher Copyright: © 1990-2012 IEEE.

PY - 2021/1/1

Y1 - 2021/1/1

N2 - Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art.

AB - Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art.

KW - High-order, high-dimension and sparse tensor

KW - low-rank representation learning

KW - machine learning algorithm

KW - parallel strategy

KW - sparse tucker decomposition

KW - stochastic optimization

UR - http://www.scopus.com/inward/record.url?scp=85098783291&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2020.3047460

DO - 10.1109/TPDS.2020.3047460

M3 - Article

VL - 32

SP - 1828

EP - 1841

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 7

M1 - 9309187

ER -

By the same author(s)