TY - JOUR
T1 - SGD_Tucker
T2 - A Novel Stochastic Optimization Strategy for Parallel Sparse Tucker Decomposition
AU - Li, Hao
AU - Li, Zixuan
AU - Li, Kenli
AU - Rellermeyer, Jan
AU - Chen, Lydia
AU - Li, Keqin
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art.
AB - Sparse Tucker Decomposition (STD) algorithms learn a core tensor and a group of factor matrices to obtain an optimal low-rank representation feature for the High-Order, High-Dimension, and Sparse Tensor (HOHDST). However, existing STD algorithms face the problem of intermediate variables explosion which results from the fact that the formation of those variables, i.e., matrices Khatri-Rao product, Kronecker product, and matrix-matrix multiplication, follows the whole elements in sparse tensor. The above problems prevent deep fusion of efficient computation and big data platforms. To overcome the bottleneck, a novel stochastic optimization strategy (SGD__Tucker) is proposed for STD which can automatically divide the high-dimension intermediate variables into small batches of intermediate matrices. Specifically, SGD__Tucker only follows the randomly selected small samples rather than the whole elements, while maintaining the overall accuracy and convergence rate. In practice, SGD__Tucker features the two distinct advancements over the state of the art. First, SGD__Tucker can prune the communication overhead for the core tensor in distributed settings. Second, the low data-dependence of SGD__Tucker enables fine-grained parallelization, which makes SGD__Tucker obtaining lower computational overheads with the same accuracy. Experimental results show that SGD__Tucker runs at least 2XX faster than the state of the art.
KW - High-order, high-dimension and sparse tensor
KW - low-rank representation learning
KW - machine learning algorithm
KW - parallel strategy
KW - sparse tucker decomposition
KW - stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=85098783291&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2020.3047460
DO - 10.1109/TPDS.2020.3047460
M3 - Article
VL - 32
SP - 1828
EP - 1841
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
SN - 1045-9219
IS - 7
M1 - 9309187
ER -