MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation

Research output: Contribution to journalArticleResearchpeer review

Authors

  • Yuexu Jiang
  • Duolin Wang
  • Yifu Yao
  • Holger Eubel
  • Patrick Künzler
  • Ian Max Møller
  • Dong Xu

Research Organisations

External Research Organisations

  • MU Bond Life Sciences Center
  • Aarhus University
View graph of relations

Details

Original languageEnglish
Pages (from-to)4825-4839
Number of pages15
JournalComputational and structural biotechnology journal
Volume19
Early online date18 Aug 2021
Publication statusPublished - 2021

Abstract

Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments-the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.

Keywords

    Deep learning, Experimental benchmark datasets, Mechanism study, Protein localization, Web server

ASJC Scopus subject areas

Cite this

MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation. / Jiang, Yuexu; Wang, Duolin; Yao, Yifu et al.
In: Computational and structural biotechnology journal, Vol. 19, 2021, p. 4825-4839.

Research output: Contribution to journalArticleResearchpeer review

Jiang, Y, Wang, D, Yao, Y, Eubel, H, Künzler, P, Møller, IM & Xu, D 2021, 'MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation', Computational and structural biotechnology journal, vol. 19, pp. 4825-4839. https://doi.org/10.1016/j.csbj.2021.08.027
Jiang, Y., Wang, D., Yao, Y., Eubel, H., Künzler, P., Møller, I. M., & Xu, D. (2021). MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation. Computational and structural biotechnology journal, 19, 4825-4839. https://doi.org/10.1016/j.csbj.2021.08.027
Jiang Y, Wang D, Yao Y, Eubel H, Künzler P, Møller IM et al. MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation. Computational and structural biotechnology journal. 2021;19:4825-4839. Epub 2021 Aug 18. doi: 10.1016/j.csbj.2021.08.027
Jiang, Yuexu ; Wang, Duolin ; Yao, Yifu et al. / MULocDeep : A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation. In: Computational and structural biotechnology journal. 2021 ; Vol. 19. pp. 4825-4839.
Download
@article{9d08e48e085645aaa8a8305449b11b3d,
title = "MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation",
abstract = "Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments-the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.",
keywords = "Deep learning, Experimental benchmark datasets, Mechanism study, Protein localization, Web server",
author = "Yuexu Jiang and Duolin Wang and Yifu Yao and Holger Eubel and Patrick K{\"u}nzler and M{\o}ller, {Ian Max} and Dong Xu",
note = "Funding Information: This work was supported by the US National Institutes of Health grants R21-LM012790 and R35-GM126985. We would like to thank Dr. Hao Lin for providing suggestions in defining subcellular and suborganellar categories, and the anonymous reviewers for the helpful advice. We would like to thank Dr. Ning Zhang for providing the evaluation results by the MU-LOC method. This work used the high-performance computing infrastructure provided by Research Computing Support Services at the University of Missouri, as well as the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562.",
year = "2021",
doi = "10.1016/j.csbj.2021.08.027",
language = "English",
volume = "19",
pages = "4825--4839",

}

Download

TY - JOUR

T1 - MULocDeep

T2 - A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation

AU - Jiang, Yuexu

AU - Wang, Duolin

AU - Yao, Yifu

AU - Eubel, Holger

AU - Künzler, Patrick

AU - Møller, Ian Max

AU - Xu, Dong

N1 - Funding Information: This work was supported by the US National Institutes of Health grants R21-LM012790 and R35-GM126985. We would like to thank Dr. Hao Lin for providing suggestions in defining subcellular and suborganellar categories, and the anonymous reviewers for the helpful advice. We would like to thank Dr. Ning Zhang for providing the evaluation results by the MU-LOC method. This work used the high-performance computing infrastructure provided by Research Computing Support Services at the University of Missouri, as well as the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562.

PY - 2021

Y1 - 2021

N2 - Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments-the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.

AB - Prediction of protein localization plays an important role in understanding protein function and mechanisms. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments-the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid's contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org.

KW - Deep learning

KW - Experimental benchmark datasets

KW - Mechanism study

KW - Protein localization

KW - Web server

UR - http://www.scopus.com/inward/record.url?scp=85114129650&partnerID=8YFLogxK

U2 - 10.1016/j.csbj.2021.08.027

DO - 10.1016/j.csbj.2021.08.027

M3 - Article

C2 - 34522290

VL - 19

SP - 4825

EP - 4839

JO - Computational and structural biotechnology journal

JF - Computational and structural biotechnology journal

SN - 2001-0370

ER -

By the same author(s)