The Viscum album Gene Space database

Research output: Contribution to journalArticleResearchpeer review

Authors

External Research Organisations

  • Justus Liebig University Giessen
View graph of relations

Details

Original languageEnglish
Article number1193122
JournalFrontiers in Plant Science
Volume14
Publication statusPublished - 26 Jun 2023

Abstract

The hemiparasitic flowering plant Viscum album (European mistletoe) is known for its very special life cycle, extraordinary biochemical properties, and extremely large genome. The size of its genome is estimated to be 30 times larger than the human genome and 600 times larger than the genome of the model plant Arabidopsis thaliana. To achieve insights into the Gene Space of the genome, which is defined as the space including and surrounding protein-coding regions, a transcriptome project based on PacBio sequencing has recently been conducted. A database resulting from this project contains sequences of 39,092 different open reading frames encoding 32,064 distinct proteins. Based on ‘Benchmarking Universal Single-Copy Orthologs’ (BUSCO) analysis, the completeness of the database was estimated to be in the range of 78%. To further develop this database, we performed a transcriptome project of V. album organs harvested in summer and winter based on Illumina sequencing. Data from both sequencing strategies were combined. The new V. album Gene Space database II (VaGs II) contains 90,039 sequences and has a completeness of 93% as revealed by BUSCO analysis. Sequences from other organisms, particularly fungi, which are known to colonize mistletoe leaves, have been removed. To evaluate the quality of the new database, proteome data of a mitochondrial fraction of V. album were re-analyzed. Compared to the original evaluation published five years ago, nearly 1000 additional proteins could be identified in the mitochondrial fraction, providing new insights into the Oxidative Phosphorylation System of V. album. The VaGs II database is available at https://viscumalbum.pflanzenproteomik.de/. Furthermore, all V. album sequences have been uploaded at the European Nucleotide Archive (ENA).

Keywords

    complex I, Complexome profiling, database development, Illumina sequencing, mitochondria, oxidative phosphorylation (OXPHOS), PacBio sequencing, supercomplex

ASJC Scopus subject areas

Cite this

The Viscum album Gene Space database. / Schröder, Lucie; Rupp, Oliver; Senkler, Michael et al.
In: Frontiers in Plant Science, Vol. 14, 1193122, 26.06.2023.

Research output: Contribution to journalArticleResearchpeer review

Schröder L, Rupp O, Senkler M, Rugen N, Hohnjec N, Goesmann A et al. The Viscum album Gene Space database. Frontiers in Plant Science. 2023 Jun 26;14:1193122. doi: 10.3389/fpls.2023.1193122
Schröder, Lucie ; Rupp, Oliver ; Senkler, Michael et al. / The Viscum album Gene Space database. In: Frontiers in Plant Science. 2023 ; Vol. 14.
Download
@article{73460b0398704051809c9421daf7e62c,
title = "The Viscum album Gene Space database",
abstract = "The hemiparasitic flowering plant Viscum album (European mistletoe) is known for its very special life cycle, extraordinary biochemical properties, and extremely large genome. The size of its genome is estimated to be 30 times larger than the human genome and 600 times larger than the genome of the model plant Arabidopsis thaliana. To achieve insights into the Gene Space of the genome, which is defined as the space including and surrounding protein-coding regions, a transcriptome project based on PacBio sequencing has recently been conducted. A database resulting from this project contains sequences of 39,092 different open reading frames encoding 32,064 distinct proteins. Based on {\textquoteleft}Benchmarking Universal Single-Copy Orthologs{\textquoteright} (BUSCO) analysis, the completeness of the database was estimated to be in the range of 78%. To further develop this database, we performed a transcriptome project of V. album organs harvested in summer and winter based on Illumina sequencing. Data from both sequencing strategies were combined. The new V. album Gene Space database II (VaGs II) contains 90,039 sequences and has a completeness of 93% as revealed by BUSCO analysis. Sequences from other organisms, particularly fungi, which are known to colonize mistletoe leaves, have been removed. To evaluate the quality of the new database, proteome data of a mitochondrial fraction of V. album were re-analyzed. Compared to the original evaluation published five years ago, nearly 1000 additional proteins could be identified in the mitochondrial fraction, providing new insights into the Oxidative Phosphorylation System of V. album. The VaGs II database is available at https://viscumalbum.pflanzenproteomik.de/. Furthermore, all V. album sequences have been uploaded at the European Nucleotide Archive (ENA).",
keywords = "complex I, Complexome profiling, database development, Illumina sequencing, mitochondria, oxidative phosphorylation (OXPHOS), PacBio sequencing, supercomplex",
author = "Lucie Schr{\"o}der and Oliver Rupp and Michael Senkler and Nils Rugen and Natalija Hohnjec and Alexander Goesmann and Helge K{\"u}ster and Hans-Peter Braun",
note = "This research has been supported by the Deutsche Forschungsgemeinschaft, grant BR 1829/16-1, to HPB. The publication of this article was funded by the Open Access Fund of Leibniz Universit{\"a}t Hannover.",
year = "2023",
month = jun,
day = "26",
doi = "10.3389/fpls.2023.1193122",
language = "English",
volume = "14",
journal = "Frontiers in Plant Science",
issn = "1664-462X",
publisher = "Frontiers Media S.A.",

}

Download

TY - JOUR

T1 - The Viscum album Gene Space database

AU - Schröder, Lucie

AU - Rupp, Oliver

AU - Senkler, Michael

AU - Rugen, Nils

AU - Hohnjec, Natalija

AU - Goesmann, Alexander

AU - Küster, Helge

AU - Braun, Hans-Peter

N1 - This research has been supported by the Deutsche Forschungsgemeinschaft, grant BR 1829/16-1, to HPB. The publication of this article was funded by the Open Access Fund of Leibniz Universität Hannover.

PY - 2023/6/26

Y1 - 2023/6/26

N2 - The hemiparasitic flowering plant Viscum album (European mistletoe) is known for its very special life cycle, extraordinary biochemical properties, and extremely large genome. The size of its genome is estimated to be 30 times larger than the human genome and 600 times larger than the genome of the model plant Arabidopsis thaliana. To achieve insights into the Gene Space of the genome, which is defined as the space including and surrounding protein-coding regions, a transcriptome project based on PacBio sequencing has recently been conducted. A database resulting from this project contains sequences of 39,092 different open reading frames encoding 32,064 distinct proteins. Based on ‘Benchmarking Universal Single-Copy Orthologs’ (BUSCO) analysis, the completeness of the database was estimated to be in the range of 78%. To further develop this database, we performed a transcriptome project of V. album organs harvested in summer and winter based on Illumina sequencing. Data from both sequencing strategies were combined. The new V. album Gene Space database II (VaGs II) contains 90,039 sequences and has a completeness of 93% as revealed by BUSCO analysis. Sequences from other organisms, particularly fungi, which are known to colonize mistletoe leaves, have been removed. To evaluate the quality of the new database, proteome data of a mitochondrial fraction of V. album were re-analyzed. Compared to the original evaluation published five years ago, nearly 1000 additional proteins could be identified in the mitochondrial fraction, providing new insights into the Oxidative Phosphorylation System of V. album. The VaGs II database is available at https://viscumalbum.pflanzenproteomik.de/. Furthermore, all V. album sequences have been uploaded at the European Nucleotide Archive (ENA).

AB - The hemiparasitic flowering plant Viscum album (European mistletoe) is known for its very special life cycle, extraordinary biochemical properties, and extremely large genome. The size of its genome is estimated to be 30 times larger than the human genome and 600 times larger than the genome of the model plant Arabidopsis thaliana. To achieve insights into the Gene Space of the genome, which is defined as the space including and surrounding protein-coding regions, a transcriptome project based on PacBio sequencing has recently been conducted. A database resulting from this project contains sequences of 39,092 different open reading frames encoding 32,064 distinct proteins. Based on ‘Benchmarking Universal Single-Copy Orthologs’ (BUSCO) analysis, the completeness of the database was estimated to be in the range of 78%. To further develop this database, we performed a transcriptome project of V. album organs harvested in summer and winter based on Illumina sequencing. Data from both sequencing strategies were combined. The new V. album Gene Space database II (VaGs II) contains 90,039 sequences and has a completeness of 93% as revealed by BUSCO analysis. Sequences from other organisms, particularly fungi, which are known to colonize mistletoe leaves, have been removed. To evaluate the quality of the new database, proteome data of a mitochondrial fraction of V. album were re-analyzed. Compared to the original evaluation published five years ago, nearly 1000 additional proteins could be identified in the mitochondrial fraction, providing new insights into the Oxidative Phosphorylation System of V. album. The VaGs II database is available at https://viscumalbum.pflanzenproteomik.de/. Furthermore, all V. album sequences have been uploaded at the European Nucleotide Archive (ENA).

KW - complex I

KW - Complexome profiling

KW - database development

KW - Illumina sequencing

KW - mitochondria

KW - oxidative phosphorylation (OXPHOS)

KW - PacBio sequencing

KW - supercomplex

UR - http://www.scopus.com/inward/record.url?scp=85165306873&partnerID=8YFLogxK

U2 - 10.3389/fpls.2023.1193122

DO - 10.3389/fpls.2023.1193122

M3 - Article

VL - 14

JO - Frontiers in Plant Science

JF - Frontiers in Plant Science

SN - 1664-462X

M1 - 1193122

ER -

By the same author(s)