Details
Original language | English |
---|---|
Pages (from-to) | 1005-1008 |
Number of pages | 4 |
Journal | Nature methods |
Volume | 13 |
Publication status | Published - 24 Oct 2016 |
Abstract
High-throughput sequencing (HTS) data are commonly stored as raw sequencing reads in FASTQ format or as reads mapped to a reference, in SAM format, both with large memory footprints. Worldwide growth of HTS data has prompted the development of compression methods that aim to significantly reduce HTS data size. Here we report on a benchmarking study of available compression methods on a comprehensive set of HTS data using an automated framework.
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Biotechnology
- Biochemistry, Genetics and Molecular Biology(all)
- Biochemistry
- Biochemistry, Genetics and Molecular Biology(all)
- Molecular Biology
- Biochemistry, Genetics and Molecular Biology(all)
- Cell Biology
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Nature methods, Vol. 13, 24.10.2016, p. 1005-1008.
Research output: Contribution to journal › Article › Research
}
TY - JOUR
T1 - Comparison of high-throughput sequencing data compression tools
AU - Numanagić, Ibrahim
AU - Bonfield, James K.
AU - Hach, Faraz
AU - Voges, Jan
AU - Ostermann, Jörn
AU - Alberti, Claudio
AU - Mattavelli, Marco
AU - Sahinalp, S. Cenk
N1 - Funding information: This research was supported by Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Frontiers program 'Cancer Genome Collaboratory' project (S.C.S., F.H., I.N.); the Vanier Canada Graduate Scholarships program (I.N.); National Institutes of Health (NIH) (R01GM108348 to S.C.S.); National Science Foundation (NSF) (1619081 to S.C.S.); Indiana University Grant Challenges Program Precision Health Initiative (S.C.S.); Wellcome Trust (098051 to J.K.B.); Leibniz Universität Hannover eNIFE grant (J.V. and J.O.); Swiss Platform for Advanced Scientific Computing (PASC) PoSeNoGap project (C.A. and M.M.). We would also like to thank the authors of evaluated compression tools for providing support for their tools and replying to our bug reports.
PY - 2016/10/24
Y1 - 2016/10/24
N2 - High-throughput sequencing (HTS) data are commonly stored as raw sequencing reads in FASTQ format or as reads mapped to a reference, in SAM format, both with large memory footprints. Worldwide growth of HTS data has prompted the development of compression methods that aim to significantly reduce HTS data size. Here we report on a benchmarking study of available compression methods on a comprehensive set of HTS data using an automated framework.
AB - High-throughput sequencing (HTS) data are commonly stored as raw sequencing reads in FASTQ format or as reads mapped to a reference, in SAM format, both with large memory footprints. Worldwide growth of HTS data has prompted the development of compression methods that aim to significantly reduce HTS data size. Here we report on a benchmarking study of available compression methods on a comprehensive set of HTS data using an automated framework.
UR - http://www.scopus.com/inward/record.url?scp=84992390491&partnerID=8YFLogxK
U2 - 10.1038/nmeth.4037
DO - 10.1038/nmeth.4037
M3 - Article
C2 - 27776113
AN - SCOPUS:84992390491
VL - 13
SP - 1005
EP - 1008
JO - Nature methods
JF - Nature methods
SN - 1548-7091
ER -