About Anaconda Help Download Anaconda

pwwang / packages / bioconductor-seqarray 1.44.0

Data Management of Large-scale Whole-genome Sequence Variant Calls

Installers

  • linux-64 v1.44.0

conda install

To install this package run one of the following:
conda install pwwang::bioconductor-seqarray

Description

Data management of whole-genome sequence variant calls with hundreds of thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language. The SeqArray package is built on top of Genomic Data Structure (GDS) data format, and defines required data structure for a SeqArray file. GDS is a flexible and portable data container with hierarchical structure to store multiple scalable array-oriented data sets. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. It also offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. A high-level R interface to GDS files is available in the package gdsfmt.


© 2024 Anaconda, Inc. All Rights Reserved. (v4.0.1) Legal | Privacy Policy