CMD + K

bioconductor-seqarray

Community

Data Management of Large-scale Whole-genome Sequence Variant Calls

Installation

To install this package, run one of the following:

Conda
$conda install pwwang::bioconductor-seqarray

Usage Tracking

1.44.0
1.42.4
2 / 8 versions selected
Total downloads: 0

Description

Data management of whole-genome sequence variant calls with hundreds of thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language. The SeqArray package is built on top of Genomic Data Structure (GDS) data format, and defines required data structure for a SeqArray file. GDS is a flexible and portable data container with hierarchical structure to store multiple scalable array-oriented data sets. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. It also offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. A high-level R interface to GDS files is available in the package gdsfmt.

About

Summary

Data Management of Large-scale Whole-genome Sequence Variant Calls

Information Last Updated

Mar 25, 2025 at 16:24

License

GPL-3

Total Downloads

329

Platforms

Linux 64 Version: 1.44.0