rust-ncbitaxonomy public A Rust crate for working with a local copy of the NCBI Taxonomy database, which provides utilities for taxonomic filtering. 2024-08-09
parafly public Given a file containing a list of unix commands, multithreading is used to process the commands in parallel on a single server. Success/failure is captured, and failed commands are retained and reported. 2024-08-09
cgpbigwig public BigWig manpulation tools using libBigWig and htslib 2024-08-08
picrust2 public PICRUSt: Phylogenetic Investigation of Communities by Reconstruction of Unobserved States 2024-08-08
sonicparanoid public SonicParanoid: fast, accurate, and comprehensive orthology inference with machine learning and language models 2024-08-07
chewbbaca public A complete suite for gene-by-gene schema creation and strain identification. 2024-08-06
irfinder public Intron Retention Finder 2024-08-06
fastool public A simple and quick tool to read huge FastQ and FastA files (both normal and gzipped) and manipulate them. 2024-08-06
umi_tools public Tools for dealing with Unique Molecular Identifiers (UMIs) / Random Molecular Tags (RMTs) 2024-08-05
bam-readcount public bam-readcount generates metrics at single nucleotide positions. 2024-08-05
iqtree public Efficient phylogenomic software by maximum likelihood. 2024-08-04
quicksect public A cythonized, extended version of the interval search tree in bx 2024-08-02
filtlong public Filtlong is a tool for filtering long reads. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter. 2024-08-01
r-ichorcna public Estimating tumor fraction in cell-free DNA from ultra-low-pass whole genome sequencing. 2024-08-01
sequencetools public Tools for population genetics on sequencing data 2024-08-01
sourmash public Compute and compare MinHash signatures for DNA data sets. 2024-08-01
rnashapes public RNAshape abstraction maps structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features, but disregarding helix lengths. Shape abstraction integrates well with dynamic programming algorithms, and hence it can be applied during structure prediction rather than afterwards. This avoids exponential explosion and can still give us a non-heuristic and complete account of properties of the molecule's folding space. 2024-07-31
perl-json public JSON (JavaScript Object Notation) encoder/decoder 2024-07-31
perl-json-xs public JSON serialising/deserialising, done correctly and fast 2024-07-31
ntcard public Estimating k-mer coverage histogram of genomics data 2024-07-25
links public Long Interval Nucleotide K-mer Scaffolder 2024-07-25
morpheus public mass spectrometry–based proteomics database search algorithm 2024-07-25
seqtk public Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format 2024-07-24
entrez-direct public Entrez Direct (EDirect) is an advanced method for accessing the NCBI's set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process. 2024-07-23
kma public KMA is mapping a method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend. 2024-07-23
pydna public Representing double stranded DNA and functions for simulating cloning and homologous recombination between DNA molecules. 2024-07-19
pybedtools public Wraps BEDTools for use in Python and adds many additional features. 2024-07-19
nonpareil public Estimate average coverage and create curves for metagenomic datasets 2024-07-19
r-ic10 public Implementation of the classifier described in the paper 'Genome-driven integrated classification of breast cancer validated in over 7,500 samples' (Ali HR et al., Genome Biology 2014). It uses copy number and/or expression form breast cancer data, trains a pamr classifier (Tibshirani et al.) with the features available and predicts the iC10 group. 2024-07-19
primer3-py public Python bindings for Primer3 2024-07-19
gffcompare public GffCompare by Geo Pertea 2024-07-18
fido public No Summary 2024-07-17
cancerit-allelecount public Support code for NGS copy number algorithms 2024-07-17
bowtie public An ultrafast memory-efficient short read aligner 2024-07-17
bwa public The BWA read mapper. 2024-07-17
libbigwig public A C library for handling bigWig files 2024-07-16
graftm public GraftM is a pipeline used for identifying and classifying marker gene reads from metagenomic datasets 2024-07-16
mappy public Minimap2 Python binding 2024-07-15
bellmans-gapc public A language and compiler for algebraic dynamic programming. 2024-07-15
illumina-interop public The Illumina InterOp libraries are a set of common routines used for reading and writing InterOp metric files. These metric files are binary files produced during a run providing detailed statistics about a run. In a few cases, the metric files are produced after a run during secondary analysis (index metrics) or for faster display of a subset of the original data (collapsed quality scores). 2024-07-15
dropseq_tools public Package for the analysis of Drop-seq data developed by Jim Nemesh in the McCarroll Lab 2024-07-15
r-rrbgen public A lightweight limited functionality R bgen read/write library 2024-07-15
pairix public 2D indexing on bgzipped text files of paired genomic coordinates 2024-07-12
perl-digest-sha1 public Perl interface to the SHA-1 algorithm 2024-07-12
skesa public Strategic Kmer Extension for Scrupulous Assemblies & Sequence Assembly Using Target Enrichment 2024-07-12
gap2seq public Gap2Seq is a tool for filling gaps between contigs in genome assemblies. 2024-07-12
vcftools public A set of tools written in Perl and C++ for working with VCF files. This package only contains the C++ libraries whereas the package perl-vcftools-vcf contains the perl libraries 2024-07-11
raxml public Phylogenetics - Randomized Axelerated Maximum Likelihood. 2024-07-11
xtandem public No Summary 2024-07-11
kalign2 public Kalign is a fast and accurate multiple sequence alignment algorithm designed to align large numbers of protein sequences. 2024-07-11

