panphlan
|
public |
PanPhlAn is a strain-level metagenomic profiling tool for identifying the gene composition and *in-vivo* transcriptional activity of individual strains in metagenomic samples.
|
2023-06-16 |
clairvoyante
|
public |
Identifying the variants of DNA sequences sensitively and accurately is an important but challenging task in the field of genomics. This task is particularly difficult when dealing with Single Molecule Sequencing, the error rate of which is still tens to hundreds of times higher than Next Generation Sequencing. With the increasing prevalence of Single Molecule Sequencing, an efficient variant caller will not only expedite basic research but also enable various downstream applications. To meet this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type, zygosity, alternative allele and Indel length. On NA12878, Clairvoyante achieved 99.73%, 97.68% and 95.36% accuracy on known variants, and achieved 98.65%, 92.57%, 77.89% F1 score on the whole genome, in Illumina, PacBio, and Oxford Nanopore data, respectively. Training Clairvoyante with a sample and call variant on another shows that Clairvoyante is sample agnostic and general for variant calling. A slim version of Clairvoyante with reduced model parameters produced a much lower F1, suggesting the full model's power in disentangling subtle details in read alignment. Clairvoyante is the first method for Single Molecule Sequencing to finish a whole genome variant calling in two hours on a 28 CPU-core machine, with top-tier accuracy and sensitivity. A toolset was developed to train, utilize and visualize the Clairvoyante model easily, and is publically available here is this repo.
|
2023-06-16 |
sc3-scripts
|
public |
A set of wrappers for individual components of the SC3 package. Functions R packages are hard to call when building workflows outside of R, so this package adds a set of simple wrappers with robust argument parsing. Intermediate steps are currently mainly serialized R objects, but the ultimate objective is to have language-agnostic intermediate formats allowing composite workflows using a variety of software packages.
|
2023-06-16 |
pytest-workflow
|
public |
A pytest plugin for configuring workflow/pipeline tests using YAML files
|
2023-06-16 |
spectrum_utils
|
public |
Mass spectrometry utility functions
|
2023-06-16 |
r-transformer
|
public |
Additional S3 and S4 coercion methods for easy interconversion between Bioconductor and tidyverse data classes.
|
2023-06-16 |
r-ldrtools
|
public |
Linear dimension reduction subspaces can be uniquely defined using orthogonal projection matrices. This package provides tools to compute distances between such subspaces and to compute the average subspace. For details see Liski, E.Nordhausen K., Oja H., Ruiz-Gazen A. (2016) Combining Linear Dimension Reduction Subspaces <doi:10.1007/978-81-322-3643-6_7>.
|
2023-06-16 |
bioconductor-sc3-scripts
|
public |
A set of wrappers for individual components of the SC3 package. Functions R packages are hard to call when building workflows outside of R, so this package adds a set of simple wrappers with robust argument parsing. Intermediate steps are currently mainly serialized R objects, but the ultimate objective is to have language-agnostic intermediate formats allowing composite workflows using a variety of software packages.
|
2023-06-16 |
gff3sort
|
public |
A Perl Script to sort gff3 files and produce suitable results for tabix tools
|
2023-06-16 |
bioconductor-idmappinganalysis
|
public |
ID Mapping Analysis
|
2023-06-16 |
bioconductor-msgfgui
|
public |
A shiny GUI for MSGFplus
|
2023-06-16 |
bioconductor-envisionquery
|
public |
Retrieval from the ENVISION bioinformatics data portal into R
|
2023-06-16 |
bioconductor-compgo
|
public |
An R pipeline for .bed file annotation, comparing GO term enrichment between gene sets and data visualisation
|
2023-06-16 |
ucsc-hggoldgapgl
|
public |
Put chromosome .agp and .gl files into browser database.
|
2023-06-16 |
barcode_splitter
|
public |
Split multiple fastq files by matching barcodes in one or more of the sequence files. Barcodes in the tab-delimited barcodes.txt file are matched against the beginning (or end) of the specified index read(s). By default, barcodes must match exactly, but --mistmatches can be set higher if desired. Compressed input is read (from all files) if the first input file name ends in .gz. Reading of compressed input can be forced with the gzipin option.
|
2023-06-16 |
pylprotpredictor
|
public |
A tool to predict PYL proteins
|
2023-06-16 |
shapeshifter-cli
|
public |
A command-line tool for transforming large data sets
|
2023-06-16 |
bioconductor-msgfplus
|
public |
An interface between R and MS-GF+
|
2023-06-16 |
abeona
|
public |
A simple transcriptome assembler based on kallisto and Cortex graphs.
|
2023-06-16 |
bioconductor-flowqb
|
public |
flowQB is a fully automated R Bioconductor package to calculate automatically the detector efficiency (Q), optical background (B) and intrinsic CV of the beads.
|
2023-06-16 |
bioconductor-idmappingretrieval
|
public |
ID Mapping Data Retrieval
|
2023-06-16 |
perl-data-match
|
public |
Complex data structure pattern matching
|
2023-06-16 |
mtb-snp-it
|
public |
SNP-IT: Whole genome SNP based identification of members of the Mycobacterium tuberculosis complex.
|
2023-06-16 |
r-ampliconduo
|
public |
Increasingly powerful techniques for high-throughput sequencing open the possibility to comprehensively characterize microbial communities, including rare species. However, a still unresolved issue are the substantial error rates in the experimental process generating these sequences. To overcome these limitations we propose an approach, where each sample is split and the same amplification and sequencing protocol is applied to both halves. This procedure should allow to detect likely PCR and sequencing artifacts, and true rare species by comparison of the results of both parts. The AmpliconDuo package, whereas amplicon duo from here on refers to the two amplicon data sets of a split sample, is intended to help interpret the obtained read frequency distribution across split samples, and to filter the false positive reads.
|
2023-06-16 |
bioconductor-rdavidwebservice
|
public |
An R Package for retrieving data from DAVID into R objects using Web Services API.
|
2023-06-16 |
perl-string-escape
|
public |
Backslash escapes, quoted phrase, word elision, etc.
|
2023-06-16 |
perl-data-compare
|
public |
compare perl data structures
|
2023-06-16 |
ucsc-hgvstovcf
|
public |
Convert HGVS terms to VCF tab-separated output
|
2023-06-16 |
spydrpick
|
public |
Mutual information based detection of pairs of genomic loci co-evolving under a shared selective pressure
|
2023-06-16 |
apollo
|
public |
WebApollo API library
|
2023-06-16 |
bioconductor-camthc
|
public |
Convex Analysis of Mixtures for Tissue Heterogeneity Characterization
|
2023-06-16 |
tango
|
public |
Assign taxonomy to metagenomic contigs
|
2023-06-16 |
ucsc-chainbridge
|
public |
Attempt to extend alignments through double-sided gaps of similar size
|
2023-06-16 |
ucsc-clustergenes
|
public |
Cluster genes from genePred tracks
|
2023-06-16 |
ucsc-bedjointaboffset
|
public |
given a bed file and tab file where each have a column with matching values: first get the value of column0, the offset and line length from inTabFile. Then go over the bed file, use the name field and append its offset and length to the bed file as two separate fields. Write the new bed file to outBed.
|
2023-06-16 |
pbccs
|
public |
pbccs - Generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads)
|
2023-06-16 |
upload-test-53255ef1
|
public |
No Summary
|
2023-06-16 |
shapeshifter
|
public |
A tool for managing large datasets
|
2023-06-16 |
compare-reads
|
public |
cythonized function to compare reads by name.
|
2023-06-16 |
socru
|
public |
Order and orientation of complete bacterial genomes
|
2023-06-16 |
igor_vdj
|
public |
IGoR is a C++ software designed to infer V(D)J recombination related processes from sequencing data.
|
2023-06-16 |
scanpy
|
public |
Single-Cell Analysis in Python. Scales to >1M cells.
|
2023-06-16 |
r-momr
|
public |
'MetaOMineR' suite is a set of R packages that offers many functions and modules needed for the analyses of quantitative metagenomics data. 'momr' is the core package and contains routines for biomarker identification and exploration. Developed since the beginning of field, 'momr' has evolved and is structured around the different modules such as preprocessing, analysis, vizualisation, etc. See package help for more information.
|
2023-06-16 |
wg-blimp
|
public |
wg-blimp (Whole Genome BisuLfIte sequencing Methylation analysis Pipeline)
|
2023-06-16 |
msp2db
|
public |
Python package to create an SQLite database from a collection of MSP mass spectromertry spectra files. Currently works with MSP files formated as MassBank records or as MoNA records. The resulting SQLite database can be used for spectral matching with msPurity Bioconductor R package,
|
2023-06-16 |
tximport-scripts
|
public |
A set of wrappers for individual components of the tximport package. Functions R packages are hard to call when building workflows outside of R, so this package adds a set of simple wrappers with robust argument parsing. Intermediate steps are currently mainly serialized R objects, but the ultimate objective is to have language-agnostic intermediate formats allowing composite workflows using a variety of software packages.
|
2023-06-16 |
r-brio
|
public |
Biological R input/output.
|
2023-06-16 |
r-bioverbs
|
public |
S4 generic functions for bioinformatics.
|
2023-06-16 |
perl-text-levenshtein
|
public |
calculate the Levenshtein edit distance between two strings
|
2023-06-16 |
dammet
|
public |
Software to reconstruct methylomes from HTS data from ancient specimen
|
2023-06-16 |