r-qs2
|
public |
Streamlines and accelerates the process of saving and loading R objects, improving speed and compression compared to other methods. The package provides two compression formats: the 'qs2' format, which uses R serialization via the C API while optimizing compression and disk I/O, and the 'qdata' format, featuring custom serialization for slightly faster performance and better compression. Additionally, the 'qs2' format can be directly converted to the standard 'RDS' format, ensuring long-term compatibility with future versions of R.
|
2024-12-20 |
r-optimotu.pipeline
|
public |
Pipeline for environmental DNA metabarcoding
|
2024-12-18 |
r-optimotu
|
public |
OptimOTU calculates single-linkage clusters at multiple thresholds given a distance matrix, which may be sparse, using a fast, multithreaded C++ implementation. In particular, the memory requirements are small and fixed, and it can accept the distance matrix from a file or text connection, meaning that it can operate on very large matrices which do not fit in RAM. Routines for quickly counting the measure of the intersection of sorted sets, and for calculating the multiclass F-measure, are also included.
|
2024-12-15 |
r-inferrnal
|
public |
Search for matches to a covariance model in RNA sequences, or align RNA sequences to a covariance model. Covariance models use a combination of primary structure (nucleotide sequence) and secondary structure (base pairing) to model non-protein-coding RNA families. This package is a thin wrapper around the Infernal package, which is not included and needs to be installed independently. Functions to read, write, and manipulate multiple sequence alignments in Stockholm format are also included.
|
2024-12-15 |
r-lsux
|
public |
LSUx is a tool to extract domains from rDNA using covariance models as implemented in the Infernal package.
|
2024-09-30 |
r-funguildr
|
public |
This is a simple reimplementation of FUNGuild_v1.1.py. It queries the FUNGuild or NEMAGuild databases (http://www.stbates.org/guilds/app.php) and assigns trait information based on matching to a taxonomic classification. It does not include a copy of the FUNGuild or NEMAGuild databases, because they are continually updated as new information is submitted, but it does have methods to download them and store them as R objects to speed up repeated queries or to allow local queries without internet access.
|
2024-09-29 |
r-nanonext
|
public |
R binding for NNG (Nanomsg Next Gen), a successor to ZeroMQ. NNG is a socket library implementing 'Scalability Protocols', a reliable, high-performance standard for common communications patterns including publish/subscribe, request/reply and service discovery, over in-process, IPC, TCP, WebSocket and secure TLS transports. As its own threaded concurrency framework, provides a toolkit for asynchronous programming and distributed computing, with intuitive 'aio' objects which resolve automatically upon completion of asynchronous operations, and synchronisation primitives allowing R to wait upon events signalled by concurrent threads.
|
2024-06-06 |
r-ritsx
|
public |
This package is an interface to the ITSx command-line utility (http://microbiology.se/software/itsx/) from R.
|
2024-05-06 |
r-tikzdevice
|
public |
Provides a graphics output device for R that records plots in a LaTeX-friendly format. The device transforms plotting commands issued by R functions into LaTeX code blocks. When included in a LaTeX document, these blocks are interpreted with the help of 'TikZ'---a graphics package for TeX and friends written by Till Tantau. Using the 'tikzDevice', the text of R plots can contain LaTeX commands such as mathematical formula. The device also allows arbitrary LaTeX code to be inserted into the output stream.
|
2024-05-06 |
r-monetdblite
|
public |
In-Process Version of 'MonetDB'.
|
2024-05-06 |
r-restez
|
public |
Download large sections of 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and generate a local SQL-based database. A user can then query this database using 'restez' functions or through 'rentrez' <https://CRAN.R-project.org/package=rentrez> wrappers.
|
2024-05-06 |
r-dada2
|
public |
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
|
2024-05-06 |
r-ipc
|
public |
Provides tools for passing messages between R processes. Shiny Examples are provided showing how to perform useful tasks such as: updating reactive values from within a future, progress bars for long running async tasks, and interrupting async tasks based on user input.
|
2024-05-06 |
bptp
|
public |
bPTP: a Bayesian implementation of the PTP model for species delimitation.
|
2024-05-06 |
r-disk.frame
|
public |
A disk-based data manipulation tool for working with large-than-RAM datasets. Aims to lower the barrier-to-entry for manipulating large datasets by adhering closely to popular and familiar data manipulation paradigms like dplyr verbs and data.table syntax.
|
2024-05-06 |
r-ggnomics
|
public |
This package contains some extensions to ggplot2 and convenience functions.
|
2024-05-06 |
r-fst
|
public |
Multithreaded serialization of compressed data frames using the 'fst' format. The 'fst' format allows for random access of stored data and compression with the LZ4 and ZSTD compressors created by Yann Collet. The ZSTD compression library is owned by Facebook Inc.
|
2024-05-06 |
r-ggh4x
|
public |
A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.
|
2024-05-06 |
r-greta
|
public |
Write statistical models in R and fit them by MCMC and optimisation on CPUs and GPUs, using Google 'TensorFlow'. greta lets you write your own model like in BUGS, JAGS and Stan, except that you write models right in R, it scales well to massive datasets, and it’s easy to extend and build on. See the website for more information, including tutorials, examples, package documentation, and the greta forum.
|
2024-05-06 |
r-nimble
|
public |
A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom- generated C++. 'NIMBLE' includes default methods for MCMC, particle filtering, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.
|
2024-05-06 |
r-gender
|
public |
Infers state-recorded gender categories from first names and dates of birth using historical datasets. By using these datasets instead of lists of male and female names, this package is able to more accurately infer the gender of a name, and it is able to report the probability that a name was male or female. GUIDELINES: This method must be used cautiously and responsibly. Please be sure to see the guidelines and warnings about usage in the 'README' or the package documentation. See Blevins and Mullen (2015) <http://www.digitalhumanities.org/dhq/vol/9/3/000223/000223.html>.
|
2024-05-06 |
r-opennlpdata
|
public |
Apache OpenNLP jars and basic English language models.
|
2024-05-06 |
r-opennlp
|
public |
An interface to the Apache OpenNLP tools (version 1.5.3). The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text written in Java. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. See <https://opennlp.apache.org/> for more information.
|
2024-05-06 |
r-qdapdictionaries
|
public |
A collection of text analysis dictionaries and word lists for use with the 'qdap' package.
|
2024-05-06 |
r-venneuler
|
public |
Calculates and displays Venn and Euler Diagrams
|
2024-05-06 |
r-qdap
|
public |
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
|
2024-05-06 |
r-deming
|
public |
Generalized Deming regression, Theil-Sen regression and Passing-Bablock regression functions.
|
2024-05-06 |
r-metacoder
|
public |
A set of tools for parsing, manipulating, and graphing data classified by a hierarchy (e.g. a taxonomy).
|
2024-05-06 |
r-phylotax
|
public |
Combines taxonomic assignments made with different methods or databases using a phylogenetic tree. Clades within the tree are assigned to a taxon if the assignment is consistent with at least one result for all of the members of the clade, except the members that do not have any assignments at all. This resolves conflict between different methods, while also adding assignments to unknown sequences which are nested in an identified clade. The package also includes helper functions to assign taxonomy using several methods, and compile the results into a uniform format.
|
2024-05-06 |
r-tarchetypes
|
public |
Function-oriented Make-like declarative workflows for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible workflows concisely and compactly. The methods in this package were influenced by the 'drake' R package by Will Landau (2018) <doi:10.21105/joss.00550>.
|
2024-05-06 |
gefast
|
public |
Clustering tool using Swarm's clustering strategy and Pass-Join's segment filter.
|
2024-05-06 |
r-hues
|
public |
Creating effective colour palettes for figures is challenging. This package generates and plot palettes of optimally distinct colours in perceptually uniform colour space, based on 'iwanthue' <http://tools.medialab.sciences-po.fr/iwanthue/>. This is done through k-means clustering of CIE Lab colour space, according to user-selected constraints on hue, chroma, and lightness.
|
2024-05-06 |
r-trackdown
|
public |
Collaborative writing and editing of R Markdown (or Sweave) documents. The local .Rmd (or .Rnw) is uploaded as a plain-text file to Google Drive. By taking advantage of the easily readable Markdown (or LaTeX) syntax and the well-known online interface offered by Google Docs, collaborators can easily contribute to the writing and editing process. After integrating all authors’ contributions, the final document can be downloaded and rendered locally.
|
2024-05-06 |
pifcosm
|
public |
PisCoSm is a pipline to construct supermatrix trees from GenBank data
|
2024-05-06 |
r-getip
|
public |
A micro-package for getting your 'IP' address, either the local/internal or the public/external one. Currently only 'IPv4' addresses are supported.
|
2024-05-06 |
r-tzara
|
public |
To reduce computational complexity, dada2 only uses non-singletons as seeds for denoising. For this strategy to work, each true sequence must be represented by at least two identical reads. Especially with long amplicons, the probability of two reads having exactly the same errors is much lower than the probability of being error-free, so in practice this means that each true sequence must have two error-free reads. This becomes problematic for rare sequences in long amplicon libraries. An alternative is to use hidden Markov models to cut out the most variable section of the targeted region and use dada2 to create denoised sequences using only that sequence, and then find a consensus sequence for all sequences that match the index region. Tzara (named after Tristan Tzara, a central figure in the Dada art movement) applies this method to rDNA sequences by cutting out the variable ITS2 region using rITSx.
|
2024-05-06 |
r-ecmwfr
|
public |
Programmatic interface to the European Centre for Medium-Range Weather Forecasts dataset web services (ECMWF; <https://www.ecmwf.int/>) and Copernicus's Climate Data Store (CDS; <https://cds.climate.copernicus.eu>). Allows for easy downloads of weather forecasts and climate reanalysis data in R.
|
2024-05-06 |
nng
|
public |
nanomsg-next-generation -- light-weight brokerless messaging
|
2024-05-06 |
r-mirai
|
public |
Lightweight parallel code execution and distributed computing. Designed for simplicity, a 'mirai' evaluates an R expression asynchronously, on local or network resources, resolving automatically upon completion. Efficient scheduling over fast inter-process communications or secure TLS connections over TCP/IP, built on 'nanonext' and 'NNG' (Nanomsg Next Gen).
|
2024-05-06 |
r-crew
|
public |
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'NNG'-powered 'mirai' R package by Gao (2023) <https://CRAN.R-project.org/package=mirai> is a sleek and sophisticated scheduler that efficiently processes these intense workloads. The 'crew' package extends 'mirai' with a unifying interface for third-party worker launchers. Inspiration also comes from packages. 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischel, and Surmann (2017) <doi:10.21105/joss.00135>.
|
2024-05-06 |
r-crew.cluster
|
public |
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'crew.cluster' package extends the 'mirai'-powered 'crew' package with worker launcher plugins for traditional high-performance computing systems. Inspiration also comes from packages 'mirai' by Gao (2023) <https://github.com/shikokuchuo/mirai>, 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischl, and Surmann (2017). <doi:10.21105/joss.00135>.
|
2024-05-06 |
r-secretbase
|
public |
SHA-3 cryptographic hash and SHAKE256 extendable-output functions (XOF). The SHA-3 Secure Hash Standard was published by the National Institute of Standards and Technology (NIST) in 2015 at <doi:10.6028/NIST.FIPS.202>. Fast and memory-efficient implementation using the core algorithm from 'Mbed TLS' under the Trusted Firmware Project <https://www.trustedfirmware.org/projects/mbed-tls/>.
|
2024-05-06 |
r-igraph
|
public |
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
|
2024-05-06 |
r-targets
|
public |
As a pipeline toolkit for Statistics and data science in R, the 'targets' package brings together function-oriented programming and 'Make'-like declarative workflows. It analyzes the dependency relationships among the tasks of a workflow, skips steps that are already up to date, runs the necessary computation with optional parallel workers, abstracts files as R objects, and provides tangible evidence that the results match the underlying code and data. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
|
2024-05-06 |