r-qdap
|
public |
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
|
2025-03-25 |
r-venneuler
|
public |
Calculates and displays Venn and Euler Diagrams
|
2025-03-25 |
r-qdapdictionaries
|
public |
A collection of text analysis dictionaries and word lists for use with the 'qdap' package.
|
2025-03-25 |
r-opennlp
|
public |
An interface to the Apache OpenNLP tools (version 1.5.3). The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text written in Java. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. See <https://opennlp.apache.org/> for more information.
|
2025-03-25 |
r-opennlpdata
|
public |
Apache OpenNLP jars and basic English language models.
|
2025-03-25 |
r-gender
|
public |
Infers state-recorded gender categories from first names and dates of birth using historical datasets. By using these datasets instead of lists of male and female names, this package is able to more accurately infer the gender of a name, and it is able to report the probability that a name was male or female. GUIDELINES: This method must be used cautiously and responsibly. Please be sure to see the guidelines and warnings about usage in the 'README' or the package documentation. See Blevins and Mullen (2015) <http://www.digitalhumanities.org/dhq/vol/9/3/000223/000223.html>.
|
2025-03-25 |
r-nimble
|
public |
A system for writing hierarchical statistical models largely compatible with 'BUGS' and 'JAGS', writing nimbleFunctions to operate models and do basic R-style math, and compiling both models and nimbleFunctions via custom- generated C++. 'NIMBLE' includes default methods for MCMC, particle filtering, Monte Carlo Expectation Maximization, and some other tools. The nimbleFunction system makes it easy to do things like implement new MCMC samplers from R, customize the assignment of samplers to different parts of a model from R, and compile the new samplers automatically via C++ alongside the samplers 'NIMBLE' provides. 'NIMBLE' extends the 'BUGS'/'JAGS' language by making it extensible: New distributions and functions can be added, including as calls to external compiled code. Although most people think of MCMC as the main goal of the 'BUGS'/'JAGS' language for writing models, one can use 'NIMBLE' for writing arbitrary other kinds of model-generic algorithms as well. A full User Manual is available at <https://r-nimble.org>.
|
2025-03-25 |
r-greta
|
public |
Write statistical models in R and fit them by MCMC and optimisation on CPUs and GPUs, using Google 'TensorFlow'. greta lets you write your own model like in BUGS, JAGS and Stan, except that you write models right in R, it scales well to massive datasets, and it’s easy to extend and build on. See the website for more information, including tutorials, examples, package documentation, and the greta forum.
|
2025-03-25 |
r-ggh4x
|
public |
A 'ggplot2' extension that does a variety of little helpful things. The package extends 'ggplot2' facets through customisation, by setting individual scales per panel, resizing panels and providing nested facets. Also allows multiple colour and fill scales per plot. Also hosts a smaller collection of stats, geoms and axis guides.
|
2025-03-25 |
r-fst
|
public |
Multithreaded serialization of compressed data frames using the 'fst' format. The 'fst' format allows for random access of stored data and compression with the LZ4 and ZSTD compressors created by Yann Collet. The ZSTD compression library is owned by Facebook Inc.
|
2025-03-25 |
r-lsux
|
public |
LSUx is a tool to extract domains from rDNA using covariance models as implemented in the Infernal package.
|
2025-03-25 |
r-inferrnal
|
public |
Search for matches to a covariance model in RNA sequences, or align RNA sequences to a covariance model. Covariance models use a combination of primary structure (nucleotide sequence) and secondary structure (base pairing) to model non-protein-coding RNA families. This package is a thin wrapper around the Infernal package, which is not included and needs to be installed independently. Functions to read, write, and manipulate multiple sequence alignments in Stockholm format are also included.
|
2025-03-25 |
r-ggnomics
|
public |
This package contains some extensions to ggplot2 and convenience functions.
|
2025-03-25 |
r-disk.frame
|
public |
A disk-based data manipulation tool for working with large-than-RAM datasets. Aims to lower the barrier-to-entry for manipulating large datasets by adhering closely to popular and familiar data manipulation paradigms like dplyr verbs and data.table syntax.
|
2025-03-25 |
bptp
|
public |
bPTP: a Bayesian implementation of the PTP model for species delimitation.
|
2025-03-25 |
r-ipc
|
public |
Provides tools for passing messages between R processes. Shiny Examples are provided showing how to perform useful tasks such as: updating reactive values from within a future, progress bars for long running async tasks, and interrupting async tasks based on user input.
|
2025-03-25 |
r-tzara
|
public |
To reduce computational complexity, dada2 only uses non-singletons as seeds for denoising. For this strategy to work, each true sequence must be represented by at least two identical reads. Especially with long amplicons, the probability of two reads having exactly the same errors is much lower than the probability of being error-free, so in practice this means that each true sequence must have two error-free reads. This becomes problematic for rare sequences in long amplicon libraries. An alternative is to use hidden Markov models to cut out the most variable section of the targeted region and use dada2 to create denoised sequences using only that sequence, and then find a consensus sequence for all sequences that match the index region. Tzara (named after Tristan Tzara, a central figure in the Dada art movement) applies this method to rDNA sequences by cutting out the variable ITS2 region using rITSx.
|
2025-03-25 |
r-dada2
|
public |
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
|
2025-03-25 |
r-restez
|
public |
Download large sections of 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and generate a local SQL-based database. A user can then query this database using 'restez' functions or through 'rentrez' <https://CRAN.R-project.org/package=rentrez> wrappers.
|
2025-03-25 |
r-monetdblite
|
public |
In-Process Version of 'MonetDB'.
|
2025-03-25 |
r-tikzdevice
|
public |
Provides a graphics output device for R that records plots in a LaTeX-friendly format. The device transforms plotting commands issued by R functions into LaTeX code blocks. When included in a LaTeX document, these blocks are interpreted with the help of 'TikZ'---a graphics package for TeX and friends written by Till Tantau. Using the 'tikzDevice', the text of R plots can contain LaTeX commands such as mathematical formula. The device also allows arbitrary LaTeX code to be inserted into the output stream.
|
2025-03-25 |
r-ritsx
|
public |
This package is an interface to the ITSx command-line utility (http://microbiology.se/software/itsx/) from R.
|
2025-03-25 |