r-tokenizers
|
public |
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, tweets, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages for fast yet correct tokenization in 'UTF-8'.
|
2025-03-25 |
r-tmb
|
public |
With this tool, a user should be able to quickly implement complex random effect models through simple C++ templates. The package combines 'CppAD' (C++ automatic differentiation), 'Eigen' (templated matrix-vector library) and 'CHOLMOD' (sparse matrix routines available from R) to obtain an efficient implementation of the applied Laplace approximation with exact derivatives. Key features are: Automatic sparseness detection, parallelism through 'BLAS' and parallel user templates.
|
2025-03-25 |
r-tinytex
|
public |
Helper functions to install and maintain the 'LaTeX' distribution named 'TinyTeX' (<https://yihui.name/tinytex/>), a lightweight, cross-platform, portable, and easy-to-maintain version of 'TeX Live'. This package also contains helper functions to compile 'LaTeX' documents, and install missing 'LaTeX' packages automatically.
|
2025-03-25 |
r-threejs
|
public |
Create interactive 3D scatter plots, network plots, and globes using the 'three.js' visualization library (<https://threejs.org>).
|
2025-03-25 |
r-tfdatasets
|
public |
Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/programmers_guide/datasets> for additional details.
|
2025-03-25 |
r-textshape
|
public |
Tools that can be used to reshape and restructure text data.
|
2025-03-25 |
r-text2vec
|
public |
Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.
|
2025-03-25 |
r-teachingdemos
|
public |
Demonstration functions that can be used in a classroom to demonstrate statistical concepts, or on your own to better understand the concepts or the programming.
|
2025-03-25 |
r-tcltk2
|
public |
A series of additional Tcl commands and Tk widgets with style and various functions (under Windows: DDE exchange, access to the registry and icon manipulation) to supplement the tcltk package
|
2025-03-25 |
r-tau
|
public |
Utilities for text analysis.
|
2025-03-25 |
r-syuzhet
|
public |
Extracts sentiment and sentiment-derived plot arcs from text using a variety of sentiment dictionaries conveniently packaged for consumption by R users. Implemented dictionaries include "syuzhet" (default) developed in the Nebraska Literary Lab "afinn" developed by Finn {\AA}rup Nielsen, "bing" developed by Minqing Hu and Bing Liu, and "nrc" developed by Mohammad, Saif M. and Turney, Peter D. Applicable references are available in README.md and in the documentation for the "get_sentiment" function. The package also provides a hack for implementing Stanford's coreNLP sentiment parser. The package provides several methods for plot arc normalization.
|
2025-03-25 |
r-survmisc
|
public |
A collection of functions to help in the analysis of right-censored survival data. These extend the methods available in package:survival.
|
2025-03-25 |
r-survminer
|
public |
Contains the function 'ggsurvplot()' for drawing easily beautiful and 'ready-to-publish' survival curves with the 'number at risk' table and 'censoring count plot'. Other functions are also available to plot adjusted curves for `Cox` model and to visually examine 'Cox' model assumptions.
|
2025-03-25 |
r-survey
|
public |
Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase subsampling designs. Graphics. PPS sampling without replacement. Principal components, factor analysis.
|
2025-03-25 |
r-strucchange
|
public |
Testing, monitoring and dating structural changes in (linear) regression models. strucchange features tests/methods from the generalized fluctuation test framework as well as from the F test (Chow test) framework. This includes methods to fit, plot and test fluctuation processes (e.g., CUSUM, MOSUM, recursive/moving estimates) and F statistics, respectively. It is possible to monitor incoming data online using fluctuation processes. Finally, the breakpoints in regression models with structural changes can be estimated together with confidence intervals. Emphasis is always given to methods for visualizing the data.
|
2025-03-25 |
r-stargazer
|
public |
Produces LaTeX code, HTML/CSS code and ASCII text for well-formatted tables that hold regression analysis results from several models side-by-side, as well as summary statistics.
|
2025-03-25 |
r-stabs
|
public |
Resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Buhlmann, 2010, <doi:10.1111/j.1467-9868.2010.00740.x>) and complementary pairs stability selection with improved error bounds (Shah & Samworth, 2013, <doi:10.1111/j.1467-9868.2011.01034.x>) are implemented. The package can be combined with arbitrary user specified variable selection approaches.
|
2025-03-25 |
r-spikeslab
|
public |
Spike and slab for prediction and variable selection in linear regression models. Uses a generalized elastic net for variable selection.
|
2025-03-25 |
r-sparsepp
|
public |
Provides interface to 'sparsepp' - fast, memory efficient hash map. It is derived from Google's excellent 'sparsehash' implementation. We believe 'sparsepp' provides an unparalleled combination of performance and memory usage, and will outperform your compiler's unordered_map on both counts. Only Google's 'dense_hash_map' is consistently faster, at the cost of much greater memory usage (especially when the final size of the map is not known in advance).
|
2025-03-25 |
r-sparsem
|
public |
Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.
|
2025-03-25 |
r-snowballc
|
public |
An R interface to the C libstemmer library that implements Porter's word stemming algorithm for collapsing words to a common root to aid comparison of vocabulary. Currently supported languages are Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish and Turkish.
|
2025-03-25 |
r-snow
|
public |
Support for simple parallel computing in R.
|
2025-03-25 |
r-snakecase
|
public |
A consistent, flexible and easy to use tool to parse and convert strings into cases like snake or camel among others.
|
2025-03-25 |
r-sjmisc
|
public |
Collection of miscellaneous utility functions, supporting data transformation tasks like recoding, dichotomizing or grouping variables, setting and replacing missing values. The data transformation functions also support labelled data, and all integrate seamlessly into a 'tidyverse'-workflow.
|
2025-03-25 |
r-sjlabelled
|
public |
Collection of functions dealing with labelled data, like reading and writing data between R and other statistical software packages like 'SPSS', 'SAS' or 'Stata', and working with labelled data. This includes easy ways to get, set or change value and variable label attributes, to convert labelled vectors into factors or numeric (and vice versa), or to deal with multiple declared missing values.
|
2025-03-25 |