public |
The trapezoid package provides 'dtrapezoid', 'ptrapezoid', 'qtrapezoid', and 'rtrapezoid' functions for the trapezoidal distribution.
2024-01-16 |
public |
A latent, quasi-independent truncation time is assumed to be linked with the observed dependent truncation time, the event time, and an unknown transformation parameter via a structural transformation model. The transformation parameter is chosen to minimize the conditional Kendall's tau (Martin and Betensky, 2005) <doi:10.1198/016214504000001538> or the regression coefficient estimates (Jones and Crowley, 1992) <doi:10.2307/2336782>. The marginal distribution for the truncation time and the event time are completely left unspecified. The methodology is applied to survival curve estimation and regression analysis.
2024-01-16 |
public |
Solve optimal transport problems. Compute Wasserstein distances (a.k.a. Kantorovitch, Fortet--Mourier, Mallows, Earth Mover's, or minimal L_p distances), return the corresponding transference plans, and display them graphically. Objects that can be compared include grey-scale images, (weighted) point patterns, and mass vectors.
2024-01-16 |
public |
Estimation of transition probabilities for the illness-death model and or the three-state progressive model.
2024-01-16 |
public |
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'Tracer' (<https://github.com/beast-dev/tracer/>) is a GUI tool to parse and analyze the files generated by 'BEAST2'. This package provides a way to parse and analyze 'BEAST2' input files without active user input, but using R function calls instead.
2024-01-16 |
public |
Estimation of transition probabilities for the illness-death model. Both the Aalen-Johansen estimator for a Markov model and a novel non-Markovian estimator by de Una-Alvarez and Meira-Machado (2015) <doi:10.1111/biom.12288>, see also Balboa and de Una-Alvarez (2018) <doi:10.18637/jss.v083.i10>, are included.
2024-01-16 |
public |
R implementation of the software tools developed in the H-CUP (Healthcare Cost and Utilization Project) <https://www.hcup-us.ahrq.gov> and AHRQ (Agency for Healthcare Research and Quality) <https://www.ahrq.gov>. It currently contains functions for mapping ICD-9 codes to the AHRQ comorbidity measures and translating ICD-9 (resp. ICD-10) codes to ICD-10 (resp. ICD-9) codes based on GEM (General Equivalence Mappings) from CMS (Centers for Medicare and Medicaid Services).
2024-01-16 |
public |
Random number generation for the truncated multivariate normal and Student t distribution. Computes probabilities, quantiles and densities, including one-dimensional and bivariate marginal densities. Computes first and second moments (i.e. mean and covariance matrix) for the double-truncated multinormal case.
2024-01-16 |
public |
Set of hydrological functions including an R implementation of the hydrological model TOPMODEL, which is based on the 1995 FORTRAN version by Keith Beven. From version 0.7.0, the package is put into maintenance mode.
2024-01-16 |
public |
The universal storage engine 'TileDB' introduces a powerful on-disk format for multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations.
2024-01-16 |
public |
Provides an interface to the C code for Latent Dirichlet Allocation (LDA) models and Correlated Topics Models (CTM) by David M. Blei and co-authors and the C++ code for fitting LDA models using Gibbs sampling by Xuan-Hieu Phan and co-authors.
2024-01-16 |
public |
Unsupervised text tokenizer focused on computational efficiency. Wraps the 'YouTokenToMe' library <https://github.com/VKCOM/YouTokenToMe> which is an implementation of fast Byte Pair Encoding (BPE) <https://aclanthology.org/P16-1162/>.
2024-01-16 |
public |
Convert natural language text into tokens. Includes tokenizers for shingled n-grams, skip n-grams, words, word stems, sentences, paragraphs, characters, shingled characters, lines, Penn Treebank, regular expressions, as well as functions for counting characters, words, and sentences, and a function for splitting longer texts into separate documents, each with the same number of words. The tokenizers have a consistent interface, and the package is built on the 'stringi' and 'Rcpp' packages for fast yet correct tokenization in 'UTF-8'.
2024-01-16 |
public |
Implementation of the classic Genz algorithm and a novel tile-low-rank algorithm for computing relatively high-dimensional multivariate normal (MVN) and Student-t (MVT) probabilities. References used for this package: Foley, James, Andries van Dam, Steven Feiner, and John Hughes. "Computer Graphics: Principle and Practice". Addison-Wesley Publishing Company. Reading, Massachusetts (1987, ISBN:0-201-84840-6 1); Genz, A., "Numerical computation of multivariate normal probabilities," Journal of Computational and Graphical Statistics, 1, 141-149 (1992) <doi:10.1080/10618600.1992.10477010>; Cao, J., Genton, M. G., Keyes, D. E., & Turkiyyah, G. M. "Exploiting Low Rank Covariance Structures for Computing High-Dimensional Normal and Student- t Probabilities," Statistics and Computing, 31.1, 1-16 (2021) <doi:10.1007/s11222-020-09978-y>; Cao, J., Genton, M. G., Keyes, D. E., & Turkiyyah, G. M. "tlrmvnmvt: Computing High-Dimensional Multivariate Normal and Student-t Probabilities with Low-Rank Methods in R," Journal of Statistical Software, 101.4, 1-25 (2022) <doi:10.18637/jss.v101.i04>.
2024-01-16 |
public |
Create browsers for reading full texts from a token list format. Information obtained from text analyses (e.g., topic modeling, word scaling) can be used to annotate the texts.
2024-01-16 |
public |
Calculates empirical TL-moments (trimmed L-moments) of arbitrary order and trimming, and converts them to distribution parameters.
2024-01-16 |
public |
Importance sampling from the truncated multivariate normal using the GHK (Geweke-Hajivassiliou-Keane) simulator. Unlike Gibbs sampling which can get stuck in one truncation sub-region depending on initial values, this package allows truncation based on disjoint regions that are created by truncation of absolute values. The GHK algorithm uses simple Cholesky transformation followed by recursive simulation of univariate truncated normals hence there are also no convergence issues. Importance sample is returned along with sampling weights, based on which, one can calculate integrals over truncated regions for multivariate normals.
2024-01-16 |
public |
A Text mining toolkit for Chinese, which includes facilities for Chinese string processing, Chinese NLP supporting, encoding detecting and converting. Moreover, it provides some functions to support 'tm' package in Chinese.
2024-01-16 |
public |
With this tool, a user should be able to quickly implement complex random effect models through simple C++ templates. The package combines 'CppAD' (C++ automatic differentiation), 'Eigen' (templated matrix-vector library) and 'CHOLMOD' (sparse matrix routines available from R) to obtain an efficient implementation of the applied Laplace approximation with exact derivatives. Key features are: Automatic sparseness detection, parallelism through 'BLAS' and parallel user templates.
2024-01-16 |
None |
A framework for text mining applications within R.
2024-01-16 |
public |
A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
2024-01-16 |
public |
Simple mechanism for placing R graphics in a Tk widget.
2024-01-16 |
public |
Functions and S3 classes for time indexes and time indexed series, which are compatible with FAME frequencies.
2024-01-16 |
public |
Functions for statistical analysis, prediction and control of time series based mainly on Akaike and Nakagawa (1988) <ISBN 978-90-277-2786-2>.
2024-01-16 |
public |
A universal non-uniform random number generator for quite arbitrary distributions with piecewise twice differentiable densities.
2024-01-16 |
public |
Programs for Martinussen and Scheike (2006), `Dynamic Regression Models for Survival Data', Springer Verlag. Plus more recent developments. Additive survival model, semiparametric proportional odds model, fast cumulative residuals, excess risk models and more. Flexible competing risks regression including GOF-tests. Two-stage frailty modelling. PLS for the additive risk model. Lasso in the 'ahaz' package.
2024-01-16 |
public |
Objects to manipulate sequential and seasonal time series. Sequential time series based on time instants and time durations are handled. Both can be regularly or unevenly spaced (overlapping durations are allowed). Only POSIX* format are used for dates and times. The following classes are provided : 'POSIXcti', 'POSIXctp', 'TimeIntervalDataFrame', 'TimeInstantDataFrame', 'SubtimeDataFrame' ; methods to switch from a class to another and to modify the time support of series (hourly time series to daily time series for instance) are also defined. Tools provided can be used for instance to handle environmental monitoring data (not always produced on a regular time base).
2024-01-16 |
public |
Provides a graphics output device for R that records plots in a LaTeX-friendly format. The device transforms plotting commands issued by R functions into LaTeX code blocks. When included in a LaTeX document, these blocks are interpreted with the help of 'TikZ'---a graphics package for TeX and friends written by Till Tantau. Using the 'tikzDevice', the text of R plots can contain LaTeX commands such as mathematical formula. The device also allows arbitrary LaTeX code to be inserted into the output stream.
2024-01-16 |
public |
Efficient routines for manipulation of date-time objects while accounting for time-zones and daylight saving times. The package includes utilities for updating of date-time components (year, month, day etc.), modification of time-zones, rounding of date-times, period addition and subtraction etc. Parts of the 'CCTZ' source code, released under the Apache 2.0 License, are included in this package. See <https://github.com/google/cctz> for more details.
2024-01-16 |
public |
Imports non-tabular from Excel files into R. Exposes cell content, position and formatting in a tidy structure for further manipulation. Tokenizes Excel formulas. Supports '.xlsx' and '.xlsm' via the embedded 'RapidXML' C++ library <https://rapidxml.sourceforge.net>. Does not support '.xlsb' or '.xls'.
2024-01-16 |
public |
Built on top of the 'tibble' package, 'tibbletime' is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and creating columns that can be used as 'dplyr' time-based groups.
2024-01-16 |
public |
Functions to read, write and display bitmap images stored in the TIFF format. It can read and write both files and in-memory raw vectors, including native image representation.
2024-01-16 |
None |
Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).
2024-01-16 |
public |
Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM). Special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models. Provides 1-d and 2-d plotting functions (with projection and slice capabilities) and tree drawing, designed for visualization of tgp-class output. Sensitivity analysis and multi-resolution models are supported. Sequential experimental design and adaptive sampling functions are also provided, including ALM, ALC, and expected improvement. The latter supports derivative-free optimization of noisy black-box functions. For details and tutorials, see Gramacy (2007) <doi:10.18637/jss.v019.i09> and Gramacy & Taddy (2010) <doi:10.18637/jss.v033.i06>.
2024-01-16 |
public |
Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
2024-01-16 |
public |
Determine the path of the executing script. Compatible with a few popular GUIs: 'Rgui', 'RStudio', 'VSCode', 'Jupyter', and 'Rscript' (shell). Compatible with several functions and packages: 'source()', 'sys.source()', 'debugSource()' in 'RStudio', 'compiler::loadcmp()', 'box::use()', 'knitr::knit()', 'plumber::plumb()', 'shiny::runApp()', 'package:targets', and 'testthat::source_file()'.
2024-01-16 |
public |
Provides a 'tbl_df' class (the 'tibble') with stricter checking and better formatting than the traditional data frame.
2024-01-16 |
public |
Design and analyze three-arm non-inferiority or superiority trials which follow a gold-standard design, i.e. trials with an experimental treatment, an active, and a placebo control. Method for the following distributions are implemented: Poisson (Mielke and Munk (2009) <arXiv:0912.4169>), negative binomial (Muetze et al. (2016) <doi:10.1002/sim.6738>), normal (Pigeot et al. (2003) <doi:10.1002/sim.1450>; Hasler et al. (2009) <doi:10.1002/sim.3052>), binary (Friede and Kieser (2007) <doi:10.1002/sim.2543>), nonparametric (Muetze et al. (2017) <doi:10.1002/sim.7176>), exponential (Mielke and Munk (2009) <arXiv:0912.4169>).
2024-01-16 |
public |
A step-up test for genetic rare variants in a gene or in a pathway. The method determines an optimal grouping of rare variants analytically. The method has been described in Hoffmann TJ, Marini NJ, and Witte JS (2010) <doi:10.1371/journal.pone.0013584>.
2024-01-16 |
public |
Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.
2024-01-16 |
public |
Deconvolving thermoluminescence glow curves according to various kinetic models (first-order, second-order, general-order, and mixed-order) using a modified Levenberg-Marquardt algorithm (More, 1978) <DOI:10.1007/BFb0067700>. It provides the possibility of setting constraints or fixing any of parameters. It offers an interactive way to initialize parameters by clicking with a mouse on a plot at positions where peak maxima should be located. The optimal estimate is obtained by "trial-and-error". It also provides routines for simulating first-order, second-order, and general-order glow peaks.
2024-01-16 |
public |
In putative Transcription Factor Binding Sites (TFBSs) identification from sequence/alignments, we are interested in the significance of certain match score. TFMPvalue provides the accurate calculation of P-value with score threshold for Position Weight Matrices, or the score with given P-value. It is an interface to code originally made available by Helene Touzet and Jean-Stephane Varre, 2007, Algorithms Mol Biol:2, 15. <doi:10.1186/1748-7188-2-15>.
2024-01-16 |
public |
In Cox's proportional hazard model, covariates are modeled as linear function and may not be flexible. This package implements additive trend filtering Cox proportional hazards model as proposed in Jiacheng Wu & Daniela Witten (2019) "Flexible and Interpretable Models for Survival Data", Journal of Computational and Graphical Statistics, <DOI:10.1080/10618600.2019.1592758>. The fitted functions are piecewise polynomial with adaptively chosen knots.
2024-01-16 |
public |
It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. 'GloVe', 'fasttext') and incorporates functions for the calculation of (pairwise) text document dissimilarities. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.
2024-01-16 |
public |
Provides access to the text shaping functionality in the 'HarfBuzz' library and the bidirectional algorithm in the 'Fribidi' library. 'textshaping' is a low-level utility package mainly for graphic devices that expands upon the font tool-set provided by the 'systemfonts' package.
2024-01-16 |
public |
An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.
2024-01-16 |
public |
Function for sparse regression on raw text, regressing a labeling vector onto a feature space consisting of all possible phrases.
2024-01-16 |
public |
Provides triangulations of regular height fields, based on the methods described in "Fast Polygonal Approximation of Terrains and Height Fields" Michael Garland and Paul S. Heckbert (1995) <https://www.mgarland.org/files/papers/scape.pdf> using code from the 'hmm' library written by Michael Fogleman <https://www.github.com/fogleman/hmm>.
2024-01-16 |
public |
An integrated set of extensions to the 'ergm' package to analyze and simulate network evolution based on exponential-family random graph models (ERGM). 'tergm' is a part of the 'statnet' suite of packages for network analysis. See Krivitsky and Handcock (2014) <doi:10.1111/rssb.12014> and Carnegie, Krivitsky, Hunter, and Goodreau (2015) <doi:10.1080/10618600.2014.903087>.
2024-01-16 |
public |
Randomizing exams with 'LaTeX'. If you can compile your main document with 'LaTeX', the program should be able to compile the randomized versions without much extra effort when creating the document.
2024-01-16 |