r-ddalpha
|
public |
Contains procedures for depth-based supervised learning, which are entirely non-parametric, in particular the DDalpha-procedure (Lange, Mosler and Mozharovskyi, 2014 <doi:10.1007/s00362-012-0488-4>). The training data sample is transformed by a statistical depth function to a compact low-dimensional space, where the final classification is done. It also offers an extension to functional data and routines for calculating certain notions of statistical depth functions. 50 multivariate and 5 functional classification problems are included. (Pokotylo, Mozharovskyi and Dyckerhoff, 2019 <doi:10.18637/jss.v091.i05>).
|
2024-01-16 |
r-dcov
|
public |
Efficient methods for computing distance covariance and relevant statistics. See Székely et al.(2007) <doi:10.1214/009053607000000505>; Székely and Rizzo (2013) <doi:10.1016/j.jmva.2013.02.012>; Székely and Rizzo (2014) <doi:10.1214/14-AOS1255>; Huo and Székely (2016) <doi:10.1080/00401706.2015.1054435>.
|
2024-01-16 |
r-datavisualizations
|
public |
Gives access to data visualisation methods that are relevant from the data scientist's point of view. The flagship idea of 'DataVisualizations' is the mirrored density plot (MD-plot) for either classified or non-classified multivariate data published in Thrun, M.C. et al.: "Analyzing the Fine Structure of Distributions" (2020), PLoS ONE, <DOI:10.1371/journal.pone.0238835>. The MD-plot outperforms the box-and-whisker diagram (box plot), violin plot and bean plot and geom_violin plot of ggplot2. Furthermore, a collection of various visualization methods for univariate data is provided. In the case of exploratory data analysis, 'DataVisualizations' makes it possible to inspect the distribution of each feature of a dataset visually through a combination of four methods. One of these methods is the Pareto density estimation (PDE) of the probability density function (pdf). Additionally, visualizations of the distribution of distances using PDE, the scatter-density plot using PDE for two variables as well as the Shepard density plot and the Bland-Altman plot are presented here. Pertaining to classified high-dimensional data, a number of visualizations are described, such as f.ex. the heat map and silhouette plot. A political map of the world or Germany can be visualized with the additional information defined by a classification of countries or regions. By extending the political map further, an uncomplicated function for a Choropleth map can be used which is useful for measurements across a geographic area. For categorical features, the Pie charts, slope charts and fan plots, improved by the ABC analysis, become usable. More detailed explanations are found in the book by Thrun, M.C.: "Projection-Based Clustering through Self-Organization and Swarm Intelligence" (2018) <DOI:10.1007/978-3-658-20540-9>.
|
2024-01-16 |
r-dbarts
|
public |
Fits Bayesian additive regression trees (BART; Chipman, George, and McCulloch (2010) <doi:10.1214/09-AOAS285>) while allowing the updating of predictors or response so that BART can be incorporated as a conditional model in a Gibbs/Metropolis-Hastings sampler. Also serves as a drop-in replacement for package 'BayesTree'.
|
2024-01-16 |
r-date
|
public |
Functions for handling dates.
|
2024-01-16 |
r-dbscan
|
public |
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
|
2024-01-16 |
r-dblcens
|
public |
Doubly censored data, as described in Chang and Yang (1987) <doi: 10.1214/aos/1176350608>), are commonly seen in many fields. We use EM algorithm to compute the non-parametric MLE (NPMLE) of the cummulative probability function/survival function and the two censoring distributions. One can also specify a constraint F(T)=C, it will return the constrained NPMLE and the -2 log empirical likelihood ratio for this constraint. This can be used to test the hypothesis about the constraint and, by inverting the test, find confidence intervals for probability or quantile via empirical likelihood ratio theorem. Influence functions of hat F may also be calculated, but currently, the it may be slow.
|
2024-01-16 |
r-cyclops
|
public |
This model fitting tool incorporates cyclic coordinate descent and majorization-minimization approaches to fit a variety of regression models found in large-scale observational healthcare data. Implementations focus on computational optimization and fine-scale parallelization to yield efficient inference in massive datasets. Please see: Suchard, Simpson, Zorych, Ryan and Madigan (2013) <doi:10.1145/2414416.2414791>.
|
2024-01-16 |
r-data.table
|
None |
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
|
2024-01-16 |
r-datagraph
|
public |
Functions to pipe data from 'R' to 'DataGraph', a graphing and analysis application for mac OS. Create a live connection using either '.dtable' or '.dtbin' files that can be read by 'DataGraph'. Can save a data frame, collection of data frames and sequences of data frames and individual vectors. For more information see <https://community.visualdatatools.com/datagraph/knowledge-base/r-package/>.
|
2024-01-16 |
r-datassim
|
public |
For estimation of a variable of interest using Kalman filter by incorporating results from previous assessments, i.e. through development weighted estimates where weights are assigned inversely proportional to the variance of existing and new estimates. For reference see Ehlers et al. (2017) <doi:10.20944/preprints201710.0098.v1>.
|
2024-01-16 |
r-dap
|
public |
An implementation of Discriminant Analysis via Projections (DAP) method for high-dimensional binary classification in the case of unequal covariance matrices. See Irina Gaynanova and Tianying Wang (2018) <arXiv:1711.04817v2>.
|
2024-01-16 |
r-cvxr
|
public |
An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.
|
2024-01-16 |
r-dang
|
public |
A collection of utility functions.
|
2024-01-16 |
r-daly
|
public |
The DALY Calculator is a free, open-source Graphical User Interface (GUI) for stochastic disability-adjusted life year (DALY) calculation.
|
2024-01-16 |
r-cutpointr
|
public |
Estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. Some methods for more robust cutpoint estimation are supported, e.g. a parametric method assuming normal distributions, bootstrapped cutpoints, and smoothing of the metric values per cutpoint using Generalized Additive Models. Various plotting functions are included. For an overview of the package see Thiele and Hirschfeld (2021) <doi:10.18637/jss.v098.i11>.
|
2024-01-16 |
r-ctsem
|
public |
Hierarchical continuous (and discrete) time state space modelling, for linear and nonlinear systems measured by continuous variables, with limited support for binary data. The subject specific dynamic system is modelled as a stochastic differential equation (SDE) or difference equation, measurement models are typically multivariate normal factor models. Linear mixed effects SDE's estimated via maximum likelihood and optimization are the default. Nonlinearities, (state dependent parameters) and random effects on all parameters are possible, using either max likelihood / max a posteriori optimization (with optional importance sampling) or Stan's Hamiltonian Monte Carlo sampling. See <https://github.com/cdriveraus/ctsem/raw/master/vignettes/hierarchicalmanual.pdf> for details. Priors may be used. For the conceptual overview of the hierarchical Bayesian linear SDE approach, see <https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling>. Exogenous inputs may also be included, for an overview of such possibilities see <https://www.researchgate.net/publication/328221807_Understanding_the_Time_Course_of_Interventions_with_Continuous_Time_Dynamic_Models> . Stan based functions are not available on 32 bit Windows systems at present. <https://cdriver.netlify.app/> contains some tutorial blog posts.
|
2024-01-16 |
r-cyclertools
|
public |
A suite of functions for analysing cycling data.
|
2024-01-16 |
r-cxhull
|
public |
Computes the convex hull in arbitrary dimension, based on the Qhull library (<http://www.qhull.org>). The package provides a complete description of the convex hull: edges, ridges, facets, adjacencies. Triangulation is optional.
|
2024-01-16 |
r-cusumdesign
|
public |
Computation of decision intervals (H) and average run lengths (ARL) for CUSUM charts. Details of the method are seen in Hawkins and Olwell (2012): Cumulative sum charts and charting for quality improvement, Springer Science & Business Media.
|
2024-01-16 |
r-cusp
|
public |
Cobb's maximum likelihood method for cusp-catastrophe modeling (Grasman, van der Maas, and Wagenmakers (2009) <doi:10.18637/jss.v032.i08>; Cobb (1981), Behavioral Science, 26(1), 75-78). Includes a cusp() function for model fitting, and several utility functions for plotting, and for comparing the model to linear regression and logistic curve models.
|
2024-01-16 |
r-curstatci
|
public |
Computes the maximum likelihood estimator, the smoothed maximum likelihood estimator and pointwise bootstrap confidence intervals for the distribution function under current status data. Groeneboom and Hendrickx (2017) <doi:10.1214/17-EJS1345>.
|
2024-01-16 |
r-cusum
|
public |
Provides functions for constructing and evaluating CUSUM charts and RA-CUSUM charts with focus on false signal probability.
|
2024-01-16 |
r-cubist
|
public |
Regression modeling using rules with added instance-based corrections.
|
2024-01-16 |
r-cstab
|
public |
Selection of the number of clusters in cluster analysis using stability methods.
|
2024-01-16 |
r-cubature
|
public |
R wrappers around the cubature C library of Steven G. Johnson for adaptive multivariate integration over hypercubes and the Cuba C library of Thomas Hahn for deterministic and Monte Carlo integration. Scalar and vector interfaces for cubature and Cuba routines are provided; the vector interfaces are highly recommended as demonstrated in the package vignette.
|
2024-01-16 |
r-cubfits
|
public |
Estimating mutation and selection coefficients on synonymous codon bias usage based on models of ribosome overhead cost (ROC). Multinomial logistic regression and Markov Chain Monte Carlo are used to estimate and predict protein production rates with/without the presence of expressions and measurement errors. Work flows with examples for simulation, estimation and prediction processes are also provided with parallelization speedup. The whole framework is tested with yeast genome and gene expression data of Yassour, et al. (2009) <doi:10.1073/pnas.0812841106>.
|
2024-01-16 |
r-csvread
|
public |
Functions for loading large (10M+ lines) CSV and other delimited files, similar to read.csv, but typically faster and using less memory than the standard R loader. While not entirely general, it covers many common use cases when the types of columns in the CSV file are known in advance. In addition, the package provides a class 'int64', which represents 64-bit integers exactly when reading from a file. The latter is useful when working with 64-bit integer identifiers exported from databases. The CSV file loader supports common column types including 'integer', 'double', 'string', and 'int64', leaving further type transformations to the user.
|
2024-01-16 |
r-crs
|
public |
Regression splines that handle a mix of continuous and categorical (discrete) data often encountered in applied settings. I would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC, <https://www.nserc-crsng.gc.ca>), the Social Sciences and Humanities Research Council of Canada (SSHRC, <https://www.sshrc-crsh.gc.ca>), and the Shared Hierarchical Academic Research Computing Network (SHARCNET, <https://www.sharcnet.ca>). We would also like to acknowledge the contributions of the GNU GSL authors. In particular, we adapt the GNU GSL B-spline routine gsl_bspline.c adding automated support for quantile knots (in addition to uniform knots), providing missing functionality for derivatives, and for extending the splines beyond their endpoints.
|
2024-01-16 |
r-crqa
|
public |
Auto, Cross and Multi-dimensional recurrence quantification analysis. Different methods for computing recurrence, cross vs. multidimensional or profile iti.e., only looking at the diagonal recurrent points, as well as functions for optimization and plotting are proposed. in-depth measures of the whole cross-recurrence plot, Please refer to Coco and others (2021) <doi:10.32614/RJ-2021-062>, Coco and Dale (2014) <doi:10.3389/fpsyg.2014.00510> and Wallot (2018) <doi: 10.1080/00273171.2018.1512846> for further details about the method.
|
2024-01-16 |
r-crch
|
public |
Different approaches to censored or truncated regression with conditional heteroscedasticity are provided. First, continuous distributions can be used for the (right and/or left censored or truncated) response with separate linear predictors for the mean and variance. Second, cumulative link models for ordinal data (obtained by interval-censoring continuous data) can be employed for heteroscedastic extended logistic regression (HXLR). In the latter type of models, the intercepts depend on the thresholds that define the intervals. Infrastructure for working with censored or truncated normal, logistic, and Student-t distributions, i.e., d/p/q/r functions and distributions3 objects.
|
2024-01-16 |
r-crawl
|
public |
Fit continuous-time correlated random walk models with time indexed covariates to animal telemetry data. The model is fit using the Kalman-filter on a state space version of the continuous-time stochastic movement process.
|
2024-01-16 |
r-crrsc
|
public |
Extension of 'cmprsk' to Stratified and Clustered data. A goodness of fit test for Fine-Gray model is also provided. Methods are detailed in the following articles: Zhou et al. (2011) <doi:10.1111/j.1541-0420.2010.01493.x>, Zhou et al. (2012) <doi:10.1093/biostatistics/kxr032>, Zhou et al. (2013) <doi: 10.1002/sim.5815>.
|
2024-01-16 |
r-cronr
|
public |
Create, edit, and remove 'cron' jobs on your unix-alike system. The package provides a set of easy-to-use wrappers to 'crontab'. It also provides an RStudio add-in to easily launch and schedule your scripts.
|
2024-01-16 |
r-crm
|
public |
Functions for phase I clinical trials using the continual reassessment method.
|
2024-01-16 |
r-crimcv
|
public |
A finite mixture of Zero-Inflated Poisson (ZIP) models for analyzing criminal trajectories.
|
2024-01-16 |
r-crfsuite
|
public |
Wraps the 'CRFsuite' library <https://github.com/chokkan/crfsuite> allowing users to fit a Conditional Random Field model and to apply it on existing data. The focus of the implementation is in the area of Natural Language Processing where this R package allows you to easily build and apply models for named entity recognition, text chunking, part of speech tagging, intent recognition or classification of any category you have in mind. Next to training, a small web application is included in the package to allow you to easily construct training data.
|
2024-01-16 |
r-crf
|
public |
Implements modeling and computational tools for conditional random fields (CRF) model as well as other probabilistic undirected graphical models of discrete data with pairwise and unary potentials.
|
2024-01-16 |
r-cplm
|
public |
Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <https://link.springer.com/article/10.1007/s11222-012-9343-7>.
|
2024-01-16 |
r-credule
|
public |
It provides functions to bootstrap Credit Curves from market quotes (Credit Default Swap - CDS - spreads) and price Credit Default Swaps - CDS.
|
2024-01-16 |
r-cpprouting
|
public |
Calculation of distances, shortest paths and isochrones on weighted graphs using several variants of Dijkstra algorithm. Proposed algorithms are unidirectional Dijkstra (Dijkstra, E. W. (1959) <doi:10.1007/BF01386390>), bidirectional Dijkstra (Goldberg, Andrew & Fonseca F. Werneck, Renato (2005) <https://archive.siam.org/meetings/alenex05/papers/03agoldberg.pdf>), A* search (P. E. Hart, N. J. Nilsson et B. Raphael (1968) <doi:10.1109/TSSC.1968.300136>), new bidirectional A* (Pijls & Post (2009) <https://repub.eur.nl/pub/16100/ei2009-10.pdf>), Contraction hierarchies (R. Geisberger, P. Sanders, D. Schultes and D. Delling (2008) <doi:10.1007/978-3-540-68552-4_24>), PHAST (D. Delling, A.Goldberg, A. Nowatzyk, R. Werneck (2011) <doi:10.1016/j.jpdc.2012.02.007>). Algorithms for solving the traffic assignment problem are All-or-Nothing assignment, Method of Successive Averages, Frank-Wolfe algorithm (M. Fukushima (1984) <doi:10.1016/0191-2615(84)90029-8>), Conjugate and Bi-Conjugate Frank-Wolfe algorithms (M. Mitradjieva, P. O. Lindberg (2012) <doi:10.1287/trsc.1120.0409>), Algorithm-B (R. B. Dial (2006) <doi:10.1016/j.trb.2006.02.008>).
|
2024-01-16 |
r-cpm
|
public |
Sequential and batch change detection for univariate data streams, using the change point model framework. Functions are provided to allow nonparametric distribution-free change detection in the mean, variance, or general distribution of a given sequence of observations. Parametric change detection methods are also provided for Gaussian, Bernoulli and Exponential sequences. Both the batch (Phase I) and sequential (Phase II) settings are supported, and the sequences may contain either a single or multiple change points. A full description of this package is available in Ross, G.J (2015) - "Parametric and nonparametric sequential change detection in R" available at <https://www.jstatsoft.org/article/view/v066i03>.
|
2024-01-16 |
r-cqrreg
|
public |
Estimate quantile regression(QR) and composite quantile regression (cqr) and with adaptive lasso penalty using interior point (IP), majorize and minimize(MM), coordinate descent (CD), and alternating direction method of multipliers algorithms(ADMM).
|
2024-01-16 |
r-cpcg
|
public |
Solves system of linear equations using (preconditioned) conjugate gradient algorithm, with improved efficiency using Armadillo templated 'C++' linear algebra library, and flexibility for user-specified preconditioning method. Please check <https://github.com/styvon/cPCG> for latest updates.
|
2024-01-16 |
r-coxsei
|
public |
Fit a CoxSEI (Cox type Self-Exciting Intensity) model to right-censored counting process data.
|
2024-01-16 |
r-corpustools
|
public |
Provides text analysis in R, focusing on the use of a tokenized text format. In this format, the positions of tokens are maintained, and each token can be annotated (e.g., part-of-speech tags, dependency relations). Prominent features include advanced Lucene-like querying for specific tokens or contexts (e.g., documents, sentences), similarity statistics for words and documents, exporting to DTM for compatibility with many text analysis packages, and the possibility to reconstruct original text from tokens to facilitate interpretation.
|
2024-01-16 |
r-coxphw
|
public |
Implements weighted estimation in Cox regression as proposed by Schemper, Wakounig and Heinze (Statistics in Medicine, 2009, <doi:10.1002/sim.3623>) and as described in Dunkler, Ploner, Schemper and Heinze (Journal of Statistical Software, 2018, <doi:10.18637/jss.v084.i02>). Weighted Cox regression provides unbiased average hazard ratio estimates also in case of non-proportional hazards. Approximated generalized concordance probability an effect size measure for clear-cut decisions can be obtained. The package provides options to estimate time-dependent effects conveniently by including interactions of covariates with arbitrary functions of time, with or without making use of the weighting option.
|
2024-01-16 |
r-coxplus
|
public |
A high performance package estimating Cox Model when an even has more than one causes. It also supports random and fixed effects, tied events, and time-varying variables.
|
2024-01-16 |
r-coxrobust
|
public |
An implementation of robust estimation in Cox model. Functionality includes fitting efficiently and robustly Cox proportional hazards regression model in its basic form, where explanatory variables are time independent with one event per subject. Method is based on a smooth modification of the partial likelihood.
|
2024-01-16 |
r-coxme
|
public |
Fit Cox proportional hazards models containing both fixed and random effects. The random effects can have a general form, of which familial interactions (a "kinship" matrix) is a particular special case. Note that the simplest case of a mixed effects Cox model, i.e. a single random per-group intercept, is also called a "frailty" model. The approach is based on Ripatti and Palmgren, Biometrics 2002.
|
2024-01-16 |