r-gbm
|
public |
An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway.
|
2025-04-22 |
r-gb
|
public |
A collection of algorithms and functions for fitting data to a generalized lambda distribution via moment matching methods, and generalized bootstrapping.
|
2025-04-22 |
r-gausscov
|
public |
Given the standard linear model the traditional way of deciding whether to include the jth covariate is to apply the F-test to decide whether the corresponding beta coefficient is zero. The Gaussian covariate method is completely different. The question as to whether the beta coefficient is or is not zero is replaced by the question as to whether the covariate is better or worse than i.i.d. Gaussian noise. The P-value for the covariate is the probability that Gaussian noise is better. Surprisingly this can be given exactly and it is the same a the P-value for the classical model based on the F-distribution. The Gaussian covariate P-value is model free, it is the same for any data set. Using the idea it is possible to do covariate selection for a small number of covariates 25 by considering all subsets. Post selection inference causes no problems as the P-values hold whatever the data. The idea extends to stepwise regression again with exact probabilities. In the simplest version the only parameter is a specified cut-off P-value which can be interpreted as the probability of a false positive being included in the final selection. For more information see the web site below and the accompanying papers: L. Davies and L. Duembgen, "Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities", 2022, <arxiv:2202.01553>. L. Davies, "Linear Regression, Covariate Selection and the Failure of Modelling", 2022, <arXiv:2112.08738>.
|
2025-04-22 |
r-gaselect
|
public |
Provides a genetic algorithm for finding variable subsets in high dimensional data with high prediction performance. The genetic algorithm can use ordinary least squares (OLS) regression models or partial least squares (PLS) regression models to evaluate the prediction power of variable subsets. By supporting different cross-validation schemes, the user can fine-tune the tradeoff between speed and quality of the solution.
|
2025-04-22 |
r-gap
|
public |
As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
|
2025-04-22 |
r-gammslice
|
public |
Uses a slice sampling-based Markov chain Monte Carlo to conduct Bayesian fitting and inference for generalized additive mixed models. Generalized linear mixed models and generalized additive models are also handled as special cases of generalized additive mixed models. The methodology and software is described in Pham, T.H. and Wand, M.P. (2018). Australian and New Zealand Journal of Statistics, 60, 279-330 <DOI:10.1111/ANZS.12241>.
|
2025-04-22 |
r-gamlss.dist
|
public |
A set of distributions which can be used for modelling the response variables in Generalized Additive Models for Location Scale and Shape, Rigby and Stasinopoulos (2005), <doi:10.1111/j.1467-9876.2005.00510.x>. The distributions can be continuous, discrete or mixed distributions. Extra distributions can be created, by transforming, any continuous distribution defined on the real line, to a distribution defined on ranges 0 to infinity or 0 to 1, by using a 'log' or a 'logit' transformation respectively.
|
2025-04-22 |
r-gamlr
|
public |
The gamma lasso algorithm provides regularization paths corresponding to a range of non-convex cost functions between L0 and L1 norms. As much as possible, usage for this package is analogous to that for the glmnet package (which does the same thing for penalization between L1 and L2 norms). For details see: Taddy (2017 JCGS), 'One-Step Estimator Paths for Concave Regularization', <arXiv:1308.5623>.
|
2025-04-22 |
r-gamesga
|
public |
Finds adaptive strategies for sequential symmetric games using a genetic algorithm. Currently, any symmetric two by two matrix is allowed, and strategies can remember the history of an opponent's play from the previous three rounds of moves in iterated interactions between players. The genetic algorithm returns a list of adaptive strategies given payoffs, and the mean fitness of strategies in each generation.
|
2025-04-22 |
r-gamboost
|
public |
This package provides routines for fitting generalized linear and and generalized additive models by likelihood based boosting, using penalized B-splines
|
2025-04-22 |
r-gam
|
public |
Functions for fitting and working with generalized additive models, as described in chapter 7 of "Statistical Models in S" (Chambers and Hastie (eds), 1991), and "Generalized Additive Models" (Hastie and Tibshirani, 1990).
|
2025-04-22 |
r-galgo
|
public |
Build multivariate predictive models from large datasets having far larger number of features than samples such as in functional genomics datasets. Trevino and Falciani (2006) <doi:10.1093/bioinformatics/btl074>.
|
2025-04-22 |
r-gafit
|
public |
A group of sample points are evaluated against a user-defined expression, the sample points are lists of parameters with values that may be substituted into that expression. The genetic algorithm attempts to make the result of the expression as low as possible (usually this would be the sum of residuals squared).
|
2025-04-22 |
r-gadag
|
public |
Sparse large Directed Acyclic Graphs learning with a combination of a convex program and a tailored genetic algorithm (see Champion et al. (2017) <https://hal.archives-ouvertes.fr/hal-01172745v2/document>).
|
2025-04-22 |
r-ga
|
public |
Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach.
|
2025-04-22 |
r-fwsim
|
public |
Simulates a population under the Fisher-Wright model (fixed or stochastic population size) with a one-step neutral mutation process (stepwise mutation model, logistic mutation model and exponential mutation model supported). The stochastic population sizes are random Poisson distributed and different kinds of population growth are supported. For the stepwise mutation model, it is possible to specify locus and direction specific mutation rate (in terms of upwards and downwards mutation rate). Intermediate generations can be saved in order to study e.g. drift.
|
2025-04-22 |
r-fuzzyranktests
|
public |
Does fuzzy tests and confidence intervals (following Geyer and Meeden, Statistical Science, 2005, <doi:10.1214/088342305000000340>) for sign test and Wilcoxon signed rank and rank sum tests.
|
2025-04-22 |
r-funitroots
|
public |
Provides four addons for analyzing trends and unit roots in financial time series: (i) functions for the density and probability of the augmented Dickey-Fuller Test, (ii) functions for the density and probability of MacKinnon's unit root test statistics, (iii) reimplementations for the ADF and MacKinnon Test, and (iv) an 'urca' Unit Root Test Interface for Pfaff's unit root test suite.
|
2025-04-22 |
r-funchisq
|
public |
Statistical hypothesis testing methods for inferring model-free functional dependency using asymptotic chi-squared or exact distributions. Functional test statistics are asymmetric and functionally optimal, unique from other related statistics. Tests in this package reveal evidence for causality based on the causality-by- functionality principle. They include asymptotic functional chi-squared tests (Zhang & Song 2013) <arXiv:1311.2707>, an adapted functional chi-squared test (Kumar & Song 2022) <doi:10.1093/bioinformatics/btac206>, and an exact functional test (Zhong & Song 2019) <doi:10.1109/TCBB.2018.2809743> (Nguyen et al. 2020) <doi:10.24963/ijcai.2020/372>. The normalized functional chi-squared test was used by Best Performer 'NMSUSongLab' in HPN-DREAM (DREAM8) Breast Cancer Network Inference Challenges (Hill et al. 2016) <doi:10.1038/nmeth.3773>. A function index (Zhong & Song 2019) <doi:10.1186/s12920-019-0565-9> (Kumar et al. 2018) <doi:10.1109/BIBM.2018.8621502> derived from the functional test statistic offers a new effect size measure for the strength of functional dependency, a better alternative to conditional entropy in many aspects. For continuous data, these tests offer an advantage over regression analysis when a parametric functional form cannot be assumed; for categorical data, they provide a novel means to assess directional dependency not possible with symmetrical Pearson's chi-squared or Fisher's exact tests.
|
2025-04-22 |
r-fts
|
public |
Fast operations for time series objects.
|
2025-04-22 |
r-ftnonpar
|
public |
The package contains R-functions to perform the methods in nonparametric regression and density estimation, described in Davies, P. L. and Kovac, A. (2001) Local Extremes, Runs, Strings and Multiresolution (with discussion) Annals of Statistics. 29. p1-65 Davies, P. L. and Kovac, A. (2004) Densities, Spectral Densities and Modality Annals of Statistics. Annals of Statistics. 32. p1093-1136 Kovac, A. (2006) Smooth functions and local extreme values. Computational Statistics and Data Analysis (to appear) D\"umbgen, L. and Kovac, A. (2006) Extensions of smoothing via taut strings Davies, P. L. (1995) Data features. Statistica Neerlandica 49,185-245.
|
2025-04-22 |
r-fst
|
public |
Multithreaded serialization of compressed data frames using the 'fst' format. The 'fst' format allows for full random access of stored data and a wide range of compression settings using the LZ4 and ZSTD compressors.
|
2025-04-22 |
r-fsinteract
|
public |
Performs fast detection of interactions in large-scale data using the method of random intersection trees introduced in Shah, R. D. and Meinshausen, N. (2014) <http://www.jmlr.org/papers/v15/shah14a.html>. The algorithm finds potentially high-order interactions in high-dimensional binary two-class classification data, without requiring lower order interactions to be informative. The search is particularly fast when the matrices of predictors are sparse. It can also be used to perform market basket analysis when supplied with a single binary data matrix. Here it will find collections of columns which for many rows contain all 1's.
|
2025-04-22 |
r-fselectorrcpp
|
public |
'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf> with a sparse matrix support.
|
2025-04-22 |
r-fromo
|
public |
Fast, numerically robust computation of weighted moments via 'Rcpp'. Supports computation on vectors and matrices, and Monoidal append of moments. Moments and cumulants over running fixed length windows can be computed, as well as over time-based windows. Moment computations are via a generalization of Welford's method, as described by Bennett et. (2009) <doi:10.1109/CLUSTR.2009.5289161>.
|
2025-04-22 |