About Anaconda Help Download Anaconda

r / packages

Package Name Access Summary Updated
r-acss.data public Data only package providing the algorithmic complexity of short strings, computed using the coding theorem method. For a given set of symbols in a string, all possible or a large number of random samples of Turing machines (TM) with a given number of states (e.g., 5) and number of symbols corresponding to the number of symbols in the strings were simulated until they reached a halting state or failed to end. This package contains data on 4.5 million strings from length 1 to 12 simulated on TMs with 2, 4, 5, 6, and 9 symbols. The complexity of the string corresponds to the distribution of the halting states of the TMs. 2024-01-16
r-acid public Functions for the analysis of income distributions for subgroups of the population as defined by a set of variables like age, gender, region, etc. This entails a Kolmogorov-Smirnov test for a mixture distribution as well as functions for moments, inequality measures, entropy measures and polarisation measures of income distributions. This package thus aides the analysis of income inequality by offering tools for the exploratory analysis of income distributions at the disaggregated level. 2024-01-16
r-accessibility public A set of fast and convenient functions to calculate multiple transport accessibility measures. Given a pre-computed travel cost matrix and a land use dataset (containing the location of jobs, healthcare and population, for example), the package allows one to calculate active and passive accessibility levels using multiple accessibility measures, such as: cumulative opportunities (using either travel cost cutoffs or intervals), minimum travel cost to closest N number of activities, gravity-based (with different decay functions) and different floating catchment area methods. 2024-01-16
r-acswr public A book designed to meet the requirements of masters students. Tattar, P.N., Suresh, R., and Manjunath, B.G. "A Course in Statistics with R", J. Wiley, ISBN 978-1-119-15272-9. 2024-01-16
r-acss public Main functionality is to provide the algorithmic complexity for short strings, an approximation of the Kolmogorov Complexity of a short string using the coding theorem method (see ?acss). The database containing the complexity is provided in the data only package acss.data, this package provides functions accessing the data such as prob_random returning the posterior probability that a given string was produced by a random process. In addition, two traditional (but problematic) measures of complexity are also provided: entropy and change complexity. 2024-01-16
r-acs public Provides a general toolkit for downloading, managing, analyzing, and presenting data from the U.S. Census (<https://www.census.gov/data/developers/data-sets.html>), including SF1 (Decennial short-form), SF3 (Decennial long-form), and the American Community Survey (ACS). Confidence intervals provided with ACS data are converted to standard errors to be bundled with estimates in complex acs objects. Package provides new methods to conduct standard operations on acs objects and present/plot data in statistically appropriate ways. 2024-01-16
r-academictwitter public Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side. 2024-01-16
r-acp public Analysis of count data exhibiting autoregressive properties, using the Autoregressive Conditional Poisson model (ACP(p,q)) proposed by Heinen (2003). 2024-01-16
r-acopula public Archimax copulas are mixture of Archimedean and EV copulas. The package provides definitions of several parametric families of generator and dependence function, computes CDF and PDF, estimates parameters, tests for goodness of fit, generates random sample and checks copula properties for custom constructs. In 2-dimensional case explicit formulas for density are used, contrary to higher dimensions when all derivatives are linearly approximated. Several non-archimax families (normal, FGM, Plackett) are provided as well. 2024-01-16
r-acnr public Provides SNP array data from different types of copy-number regions. These regions were identified manually by the authors of the package and may be used to generate realistic data sets with known truth. 2024-01-16
r-acfmperiod public Non-robust and robust computations of the sample autocovariance (ACOVF) and sample autocorrelation functions (ACF) of univariate and multivariate processes. The methodology consists in reversing the diagonalization procedure involving the periodogram or the cross-periodogram and the Fourier transform vectors, and, thus, obtaining the ACOVF or the ACF as discussed in Fuller (1995) <doi:10.1002/9780470316917>. The robust version is obtained by fitting robust M-regressors to obtain the M-periodogram or M-cross-periodogram as discussed in Reisen et al. (2017) <doi:10.1016/j.jspi.2017.02.008>. 2024-01-16
r-ace2fastq public The ACE file format is used in genomics to store contigs from sequencing machines. This tools converts it into FASTQ format. Both formats contain the sequence characters and their corresponding quality information. Unlike the FASTQ file, the ace file stores the quality values numerically. The conversion algorithm uses the standard Sanger formula. The package facilitates insertion into pipelines, and content inspection. 2024-01-16
r-abcanalysis public For a given data set, the package provides a novel method of computing precise limits to acquire subsets which are easily interpreted. Closely related to the Lorenz curve, the ABC curve visualizes the data by graphically representing the cumulative distribution function. Based on an ABC analysis the algorithm calculates, with the help of the ABC curve, the optimal limits by exploiting the mathematical properties pertaining to distribution of analyzed items. The data containing positive values is divided into three disjoint subsets A, B and C, with subset A comprising very profitable values, i.e. largest data values ("the important few"), subset B comprising values where the yield equals to the effort required to obtain it, and the subset C comprising of non-profitable values, i.e., the smallest data sets ("the trivial many"). Package is based on "Computed ABC Analysis for rational Selection of most informative Variables in multivariate Data", PLoS One. Ultsch. A., Lotsch J. (2015) <DOI:10.1371/journal.pone.0129767>. 2024-01-16
r-abd public The abd package contains data sets and sample code for The Analysis of Biological Data by Michael Whitlock and Dolph Schluter (2009; Roberts & Company Publishers). 2024-01-16
r-acceptancesampling public Provides functionality for creating and evaluating acceptance sampling plans. Sampling plans can be single, double or multiple. 2024-01-16
r-ac3net public Infers directional conservative causal core (gene) networks. It is an advanced version of the algorithm C3NET by providing directional network. Gokmen Altay (2018) <doi:10.1101/271031>, bioRxiv. 2024-01-16
r-aca public Offers an interactive function for the detection of breakpoints in series. 2024-01-16
r-abc public Implements several ABC algorithms for performing parameter estimation, model selection, and goodness-of-fit. Cross-validation tools are also available for measuring the accuracy of ABC estimates, and to calculate the misclassification probabilities of different models. 2024-01-16
r-abps public An implementation of the Abnormal Blood Profile Score (ABPS, part of the Athlete Biological Passport program of the World Anti-Doping Agency), which combines several blood parameters into a single score in order to detect blood doping (Sottas et al. (2006) <doi:10.2202/1557-4679.1011>). The package also contains functions to calculate other scores used in anti-doping programs, such as the OFF-score (Gore et al. (2003) <http://www.haematologica.org/content/88/3/333>), as well as example data. 2024-01-16
r-abodoutlier public Performs angle-based outlier detection on a given dataframe. Three methods are available, a full but slow implementation using all the data that has cubic complexity, a fully randomized one which is way more efficient and another using k-nearest neighbours. These algorithms are specially well suited for high dimensional data outlier detection. 2024-01-16
r-abnormality public Contains the functions to implement the methodology and considerations laid out by Marks et al. in the manuscript Measuring Abnormality in High Dimensional Spaces: Applications in Biomechanical Gait Analysis. As of 2/27/2018 this paper has been submitted and is under scientific review. Using high-dimensional datasets to measure a subject’s overall level of abnormality as compared to a reference population is often needed in outcomes research. Utilizing applications in instrumented gait analysis, that article demonstrates how using data that is inherently non-independent to measure overall abnormality may bias results. A methodology is introduced to address this bias to accurately measure overall abnormality in high dimensional spaces. While this methodology is in line with previous literature, it differs in two major ways. Advantageously, it can be applied to datasets in which the number of observations is less than the number of features/variables, and it can be abstracted to practically any number of domains or dimensions. After applying the proposed methodology to the original data, the researcher is left with a set of uncorrelated variables (i.e. principal components) with which overall abnormality can be measured without bias. Different considerations are discussed in that article in deciding the appropriate number of principal components to keep and the aggregate distance measure to utilize. 2024-01-16
r-abind public Combine multidimensional arrays into a single array. This is a generalization of 'cbind' and 'rbind'. Works with vectors, matrices, and higher-dimensional arrays. Also provides functions 'adrop', 'asub', and 'afill' for manipulating, extracting and replacing data in arrays. 2024-01-16
r-abe public Performs augmented backward elimination and checks the stability of the obtained model. Augmented backward elimination combines significance or information based criteria with the change in estimate to either select the optimal model for prediction purposes or to serve as a tool to obtain a practically sound, highly interpretable model. More details can be found in Dunkler et al. (2014) <doi:10.1371/journal.pone.0113677>. 2024-01-16
r-a3 public Supplies tools for tabulating and analyzing the results of predictive models. The methods employed are applicable to virtually any predictive model and make comparisons between different methodologies straightforward. 2024-01-16
r-abcp2 public Tests the goodness of fit of a distribution of offspring to the Normal, Poisson, and Gamma distribution and estimates the proportional paternity of the second male (P2) based on the best fit distribution. 2024-01-16
r-abc.rap public It aims to identify candidate genes that are “differentially methylated” between cases and controls. It applies Student’s t-test and delta beta analysis to identify candidate genes containing multiple “CpG sites”. 2024-01-16
r-abc.data public Contains data which are used by functions of the 'abc' package. 2024-01-16
r-yardstick public Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE). 2024-01-16
r-wrs2 public A collection of robust statistical methods based on Wilcox' WRS functions. It implements robust t-tests (independent and dependent samples), robust ANOVA (including between-within subject designs), quantile ANOVA, robust correlation, robust mediation, and nonparametric ANCOVA models based on robust location measures. 2024-01-16
rpy2 None Python interface to the R language (embedded R) 2024-01-16
r-zoo None An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo's key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics. 2024-01-16
r-zip public Cross-Platform 'zip' Compression Library. A replacement for the 'zip' function, that does not require any additional external tools on any platform. 2024-01-16
r-zic public Provides MCMC algorithms for the analysis of zero-inflated count models. The case of stochastic search variable selection (SVS) is also considered. All MCMC samplers are coded in C++ for improved efficiency. A data set considering the demand for health care is provided. 2024-01-16
r-ypinterimtesting public For any spending function specified by the user, this package provides corresponding boundaries for interim testing using the adaptively weighted log-rank test developed by Yang and Prentice (2010 <doi:10.1111/j.1541-0420.2009.01243.x>). The package uses a re-sampling method to obtain stopping boundaries at the interim looks.The output consists of stopping boundaries and observed values of the test statistics at the interim looks, along with nominal p-values defined as the probability of the test exceeding the specific observed test statistic value or critical value, regardless of the test behavior at other looks. The asymptotic validity of the stopping boundaries is established in Yang (2018 <doi:10.1002/sim.7958>). 2024-01-16
r-xts None Provide for uniform handling of R's different time-based data classes by extending zoo, maximizing native format information preservation and allowing for user level customization and extension, while simplifying cross-class interoperability. 2024-01-16
r-yaimpute public Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results. 2024-01-16
r-xslt public An extension for the 'xml2' package to transform XML documents by applying an 'xslt' style-sheet. 2024-01-16
r-xnomial public Tests whether a set of counts fit a given expected ratio. For example, a genetic cross might be expected to produce four types in the relative frequencies of 9:3:3:1. To see whether a set of observed counts fits this expectation, one can examine all possible outcomes with xmulti() or a random sample of them with xmonte() and find the probability of an observation deviating from the expectation by at least as much as the observed. As a measure of deviation from the expected, one can use the log-likelihood ratio, the multinomial probability, or the classic chi-square statistic. A histogram of the test statistic can also be plotted and compared with the asymptotic curve. 2024-01-16
r-wordspace public An interactive laboratory for research on distributional semantic models ('DSM', see <https://en.wikipedia.org/wiki/Distributional_semantics> for more information). 2024-01-16
r-xplorerr public Tools for interactive data exploration built using 'shiny'. Includes apps for descriptive statistics, visualizing probability distributions, inferential statistics, linear regression, logistic regression and RFM analysis. 2024-01-16
r-xgboost public Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily. 2024-01-16
r-xbrl public Functions to extract business financial information from an Extensible Business Reporting Language ('XBRL') instance file and the associated collection of files that defines its 'Discoverable' Taxonomy Set ('DTS'). 2024-01-16
r-wwr public Calculate the (weighted) win loss statistics including the win ratio, win difference and win product and their variances, with which the p-values are also calculated. The variance estimation is based on Luo et al. (2015) <doi:10.1111/biom.12225> and Luo et al. (2017) <doi:10.1002/sim.7284>. This package also calculates general win loss statistics with user-specified win loss function with variance estimation based on Bebu and Lachin (2016) <doi:10.1093/biostatistics/kxv032>. This version corrected an error when outputting confidence interval for win difference. 2024-01-16
r-wtest public Perform the calculation of W-test, diagnostic checking, calculate minor allele frequency (MAF) and odds ratio. 2024-01-16
r-wsrf public A parallel implementation of Weighted Subspace Random Forest. The Weighted Subspace Random Forest algorithm was proposed in the International Journal of Data Warehousing and Mining by Baoxun Xu, Joshua Zhexue Huang, Graham Williams, Qiang Wang, and Yunming Ye (2012) <DOI:10.4018/jdwm.2012040103>. The algorithm can classify very high-dimensional data with random forests built using small subspaces. A novel variable weighting method is used for variable subspace selection in place of the traditional random variable sampling.This new approach is particularly useful in building models from high-dimensional data. 2024-01-16
r-word2vec public Learn vector representations of words by continuous bag of words and skip-gram implementations of the 'word2vec' algorithm. The techniques are detailed in the paper "Distributed Representations of Words and Phrases and their Compositionality" by Mikolov et al. (2013), available at <arXiv:1310.4546>. 2024-01-16
r-wskm public Entropy weighted k-means (ewkm) by Liping Jing, Michael K. Ng and Joshua Zhexue Huang (2007) <doi:10.1109/TKDE.2007.1048> is a weighted subspace clustering algorithm that is well suited to very high dimensional data. Weights are calculated as the importance of a variable with regard to cluster membership. The two-level variable weighting clustering algorithm tw-k-means (twkm) by Xiaojun Chen, Xiaofei Xu, Joshua Zhexue Huang and Yunming Ye (2013) <doi:10.1109/TKDE.2011.262> introduces two types of weights, the weights on individual variables and the weights on variable groups, and they are calculated during the clustering process. The feature group weighted k-means (fgkm) by Xiaojun Chen, Yunminng Ye, Xiaofei Xu and Joshua Zhexue Huang (2012) <doi:10.1016/j.patcog.2011.06.004> extends this concept by grouping features and weighting the group in addition to weighting individual features. 2024-01-16
r-wrswor public A collection of implementations of classical and novel algorithms for weighted sampling without replacement. 2024-01-16
r-winch public Obtain the native stack trace and fuse it with R's stack trace for easier debugging of R packages with native code. 2024-01-16
r-writexl public Zero-dependency data frame to xlsx exporter based on 'libxlsxwriter'. Fast and no Java or Excel required. 2024-01-16

© 2025 Anaconda, Inc. All Rights Reserved. (v4.0.7) Legal | Privacy Policy