Implementation of advanced data structures such as hashmaps, heaps, or queues. Advanced data structures are essential in many computer science and statistics problems, for example graph algorithms or string analysis. The package uses 'Boost' and 'STL' data types and extends these to R with 'Rcpp' modules.
Combining Univariate Association Test Results of Multiple Phenotypes for Detecting Pleiotropy.
Reads and writes CSV with selected conventions. Uses the same generic function for reading and writing to promote consistent formats.
Provides a new method for identification of clusters of genomic regions within chromosomes. Primarily, it is used for calling clusters of cis-regulatory elements (COREs). 'CREAM' uses genome-wide maps of genomic regions in the tissue or cell type of interest, such as those generated from chromatin-based assays including DNaseI, ATAC or ChIP-Seq. 'CREAM' considers proximity of the elements within chromosomes of a given sample to identify COREs in the following steps: 1) It identifies window size or the maximum allowed distance between the elements within each CORE, 2) It identifies number of elements which should be clustered as a CORE, 3) It calls COREs, 4) It filters the COREs with lowest order which does not pass the threshold considered in the approach.
Allows to retrieve data from the 'CPTEC/INPE' weather forecast API. 'CPTEC' stands for 'Centro de Previsão de Tempo e Estudos Climáticos' and 'INPE' for 'Instituto Nacional de Pesquisas Espaciais'. 'CPTEC' is the most advanced numerical weather and climate forecasting center in Latin America, with high-precision short and medium-term weather forecasting since the beginning of 1995. See <http://www.cptec.inpe.br/> for more information.
We propose to determine the correction of the significance level after multiple coding of an explanatory variable in Generalized Linear Model. The different methods of correction of the p-value are the Single step Bonferroni procedure, and resampling based methods developed by P.H.Westfall in 1993. Resampling methods are based on the permutation and the parametric bootstrap procedure. If some continuous, and dichotomous transformations are performed this package offers an exact correction of the p-value developed by B.Liquet & D.Commenges in 2005. The naive method with no correction is also available.
Generating multiple binary and normal variables simultaneously given marginal characteristics and association structure based on the methodology proposed by Demirtas and Doganay (2012) <DOI:10.1080/10543406.2010.521874>.
Functions for phase I clinical trials using the continual reassessment method.
It fits a CoxSEI (Cox type Self-Exciting Intensity) model to right-censored counting process data.
Implements Firth's penalized maximum likelihood bias reduction method for Cox regression which has been shown to provide a solution in case of monotone likelihood (nonconvergence of likelihood function). The program fits profile penalized likelihood confidence intervals which were proved to outperform Wald confidence intervals.
This package fits a simultaneous regression model for the mean vectors and covariance matrices of multivariate response variables, as described in Hoff and Niu (2012). The explanatory variables can be continuous or discrete. The current version of the package provides the Bayesian estimates.
Converts any word, sentence or speech into Trump's infamous "covfefe" format. Reference: <https://www.nytimes.com/2017/05/31/us/politics/covfefe-trump-twitter.html>. Inspiration thanks to: <https://codegolf.stackexchange.com/questions/123685/covfefify-a-string>.
Functions for clustering regions that form convergence clubs, according to the definition of Phillips and Sul (2009) <doi:10.1002/jae.1080>.
Provides functions for making contour plots. The contour plot can be created from grid data, a function, or a data set. If non-grid data is given, then a Gaussian process is fit to the data and used to create the contour plot.
CODATA internationally recommended values of the fundamental physical constants, provided as symbols for direct use within the R language. Optionally, the values with errors and/or the values with units are also provided if the 'errors' and/or the 'units' packages are installed. The Committee on Data for Science and Technology (CODATA) is an interdisciplinary committee of the International Council for Science which periodically provides the internationally accepted set of values of the fundamental physical constants. This package contains the "2014 CODATA" version, published on 25 June 2015: Mohr, P. J., Newell, D. B. and Taylor, B. N. (2016) <DOI:10.1103/RevModPhys.88.035009>, <DOI:10.1063/1.4954402>.
Manipulate and analyze 3-D structural geometry of Protein Data Bank (PDB) files.
For estimation of a variable of interest using Kalman filter by incorporating results from previous assessments, i.e. through development weighted estimates where weights are assigned inversely proportional to the variance of existing and new estimates. For reference see Ehlers et al. (2017) <doi:10.20944/preprints201710.0098.v1>.
Creates a 3D data cube view of a RasterStack/Brick, typically a collection/array of RasterLayers (along z-axis) with the same geographical extent (x and y dimensions) and resolution, provided by package 'raster'. Slices through each dimension (x/y/z), freely adjustable in location, are mapped to the visible sides of the cube. The cube can be freely rotated. Zooming and panning can be used to focus on different areas of the cube.
Includes two functions to evaluate the hypothesis of complete spatial randomness (csr) in point processes. The function 'mwin' calculates quadrat counts to estimate the intensity of a spatial point process through the moving window approach proposed by Bailey and Gatrell (1995). Event counts are computed within a window of a set size over a fine lattice of points within the region of observation. The function 'pielou' uses the nearest neighbor test statistic and asymptotic distribution proposed by Pielou (1959) to compare the observed point process to one generated under csr. The value can be compared to that given by the more widely used test proposed by Clark and Evans (1954).
Provides peruvian agricultural production data from the Agriculture Minestry of Peru (MINAGRI). The first version includes 6 crops: rice, quinoa, potato, sweet potato, tomato and wheat; all of them across 24 departments. Initially, in excel files which has been transformed and assembled using tidy data principles, i.e. each variable is in a column, each observation is a row and each value is in a cell. The variables variables are sowing and harvest area per crop, yield, production and price per plot, every one year, from 2004 to 2014.
A device closing function which is able to crop graphics (e.g., PDF, PNG files) on Unix-like operating systems with the required underlying command-line tools installed.
Functions for constructing simultaneous credible bands and identifying subsets via the "credible subsets" (also called "credible subgroups") method.
Functions for completing and recalculating rankings and sorting.
R functions for cosmological research. The main functions are similar to the python library, cosmolopy.
Genome-wide association studies (GWAS) have been widely used for identifying common variants associated with complex diseases. Due to the small effect sizes of common variants, the power to detect individual risk variants is generally low. Complementary to SNP-level analysis, a variety of gene-based association tests have been proposed. However, the power of existing gene-based tests is often dependent on the underlying genetic models, and it is not known a priori which test is optimal. Here we proposed COMBined Association Test (COMBAT) to incorporate strengths from multiple existing gene-based tests, including VEGAS, GATES and simpleM. Compared to individual tests, COMBAT shows higher overall performance and robustness across a wide range of genetic models. The algorithm behind this method is described in Wang et al (2017) <doi:10.1534/genetics.117.300257>.
Perform competing risks analysis under bivariate Pareto models. See Shih et al. (2018) <doi:10.1080/03610926.2018.1425450> for details.
Generation of multiple binary and continuous non-normal variables simultaneously given the marginal characteristics and association structure based on the methodology proposed by Demirtas et al. (2012) <DOI:10.1002/sim.5362>.
Extension of cmprsk to Stratified and Clustered data. Goodness of fit test for Fine-Gray model.
Solves system of linear equations using (preconditioned) conjugate gradient algorithm, with improved efficiency using Armadillo templated 'C++' linear algebra library, and flexibility for user-specified preconditioning method. Please check <https://github.com/styvon/cPCG> for latest updates.
Fit robustly proportional hazards regression model
Activated Region Fitting (ARF) is an analysis method for fMRI data.
Set of methods to constrain numerical series and time series within arbitrary boundaries.
Provides a minimal interface for applying annotators from the 'Stanford CoreNLP' java library. Methods are provided for tasks such as tokenisation, part of speech tagging, lemmatisation, named entity recognition, coreference detection and sentiment analysis.
Estimation and statistical process control are performed under copula-based time-series models. Available are statistical methods in Long and Emura (2014 JCSA), Emura et al. (2017 Commun Stat-Simul) <DOI:10.1080/03610918.2015.1073303>, Huang and Emura(2019, in revision) and Huang, Chen and Emura (2019-, in revision).
Computes the Conover-Iman test (1979) for stochastic dominance and reports the results among multiple pairwise comparisons after a Kruskal-Wallis test for stochastic dominance among k groups (Kruskal and Wallis, 1952). The interpretation of stochastic dominance requires an assumption that the CDF of one group does not cross the CDF of the other. conover.test makes k(k-1)/2 multiple pairwise comparisons based on Conover-Iman t-test-statistic of the rank differences. The null hypothesis for each pairwise comparison is that the probability of observing a randomly selected value from the first group that is larger than a randomly selected value from the second group equals one half; this null hypothesis corresponds to that of the Wilcoxon-Mann-Whitney rank-sum test. Like the rank-sum test, if the data can be assumed to be continuous, and the distributions are assumed identical except for a difference in location, Conover-Iman test may be understood as a test for median difference. conover.test accounts for tied ranks. The Conover-Iman test is strictly valid if and only if the corresponding Kruskal-Wallis null hypothesis is rejected.
Collects several different methods for analyzing and working with connectivity data in R. Though primarily oriented towards marine larval dispersal, many of the methods are general and useful for terrestrial systems as well.
R's default conflict management system gives the most recently loaded package precedence. This can make it hard to detect conflicts, particularly when they arise because a package update creates ambiguity that did not previously exist. 'conflicted' takes a different approach, making every conflict an error and forcing you to choose which function to use.
To calculate the AQI (Air Quality Index) from pollutant concentration data. O3, PM2.5, PM10, CO, SO2, and NO2 are available currently. The method can be referenced at Environmental Protection Agency, United States as follows: EPA (2016) <https://www3.epa.gov/airnow/aqi-technical-assistance-document-may2016.pdf>.
Interface with and extract data from the United Nations Comtrade API <https://comtrade.un.org/data/>. Comtrade provides country level shipping data for a variety of commodities, these functions allow for easy API query and data returned as a tidy data frame.
Partition data points (variables) into communities/clusters, similar to clustering algorithms, such as k-means and hierarchical clustering. This package implements a clustering algorithm based on a new metric CORD, defined for high dimensional parametric or semi-parametric distributions. Read http://arxiv.org/abs/1508.01939 for more details.
Routines doing cone projection and quadratic programming, as well as doing estimation and inference for constrained parametric regression and shape-restricted regression problems. See Mary C. Meyer (2013)<doi:10.1080/03610918.2012.659820> for more details.
This package contains a collection of functions to deal with nonparametric measurement error problems using deconvolution kernel methods. We focus two measurement error models in the package: (1) an additive measurement error model, where the goal is to estimate the density or distribution function from contaminated data; (2) nonparametric regression model with errors-in-variables. The R functions allow the measurement errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in R, we adapt the "Fast Fourier Transform" (FFT) algorithm for density estimation with error-free data to the deconvolution kernel estimation. Several methods for the selection of the data-driven smoothing parameter are also provided in the package. See details in: Wang, X.F. and Wang, B. (2011). Deconvolution estimation in measurement error models: The R package decon. Journal of Statistical Software, 39(10), 1-24.
Are you aiming at the right spot in darts? Maybe not! Use this package to find your optimal aiming location. For a better explanation, go to http://www-stat.stanford.edu/~ryantibs/darts/ or see the paper "A Statistician Plays Darts".
An iterative algorithm for solving a convex formulation of the biclustering problem.
A matrix-like data structure that allows for efficient, convenient, and scalable subsetting of binary genotype/phenotype files generated by PLINK (<https://www.cog-genomics.org/plink2>), the whole genome association analysis toolset, without loading the entire file into memory.
For ordinal rating data, estimate and test models within the family of CUB models and their extensions (where CUB stands for Combination of a discrete Uniform and a shifted Binomial distributions). Simulation routines, plotting facilities and fitting measures are also provided.
Support for import from and export to the CSVY file format. CSVY is a file format that combines the simplicity of CSV (comma-separated values) with the metadata of other plain text and binary formats (JSON, XML, Stata, etc.) by placing a YAML header on top of a regular CSV.
Enables transformation of Verbal Autopsy data collected with the WHO 2016 questionnaire (versions 1.4.1 & 1.5.1) or the WHO 2014 questionnaire for automated coding of Cause of Death using the InSilicoVA (data.type = "WHO2016") and InterVA5 algorithms. Previous versions of this package supported user-supplied mappings (via the map_records function), but this functionality has been removed. This package is made available by WHO and the Bloomberg Data for Health Initiative.
In competing risks regression, the proportional subdistribution hazards(PSH) model is popular for its direct assessment of covariate effects on the cumulative incidence function. This package allows for penalized variable selection for the PSH model. Penalties include LASSO, SCAD, MCP, and their group versions.
A high performance package estimating Cox Model when an even has more than one causes. It also supports random and fixed effects, tied events, and time-varying variables.
