Tools for Scraping Data from Web-Based Documents
Provides functions for estimating power or sample size for RNA-Seq studies. Empirical approach is used and the data is assumed to be count in nature. The underlying distribution of data is assumed to be Poisson or negative binomial. The package contains 6 function; 4 functions provide estimates of sample size or power for Poisson and Negative Binomial distribution; 2 functions provide plots of power for given sample size or sample size for given power.
Query plane tickets, from several airlines, using the 'Kiwi' API (similar to 'Google Flights'). The API is documented at <https://docs.kiwi.com/>.
Executes a post-rotation algorithm that REorders and/or REflects FACTors (REREFACT) for each replication of a simulation study with exploratory factor analysis.
A tool for refining data frame with formulas.
The package provides functions for estimating the relevant dimension of a data set in feature spaces, applications to model selection, graphical illustrations and prediction.
R version of 'GenClone' (a computer program to analyse genotypic data, test for clonality and describe spatial clonal organization, Arnaud-Haond & Belkhir 2007, <http://wwz.ifremer.fr/clonix/content/download/68205/903914/file/GenClone2.0.setup.zip>), this package allows clone handling as 'GenClone' does, plus the possibility to work with several populations, MultiLocus Lineages (MLL) custom definition and use, and p-value calculation for psex statistic (probability of originating from distinct sexual events) and psex_Fis statistic (taking account of Hardy-Weinberg equilibrium departure) as 'MLGsim'/'MLGsim2' (a program for detecting clones using a simulation approach, Stenberg et al. 2003).
Provides a programmatic interface to the citation information and alternate metrics provided by 'Altmetric'. Data from Altmetric allows researchers to immediately track the impact of their published work, without having to wait for citations. This allows for faster engagement with the audience interested in your work. For more information, visit <https://www.altmetric.com/>.
Tools for the design of QTL experiments
Apply the popular real-time monitoring strategy proposed by Phillips, Shi and Yu (2015a,b;PSY) <doi:10.1111/iere.12132>, <doi:10.1111/iere.12131>, along with a new bootstrap procedure designed to mitigate the potential impact of heteroskedasticity and to effect family-wise size control in recursive testing algorithms (Phillips and Shi, forthcoming).
PROTOLIDAR package contains functions for analyze the LIDAR scan of plants (grapevine) and make 3D maps in GRASS GIS.
Query and download satellite imagery and climate/atmospheric datasets using the SkyWatch API. Search datasets by wavelength (band), cloud cover, resolution, location, date, etc. Get the query results as data frame and as HTML. To learn more about the SkyWatch API, see <https://github.com/skywatchspaceapps/api>.
The semidiscrete decomposition (SDD) approximates a matrix as a weighted sum of outer products formed by vectors with entries constrained to be in the set {-1, 0, 1}.
The purpose of this package is to be able to save and load only the needed variables/columns of a dataframe in special binary files (tar archives) - which seems to be a lot faster method than loading the whole binary object (RData files) via load() function, or than loading columns from SQLite/MySQL databases via SQL commands (see vignettes). Performance gain on SSD drives is a lot more sensible compared to basic load() function. The performance improvement gained by loading only the chosen variables in binary format can be useful in some special cases (e.g. where merging data tables is not an option and very different datasets are needed for reporting), but be sure if using this package that you really need this, as non-standard file formats are used!
This package provides a set of tools for conducting exact or approximate randomization-based inference for experiments of arbitrary design. The primary functionality of the package is in the generation, manipulation and use of permutation matrices implied by given experimental designs. Among other features, the package facilitates estimation of average treatment effects, constant effects variance estimation, randomization inference for significance testing against sharp null hypotheses and visualization of data and results.
An Interface to the GA4GH API that allows users to easily GET responses and POST requests to GA4GH Servers. See <http://ga4gh.org> for more information about the GA4GH project.
Simple, easy to use, and flexible functionality for recoding variables. It allows for simple piecewise definition of transformations.
Contains the datasets used as default examples by the rattle package. The datasets themselves can be used independently of the rattle package to illustrate analytics, data mining, and data science tasks.
An implementation of routines for solving rate-distortion problems. Rate-distortion theory is a field within information theory that examines optimal lossy compression. That is, given that some information must be lost, how can a communication channel be designed that minimizes the cost of communication error? Rate-distortion theory is concerned with the optimal (minimal cost) solution to such tradeoffs. An important tool for solving rate-distortion problems is the Blahut algorithm, developed by Richard Blahut and described in: . Blahut, R. E. (1972). Computation of channel capacity and rate-distortion functions. IEEE Transactions on Information Theory, IT-18(4), 460-473. . This package implements the basic Blahut algorithm, and additionally contains a number of `helper' functions, including a routine for searching for an information channel that minimizes cost subject to a constraint on information rate.
Quantile Regression (QR) using Support Vector Machines under the Pinball-Loss. Estimation is based on "Nonparametric Quantile Regression" by I. Takeuchi, Q.V.Le , T. Sears, A.J.Smola (2004). Implementation relies on 'quadprog' package, package 'kernlab' Kernelfunctions and package 'Matrix' nearPD to find next Positive definite Kernelmatrix. Package estimates quantiles individually but an Implementation of non crossing constraints coming soon. Function multqrsvm() now supports parallel backend for faster fitting.
Functions to simulate Poisson or Normally distributed responses relative to a baseline and compute achieved significance level and powers for tests on the simulated responses.
R implementation of the methods described in "A rank-based empirical likelihood approach to two-sample proportional odds model and its goodness-of-fit" by Zhong Guan and Cheng Peng, Journal of Nonparametric Statistics, to appear.
The package implements the model-based kernel machine method for detecting gene-centric gene-gene interactions of Li and Cui (2012).
Implements an algorithm to conduct advanced gene set enrichment analysis on the results of genomics experiments.
Provides supplemental 2000 census tract boundaries for the 19 states containing Seer Registries for use with the 'SeerMapper' package. The data contained in this package is derived from U.S. Census data and is in the public domain.
A complete set of functions to calculate several EBLUP (Empirical Best Linear Unbiased Predictor) estimators and their mean squared errors. All estimators are based on an area-level linear mixed model introduced by Rao and Yu in 1994 (see documentation). The REML method is used for fitting this model.
R interaction with 'pipedrive.com API'. All functions were created and documented according to <https://developers.pipedrive.com/docs/api/v1/>. Created with the objective of offering integration and even the development of 'APIs'. Making possible to create workflows and easily downloading databases for analysis.
Provides utilities which interact with all R objects as if they were arranged in rows. It allows more consistent and predictable output to common functions, and generalizes a number of utility functions to to be failsafe with any number and type of input objects.
'KEEL' is a popular Java software for a large number of different knowledge data discovery tasks. Furthermore, 'RKEEL' is a package with a R code layer between R and 'KEEL', for using 'KEEL' in R code. This package downloads and install the .jar files necessary for 'RKEEL' algorithms execution. For more information about 'KEEL', see <http://www.keel.es/>.
Functionality required to efficiently use R with MarkLogic NoSQL Database Server, <http://www.marklogic.com/what-is-marklogic/>. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of MarkLogic server. In order to use the package you need a MarkLogic Server version 8 or higher.
Provides tools for working with Type S (Sign) and Type M (Magnitude) errors, as proposed in Gelman and Tuerlinckx (2000) <doi.org/10.1007/s001800000040> and Gelman & Carlin (2014) <doi.org/10.1177/1745691614551642>. In addition to simply calculating the probability of Type S/M error, the package includes functions for calculating these errors across a variety of effect sizes for comparison, and recommended sample size given "tolerances" for Type S/M errors. To improve the speed of these calculations, closed forms solutions for the probability of a Type S/M error from Lu, Qiu, and Deng (2018) <doi.org/10.1111/bmsp.12132> are implemented. As of 1.0.0, this includes support only for simple research designs. See the package vignette for a fuller exposition on how Type S/M errors arise in research, and how to analyze them using the type of design analysis proposed in the above papers.
A collection of utilities for some reliability models/probability distributions.
In confirmatory factor analysis (CFA), structural constraints typically ensure that the model is identified up to all possible reflections, i.e., column sign changes of the matrix of loadings. Such reflection invariance is problematic for Bayesian CFA when the reflection modes are not well separated in the posterior distribution. Imposing rotational constraints -- fixing some loadings to be zero or positive in order to pick a factor solution that corresponds to one reflection mode -- may not provide a satisfactory solution for Bayesian CFA. The function 'relabel' uses the relabeling algorithm of Erosheva and Curtis to correct for sign invariance in MCMC draws from CFA models. The MCMC draws should come from Bayesian CFA models that are fit without rotational constraints.
Functions for fitting linear and generalized linear models with variable selection. The functions can automatically do Stepwise Regression, Lasso or Elastic Net as variable selection methods. Lasso and Elastic net are improved and handle factors better (they can either include or exclude all factor levels).
Reference-free method for conducting EWAS while deconvoluting DNA methylation arising as mixtures of cell types. The older method (Houseman et al., 2014,<doi:10.1093/bioinformatics/btu029>) is similar to surrogate variable analysis (SVA and ISVA), except that it makes additional use of a biological mixture assumption. The newer method (Houseman et al., 2016, <doi:10.1186/s12859-016-1140-4>) is similar to non-negative matrix factorization, with additional constraints and additional utilities.
Pathways in a database could have many redundancies among them. This package allows the user to set a maximum value for the proportion of these redundancies.
Provides programmatic access to Colombos, a web based interface for exploring and analyzing comprehensive organism-specific cross-platform expression compendia of bacterial organisms.
Implements tests for Type I error in Qualitative Comparative Analysis (QCA) that take into account the multiple hypothesis tests inherent in the procedure. Tests can be carried out on three variants of QCA: crisp-set QCA (csQCA), multi-value QCA (mvQCA) and fuzzy-set QCA (fsQCA). For fsQCA, the fsQCApermTest() command implements a permutation test that provides 95% confidence intervals for the number of counterexamples and degree of consistency, respectively. The distributions of permuted values can be plotted against the observed values. For csQCA and mvQCA, simple binomial tests are implemented in csQCAbinTest() and mvQCAbinTest(), respectively.
This package provides utilities for working with a library of SQL files.
Perform Wald's Sequential Probability Ratio Test on variables with a Normal, Bernoulli, Exponential and Poisson distribution. Plot acceptance and continuation regions, or create your own with the help of closures.
This package is for optimizing non-linear complex functions based on Monte Carlo random sampling.
Contains two functions for simulating survival data from piecewise exponential hazards with a proportional hazards adjustment for covariates. The first function SimUNIVPiecewise simulates univariate survival data based on a piecewise exponential hazard, covariate matrix and true regression vector. The second function SimSCRPiecewise semi-competing risks data based on three piecewise exponential hazards, three true regression vectors and three matrices of patient covariates (which can be different or the same). This simulates from the Semi-Markov model of Lee et al (2015) given patient covariates, regression parameters, patient frailties and baseline hazard functions.
Provides supplemental 2010 census tract boundaries for the 13 states without Seer Registries that are west of the Mississippi river for use with the 'SeerMapper' package. The data contained in this package is derived from U.S. 2010 Census data and is in public domain.
The software suite, 'Freesurfer', is a open-source software suite involving the segmentation of brain MRIs (see <http://freesurfer.net/> for more information). This package provides functionality to import the data generated by 'Freesurfer'; functions to easily manipulate the data; and provides brain specific normalisation commonly used when studying structural brain MRIs. This package has been designed using an installation of and data generated from 'Freesurfer' version 5.3.
A convenience interface for communicating with the Stripe payment processor to accept payments online. See <https://stripe.com> for more information.
Implements the Gibbs sampling algorithm to randomly sample association rules with one pre-chosen item as the consequent from a transaction dataset. The Gibbs sampling algorithm was proposed in G. Qian, C.R. Rao, X. Sun and Y. Wu (2016) <DOI:10.1073/pnas.1604553113>.
Provides an interface to Geckoboard.
Exact one-sided p-values and confidence intervals for an outcome variable defined on an interval measurement scale with only qualitative and ordinal information available.
The package provides the plot function som.plot() to create high quality visualisations of hexagonal Kohonen maps (self-organising maps).
An optimal weighting strategy to compute simulation-efficient shortest probability intervals (spins).
