A set of scikit-learn-style transformers for encoding categorical variables into numeric with different techniques. While ordinal, one-hot, and hashing encoders have similar equivalents in the existing scikit-learn version, the transformers in this library all share a few useful properties: - First-class support for pandas dataframes as an input (and optionally as output) - Can explicitly configure which columns in the data are encoded by name or index, or infer non-numeric columns regardless of input type - Can drop any columns with very low variance based on training set optionally - Portability: train a transformer on data, pickle it, reuse it later and get the same thing out. - Full compatibility with sklearn pipelines, input an array-like dataset like any other transformer
Uploaded | Mon Mar 31 20:59:01 2025 |
md5 checksum | c985af78172c73025017cc50f6e7f21c |
arch | x86_64 |
build | py311h06a4308_0 |
depends | numpy >=1.14.0,<2.0a0, pandas >=1.0.5, patsy >=0.5.1, python >=3.11,<3.12.0a0, scikit-learn >=0.20.0, scipy >=1.0.0, statsmodels >=0.9.0 |
license | BSD-3-Clause |
license_family | BSD |
md5 | c985af78172c73025017cc50f6e7f21c |
name | category_encoders |
platform | linux |
sha1 | b75e11c89080e9e01273c44ad6c64a6911c463b7 |
sha256 | d3c1f3b42cc3fe03b96cbd115e5078e1f1b568e4906c02eb20c7c1ddb8484ec4 |
size | 128394 |
subdir | linux-64 |
timestamp | 1676930344034 |
version | 2.6.0 |