A set of scikit-learn-style transformers for encoding categorical variables into numeric with different techniques. While ordinal, one-hot, and hashing encoders have similar equivalents in the existing scikit-learn version, the transformers in this library all share a few useful properties: - First-class support for pandas dataframes as an input (and optionally as output) - Can explicitly configure which columns in the data are encoded by name or index, or infer non-numeric columns regardless of input type - Can drop any columns with very low variance based on training set optionally - Portability: train a transformer on data, pickle it, reuse it later and get the same thing out. - Full compatibility with sklearn pipelines, input an array-like dataset like any other transformer
| Uploaded | Mon Mar 31 20:59:03 2025 |
| md5 checksum | 71e49726fdf7e51ce8d7f8c01449a3b5 |
| arch | x86_64 |
| build | py310h06a4308_0 |
| depends | numpy >=1.11.1,<2.0a0, pandas >=0.20.1, patsy >=0.4.1, python >=3.10,<3.11.0a0, scikit-learn >=0.17.1, scipy >=0.17.0, statsmodels >=0.6.1 |
| license | BSD-3-Clause |
| license_family | BSD |
| md5 | 71e49726fdf7e51ce8d7f8c01449a3b5 |
| name | category_encoders |
| platform | linux |
| sha1 | 866ba5e8c18130c84f053ec88a82313685632291 |
| sha256 | 9cff6746cdeea4385f8b33f6725a567462d829a629867d37b019cd9c1ff4b26d |
| size | 62539 |
| subdir | linux-64 |
| timestamp | 1690384694741 |
| version | 1.3.0 |