This folder contains the Python and Cython code of the algorithms and ML primitives of cuML, that are distributed in the Python cuML package.
Contents:
The build system uses setup.py for configuration and building.
cuML's setup.py can be configured through environment variables and command line arguments.
The environment variables are:
| Environment variable | Possible values | Default behavior if not set | Behavior |
| --- | --- | --- | --- |
| CUDAHOME | path/to/cudatoolkit | Inferred by location of nvcc
| Optional variable allowing to manually specify location of the CUDA toolkit. |
| CUMLBUILDPATH | path/to/libcumlbuildfolder | Looked for in pathtocumlrepo/cpp/build | Optional variable allowing to manually specify location of libcuml++ build folder. |
| RAFTPATH | path/to/raft | Looked for in pathtocuml_repo/cpp/build, if not found clone | Optional variable allowing to manually specify location of the RAFT Repository. |
The command line arguments (i.e. passed alongside setup.py
when invoking, for
example setup.py --singlegpu
) are:
| Argument | Behavior | | --- | --- | | clean --all | Cleans all Python and Cython artifacts, including pycache folders, .cpp files resulting of cythonization and compiled extensions. | | --singlegpu | Option to build cuML without multiGPU algorithms. Removes dependency on nccl, libcumlprims and ucx-py. |
RAFT's Python and Cython is located in the RAFT repository. It was designed to be included in projects as opposed to be distributed by itself, so at build time, setup.py creates a symlink from cuML, located in /python/cuml/raft/
to the Python folder of RAFT.
For developers that need to modify RAFT code, please refer to the RAFT Developer Guide for recommendations.
To configure RAFT at build time:
RAFT_PATH
points to the RAFT repo, then that will be used.The RAFT Python code gets included in the cuML build and distributable artifacts as if it was always present in the folder structure of cuML.
cuML's convenience development yaml files includes all dependencies required to build cuML.
To build cuML's Python package, the following dependencies are required:
Packages required for multigpu algorithms*: - libcumlprims version matching the cuML version - ucx-py version matching the cuML version - dask-cudf version matching the cuML version - nccl>=2.5 - rapids-dask-dependency version matching the cuML version
--singlegpu
argument flag.Python tests are based on the pytest library. To run them, from the path_to_cuml/python/
folder, simply type pytest
.