This is the UCX communication backend for Dask Distributed, providing high-performance communication capabilities using the UCX (Unified Communication X) framework. It enables efficient GPU-to-GPU communication via NVLink (CUDA IPC), InfiniBand support, and various other high-speed interconnects.
This package is typically installed as part of the UCXX project build process. It can also be installed separately via conda-forge:
mamba install -c conda-forge distributed-ucxx
Or via PyPI (requires selection of CUDA version):
pip install distributed-ucxx-cu13 # For CUDA 13.x
pip install distributed-ucxx-cu12 # For CUDA 12.x
This package provides its own configuration system that replaces the UCX configuration previously found in the main Distributed package. Configuration can be provided via:
distributed-ucxx.yaml
DASK_DISTRIBUTED_UCXX_
prefixThe configuration schema is defined in distributed-ucxx-schema.yaml
and supports various options:
tcp
, nvlink
, infiniband
, cuda-copy
, etc.rmm.pool-size
multi-buffer
, environment
New schema:
distributed-ucxx:
tcp: true
nvlink: true
infiniband: false
cuda-copy: true
create-cuda-context: true
multi-buffer: false
environment:
log-level: "info"
rmm:
pool-size: "1GB"
Legacy schema (may be removed in the future):
distributed:
comm:
ucx:
tcp: true
nvlink: true
infiniband: false
cuda-copy: true
create-cuda-context: true
multi-buffer: false
environment:
log-level: "info"
rmm:
pool-size: "1GB"
New schema:
export DASK_DISTRIBUTED_UCXX__TCP=true
export DASK_DISTRIBUTED_UCXX__NVLINK=true
export DASK_DISTRIBUTED_UCXX__RMM__POOL_SIZE=1GB
Legacy schema (may be removed in the future):
export DASK_DISTRIBUTED__COMM__UCX__TCP=true
export DASK_DISTRIBUTED__COMM__UCX__NVLINK=true
export DASK_DISTRIBUTED__RMM__POOL_SIZE=1GB
New schema:
import dask
dask.config.set({
"distributed-ucxx.tcp": True,
"distributed-ucxx.nvlink": True,
"distributed-ucxx.rmm.pool-size": "1GB"
})
Legacy schema (may be removed in the future):
import dask
dask.config.set({
"distributed.comm.ucx.tcp": True,
"distributed.comm.ucx.nvlink": True,
"distributed.rmm.pool-size": "1GB"
})
The package automatically registers itself as a communication backend for Distributed using the entry point ucxx
. Once installed, you can use it by specifying the protocol:
from distributed import Client
# Connect using UCXX protocol
client = Client("ucxx://scheduler-address:8786")
Or when starting a scheduler/worker:
dask scheduler --protocol ucxx
dask worker ucxx://scheduler-address:8786
If you're migrating from the legacy UCX configuration in the main Distributed package, update your configuration keys:
distributed.comm.ucx.*
is now distributed-ucxx.*
distributed.rmm.pool-size
is now distributed-ucxx.rmm.pool-size
The old configuration schema is still valid for convenience, but may be removed in a future version.