TileIR is a portable and language agnostic intermediate representation for CUDA kernels
conda install nvidia::cuda-tileirasconda install nvidia/label/cuda-13.1.0::cuda-tileirasconda install nvidia/label/cuda-13.1.1::cuda-tileirasconda install nvidia/label/cuda-13.2.0::cuda-tileiras
With Tile IR, we introduce a new operation set and programming model to retain CUDA’s performance across architectures while regaining portability and improving productivity for developers using matrix operations on new architectures. We virtualize tensor-cores and their associated programming model to the point that we can innovate new approaches in hardware without invalidating investments in software.