cutlass
CUDA Templates for Linear Algebra Subroutines
CUDA Templates for Linear Algebra Subroutines
To install this package, run one of the following:
CUTLASS is a collection of abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement. CUTLASS decomposes these "moving parts" into reusable, modular software components and abstractions.
Summary
CUDA Templates for Linear Algebra Subroutines
Last Updated
Mar 27, 2026 at 19:42
License
BSD-3-Clause
Supported Platforms
GitHub Repository
https://github.com/NVIDIA/cutlassDocumentation
https://docs.nvidia.com/cutlass