PyTorch native quantization and sparsity for training and inference
copied from cf-post-staging / torchaoTorchAO is a PyTorch-native model optimization framework leveraging quantization and sparsity to provide an end-to-end, training-to-serving workflow for AI models. TorchAO works out-of-the-box with torch.compile() and FSDP2 across most HuggingFace PyTorch models.