vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
To install this package, run one of the following:
Summary
A high-throughput and memory-efficient inference and serving engine for LLMs
Last Updated
Jul 30, 2025 at 16:25
License
Apache-2.0
Total Downloads
86
Supported Platforms