llmlingua

Community

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Copied fromcf-staging / llmlingua

Versions

Installation

To install this package, run one of the following:

Conda

$conda install conda-forge::llmlingua

Usage Tracking

Version

2 / 8 versions selected

Downloads (Last 6 months): 0

About

Summary

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Last Updated

Apr 9, 2024 at 09:34

License

MIT

Supported Platforms

noarch

Home

https://github.com/microsoft/LLMLingua