Exllamav2: Inference library for running LLMs locally on consumer-class GPUs

📰 Hacker News · Palmik

Exllamav2: Inference library for running LLMs locally on consumer-class GPUs. 125 comments, 322 points on Hacker News.

Published 13 Sept 2023