Speculative Decoding • Accelerating LLMs, Part 2
📰 Medium · LLM
In this post we continue our series on how to accelerate LLMs. Previously we covered the FlashAttention algorithm in Part 1. Follow along… Continue reading on Medium »
In this post we continue our series on how to accelerate LLMs. Previously we covered the FlashAttention algorithm in Part 1. Follow along… Continue reading on Medium »