I tested speculative decoding on my home GPU cluster. Here's why it didn't help.
📰 Dev.to · Christopher Maher
I spent Saturday night testing n-gram speculative decoding on consumer GPUs. The claim: speculative...
I spent Saturday night testing n-gram speculative decoding on consumer GPUs. The claim: speculative...