Speculative decoding made my local LLM actually usable
…But it doesn't change how the model actually feels to use. Speculative decoding is different. It changes speed, which changes everything. The answers arrive much faster, which makes the entire experience…
