Briefing Findings · For local Qwen3 runs, KV cache and newer
Story-specific findings extracted from this briefing's coverage. Fast Facts in the sidebar holds the canonical reference data (CEO, founded, ticker).
What to Watch
-
Look for follow-up benchmark threads on r/LocalLLaMA testing KV cache effects with Qwen 3.6 variants.
r/LocalLLaMA
-
Track releases/changes for BeeLlama v0.3.1 and its llama.cpp extras (DFlash, MTP, TurboQuant) for new throughput reports.
r/LocalLLaMA
What Changed
-
BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline)
XDA Developers