Google battles Chinese open weights models with Gemma 4
["google","google-gemma","kimi-k2"]
Tracked topic
Kimi K2 is a large language model service associated with the Kimi series, also referenced as kimi 2.6 or kimi k2.6.
["google","google-gemma","kimi-k2"]
…const response = await env.AI.run('@cf/moonshotai/kimi-k2.5', { prompt: 'What is AI Gateway?' }, { metadata: { "teamId": "AI", "userId": 12345 } } ); Bring your own model AI Gateway gives you access to models…
…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
DeepSeek V4 Pro and Flash vs. Claude Opus 4.7 and Kimi K2.6
We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6
As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and al…
Saw this post comparing Qwen 3.6 variants on coding primitives, so I wanted to see how local quants stack up against frontier models on a similar dense, single-file coding task. I ran the exact same prompt across local a…
Hey HN,We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase.Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep…
…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
…Kimi K2.6 wants to be your agentic worker Built for long-horizon tasks Moonshot's Kimi K2.6 is the most recent release here, dropping on April 21. It's a…
…May 14, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…
…We're using Kimi K2.5 as the LLM via Workers AI . The model decides when to call the tools based on the conversation: import { AIChatAgent, type OnChatMessageOptions } from "@cloudflare/ai-chat…