Search

Showing top 47 results for "Kimi K2"

Kimi K2

Kimi K2 is a large language model service associated with the Kimi series, also referenced as kimi 2.6 or kimi k2.6.

16 articles indexed Last updated 6d ago See topic hub

Videos

Google battles Chinese open weights models with Gemma 4

["google","google-gemma","kimi-k2"]

Apr 2, 2026 · Tobias Mann

Cloudflare’s AI Platform: an inference layer designed for agents

…const response = await env.AI.run('@cf/moonshotai/kimi-k2.5', { prompt: 'What is AI Gateway?' }, { metadata: { "teamId": "AI", "userId": 12345 } } ); Bring your own model AI Gateway gives you access to models…

Apr 16, 2026 · Ming Lu

Accelerating AI with Open Software: AMD ROCm™ 7 is Here

…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 19, 2026 · AMD Data Center Insights

SPEC CPU 2026 and the Value of Open, Trusted Performance Measurement

…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 19, 2026 · Robert Hormuth

Discussions and forums

Hacker News · u/heymax054 · 2w ago

DeepSeek V4 Pro and Flash vs. Claude Opus 4.7 and Kimi K2.6

2 1

Hacker News · u/nl · 2w ago

We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6

r/LocalLLaMA · u/APFrisco · 2w ago

Computer build using Intel Optane Persistent Memory - Can run 1 trillion parameter model at over 4 tokens/sec

As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and al…

r/LocalLLaMA · u/Fragrant-Remove-9031 · 1w ago

Local Qwen 3.6 vs frontier models on a coding primitive: single-file HTML canvas driving animation - results and GIFs

Saw this post comparing Qwen 3.6 variants on coding primitives, so I wanted to see how local quants stack up against frontier models on a similar dense, single-file coding task. I ran the exact same prompt across local a…

Hacker News · u/ramonga · 1d ago

Show HN: Free open source coding models in Slack

Hey HN,We believe we have the easiest onboarding from signup to being able to spin up coding agents in slack like Stripe, Ramp & Coinbase.Demo of the onboarding: https://www.tella.tv/video/connecting-cord-to-slack-1-19ep…

AI at Scale Starts Here: The AMD Vision Comes Alive at Advancing AI 2025

…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 19, 2026 · AMD Data Center Insights

Solving the Biggest Challenges in the Data Center

…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 19, 2026

Claude, ChatGPT, and Gemini get all the hype, but the most interesting AI models are coming from elsewhere

…Kimi K2.6 wants to be your agentic worker Built for long-horizon tasks Moonshot's Kimi K2.6 is the most recent release here, dropping on April 21. It's a…

Apr 24, 2026 · Adam Conway

AMD AI Solutions

…May 14, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 21, 2026

What is AI Networking?

…May 19, 2026 Further Accelerating Kimi-K2.5 on AMD Instinct™ MI325X: W4A8 & W8A8 Quantization with AMD Quark — ROCm Blogs Quantize Kimi-K2.5 to W4A8 and W8A8 using AMD Quark and…

May 19, 2026 · AMD Networking

AI Search: the search primitive for your agents

…We're using Kimi K2.5 as the LLM via Workers AI . The model decides when to call the tools based on the conversation: import { AIChatAgent, type OnChatMessageOptions } from "@cloudflare/ai-chat…

Apr 16, 2026 · Gabriel Massadas

Followed topics

Kimi K2

Videos

Google battles Chinese open weights models with Gemma 4

Cloudflare’s AI Platform: an inference layer designed for agents

Accelerating AI with Open Software: AMD ROCm™ 7 is Here

SPEC CPU 2026 and the Value of Open, Trusted Performance Measurement

Discussions and forums

DeepSeek V4 Pro and Flash vs. Claude Opus 4.7 and Kimi K2.6

We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6

Computer build using Intel Optane Persistent Memory - Can run 1 trillion parameter model at over 4 tokens/sec

Local Qwen 3.6 vs frontier models on a coding primitive: single-file HTML canvas driving animation - results and GIFs

Show HN: Free open source coding models in Slack

AI at Scale Starts Here: The AMD Vision Comes Alive at Advancing AI 2025

Solving the Biggest Challenges in the Data Center

Claude, ChatGPT, and Gemini get all the hype, but the most interesting AI models are coming from elsewhere

AMD AI Solutions

What is AI Networking?

AI Search: the search primitive for your agents