Search

Showing top 129 results for "Qwen3"

Qwen3

Qwen3 is an AI model family developed by Alibaba, released as a set of large language models for natural-language tasks.

54 articles indexed Last updated 1d ago See topic hub

Videos

Advancing Emerging Optimizers for Accelerated LLM Training with NVIDIA Megatron | NVIDIA Technical Blog

…Muon training performance on NVIDIA GB300 NVL72 Table 1 summarizes training throughput of the Kimi K2 and Qwen3 30B models with Muon and the AdamW optimizer on the NVIDIA GB300 NVL72 system…

Apr 22, 2026 · Hao Wu

AMD Silo AI and University of Bologna Start Spatial AI Collaboration for Robotics and Autonomous Driving

…run Qwen3.5 9B–122B on Ryzen™ AI Max+ with 128GB UMA and Ollama, with generation benchmarks and a clear UMA setup path on Ubuntu/ROCm. May 24, 2026 LLM-D Serving…

May 26, 2026 · AMD Silo AI

Your old GPU can still run big LLMs – you just need the right tweaks

…Offloading layers lets me run massive LLMs on weak GPUs That’s how I managed to deploy Qwen3.6-35B-A3B on 12GB of VRAM Although your GPU is the ideal component…

May 6, 2026 · Ayush Pande

NVIDIA Platform Delivers Lowest Token Cost Enabled by Extreme Co-Design | NVIDIA Technical Blog

…The Qwen3-VL vision-language submission used the vLLM open source framework, showing how the community is rapidly building advanced multimodal optimizations to accelerate image-heavy inference workloads on the latest GPUs…

Apr 1, 2026 · Ashraf Eassa

Discussions and forums

r/LocalLLaMA · u/LLMFan46 · 2d ago

Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats

Safetensors, llmfan46/Qwen3.5-35B-A3B-uncensored-heretic-v2-Native-MTP-Preserved: https://huggingface.co/llmfan46/Qwen3.5-35B-A3B-uncensored-heretic-v2-Native-MTP-Preserved GGUFs, llmfan46/Qwen3.5-35B-A3B-uncensored-here…

Hacker News · u/thc1006 · Apr 21, 2026

Qwen3.6-35B-A3B speculative decoding is net-negative on RTX 3090

5 2

Hacker News · u/GreenGames · Apr 20, 2026

We got 207 tok/s with Qwen3.5-27B on an RTX 3090

165 52

r/LocalLLaMA · u/Beamsters · 1w ago

Qwen3.7 Max scored by Artificial Analysis, 27B/35B waiting room

https://preview.redd.it/42ak5qmus82h1.png?width=1133&format=png&auto=webp&s=744ea3dfc06c83d0c4d8aa128c39b3238b17d7be Qwen 3.7 Max sitting at 5th, pretty much on par with GPT 5.4 (xhigh) and a notch above the just release…

Hacker News · u/freakynit · Apr 17, 2026

Show HN: Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant

https://w418ufqpha7gzj-80.proxy.runpod.netStarted for myself, but since Im not using it continuously, sharing it:Open Access Qwen3.6-35B-A3B-UD-Q5_K_M with TurboQuant (TheTom/llama-cpp-turboquant) on RTX 3090 (Runpod spo…

4 2

Embedded AI Archives

…robotic systems, NVIDIA Nemotron speech models are used for fast and accurate natural voice interactions. Qwen3 4B, served locally via vLLM, interprets requests and generates responses with low latency, no cloud link…

May 7, 2026