Search

Showing top 103 results for "GPU needs for LLMs"

Videos

AR / VR – NVIDIA Technical Blog

…Your Essential Tool for Measuring GPU Interconnect and Memory Performance When you’re writing CUDA applications, one of the most important things you need to focus on to write great code is…

May 22, 2026

Developer Tools & Techniques – NVIDIA Technical Blog

…Your Essential Tool for Measuring GPU Interconnect and Memory Performance When you’re writing CUDA applications, one of the most important things you need to focus on to write great code is…

May 22, 2026

Launching AMD AI Playbooks: Step-by-Step Guides for Building with AI Locally with AMD

…Who they are for? AI Playbooks are written for developers who are comfortable with the basics: The command line Python environments Basic AI/ML concepts You don’t need deep expertise in…

May 12, 2026 · AMD AI Group

Optimized Software for Professionals With AMD and ISV Solutions

…Epic Games AMD partners with Epic Games to optimize Unreal Engine for peak performance on AMD CPUs and GPUs, delivering enhanced graphics and speed for developers and users. Maxon AMD and Maxon…

Winning Health Optimizes LLMs in Healthcare

…The fact that the solution manages to use CPU for inference means healthcare institutions can flexibly allocate CPU’s computing power between LLM inference and other IT applications as needed, which improves…

· PDF

CEO Interview with Adi Gelvan of Speedata - Semiwiki

…What new features/technology are you working on? Our near-term development is focused on deeper optimization for agentic analytics workloads as enterprises increasingly need to run LLM queries against large, structured…

May 17, 2026 · Daniel Nenni

n8n, Dify, and Ollama might be the best self-hosted AI automation stack right now

…First is connecting different apps and systems, second is building LLM apps and RAG workflows, and third is running models locally for privacy. n8n fits into the first layer. It’s built…

Apr 13, 2026 · Anurag Singh

Discussions and forums

r/homelab · u/AntifaAustralia · 2w ago

My first 10 inch rack with local LLM! No more Spotify, Google Home, Netflix, ChatGPT...

I'm pretty new to homelabbing and this is my first mini rack! Started with the Beelink ME Mini and then just kinda grew from there (it's always the way hey haha). It idles at 70 watts (not too shabby for how much is goin…

r/LocalLLaMA · u/APFrisco · 2w ago

Computer build using Intel Optane Persistent Memory - Can run 1 trillion parameter model at over 4 tokens/sec

As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and al…

r/LocalLLaMA · u/janvitos · 2w ago

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP

Just wanted to share my config in hopes of helping other 12GB GPU owners achieve what I see as very respectable token generation speeds with modest VRAM. Using the latest llama.cpp build + MTP PR, I got over 80 tok/sec w…

r/LocalLLaMA · u/ex-arman68 · 3w ago

2.5x faster inference with Qwen 3.6 27B using MTP - Finally a viable option for local agentic coding - 262k context on 48GB - Fixed chat template - Drop-in OpenAI and Anthropic API endpoints

2026-05-07 edit: I have updated the hardware based recommendations with more focus on quality. I do not recommend q4_0 KV cache anymore beyond 64k context. After multiple rounds of testing with the different size quants,…

r/selfhosted · u/lazycodewiz · 1w ago

Take-Two's CEO says AI's not in the business of making hits, 'datasets by their very nature are backward looking', but that doesn't mean AI can't be 'super helpful'

…favorites 3 Best graphics cards in 2026: These are the GPUs worth spending money in right now 4 Best gaming laptop 2026: I've tested the best laptops for gaming of this…

May 18, 2026 · Elie Gould

Followed topics

Search

Videos

AR / VR – NVIDIA Technical Blog

Developer Tools & Techniques – NVIDIA Technical Blog

Launching AMD AI Playbooks: Step-by-Step Guides for Building with AI Locally with AMD

Optimized Software for Professionals With AMD and ISV Solutions

Top stories

Trying to self-host LLMs made me realize local AI has a friction problem, not a quality problem

AMD just dropped a compact AI workstation that makes discrete GPUs look outdated for running LLMs

I added a second GPU just for local AI workloads, and it cost less than upgrading my main one

13 years later, the GTX Titan is still the most important GPU Nvidia ever made