Search

Showing top 33 results for "local LLM performance" · filtered from 37 indexed

Filtered by topic: LLMs Clear ✕

All sources xda-developers.com 24 developer.nvidia.com 6 amd.com 2 intel.com 2 phoronix.com 1 press.asus.com 1 research.google 1

Stop obsessing over your GPU's core clock — memory clock matters more for local LLM inference

… Your GPU's memory clock is usually the biggest LLM bottleneck More than raw compute Most users consider the GPU core frequency the primary performance metric, and while that's true for gaming, it doesn't move the needle for local AI workloads. …

Mar 28, 2026 · Tanveer Singh

You don't need an expensive GPU to run a local LLM that actually works

… Higher-end GPUs unlock more performance for running better models, but even 7B or smaller options can prove useful when used appropriately. With the right hardware configuration, you can run genuinely useful local LLMs. …

Apr 29, 2026 · Rich Edmonds

Local LLMs work best when you're not loyal to just one

… Switching between LLMs lets you harness their unique capabilities If I were to stick to a specific model family, I’d be significantly bottlenecking the potential of my local LLM servers. So, I tend to use a bunch of LLMs in my workflow, and cycle between them depending on my needs. …

Apr 26, 2026 · Ayush Pande

Create with LM Studio, Powered by AMD Ryzen™ and Radeon™

… Key Functionality A desktop application for running local LLMs A familiar chat interface Search & download functionality via Hugging Face 🤗 A local server that can listen on OpenAI-like endpoints Systems for managing local models and configurations Run Llama, Mistral, Mixtral, and other local LLMs … …

Discussions and forums

r/LocalLLaMA · u/The_Paradoxy · 2w ago

The Qwen 3.6 35B A3B hype is real!!!

My personal test for small local LLM intelligence is to check whether a model has any ability to understand the code that I write for my own academic research. My research is on some pretty niche topics and I doubt that …

r/LocalLLaMA · u/MikeNonect · 2w ago

Getting a feel for how fast X tokens/second really is.

I love following all your adventures with local LLM setups. Quality and size of the models are important, but so is performance. Numbers don't really convey the experienced speed well, however. If someone claims they run…

r/LocalLLaMA · u/gladkos · 3w ago

Qwen 3.6 27B vs Gemma 4 31B - making Packman game!

Gemma just crushed Qwen in a local LLM gamedev contest! Device: MacBook Pro M5 Max, 64GB RAM Qwen 3.6 27B: 32 tokens/sec · 18m 04s · 33,946 tokens. Gemma 4 31B: 27 tokens/sec · 3m 51s · 6,209 tokens. So what is more impo…

r/LocalLLaMA · u/Signal_Ad657 · 1w ago

M5 vs DGX Spark vs Strix Halo vs RTX 6000

Hey guys, super simple. There have been a lot of online debates about the new M5 Macs vs DGX Sparks vs Strix Halo vs dedicated GPUs etc. So I put them all in a room with good power and cooling and ran everything in paral…

r/LocalLLaMA · u/Porespellar · 2w ago

Unpopular Opinion: The DGX Spark Forum community of devs is talented AF and will make the crippled hardware a success through their sheer force of will.

There is a lot of disdain for DGX Sparks here on the sub. And I get it. A lot of people say “It could have been great if it had been better memory bandwidth”, “SM-121 is a fake /second-class Blackwell chip” yadda, yadda.…

Your old GPU can still run big LLMs – you just need the right tweaks

… Test your knowledge of running LLMs without breaking the bank. Hardware AI Models Performance Software RAM & CPU 01 / 8 Software Which popular open-source tool is widely used to run large language models locally on consumer hardware without writing any code? …

May 6, 2026 · Ayush Pande

After a year of self-hosting LLMs, I realized the real bottleneck isn’t the GPU

May 6, 2026 · Yash Patel

I connected my local LLM to my browser and it changed how I automated tasks

… Related 6 ways anyone can use LM Studio and a local LLM on their PC Most people can find a use for a local LLM on their PC, and here's how I use mine. Getting a local LLM running in your browser It's worth the effort Getting a local LLM running in your browser takes some effort. …

Apr 12, 2026 · Anurag Singh

One tiny change made my local LLMs more useful than ChatGPT for real work

… Related I use local LLMs and self-hosted apps to manage my documents instead of relying on ChatGPT Not every LLM-powered task requires a ChatGPT subscription RAG makes my LLMs more accurate without compromising my privacy Feeding my own documents to pre-trained AI models increases their context awa… …

Apr 12, 2026 · Ayush Pande

Self-hosted AI may not be for everyone, but I'm done paying for ChatGPT and other cloud models

… Related I ran local LLMs on a "dead" GPU, and the results surprised me My Pascal card may not be ideal for intensive workloads, but it's more than enough for light LLM-powered tasks My local AI-aided workflows insulate me on the privacy front Although the fact that I don’t have to pay monthly subsc… …

Mar 22, 2026 · Ayush Pande

Local LLMs changed how I use Home Assistant, and now my smart devices actually listen

… That’s where MCP servers come in handy, as they let my local LLMs access dozens of tools and API calls to control every aspect of my Home Assistant server. …

Apr 23, 2026 · Ayush Pande

Followed topics