Search

Showing top 82 results for "Full specs and memory"

Top stories

Discussions and forums

r/LocalLLaMA · u/APFrisco · 3w ago

Computer build using Intel Optane Persistent Memory - Can run 1 trillion parameter model at over 4 tokens/sec

As the title states, my build is indeed able to run a 1 trillion parameter model (in this case Kimi K2.5) locally at ~4 tokens/second. I thought r/LocalLLaMA would be interested in the build due to that stat line, and al…

Hacker News · u/wasnaga · Apr 28, 2026

Show HN: An agent that remembers across sessions (no chat history)

Hi HN — I built this in my off-hours over the last 3 months. Sharing now because I just filed the provisional patent yesterday (US 64/050,345) and the repo is freshly public.The frustration that started it: every time I …

1
Hacker News · u/bilgisoft · Mar 12, 2026

Show HN: Turkish Sieve Engine – Full Prime Statistics Up to 10^14 and V2 Preview

Hi HN, I’m announcing a major update for the Turkish Sieve Engine (TSE). We have just published comprehensive and deterministic distribution statistics across the entire 10^14 range.What we've achieved:Full Spectrum Stat…

1
r/LocalLLaMA · u/ex-arman68 · 3w ago

2.5x faster inference with Qwen 3.6 27B using MTP - Finally a viable option for local agentic coding - 262k context on 48GB - Fixed chat template - Drop-in OpenAI and Anthropic API endpoints

2026-05-07 edit: I have updated the hardware based recommendations with more focus on quality. I do not recommend q4_0 KV cache anymore beyond 64k context. After multiple rounds of testing with the different size quants,…

r/LocalLLaMA · u/Kurcide · May 1, 2026

16x Spark Cluster (Build Update)

Build is done. 16 DGX Sparks on the fabric, all hitting line rate. Setup was time consuming but honestly smoother than I expected. Each Spark runs Nvidia’s flavor of Ubuntu out of the box with mostly everything pre insta…