Followed topics

Search

Showing top 14 results for "GPU memory bandwidth"

All sources theregister.com 14

People also ask

How many LPUs do you need?

The whole system requires a lot of LPUs. The exact ratio of GPUs to LPUs depends on the workload. Tasks requiring extremely large contexts, batch sizes, or concurrency may need a larger pool of GPUs. A general-purpose chatbot might run well on a single rack. This is because longer context windows require more memory for the key-value (KV) caches that store model state (think short-term memory) and attention operations. By keeping these on the GPU, Nvidia is able to get by with fewer LPUs. The actual number of required LPUs is directly proportional to the size of a model. For a trillion-paramet

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia crams 256 Vera CPUs into a single liquid cooled rack

… Nvidia also benefits from its use of LPDDR5X memory, more commonly found in notebook computers, rather than the RDIMMS used by conventional servers. Each Vera CPU can be equipped with up to 1.5 TB of LPDDR5 SOCAMM memory modules good for 1.2 TB/s of bandwidth per socket. …

Mar 16, 2026 · Tobias Mann

A closer look at Nvidia's Groq-powered LPX rack systems

… That compute is tied to a relatively large pool of SRAM memory, which is orders of magnitude faster than the high-bandwidth memory HBM found in GPUs today, but is also incredibly inefficient in terms of space required. Each LPU only has enough die space for 500 MB of on-chip memory. …

Mar 19, 2026 · Tobias Mann

Arm says AI agents need a new CPU. Intel doesn't buy it

… While it's true that Arm's 136-core parts deliver 6 GB/s of memory bandwidth per core, this is largely down to the ratio of compute to memory. In fact, it is common to see lower core count parts with large caches favored for memory-bound workloads like computational fluid dynamics. …

Mar 31, 2026 · Tobias Mann

Memory-makers' shares are down. Don't blame Google

… Demand for artificial intelligence infrastructure created the situations described above by giving memory-makers incentive to production of high-bandwidth and high-margin memory GPUs require. Reduced supply for other memory sent prices soaring. …

Mar 31, 2026 · Simon Sharwood

Nvidia slaps Groq into new LPX racks for faster AI response

… With up to 50 petaFLOPs each, Nvidia's newly announced Rubin GPUs aren't hurting for compute, but with 22 TB/s of HBM4 memory bandwidth, Groq's latest chip tech is nearly 7x faster, achieving 150 TB/s apiece. …

Mar 16, 2026 · Tobias Mann

Why real-world AI performance depends on the control layer

… Rather than competing with specialized AI silicon, modern CPUs are designed to support it , increasing memory bandwidth, strengthening I/O throughput, and maintaining system-level efficiency under AI-scale workloads. …

Mar 19, 2026 · David Gordon

Rebellions eyes global expansion with rack-scale AI platform

… For disaggregated inference, it's using llm-d, another open source framework that enables compute-heavy prefill operations on one set of accelerators and memory bandwidth-heavy decode operations on another. …

Mar 30, 2026 · Tobias Mann

Intel Core Ultra 270K, 250K Plus, reviewed

… Memory bandwidth Memory bandwidth is a major bottleneck for a lot of modern applications, so it's nice to see that Intel's new parts can actually utilize a decent chunk of the bandwidth DDR5 7200 has to offer. …

Mar 23, 2026 · Tobias Mann

Guide to GPU virtualization: passthrough, vGPU, and MIG

… Where vGPU shares GPU resources in software, MIG partitions the physical GPU in silicon. Each MIG instance gets its own dedicated compute engine, memory controller, and memory bandwidth. …

Apr 16, 2026 · VergeIO

Intel eases reliance on TSMC with Core Series 3 CPUs

… But with only a single memory channel, bandwidth is halved compared to Intel's beefier Core Ultra Series 3 parts. In fact, looking at the SKU list, the biggest difference between the chips comes down to CPU, NPU, and GPU clocks. …

Apr 17, 2026 · Tobias Mann