Search

Showing top 12 results for "AI model announcements"

People also ask

Why the rush?

One of the defining characteristics of SRAM-heavy architectures from Groq and its rival Cerebras is that they are very fast when running LLM inferencing workloads, routinely achieving generation rates exceeding 500 and even 1000 tokens a second. The faster Nvidia can generate tokens, the faster code assistants and AI agents can act. But this kind of speed also opens the door to what Huang describes as test-time scaling. The idea is that by letting "reasoning" models generate more "thinking" tokens, they can produce smarter, more accurate results. So, the faster you can generate tokens, the les

A closer look at Nvidia's Groq-powered LPX rack systems

Who is LPX for?

If you're not a hyperscaler, neocloud, model dev, LPX is probably not for you. The sheer number of LPUs required to serve large open models will likely put Nvidia's LPX platform out of reach for most enterprises. Speaking to press ahead of this week's keynote, Buck said Nvidia is focusing primarily on model builders and service providers that need to serve trillion-plus-parameter models with token rates exceeding 500 to 1,000 a second. Having said that, in a technical blog, Nvidia presented another use case for the LPUs as a speculative decode accelerator, something we suggested the company mi

A closer look at Nvidia's Groq-powered LPX rack systems

What happened to Rubin CPX?

You may be scratching your head, wondering "wasn't there supposed to be some kind of special Rubin chip optimized for large-context prefill processing?" You're not hallucinating. Back at Computex last northern spring, Nvidia unveiled the Rubin CPX, a version of Rubin that used slower, less expensive GDDR7 memory to speed up the time to first token – how long users or agents have to wait for the model to start generating an output – when working with large inputs. The idea was that Rubin CPX could cut down on wait times for applications that might involve processing large quantities of document

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia slaps Groq into new LPX racks for faster AI response

… To put that in perspective, OpenAI currently charges about $15 per million output tokens for API access to its top GPT-5.4 model. …

Mar 16, 2026 · Tobias Mann

Storage vendors orbit the Nvidia sun at GTC

… Thomas Cornely, EVP of Product Management at Nutanix, said in a statement: "Nutanix Agentic AI extends our AHV hypervisor, Flow Virtual Networking, Nutanix Kubernetes Platform, and Nutanix Enterprise AI to deliver a cloud operating model to enterprise AI factories, enabling infrastructure and platf… …

Mar 18, 2026 · Chris Mellor

AI-pilled Arm CEO teases mystery products for $1T TAM

… Tuesday's event was all about Arm’s newly announced AGI CPU products, which will free the company from the shackles of its IP licensing model by enabling the company to sell directly to end customers. Haas has high hopes for agentic AI to accelerate the British chip designer's datacenter business. …

Mar 24, 2026 · Tobias Mann

A closer look at Nvidia's Groq-powered LPX rack systems

… MORE CONTEXT Nvidia's on-again off-again H200 sales in China are now on again Chips... in spaaaace – courtesy of Nvidia Nvidia powers further into the CPU market with new rack systems packing 256 Vera processors Nvidia slaps $20B Groq tech into massive new LPX racks to speed AI response time Any ti… …

Mar 19, 2026 · Tobias Mann

HPE adds Blackwell, Rubin systems to Nvidia-backed AI push

… It said its work would allow orgs to "scale AI initiatives" while "adhering to regional data sovereignty and compliance requirements." Dr Bastian Koller, Managing Director of the High Performance Computing Center at Stuttgart University and lead coordinator of HammerHAI said of the partnership: "Ha… …

Mar 17, 2026 · Chris Mellor

Meta reveals custom AI chips it says beat Nvidia

… Each PE includes a pair of RISC-V vector cores. The chip is in production. MTIA 400: An evolution of the MTIA 300 that can support generative AI models and R&R workloads. …

Mar 12, 2026 · Simon Sharwood

OpenAI and Oracle reportedly abandon TX Stargate expansion

… Since announcing the $500 billion Stargate initiative just over a year ago, OpenAI has tapped Oracle to deploy 4.5 gigawatts of compute capacity to fuel model development and services. …

Mar 7, 2026 · Tobias Mann

Washington reportedly moves to put AI chips on tighter leash

… The Trump administration rescinded those rules before they took effect, with the Commerce Department saying they would have "stifled American innovation and saddled companies with burdensome new regulatory requirements." The department posted on X that it is "committed to promoting secure exports o… …

Mar 6, 2026 · Dan Robinson

Nvidia GTC 2026: What to expect at AI Burning Man

… In fact, this capability is how Cerebras won OpenAI's business earlier this year to power its Codex model . …

Mar 13, 2026 · Tobias Mann

Nvidia's DLSS 5 seems to cross the uncanny valley

… Even AI generators like Sora 2 have some delay while you wait for them to process. "We combined 3D graphics, structured data, with generative AI, probabilistic computing," Huang said. …

Mar 16, 2026 · Avram Piltch

Followed topics

People also ask

Nvidia slaps Groq into new LPX racks for faster AI response

Storage vendors orbit the Nvidia sun at GTC

AI-pilled Arm CEO teases mystery products for $1T TAM

A closer look at Nvidia's Groq-powered LPX rack systems

HPE adds Blackwell, Rubin systems to Nvidia-backed AI push

Meta reveals custom AI chips it says beat Nvidia

OpenAI and Oracle reportedly abandon TX Stargate expansion

Washington reportedly moves to put AI chips on tighter leash

Nvidia GTC 2026: What to expect at AI Burning Man

Nvidia's DLSS 5 seems to cross the uncanny valley