Search

Showing top 22 results for "User model switching"

People also ask

What happened to Rubin CPX?

You may be scratching your head, wondering "wasn't there supposed to be some kind of special Rubin chip optimized for large-context prefill processing?" You're not hallucinating. Back at Computex last northern spring, Nvidia unveiled the Rubin CPX, a version of Rubin that used slower, less expensive GDDR7 memory to speed up the time to first token – how long users or agents have to wait for the model to start generating an output – when working with large inputs. The idea was that Rubin CPX could cut down on wait times for applications that might involve processing large quantities of document

A closer look at Nvidia's Groq-powered LPX rack systems