NVIDIA's Vera Rubin AI Platform Enters Full Production with 10x Throughput Claim

NVIDIA has moved its next-generation Vera Rubin computing platform from development into full-scale manufacturing, the company confirmed. The new system targets what NVIDIA calls “AI factories” — massive data centers built to run autonomous AI agents — and promises a tenfold improvement in throughput over current architectures.

What the Vera Rubin Platform Does

Unlike previous generations focused on training large models, Vera Rubin is designed specifically for inference workloads where AI agents make real-time decisions. NVIDIA says the 10x throughput gain comes from architectural changes in the GPU cluster and its interconnects, allowing more agents per rack with lower latency. The platform is named after the American astronomer Vera Rubin, continuing NVIDIA’s practice of naming GPU generations after scientists.

The company hasn’t released detailed specifications or a launch price, but full production means Vera Rubin components are now shipping to data-center builders and cloud providers. The first AI factories equipped with the platform are expected to come online later this year.

Targeting the Shift to Agentic AI

NVIDIA is betting that the next wave of AI deployment won’t be about training ever-larger models, but about running thousands of smaller, specialized agents in parallel — think customer-service bots, coding assistants, and autonomous logistics systems. Vera Rubin’s high throughput is meant to make those agent workflows economically viable at scale.

The platform arrives as hyperscale cloud operators and enterprise data-center developers increase spending on inference hardware. Competitors including AMD and Intel have also announced purpose-built AI chips, but NVIDIA’s existing software ecosystem, CUDA, gives Vera Rubin a built-in advantage for developers already working in the NVIDIA stack.

Production Ramp and Next Steps

Entering full production is a milestone that typically signals the end of sampling and qualification phases. NVIDIA manufacturing partners, including TSMC, are now running Vera Rubin on dedicated production lines. The company hasn’t disclosed volume targets, but industry analysts tracking supply chains have reported increased wafer starts at TSMC’s CoWoS packaging facilities, which are used for NVIDIA’s high-end AI accelerators.

Early adopters are likely to be the same cloud giants that deployed the previous Hopper and Blackwell generations: Amazon Web Services, Microsoft Azure, and Google Cloud. NVIDIA’s own DGX systems will probably be the first to ship with Vera Rubin, followed by certified partner servers.

Unanswered Questions About Price and Power

NVIDIA hasn’t revealed the power draw or per-unit cost of Vera Rubin. The previous generation, Blackwell, drew more than 700 watts per GPU, and Vera Rubin is expected to require similar or higher power budgets, which could challenge data-center cooling infrastructure. The company also hasn’t compared Vera Rubin’s price-performance ratio against rival hardware like AMD’s MI300 series.

For now, the only certainty is that production has started. The first systems will ship to customers in the coming months, and the benchmark numbers that matter — real-world agent throughput per dollar — will be measured then.

What the Vera Rubin Platform Does

Targeting the Shift to Agentic AI

Production Ramp and Next Steps

Unanswered Questions About Price and Power

Related Articles