NVIDIA Launches Vera, Its First In-House CPU for AI Agent Workloads

NVIDIA on Tuesday unveiled Vera, its first internally designed CPU built specifically to handle AI agent tasks. The chip delivers a 1.8x performance gain over x86 processors, according to the company, and targets hyperscale data centers and AI research labs.

What Vera brings to AI workloads

The Vera CPU is purpose-built for the kind of inference and orchestration work that AI agents require—things like coordinating multiple models, managing memory across GPU clusters, and running lightweight reasoning loops in real time. That 1.8x speedup over x86 comes from a specialized architecture that cuts latency on agent-specific operations, not just raw integer or floating-point benchmarks.

NVIDIA has long relied on partners like Intel and AMD for the CPUs in its DGX systems. With Vera, the company shifts its strategy, controlling both the brains that orchestrate AI workflows and the GPUs that do the heavy number crunching. The first systems using Vera are expected to reach hyperscalers for testing within months.

Why NVIDIA built its own CPU

Off-the-shelf x86 chips were never designed for the rapid decision-making loops that modern AI agents demand. Every time an agent calls a tool, fetches context from memory, or decides which GPU to send a task to, the CPU has to handle that request in microseconds. x86 can do it, but not efficiently at the scale NVIDIA’s customers need. Vera’s architectural changes include a leaner instruction set and on-chip accelerators for graph traversal and memory pooling.

The move also gives NVIDIA more pricing leverage. Today the company is a huge buyer of server CPUs. If Vera delivers on its performance claims, NVIDIA can cut that supply chain dependency and offer a more integrated system—similar to how Apple controls both the CPU and GPU in its M-series chips.

Who will buy Vera

At launch the chip is aimed squarely at hyperscalers—cloud providers like AWS, Google Cloud, and Azure—and at AI laboratories running large-scale agent frameworks. Those are the customers already pushing the limits of inference throughput and memory bandwidth. For smaller shops or enterprises, NVIDIA will likely sell complete server nodes rather than standalone CPUs, but the company hasn’t detailed pricing or availability tiers yet.

One unresolved question is whether the 1.8x speed advantage holds under real-world agent workloads, not just synthetic benchmarks. Independent testing will take months, and early adopters will likely publish their own numbers. Another open point: how much power Vera draws. The fact sheet did not include TDP figures, though power efficiency is often a deciding factor in hyperscaler purchasing decisions.

NVIDIA has not announced a formal launch date for Vera-based products. The company will present more details at its GTC conference later this month.

What Vera brings to AI workloads

Why NVIDIA built its own CPU

Who will buy Vera

Related Articles