Cerebras Systems has built the world's largest artificial intelligence chip — a wafer-scale design that packs thousands of cores onto a single slice of silicon. The chip, which the company says is already in operation, is meant to tackle the growing computational demands of AI models without relying on clusters of smaller processors. CEO Andrew Feldman explained the reasoning behind the massive design, saying the company is betting that bigger hardware can upend conventional AI infrastructure.
The Wafer-Scale Approach
Most AI chips are cut from wafers into individual dies, then linked together in servers. Cerebras does the opposite: it leaves the wafer intact, creating one enormous chip that spans the entire silicon disk. That eliminates the need for inter-chip communication, a bottleneck in traditional systems. The result is a single chip with nearly a trillion transistors and tens of thousands of cores, all working on the same problem at once.
CEO's Rationale
Feldman outlined the reasoning during a briefing on the chip's architecture. He argued that AI training and inference are increasingly limited by memory bandwidth and latency, not just raw compute. A wafer-scale chip can keep data on the same die, reducing the delays that plague multi-chip setups. The company believes this design can handle models that would otherwise require racks of specialized hardware.
If Cerebras delivers on its claims, the chip could reshape how data centers are built. Instead of stringing together thousands of GPUs, operators might deploy a handful of wafer-scale processors to do the same work. That would cut power consumption and simplify software orchestration. But the approach comes with challenges: yields on such a large die are harder to manage, and cooling a chip the size of a dinner plate requires new engineering.
The broader industry has been watching Cerebras closely since it first unveiled the wafer-scale concept in 2019. The latest chip is a second-generation design that the company says has already been purchased by customers in research and enterprise settings. Whether it can truly challenge the dominance of GPU-based systems depends on real-world performance data, which Cerebras has not yet published at scale.




