3 items

The inference economy requires a different chip for every workload, and Nvidia is positioning to be the company that integrates all three. The Groq licensing deal, NVLink interconnect neutrality, and Grace/Vera CPU positioning are three facets of the same play: owning the integration layer for heterogeneous AI compute the way ARM captures licensing rent regardless of who fabs the core. The pressure this creates is asymmetric: vertically integrated players like Google TPU are insulated because they consume their own silicon, but pure-play inference startups now compete against Nvidia's ecosystem bundled with Groq's speed. Cerebras had a clean pitch when the comparison was 'faster than GPUs at inference'; competing against GPU+LPU+NVLink while lacking a training story is a harder sell. The value is migrating up the stack, toward chip-agnostic inference routing, a middleware layer that barely exists yet but that every multi-chip architecture makes more necessary.

CNBC 2026-03-17-1

Nvidia GTC Preview: Why the CPU is Taking Center Stage

Agentic AI creates genuine CPU demand expansion: orchestration is sequential, CPU-bound work that GPUs can't do. Nvidia's "standalone CPU" story is really a coprocessor story, though; Grace and Vera are optimized to feed GPUs, not compete for general-purpose workloads at 6.2% share and 72 cores vs. 128. The higher-signal play is NVLink licensing, where Nvidia captures networking value regardless of whose CPU fills the socket.

Wall Street Journal 2026-03-17-2

Can Nvidia's Dominance Survive the Sea Change Under Way in AI Computing?

Nvidia's 73% GPU margins are structurally incompatible with an efficiency-first inference economy, but the displacement story isn't "Cerebras replaces Nvidia." Inference is heterogeneous, and Nvidia is racing to sell all three form factors: GPU for training, CPU for orchestration, LPU for inference throughput. The transition from monopolist-margin chipmaker to platform-margin integrator is the real architectural bet at GTC this year.

New York Times 2026-03-17-3

Nvidia Built the A.I. Era. Now It Has to Defend It.

Nvidia is the first major chipmaker to unbundle training from inference at the architecture level, pairing its GPUs with Groq's inference-optimized LPUs in a $20B licensing deal. The supply chain math is as interesting as the product: Groq on Samsung fab with no HBM dependency sidesteps both TSMC allocation constraints and memory chip shortages. If inference grows to 70-80% of total AI compute spend, the companies building chip-agnostic inference routing will capture a new middleware layer that doesn't exist yet.