AI's Silicon Mix Shifts: Why CPUs Matter Now More Than Ever

Hi {{first_name|Investor}} -

Over the last two days, I’ve talked about Intel’s unprecedented 10x revision and the shift to long term agreements between hyperscalers and semis.

Two paragraphs into Intel's earnings Q&A last Thursday, Lip Bu Tan said something that should have moved more stock prices than it did.

An analyst asked about server CPU demand. Lip Bu's answer, paraphrased: the ratio of CPUs to GPUs in AI deployments used to be 1-to-8. It's now 1-to-4. And he expects it to "move towards parity or even better."

CFO David Zinsner reinforced the point later with more specificity: training workloads run roughly 7–8 GPUs per CPU; inference runs closer to 3–4 per CPU; agentic workloads "potentially even flipping the other direction a little bit."

Translate that out of analyst-speak. The silicon mix inside an AI rack is rebalancing. The era when GPUs were the only AI story is ending. And the market — which has spent two years pricing AI infrastructure as a NVIDIA-and-only-NVIDIA trade — hasn't fully repriced for what comes next.

Why the Rack is Rebalancing

To understand why the CPU is staging a comeback, you have to understand what AI workloads actually do.

Training is the phase where a model learns. It involves enormous parallel computation across billions of parameters — exactly the workload GPUs were built for. The math is heavily matrix-multiplication-driven, highly parallelizable, and scales beautifully across thousands of GPUs working simultaneously. CPUs play a small supporting role.

Inference is what happens when you actually use the model. A user types a prompt; the model produces a response. This is a different computational pattern. There's still GPU work, but inference involves more orchestration — managing input/output, routing requests, handling memory, coordinating across services. CPUs do this work better than GPUs.

Agentic AI is the next layer. An AI agent doesn't just answer one prompt. It plans, calls tools, makes decisions, routes to other agents, manages state across long conversations. Most of that work is not matrix multiplication. It's logic, control flow, and orchestration — exactly what CPUs are designed for.

The industry is now scaling all three workloads simultaneously. Training continues. But inference is exploding (every chatbot interaction is inference). Agentic deployment is just beginning. The GPU-heavy ratio that defined the training era doesn't apply to the next two phases. The rack rebalances accordingly.

Lip Bu's number — 1:8 → 1:4 → parity — isn't a forecast. It's an observation about what's already happening across his customer base.

What Rebalancing Looks Like in Practice

Picture an AI rack today. In a training-heavy 2023 deployment, you'd have 8 GPUs and 1 host CPU. Cost-wise, the GPUs dominate — those are the expensive items. The CPU is a supporting cost.

Now picture an inference-heavy 2026 deployment running similar total wattage. You might have 8 GPUs and 2 CPUs. Or 4 GPUs and 1 CPU running smaller inference models. The CPU isn't replacing the GPU — both are growing. But the mix has shifted, and the share of rack spending going to CPUs is materially higher.

Multiply that across the millions of racks being deployed in 2026 and 2027, and the dollar volume going to CPU silicon is meaningfully larger than what the 2023 GPU-only narrative would have predicted.

This isn't a derate of the GPU thesis. NVIDIA is still going to sell every GPU it can make. Intel confirmed as much: Xeon 6 was selected as the host CPU for NVIDIA's DGX Rubin NVL8 systems — meaning every Rubin rack pulls more Intel silicon, not less. (Worth noting: Rubin also depends on the advanced packaging capacity I wrote about two issues ago.) The two architectures are now complements, not substitutes.

But it is a derate of the GPU-only thesis. And there's a meaningful difference between "AI infrastructure" and "GPU infrastructure" that the market is still in the process of pricing.

Who Wins When the Mix Shifts

The clearest beneficiary is AMD (AMD). They have the most competitive current-generation server CPU lineup — Turin and Genoa-X are arguably best-in-class for inference workloads — and they participate in any environment where the total x86 server CPU TAM is expanding. If Lip Bu is right and the ratio shifts toward parity, AMD's server segment is in a structurally larger market for the next half-decade.

Intel (INTC) is the obvious second beneficiary. The Xeon roadmap (Granite Rapids → Diamond Rapids → Coral Rapids) is back in serious competitive shape, they're winning hyperscaler designs (most visibly the multi-year LTA with Google I covered last week), and the broader CPU TAM expansion lifts them too. The Intel turnaround narrative has been overhyped in moments and underpriced in others — this thesis is a structural reason it might finally stick.

The broader x86 ecosystem. This is where the lens gets interesting. Server OEMs (Dell, HPE, Supermicro), motherboard vendors, and networking silicon companies that integrate with CPU-anchored architectures all see incremental tailwind. The reframe to apply to your own portfolio: what assumes "AI = pure GPU rack" in the names you own? Anything that does is mispriced if the rebalancing thesis is right.

What about ARM? Worth addressing directly. ARM-based server CPUs (Amazon's Graviton, Google's Axion, Ampere) are real and growing. Lip Bu acknowledged this on the call. But the rebalancing thesis isn't really about which CPU architecture wins — it's about more total CPU silicon being deployed regardless of which instruction set wins the share war. ARM gaining 10–15% of server share in a market that's 40% larger is still a positive backdrop for the x86 incumbents.

The NVIDIA Question

The hardest part of this thesis is what it implies for NVIDIA, because NVIDIA is the largest stock most readers own and the dominant AI infrastructure narrative for two years running.

The honest read: this isn't a sell signal. It's a ceiling adjustment. NVIDIA's 2024–2025 valuation embedded an assumption that GPUs would capture roughly the same share of AI infrastructure spend going forward. If the rebalancing thesis is right, that assumption is too aggressive. NVIDIA's absolute earnings power doesn't shrink — but the multiple the market is willing to pay for it might compress as the "GPU = AI" mental model gets refined into something more nuanced.

This is what gradual re-rating looks like. Not a single dramatic earnings miss. A slow grind as analysts revise their long-term assumptions and the multiple drifts down to reflect a more layered infrastructure reality.

For portfolios massively overweight NVIDIA, the right response isn't panic. It's to ask honestly: am I priced for the GPU-only world, or for the world where CPUs and GPUs both compound?

What Would Invalidate This Thesis?

The strongest counter-argument is that inference workloads might shift back toward GPU-dominance if model architectures change. Specifically, if next-generation models become significantly more compute-intensive at inference time rather than less, GPU share could re-expand.

That's possible but not likely. The industry trend is the opposite — model distillation (training smaller models that match larger ones), smaller specialized models, and edge inference are all making per-query inference cheaper and less GPU-bound, not more. Reasoning models do consume more inference compute, but they consume it as smaller, repeated calls — exactly the workload pattern that benefits CPU orchestration.

The other invalidation path: if hyperscalers consolidate their own custom CPU efforts (Graviton, Axion) faster than expected, displacing both Intel and AMD share. That's a real risk for the who wins question, but not for the CPUs grow thesis. Whoever ships those CPUs benefits.

Some final thoughts

The market has spent two years pricing AI infrastructure as if the silicon mix is static. It isn't. Inference and agentic workloads are rebalancing the rack, and the beneficiaries are CPU vendors the dominant narrative has been writing off as commodity infrastructure for half a decade.

The work-optional portfolio isn't built by selling NVIDIA. It's built by recognizing that "AI infrastructure" is a broader category than the front-page story suggests — and positioning accordingly while the consensus is still catching up.

By the way, I’m writing an e-book about my investing strategy, frameworks, and research process. If you’re interested in getting notified about the book launch, fill out this 1-question survey.

Thank you again for your support!

Stay disciplined, Koh

Disclaimer: Nothing in this newsletter constitutes investment advice or a recommendation to buy or sell any security. Numbers and observations are as of publication. I may hold positions in companies discussed above. Always do your own research and consult a licensed financial advisor before making investment decisions.

GPUs aren't the only AI story anymore