The Recursive Resolution

The Recursive Resolution

In a recent episode of the Dwarkesh Podcast, Dylan Patel — founder of SemiAnalysis, one of the most closely followed semiconductor research firms in the industry — laid out the three major bottlenecks constraining AI compute: logic, memory, and power. The picture he painted is sobering. ASML, the Dutch company that holds a global monopoly on extreme ultraviolet (EUV) lithography, produces roughly seventy of these machines per year. Each one costs around $380 million and is required to pattern the transistors on every advanced chip in the world. TSMC — the Taiwanese foundry that fabricates processors for NVIDIA, Apple, AMD, Google, and essentially every major AI player — depends entirely on these machines. At approximately three and a half EUV tools per gigawatt of data center power, the supply chain looks like a funnel with a very narrow neck.

Patel's analysis covers how NVIDIA secured long-term TSMC capacity early while competitors scrambled, why Google began panic-buying turbines in Q4, and how the memory and power constraints may prove even harder to solve than the logic bottleneck. It's a thorough accounting of where AI infrastructure actually stands today — the physical limits that no amount of algorithmic cleverness can bypass alone.

But there's a dimension the bottleneck framing doesn't fully capture: the intelligence being constrained is also the intelligence most capable of dissolving the constraint. This isn't a linear improvement curve. It's a feedback spiral — and it's already running.


Loop 1: AI Designing Its Own Chips

Google DeepMind's AlphaChip treats chip floorplanning as a game — the same reinforcement learning architecture that conquered Go. It starts from a blank grid and places circuit components one at a time until the layout is complete. A graph neural network learns the relationships between components and improves with each chip it designs. What used to take human engineers twenty-four months, AlphaChip completes in hours. It has been used in four generations of Google's TPU — including Ironwood/v7, announced in April 2025, delivering 100% better performance per watt than v6e. The chip that trains the AI that designs the chip.

In January 2026, AlphaChip's creators — Dr. Anna Goldie and Dr. Azalia Mirhoseini — launched Ricursive Intelligence, raising $335 million at a $4 billion valuation in under four months. The company's explicit mission: a recursive self-improvement cycle where AI designs the silicon that powers the next generation of AI. Their target customers include AMD, Intel, and every major chip maker. NVIDIA's venture arm, NVentures, is among the investors. The recursive loop now has its own company, its own capital stack, and its own valuation reflecting the market's bet that agentic EDA will replace human-heavy design cycles.

Meanwhile, the broader EDA industry is transforming. Synopsys and Cadence — the two companies whose tools the semiconductor industry runs on — are building generative AI directly into their design toolchains. At DVCon 2026, Alpha Design AI introduced the industry's first multi-agent AI teams for root cause analysis, collaborating on complex debugging tasks with no human in the loop. Verification times — where most of the twenty-four-month timeline actually lives — are down five to ten times on analog IP migration projects.

The honest ceiling isn't whether AI chip design works — that question is answered by shipping silicon. It's where it stops working. AlphaChip dominates placement and routing. Full-stack verification — proving a chip functions correctly under every edge condition — is still where humans spend most of the timeline, and AI verification tools are themselves prone to missing corner cases. The loop closes, but it closes unevenly: fast in layout, slow in proving the layout is safe.

There's also an architectural bypass emerging that changes the equation entirely. Chiplet architectures — the "Silicon Lego Revolution" — break the assumption that each better chip is a monolithic die requiring EUV. Intel's Foveros 3D stacking in its Clearwater Forest Xeon processors achieves 9-micrometer copper-to-copper bump pitch, stacking compute tiles directly onto active base dies. NVIDIA's co-packaged optics stack 3nm electrical dies on 65nm optical dies. TSMC plans System-on-Wafer (SoW-X) by 2027, integrating up to sixteen compute dies and eighty HBM4 stacks on a single package. The critical compute tiles need EUV; the I/O controllers, memory interfaces, and interconnect fabric can be manufactured on older, cheaper process nodes. This reduces EUV dependency per unit of useful compute without needing more ASML machines — an architectural end-run around the bottleneck rather than a direct assault on it. The next iteration of the UCIe 2.0 standard is expected to incorporate silicon photonics directly, allowing chiplets to communicate via light and virtually eliminating heat generation from data transfer.


Loop 2: AI Discovering New Materials

The ASML bottleneck is fundamentally a silicon bottleneck. Silicon isn't the only viable semiconductor substrate — it's the one the industry has optimized for sixty years. The materials discovery pipeline has historically been the slowest part of semiconductor progress: find a promising compound, grow it, characterize it, figure out how to manufacture it at scale. Each new material traditionally takes decades to move from lab to production.

Self-driving laboratories — automated research facilities where AI designs experiments, robotic systems execute them, and machine learning models analyze the results in continuous cycles — are collapsing that timeline. Argonne National Lab ran six thousand battery chemistry experiments in five months, work that would have taken years manually. NC State demonstrated a technique that collects ten times more materials data at record speed. Boston University's autonomous lab has run over twenty-five thousand experiments since 2023 with minimal human oversight, discovering the most efficient energy-absorbing material to date. MIT is using generative AI to synthesize complex materials that previously required purely empirical trial-and-error.

AI-driven materials discovery — lattice structures dissolving into photonic pathways
AI-driven materials discovery — lattice structures dissolving into photonic pathways

The concrete result: gallium nitride (GaN) manufacturing just hit a breakthrough. Infineon used convolutional neural networks to achieve 99% accuracy in identifying nanoscale lattice mismatches during 300mm GaN-on-silicon epitaxy — the process step that was the mass-manufacturing bottleneck. GaN chip costs are projected to drop 50% by end of 2026. GaN switches faster, handles higher voltages, and wastes less energy than silicon. onsemi's vertical GaN technology reduces energy loss by almost 50%. Beyond power delivery, researchers are now using GaN microLEDs for neuromorphic optical computing — transmitting and processing information using photons rather than electrons, which travel faster and generate far less heat. The Nitride Technology Center has secured €15 million in funding to combine GaN microLEDs with silicon integrated circuits for energy-efficient AI.

The photonic interconnect pathway is accelerating independently. TSMC announced a strategic partnership with Avicena focused on microLED-based chip-to-chip optical communication — marking TSMC's official entry into using microLEDs beyond display applications. Avicena's technology achieves ultra-low-energy optical interconnects with reach up to thirty meters, far exceeding copper cable capabilities for GPU-to-GPU and CPU-to-memory connections. The startup raised $65 million in Series B funding with participation from SK Hynix. Meanwhile, China is making a national bet on optical computing — Nature reported in early 2026 that Chinese researchers are pursuing photonic chips as a potential path around export controls on silicon-based processors.

Cornell's "dualtronic" chip combines photonic and electronic functions simultaneously on a single GaN semiconductor, potentially shrinking device size while improving energy efficiency. Carbon nanotubes remain the longer-shot substrate replacement — MIT demonstrated a programmable CNT processor, and AI-guided synthesis is accelerating the purity problem, but volume production is still years out.

This loop runs slower than Loop 1 — years instead of months — but the payoff is exponential because it changes the substrate itself, not just the design etched onto it. AI accelerates materials science, new materials bypass silicon bottlenecks, more efficient compute enables better AI for the next round of materials science.


Loop 3: AI Solving Its Own Energy Crisis

Patel's three-and-a-half-tools-per-gigawatt number from the Dwarkesh interview is the energy constraint expressed as a semiconductor constraint. Every new fabrication facility needs a power plant's worth of electricity. Every new data center needs its own grid connection. Google's Q4 turbine purchases weren't about chips — they were about watts.

AI is attacking this from multiple directions simultaneously.

Fusion acceleration. Total fusion industry funding went from $1.7 billion in 2020 to $15 billion by September 2025. Commonwealth Fusion Systems (CFS) began assembling SPARC in January 2026 — the first of eighteen toroidal field magnets placed on its assembly jig — with first plasma targeted for 2027. At CES 2026, CFS unveiled a digital twin of SPARC built with Siemens and NVIDIA, using NVIDIA's Omniverse and AI models to compress years of physical experimentation into weeks of virtual optimization. If SPARC succeeds, CFS's first commercial fusion plant — ARC, a 400-megawatt station near Richmond, Virginia — is slated to come online in the early 2030s, powering about 300,000 homes. The DOE's StellFoundry project uses AI to replace lengthy plasma physics calculations with fast digital models, letting scientists test stellarator configurations in hours instead of months. Instead of analytically solving the plasma confinement problem — which humans have been failing at for seventy years — AI treats it as an optimization problem and brute-forces the configuration space.

Nuclear fission renaissance. The fusion timeline is decade-scale, but fission is moving now. In February 2026, the DOE published twenty-six "Genesis Mission" challenges for applying AI to energy and national security. The flagship nuclear challenge, codenamed "PROMETHEUS", is a partnership between Idaho National Laboratory and NVIDIA targeting at least 2x schedule acceleration and greater than 50% operational cost reduction for nuclear reactor deployment — deploying AI across design, licensing, manufacturing, construction, and operations. X-energy is breaking ground on four 80MW reactors for Dow Chemical in Texas in 2026. NextEra Energy plans to restart Iowa's Duane Arnold nuclear plant by 2029 with Google as the power purchaser. A third of data center operators now rank nuclear among their top alternative energy sources. Nature named AI plus nuclear as one of the agenda-setting technologies for 2026.

Grid intelligence and geothermal. AI-optimized data centers achieve power usage effectiveness (PUE) close to 1.1 — meaning nearly all the electricity goes to compute, almost nothing wasted on cooling. A PUE of 1.0 would be perfect efficiency; the industry average was 1.58 as recently as 2020. Companies like Zanskar use AI to identify geothermal resources that were previously thought non-existent — baseload power that doesn't need fuel, discovered by an AI that can process geological survey data faster than any human team.

Loop 3 operates on two timescales. Nuclear fission, grid optimization, and geothermal are delivering results now — 2026 through 2030. Fusion is the decade-scale play that removes the ultimate ceiling. AI needs energy, AI optimizes energy production, more energy becomes available for AI.


Loop 4: AI Improving AI

This is the loop that prompted the first ICLR workshop dedicated to recursive self-improvement in 2026 — a signal that the research community now takes the concept seriously enough to formalize it.

The concrete examples are no longer hypothetical. AlphaEvolve — Google DeepMind's evolutionary coding agent released in May 2025 — mutates and selects algorithms through an LLM, producing genuinely novel optimizations that no human programmed. Its achievements include breaking a fifty-year-old algorithm in tensor decomposition and optimizing Google's data centers to recover 0.7% of global compute resources. Anthropic CEO Dario Amodei stated the mechanism explicitly at Davos in January 2026: "We would make models that were good at coding and good at AI research, and we would use that to produce the next generation of models and speed it up to create a loop." OpenAI's Codex release cadence compressed from six-month gaps to under two months. The workforce multiplier effect is already visible: frontier labs are automating their own research operations, with effective AI-augmented workforces growing from thousands to tens of thousands — and this accelerates not just Loop 4 but all loops simultaneously.

The hardware arriving in 2026 gives this loop its physical substrate. NVIDIA's Vera Rubin platform, unveiled at CES 2026 and already in production as of Q1 2026, delivers 5x greater inference performance and 10x lower cost per token than the Blackwell generation it replaces. Each Vera Rubin NVL72 rack delivers 3.6 exaFLOPS of inference performance with 20.7 TB of HBM4 memory offering 1.6 petabytes per second of bandwidth. Six co-designed chips — GPU, CPU, NVLink switch, SuperNIC, DPU, and Ethernet switch — working as a single system. The recursive loop now runs on hardware specifically designed to make the loop faster.

Meanwhile, 3D chip architectures that go vertical instead of continuing to shrink horizontal are outperforming 2D designs by twelve times on real AI workloads, according to Stanford research. This represents a fundamentally different approach to the scaling problem: instead of requiring ever-finer lithography, 3D stacking increases compute density by building upward on existing process nodes.

The bottleneck framing assumes a fixed demand curve hitting constrained supply. But AI is simultaneously increasing the efficiency of demand, accelerating supply through all four loops, and discovering entirely new approaches that bypass the bottleneck altogether.


Loop 5: The Constraint Nobody Models

The first four loops are technical. The fifth is political — and it may be the hardest to close.

ASML cannot ship EUV machines to China. TSMC — which manufactures over 90% of the world's most advanced chips at the 3nm node — sits on a geopolitical fault line that makes the Taiwan Strait the single most consequential bottleneck in the global economy.

The policy landscape shifted dramatically in January 2026. On January 13, the Commerce Department revised its export review posture for advanced AI chips like NVIDIA H200 and AMD MI325X, moving from "presumption of denial" to "case-by-case review." The next day, President Trump imposed a 25 percent tariff on advanced AI chips not destined for the U.S. supply chain. The U.S. CHIPS and Science Act's $52.7 billion in subsidies has catalyzed a broader investment wave exceeding $630 billion across 140 projects by late 2025. The EU Chips Act, Japan's semiconductor subsidies, China's National IC Fund — every major power is spending tens of billions to localize portions of a supply chain that was, until recently, globally optimized for efficiency rather than resilience.

China's response to these constraints is itself a form of recursive resolution. DeepSeek V4 — a trillion-parameter multimodal model expected in early 2026 — is optimized to run on Huawei Ascend and Cambricon chips, demonstrating that frontier AI models can be trained and deployed on Chinese-made silicon despite U.S. export controls. DeepSeek V3 already matched GPT-4 class performance using a Mixture-of-Experts architecture activating only 37 billion of its 671 billion parameters per query, compressing the key-value cache by 93%, and training on just 2,048 H800 GPUs — far fewer than Western labs use for comparable models. As Nature reported in early 2026, China is simultaneously pursuing photonic computing as a potential path around silicon-based export controls entirely.

This creates a fracture that the technical loops can't resolve on their own. AI can design a better chip, but it can't ship that chip to a restricted destination. AI can discover a new material, but the export control classification of that material is a legal question, not a physics one. The recursive resolution works within a jurisdiction. Across jurisdictions, it fragments into parallel spirals — running on different substrates, different architectures, and different regulatory ceilings. Whether this produces convergence or divergence is the question that no technical loop can answer.


The Compounding Effect

Every previous technology improvement — steam to electric, vacuum tube to transistor, mainframe to PC — required humans to identify the bottleneck, research alternatives, engineer solutions, and iterate. The cycle time was measured in decades because the intelligence doing the solving was fixed.

What's happening now is structurally different. The intelligence doing the solving is the thing being solved for. Each improvement feeds back into faster chip design, faster materials discovery, faster energy solutions, and faster architectural innovation. The five loops don't run independently — they compound. AlphaChip designs a better TPU, the better TPU trains a better materials-discovery model, the materials-discovery model finds a GaN process that drops power consumption 50%, the saved power runs more TPUs, those TPUs train a better AlphaChip.

The ASML number — seventy tools per year — is a real constraint right now. But the question isn't whether seventy becomes a hundred and forty through normal ASML scaling. The question is whether AI discovers a lithography approach that makes EUV unnecessary for certain chip classes, designs architectures so efficient that fewer EUV-made chips deliver the same compute, or finds materials that can be patterned with cheaper tools.


The Ceiling Question

There is a version of this argument that is too clean. Recursive improvement sounds exponential, and exponentials are seductive — but every real optimization curve in history has been an S-curve. The first order-of-magnitude improvement in chip design automation was dramatically easier than the next factor-of-two. AlphaChip's gains were largest on its first generation; the marginal improvement on each subsequent TPU generation is smaller. Algorithmic efficiency improvements in machine learning follow the same pattern: the jump from GPT-2 to GPT-3 required roughly ten times the compute; the jump from GPT-4 to the next frontier model required less than two times, because better training methods absorbed most of the gap. But the rate of improvement in training methods is itself slowing as the low-hanging fruit gets picked.

Each loop has a natural ceiling — not imposed by physics in the abstract, but by the specifics of the optimization landscape. Materials discovery is bounded by thermodynamics. Energy production is bounded by conversion efficiency limits. Chip design is bounded by the interconnect delay problem, which no amount of transistor shrinkage solves. The recursive spiral doesn't escape these bounds — it approaches them faster.

This means the interesting question isn't "will the loops keep running?" — they will — but "when does each loop saturate, and what happens to the compound effect when individual loops flatten?"

The memory wall — stacked layers of compute and bandwidth, each with its own constraint
The memory wall — stacked layers of compute and bandwidth, each with its own constraint

There's also a structural blind spot in the bottleneck narrative: the four technical loops primarily address the training constraint — logic density, chip design speed, manufacturing throughput. But inference workloads now account for two-thirds of all AI compute, and inference has a different binding constraint: memory bandwidth — the speed at which data moves between storage and compute — not logic density. Most LLM inference workloads are memory bandwidth-bound rather than compute-bound. HBM is sold out through 2026, with demand growing over 70% year-over-year. The memory wall has become the binding constraint — and the industry is responding: HBM4E doubles bandwidth to 4.1 terabytes per second, Rambus delivered controller IP in March 2026, and SK Hynix is developing High Bandwidth Flash (HBF) targeting 512GB capacity at over 1,638 GB/s, with samples expected in the second half of 2026. Inference accounts for 80-90% of the lifetime cost of a production AI system. The recursive loops might dissolve the training bottleneck while the inference bottleneck — larger and growing faster — requires a different set of solutions: memory architecture innovation, the kind of inference-specific silicon that NVIDIA's Vera Rubin represents with its 10x lower cost per token, and algorithmic breakthroughs like FlashAttention-4 achieving 1,605 TFLOPS/s on Blackwell.

There's a more subtle version of this question that the bottleneck framing misses entirely: what if the constraint dissolves not because supply expands, but because demand shrinks? The DeepSeek effect is the clearest evidence. DeepSeek-V3 matched GPT-4 class performance while reducing costs up to 90% compared to traditional training methods. Mixture-of-Experts architectures activate only 5-6% of parameters per query. Model distillation has become a key technique across the industry in 2026 as AI deployment expands to edge devices. If the capability frontier can be reached with less compute per unit of intelligence, then the ASML bottleneck matters less — not because there are more EUV machines, but because each machine's output goes further.

This is the reframe the bottleneck analysis needs most: the denominator is moving, not just the numerator. The recursive resolution may work less by producing more compute and more by requiring less of it.


What This Means

The bottleneck is real and it matters today. As Patel details in his conversation with Dwarkesh, companies that secured long-term TSMC contracts early — NVIDIA being the most prominent — hold a structural advantage that takes years to replicate. The supply chain is genuinely constrained by the number of machines one Dutch company can build per year. Anthropic's delayed commitment to compute contracts, Google's late-stage turbine scramble — these are the consequences of underestimating how physical the AI buildout would become.

But the framing that treats this as a static constraint misses the recursive nature of the problem. A constraint on an intelligence that can work to dissolve its own constraints is a fundamentally different kind of constraint than one on a passive system.

The five loops operate on different timescales. Chip design (Loop 1) is already producing measurable results in shipping hardware — and now has a $4 billion startup dedicated to accelerating the cycle. AI self-improvement (Loop 4) is running on Vera Rubin hardware already in production. Materials science (Loop 2) is three to five years from changing the substrate — with GaN power stages, photonic interconnects via TSMC-Avicena, and optical computing as the nearest arrivals. Energy (Loop 3) is operating on two tracks: nuclear fission and grid optimization delivering now through 2030, with DOE's PROMETHEUS initiative targeting halved deployment timelines; fusion removing the ultimate ceiling by the early 2030s. Governance (Loop 5) operates on an indeterminate timescale because it's driven by geopolitics, not engineering — and it's the only loop that may fragment the spiral rather than accelerate it.

The ceiling question tempers the exponential narrative. Each loop will flatten. The compound effect will slow. But even S-curves, when stacked and staggered, produce something that looks exponential during the middle phase — and we are, by most measures, still in the middle phase of every loop except chip design.

The most underappreciated dynamic may be the demand side. If algorithmic efficiency keeps improving at its current rate — and DeepSeek's 90% cost reduction suggests it will — the bottleneck that seemed permanent may dissolve not because the funnel widens, but because less needs to flow through it. The ASML number stays at seventy. The number of EUV-patterned chips needed per unit of useful intelligence drops by half. The constraint is the same; its relevance is not.

By the early 2030s, the technical loops should be running at speed — and the compound effects of their interaction are harder to predict than any single loop's trajectory. Whether those loops run in one global system or fragment into competing stacks depends on the governance loop, which no amount of technical recursion can resolve.

The recursive resolution isn't a prediction. It's a description of what's already happening — unevenly, with real ceilings, across a fractured geopolitical landscape. The question is not only how fast, but for whom.