2026-01-27 - Create Content

Context

Goal

Let’s play around with this a bit.

Ok, today we’re creating the content for a long-form newspaper/magazine.

This report and the associated newspaper will be dated 2026-01-27 Be sure to use that date and also the day of the week. You can note the date this was actually generated at the bottom if you’d like

The title of the newspaper will be “The Review”

I’ve requested a research report to verify facts and re-organize themes. I’ve attached the report at the end of this prompt. The catch is that we’re taking the research and theme and having fun with them. These are dry topics How can we play around with them? Are there any good sourced quotes. comments, editorials, essays, or such that are funny and are about this topic? Make it light, but be sure you’re not lying about the facts. For each story, write it using a traditional newspaper story with the pyramid format. Write for a higher-education level, except for the lead sentence, which should be readable by most anybody deciding on whether to continue reading the story or not (as in a traditional newspaper). Continue until you have all the stories created. Now let’s make something to put at the top of our newspaper. Write a brief, perhaps 2-5 paragraphs, along with a headline, to tell the user what the rest of the document is going to be. This is our introduction. That’ll be our lead at the top before folks dive into each headline. This should give folks a good idea of whether they want to read anything in the paper at all. At the bottom, give your editorial based on the information and Overarching Connecting Theme Each of these assignments, the stories, the introduction, and the editorial, shouldn’t take more than 10 minutes to read. Try to write good headlines for each story that are non-technical. Finally, don’t tell me about my instructions to you as far as the newspaper. The top part should be the pitch for the entire paper only, not you repeating all the instructions and constraints.

No matter what, be sure to follow the editorial guidelines.

For those who are interested in pursuing further along the lines of hearing pro/con commentary, I’d like a link to opinion pieces that are the best representation of this. I’ve been a big fan of the realclear series of websites, as they give a broad overview of the opinion community. However, sadly much opinion is simply hair-on-fire rage bait, not well thought-out articles. There’s a lot of audience capture.

I know that you have access to even more current opinion pieces, like X and essays linked from X. There’s still that quality problem, though. For each of the newspaper articles you make, plus the editorial, scan all of the recent <4 weeks opinion pieces and give me the best pro and con essay under each of the articles and editorial. I’d also like a new, more newsworthy title along with one word representing the author. The heading should be something like “Pros and Cons” in smaller font than the story headline. I guess that’s H4.

A Style guide to the newspaper is included below before the research paper:

Just to emphasize, I want places in each article to hold images or infographics I can create or find later. If you an image or infographic, put it in there. Colored infographics are great. Those kind of pencil sketch heads like you used to see on the NYT are also cool. But don’t worry about images unless you can find one. We’ll do that in the formatting stage. I want actual links to the pros and cons with brief descriptions of their arguments.

APPLT WHAT YOU CAN FROM THE STYLE GUIDE, BUT WE’RE NOT DOING GRAPHICAL LAYOUT HERE. We simply want to make sure any sort of content material we can find is put into the markdown.

You probably want to break this work up into small pieces because it might crash and you’ll need to pick back up where you left off.

Background

Relevant context, prior work, and constraints

Success Criteria

How will you know when this is done well?

Daily Newspaper Style Guide

This style guide ensures consistency across all editions of the daily newspaper. It applies to both human editors and large language models (LLMs) during the final polishing stage, after core content (articles, headlines, images, etc.) has been drafted. The goal is to maintain a professional, readable, and uniform appearance, fostering reader trust and brand recognition. Adhere strictly to these rules unless overridden by specific editorial decisions.

1. Overall Structure and Layout

  • Edition Header (Masthead): Every edition must start with a centered masthead block including:
    • Volume and issue details, day, date, and price in uppercase, small caps or equivalent, on one line (e.g., “VOL. I, NO. 47 • SUNDAY, JANUARY 11, 2026 • PRICE: ONE MOMENT OF ATTENTION”), centered, in 10-12pt font.
    • Newspaper name in bold, uppercase, large font (e.g., 48pt), split across two lines if needed (e.g., “THE GLOBAL” on first line, “CONNECTOR” on second), centered.
    • Tagline in quotes, italic, below the name (e.g., “Tracing the threads that hold the world together—before they snap”), centered, in 14pt font.
    • A horizontal rule (---) below the masthead for separation.
    • Example in markdown approximation:
      VOL. I, NO. 47 • SUNDAY, JANUARY 11, 2026 • PRICE: ONE MOMENT OF ATTENTION
      
      THE GLOBAL
      CONNECTOR
      
      *"Tracing the threads that hold the world together—before they snap"*
      
      ---
      
  • Background and Visual Style: Aim for a newspaper-like background in digital formats (e.g., light beige or subtle paper texture via CSS if possible; in plain markdown, note as a design instruction for rendering).
  • Sections: Organize content into a themed newsletter format rather than rigid categories. Start with an introductory article, followed by 4-6 main stories, and end with an editorial. Each story should stand alone but tie into the edition’s theme.
    • Introductory article: Begins immediately after masthead, with a main headline in bold, title case.
    • Main stories: Each starts with a bold headline, followed by a subheadline in italic.
    • Editorial: Labeled as “EDITORIAL” in uppercase, bold, with its own headline.
    • Separate sections with ❧ ❧ ❧ or similar decorative dividers.
    • Limit total content to 2000-3000 words for a daily edition.
  • Page Breaks/Flow: In digital formats, use markdown or HTML breaks for readability. Aim for a “print-like” flow: no more than 800-1000 words per “page” equivalent. Use drop caps for the first letter of major articles.
  • Footer: End every edition with:
    • A horizontal rule.
    • Production Note: A paragraph explaining the collaboration between human and AI, verification process, and encouragement of skepticism (e.g., “Production Note: This edition… Your skepticism remains appropriate and encouraged.”).
    • Coming Next: A teaser for the next edition (e.g., “Coming Next Week: [Theme]—examining [details]. Also: [additional hook].”).
    • Copyright notice: ”© 2026 [Newspaper Name]. All rights reserved.”
    • Contact info: “Editor: [Name/Email] | Submissions: [Email]“.
    • No page count; end with a clean close.

2. Typography and Formatting

  • Fonts (for digital/print equivalents):
    • Headlines: Serif font (e.g., Times New Roman or Georgia), bold, 18-24pt.
    • Subheadlines: Serif, italic, 14-16pt.
    • Body Text: Serif, regular, 12pt.
    • Captions/Quotes: Sans-serif (e.g., Arial or Helvetica), 10pt, italic.
    • Use markdown equivalents: # for main headlines, for sections, bold for emphasis, italic for quotes/subtle emphasis.
  • Drop Caps: Introduce new articles or major sections with a drop cap for the first letter (e.g., large, bold initial like Welcome). In markdown, approximate with W and continue the paragraph; in rendered formats, use CSS for 3-4 line height drop.
  • Headlines:
    • Main article headlines: Capitalize major words (title case), no period at end.
    • Keep to 1-2 lines (under 70 characters).
    • Example: “Everything Is Connected (By Very Fragile Stuff)”
  • Body Text:
    • Paragraphs: 3-5 sentences each, separated by a blank line.
    • Line length: 60-80 characters for readability.
    • Bullet points for lists (e.g., key facts): Use - or * with consistent indentation.
    • Tables: Use markdown tables for data. Align columns left for text, right for numbers.
  • Pull Quotes (Drop Quotes): Insert 1-2 per story, centered, in a boxed or indented block, larger font (14pt), italic, with quotation marks. Place mid-article for emphasis. Example in markdown:
    > "The tech giants in California scream about latency and 'packet loss,' viewing the outage as a software bug. The ship captain knows the truth: the internet is just a wire in the ocean."
    
  • Emphasis:
    • Bold (text) for key terms or names on first mention.
    • Italics (text) for book titles, foreign words, or emphasis.
    • Avoid ALL CAPS except in headers.
    • No underlining except for hyperlinks.
  • Punctuation and Spacing:
    • Use Oxford comma in lists (e.g., “apples, oranges, and bananas”).
    • Single space after periods.
    • Em-dashes (—) for interruptions, en-dashes (–) for ranges (e.g., 2025–2026).
    • Block quotes: Indent with > or use italics in a separate paragraph for quotes longer than 2 lines.

3. Language and Tone

  • Style Standard: Follow Associated Press (AP) style for grammar, spelling, and abbreviations.
    • Numbers: Spell out 1-9, use numerals for 10+ (except at sentence start).
    • Dates: “Jan. 12, 2026” (abbreviate months when with day).
    • Titles: “President Joe Biden” on first reference, “Biden” thereafter.
    • Avoid jargon; explain acronyms on first use (e.g., “Artificial Intelligence (AI)”).
  • Tone: Neutral, factual, and objective for news stories, with a witty, reflective edge. Editorial may be more opinionated but balanced. Overall voice: Professional, concise, engaging—aim for a reading level of 8th-10th grade. Use direct address like “dear reader” in intros.
  • Length Guidelines:
    • Introductory article: 200-400 words.
    • Main stories: 300-500 words each.
    • Editorial: 400-600 words.
    • Avoid fluff; prioritize who, what, when, where, why, how, with thematic connections.
  • Inclusivity: Use gender-neutral language (e.g., “they” instead of “he/she”). Avoid biased terms; represent diverse perspectives fairly.
  • For Further Reading: Perspectives: At the end of each story and editorial, include a “FOR FURTHER READING: PERSPECTIVES” section. Use PRO (green box) and CON (red box) for balanced views. Each entry: Bold label (PRO or CON), title in quotes, source with hyperlink. Approximate boxes in markdown with code blocks or tables; in rendered formats, use colored backgrounds (e.g., light green for PRO, light red for CON). Example:
    FOR FURTHER READING: PERSPECTIVES
    
    **PRO** "Why Governments Must Control Cable Repair" — Parliament UK Joint Committee Report  
    Source: [publications.parliament.uk](https://publications.parliament.uk) (September 2025)
    
    **CON** "Sabotage Fears Outpace Evidence" — TeleGeography Analysis  
    Source: [blog.telegeography.com](https://blog.telegeography.com) (2025)
    

4. Images and Media

  • Placement: Insert images after the first or second paragraph of relevant articles. Use 1-2 per article max. No images in this example, but if used, tie to stories (e.g., maps for cables, illustrations for AI).
  • Formatting:
    • Size: Medium (e.g., 400-600px wide) for main images; thumbnails for galleries.
    • Alignment: Center with wrapping text if possible.
    • In text-based formats, describe images in brackets: [Image: Description of scene, credit: Source].
  • Captions: Below images, in italics, 1-2 sentences. Include credit (e.g., “Photo by Jane Doe / Reuters”).
  • Alt Text (for digital): Provide descriptive alt text for accessibility (e.g., “A bustling city street during rush hour”).
  • Usage Rules: Only relevant, high-quality images. No stock photos unless necessary; prefer originals or credited sources.

5. Editing and Proofing Checklist

Before finalizing:

  • Consistency Check: Ensure all sections follow the structure. Cross-reference dates, names, facts, and thematic ties.
  • Grammar/Spelling: Run through a tool like Grammarly or manual review. Use American English (e.g., “color” not “colour”).
  • Fact-Checking: Verify claims with sources; add inline citations if needed (e.g., [Source: Reuters]).
  • Readability: Read aloud for flow. Break up dense text with subheads, pull quotes, or bullets.
  • LLM-Specific Notes: If using an LLM for polishing, prompt with: “Apply the style guide to this draft: [insert content]. Ensure consistency in structure, tone, formatting, including drop caps, pull quotes, and perspectives sections.”
  • Variations: Minor deviations allowed for special editions (e.g., holidays), but document changes.

This guide should be reviewed annually or as needed. For questions, contact the editor-in-chief. By following these rules, each edition will maintain a polished, predictable look that readers can rely on.

Failure Indicators

Input

The Weight of Intelligence

How the Physical World Is Reshaping the Future of AI


A Long-Form Investigation into the Material Limits, Geopolitical Fractures, and Radical Innovations Defining the Next Decade of Artificial Intelligence


![Conceptual Infographic: The AI Infrastructure Stack showing layers from orbital satellites to submarine cables, with bottlenecks marked at Energy, Cooling, Packaging, and Architecture levels]

Figure 1: The Material Stack of AI — A visualization showing how frontier AI models depend on a fragile tower of physical infrastructure: from rare earth elements in GPUs, to T-Glass cloth from Japanese looms, to cooling water from stressed aquifers, to electrons from grids designed for a different era. Each layer represents a potential point of failure—and innovation.


Prologue: The Day the Market Learned Physics

On the morning of January 27, 2025, something unprecedented happened on Wall Street. Nvidia, the company that had become synonymous with artificial intelligence itself, lost $589 billion in market value in a single trading session—the largest one-day destruction of wealth in stock market history. The cause was not a fraud scandal, a product failure, or a regulatory crackdown. It was a PDF.

Two days earlier, a relatively obscure Chinese AI lab called DeepSeek had released a technical report alongside an open-source model. The model, called R1, could reason through complex problems at the level of OpenAI’s best offerings. The report claimed the final training run had cost approximately $5.6 million. American labs had spent hundreds of millions, sometimes billions, to reach similar benchmarks.

The market’s reaction was visceral and immediate. If intelligence could be produced this cheaply, what was the value of Nvidia’s 80 billion AI infrastructure budget? Had the entire Western AI strategy—premised on the idea that more compute equals more capability—been built on a fundamental miscalculation?

Within weeks, a more nuanced picture emerged. The 1.6 billion. Nvidia’s stock recovered. The trillion-dollar AI buildout continued.

But something had shifted in the collective understanding of the industry. The DeepSeek shock had revealed a truth that market valuations had obscured: artificial intelligence is not a purely digital phenomenon. It is a physical system, bound by energy, water, materials, and thermodynamics. And those physical systems were approaching their limits.

This is the story of that collision—between exponential ambition and finite resources, between software’s infinite reproducibility and hardware’s stubborn materiality. It is a story that spans Japanese glass looms and orbital satellites, nuclear reactors and Formula 1 cooling systems, Swiss standards bodies and Virginian power grids. It is, ultimately, the story of what happens when software tries to eat the world, and the world begins to eat back.


Part One: The Efficiency Heresy

Chapter 1: The Myth of the Six Million Dollar Model

The number that launched a thousand takes—$5.6 million—was simultaneously a misleading accounting trick and a profound strategic truth. Understanding why requires unpacking what it actually measured, and what it conspicuously excluded.

The figure referred strictly to the final training run of DeepSeek-V3, the base model upon which R1 was built. That run consumed approximately 2.788 million GPU hours on a cluster of Nvidia H800 accelerators. At an assumed internal cost of roughly $2 per GPU hour, the arithmetic yields the provocative headline number.

What the figure excluded was the massive “iceberg” of sunk costs required to reach that starting line. It did not account for the months of failed experiments that informed the successful architecture. It excluded the ablation studies used to tune hyperparameters—systematic experiments where researchers disable individual components to understand their contribution. It omitted the salaries of world-class researchers, many of whom had been recruited from leading Western labs. And it ignored the accumulation of hardware itself: a cluster estimated at 50,000 GPUs, acquired before the tightening of U.S. export controls made such purchases impossible.

Scale AI’s CEO Alexandr Wang was among the first to push back publicly, claiming DeepSeek secretly possessed far more compute than disclosed. Dylan Patel of SemiAnalysis published a detailed breakdown estimating that DeepSeek had access to approximately 10,000 H800s and 10,000 H100s, with total server capital expenditure around 944 million. White House AI Czar David Sacks called the $6 million figure “misleading.”

Yet to dismiss DeepSeek’s achievement as mere creative accounting is to miss what made it genuinely disruptive. The strategic shock was not that they spent little money in absolute terms. It was that they achieved frontier performance despite being constrained by inferior hardware.

The H800, the chip DeepSeek was forced to use, is a “crippled” version of Nvidia’s flagship H100. It was specifically designed to comply with U.S. export controls—identical compute cores, but with dramatically reduced interconnect bandwidth. The chip could think just as fast, but it could not communicate with its neighbors at full speed. For training models that require constant synchronization across thousands of GPUs, this was supposed to be a crippling limitation.

It was not. DeepSeek’s engineers, denied the luxury of brute-force scaling across massive, fast-connected clusters, were compelled to optimize the model architecture itself to minimize communication overhead. Constraint became catalyst.


Chapter 2: The Technical Architecture of Necessity

The innovations that emerged from DeepSeek’s constraints have since been adopted across the industry, but understanding their origins reveals something important about how breakthrough engineering actually happens.

Multi-Head Latent Attention addressed one of the primary bottlenecks in training large language models: the Key-Value (KV) cache. As a model’s context window grows—the amount of text it can “remember” during a conversation—the memory required to store this cache expands linearly. With context windows reaching 128,000 tokens, the KV cache often consumes more memory than the model’s weights themselves, forcing reductions in batch size that kill training throughput.

DeepSeek’s solution was to project the Key and Value vectors into a lower-dimensional latent space, dramatically compressing the cache without degrading performance. This allowed them to train with larger batch sizes on the H800s, effectively simulating the throughput of a higher-bandwidth system.

Mixture-of-Experts (MoE) architecture was not new in 2025—Mistral and Google had used variants—but DeepSeek’s implementation featured a breakthrough in load balancing. Their V3 model contains 671 billion parameters, but only 37 billion activate for any given token. A “router” network decides which specialist sub-networks handle which inputs.

The challenge with MoE has always been uneven distribution: if the “coding” expert gets overloaded during programming tasks, it becomes a bottleneck while other experts sit idle. Previous solutions added penalty functions to discourage imbalance, but these degraded model quality. DeepSeek developed an auxiliary-loss-free load balancing strategy that dynamically adjusted router bias, achieving near-perfect utilization without quality trade-offs. They trained a 671B parameter model with the compute budget typically reserved for a 70B model.

FP8 quantization was perhaps the most audacious risk. Most frontier models were trained in BF16 (16-bit) or FP32 (32-bit) precision. Moving to 8-bit precision theoretically doubles compute throughput and halves memory bandwidth requirements—critical for bandwidth-starved hardware. But training in FP8 is notoriously unstable; the limited dynamic range often causes “loss spikes” where training diverges catastrophically.

DeepSeek solved this through fine-grained, block-wise quantization that adjusted the numerical range for every 128×128 block of the weight matrix. This allowed them to harness the H800’s tensor core optimizations for low-precision math, turning a hardware weakness into a throughput multiplier.


![Technical Comparison Table]

Figure 2: Two Paths to Frontier AI

DimensionWestern “Brute Force”DeepSeek “Efficiency”
Primary ConstraintCapitalBandwidth
ArchitectureDense TransformersMixture-of-Experts (Sparse)
Training PrecisionBF16 / FP32FP8 (Mixed Precision)
Attention MechanismStandard Multi-HeadMulti-Head Latent Attention
Reinforcement LearningPPO (Requires Critic Model)GRPO (No Critic)
Strategic FocusMaximize capability at any costMaximize capability per FLOP

Chapter 3: The Reasoning Revolution

While V3 was the foundation, the true disruption came with R1, the “reasoning” model that matched OpenAI’s o1 series in mathematical and coding benchmarks.

The breakthrough was methodological rather than architectural. Prior to R1, the standard approach to Reinforcement Learning from Human Feedback (RLHF) relied on Proximal Policy Optimization (PPO). This algorithm requires a “Critic” model—a separate neural network that evaluates the quality of responses from the “Actor” model. The Critic is typically as large as the Actor itself, effectively doubling the memory and compute requirements for the alignment phase.

DeepSeek introduced Group Relative Policy Optimization (GRPO), which eliminated the Critic entirely. Instead of relying on a separate neural network to score answers, GRPO generates multiple outputs for a single prompt and scores them relative to the group’s average. For objective tasks—mathematics, coding, logical puzzles—the reward signal can be derived from simple rule-based verifiers. Did the code compile? Is the mathematical answer correct? No expensive neural reward model required.

This had two profound implications. First, it halved the compute resources required for alignment, making “reasoning” training accessible to labs with modest clusters. Second, and more provocatively, it demonstrated that complex reasoning behaviors could emerge from pure reinforcement learning on a base model without the need for thousands of hours of expensive human-annotated “Chain of Thought” data.

DeepSeek proved this with R1-Zero, a variant trained entirely through GRPO without any supervised fine-tuning. The model spontaneously developed extended reasoning chains, verification behaviors, and self-correction patterns—emergent capabilities that Western labs had assumed required extensive human curation.


Chapter 4: The Distillation Economy

The most lasting impact of the DeepSeek shock has been the widespread adoption of model distillation. DeepSeek demonstrated that the reasoning patterns discovered by their massive R1 model could be “distilled” into much smaller, open-source models simply by fine-tuning them on R1’s outputs.

The implications for AI economics are severe. If a developer can achieve GPT-4-level reasoning on a local 32-billion parameter model—one that runs on a single consumer GPU—the demand for expensive cloud-based inference plummets. The market has bifurcated into a “Teacher/Student” economy: value concentrates in the massive “Teacher” models and the proprietary data used to train them, while “Student” models have become rapidly commoditizing utilities.

This dynamic explains why the major American labs have since adopted many of DeepSeek’s techniques while simultaneously attempting to maintain differentiation through scale, data, and integration. OpenAI’s o3 and Anthropic’s Claude 4 series both employ variants of mixture-of-experts architectures. Google’s Gemini models use aggressive quantization during inference. The efficiency frontier DeepSeek discovered turned out to be real, and ignoring it was no longer an option.


Chapter 5: The Geopolitics of Constraint

The U.S. export control regime was designed to freeze Chinese AI capabilities at a pre-2023 level. By restricting access to Nvidia’s most advanced chips, policymakers hoped to maintain American dominance in frontier AI development. DeepSeek’s success suggested that algorithmic efficiency could serve as an effective asymmetric counter-strategy to hardware embargoes.

But a more nuanced view has emerged in the year since the shock. The controls did work—just not as intended. They forced Chinese labs to become hyper-efficient, essentially “Darwinianizing” the ecosystem. While American labs grew comfortable with abundant compute, Chinese labs were training at altitude.

A senior researcher at one major American lab put it bluntly in an anonymized interview: “We had no incentive to find the efficiency frontier when investors were providing billions to find the capability frontier instead. DeepSeek, backed into a corner by geopolitics, discovered that the path we weren’t taking actually led somewhere.”

The result is a Chinese AI sector that is leaner, more rigorous in its engineering, and potentially better prepared for the thermodynamic constraints that now face the entire industry. Whether this represents a failure of American policy or an unexpected form of success depends on time horizons that extend well beyond quarterly earnings.


Part Two: The Energy Wall

Chapter 6: The Physics of the Dunkelflaute

The word is German: Dunkelflaute, meaning “dark doldrums.” It describes extended periods—typically 5 to 10 days in winter—when high-pressure weather systems create stagnant, cloudy air masses over northern Europe. Wind turbines stop turning. Solar panels produce negligible power. The phenomenon that once concerned only meteorologists and grid operators has become central to understanding why AI cannot scale on renewables alone.

Data centers are unique energy consumers. Unlike factories that can reduce shifts or homes that use less power at night, a training cluster demands continuous, “flat” power 24 hours a day, 365 days a year. A model training run might last months; any interruption requires restarting from the last checkpoint, potentially wasting days of compute. The industry calls this requirement “baseload,” and meeting it with weather-dependent renewable energy requires storage.

The mathematics of battery backup at AI scale are unforgiving. Consider a hypothetical 500-megawatt hyperscale campus—a standard size for 2026 AI hubs:

  • Daily consumption: 500 MW × 24 hours = 12,000 MWh
  • Ten-day Dunkelflaute survival: 120,000 MWh of storage required
  • Current reality: The world’s largest battery storage project, Moss Landing in California, holds approximately 3,000 MWh

A single AI campus would require a battery forty times larger than the biggest one currently in existence. The capital cost would be astronomical—and the batteries would sit idle most of the year, waiting for weather events that might not come.

This is not an engineering problem awaiting a solution. It is a thermodynamic reality that has forced a fundamental reorientation of how the industry thinks about power.


Chapter 7: The Numbers That Keep CFOs Awake

U.S. data centers consumed 183 terawatt-hours of electricity in 2024, according to International Energy Agency estimates. This represents more than 4% of the country’s total electricity consumption—roughly equivalent to the annual demand of Pakistan. By 2030, projections suggest this will grow by 133% to 426 TWh.

Globally, data centers consumed around 415 TWh in 2024, approximately 1.5% of total demand. The IEA projects this will more than double to 945 TWh by 2030, slightly exceeding Japan’s total annual electricity consumption. AI is the primary driver.

The geographic concentration makes these numbers more alarming. In 2023, Virginia’s data centers consumed 26% of the state’s total electricity supply. In Loudoun County—“Data Center Alley”—they accounted for 21% of power consumption, surpassing all residential use combined. A minor disturbance in neighboring Fairfax County in 2024 caused 60 data centers to switch simultaneously to backup generation. The sudden loss of 1,500 megawatts—equivalent to Boston’s entire demand—nearly triggered cascading grid failures.

The economic ripple effects are reaching consumers. In the PJM electricity market, which stretches from Illinois to North Carolina, data centers accounted for an estimated 18 per month in western Maryland and $16 per month in Ohio. A Carnegie Mellon study estimates that data centers and cryptocurrency mining could increase the average U.S. electricity bill by 8% by 2030—potentially exceeding 25% in Northern Virginia.

Water consumption adds another dimension to the resource strain. Cooling systems require millions of gallons annually, creating competition with municipal and agricultural needs in regions already facing water stress. A Forbes investigation found some large data centers using 300,000 gallons daily. The Alliance for the Great Lakes projects cumulative U.S. hyperscale water withdrawals of 150.4 billion gallons between 2025 and 2030.


![Energy Consumption Comparison]

Figure 3: AI’s Appetite for Power

EntityAnnual Electricity Consumption
U.S. Data Centers (2024)183 TWh
Pakistan (entire country)~180 TWh
Global Data Centers (2024)415 TWh
Japan (entire country)~940 TWh
Projected Global Data Centers (2030)945 TWh

Source: IEA, Gartner, Deloitte


Chapter 8: The Nuclear Pivot

In late 2025, Microsoft announced a 20-year power purchase agreement to restart Three Mile Island Unit 1—not the unit involved in the 1979 accident, but its undamaged twin. The 835-megawatt reactor, mothballed in 2019 for economic reasons, would be dedicated entirely to powering Microsoft’s AI operations. The target date: 2028.

The deal was a watershed moment. A tech company was not merely purchasing clean energy certificates or funding a solar farm. It was bringing a nuclear reactor back from the dead to serve as its captive power plant.

Within months, the rest of the industry followed. Google signed the first corporate agreement to develop a fleet of Small Modular Reactors (SMRs) in the United States with Kairos Power, covering up to 500 megawatts across six or seven units, with the first targeted for 2030. Amazon invested over $20 billion converting the Susquehanna site into a nuclear-powered AI data center campus and backed 5 gigawatts of new X-energy SMR projects. Meta issued requests for proposals targeting 1 to 4 gigawatts of new nuclear generation. Oracle announced plans for a gigawatt-scale data center powered by three SMRs.

The collective commitment exceeds 10 gigawatts—roughly the output of ten large nuclear plants.


Chapter 9: The SMR Value Proposition

Small Modular Reactors represent a philosophical departure from traditional nuclear power. Rather than building enormous, bespoke facilities that take decades to plan and construct, the SMR vision involves factory-fabricating standardized units that can be transported to site and assembled relatively quickly.

The core innovation lies in passive safety systems. Traditional reactors rely on pumps, valves, and human operators to prevent meltdowns. SMRs use natural physical processes—gravity, convection, thermal expansion—to shut down safely without external power or intervention. NuScale’s design can cool itself for seven days with no electricity and no human action.

For AI data centers, SMRs offer compelling advantages:

Reliability: Nuclear provides constant output regardless of weather, time of day, or season. There is no Dunkelflaute in fission.

Density: A single SMR can power a hyperscale campus from a footprint smaller than its parking lot, avoiding the vast land requirements of equivalent solar installations.

Stability: Unlike natural gas, nuclear is insulated from fuel price volatility and geopolitical supply disruptions.

Carbon: Nuclear is effectively zero-emission during operation, allowing AI companies to maintain climate commitments while scaling aggressively.

The Bulletin of the Atomic Scientists quoted one industry analyst: “SMRs promise compact footprints, high-energy density, and predictable operating costs—everything AI infrastructure needs and renewables struggle to provide at scale.”


Chapter 10: The Temporal Gap

The fundamental challenge is timing. AI compute demand is doubling every six to eighteen months. Nuclear reactors take years to license and build.

The Three Mile Island restart is optimistically targeted for 2028. The first Kairos SMRs are expected around 2030-2035. NuScale’s first commercial deployment has faced repeated delays. There is a five-year gap where AI demand explodes but new nuclear capacity is not yet online.

This gap will likely be filled by the postponement of fossil fuel plant retirements. The Brookings Institution notes that 114 gigawatts of new gas-fired capacity is currently in the U.S. development pipeline. The perverse near-term result: “green” AI companies are indirectly extending the life of coal and gas plants to bridge to their nuclear future.

Regulatory hurdles compound the timing problem. The Nuclear Regulatory Commission is working to modernize its approach, but challenges include limited experience with next-generation designs, high application fees, and unresolved questions about how factory-built modules will be certified. Complex permitting workflows and overloaded interconnection queues add years of delay.

The question facing the industry is whether it can survive the gap—whether efficiency gains, renewables, and legacy grid capacity can buy enough time for the nuclear cavalry to arrive.


![Nuclear AI Deals Timeline]

Figure 4: The $10 Billion Nuclear Rush

CompanyPartnerCapacityTarget Date
MicrosoftConstellation (TMI-1)835 MW2028
GoogleKairos Power500 MW (6-7 SMRs)2030+
AmazonX-energy5 GW2030s
AmazonSusquehanna expansionMulti-GW2028+
MetaMultiple (RFP)1-4 GWTBD
OracleSMR consortium1+ GWTBD

Total committed: >10 GW of carbon-free capacity


Part Three: The Invisible Infrastructure

Chapter 11: The Cooling Crisis

The existential threat to AI scaling in 2026 is not silicon shortage but water where it does not belong.

A modern Nvidia Blackwell rack consumes 120 kilowatts—six times the density that air cooling can handle. The heat generated by next-generation GPUs cannot be removed by fans alone; it requires liquid flowing directly over the processors. The industry has shifted en masse to Direct-to-Chip (DTC) liquid cooling, where cold plates channel coolant within millimeters of the silicon.

The transition has been faster than the operational maturity to support it. An epidemic of failures has emerged in hastily retrofitted facilities.

Galvanic corrosion is the silent destroyer. In a rush to deploy liquid cooling, many data centers have mixed incompatible metals—copper cold plates connected to aluminum radiators, for instance. When coolant (an electrolyte) flows between them, the system becomes a battery. The aluminum sacrifices itself, dissolving into the fluid and precipitating as sludge that clogs microscopic fins. Eventually, the aluminum wall thins until it bursts, spraying glycol-water mixture onto racks worth millions of dollars.

A November 2025 incident at a CME Group data center in Aurora, Illinois, illustrated the stakes. A chiller malfunction caused cooling to fail across multiple units, halting trading operations. The exact financial impact remains undisclosed, but estimates of similar incidents suggest GPU downtime costs between 40,000 per chip per day.

Industry veterans sometimes dismiss these failures as standard “teething pains,” citing decades of successful liquid cooling in niche supercomputers. But this argument ignores a critical distinction. High-performance computing centers are bespoke laboratories run by specialized engineers with PhD-level expertise. Modern hyperscale data centers are industrial warehouses struggling to find technicians who understand fluid dynamics.


Chapter 12: The Labor Crisis

When Microsoft’s Quincy data center experienced a cooling failure lasting 37 minutes, GPU temperatures spiked to 94 degrees Celsius. The result: $3.2 million in hardware damage and 72 hours of downtime. Post-incident analysis revealed that the failure mode—a chemical incompatibility between coolant and gasket materials—would have been obvious to anyone with process engineering training. But the staff on-site were IT professionals, not chemical engineers.

This is the labor crisis beneath the cooling crisis. Managing a liquid-cooled data center requires the skills of a chemical engineer and a master plumber. The current data center workforce is trained in swapping hard drives and managing airflow. The gap cannot be closed through short-term hiring; the pipeline of qualified professionals simply does not exist at the scale required.

The industry’s response has been to engineer the human element out of the loop. Vendors are racing to build “idiot-proof” systems—hermetically sealed, modular cooling cartridges that require no specialized knowledge to service. The goal is to turn the cooling loop into a consumable, like a printer cartridge, that generalist technicians can swap when sensors indicate degradation.

This represents a deliberate trade-off: peak thermal efficiency for absolute reliability. A custom-engineered cooling system optimized for a specific chip configuration will always outperform a standardized module. But a standardized module that a $25-per-hour technician can replace in twenty minutes without understanding fluid chemistry will experience far less catastrophic downtime.

The industry term for this shift is “cartridge-ification.” It is the recognition that at sufficient scale, complexity becomes the enemy of reliability.


![Thermal Density Progression]

Figure 5: The Heat Wave

GPU GenerationPower per ChipRack PowerCooling Requirement
Nvidia A100 (2020)400W~35 kWAir (marginal)
Nvidia H100 (2023)700W~75 kWLiquid recommended
Nvidia Blackwell (2024)1,000W+~120 kWLiquid required
Nvidia Blackwell Ultra (2025)1,400W~132 kWAdvanced liquid
Expected “Feynman” (2028)~4,400W~240 kW???

Heat dissipation per square centimeter now approaches 50W—comparable to a nuclear reactor core.


Part Four: The Materials Wall

Chapter 13: The Reticle Limit

To understand why the AI industry is in crisis over glass, one must first understand the “reticle limit.”

The lithography machines that print chips—such as ASML’s extreme ultraviolet (EUV) systems—have a maximum exposure field of roughly 858 square millimeters. A single silicon die cannot physically exceed this size. Yet the demand for AI compute requires chips with trillions of transistors, far more than can fit on a single reticle-sized die.

The industry’s solution has been “Advanced Packaging” or “chiplets.” Instead of making one giant chip, manufacturers stitch together multiple smaller dies—GPU logic, HBM memory stacks, I/O controllers—onto a base layer called an interposer or substrate. This package acts as the motherboard for the silicon, providing the electrical connections between components.

For decades, these substrates have been made from organic resin reinforced with fiberglass cloth—essentially sophisticated circuit boards. This worked adequately when packages were small and interconnect requirements modest. It is failing catastrophically as AI chips push to the limits of physics.


Chapter 14: The Warpage Problem

When a massive package—100mm by 100mm, roughly the size of a deck of cards—is heated to 250°C during manufacturing, the organic substrate expands at a different rate than the silicon chips sitting on it. This mismatch in the Coefficient of Thermal Expansion (CTE) causes the package to warp like a potato chip.

For traditional chips, this warpage was manageable. For AI packages with thousands of microscopic solder bumps connecting chiplets, warpage severs connections and destroys yields. An Intel white paper described the challenge starkly: “By the end of this decade, the semiconductor industry will likely reach its limits on scaling transistors on a silicon package using organic materials.”

The density problem compounds the thermal one. Organic substrates are relatively rough on a microscopic scale. The smallest electrical traces that can be reliably printed on them are about 2 microns wide with 2-micron spacing. To connect more chiplets at higher speeds, engineers need wires below 1 micron. The surface roughness of organic materials physically cannot hold these tolerances; the traces break.


Chapter 15: The T-Glass Bottleneck

To fight warpage, substrate manufacturers have relied on a specialized reinforcement material called T-Glass (Low-CTE Glass Cloth). Unlike standard fiberglass, T-Glass is chemically formulated to have a thermal expansion coefficient very close to silicon, minimizing the mismatch that causes warping.

The problem is that T-Glass production is an incredibly niche, difficult, and low-margin business. The manufacturing process involves spinning molten glass into yarn finer than a human hair and weaving it into defect-free cloth. It requires specialized furnaces that take years to build and qualify.

Global T-Glass production is dominated by a single Japanese company: Nitto Boseki (Nittobo). When Nvidia and AMD ramped production of chiplet-based AI accelerators in 2025, demand for T-Glass exploded beyond anything Nittobo had anticipated. The company is reportedly sold out through 2027.

This has created a hard ceiling on advanced organic package production. The entire trillion-dollar AI hardware market is currently throttled by the output of a few glass looms in Japan.


Chapter 16: The Glass Transition

The T-Glass shortage has accelerated the timeline for what was always the inevitable solution: abandoning organic cloth entirely in favor of solid glass substrates.

A glass core substrate uses a sheet of borosilicate or quartz glass—similar to display panel glass—as the package’s structural foundation. The advantages are profound:

Perfect flatness: Glass is atomically smooth, enabling lithography-grade precision. Interconnects with sub-1-micron line/space become possible, providing a 10× increase in routing density compared to organic substrates.

Tunable CTE: The chemical composition of glass can be adjusted to perfectly match silicon’s thermal expansion, effectively eliminating warpage even for enormous packages.

Through-Glass Vias (TGVs): Lasers can drill millions of microscopic holes through glass with extreme precision, allowing interconnect pitches below 100 micrometers.

Optical integration: Glass is transparent, enabling the embedding of optical waveguides directly into the package. This opens the door to “Co-Packaged Optics”—replacing copper interconnects with light pulses for dramatically reduced power and latency.


Chapter 17: The Glass Race

The transition to glass substrates has sparked a fierce industrial competition.

Intel is the unlikely leader. The company began investing in glass substrate R&D over a decade ago at its Chandler, Arizona facility, spending more than $1 billion. Their “Clearwater Forest” Xeon processors, shipping in volume in 2026, are the first commercial products to feature glass cores. This gives Intel a potential two-to-three-year lead over competitors—a rare advantage for a company that has struggled in recent years.

Absolics, a subsidiary of South Korean conglomerate SK Group, is the first “pure-play” glass substrate manufacturer. Their factory in Covington, Georgia, supported by U.S. CHIPS Act funding, began shipping commercial-grade substrates in late 2025. They are reportedly supplying samples to AMD for future Instinct accelerators.

Samsung is leveraging its display division’s expertise in handling large glass panels to accelerate its “Dream Substrate” roadmap, targeting 2027 for mass production.

Rapidus, Japan’s national semiconductor champion, has developed a prototype glass interposer cut from a 600mm × 600mm substrate—the world’s first at this format. This enables production of interposers 1.3 to 2 times larger than rivals, with mass production planned for 2028.

The race is not merely commercial but strategic. Glass substrate capability is becoming a prerequisite for manufacturing AI chips at frontier scale. Nations and companies without domestic glass substrate production will be dependent on those who have it.


![Substrate Evolution]

Figure 6: From Plastic to Glass

FeatureOrganic (Standard)Organic + T-GlassGlass Core
Core MaterialEpoxy + standard fiberglassEpoxy + low-CTE T-GlassSolid borosilicate/quartz
Min. Feature Size~10 µm L/S~2-5 µm L/S< 1 µm L/S
Thermal ExpansionHigh mismatch (warping)Tuned match (supply constrained)Perfect match (tunable)
Interconnect DensityLowMediumUltra-high (10×)
Optical IntegrationDifficult/externalDifficult/externalNative (embedded waveguides)
Supply Status (2026)AvailableCritical shortageEarly ramp

Part Five: The Architecture of Sovereignty

Chapter 18: The End of the Toy Era

For decades, the computing world has been a duopoly: x86 processors from Intel and AMD ruled servers and PCs, while ARM processors from Apple, Qualcomm, and Nvidia dominated mobile and embedded devices. Both architectures are proprietary, requiring licenses and royalties to use.

RISC-V is different. It is an open-standard Instruction Set Architecture (ISA)—the fundamental language that software uses to communicate with hardware. Anyone can implement RISC-V without paying royalties or seeking permission. The specification is maintained by RISC-V International, a non-profit based in Switzerland.

Until recently, RISC-V was dismissed by serious datacenter architects as a “toy” architecture—adequate for microcontrollers and hard drive controllers, but lacking the software ecosystem for high-performance computing. That changed with the ratification of the RVA23 Profile in late 2024.

RVA23 establishes a standardized “northbound” interface for datacenter-class chips. It mandates a strict set of extensions—including Vector (RVV 1.0) for AI math, Hypervisor for virtualization, and Crypto for security—that any RVA23-compliant processor must support. For the first time, operating system vendors like Red Hat and Ubuntu can build a single image that works across different RISC-V server chips, creating the standardized target the software ecosystem requires.


Chapter 19: The Geopolitical Driver

The primary engine driving RISC-V adoption is not technical superiority but geopolitical necessity.

U.S. export controls have blocked China’s access to advanced GPUs and threatened restrictions on EDA (electronic design automation) software—the tools used to design chips. Beijing has concluded that reliance on x86 or ARM, both of which are entangled with American or British intellectual property law, represents an existential risk to Chinese technological sovereignty.

RISC-V’s Swiss governance and open nature make it legally difficult for any single government to restrict access to the standard itself. China has effectively adopted RISC-V as its national architecture for the post-American computing era.

The investment has been substantial. Chinese entities like T-Head (Alibaba) and XuanTie are building high-performance RISC-V cores targeting server, PC, and automotive applications. The XuanTie C930 aims to compete directly with ARM’s Neoverse series. State funding flows into the ecosystem at a scale that dwarfs private investment in the West.

The result is an emerging “bifurcated stack” in global computing. In the West: CUDA, x86, ARM, proprietary everything. In the East—and increasingly the Global South: RISC-V, open-source, and architectural independence.


Chapter 20: Nvidia’s Unexpected Endorsement

In July 2025, at the RISC-V Summit in China, Nvidia announced CUDA support for RISC-V processors.

The announcement was unexpected. CUDA—Nvidia’s proprietary software platform that has locked the AI industry into its GPUs—had previously been available only on x86 and ARM host processors. Adding RISC-V suggested that Nvidia anticipated a future where server vendors might choose open architectures.

Frans Sijstermans, Nvidia’s Vice President of Hardware Engineering, framed it strategically: “Accelerated computing is our business, CUDA is our core product, and we want to support it on any CPU. If a server vendor chooses RISC-V, we want to support that too.”

Industry analysts interpreted the move as recognition that RISC-V had crossed a maturity threshold. Nvidia would not invest engineering resources in a port without confidence that the target platform would achieve meaningful market share. The RVA23 ratification provided that confidence.


Chapter 21: Tenstorrent and the Thesis of Sovereign AI

In the West, the most aggressive RISC-V champion is Tenstorrent, led by legendary chip architect Jim Keller—designer of AMD’s Zen architecture, Apple’s A-series chips, and Intel’s Tiger Lake.

Tenstorrent’s business model attacks Nvidia’s high margins by offering a “build it yourself” alternative. The company licenses its high-performance RISC-V CPU core (Ascalon) and its AI accelerator IP (Tensix) to customers who want to design custom silicon without paying Nvidia’s prices or being subject to its supply allocation.

The target audience includes nations and corporations pursuing “Sovereign AI”—the ability to control their AI infrastructure completely, without dependence on foreign black-box providers. Tenstorrent has inked deals with entities in Japan and the UAE to build national-scale AI infrastructure using RISC-V.

The pitch is compelling for nations that have watched U.S. export controls weaponize the supply chain. An auditable, vendor-neutral hardware root of trust may be worth the friction of software porting.


![Global Compute Stack Bifurcation]

Figure 7: Two Stacks, Two Worlds

LayerWestern StackSovereign Stack
AI FrameworkPyTorch/TensorFlow (CUDA-optimized)PyTorch (RISC-V/custom accelerator)
AcceleratorNvidia GPU (proprietary)Tenstorrent Tensix / custom ASIC
Host CPUx86 (Intel/AMD) or ARMRISC-V (open)
ISA IPLicensed (royalties)Open (no royalties)
Geopolitical RiskSubject to export controlsIndependent

Part Six: The Orbital Escape

Chapter 22: The First AI Model Trained in Space

In November 2025, an Nvidia-backed startup called Starcloud launched a satellite carrying an Nvidia H100 GPU—a chip 100 times more powerful than any GPU previously operated in orbit. The mission was not primarily about Earth observation or communications. It was about AI.

Starcloud successfully trained NanoGPT—a compact language model—on Shakespeare’s complete works and ran Google’s Gemma model in inference, all in orbit. The demonstrations proved that space-based data centers could exist and operate AI workloads.

Dion Harris, Nvidia’s senior director of AI infrastructure, called it “a giant leap toward a future where orbital computing harnesses the infinite power of the sun.”

Weeks later, Google unveiled Project Suncatcher: a research initiative aiming to build data centers in space using solar-powered satellite constellations running Google’s TPU chips, with data transmitted via laser inter-satellite links. A demonstration mission is planned for 2027 in partnership with Planet Labs.

Google CEO Sundar Pichai framed the ambition: “We will send tiny, tiny racks of machines and have them in satellites, test them out, and then start scaling from there. There is no doubt to me that, a decade or so away, we will be viewing it as a more normal way to build data centers.”


Chapter 23: The Physics Advantage

The appeal of space for AI infrastructure rests on fundamental physics.

Solar abundance: In a sun-synchronous orbit, a satellite receives solar energy continuously—no day/night cycles, no weather, no atmospheric attenuation. Irradiance is approximately 1,366 watts per square meter, 40% higher than Earth’s surface. The capacity factor approaches 100%.

Radiative cooling: On Earth, cooling requires fans or pumps to move heat into air or water. In the vacuum of space, heat must be rejected via radiation. But deep space is an effectively infinite heat sink at approximately 2.7 Kelvin (-270°C). A black surface at room temperature will radiate roughly 838 watts per square meter to the cosmic background—about three times the electricity generated by solar panels of equivalent area.

No resource conflicts: Orbital facilities compete with no one for water, electricity, or land. There are no neighbors to object to noise or appearance. There are no grid interconnection queues.

Starcloud’s vision involves a 5-gigawatt orbital data center with solar and cooling panels spanning approximately 4 kilometers on each side. This would produce more power than the largest U.S. power plant while being substantially smaller than a terrestrial solar farm of equivalent capacity.


Chapter 24: The Skeptics’ Case

Critics of orbital computing enumerate substantial obstacles:

Launch costs: Even with reusable rockets, delivering mass to orbit costs thousands of dollars per kilogram. A gigawatt-scale data center would require mass measured in thousands of tons—hundreds of Starship launches.

Maintenance: There are no technicians in orbit. A failed GPU cannot be swapped. This requires either extreme redundancy (software fault tolerance across massive node counts) or autonomous robotic servicing swarms—technology still in early development.

Radiators: Rejecting 1 gigawatt of heat via radiation requires approximately 3 square kilometers of radiator surface area. Constructing and maintaining structures of this scale in orbit is an unprecedented engineering challenge.

Radiation and debris: Cosmic rays cause bit flips in electronics. Micrometeorites threaten physical damage. Hardening adds mass and cost.

Latency: Light takes time to travel. A geostationary satellite is 36,000 kilometers away—240 milliseconds round-trip at minimum. Even low Earth orbit adds tens of milliseconds compared to terrestrial infrastructure.

Environmental footprint: A Saarland University study calculated that orbital data centers could create an order of magnitude greater emissions than Earth-based alternatives, accounting for rocket launch emissions and atmospheric reentry of spacecraft components.


Chapter 25: The Regulatory Vacuum

Perhaps the most profound implication of orbital computing is jurisdictional. An AI model trained on a private station in international orbit is technically outside the reach of the EU AI Act, U.S. Executive Orders, or Chinese regulations.

This creates the possibility of “Orbital Data Havens”—facilities where companies could train models on copyrighted data, use prohibited algorithmic techniques, or develop capabilities banned on Earth, all beyond effective regulatory enforcement.

The prospect has already sparked discussion of an “Outer Space Treaty for AI”—international agreements that would extend jurisdictional frameworks to orbital compute. But negotiating such treaties while space-based AI remains speculative faces the same challenge as all anticipatory governance: it is difficult to regulate what does not yet exist.


![Orbital vs. Terrestrial Trade-offs]

Figure 8: The Space Equation

FactorTerrestrial Data CenterOrbital Data Center
Solar availabilityWeather-dependent, 20-30% capacity factorNear-100% capacity factor
CoolingRequires water/energyPassive radiative (free)
Resource conflictsWater, grid, land, neighborsNone
MaintenanceHuman techniciansRobotic (immature)
LatencyMinimalTens to hundreds of milliseconds
Regulatory jurisdictionNational laws applyAmbiguous
Current statusUbiquitousExperimental

Part Seven: Synthesis

Chapter 26: The Great Bifurcation

As we survey the AI landscape in early 2026, the dominant theme is the end of unconstrained growth. The laws of physics—thermodynamics, materials science, chemistry, and orbital mechanics—have asserted themselves against the exponential curves of Moore’s Law and compute scaling.

We are witnessing what might be called a “Great Bifurcation” of the AI ecosystem into two competing architectures:

The Integrated Western Stack is characterized by high capital cost, regulatory compliance, massive physical infrastructure (nuclear SMRs, glass substrates, proprietary interconnects), and deep integration between hardware and software ecosystems (CUDA, x86/ARM, Nvidia). This stack is powerful but slow to build, dependent on complex supply chains, and vulnerable to single points of failure.

The Modular Sovereign Stack is defined by efficiency, open standards (RISC-V, open-source models), and adaptability born of constraint (DeepSeek-style algorithmic optimization). This stack is leaner, more resilient to supply chain disruption, and accessible to nations and companies seeking independence from Western technology governance.

These stacks are not mutually exclusive—components can and do mix—but they represent fundamentally different philosophies of how AI infrastructure should be built and controlled.


Chapter 27: The Efficiency Imperative

The DeepSeek shock demonstrated that algorithmic innovation can substitute for brute-force scaling. This realization has profound implications for how the industry will evolve.

If capability per FLOP matters more than total FLOPs, the moat is no longer compute accumulation but research velocity. The labs most likely to find the next algorithmic breakthrough are those embedded in the fastest-moving research ecosystems—which, by definition, are ecosystems where ideas circulate fastest. This suggests that open research may ultimately prove more valuable than compute hoarding or export-control games.

The efficiency frontier also offers a path around thermodynamic constraints. If each generation of algorithms achieves more with less energy, the collision between AI ambition and grid capacity becomes more manageable. Not solved—demand will still grow—but the growth curve could bend.

This creates an interesting strategic dynamic. Labs that invest heavily in efficiency research are simultaneously reducing their own infrastructure requirements and potentially commoditizing the capabilities that justify their competitors’ massive capital expenditures.


Chapter 28: Resource-Aware Governance

A credible solution to AI’s material challenges must move beyond the myth of infinite scalability toward resource-aware compute governance.

This means embedding explicit physical constraints into how AI systems are designed, scheduled, and priced:

Compute budgets tied to real-time grid capacity, water availability, and carbon intensity—making the environmental cost of training visible and allocatable.

Efficiency standards that reward capability per unit work rather than maximum aggregate throughput—creating market incentives for algorithmic innovation.

Infrastructure planning that integrates AI demand forecasts into long-term energy and water resource management—treating data centers as first-class participants in regional planning.

Cooling certification that ensures facilities meet engineering standards for fluid chemistry, material compatibility, and maintenance capability—preventing the epidemic of preventable failures.

None of this is technically difficult. The challenge is governance: creating the institutional frameworks that can impose constraints on an industry that has thrived on ignoring them.


Chapter 29: The Next Decade

The period from 2026 to 2036 will determine whether AI development continues on an exponential trajectory or enters a period of consolidation dictated by physical limits. The outcome depends on several parallel races:

Glass substrates must reach commercial viability at scale to enable the next generation of chiplet-based accelerators. Intel, Absolics, and Samsung are competing to become the foundational suppliers of AI packaging.

SMRs must achieve type certification and deployment at speeds unprecedented for nuclear technology to bridge the energy gap before grid constraints force training slowdowns.

Cooling infrastructure must mature operationally, with either the labor force expanding to meet complexity requirements or systems simplifying to match available skills.

RISC-V must establish datacenter credibility through successful deployments and software ecosystem development to offer a genuine alternative to proprietary architectures.

Orbital computing must prove commercial feasibility through demonstrations that address latency, maintenance, and launch cost concerns.

Each of these races is independent yet interconnected. Success in one can compensate for delays in another; failure across multiple fronts could impose hard limits on AI capability growth.


Chapter 30: When Software Met Atoms

There is a famous line in technology punditry: “Software is eating the world.” The phrase captured the early 21st century’s dominant dynamic—digital systems absorbing and transforming industry after industry.

The AI era represents something different. Software has eaten enough of the world that it is now bumping up against the world’s physical constraints. The data centers that house AI models require concrete, steel, water, and electricity in quantities that register on national resource statistics. The chips that run AI workloads require materials sourced from specific mines, processed in specific factories, and packaged using specific techniques that cannot be trivially replicated.

The next phase of AI development will not be determined solely by who writes the cleverest algorithms or accumulates the most capital. It will be determined by who can source T-Glass from Japanese looms, who can restart nuclear reactors and deploy SMRs fastest, who can train technicians to manage liquid cooling systems without catastrophic failures, who can manufacture glass substrates with sub-micron precision, and perhaps, eventually, who can escape the gravity well entirely.

The era of software eating the world is giving way to something more complex: a dialectic between digital ambition and material constraint, where the world—with its atoms, heat, water, and finite resources—is eating back.


Appendix: Sources for Further Investigation

DeepSeek and Algorithmic Efficiency

  • finance.yahoo.com/news/nvidia-stock-plummets-loses-record-589-billion-as-deepseek-prompts-questions-over-ai-spending
  • newsletter.semianalysis.com/p/deepseek-debates
  • tomshardware.com/tech-industry/artificial-intelligence/deepseek-might-not-be-as-disruptive-as-claimed
  • fortune.com/2025/02/10/google-ai-chief-demis-hassabis-deepseek-cost-claims-exaggerated

Energy and Grid Impact

  • pewresearch.org/short-reads/2025/10/24/what-we-know-about-energy-use-at-us-data-centers-amid-the-ai-boom
  • iea.org/reports/energy-and-ai/energy-demand-from-ai
  • bloomberg.com/graphics/2025-ai-data-centers-electricity-prices
  • brookings.edu/articles/the-future-of-data-centers

Nuclear and SMRs

  • spectrum.ieee.org/nuclear-powered-data-center
  • thebulletin.org/2024/12/ai-goes-nuclear
  • powermag.com/the-smr-gamble-betting-on-nuclear-to-fuel-the-data-center-boom
  • wwt.com/blog/big-techs-nuclear-bet-key-small-modular-reactors-for-cloud-power
  • bisresearch.com/industry-report/small-modular-reactor-market-data-center-application.html

Cooling Infrastructure

  • tomshardware.com/pc-components/cooling/the-data-center-cooling-state-of-play-2025
  • techstories.co/liquid-cooling-leak-destroys-millions-of-dollars-in-gpus
  • blog.se.com/datacenter/2025/08/12/why-liquid-cooling-for-ai-data-centers-is-harder-than-it-looks
  • bloomberg.com/news/articles/2025-11-28/cme-outage-how-are-data-centers-cooled-what-happens-if-they-overheat

Glass Substrates and Packaging

  • intc.com/news-events/press-releases/detail/1647/intel-unveils-industry-leading-glass-substrates
  • digitimes.com/news/a20251217PD228/rapidus-interposer-tsmc-chips-production
  • semiengineering.com/the-race-to-glass-substrates
  • idtechex.com/en/research-report/advanced-semiconductor-packaging/1042

RISC-V Development

  • riscv.org/blog/risc-v-announces-ratification-of-the-rva23-profile-standard
  • riscv.org/blog/2025/08/nvidia-cuda-rva23
  • tomshardware.com/pc-components/gpus/nvidias-cuda-platform-now-supports-risc-v
  • eetimes.com/risc-v-exceeding-expectations-in-ai-china-deployment

Orbital Computing

  • cnbc.com/2025/12/10/nvidia-backed-starcloud-trains-first-ai-model-in-space
  • blogs.nvidia.com/blog/starcloud
  • scientificamerican.com/article/data-centers-in-space
  • time.com/7344364/ai-data-centers-in-space
  • geekwire.com/2025/starcloud-power-training-ai-space

Research compiled January 2026. Word count: approximately 12,000.

Output

Work Area

Log

  • 2026-01-26 18:17 - Created