Business & policy

Musk's Colossus 1 supercomputer leased to Anthropic as Blackwell-only Colossus 2 readied for frontier training and potential IPO

At a glance:

  • SpaceX leased its entire Colossus 1 data center — 220,000 GPUs across 30 MW of capacity — to Anthropic, xAI's direct rival, to monetize an underutilized asset.
  • Colossus 1's mixed architecture (150,000 H100s, 50,000 H200s, 20,000 GB200s) caused straggler bottlenecks that drove xAI's real-world GPU utilization to just 11%, making it unsuitable for training Grok.
  • Colossus 2, built on uniform Nvidia Blackwell hardware at gigawatt scale, is now xAI's frontier training platform as Musk positions SpaceX/xAI for an upcoming IPO.

Why Anthropic struck the deal

Last week Anthropic announced it had signed a deal with SpaceX to lease all of xAI's Colossus 1 data center. The cluster packs over 220,000 Nvidia GPUs and 30 megawatts of compute capacity — one of the largest AI installations on Earth — and it now belongs entirely to Anthropic. According to the company, the additional capacity will primarily be used to ease long-standing bottlenecks across Claude's paid ecosystem. The specifics include significantly higher Claude Code limits, the removal of peak-hour throttling for Pro and Max subscribers, and substantially increased API request limits for Claude Opus models used by developers and enterprise customers.

The partnership is a complete reversal of Musk's earlier stance. Earlier this year he publicly attacked Anthropic and Claude, calling the company "misanthropic and evil." This week he claimed he approved the deal after speaking with Anthropic executives and determining that "no one set off my evil detector." For Anthropic, the urgency was real: the company says it needs the entire 300 MW AI supercluster just to improve the experience of using Claude, and it recently raised $30 billion in a Series G round valuing the company at $380 billion.

Colossus 1's architectural problem

When Musk unveiled Colossus, it was framed as proof that xAI intended to compete seriously with OpenAI, Anthropic, and Google at the AI frontier. The Memphis-based cluster became famous for how quickly it was assembled — tens of thousands of Nvidia GPUs reportedly came online in record time, eventually scaling to over 220,000 accelerators. However, the speed of construction created a heterogeneous configuration: roughly 150,000 H100s, 50,000 H200s, and 20,000 GB200s, three different generations of Nvidia silicon running under one roof.

For AI training, that mix is a structural liability. Distributed training requires every GPU in the cluster to complete each computational step simultaneously before the system can advance. When the faster GB200 chips finish their work first, the entire cluster waits for the slower H100s — a well-known bottleneck called the straggler effect. At 220,000 chips, the effect is exponential. According to a detailed report by Mirae Asset Securities, xAI's real-world GPU utilization sat at just 11%, meaning 89% of the cluster's theoretical computing power was going to waste. By contrast, Meta and Google typically operate at 40% or above. AI GPUs depreciate rapidly, consume enormous electricity, and require expensive maintenance and cooling — unused GPUs are effectively burning money.

Why inference changes the calculus

Anthropic's compute capacity problem was immediate and urgent. Free users frequently complained about rapidly exhausting tokens, but the restrictions extended well beyond the free tier. Paid Pro, Max, Team, and Enterprise users also regularly encountered message caps, peak-hour throttling, API rate limits, and strict time-based usage ceilings on Claude Code sessions, particularly during periods of heavy demand. The company was running out of inference capacity — the continuous, round-the-clock demand for compute that scales with every new user and every new query.

Inference workloads, however, do not require the tight synchronization that training demands. What was a structural inefficiency for xAI's training workloads is a workable infrastructure for Anthropic's inference needs. Running queries through an already-trained model can tolerate heterogeneous hardware because the GPUs are not waiting on each other in lockstep. That means the same cluster that was a money-losing liability for xAI became a valuable asset for Anthropic at exactly the right moment.

Colossus 2 and the path to an IPO

Multiple reports suggest xAI is now heavily focused on Colossus 2, a far larger next-generation cluster reportedly aimed at gigawatt-scale AI infrastructure. Unlike Colossus 1's chaotic mix of chip generations, Colossus 2 is built entirely on Nvidia's Blackwell architecture — a homogeneous cluster where every GPU is identical. In a uniform cluster, every chip completes each training step at roughly the same time, allowing GPU utilization to theoretically surpass the range in which Meta and Google currently operate. xAI can also properly optimize its software stack for a single hardware generation rather than trying to serve three simultaneously.

According to the Mirae Asset report, xAI has already moved its core training workloads entirely onto Colossus 2, effectively treating Colossus 1 as a retired first-generation asset. Musk has long treated his companies less like isolated entities and more like interconnected pieces of a broader ecosystem — Tesla technologies appear across SpaceX projects, SpaceX infrastructure supports xAI ambitions, and xAI products increasingly feed into Musk's wider platform strategy. The deal also hints that Musk could be positioning SpaceX/xAI as an AI cloud infrastructure provider, which aligns with xAI's existing Grok Business and enterprise offerings featuring APIs, security controls, audit logging, and corporate integrations, as well as reported plans for broader structural changes ahead of an upcoming IPO.

The financials behind the lease

Mirae Asset's analysts attempted to estimate the value of the Anthropic deal using estimated hourly lease rates for different Nvidia GPU types. They projected that Colossus 1 could theoretically generate roughly $5–6 billion in annual revenue — nearly perfectly offsetting xAI's annualized net loss of approximately $6 billion as of Q1 2026, effectively pulling the company to breakeven in a single contract.

On the Anthropic side, the analysts applied CEO Dario Amodei's own publicly stated estimate that roughly half of all AI industry compute spending goes toward inference and that inference compute converts to revenue at a 3x multiplier. On that basis, the $5 billion being directed toward inference capacity could generate approximately $15 billion in incremental ARR — a significant addition to Anthropic's already rapidly growing revenue base. Anthropic recently said its annualized revenue run rate had already surpassed $30 billion, underscoring the staggering scale at which Claude's business is now operating.

Orbital AI and the growing infrastructure crunch

The announcement also touched on "orbital AI compute capacity" — data centers in space. It sounds like science fiction marketing language, but it ties directly into a core problem both companies, alongside several other AI giants, are increasingly facing: AI infrastructure is beginning to outgrow terrestrial constraints. When a joint announcement comes from the world's largest AI company and the company that built the world's largest reusable rocket system and operates thousands of active satellites in orbit, the implication is clear — we may soon see data centers floating around in space.

The terrestrial constraints are real and compounding. Modern hyperscale AI data centers can cost tens of billions of dollars and take years to build. Utilities are struggling to supply sufficient electricity for AI projects, while land, transformers, cooling infrastructure, and high-end GPUs themselves remain constrained. There is also growing sentiment against AI infrastructure from local communities. Anthropic is pursuing massive gigawatt deals with Amazon, Google, Microsoft, and Nvidia, but those solutions are long-term. Colossus 1 provided an immediate, ready-made fix.

What happens next

Colossus 1's entire computing power now belongs to Anthropic — for now. xAI built Colossus 1 fast, and the resulting mixed GPU architecture created structural training inefficiencies that made the cluster hard to justify as a long-term platform. With Colossus 2 now operational and built properly on uniform Blackwell hardware, Colossus 1 transitioned from "cutting-edge frontier training weapon" into a monetizable first-generation compute asset. For Anthropic, the deal converts a depreciating liability into roughly $6 billion in annual revenue for xAI and could unlock an estimated $15 billion in additional ARR. Both companies got what they needed, and Musk gets a compelling infrastructure story heading into a potential IPO.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

FAQ

Why did Musk lease Colossus 1 to Anthropic instead of using it for Grok?
Colossus 1's mixed GPU architecture — 150,000 H100s, 50,000 H200s, and 20,000 GB200s — created severe straggler bottlenecks during distributed training. xAI's real-world GPU utilization reportedly sat at just 11%, making the cluster inefficient for training Grok. Anthropic, which needed compute for inference rather than training, could use the same hardware without the synchronization penalty.
What is Colossus 2 and how does it differ from Colossus 1?
Colossus 2 is xAI's next-generation cluster built entirely on Nvidia's Blackwell architecture, making it a homogeneous cluster where every GPU is identical. Unlike Colossus 1's mix of three GPU generations, Colossus 2 avoids straggler effects, allowing higher utilization and proper software optimization for a single hardware generation. It is reportedly aimed at gigawatt-scale AI infrastructure.
How much revenue could the Colossus 1 lease generate?
According to Mirae Asset Securities, Colossus 1 could theoretically generate roughly $5–6 billion in annual revenue. For xAI, that nearly offsets its annualized net loss of approximately $6 billion as of Q1 2026. For Anthropic, the $5 billion directed toward inference capacity could generate an estimated $15 billion in incremental ARR based on CEO Dario Amodei's estimate that inference compute converts to revenue at a 3x multiplier.

More in the feed

Prepared by the editorial stack from public data and external sources.

Original article