Google talks with Marvell to add a third custom AI inference chip partner

SiliconFeed EditorialApril 19, 2026

AI chips custom silicon Google Marvell Broadcom

Sections and tags — in the Topics menu Search the feed

At a glance:

Google is in talks with Marvell Technology to design a memory processing unit and an inference‑optimised TPU for its data‑center AI workloads.
The move adds Marvell as a third design‑services partner alongside Broadcom and MediaTek in Google’s custom silicon supply chain.
Industry analysts project the custom AI accelerator market to reach $118 billion by 2033, driven by a shift toward inference‑heavy compute.

What the talks involve

Google’s internal teams have approached Marvell Technology about two specific chips, according to a report from The Information. The first is a memory processing unit (MPU) that would sit alongside Google’s existing Tensor Processing Units, helping to accelerate data movement for large models. The second is a new TPU built specifically for inference, the phase where trained models serve user queries rather than learning from new data.

Marvell would act in a design‑services capacity, similar to MediaTek’s role on Google’s recently announced Ironwood TPU. The discussions are still at an exploratory stage and no contract has been signed, but the timing follows Broadcom’s announcement of a long‑term, through‑2031 agreement to supply TPUs and networking components for Google.

Why inference matters now

Google’s seventh‑generation TPU, Ironwood, launched this month as “the first Google TPU for the age of inference.” It delivers ten‑times the peak performance of the previous TPU v5p and can scale to 9,216 liquid‑cooled chips in a super‑pod that consumes roughly 10 MW and produces 42.5 FP8 exaflops. Google plans to build millions of Ironwood units this year.

Inference workloads differ from training in that they run continuously, serving billions of queries across Search, Gemini, and Cloud AI APIs. Because the cost scales with volume rather than peak capability, even modest efficiency gains translate into billions of dollars saved. Purpose‑built inference silicon therefore offers a competitive edge that general‑purpose GPUs cannot match on cost or power efficiency.

Marvell’s credentials

Marvell reported a record $6.1 billion in data‑center revenue for its fiscal year ending February 2026, part of an $8.2 billion total that represented a 42 % year‑over‑year increase. The company runs a custom silicon business with a $1.5 billion annual run rate across 18 cloud‑provider design wins, including chips for Amazon (Trainium), Microsoft (Maia AI accelerator), and Meta (a new data‑processing unit). Marvell also supplies Google with the Axion ARM CPU.

In March, Nvidia invested $2 billion in Marvell and announced an NVLink Fusion partnership to integrate Marvell’s custom chips and networking with Nvidia’s interconnect fabric. Earlier, in December 2025, Marvell acquired Celestial AI for up to $5.5 billion, adding photonic interconnect technology that CEO Matt Murphy says will deliver “the industry’s most complete connectivity platform for AI and cloud customers.” Murphy targets a 20 % market share in custom AI chips and expects roughly 30 % YoY revenue growth in fiscal 2027.

Broadcom’s position and market outlook

Broadcom still dominates the custom AI accelerator space with more than 70 % market share. Its AI revenue hit $8.4 billion in the most recent quarter, up 106 % YoY, and the company guided $10.7 billion for the following quarter. Broadcom aims for $100 billion in AI chip revenue by 2027.

Analysts project the broader ASIC market to outpace GPUs: TrendForce expects custom chip sales to grow 45 % in 2026, versus a 16 % rise in GPU shipments. Counterpoint Research forecasts Broadcom will hold roughly 60 % of the custom AI accelerator market by 2027, with Marvell at about 25 %. The total market is expected to reach $118 billion by 2033.

Implications for Google’s chip strategy

Google’s supply chain now involves four partners—Broadcom, MediaTek, Marvell, and TSMC—plus its own in‑house design team. The diversification mirrors automotive manufacturers that spread component risk across multiple suppliers rather than relying on a single vendor.

By adding Marvell, Google can target different workload profiles or cost points for inference, complementing the high‑performance Ironwood chips from Broadcom and the cost‑optimised “e” variants from MediaTek. This multi‑supplier architecture reduces pricing and supply risk while giving Google leverage to negotiate better terms across the board.

The talks are still preliminary, and any resulting products are likely years away from production. Nevertheless, the direction is clear: Google is building a resilient, inference‑focused silicon ecosystem that can sustain the massive scale of its AI‑augmented services. For Marvell, winning a Google inference TPU contract would cement its status as the second‑most important custom AI chip designer worldwide.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

FAQ

What specific chips is Google discussing with Marvell?

Google is in talks to develop two chips with Marvell: a memory processing unit (MPU) that would work alongside existing Tensor Processing Units, and a new inference‑optimised TPU designed for serving AI models rather than training them.

How does the proposed Marvell partnership fit into Google’s existing chip supply chain?

Google already works with Broadcom for high‑performance TPUs, MediaTek for cost‑optimised “e” variants, and TSMC for fabrication. Adding Marvell as a design‑services partner would create a four‑partner ecosystem, allowing Google to target different inference workloads and cost points while reducing reliance on any single supplier.

What is the projected size of the custom AI accelerator market, and why is inference becoming dominant?

Industry analysts expect the custom AI accelerator market to reach $118 billion by 2033, growing 45 % in 2026. Inference is becoming the primary cost driver because it runs continuously for billions of user queries, so efficiency gains in inference silicon translate into massive savings compared with the one‑time compute expense of training.

More in the feed

Prepared by the editorial stack from public data and external sources.

Original article