
For model providers, such as OpenAI and Anthropic, the choice of chips, Dai pointed out, enables a clearer separation between training and serving fleets, while still allowing reuse of common tools and code paths, in turn lowering total costs, improving fleet efficiency, and simplifying model lifecycle transitions.
In fact, Google is not the only chip provider that is walking the split-design path, said Stephen Sopko, analyst at HyperFRAME Research, giving the example of AWS, which has two distinct chips — Trainium and Inferentia — for different AI workloads.
How is 8t and 8i better than Ironwood?
While the design split reflects changing economics, the two chips are also built for distinct technical advantages over their predecessor, Ironwood.
TPU 8t, the training-focused variant, according to Google, offers nearly 3x compute performance per pod, larger superpods, and double the interchip bandwidth when compared to Ironwood.
While Ironwood delivers 42.5 exaflops across a 9,216-chip pod, TPU 8t scales to 121 exaflops across 9,600 chips, alongside a doubling of bidirectional scale-up bandwidth to 19.2 Tbps per chip and a fourfold increase in scale-out networking bandwidth to 400 Gbps, the company said in a statement.
The boost to performance and bandwidth between racks, according to Omdia principal analyst Alexander Harrowell, will support training of even larger models with shorter runs compared to Ironwood.




















