
The new strategic partnership between OpenAI and NVIDIA, formalized via a letter of intent in September 2025, is designed to both power and finance the next generation of OpenAI’s compute infrastructure, with initial deployments expected in the second half of 2026. According to the joint press release, both parties position this as “the biggest AI infrastructure deployment in history,” explicitly aimed at training and running OpenAI’s next-generation models.
At a high level:
- The target scale is 10 gigawatts (GW) or more of deployed compute capacity, realized via NVIDIA systems (comprising millions of GPUs).
- The first phase (1 GW) is slated for the second half of 2026, built on the forthcoming Vera Rubin platform.
- NVIDIA will progressively invest up to $100 billion into OpenAI, contingent on deployment of capacity in stages.
- An initial $10 billion investment from NVIDIA is tied to the execution of a definitive purchase agreement for the first gigawatt of systems.
- The equity stake NVIDIA will acquire is described as non-voting / non-controlling, meaning it gives financial skin in the game without governance control.
From a strategic standpoint, tying investment to capacity deployment helps OpenAI lock in capital and hardware over a long horizon, mitigating supply-chain and financing risk. With compute frequently cited as a binding constraint on advancing models, this kind of staged, anchored commitment gives OpenAI a more predictable growth path (at least in theory; that said, the precise economic terms and risk-sharing remain to be fully disclosed.)
Press statements emphasize that millions of GPUs will ultimately be involved, and that co-optimization of NVIDIA’s hardware with OpenAI’s software/stack will be a key feature of the collaboration.
Importantly, this deal also fits into OpenAI’s broader strategy of diversifying infrastructure partnerships beyond any single cloud provider. Microsoft remains a central backer and collaborator, but this NVIDIA tie-up further broadens OpenAI’s compute base, complementing other announced partnerships (e.g. with Oracle, SoftBank, and Stargate).
This deal also marks a strategic shift for NVIDIA: rather than being purely a vendor of chips, it becomes a strategic investor in an anchor customer whose growth directly drives GPU demand. This alignment tightens the coupling between NVIDIA’s roadmap, its software ecosystem, and real-world end deployments.
In a CNBC interview, NVIDIA CEO Jensen Huang characterized the initiative thus:
“This is the biggest AI infrastructure project in history. This partnership is about building an AI infrastructure that enables AI to go from the labs into the world.”
NVIDIA’s press materials also add that this new AI infrastructure will deliver “a billion times more computational power” than the first DGX system Huang delivered to OpenAI in 2016 — a rhetorical contrast intended to highlight the scale leap.
Significant Architectural Upgrade for OpenAI
NVIDIA’s Vera Rubin family and the new NVL144 CPX systems represent a deliberate architectural pivot toward ultra-dense, rack-scale platforms optimized for very long context windows, multimodal workloads, and generative video. NVIDIA’s Rubin CPX announcement frames the platform around “million-token” inference/prefill use cases and emphasizes very large on-rack memory and extreme cross-rack bandwidth — the Vera Rubin NVL144 CPX rack is advertised as delivering roughly 8 exaflops of NVFP4 AI compute, ~100 TB of high-speed memory per rack and on the order of 1.7 PB/s of aggregate memory bandwidth, with an overall performance uplift NVIDIA says is about 7.5× versus the prior GB300 NVL72 systems.
Technically, the Rubin family also introduces a disaggregated approach to long-context inference: NVIDIA is shipping both a compute-optimized Rubin CPX (reported at ~30 PFLOPs NVFP4 with 128 GB of GDDR7 per socket in some configurations) and higher-bandwidth Rubin GPUs (larger HBM configurations) so different phases of inference (prefill/context vs. generate/decode) can be mapped to the most appropriate silicon. This split — together with next-generation NVLink/NVSwitch fabrics, a new NVLink-144 switch and enhanced silicon-photonics and NICs referenced in platform materials — is intended to deliver much higher effective context lengths and throughput when combined with software optimizations.
For OpenAI, the practical implication is hardware + software co-design at scale: Rubin-era GPUs, NVL144 NVLink/NVSwitch fabrics and accelerated networking will be tuned to OpenAI’s training and inference pipelines, and CUDA/SDK roadmap alignment should ease integration and performance tuning. Public coverage and NVIDIA materials explicitly call out co-optimization and rack-scale system designs intended for million-token workloads.
There are operational and deployment consequences worth flagging. The NVL144 CPX racks are being positioned for production shipments in late 2026 (NVIDIA’s public timetable), which aligns with the 1-GW first-stage timing announced in the NVIDIA–OpenAI LOI. Scaling the kind of capacity OpenAI and NVIDIA describe will likely require distributed deployment across multiple campuses and providers — multiple independent reporting outlets and systems-level analyses emphasize the platform’s rack-scale nature and the practical limits (power, cooling, site procurement) that make a single-campus 10-GW buildout unlikely and operationally risky. Treat the “distributed campuses” statement as an informed inference from platform design and the industry’s power/site realities rather than a line from an NDA or definitive filing.