The Rise of AI Factories: Transforming Intelligence at Scale

Stay Ahead, Stay ONMINE

The Rise of AI Factories: Transforming Intelligence at Scale

AI Factories Redefine Infrastructure The architecture of AI factories reflects a paradigm shift that mirrors the evolution of the industrial age itself—from manual processes to automation, and now to autonomous intelligence. Nvidia’s framing of these systems as “factories” isn’t just branding; it’s a conceptual leap that positions AI infrastructure as the new production line. GPUs […]

AI Factories Redefine Infrastructure

The architecture of AI factories reflects a paradigm shift that mirrors the evolution of the industrial age itself—from manual processes to automation, and now to autonomous intelligence. Nvidia’s framing of these systems as “factories” isn’t just branding; it’s a conceptual leap that positions AI infrastructure as the new production line. GPUs are the engines, data is the raw material, and the output isn’t a physical product, but predictive power at unprecedented scale. In this vision, compute capacity becomes a strategic asset, and the ability to iterate faster on AI models becomes a competitive differentiator, not just a technical milestone.

This evolution also introduces a new calculus for data center investment. The cost-per-token of inference—how efficiently a system can produce usable AI output—emerges as a critical KPI, replacing traditional metrics like PUE or rack density as primary indicators of performance. That changes the game for developers, operators, and regulators alike. Just as cloud computing shifted the industry’s center of gravity over the past decade, the rise of AI factories is likely to redraw the map again—favoring locations with not only robust power and cooling, but with access to clean energy, proximity to data-rich ecosystems, and incentives that align with national digital strategies.

The Economics of AI: Scaling Laws and Compute Demand

At the heart of the AI factory model is a requirement for a deep understanding of the scaling laws that govern AI economics.

Initially, the emphasis in AI revolved around pretraining large models, requiring massive amounts of compute, expert labor, and curated data. Over five years, pretraining compute needs have increased by a factor of 50 million. However, once a foundational model is trained, the downstream potential multiplies exponentially, while the compute required to utilize a fully trained model for standard inference is significantly less than that required for training and fine-tuning models for use.

The challenge shifts to post-training scaling and test-time scaling. Fine-tuning models to suit specific applications demands 30x more compute than the original pretraining. Meanwhile, the latest advanced inference tasks like agentic AI, where models reason iteratively before responding, can require 100x more compute than standard inference. These compute-intensive needs simply exceed the capacity of general-purpose data centers.

AI factories are designed with this exponential growth in mind. From the ground up, they are built to support massive inference demands, iterative reasoning, and adaptive model deployment.

AI’s New Cost Curve

This shift in workload dynamics rewrites the economic blueprint for infrastructure investment. Where once the ROI of data center capacity was measured against steady-state cloud or enterprise workloads, AI factories demand a forward-looking calculus based on scaling behavior and future inference velocity.

The cost per token or decision point becomes a more meaningful financial metric than simple cost per kWh or per-core performance. Operators must not only provision for peak demand but architect systems flexible enough to evolve with model complexity—supporting seamless upgrades in compute density, interconnect bandwidth, and software orchestration.

Moreover, these economics aren’t confined to hyperscale players alone. Enterprises deploying vertical-specific models—whether for fraud detection, supply chain optimization, or autonomous control systems—are increasingly recognizing that the benefits of faster, smarter AI decisions justify the infrastructure premium.

This drives demand for regional and modular AI factories tailored to industry use cases, where latency, data locality, and compliance matter as much as raw compute. As with previous inflections in the digital economy, those who internalize and invest early in the new cost curves will be best positioned to lead in a world where intelligence itself is the product.

AI Factory Development Around the World

Nvidia is not alone in recognizing the strategic importance of AI factories. Governments and enterprises across the globe are racing to deploy them:

India: Through a high-profile partnership with NVIDIA, Yotta Data Services has launched the Shakti Cloud Platform—one of the country’s first AI supercomputing infrastructures. Positioned as a national resource, Shakti aims to democratize access to high-performance GPU resources for startups, research institutions, and public sector innovation, reflecting India’s broader ambition to become a global AI hub.

Japan: Cloud providers like GMO Internet and KDDI are rapidly scaling NVIDIA-powered AI infrastructure to accelerate advancements in robotics, precision medicine, and smart cities. These efforts align with Japan’s Society 5.0 vision, which emphasizes the fusion of cyber and physical systems to tackle demographic and economic challenges through AI and automation.

Europe: The European Union is taking a coordinated, multi-national approach to AI factory development, investing in seven advanced computing centers across 17 member states via the High Performance Computing Joint Undertaking (EuroHPC JU). These sites are being positioned not just as data centers but as digital sovereignty assets—powering AI research, public sector applications, and secure industrial innovation.

Norway: Telenor’s NVIDIA-powered AI factory exemplifies how Nordic countries are integrating sustainability into digital transformation. With a strong emphasis on green energy, regional talent development, and cross-border collaboration, the initiative is laying a foundation for climate-conscious AI infrastructure that aligns with European ESG priorities.

United States: AI factory development is taking a dual-track approach. Public-private initiatives like the Stargate project—focused on frontier-scale computing—and executive directives from the White House underscore Washington’s intent to lead in both commercial and governmental AI capabilities. The U.S. sees AI infrastructure not just as a competitive edge but as a strategic imperative for national resilience.

Saudi Arabia: Through its Vision 2030 strategy, the Kingdom is investing heavily in AI infrastructure, including a partnership between the Saudi Data and Artificial Intelligence Authority (SDAIA) and global hyperscalers. Recent announcements include the creation of sovereign AI compute clusters designed to support Arabic-language models and AI-driven public services.

Singapore: Known for its methodical approach to digital infrastructure, Singapore is building out AI factories as part of its National AI Strategy 2.0. With investments in sovereign compute capabilities and robust data governance, the city-state is positioning itself as Southeast Asia’s AI nerve center—balancing innovation with regulatory foresight.

These projects highlight how AI factories are quickly becoming essential national infrastructure, akin to telecommunications and energy grids. More than just data centers, they represent strategic bets on where intelligence will be created, who controls its production, and how nations will compete in an AI-first global economy.

Inside the AI Factory: A Full-Stack Approach to Intelligence Production

Nvidia’s AI factory model isn’t just a high-powered compute stack—it’s a vertically integrated platform purpose-built to accelerate every stage of the AI lifecycle. From training foundational models to deploying them at scale in real-time applications, the architecture spans compute, networking, software, data pipelines, and digital twin simulation. Each layer is engineered for high-efficiency throughput, reflecting Nvidia’s belief that intelligence production requires the same rigor and precision as modern manufacturing.

1. Compute Performance: The Engine Room of Intelligence

At the core of the AI factory is GPU horsepower. Nvidia’s Hopper, Blackwell, and the forthcoming Blackwell Ultra architectures offer exponential leaps in performance. The flagship GB200 NVL72 system—a rack-scale unit with dual Blackwell GPUs connected by NVLink Switch—delivers 50x more AI inference throughput compared to the A100 generation. Integrated into DGX SuperPOD clusters, these systems can scale to tens of thousands of nodes, forming the compute backbone for hyperscale AI development.

DGX Cloud extends these capabilities into a managed, consumption-based model, allowing enterprises to access AI factory-grade infrastructure through major cloud platforms like Microsoft Azure, Google Cloud, and Oracle. It’s an operating model built for rapid deployment and elastic growth.

2. High-Performance Networking: Compute Without Bottlenecks

Scaling AI requires more than raw compute—it demands precision networking. Nvidia’s NVLink, Quantum-2 InfiniBand, and Spectrum-X Ethernet fabrics are designed to minimize latency and ensure lossless, high-bandwidth data movement between tens of thousands of GPUs. ConnectX-8 SmartNICs and BlueField-3 DPUs enable secure, multi-tenant environments while offloading network and storage tasks to free up GPU cycles. The result is a tightly-coupled infrastructure where compute and data flow at AI-native speeds.

3. Orchestration and Operational Intelligence

Orchestrating AI workloads at scale is non-trivial. Tools like Nvidia Run:ai, Base Command, and Mission Control provide full-stack visibility and GPU-aware scheduling, ensuring optimal utilization across heterogeneous environments. These platforms support multi-tenant operations, dynamic scaling, and fine-grained workload isolation—critical in enterprise and sovereign AI environments where uptime and performance cannot be compromised.

4. Inference Stack: From Model to Real-Time Decisions

The Nvidia inference stack—including TensorRT for optimized execution, NVIDIA Inference Microservices (NIMs) for containerized deployment, and NVIDIA Triton for scalable serving—enables low-latency, high-throughput AI services. These tools are optimized for transformer architectures and multimodal models, addressing the growing demand for agentic inference, edge reasoning, and continuous learning in production.

5. Data Infrastructure: Feeding the Intelligence Pipeline

AI performance is bound by the quality and availability of data. The Nvidia AI Data Platform enables seamless integration with modern data lakes, object stores, and streaming platforms. It provides end-to-end support for preprocessing, labeling, and versioning at scale—turning chaotic data pipelines into repeatable, high-performance processes. Certified storage partners (like NetApp, Dell, and VAST Data) ensure that storage throughput can keep pace with real-time inference and training demands.

6. Omniverse Blueprint: Digital Twin-Driven Infrastructure Planning

Designing an AI factory involves massive complexity—up to 5 billion components, 210,000 miles of cabling, and megawatt-scale power demands. Nvidia’s Omniverse Blueprint introduces a systems-level digital twin to simulate, validate, and optimize AI factory builds before breaking ground. This includes everything from airflow and thermals to rack placement and interconnect design.

By enabling real-time collaboration across electrical, mechanical, and IT disciplines, Omniverse reduces time-to-deployment and mitigates critical risk. In environments where an hour of downtime can equate to tens of millions in lost inference capacity, this level of planning precision is no longer optional—it’s a necessity.

AI factories represent more than just technical innovation—they are a new class of infrastructure, purpose-built for the intelligence economy. Nvidia’s full-stack platform provides the modularity, scalability, and performance required to manufacture intelligence at scale, redefining how enterprises and nations deploy AI as a core strategic asset.

Deep Dive on Omniverse Developments: Advancing AI Factory Design and Simulation

As AI continues to drive unprecedented demand for specialized infrastructure, NVIDIA is taking bold steps to help design and optimize the next generation of AI factories with its new Omniverse Blueprint for AI factory design and operations. Unveiled during NVIDIA’s GTC keynote, this innovative blueprint is designed to help engineers simulate, plan, and optimize the development of gigawatt-scale AI factories, which require the seamless integration of billions of components and complex systems.

In collaboration with leading simulation and infrastructure partners, including Cadence, ETAP, Schneider Electric, and Vertiv, the Omniverse Blueprint enables digital twin technology to support the design, testing, and optimization of AI factory components such as power, cooling, and networking systems long before physical construction begins.

Engineering AI Factories: A Simulation-First Approach

Using OpenUSD libraries, NVIDIA’s Omniverse Blueprint aggregates 3D data from multiple sources, including building layouts, accelerated computing systems like NVIDIA DGX SuperPODs, and power/cooling units from partners such as Schneider Electric and Vertiv. This unified approach allows engineers to address key challenges in AI factory development, such as:

Component Integration and Space Optimization: Seamlessly integrating NVIDIA systems with billions of components for optimal layouts.
Cooling Efficiency: Using the Cadence Reality Digital Twin Platform to simulate and evaluate cooling solutions, from hybrid air to liquid cooling.
Power Distribution: Designing scalable, redundant systems to simulate and optimize power reliability using ETAP.
Networking Topology: Fine-tuning high-bandwidth networking infrastructure with NVIDIA Spectrum-X and NVIDIA Air.

The blueprint empowers engineers to collaborate in real-time across disciplines, reducing inefficiencies and enabling parallel workflows. Real-time simulations allow for faster decision-making and optimization, with teams able to adjust configurations and immediately see the impact — drastically reducing design time and avoiding costly mistakes during construction.

Building Resilience Into the AI Frontier

As AI workloads continue to evolve, the blueprint offers advanced features such as workload-aware simulations and failure scenario testing to ensure AI factories can scale and adapt to future demands. With the growing importance of minimizing downtime (which can cost millions per day in gigawatt-scale AI factories), the Omniverse Blueprint reduces risk, improves efficiency, and helps AI factory operators stay ahead of infrastructure needs.

NVIDIA’s ongoing efforts with partners like Vertech and Phaidra will bring AI-enabled operations into the fold, including reinforcement-learning agents that optimize energy efficiency and system stability. These advancements ensure that AI factories can adapt to changing hardware and environmental conditions in real-time, contributing to ongoing operational resilience.

The integration of digital twin technology into AI factory design is not just a theoretical enhancement—it’s essential for the future of AI-driven data centers. With over $1 trillion projected for AI-related upgrades, NVIDIA’s Omniverse Blueprint stands poised to lead this transformation, helping AI factory operators navigate the complexities of AI workloads while minimizing risk and maximizing efficiency.

To explore these developments further, watch the GTC keynote, and discover how NVIDIA and its partners are shaping the future of AI factory infrastructure.

The Age of Reasoning and Agentic AI

Nvidia defines its Blackwell Ultra platform not just as another leap in GPU performance, but as the gateway to a new phase in AI development—what it calls the age of reasoning. As workloads transition from static inference to dynamic decision-making, AI systems must increasingly mimic human-like cognition: analyzing context, planning multistep actions, and adapting behavior in real time. This shift is giving rise to two transformative paradigms—agentic AI and physical AI—both of which are redefining the infrastructure requirements for scalable intelligence.

Agentic AI involves AI models that operate autonomously to solve complex, multistep problems. These models reason iteratively, self-correct, and manage workflows across multiple domains. They’re already emerging in tools like AutoGPT, Devin, and AI copilots that can write code, generate research plans, or manage enterprise workflows. Unlike traditional inference, agentic AI requires continual interaction with large-scale memory, context retrieval, and recursive reasoning—all of which drive up compute needs by orders of magnitude.
Physical AI focuses on embodied intelligence—where simulation, sensor fusion, and real-world control intersect. Applications include real-time photorealistic simulation for digital twins, robotics, autonomous vehicles, and industrial automation. These workloads demand ultra-low latency and tight coupling between simulation and inference engines.

Blackwell Ultra is engineered for this new class of demands. It enables AI factories to scale compute across the full lifecycle—from massive pretraining runs to highly variable post-training tasks, including fine-tuning, retraining, and real-time inference. Crucially, Nvidia’s Dynamo software stack coordinates these large-scale operations, orchestrating token generation and communication across thousands of GPUs with efficiency that keeps latency low and throughput high.

In this new era, compute isn’t just about speed—it’s about intelligence per watt, adaptability per dollar, and the ability to support inference that behaves less like static prediction and more like dynamic reasoning. Blackwell Ultra and its supporting ecosystem are designed to meet that challenge head-on, reshaping not only how AI runs, but what it can become.

Oracle and NVIDIA Team Up to Accelerate the AI Factory Model with Agentic AI Integration

At NVIDIA’s 2025 GTC conference, Oracle and NVIDIA unveiled a major step forward in the buildout of enterprise-scale AI infrastructure — a key component of the emerging “AI Factory” model. The companies announced a deep integration between Oracle Cloud Infrastructure (OCI) and the NVIDIA AI Enterprise software platform, aimed at accelerating the deployment of agentic AI — autonomous AI systems capable of reasoning, planning, and executing complex tasks.

This collaboration brings NVIDIA’s inference stack — including 160+ AI tools and more than 100 NIM™ (NVIDIA Inference Microservices) — natively into the OCI Console. Oracle customers can now tap into a fully integrated AI stack, available in Oracle’s cloud regions, sovereign clouds, on-premises via OCI Dedicated Region, and even at the edge.

“Oracle has become the platform of choice for both AI training and inferencing,” said Oracle CEO Safra Catz. “This partnership enhances our ability to help customers achieve greater innovation and business results.”

NVIDIA CEO Jensen Huang underscored the significance of the integration for enterprise AI: “Together, we help enterprises innovate with agentic AI to deliver amazing things for their customers and partners.”

No-Code Blueprints and Turnkey Inference

A key element of the Oracle-NVIDIA collaboration is the launch of no-code OCI AI Blueprints, which allow enterprises to deploy multimodal large language models, inference pipelines, and observability tools without managing infrastructure. These blueprints are optimized for NVIDIA GPUs and microservices, and can reduce the time-to-deployment from weeks to minutes.

NVIDIA is also contributing its own Blueprints to the OCI Marketplace, preloaded with workflows for enterprise use cases in customer service, simulation, and robotics. For example, Oracle plans to offer NVIDIA Omniverse and Isaac Sim tools on OCI, bundled with preconfigured NVIDIA L40S GPU instances for simulation and physical AI development.

Pipefy, a business process automation platform, is already deploying multimodal LLMs on OCI using these AI Blueprints. “Using these prepackaged and verified blueprints, deploying our AI models on OCI is now fully automated and significantly faster,” said Gabriel Custódio, principal software engineer at Pipefy.

Enabling Real-Time Inference and Vector Search

Oracle is also integrating NVIDIA NIM microservices into OCI Data Science, enabling real-time inference with a pay-as-you-go model. These microservices can be deployed within a customer’s OCI tenancy for AI use cases ranging from copilots to recommendation engines, delivering rapid time-to-value while maintaining data security and compliance.

In the AI data stack, Oracle Database 23ai now supports accelerated vector search powered by NVIDIA GPUs and the cuVS library — enabling fast creation of vector embeddings and indexes for massive datasets. Companies like DeweyVision, which provides AI-driven media cataloging and search tools, are using this integration to ingest, search, and manage high volumes of video content efficiently.

“Oracle Database 23ai with AI Vector Search can significantly increase Dewey’s search performance while increasing the scalability of the DeweyVision platform,” said CEO Majid Bemanian.

Blackwell-Powered Superclusters Signal the AI Factory Future

Perhaps most notably, Oracle is among the first cloud providers to roll out NVIDIA’s latest generation Blackwell Ultra GPUs across its AI Supercluster. The NVIDIA GB300 NVL72 and HGX B300 NVL16 platforms — successors to last year’s GB200 — promise up to 1.5x performance gains and are designed for large-scale AI factories spanning tens of thousands of GPUs. Oracle’s Supercluster deployments will soon support up to 131,072 GPUs, connected by NVIDIA’s Quantum-2 InfiniBand and NVLink fabrics.

Companies like Soley Therapeutics and SoundHound AI are already leveraging this full-stack Oracle-NVIDIA platform to train next-generation models for drug discovery and voice AI, respectively. “The combination of OCI and NVIDIA delivers a full-stack AI solution,” said Yerem Yeghiazarians, CEO of Soley Therapeutics. “It provides us the storage, compute, software tools, and support necessary to innovate faster with petabytes of data.”

As AI workloads continue to demand ever-larger compute clusters and sophisticated software integration, partnerships like Oracle and NVIDIA’s are laying the foundation for scalable, enterprise-ready AI factories — designed to push the limits of reasoning, automation, and insight.

Secure AI Factories: The Cisco-NVIDIA Collaboration

As AI infrastructure becomes a foundational layer of national and enterprise strategy, its security posture can no longer be an afterthought—it must be embedded from the silicon up. Cisco and NVIDIA have partnered to deliver exactly that with the Secure AI Factory: a full-stack architecture that merges scalable compute and high-performance networking with zero-trust security principles and AI-native threat protection.

The collaboration tightly integrates Cisco’s security and networking stack—including Hypershield, AI Defense, and hybrid mesh firewalls—with NVIDIA’s BlueField-3 DPUs and AI Enterprise platform. The result is a unified framework that provides policy enforcement, observability, and real-time threat detection across every layer of the AI stack.

Hypershield applies adaptive segmentation and micro-isolation, using AI to identify and quarantine threats across east-west traffic inside data centers.
AI Defense leverages behavior-based analysis to protect against AI-specific risks such as prompt injection, model hijacking, adversarial inputs, and data leakage during runtime.
BlueField-3 DPUs offload security and network processing from host CPUs, enabling wire-speed telemetry, access control, and cryptographic operations without impacting AI performance.

This joint platform supports on-premises deployments through Cisco UCS AI servers and Nexus switches, or cloud and hybrid deployments using validated reference architectures optimized for AI factories. Security scales automatically with workload changes—eliminating blind spots in dynamic, multi-tenant environments where AI models evolve in real time.

By embedding security into every node, packet, and process, Cisco and NVIDIA are enabling enterprises to move fast without sacrificing control. In an era where AI models make mission-critical decisions and process sensitive data, the Secure AI Factory ensures that trust is not just assumed—it’s architected.

Chuck Robbins, Chair and CEO, Cisco, said:

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Network discovery gets a boost from Intel-spinout Articul8

Technical architecture: beyond traditional monitoring Weave’s technical foundation relies on a hybrid knowledge graph architecture. It processes different data types through specialized analytical engines. It does not attempt to force all network data through large language models (LLM). This design choice addresses accuracy concerns inherent in applying generative AI to

Cisco, Nvidia, VAST team to offer turnkey AI infrastructure components

Cisco is using its Nvidia partnership in collaboration with VAST Data to offer customers a turnkey AI infrastructure package including compute, network, storage and data. The combined package will offer customers a blueprint for building pre-integrated AI infrastructure to support AI workload data fabrics as well as large-scale training and

Cato Networks acquires AI security startup Aim Security

“While the world met the early signs of this revolution with deep cynicism and tried to strike it down with outdated tools like DLP [data loss prevention], the pain we saw in our customers’ eyes and their hunger to truly understand AI convinced my co-founder and me that a fundamental

JF Acquires 4 Corners

JF Petroleum Group (JF) is continuing its expansion with the acquisition of 4 Corners Petroleum Equipment, a service contractor based in Texarkana, Texas. 4 Corners was founded in 2015 by industry veteran Kenny Allen and it has since served Northeast Texas and Southwest Arkansas, JF noted in a media release.

Honeywell Modular LNG Tech Selected for Mexican Export Facility

Honeywell International Inc. has been contracted to deliver its modular liquefied natural gas (LNG) pretreatment technology and Integrated Control and Safety Systems (ICSS) for the Amigo LNG S.A. de C.V. export facility in Guaymas, Sonora, Mexico. Honeywell said in a media release that the project, a joint venture between Epcilon LNG LLC and LNG Alliance Pte. Ltd., is targeting optimization of production and advancement of energy development in the region. Honeywell said its modular pretreatment technology will help guarantee that LNG exported from the new facility complies with industry standards and specifications, while also allowing for quicker installation and simpler expansion. By eliminating impurities from natural gas before the liquefaction process, Honeywell’s technology helps prolong equipment lifespan and avoid unexpected downtime, thereby improving operational efficiency and reliability. The modular solution also minimizes construction-related risks and enhances the speed to market, according to Honeywell. “By leveraging Honeywell’s advanced and proven modular pre-treatment technology and automation systems, we are streamlining project construction, accelerating project delivery, and enhancing operational efficiency to deliver a superior product to our customers faster”, Muthu Chezhian, CEO of LNG Alliance, said. “The Amigo LNG terminal will deliver LNG more competitively to global markets, while reinforcing Mexico’s position in international energy trade, generating local economic value and advancing the global transition to cleaner fuels”. The Amigo LNG export facility is anticipated to commence operations in 2028 and has a capacity to export as much as 7.8 million tonnes per annum of LNG. “As the global demand for LNG continues to grow, Honeywell is uniquely positioned as a single provider for both modular process technology and automation solutions. This optional integrated approach can help accelerate project timelines and provide production efficiency benefits”, Rajesh Gattupalli, president and CEO of Honeywell UOP, said. “This collaboration with LNG Alliance highlights our joint focus

Oil Drops to Lowest Since May Ahead of OPEC+ Talks

Oil fell to the lowest since May ahead of a weekend OPEC+ meeting where Saudi Arabia will seek to steer the group toward more production increases in the coming months. West Texas Intermediate slid 2.5% to settle below $62 a barrel, down 3.3% this week. The alliance will hold a virtual meeting Sept. 7 to decide its next move after completing the restart of 2.5 million barrels a day of idled supply at its previous gathering. Saudi Arabia wants to boost production further in a bid to offset lower prices with higher volumes, people familiar with the matter said. No decision has been made, and it’s not clear whether any increase would be agreed upon as soon as Sunday or only in later months. “If the eight OPEC+ countries were to agree on another production increase, we believe this would place significant downward pressure on oil prices,” Commerzbank analysts Barbara Lambrecht and Carsten Fritsch wrote in a note. “After all, there is already a significant risk of a supply surplus.” West Texas Intermediate crude futures have retreated about 14% this year after the shift by OPEC+ — coupled with supply increases from drillers outside the group — exacerbated concerns about a global glut. Market sentiment has also been weighed down by mounting worries over the health of the US economy, where job growth slowed last month. Geopolitical tensions have also been in focus this week, with the US looking to pressure buyers of Russian crude to push Moscow into agreeing on a truce in Ukraine. As part of that effort, Washington has imposed a 50% levy on some imports from India. President Donald Trump said Friday that the US seems to have “lost India and Russia to deepest, darkest China.” “Sentiment in crude markets is poor,” said Daniel Ghali, a

USA LNG Exporters Race to Tie Up Financing

US developers are racing to cash in on the nation’s natural-gas export boom while they still can. The massive US buildout of terminals that process and ship liquefied natural gas, or LNG, has transformed the nation into the world’s top exporter of the fuel. But plants still in development are facing a tight deadline: By 2027, global LNG supply will exceed demand, BloombergNEF estimates. By 2030, US rival Qatar will have finished its own years-long LNG buildout, further damping appetite for new terminals. And by 2031, a massive pipeline expansion by Gazprom PJSC could begin funneling more of Russia’s natural gas to China, possibly displacing as much as 40 million metric tons of LNG demand per year, according to BloombergNEF. Four US projects with the capacity to export 63 million tons of LNG a year are still awaiting final investment decisions. Even the $35 billion in US plants already under construction face headwinds amid a tight labor market that’s threatening to push back timelines. Golden Pass LNG, being jointly developed in Texas by Exxon Mobil Corp. and QatarEnergy LNG, is coming online in 2025, one year later than scheduled following a worker shortage and the bankruptcy of one of its contractors. Here are the projects to watch. Louisiana LNG (Under construction) Developer: Woodside Energy Capacity: 27.6 million tons per year Woodside Energy announced its $17.5 billion final investment decision to build Louisiana LNG in late April, after the company acquired Tellurian Inc. in 2024. The facility is under construction in Calcasieu Pass, Louisiana, and targeted to come online by 2029. Corpus Christi LNG Expansion (Under construction) Developer: Cheniere Energy Inc. Capacity: 3 million tons per year Cheniere Energy Inc., the largest American exporter, last month announced a $2.9 billion expansion of its Corpus Christi plant in south Texas. Two new production trains are slated to start toward the end of the decade, which

Plains to acquire 55% interest in EPIC Crude from Diamondback, Kinetik

A Plains All American Pipeline LP and Plains GP Holdings subsidiary has agreed to acquire from subsidiaries of Diamondback Energy Inc. and Kinetik Holdings Inc. a 55% non-operated interest in EPIC Crude Holdings LP, the entity that owns and operates the EPIC crude oil pipeline, in a deal valued at about $1.57 billion, inclusive of about $600 million of debt. “By further linking our Permian and Eagle Ford gathering systems to Corpus Christi, we are enhancing market access and ensuring our customers have reliable, cost-effective routes to multiple demand centers,” said Plains chairman, chief executive officer, and president, Willie Chiang. Plains also has agreed to a potential $193 million earnout payment should an expansion of the pipeline to a capacity of at least 900,000 b/d be formally sanctioned before yearend 2027. Diamondback Energy and Kinetik Holdings each agreed to sell their respective 27.5% equity interest, which they reached with acquisitions in September 2024, for about $500 million in net upfront cash and a $96 million share of the total potential $193-million contingent cash payment related to the potential expansion. Diamondback will maintain its commercial relationship with the EPIC Crude and Plains teams as an anchor shipper on the EPIC Crude pipeline, said Kaes Van’t Hof, chief executive officer and director of Diamondback Energy, in a separate release Sept. 2. The remaining 45% interest in EPIC Crude Holdings is owned by a portfolio company of Ares Management Corp. (EPIC Management), which also serves as operator. The EPIC assets include over 800 miles of long-haul crude oil takeaway from the Permian and Eagle Ford basins to the Gulf Coast market at Corpus Christi, Tex., with current operating capacity over 600,000 b/d. Other assets include total operational storage of about 7 million bbl and over 200,000 b/d of export capacity. EPIC Crude includes terminals

Equinor signs heads of agreement for Bay du Nord FPSO

Equinor Canada Ltd. signed a head of agreement (HoA) with BW Offshore, confirming its selection as preferred bidder for the floating production, storage and offloading (FPSO) unit for the Bay du Nord deepwater oil project offshore Newfoundland and Labrador, Canada. Equinor operates Bay du Nord, Canada’s first deepwater oil project, in partnership with bp plc. The project holds an estimated 400 million bbl of recoverable light crude in its initial phase. The oil discovery lies in the Jurassic reservoirs of the Flemish Pass basin, about 500 km east of St. John’s in 1,170 m of water. Later discoveries, and potential tie-ins, lie in adjacent exploration licence EL1156 (Cappahayden and Cambriol) in waters about 650 m deep. Development of the project was postposedin 2023 for up to 3 years due to “changing market conditions and subsequent high cost inflation,” according to Equinor. During that time, however, Equinor and bp have advanced work to actively mature the project toward future development. Under the newly signed HoA, Equinor and BW Offshore will continue to advance discussions on all technical and commercial aspects of the FPSO project. These include further maturation of design through front-end engineering design (FEED) work, and agreeing on a commercial solution. The FPSO will be tailored for the harsh environment of the sub-Arctic. The unit is expected to support production of up to 160,000 b/d of oil and will feature a disconnectable turret system and extensive winterization, BW Offshore said. The topside will include emission reduction initiatives such as high-efficiency power generation and heat recovery, variable speed drives and a closed flare system. The FPSO also will be designed for future tiebacks. Following pre-FEED completion mid-September, the two companies are expected to enter into a bridging phase to prepare for FEED in early 2026, subject to approvals by Equinor and bp.

Chevron takes over operatorship of block offshore Uruguay

Chevron Corp. has officially taken over operatorship of AREA OFF-1 block in Uruguay and 3D seismic acquisition on the block is expected late in this year’s fourth quarter. Handover of the South American block occurred in first-half 2025, partner Challenger Energy Group PLC said in a half-year report Sept. 3. In November 2024, Chevron completed a farm-in with Challenger to acquire a 60% interest in the offshore block, along with “various work streams necessary to prepare for 3D seismic acquisition,” Challenger’s chief executive officer Eytan Uliel told stakeholders. Uliel noted the start of seismic acquisition is still subject to finalization of permitting by the Uruguayan Ministry of Environment, “a process which is well advanced,” he said. In July, Challenger said the Ministry has consultations planned ahead of permit issuances, and that a final consultation was expected late that month. At the time, Challenger said it expected permits to be granted in August/September. Chevron will carry the full cost of the seismic campaign up to a total program cost of $37.5 million. The 14,557-sq km block lies about 100 km offshore in water depths of 80-1,000 m, and holds prospective inventory of about 2 billion bbl of recoverable resource (Pmean) through multiple prospects (Teru Teru, Anapero, Lenteja) in a range of play types, according to Challenger. Elsewhere in Uruguay, Challenger progressed work at the 13,000 sq km AREA OFF-3 block, substantially completing its planned technical work program in August. The primary geotechnical work focused on the licensing, reprocessing, and interpretation of a 1,250 sq km 3D seismic data set. Other subsurface studies addressed the geochemistry and further de-risked AREA OFF-3 exploration potential, the company said. The company began a formal farmout process for the block on Sept. 1.

Google adds Gemini to its on-prem cloud for increased data protection

Google has announced the general availability of its Gemini artificial intelligence models on Google Distributed Cloud (GDC), making its generative AI product available on enterprise and government data centers. GDC is an on-premises implementation of Google Cloud, aimed at heavily regulated industries like medical and financial services to bring Google Cloud services within company firewalls rather than the public cloud. The launch of Gemini on GDC allows organizations with strict data residency and compliance requirements to deploy generative AI without compromising control over sensitive information. GDC uses Nvidia Hopper and Blackwell 0era GPU accelerators with automated load balancing and zero-touch updates for high availability. Security features include audit logging and access control capabilities that provide full transparency for customers. The platform also features Confidential Computing support for both CPUs (with Intel TDX) and GPUs (with Nvidia’s confidential computing) to secure sensitive data and prevent tampering or exfiltration.

Nvidia networking roadmap: Ethernet, InfiniBand, co-packaged optics will shape data center of the future

Nvidia is baking into its Spectrum-X Ethernet platform a suite of algorithms that can implement networking protocols to allow Spectrum-X switches, ConnectX-8 SuperNICs, and systems with Blackwell GPUs to connect over wider distances without requiring hardware changes. These Spectrum-XGS algorithms use real-time telemetry—tracking traffic patterns, latency, congestion levels, and inter-site distances—to adjust controls dynamically. Ethernet and InfiniBand Developing and building Ethernet technology is a key part of Nvidia’s roadmap. Since it first introduced Spectrum-X in 2023, the vendor has rapidly made Ethernet a core development effort. This is in addition to InfiniBand development, which is still Nvidia’s bread-and-butter connectivity offering. “InfiniBand was designed from the ground up for synchronous, high-performance computing — with features like RDMA to bypass CPU jitter, adaptive routing, and congestion control,” Shainer said. “It’s the gold standard for AI training at scale, connecting more than 270 of the world’s top supercomputers. Ethernet is catching up, but traditional Ethernet designs — built for telco, enterprise, or hyperscale cloud — aren’t optimized for AI’s unique demands,” Shainer said. Most industry analysts predict Ethernet deployment for AI networking in enterprise and hyperscale deployments will increase in the next year; that makes Ethernet advancements a core direction for Nvidia and any vendor looking to offer AI connectivity options to customers. “When we first initiated our coverage of AI back-end Networks in late 2023, the market was dominated by InfiniBand, holding over 80% share,” wrote Sameh Boujelbene, vice president of Dell ’Oro Group, in a recent report. “Despite its dominance, we have consistently predicted that Ethernet would ultimately prevail at scale. What is notable, however, is the rapid pace at which Ethernet gained ground in AI back-end networks. As the industry moves to 800 Gbps and beyond, we believe Ethernet is now firmly positioned to overtake InfiniBand in these high-performance deployments.”

Inside the AI-optimized data center: Why next-gen infrastructure is non-negotiable

How are AI data centers different from traditional data centers? AI data centers and traditional data centers can be physically similar, as they contain hardware, servers, networking equipment, and storage systems. The difference lies in their capabilities: Traditional data centers were built to support general computing tasks, while AI data centers are specifically designed for more sophisticated, time and resource-intensive workloads. Conventional data centers are simply not optimized for AI’s advanced tasks and necessary high-speed data transfer. Here’s a closer look at their differences: AI-optimized vs. traditional data centers Traditional data centers: Handle everyday computing needs such as web browsing, cloud services, email and enterprise app hosting, data storage and retrieval, and a variety of other relatively low-resource tasks. They can also support simpler AI applications, such as chatbots, that do not require intensive processing power or speed. AI data centers: Built to compute significant volumes of data and run complex algorithms, ML and AI tasks, including agentic AI workflows. They feature high-speed networking and low-latency interconnects for rapid scaling and data transfer to support AI apps and edge and internet of things (IoT) use cases. Physical infrastructure Traditional data centers: Typically composed of standard networking architectures such as CPUs suitable for handling networking, apps, and storage. AI data centers: Feature more advanced graphics processing units (GPU) (popularized by chip manufacturer Nvidia), tensor processing units (TPUs) (developed by Google), and other specialized accelerators and equipment. Storage and data management Traditional data centers: Generally, store data in more static cloud storage systems, databases, data lakes, and data lakehouses. AI data centers: Handle huge amounts of unstructured data including text, images, video, audio, and other files. They also incorporate high-performance tools including parallel file systems, multiple network servers, and NVMe solid state drives (SSDs). Power consumption Traditional data centers: Require robust cooling

From Cloud to Concrete: How Explosive Data Center Demand is Redefining Commercial Real Estate

The world will generate 181 ZB of data in 2025, an increase of 23.13% year over year and 2.5 quintillion bytes (a quintillion byte is also called an exabyte, EB) created daily, according to a report from Demandsage. To put that in perspective: One exabyte is equal to 1 quintillion bytes, which is 1,000,000,000,000,000,000 bytes. That’s 29 TB every second, or 2.5 million TB per day. It’s no wonder data centers have become so crucial for creating, consuming, and storing data — and no wonder investor interest has skyrocketed. The surging demand for secure, scalable, high-performance retail and wholesale colocation and hyperscale data centers is spurred by the relentless, global expansion of cloud computing and demand for AI as data generation from businesses, governments, and consumers continues to surge. Power access, sustainable infrastructure, and land acquisition have become critical factors shaping where and how data center facilities are built. As a result, investors increasingly view these facilities not just as technology assets, but as a unique convergence of real estate, utility infrastructure, and mission-critical systems. Capitalizing on this momentum, private equity and real estate investment firms are rapidly expanding into the sector through acquisitions, joint ventures, and new funds—targeting opportunities to build and operate facilities with a focus on energy efficiency and scalability.

Ai4 2025 Navigates Rapid Change in AI Policy, Education

The pace of innovation in artificial intelligence is fundamentally reshaping the landscape of education, and the changes are happening rapidly. At the forefront of this movement stand developers, policy makers, educational practitioners, and associated experts at the recent Ai4 2025 conference (Aug. 11-13) in Las Vegas, where leading voices such as Geoffrey Hinton “The Godfather of AI,” top executives from Google and U.S. Bank, and representatives from multiple government agencies gathered to chart the future of AI development. Importantly, educators and academic institutions played a central role, ensuring that the approach to AI in schools is informed by those closest to the classroom. Key discussions at Ai4 and recent educator symposia underscored both the promise and peril of swift technological change. Generative AI, with its lightning-fast adoption since the advent of tools like ChatGPT, is opening new possibilities for personalized learning, skills development, and operational efficiency. But participants were quick to note that acceleration brings good and bad consequences. On one hand, there’s excitement about practical classroom implementations and the potential for students to engage with cutting-edge technology. On the other, concerns about governance, ethics, safety, and the depth of genuine learning remain at the forefront. This urgency to “do this right” is echoed by teachers, unions, and developers who are united by the challenges and opportunities on the ground. Their voices highlight the need for agreement on education policy and associated regulations to keep pace with technological progress, create frameworks for ethical and responsible use, and ensure that human agency remains central in shaping the future of childhood and learning. In this rapidly evolving environment, bringing all stakeholders to the table is no longer optional; it is essential for steering AI in education toward outcomes that benefit both students and society. Global Context: America, China, and the AI Race

Two Lenses on One Market: JLL and CBRE Show Data Centers in a Pinch

The two dominant real estate research houses, JLL and CBRE, have released midyear snapshots of the North American data center market, and both paint the same picture in broad strokes: demand remains insatiable, vacancy has plunged to record lows, and the growth of AI and hyperscale deployments is reshaping every aspect of the business. But their lenses capture different angles of the same story: one emphasizing preleasing and capital flows, the other highlighting hyperscale requirements and regional shifts. Vacancy Falls Through the Floor JLL sets the stage with a stark headline: colocation vacancy is nearing 0%. The JLL Midyear 2025 North America Data Center report warns that this scarcity “is constraining economic growth and undermining national security,” underscoring the role of data centers as critical infrastructure. CBRE’s North American Data Center Trends H1 2025 numbers back this up, recording an all-time low North America vacancy rate of 1.6%, the tightest in more than a decade. Both agree that market loosening is years away — JLL projecting vacancy hovering around 2% through 2027, CBRE noting 74.3% of new capacity already spoken for. The takeaway seems clear: without preleasing, operators and tenants alike are effectively shut out of core markets. Absorption and Preleasing Drive Growth JLL drills down into the mechanics. With virtually all absorption tied to preleasing, the firm points to Northern Virginia (647 MW) and Dallas (575 MW) as the twin engines of growth in H1, joined by Chicago, Austin/San Antonio, and Atlanta. CBRE’s absorption math is slightly different, but the conclusion aligns: Northern Virginia again leads the nation, with 538.6 MW net absorption and a remarkable 80% surge in under-construction capacity. CBRE sharpens the view by noting that the fiercest competition is at the top end: single-tenant requirements of 10 MW or more are setting pricing records as hyperscalers

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Authors
Brendan Tracey, Jonas Buchli

Our novel Deep Loop Shaping method improves control of gravitational

Imagining the future of banking with agentic AI

In association withEY Agentic AI is coming of age. And with it comes new opportunities in the financial services sector. Banks are increasingly employing agentic