Introducing Gemma 3n: The developer guide

Stay Ahead, Stay ONMINE

Introducing Gemma 3n: The developer guide

The first Gemma model launched early last year and has since grown into a thriving Gemmaverse of over 160 million collective downloads. This ecosystem includes our family of over a dozen specialized models for everything from safeguarding to medical applications and, most inspiringly, the countless innovations from the community. From innovators like Roboflow building enterprise computer vision to the Institute of Science Tokyo creating highly-capable Japanese Gemma variants, your work has shown us the path forward.Building on this incredible momentum, we’re excited to announce the full release of Gemma 3n. While last month’s preview offered a glimpse, today unlocks the full power of this mobile-first architecture. Gemma 3n is designed for the developer community that helped shape Gemma. It’s supported by your favorite tools including Hugging Face Transformers, llama.cpp, Google AI Edge, Ollama, MLX, and many others, enabling you to fine-tune and deploy for your specific on-device applications with ease. This post is the developer deep dive: we’ll explore some of the innovations behind Gemma 3n, share new benchmark results, and show you how to start building today.What’s new in Gemma 3n?Gemma 3n represents a major advancement for on-device AI, bringing powerful multimodal capabilities to edge devices with performance previously only seen in last year’s cloud-based frontier models. Achieving this leap in on-device performance required rethinking the model from the ground up. The foundation is Gemma 3n’s unique mobile-first architecture, and it all starts with MatFormer.MatFormer: One model, many sizesAt the core of Gemma 3n is the MatFormer (🪆Matryoshka Transformer) architecture, a novel nested transformer built for elastic inference. Think of it like Matryoshka dolls: a larger model contains smaller, fully functional versions of itself. This approach extends the concept of Matryoshka Representation Learning from just embeddings to all transformer components. During the MatFormer training of the 4B effective parameter (E4B) model, a 2B effective parameter (E2B) sub-model is simultaneously optimized within it, as shown in the figure above. This provides developers two powerful capabilities and use cases today:1: Pre-extracted models: You can directly download and use either the main E4B model for the highest capabilities, or the standalone E2B sub-model which we have already extracted for you, offering up to 2x faster inference.2: Custom sizes with Mix-n-Match: For more granular control tailored to specific hardware constraints, you can create a spectrum of custom-sized models between E2B and E4B using a method we call Mix-n-Match. This technique allows you to precisely slice the E4B model’s parameters, primarily by adjusting the feed forward network hidden dimension per layer (from 8192 to 16384) and selectively skipping some layers. We are releasing the MatFormer Lab, a tool that shows how to retrieve these optimal models, which were identified by evaluating various settings on benchmarks like MMLU. MMLU scores for the pre-trained Gemma 3n checkpoints at different model sizes (using Mix-n-Match) Looking ahead, the MatFormer architecture also paves the way for elastic execution. While not part of today’s launched implementations, this capability allows a single deployed E4B model to dynamically switch between E4B and E2B inference paths on the fly, enabling real-time optimization of performance and memory usage based on the current task and device load.Per-Layer Embeddings (PLE): Unlocking more memory efficiencyGemma 3n models incorporate Per-Layer Embeddings (PLE). This innovation is tailored for on-device deployment as it dramatically improves model quality without increasing the high-speed memory footprint required on your device’s accelerator (GPU/TPU).While the Gemma 3n E2B and E4B models have a total parameter count of 5B and 8B respectively, PLE allows a significant portion of these parameters (the embeddings associated with each layer) to be loaded and computed efficiently on the CPU. This means only the core transformer weights (approximately 2B for E2B and 4B for E4B) need to sit in the typically more constrained accelerator memory (VRAM). With Per-Layer Embeddings, you can use Gemma 3n E2B while only having ~2B parameters loaded in your accelerator. KV Cache sharing: Faster long-context processingProcessing long inputs, such as the sequences derived from audio and video streams, is essential for many advanced on-device multimodal applications. Gemma 3n introduces KV Cache Sharing, a feature designed to significantly accelerate time-to-first-token for streaming response applications.KV Cache Sharing optimizes how the model handles the initial input processing stage (often called the “prefill” phase). The keys and values of the middle layer from local and global attention are directly shared with all the top layers, delivering a notable 2x improvement on prefill performance compared to Gemma 3 4B. This means the model can ingest and understand lengthy prompt sequences much faster than before.Audio understanding: Introducing speech to text and translationGemma 3n uses an advanced audio encoder based on the Universal Speech Model (USM). The encoder generates a token for every 160ms of audio (about 6 tokens per second), which are then integrated as input to the language model, providing a granular representation of the sound context.This integrated audio capability unlocks key features for on-device development, including:Automatic Speech Recognition (ASR): Enable high-quality speech-to-text transcription directly on the device.Automatic Speech Translation (AST): Translate spoken language into text in another language.We’ve observed particularly strong AST results for translation between English and Spanish, French, Italian, and Portuguese, offering great potential for developers targeting applications in these languages. For tasks like speech translation, leveraging Chain-of-Thought prompting can significantly enhance results. Here’s an example: user Transcribe the following speech segment in Spanish, then translate it into English: model Plain text At launch time, the Gemma 3n encoder is implemented to process audio clips up to 30 seconds. However, this is not a fundamental limitation. The underlying audio encoder is a streaming encoder, capable of processing arbitrarily long audios with additional long form audio training. Follow-up implementations will unlock low-latency, long streaming applications.MobileNet-V5: New state-of-the-art vision encoderAlongside its integrated audio capabilities, Gemma 3n features a new, highly efficient vision encoder, MobileNet-V5-300M, delivering state-of-the-art performance for multimodal tasks on edge devices.Designed for flexibility and power on constrained hardware, MobileNet-V5 gives developers:Multiple input resolutions: Natively supports resolutions of 256×256, 512×512, and 768×768 pixels, allowing you to balance performance and detail for your specific applications.Broad visual understanding: Co-trained on extensive multimodal datasets, it excels at a wide range of image and video comprehension tasks.High throughput: Processes up to 60 frames per second on a Google Pixel, enabling real-time, on-device video analysis and interactive experiences.This level of performance is achieved with multiple architectural innovations, including:An advanced foundation of MobileNet-V4 blocks (including Universal Inverted Bottlenecks and Mobile MQA).A significantly scaled up architecture, featuring a hybrid, deep pyramid model that is 10x larger than the biggest MobileNet-V4 variant.A novel Multi-Scale Fusion VLM adapter that enhances the quality of tokens for better accuracy and efficiency.Benefiting from novel architectural designs and advanced distillation techniques, MobileNet-V5-300M substantially outperforms the baseline SoViT in Gemma 3 (trained with SigLip, no distillation). On a Google Pixel Edge TPU, it delivers a 13x speedup with quantization (6.5x without), requires 46% fewer parameters, and has a 4x smaller memory footprint, all while providing significantly higher accuracy on vision-language tasksWe’re excited to share more about the work behind this model. Look out for our upcoming MobileNet-V5 technical report, which will deep dive into the model architecture, data scaling strategies, and advanced distillation techniques.Making Gemma 3n accessible from day one has been a priority. We’re proud to partner with many incredible open source developers to ensure broad support across popular tools and platforms, including contributions from teams behind AMD, Axolotl, Docker, Hugging Face, llama.cpp, LMStudio, MLX, NVIDIA, Ollama, RedHat, SGLang, Unsloth, and vLLM.But this ecosystem is just the beginning. The true power of this technology is in what you will build with it. That’s why we’re launching the Gemma 3n Impact Challenge. Your mission: use Gemma 3n’s unique on-device, offline, and multimodal capabilities to build a product for a better world. With $150,000 in prizes, we’re looking for a compelling video story and a “wow” factor demo that shows real-world impact. Join the challenge and help build a better future.Get started with Gemma 3n todayReady to explore the potential of Gemma 3n today? Here’s how:Experiment directly: Use Google AI Studio to try Gemma 3n in just a couple of clicks. Gemma models can also be deployed directly to Cloud Run from AI Studio.Learn & integrate: Dive into our comprehensive documentation to quickly integrate Gemma into your projects or start with our inference and fine-tuning guides.

Building on this incredible momentum, we’re excited to announce the full release of Gemma 3n. While last month’s preview offered a glimpse, today unlocks the full power of this mobile-first architecture. Gemma 3n is designed for the developer community that helped shape Gemma. It’s supported by your favorite tools including Hugging Face Transformers, llama.cpp, Google AI Edge, Ollama, MLX, and many others, enabling you to fine-tune and deploy for your specific on-device applications with ease. This post is the developer deep dive: we’ll explore some of the innovations behind Gemma 3n, share new benchmark results, and show you how to start building today.

What’s new in Gemma 3n?

Gemma 3n represents a major advancement for on-device AI, bringing powerful multimodal capabilities to edge devices with performance previously only seen in last year’s cloud-based frontier models.

Achieving this leap in on-device performance required rethinking the model from the ground up. The foundation is Gemma 3n’s unique mobile-first architecture, and it all starts with MatFormer.

MatFormer: One model, many sizes

At the core of Gemma 3n is the MatFormer (🪆Matryoshka Transformer) architecture, a novel nested transformer built for elastic inference. Think of it like Matryoshka dolls: a larger model contains smaller, fully functional versions of itself. This approach extends the concept of Matryoshka Representation Learning from just embeddings to all transformer components.

During the MatFormer training of the 4B effective parameter (E4B) model, a 2B effective parameter (E2B) sub-model is simultaneously optimized within it, as shown in the figure above. This provides developers two powerful capabilities and use cases today:

1: Pre-extracted models: You can directly download and use either the main E4B model for the highest capabilities, or the standalone E2B sub-model which we have already extracted for you, offering up to 2x faster inference.

2: Custom sizes with Mix-n-Match: For more granular control tailored to specific hardware constraints, you can create a spectrum of custom-sized models between E2B and E4B using a method we call Mix-n-Match. This technique allows you to precisely slice the E4B model’s parameters, primarily by adjusting the feed forward network hidden dimension per layer (from 8192 to 16384) and selectively skipping some layers. We are releasing the MatFormer Lab, a tool that shows how to retrieve these optimal models, which were identified by evaluating various settings on benchmarks like MMLU.

MMLU scores for the pre-trained Gemma 3n checkpoints at different model sizes (using Mix-n-Match)

Looking ahead, the MatFormer architecture also paves the way for elastic execution. While not part of today’s launched implementations, this capability allows a single deployed E4B model to dynamically switch between E4B and E2B inference paths on the fly, enabling real-time optimization of performance and memory usage based on the current task and device load.

Per-Layer Embeddings (PLE): Unlocking more memory efficiency

Gemma 3n models incorporate Per-Layer Embeddings (PLE). This innovation is tailored for on-device deployment as it dramatically improves model quality without increasing the high-speed memory footprint required on your device’s accelerator (GPU/TPU).

While the Gemma 3n E2B and E4B models have a total parameter count of 5B and 8B respectively, PLE allows a significant portion of these parameters (the embeddings associated with each layer) to be loaded and computed efficiently on the CPU. This means only the core transformer weights (approximately 2B for E2B and 4B for E4B) need to sit in the typically more constrained accelerator memory (VRAM).

With Per-Layer Embeddings, you can use Gemma 3n E2B while only having ~2B parameters loaded in your accelerator.

Processing long inputs, such as the sequences derived from audio and video streams, is essential for many advanced on-device multimodal applications. Gemma 3n introduces KV Cache Sharing, a feature designed to significantly accelerate time-to-first-token for streaming response applications.

KV Cache Sharing optimizes how the model handles the initial input processing stage (often called the “prefill” phase). The keys and values of the middle layer from local and global attention are directly shared with all the top layers, delivering a notable 2x improvement on prefill performance compared to Gemma 3 4B. This means the model can ingest and understand lengthy prompt sequences much faster than before.

Audio understanding: Introducing speech to text and translation

Gemma 3n uses an advanced audio encoder based on the Universal Speech Model (USM). The encoder generates a token for every 160ms of audio (about 6 tokens per second), which are then integrated as input to the language model, providing a granular representation of the sound context.

This integrated audio capability unlocks key features for on-device development, including:

Automatic Speech Recognition (ASR): Enable high-quality speech-to-text transcription directly on the device.

Automatic Speech Translation (AST): Translate spoken language into text in another language.

We’ve observed particularly strong AST results for translation between English and Spanish, French, Italian, and Portuguese, offering great potential for developers targeting applications in these languages. For tasks like speech translation, leveraging Chain-of-Thought prompting can significantly enhance results. Here’s an example:

user
Transcribe the following speech segment in Spanish, then translate it into English: 

model

Plain text

At launch time, the Gemma 3n encoder is implemented to process audio clips up to 30 seconds. However, this is not a fundamental limitation. The underlying audio encoder is a streaming encoder, capable of processing arbitrarily long audios with additional long form audio training. Follow-up implementations will unlock low-latency, long streaming applications.

MobileNet-V5: New state-of-the-art vision encoder

Alongside its integrated audio capabilities, Gemma 3n features a new, highly efficient vision encoder, MobileNet-V5-300M, delivering state-of-the-art performance for multimodal tasks on edge devices.

Designed for flexibility and power on constrained hardware, MobileNet-V5 gives developers:

Multiple input resolutions: Natively supports resolutions of 256×256, 512×512, and 768×768 pixels, allowing you to balance performance and detail for your specific applications.

Broad visual understanding: Co-trained on extensive multimodal datasets, it excels at a wide range of image and video comprehension tasks.

High throughput: Processes up to 60 frames per second on a Google Pixel, enabling real-time, on-device video analysis and interactive experiences.

This level of performance is achieved with multiple architectural innovations, including:

An advanced foundation of MobileNet-V4 blocks (including Universal Inverted Bottlenecks and Mobile MQA).

A significantly scaled up architecture, featuring a hybrid, deep pyramid model that is 10x larger than the biggest MobileNet-V4 variant.

A novel Multi-Scale Fusion VLM adapter that enhances the quality of tokens for better accuracy and efficiency.

Benefiting from novel architectural designs and advanced distillation techniques, MobileNet-V5-300M substantially outperforms the baseline SoViT in Gemma 3 (trained with SigLip, no distillation). On a Google Pixel Edge TPU, it delivers a 13x speedup with quantization (6.5x without), requires 46% fewer parameters, and has a 4x smaller memory footprint, all while providing significantly higher accuracy on vision-language tasks

We’re excited to share more about the work behind this model. Look out for our upcoming MobileNet-V5 technical report, which will deep dive into the model architecture, data scaling strategies, and advanced distillation techniques.

Making Gemma 3n accessible from day one has been a priority. We’re proud to partner with many incredible open source developers to ensure broad support across popular tools and platforms, including contributions from teams behind AMD, Axolotl, Docker, Hugging Face, llama.cpp, LMStudio, MLX, NVIDIA, Ollama, RedHat, SGLang, Unsloth, and vLLM.

But this ecosystem is just the beginning. The true power of this technology is in what you will build with it. That’s why we’re launching the Gemma 3n Impact Challenge. Your mission: use Gemma 3n’s unique on-device, offline, and multimodal capabilities to build a product for a better world. With $150,000 in prizes, we’re looking for a compelling video story and a “wow” factor demo that shows real-world impact. Join the challenge and help build a better future.

Get started with Gemma 3n today

Ready to explore the potential of Gemma 3n today? Here’s how:

Experiment directly: Use Google AI Studio to try Gemma 3n in just a couple of clicks. Gemma models can also be deployed directly to Cloud Run from AI Studio.

Learn & integrate: Dive into our comprehensive documentation to quickly integrate Gemma into your projects or start with our inference and fine-tuning guides.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

AI power efficiency the target of Lotus Microsystems energy advances

By shortening current paths and integrating thermal management directly into the power-delivery structure, vStrata aims to reduce conversion losses while improving cooling efficiency. According to Lotus Microsystems, the module can achieve point-of-load efficiencies of up to 96% while reducing power-conversion losses by more than 50% compared with conventional approaches. “We

Northern Gulf Petroleum secures drilling contract for operations offshore Thailand

Valesto will provide a rig for a drilling campaign in the Gulf of Thailand, which includes four infill wells and three exploration wells. <!–> June 4, 2026 –> Key Highlights Northern Gulf Petroleum let a jack-up rig contract for offshore Thailand. The work scope comprises four infill wells and three

Continental Resources seeks to explore Vaca Muerta’s eastern flank with La Huella block

The filing remains under technical and legal review by Río Negro authorities. If accepted, the province must launch a public tender in which the original proponent retains a right of preference. Continental could match a higher third-party bid to secure the block. While not confirmed, given the size of the

Zscaler launches zero trust platform for agentic AI

The company will be extending its Zscaler Zero Trust Exchange platform to cover AI agents, including how they connect, how they access data, and how they run on devices. According to Christina Powers, partner and cybersecurity consulting leader at management consulting firm West Monroe Partners, zero trust for agentic systems

Energy Department Releases Finalized Fusion Science and Technology Roadmap to Accelerate Commercial Fusion Power

WASHINGTON—The U.S. Department of Energy (DOE) today released the finalized Fusion Science and Technology (FS&T) Roadmap, a national strategy to accelerate the development and commercialization of fusion energy on the most rapid, responsible timeline in history. Building on earlier roadmap efforts, the finalized roadmap brings together fusion science, technology, infrastructure, workforce development, and commercialization priorities into a single national strategy to support fusion pilot plants and commercial fusion power in the mid-2030s. Fusion is the process that powers the sun and stars. For decades, scientists and engineers have worked to bring that same process to Earth as a source of abundant, reliable energy. The finalized roadmap outlines how DOE, industry, universities, and national laboratories will work together to accelerate the path toward commercial fusion energy in the United States. This effort advances President Trump’s energy dominance agenda and reinforces the Administration’s commitment to expanding reliable American energy production, strengthening domestic supply chains, and maintaining U.S. leadership in critical technologies. By accelerating progress toward commercial fusion power, DOE is helping secure a future of abundant and reliable energy. “Fusion energy has entered a new era defined by extraordinary scientific progress and public-private momentum,” said DOE Under Secretary for Science Dr. Darío Gil. “With this roadmap, we now have the clarity, coordination, and sustained commitment needed to turn the promise of fusion into a reality for the American people.” Developed with input from more than 800 scientists and engineers across the public and private sectors, the finalized FS&T Roadmap reflects contributions from more than 15 private companies, over 10 National Laboratories, and more than 70 universities. The roadmap identifies the critical science and technology gaps that must be closed to realize fusion pilot plants and strengthen U.S. leadership in the global fusion industry. The FS&T Roadmap establishes a unified strategy for the U.S.

Aramco to divest Malaysian refining assets

Petroliam Nasional Bhd. (PETRONAS) subsidiary PETRONAS Refinery & Petrochemical Corp. Sdn. Bhd. (PRPC) has agreed to buyout Saudi Arabian Oil Co.’s (Aramco) equity interests in the partners’ dual 50-50 joint ventures responsible for operating the 300,000-b/d integrated refining and petrochemical refinery of the Pengerang Integrated Complex (PIC) in southeastern Johor, Malaysia. Subject to fulfillment of customary closing conditions, Petronas will take 100% ownership and become full operator of Pengerang Refining Co. Sdn. Bhd. and Pengerang Petrochemical Co. Sdn. Bhd., collectively known as PRefChem, Aramco and Petronas said in separate releases. Aramco said divestment of the Malaysian assets will support the strategic optimization of the company’s own downstream portfolio by providing additional flexibility to pursue investments aligned with its broader downstream strategy. While Aramco will no longer hold ownership in the Malaysian ventures, the company said it will continue actively explore commercial arrangements with Petronas following the sale, including continuing its existing agreement to supply Saudi Arabian crude oil to the site, as well as opportunities related to technology exchange and integrated product distribution. Petronas said its move to take full control of the downstream assets will allow the company to further enhance operational alignment and flexibility across PRefChem’s value chain, while harnessing its international oil supply network and integrated operating model to support continued reliability and resilience across varying market conditions. Full ownership of PRefChem’s in-country operations also will strengthen Petronas’ ability to support Malaysia’s long-term energy security and industry resilience, the operator said. A definitive timeframe for when the parties expect to finalize the proposed transaction was not revealed. PRefChem operations In addition to the Johor refinery, PRefChem’s operations at PIC include a steam cracker complex equipped to produce 3.4 million tonnes/year (tpy) combined of ethylene, propylene, butadiene, benzene and raffinate-2. PRefChem also operates an associated petrochemical complex at the

Delfin Midstream takes $5-billion FID for first FLNG vessel

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } <!–> Delfin Midstream Inc., Houston, has taken a final investment decision (FID) for the first floating liquefied natural gas (FLNG) vessel of the Delfin LNG project under development in Louisiana and offshore in the Gulf of Mexico. Delfin FLNG 1 will be the first FLNG vessel in the US and the largest FLNG project globally, with an expected export capacity of 4.4 million tonnes/year (tpy) of LNG, the company said in its June 3 release. Concurrent with the FID, a group of investors led by Global Infrastructure Partners (GIP), a part of BlackRock—including existing Delfin investors Mitsui OSK Lines Ltd. (MOL), Vitol, and Diameter Capital Partners—has agreed to invest in the first phase of the project. ]–> <!–> –><!–> –> Oct. 9, 2023 <!–> –><!–> –> July 11, 2023 <!–> –><!–> –> June 9, 2023 <!–> –><!–> –> July 2, 2021 <!–> –> <!–> The vessel is backed by long-term LNG sales agreements with Vitol, Expand Energy, Centrica, and Gunvor, Delfin said, and all necessary permits and licenses required to begin construction have been secured. Construction contracts have been executed with Samsung Heavy Industries Co. Ltd. and Black & Veatch. LNG production is scheduled to begin

Chevron files $13.8-billion Argentina oil development proposal

Chevron Corp. applied June 2 to join Argentina’s Large Investment Incentive Regime (RIGI) for a $13.8-billion unconventional oil development at its 100% operated El Trapial-Este block in northern Neuquén province. Until recently, RIGI had attracted about $93 billion across 36 projects. Chevron’s application, which remains subject to government approval, is equivalent to almost one seventh of that total. The filing, which does not consitute a final investment decision, is Chevron’s largest individual investment proposal in Argentina since it entered the country in 1999 and the second-largest project submitted under RIGI, behind YPF SA’s $25-billion LLL Oil development. Chevron said it is targeting production of about 30,000 b/d from El Trapial-Este, subject to the availability of takeaway infrastructure. The block currently produces about 7,000 b/d. Chevron tested the block with a 7-well pilot in 2021 and has been carrying out development since late 2022, using laterals of more than 3,000 m and techniques transferred from the US Permian basin. In 2023, Chevron committed $500 million to that phase. During the company’s first-quarter earnings call on May 1, chief executive officer Mike Wirth anchored Chevron’s 2030 targets in “assets that are operating today.” El Trapial-Este was not explicitly identified among assets described as the main base for those targets. Wirth also said Chevron would not accelerate Permian production even with Brent above $100/bbl, preferring to manage that asset for free cash flow rather than volume. In the same presentation, Wirth named Argentina among the sources of equity crude that feed Chevron’s global refining system, along with Tengiz, Guyana, the Permian, and Venezuela. The earnings call came weeks before the El proposal filing. Vaca Muerta costs, takeaway capacity Breakeven costs in Vaca Muerta’s best blocks are about $40/bbl at the wellhead, according to Rystad Energy, while normalized well productivity—adjusted for lateral length and fracture

Santos lets rig contract for Bedout subbasin exploration campaign

Santos Ltd. has let a contract for the Transocean Equinox semisubmersible mobile offshore drilling unit for a multi-well campaign at Bedout subbasin exploration permits offshore North West Shelf Australia, said partner Carnarvon Energy Ltd. in a June 1 release. The objective of the 2027 Bedout exploration campaign is to define the scale of the subbasin’s resource potential and target some of the largest prospects in the exploration portfolio. Shortlisted prospects include Ara, Yuma, Goats Eye, and Hutton, which are all defined on the Bedout MegaMerge 3D seismic survey. The Bedout exploration campaign is on track to start begin in April 2027, with one firm well and one contingent well. The Transocean Equinox is currently engaged in a multi-well exploration drilling campaign off the coast of Victoria, which is expected to be completed by early 2027. Bedout basin is proposed to be an integrated gas and liquids project. To date, five fields have been discovered. Net 2C contingent resource of 230 MMboe is booked as of Dec. 31, 2024. Santos is operator at Bedout. Carnarvon holds 20% interest in Yuma, Goats Eye, and Hutton, and 10% interest in Ara.

US underground natural gas storage capacity edges higher in 2025

Underground working natural gas storage capacity in the Lower 48 states increased modestly in 2025, with most additions concentrated in the South Central and Mountain regions, according to the US Energy Information Administration (EIA). Natural gas storage plays a key role in balancing seasonal demand fluctuations, allowing supplies to be injected during periods of lower consumption and withdrawn during periods of peak demand. EIA calculates natural gas storage capacity in two ways: demonstrated peak capacity and working gas design capacity. Both increased in 2025. EIA data show demonstrated peak storage capacity rose by 6 bcf, or 0.1%, from the previous year, marking the third consecutive annual increase. Demonstrated peak capacity is the sum of the largest volume of working gas stored in each storage field during the previous five-year period, regardless of when the peaks occurred. The South Central and Mountain regions posted the largest gains, with demonstrated peak capacity increasing by 16 bcf and 18 bcf, respectively. Capacity declined in other regions, falling 15 bcf in the East, 8 bcf in the Pacific, and 5 bcf in the Midwest. Working gas design capacity, sometimes referred to as nameplate capacity, is based on the physical characteristics of the reservoir, installed equipment, and operating procedures on the site, which federal or state regulators usually must certify. As of November, Lower 48 design capacity totaled 4,683 bcf, up 26 bcf from a year earlier. The South Central region accounted for most of the increase, adding 21 bcf of design capacity, while the Mountain region added 6 bcf. Design capacity in the East declined by 2 bcf, primarily because of base gas adjustments. Capacity in the Midwest and Pacific regions was unchanged from the previous year.

Arista unveils 1.6T rack-scale switch family for AI infrastructure

The new Arista family joins a growing ecosystem of vendors looking to tap into the 1.6T Ethernet world, which includes Cisco, Nvidia, Celestica and others. “Arista Network’s new 7060XE7 Series is a strong signal of where large-scale AI fabrics are heading: higher bandwidth, better power efficiency, and tighter integration between compute, optics, silicon, cooling, and network operating software,” wrote Sameh Boujelbene, vice president, data center switch and AI networks market research for Dell Oro, in a LinkedIn post. Among the features that stand out to her are “strong customer and ecosystem validation from Microsoft Azure, Oracle Cloud Infrastructure, Meta, AMD, and Broadcom.”

Water Emerges as a Critical Constraint for AI Data Centers

“There really has been a major shift within the last couple of years,” Bajpayee said. “I would even say within the last 12 months is where we have seen suddenly a rapid increase in the data center operators’ desire to control their water destiny.” For Gradiant, the MIT-born water technology company that built its reputation serving semiconductor manufacturers, pharmaceutical companies, and industrial customers worldwide, that shift has translated into a rapidly expanding pipeline of data center opportunities. More importantly, Bajpayee believes it signals a fundamental change in how the industry thinks about water itself. The conversation is no longer centered primarily on sustainability metrics or corporate environmental goals. Instead, operators increasingly view water as a business continuity issue. “We’re seeing operators themselves come to us and tell us that these are issues they are facing,” Bajpayee said. “They want to make sure they don’t get stalled, their permits don’t get pulled, their business doesn’t get stopped, and communities don’t push them out because they didn’t figure out a way to control their water.” From Water Treatment to Water Strategy That shift is occurring as Gradiant expands deployments of its recently announced HyperSolved platform, an end-to-end cooling water management system purpose-built for AI data centers. The company says HyperSolved is now being deployed with several of the world’s largest hyperscale operators across North America, Europe, and Asia, reflecting growing industry demand for integrated approaches to water infrastructure. While compute, networking, and power systems have evolved rapidly during the AI era, water management often remains fragmented, requiring operators to coordinate multiple vendors responsible for sourcing, treatment, cooling, wastewater management, reuse, discharge, and regulatory compliance. Gradiant’s approach seeks to consolidate those functions into a single integrated platform and operating model. The timing reflects the growing scale of the challenge. New AI data center

Data Center Jobs: Engineering, Construction, Commissioning, Sales, Field Service and Facility Tech Jobs Available in Major Data Center Hotspots

Each month Data Center Frontier, in partnership with Pkaza, posts some of the hottest data center career opportunities in the market. Here’s a look at some of the latest data center jobs posted on the Data Center Frontier jobs board, powered by Pkaza Critical Facilities Recruiting. Looking for Data Center Candidates? Check out Pkaza’s Active Candidate / Featured Candidate Hotlist Mechanical Applications Engineer Pittsburgh, PA This position is also available in: Denver, CO; Richmond, VA and Georgetown, SC (live by the beach!). Relo available. Our client is a leading provider and manufacturer of industrial HVAC mechanical equipment used in industrial cooling applications for mission critical operations. They help their customers save money by reducing energy and operating costs and provide solutions for modernizing their customer’s existing mechanical infrastructure. This company provides cooling solutions to many of the world’s largest organizations and government facilities and enterprise clients, colocation providers and hyperscale companies. This career-growth minded opportunity offers exciting projects with leading-edge technology and innovation as well as competitive salaries and benefits. Electrical Commissioning Engineer New Albany, OH This traveling position is also available in: New York, NY; White Plains, NY; Dallas, TX; Richmond, VA; Ashburn, VA; Montvale, NJ; Charlotte, NC; Atlanta, GA; Hampton, GA; Cedar Rapids, IA; Phoenix, AZ; Salt Lake City, UT; Kansas City, MO; Omaha, NE; Chesterton, IN; Indianapolis, IN or Chicago, IL. *** ALSO looking for a LEAD EE and ME CxA Agents and CxA PMs *** Our client is an engineering design and commissioning company that has a national footprint and specializes in MEP critical facilities design. They provide design, commissioning, consulting and management expertise in the critical facilities space. They have a mindset to provide reliability, energy efficiency, sustainable design and LEED expertise when providing these consulting services for Enterprise, Colocation and Hyperscale Companies. This career-growth minded opportunity offers exciting projects

Fiber’s Next Act: How AI Is Driving Connectivity Closer to the Edge

ORLANDO, Fla. — Much of the conversation surrounding AI infrastructure has focused on GPUs, power generation, cooling systems, and the unprecedented scale of next-generation data center development. But at Fiber Connect 2026, another reality became increasingly clear: none of those investments matter without the network infrastructure required to connect them. That theme emerged repeatedly during a conversation between Data Center Frontier Editor-in-Chief Matt Vincent and Clearfield Chief Commercial Officer Anis Khemakhem, whose perspective sits at the intersection of broadband infrastructure, fiber deployment, and emerging AI connectivity requirements. While Clearfield is best known throughout the broadband industry for its fiber management and connectivity solutions, Khemakhem argued that AI’s rapid expansion is creating new opportunities, and new challenges, that extend well beyond traditional fiber-to-the-home deployments. “AI is driving that connectivity closer and closer to the edge,” Khemakhem said, noting that growing compute requirements and increasingly latency-sensitive workloads are fundamentally changing assumptions about where infrastructure must reside and how it must be connected. For Data Center Frontier readers, the significance lies in a growing realization that AI infrastructure is becoming as much a networking challenge as a compute challenge. Beyond the Traditional Data Center One of the more notable themes of the discussion was Khemakhem’s view that the term “data center” has become too broad to be useful. The industry often speaks of data centers as a single category, but Clearfield increasingly differentiates between hyperscale campuses, colocation facilities, central office environments, and a rapidly emerging class of edge deployments. “There is no one-size-fits-all data center,” Khemakhem said, describing a continuum that extends from hyperscale facilities all the way to edge locations positioned near users and applications. That distinction matters because many AI applications are introducing latency requirements that cannot always be addressed by centralized facilities alone. As AI inference moves closer to users,

Liquid Cooling Market Matures: Innovations, Acquisitions, and Modular Solutions for AI Infrastructure

Thermal Validation Becomes a Strategic Capability Cooling is no longer simply a matter of installing enough CRAC units, chillers, CDUs, or rear-door heat exchangers. As rack densities climb and chip-level heat flux intensifies, the performance of the entire thermal chain increasingly depends on how coldplates, manifolds, pumps, controls, facility water loops, power systems, commissioning practices, and service workflows interact. Vertiv said Strategic Thermal Labs will help it simulate and emulate real-world high-density compute conditions, optimize interactions between the thermal chain and power train, and support customers across design, integration, commissioning, and lifecycle operations. That reflects a broader evolution underway in AI infrastructure. Data centers are becoming tightly coupled systems where thermal behavior influences power design, reliability, serviceability, operational efficiency, and ultimately the utilization of increasingly expensive accelerator platforms. Vertiv also emphasized that the acquisition does not alter its commitment to interoperable, server- and silicon-agnostic infrastructure solutions. That distinction matters because hyperscale and colocation operators remain wary of vendor lock-in at a time when chip architectures, server designs, and cooling strategies continue to evolve rapidly. Viewed through that lens, Vertiv’s acquisition reflects a larger industry shift. Infrastructure providers are no longer waiting for server OEMs or chipmakers to define the cooling roadmap. Instead, they are investing deeper in modeling, validation, and chip-level thermal expertise because the next generation of AI infrastructure performance will increasingly be determined by how effectively those systems work together. Accelsius Moves from Technology Validation to Market Scaling Accelsius offers a different view of where the liquid-cooling market is headed. While some vendors are extending existing architectures, Accelsius is focused on making two-phase direct-to-chip cooling easier to deploy, validate, and scale. The company’s recently introduced NeuCool IR150 is designed around that objective. Described by Accelsius as the industry’s first fully integrated rack-level cooling solution for two-phase liquid cooling,

DCF Poll: Which Technology Will Define the Next Generation of AI Data Centers?

Matt Vincent is Editor in Chief of Data Center Frontier, where he leads editorial strategy and coverage focused on the infrastructure powering cloud computing, artificial intelligence, and the digital economy. A veteran B2B technology journalist with more than two decades of experience, Vincent specializes in the intersection of data centers, power, cooling, and emerging AI-era infrastructure. Since assuming the EIC role in 2023, he has helped guide Data Center Frontier’s coverage of the industry’s transition into the gigawatt-scale AI era, with a focus on hyperscale development, behind-the-meter power strategies, liquid cooling architectures, and the evolving energy demands of high-density compute, while working closely with the Digital Infrastructure Group at Endeavor Business Media to expand the brand’s analytical and multimedia footprint. Vincent also hosts The Data Center Frontier Show podcast, where he interviews industry leaders across hyperscale, colocation, utilities, and the data center supply chain to examine the technologies and business models reshaping digital infrastructure. Since its inception he serves as Head of Content for the Data Center Frontier Trends Summit. Before becoming Editor in Chief, he served in multiple senior editorial roles across Endeavor Business Media’s digital infrastructure portfolio, with coverage spanning data centers and hyperscale infrastructure, structured cabling and networking, telecom and datacom, IP physical security, and wireless and Pro AV markets. He began his career in 2005 within PennWell’s Advanced Technology Division and later held senior editorial positions supporting brands such as Cabling Installation & Maintenance, Lightwave Online, Broadband Technology Report, and Smart Buildings Technology. Vincent is a frequent moderator, interviewer, and keynote speaker at industry events including the HPC Forum, where he delivers forward-looking analysis on how AI and high-performance computing are reshaping digital infrastructure. He graduated with honors from Indiana University Bloomington with a B.A. in English Literature and Creative Writing and lives in southern New Hampshire with

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE