Stay Ahead, Stay ONMINE

T5Gemma: A new collection of encoder-decoder Gemma models

In the rapidly evolving landscape of large language models (LLMs), the spotlight has largely focused on the decoder-only architecture. While these models have shown impressive capabilities across a wide range of generation tasks, the classic encoder-decoder architecture, such as T5 (The Text-to-Text Transfer Transformer), remains a popular choice for many real-world applications. Encoder-decoder models often excel at summarization, translation, QA, and more due to their high inference efficiency, design flexibility, and richer encoder representation for understanding input. Nevertheless, the powerful encoder-decoder architecture has received little relative attention.Today, we revisit this architecture and introduce T5Gemma, a new collection of encoder-decoder LLMs developed by converting pretrained decoder-only models into the encoder-decoder architecture through a technique called adaptation. T5Gemma is based on the Gemma 2 framework, including adapted Gemma 2 2B and 9B models as well as a set of newly trained T5-sized models (Small, Base, Large and XL). We are excited to release pretrained and instruction-tuned T5Gemma models to the community to unlock new opportunities for research and development.From decoder-only to encoder-decoderIn T5Gemma, we ask the following question: can we build top-tier encoder-decoder models based on pretrained decoder-only models? We answer this question by exploring a technique called model adaptation. The core idea is to initialize the parameters of an encoder-decoder model using the weights of an already pretrained decoder-only model, and then further adapt them via UL2 or PrefixLM-based pre-training. An overview of our approach, showing how we initialize a new encoder-decoder model using the parameters from a pretrained, decoder-only model. This adaptation method is highly flexible, allowing for creative combinations of model sizes. For instance, we can pair a large encoder with a small decoder (e.g., a 9B encoder with a 2B decoder) to create an “unbalanced” model. This allows us to fine-tune the quality-efficiency trade-off for specific tasks, such as summarization, where a deep understanding of the input is more critical than the complexity of the generated output.Towards better quality-efficiency trade-offHow does T5Gemma perform?In our experiments, T5Gemma models achieve comparable or better performance than their decoder-only Gemma counterparts, nearly dominating the quality-inference efficiency pareto frontier across several benchmarks, such as SuperGLUE which measures the quality of the learned representation. Encoder-decoder models consistently offer better performance for a given level of inference compute, leading the quality-efficiency frontier across a range of benchmarks. This performance advantage isn’t just theoretical; it translates to real-world quality and speed too. When measuring the actual latency for GSM8K (math reasoning), T5Gemma provided a clear win. For example, T5Gemma 9B-9B achieves higher accuracy than Gemma 2 9B but with a similar latency. Even more impressively, T5Gemma 9B-2B delivers a significant accuracy boost over the 2B-2B model, yet its latency is nearly identical to the much smaller Gemma 2 2B model. Ultimately, these experiments showcase that encoder-decoder adaptation offers a flexible, powerful way to balance across quality and inference speed.Unlocking foundational and fine-tuned capabilitiesCould encoder-decoder LLMs have similar capabilities to decoder-only models?Yes, T5Gemma shows promising capabilities both before and after instruction tuning.After pre-training, T5Gemma achieves impressive gains on complex tasks that require reasoning. For instance, T5Gemma 9B-9B scores over 9 points higher on GSM8K (math reasoning) and 4 points higher on DROP (reading comprehension) than the original Gemma 2 9B model. This pattern demonstrates that the encoder-decoder architecture, when initialized via adaptation, has the potential to create a more capable, performant foundational model. Detailed results for pretrained models, illustrating how adapted models have significant gains on several reasoning-intensive benchmarks compared to decoder-only Gemma 2. These foundational improvements from pre-training set the stage for even more dramatic gains after instruction tuning. For example, comparing Gemma 2 IT to T5Gemma IT, the performance gap widens significantly across the board. T5Gemma 2B-2B IT sees its MMLU score jump by nearly 12 points over the Gemma 2 2B, and its GSM8K score increases from 58.0% to 70.7%. The adapted architecture not only potentially provides a better starting point but also responds more effectively to instruction-tuning, ultimately leading to a substantially more capable and helpful final model. Detailed results for fine-tuned + RLHFed models, illustrating the capabilities of post-training to significantly amplify the performance advantages of the encoder-decoder architecture. Explore our models: Releasing T5Gemma checkpointsWe’re very excited to present this new method of building powerful, general purpose encoder-decoder models by adapting from pretrained decoder-only LLMs like Gemma 2. To help accelerate further research and allow the community to build on this work, we are excited to release a suite of our T5Gemma checkpoints.The release includes:Multiple Sizes: Checkpoints for T5-sized models (Small, Base, Large, and XL), the Gemma 2-based models (2B and 9B), as well as an additional model in between T5 Large and T5 XL.Multiple Variants: Pretrained and instruction-tuned models.Flexible Configurations: A powerful and efficient unbalanced 9B-2B checkpoint to explore the trade-offs between encoder and decoder size.Different Training Objectives: Models trained with either PrefixLM or UL2 objectives to provide either state-of-the-art generative performance or representation quality.We hope these checkpoints will provide a valuable resource for investigating model architecture, efficiency, and performance.Getting started with T5GemmaWe can’t wait to see what you build with T5Gemma. Please see the following links for more information:Learn about the research behind this project by reading the paper.Explore the models capabilities or fine-tune them for your own use cases with the Colab notebook.

In the rapidly evolving landscape of large language models (LLMs), the spotlight has largely focused on the decoder-only architecture. While these models have shown impressive capabilities across a wide range of generation tasks, the classic encoder-decoder architecture, such as T5 (The Text-to-Text Transfer Transformer), remains a popular choice for many real-world applications. Encoder-decoder models often excel at summarization, translation, QA, and more due to their high inference efficiency, design flexibility, and richer encoder representation for understanding input. Nevertheless, the powerful encoder-decoder architecture has received little relative attention.

Today, we revisit this architecture and introduce T5Gemma, a new collection of encoder-decoder LLMs developed by converting pretrained decoder-only models into the encoder-decoder architecture through a technique called adaptation. T5Gemma is based on the Gemma 2 framework, including adapted Gemma 2 2B and 9B models as well as a set of newly trained T5-sized models (Small, Base, Large and XL). We are excited to release pretrained and instruction-tuned T5Gemma models to the community to unlock new opportunities for research and development.

From decoder-only to encoder-decoder

In T5Gemma, we ask the following question: can we build top-tier encoder-decoder models based on pretrained decoder-only models? We answer this question by exploring a technique called model adaptation. The core idea is to initialize the parameters of an encoder-decoder model using the weights of an already pretrained decoder-only model, and then further adapt them via UL2 or PrefixLM-based pre-training.

decoder-only model

An overview of our approach, showing how we initialize a new encoder-decoder model using the parameters from a pretrained, decoder-only model.

This adaptation method is highly flexible, allowing for creative combinations of model sizes. For instance, we can pair a large encoder with a small decoder (e.g., a 9B encoder with a 2B decoder) to create an “unbalanced” model. This allows us to fine-tune the quality-efficiency trade-off for specific tasks, such as summarization, where a deep understanding of the input is more critical than the complexity of the generated output.

Towards better quality-efficiency trade-off

How does T5Gemma perform?

In our experiments, T5Gemma models achieve comparable or better performance than their decoder-only Gemma counterparts, nearly dominating the quality-inference efficiency pareto frontier across several benchmarks, such as SuperGLUE which measures the quality of the learned representation.

Encoder-decoder models benchmarks

Encoder-decoder models consistently offer better performance for a given level of inference compute, leading the quality-efficiency frontier across a range of benchmarks.

This performance advantage isn’t just theoretical; it translates to real-world quality and speed too. When measuring the actual latency for GSM8K (math reasoning), T5Gemma provided a clear win. For example, T5Gemma 9B-9B achieves higher accuracy than Gemma 2 9B but with a similar latency. Even more impressively, T5Gemma 9B-2B delivers a significant accuracy boost over the 2B-2B model, yet its latency is nearly identical to the much smaller Gemma 2 2B model. Ultimately, these experiments showcase that encoder-decoder adaptation offers a flexible, powerful way to balance across quality and inference speed.

Unlocking foundational and fine-tuned capabilities

Could encoder-decoder LLMs have similar capabilities to decoder-only models?

Yes, T5Gemma shows promising capabilities both before and after instruction tuning.

After pre-training, T5Gemma achieves impressive gains on complex tasks that require reasoning. For instance, T5Gemma 9B-9B scores over 9 points higher on GSM8K (math reasoning) and 4 points higher on DROP (reading comprehension) than the original Gemma 2 9B model. This pattern demonstrates that the encoder-decoder architecture, when initialized via adaptation, has the potential to create a more capable, performant foundational model.

Detailed results for pretrained models

Detailed results for pretrained models, illustrating how adapted models have significant gains on several reasoning-intensive benchmarks compared to decoder-only Gemma 2.

These foundational improvements from pre-training set the stage for even more dramatic gains after instruction tuning. For example, comparing Gemma 2 IT to T5Gemma IT, the performance gap widens significantly across the board. T5Gemma 2B-2B IT sees its MMLU score jump by nearly 12 points over the Gemma 2 2B, and its GSM8K score increases from 58.0% to 70.7%. The adapted architecture not only potentially provides a better starting point but also responds more effectively to instruction-tuning, ultimately leading to a substantially more capable and helpful final model.

Results for fine-tuned + RLHFed models

Detailed results for fine-tuned + RLHFed models, illustrating the capabilities of post-training to significantly amplify the performance advantages of the encoder-decoder architecture.

Explore our models: Releasing T5Gemma checkpoints

We’re very excited to present this new method of building powerful, general purpose encoder-decoder models by adapting from pretrained decoder-only LLMs like Gemma 2. To help accelerate further research and allow the community to build on this work, we are excited to release a suite of our T5Gemma checkpoints.

The release includes:

  • Multiple Sizes: Checkpoints for T5-sized models (Small, Base, Large, and XL), the Gemma 2-based models (2B and 9B), as well as an additional model in between T5 Large and T5 XL.
  • Multiple Variants: Pretrained and instruction-tuned models.
  • Flexible Configurations: A powerful and efficient unbalanced 9B-2B checkpoint to explore the trade-offs between encoder and decoder size.
  • Different Training Objectives: Models trained with either PrefixLM or UL2 objectives to provide either state-of-the-art generative performance or representation quality.

We hope these checkpoints will provide a valuable resource for investigating model architecture, efficiency, and performance.

Getting started with T5Gemma

We can’t wait to see what you build with T5Gemma. Please see the following links for more information:

  • Learn about the research behind this project by reading the paper.
  • Explore the models capabilities or fine-tune them for your own use cases with the Colab notebook.
Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Eying AI factories, Nvidia buys bigger stake in CoreWeave

Nvidia continues to throw its sizable bank account around, this time making a $2 billion investment in GPU cloud service provider CoreWeave. The company says the investment reflects Nvidia’s “confidence in CoreWeave’s business, team and growth strategy as a cloud platform built on Nvidia infrastructure.” CoreWeave is not the only

Read More »

AI, security tailwinds signal promising 2026 for Cisco

A big component of AI in communications is agentic agents talking to employees and customers, and bringing trust to the system is where Cisco should shine. It builds and runs its own infrastructure, which is secure by design. Cisco has relationships with governments all over the world, and between Webex

Read More »

Enterprise Spotlight: Manufacturing Reimagined

Emerging technologies from AI and extended reality to edge computing, digital twins, and more are driving big changes in the manufacturing world.  Download the February 2026 issue of the Enterprise Spotlight from the editors of CIO, Computerworld, CSO, InfoWorld, and Network World and learn about the new tech at the forefront

Read More »

Oil Closes Sharply Lower as Iran Risk Fades

Oil fell sharply as geopolitical risk premiums faded after US President Donald Trump said Washington is talking with Iran, while a broader commodities selloff exacerbated the slide. West Texas Intermediate plummeted 4.7% to settle near $62 a barrel, the biggest loss since June, while Brent futures also nosedived. Trump downplayed Iran supreme leader Ayatollah Ali Khamenei’s threats of a regional war over the weekend, reiterating he’s hopeful they’ll make a deal. The Islamic Republic’s foreign ministry said it hopes diplomatic efforts will avert a war. White House envoy Steve Witkoff and Iranian Foreign Minister Abbas Araghchi are set to meet in Istanbul on Friday, Axios reported, citing two people familiar with the matter. “The move lower looks more like a positioning reset than a fundamental shift,” said Haris Khurshid, chief investment officer at Karobaar Capital LP. “With no new supply shock, oil is giving back some risk premium as the market recalibrates after pricing in near-term disruption that just didn’t materialize.” Crude was also hit as commodities, particularly metals, came under intense selling pressure. Gold fell as much as 10%, and copper at one point dropped more than 5% as they continued a retreat that started on Friday. The precipitous drop comes on the back of WTI’s biggest monthly increase since 2023, supported by broad-based flows into commodities during the same period. The prospect of conflict with Iran and pockets of supply disruption led to a surprisingly tight first month of the year. Still, the wider backdrop is one of elevated supplies, particularly in the first half of 2026. At current prices, the sharp reversal will trigger selling from trend-following commodity trading advisors, according to James Taylor, head of the quant service at consultant Energy Aspects. More selling would come if Brent falls below $65 a barrel, he added, noting

Read More »

Trump to Launch $12B Critical Mineral Stockpile

President Donald Trump is set to launch a strategic critical-minerals stockpile with $12 billion in seed money, a bid to insulate manufacturers from supply shocks as the US works to slash its reliance on Chinese rare earths and other metals.  The venture — dubbed Project Vault — is set to marry $1.67 billion in private capital with a $10 billion loan from the US Export-Import Bank to procure and store the minerals for automakers, tech firms and other manufacturers.  US rare-earths stocks jumped in premarket trading upon news of the administration’s plan, including USA Rare Earth Inc., Critical Metals Corp., United States Antimony Corp. and NioCorp Developments Ltd. Details of the initiative, which would represent a first-of-its-kind stockpile for the US private sector, were described by senior administration officials, who asked not to be identified discussing a plan that has yet to be announced. The effort is akin to the nation’s existing emergency oil stockpile. But instead of crude, its focus would be minerals — such as gallium and cobalt — used in products such as iPhones, batteries and jet engines. The stockpile is expected to include both rare earths and critical minerals as well as other strategically important elements that are subject to volatile prices. A Gallium Arsenide semiconducting wafer is processed into chips for radio frequency communications devices at RF Micro Devices Inc. (RFMD) headquarters in Greensboro, North Carolina, U.S., on Wednesday, Feb. 15, 2012. RF Micro Devices Inc. manufactures radio-frequency components and semiconductor technologies. Photographer: Victor J. Blue/Bloomberg It represents a major commitment to accumulate minerals deemed critical to the industrial economy — including the automotive, aerospace and energy sectors — and highlights Trump’s effort to wean US supply chains from China, the world’s dominant provider and processor of critical minerals.  The project has participation from more

Read More »

Trump Says He Welcomes China, India Investment in VEN Oil

President Donald Trump said Saturday he welcomed investment by China and India in Venezuela’s oil industry. “China is welcome to come in and will make a great deal on oil,” Trump told reporters during a flight to Mar-a-Lago on Air Force One. He added that the US is working with India on a deal to purchase Venezuelan oil. “India’s coming in and they’re going to be buying Venezuelan oil, as opposed to buying it from Iran,” he said. “We’ve already made the deal, the concept of that deal.” Earlier this week, Venezuela’s acting president signed off on historic changes to the country’s nationalist oil policy that would reduce taxes and allow greater ownership for foreign oil companies, less than a month after US forces captured longtime leader Nicolas Maduro. Shortly after, US Treasury Department issued a general license expanding the ability for US companies to export, sell and refine crude coming from the sanctioned South American country.  The US is set to import the most Venezuelan oil in a year after the Trump administration moved to control the country’s energy supply and pressed oil companies to invest $100 billion in rebuilding the country’s oil infrastructure. Yet as the US emerges as the biggest recipient of Venezuelan oil following Maduro’s capture, shipments to China — which averaged 400,000 barrels a day last year — fell to zero in January amid a US naval crackdown on the so-called dark fleet of vessels used to transport sanctioned oil to China.  Most of the oil arriving in the US comes from Chevron Corp., which holds a US license to sell sanctioned Venezuelan crude. About 20% is being supplied by commodity traders Trafigura Group and Vitol Group, which were tapped by the Trump administration to help sell up to 50 million barrels of oil after

Read More »

Energy Star gets full 2026 funding from Congress

Congress has fully funded the Energy Star program through fiscal year 2026 as part of a funding bill that President Trump signed into law Jan. 23. The administration tried to zero-out the program in early 2025. “The funding is a huge win,” Sabine Rogers, federal policy manager at the U.S. Green Building Council, said in a blog post. A provision in the fiscal 2026 appropriations bill that funds the U.S. Environmental Protection Agency and several other federal agencies, H.R. 6938, mandates that the administration provide at least $33 million to carry out the program through the fiscal year ending Sept. 30 — a modest increase over the $32 million provided in FY2024, the most recent year where program funding data is available. The provision includes a directive from Congress that the administration not take actions to reduce the amount. “[This is] the very first time that Congress has stipulated a mandatory annual spending level for Energy Star,” Rogers said, “placing a clear and binding legal requirement on the administration.” More than 1,200 organizations lobbied Congress last year to save Energy Star after the Trump administration in May proposed eliminating EPA’s Office of Atmospheric Protection, which oversees the program. In letters calling for the EPA to continue the program, organizations said its elimination would damage the real estate sector at a time when it is already facing significant uncertainty.  The Energy Star program has saved consumers and organizations some 5.2 trillion kilowatt-hours of energy and more than $500 billion in costs since it was created in 1992, according to the program website. “Energy Star has grown to become the international standard for energy efficiency and one of the most successful voluntary U.S. government programs in history,” the site says.  Energy Star Portfolio Manager, a free tool that allows commercial building operators

Read More »

Iran Edges Toward Nuclear Talks With USA in Bid to Avoid War

(Update) February 2, 2026, 3:29 PM GMT: Article updated with with more on the talks in fifth paragraph. Iran said talks with the US over a new nuclear deal could get underway in the coming days, building on a flurry of diplomatic activity aimed at averting war between the two sides.  President Masoud Pezeshkian ordered the start of negotiations with Washington “within the framework of the nuclear issue,” Iran’s semi-official Fars news service reported Monday, citing a government source. Talks could include senior officials from both countries such as US envoy Steve Witkoff and Iran’s Foreign Minister Abbas Araghchi, the Tasnim news service said, citing a source it didn’t identify. “We’re ready for diplomacy, but they must understand that diplomacy is not compatible with threats, intimidation or pressure,” Araghchi said on state TV. “We will remain steadfast on this path and hope to see its results soon.” Multiple countries in the Middle East have been acting as intermediaries between Tehran and Washington, according to Esmail Baghaei, a spokesman for Iran’s foreign ministry. No time or location for an initial meeting have been set, Tasnim said, while details of what would be discussed remain unclear, such as whether the US would push for the Islamic Republic to end uranium enrichment.   Iran’s priority in new talks will be sanctions relief and Tehran is “realistic” in its approach, Baghaei said. The developments underline the international effort to ease Middle East tensions as US President Donald Trump threatens Iran with military action if it doesn’t reach an agreement to curb its nuclear program. American naval assets have been dispatched toward the region and Trump said Sunday they were “a couple of days” away, even while unspecified Gulf allies negotiate to “make a deal.” Oil prices fell sharply on Monday, partly because of the heightened diplomatic

Read More »

Texas Upstream Oil, Gas Employment Was Steady in 2025

In a statement sent to Rigzone recently, the Texas Oil & Gas Association (TXOGA) said Texas upstream oil and gas employment was “steady in 2025, despite market headwinds”. TXOGA noted in the statement that, according to data released by the Texas Workforce Commission, Texas upstream oil and gas employment “remained essentially flat in 2025, even as producers continued to deliver strong output amid challenging market conditions”. “Through November 2025, upstream employment totaled 201,200 jobs. While employment declined by 3,500 jobs in November compared with October, year to date employment was little changed, with a net gain of 300 direct upstream jobs,” it added. “Employment was also modestly higher than a year earlier, rising by 100 jobs, or 0.1 percent,” it continued. TXOGA noted in the statement that, “since the Covid-era low point in September 2020”, Texas upstream oil and natural gas employment has “increased by more than 44,000 jobs, a 28 percent gain”. The industry body outlined in the statement that this increase “underscor[es]… the industry’s continued role as a high-wage employer in the Texas economy”. TXOGA President Todd Staples said in the statement that “reaching new production highs in multiple categories with employment essentially remaining steady is absolutely remarkable”. “Navigating these volatile circumstances is a vivid reminder: growth is not guaranteed,” he added. “This resilience demonstrated by increased energy output in 2025 depends on policies that support infrastructure development and market flexibility so the oil and natural gas industry can adapt to uncertainty and continue delivering the affordable, reliable energy that powers our modern way of life,” he continued. TXOGA highlighted in its statement that upstream employment includes oil and natural gas extraction and related support activities, and excludes downstream sectors such as refining, petrochemicals, pipelines, and fuels distribution. “The combined industry sectors moved up slightly on average from

Read More »

How Robotics Is Re-Engineering Data Center Construction and Operations

Physical AI: A Reusable Robotics Stack for Data Center Operations This is where the recent collaboration between Multiply Labs and NVIDIA becomes relevant, even though the application is biomanufacturing rather than data centers. Multiply Labs has outlined a robotics approach built on three core elements: Digital twins using NVIDIA Isaac Sim to model hardware and validate changes in simulation before deployment. Foundation-model-based skill learning via NVIDIA Isaac GR00T, enabling robots to generalize tasks rather than rely on brittle, hard-coded behaviors. Perception pipelines including FoundationPose and FoundationStereo, that convert expert demonstrations into structured training data. Taken together, this represents a reusable blueprint for data center robotics. Applying the Lesson to Data Center Environments The same physical-AI techniques now being applied in lab and manufacturing environments map cleanly onto the realities of data center operations, particularly where safety, uptime, and variability intersect. Digital-twin-first deployment Before a robot ever enters a live data hall, it needs to be trained in simulation. That means modeling aisle geometry, obstacles, rack layouts, reflective surfaces, and lighting variation; along with “what if” scenarios such as blocked aisles, emergency egress conditions, ladders left in place, or spill events. Simulation-first workflows make it possible to validate behavior and edge cases before introducing any new system into a production environment. Skill learning beats hard-coded rules Data centers appear structured, but in practice they are full of variability: temporary cabling, staged parts, mixed-vendor racks, and countless human exceptions. Foundation-model approaches to manipulation are designed to generalize across that messiness far better than traditional rule-based automation, which tends to break when conditions drift even slightly from the expected state. Imitation learning captures tribal knowledge Many operational tasks rely on tacit expertise developed over years in the field, such as how to manage stiff patch cords, visually confirm latch engagement, or stage a

Read More »

Applied Digital CEO Wes Cummins On the Hard Part of the AI Boom: Execution

Designing for What Comes After the Current AI Cycle Applied Digital’s design philosophy starts with a premise many developers still resist: today’s density assumptions may not hold. “We’re designing for maximum flexibility for the future—higher density power, lower density power, higher voltage delivery, and more floor space,” Cummins said. “It’s counterintuitive because densities are going up, but we don’t know what comes next.” That choice – to allocate more floor space even as rack densities climb – signals a long-view approach. Facilities are engineered to accommodate shifts in voltage, cooling topology, and customer requirements without forcing wholesale retrofits. Higher-voltage delivery, mixed cooling configurations, and adaptable data halls are baked in from the start. The goal is not to predict the future perfectly, Cummins stressed, but to avoid painting infrastructure into a corner. Supply Chain as Competitive Advantage If flexibility is the design thesis, supply chain control is the execution weapon. “It’s a huge advantage that we locked in our MEP supply chain 18 to 24 months ago,” Cummins said. “It’s a tight environment, and more timelines are going to get missed in 2026 because of it.” Applied Digital moved early to secure long-lead mechanical, electrical, and plumbing components; well before demand pressure fully rippled through transformers, switchgear, chillers, generators, and breakers. That foresight now underpins the company’s ability to make credible delivery commitments while competitors confront procurement bottlenecks. Cummins was blunt: many delays won’t stem from poor planning, but from simple unavailability. From 100 MW to 700 MW Without Losing Control The past year marked a structural pivot for Applied Digital. What began as a single, 100-megawatt “field of dreams” facility in North Dakota has become more than 700 MW under construction, with expansion still ahead. “A hundred megawatts used to be considered scale,” Cummins said. “Now we’re at 700

Read More »

From Silicon to Cooling: Dell’Oro Maps the AI Data Center Buildout

For much of the past decade, data center growth could be measured in incremental gains: another efficiency point here, another capacity tranche there. That era is over. According to a cascade of recent research from Dell’Oro Group, the AI investment cycle has crossed into a new phase, one defined less by experimentation and more by industrial-scale execution. Across servers, networks, power, and cooling, Dell’Oro’s latest data points to a market being reshaped end-to-end by AI workloads which are pulling forward capital spending, redefining bill-of-material assumptions, and forcing architectural transitions that are rapidly becoming non-negotiable. Capex Becomes the Signal The clearest indicator of the shift is spending. Dell’Oro reported that worldwide data center capital expenditures rose 59 percent year-over-year in 3Q 2025, marking the eighth consecutive quarter of double-digit growth. Importantly, this is no longer a narrow, training-centric surge. “The Top 4 US cloud service providers—Amazon, Google, Meta, and Microsoft—continue to raise data center capex expectations for 2025, supported by increased investments in both AI and general-purpose infrastructure,” said Baron Fung, Senior Research Director at Dell’Oro Group. He added that Oracle is on track to double its data center capex as it expands capacity for the Stargate project. “What is notable this cycle is not just the pace of spending, but the expanding scope of investment,” Fung said. Hyperscalers are now scaling accelerated compute, general-purpose servers, and the supporting infrastructure required to deploy AI at production scale, while simultaneously applying tighter discipline around asset lifecycles and depreciation to preserve cash flow. The result is a capex environment that looks less speculative and more structural, with investment signals extending well into 2026. Accelerators Redefine the Hardware Stack At the component level, the AI effect is even more pronounced. Dell’Oro found that global data center server and storage component revenue jumped 40 percent

Read More »

Rethinking Water in the AI Data Center Era

Finding Water by Eliminating Waste: Leakage as a Hidden Demand Driver ION Water and Meta frame leakage not as a marginal efficiency issue, but as one of the largest and least visible sources of water demand. According to the release, more than half of the water paid for at some properties can be lost to “invisible leaks,” including running toilets, aging water heaters, and faulty fixtures that go undetected for extended periods. ION’s platform is designed to surface that hidden demand. By monitoring water consumption at the unit level, the system flags anomalies in real time and directs maintenance teams to specific fixtures, rather than entire buildings. The company says this approach can reduce leak-driven water waste by as much as 60%. This represents an important evolution in how hyperscalers defend and contextualize their water footprints: Instead of focusing solely on their own direct WUE metrics, operators are investing in demand reduction within the same watershed where their data centers operate. That shift reframes the narrative from simply managing active water consumption to actively helping stabilize stressed local water systems. The Accounting Shift: Volumetric Water Benefits (VWB) The release explicitly positions the project as a model for Volumetric Water Benefits (VWB) initiatives, projects intended to deliver measurable environmental gains while also producing operational and financial benefits for underserved communities. This framing aligns with a broader stewardship accounting movement promoted by organizations such as the World Resources Institute, which has developed Volumetric Water Benefit Accounting (VWBA) as a standardized method for quantifying and valuing watershed-scale benefits. Meta is explicit that the project supports its water-positive commitment tied to its Temple, Texas data center community. The company has set a 2030 goal to restore more water than it consumes across its global operations and has increasingly emphasized “water stewardship in our data center

Read More »

Microsoft and Meta’s Earnings Week Put the AI Data Center Cycle in Sharp Relief

If you’re trying to understand where the hyperscalers really are in the AI buildout, beyond the glossy campus renders and “superintelligence” rhetoric, this week’s earnings calls from Microsoft and Meta offered a more grounded view. Both companies are spending at a scale the data center industry has never had to absorb at once. Both are navigating the same hard constraints: power, capacity, supply chain, silicon allocation, and time-to-build.  But the market’s reaction split decisively, and that divergence tells its own story about what investors will tolerate in 2026. To wit: Massive capex is acceptable when the return narrative is already visible in the P&L…and far less so when the payoff is still being described as “early innings.” Microsoft: AI Demand Is Real. So Is the Cost Microsoft’s fiscal Q2 2026 results reinforced the core fact that has been driving North American hyperscale development for two years: Cloud + AI growth is still accelerating, and Azure remains one of the primary runways. Microsoft said Q2 total revenue rose to $81.3 billion, while Microsoft Cloud revenue reached $51.5 billion, up 26% (constant currency 24%). Intelligent Cloud revenue hit $32.9 billion, up 29%, and Azure and other cloud services revenue grew 39%. That’s the demand signal. The supply signal is more complicated. On the call and in follow-on reporting, Microsoft’s leadership framed the moment as a deliberate capacity build into persistent AI adoption. Yet the bill for that build is now impossible to ignore: Reuters reported Microsoft’s capital spending totaled $37.5 billion in the quarter, up nearly 66% year-over-year, with roughly two-thirds going toward computing chips. That “chips first” allocation matters for the data center ecosystem. It implies a procurement and deployment reality that many developers and colo operators have been living: the short pole is not only power and buildings; it’s GPU

Read More »

Network engineers take on NetDevOps roles to advance stalled automation efforts

What NetDevOps looks like Most enterprises begin their NetDevOps journey modestly by automating a limited set of repetitive, lower-level tasks. Nearly 70% of enterprises pursuing infrastructure automation start with task-level scripting, rather than end-to-end automation, according to theCUBE Research’s AppDev Done Right Summit. This can include using tools such as Ansible or Python scripts to standardize device provisioning, configuration changes, or other routine changes. Then, more mature teams adopt Git for version control, define golden configurations, and apply basic validation before and after changes, explains Bob Laliberte, principal analyst at SiliconANGLE and theCUBE. A smaller group of enterprises extends automation efforts into complete CI/CD-style workflows with consistent testing, staged deployments, and automated verification, Laliberte adds. This capability is present in less than 25% of enterprises today, according to theCUBE, and it is typically focused on specific domains such as data center fabric or cloud networking. NetDevOps usually exists with the network organization as a dedicated automation or platform subgroup, and more than 60% of enterprises anchor NetDevOps initiatives within traditional infrastructure teams rather than application or platform engineering groups, according to Laliberte. “In larger enterprises, NetDevOps capabilities are increasingly centralized within shared infrastructure or platform teams that provide tooling, pipelines, and guardrails across compute, storage, and networking,” Laliberte says. “In more advanced or cloud-native environments, network specialists may be embedded within application, site reliability engineering (SRE), or platform teams, particularly where networking directly impacts application performance.” Transforming work At its core, NetDevOps isn’t just about changing titles for network engineers. It is about changing workflows, behaviors, and operating models across network operations.

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »