Nous Research drops Hermes 4 AI models that outperform ChatGPT without content restrictions

Stay Ahead, Stay ONMINE

Nous Research drops Hermes 4 AI models that outperform ChatGPT without content restrictions

Nous Research, a secretive artificial intelligence startup that has emerged as a leading voice in the open-source AI movement, quietly released Hermes 4 on Monday, a family of large language models that the company claims can match the performance of leading proprietary systems while offering unprecedented user control and minimal content restrictions.The release represents a significant escalation in the battle between open-source AI advocates and major technology companies over who should control access to advanced artificial intelligence capabilities. Unlike models from OpenAI, Google, or Anthropic, Hermes 4 is designed to respond to nearly any request without the safety guardrails that have become standard in commercial AI systems.“Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities,” Nous Research announced on X (formerly Twitter). “Special attention was given to making the models creative and interesting to interact with, unencumbered by censorship, and neutrally aligned while maintaining state of the art level math, coding, and reasoning performance for open weight models.”Hermes 4 introduces what Nous Research calls “hybrid reasoning,” allowing users to toggle between fast responses and deeper, step-by-step thinking processes. When activated, the models generate their internal reasoning within special tags before providing a final answer — similar to OpenAI’s o1 reasoning models but with full transparency into the AI’s thought process.

The release represents a significant escalation in the battle between open-source AI advocates and major technology companies over who should control access to advanced artificial intelligence capabilities. Unlike models from OpenAI, Google, or Anthropic, Hermes 4 is designed to respond to nearly any request without the safety guardrails that have become standard in commercial AI systems.

“Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities,” Nous Research announced on X (formerly Twitter). “Special attention was given to making the models creative and interesting to interact with, unencumbered by censorship, and neutrally aligned while maintaining state of the art level math, coding, and reasoning performance for open weight models.”

Hermes 4 introduces what Nous Research calls “hybrid reasoning,” allowing users to toggle between fast responses and deeper, step-by-step thinking processes. When activated, the models generate their internal reasoning within special tags before providing a final answer — similar to OpenAI’s o1 reasoning models but with full transparency into the AI’s thought process.

AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

Turning energy into a strategic advantage
Architecting efficient inference for real throughput gains
Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO

The technical achievement is substantial. In testing, Hermes 4’s largest 405-billion parameter model scored 96.3% on the MATH-500 benchmark in reasoning mode and 81.9% on the challenging AIME’24 mathematics competition — performance that rivals or exceeds many proprietary systems costing millions more to develop.

“The challenge is making thinking traces useful and verifiable without runaway reasoning,” noted AI researcher Rohan Paul on X, highlighting one of the technical breakthroughs in the release.

Perhaps most notably, Hermes 4 achieved the highest score among all tested models on “RefusalBench,” a new benchmark Nous Research created to measure how often AI systems refuse to answer questions. The model scored 57.1% in reasoning mode, significantly outperforming GPT-4o (17.67%) and Claude Sonnet 4 (17%).

Hermes 4 models from Nous Research answered significantly more questions than competing AI systems on RefusalBench, a test measuring how often models refuse to respond to user requests. (Credit: Nous Research)

Inside DataForge and Atropos: The breakthrough training systems behind Hermes 4’s capabilities

Behind Hermes 4’s capabilities lies a sophisticated training infrastructure that Nous Research has developed over several years. The models were trained using two novel systems: DataForge, a graph-based synthetic data generator, and Atropos, an open-source reinforcement learning framework.

DataForge creates training data through what the company describes as “random walks” through directed graphs, transforming simple pre-training data into complex instruction-following examples. The system can, for instance, take a Wikipedia article and transform it into a rap song, then generate questions and answers based on that transformation.

Atropos, meanwhile, operates like hundreds of specialized training environments where AI models practice specific skills—mathematics, coding, tool use, and creative writing—receiving feedback only when they produce correct solutions. This “rejection sampling” approach ensures that only verified, high-quality responses make it into the training data.

Atropos is Nous’ Reinforcement Learning framework
Atropos is an open source reinforcement learning environment by Nous that has hundreds of “gyms” (like math, coding, games, tool‑use, vision) to train and evaluate LLM trajectories via scalable, async RL loops.
In other words… pic.twitter.com/fjxaQKClEZ
— Tommy (@Shaughnessy119) August 26, 2025

“Nous used these environments to generate the dataset for Hermes 4!” explained Tommy Shaughnessy, a venture capitalist at Delphi Ventures who has invested in Nous Research. “All in the dataset contains 3.5 million reasoning samples and 1.6 million non-reasoning samples! Hermes was trained on RL data, not just static datasets of question and answer!”

The training process required 192 Nvidia B200 GPUs and 71,616 GPU hours for the largest model — a significant but not unprecedented computational investment that demonstrates how specialized techniques can compete with the massive scale of tech giants.

Why Nous Research believes AI safety guardrails are ‘annoying as hell’ and hurt innovation

Nous Research has built its reputation on a philosophy that puts user control above corporate content policies. The company’s models are designed to be “steerable,” meaning they can be fine-tuned or prompted to behave in specific ways without the rigid safety constraints that characterize commercial AI systems.

“Hermes 4 is not shackled by disclaimers, rules and being overly cautious which is annoying as hell and hurts innovation and usability,” wrote Shaughnessy in a detailed thread analyzing the release. “If its open source but refuses all requests its pointless. Not an issue with Hermes 4.”

Hermes 4 is not shackled by disclaimers, rules and being overly cautious which is annoying as hell and hurts innovation and usability.
Hermes 4 70B is at the complete opposite of the spectrum vs OpenAI’s open source model. It’s also ~4x more open vs ChatGPT 4o!
If its open… pic.twitter.com/q5RpX1oOzo
— Tommy (@Shaughnessy119) August 26, 2025

This approach has made Nous Research popular among AI researchers and developers who want maximum flexibility, but it also places the company at the center of ongoing debates about AI safety and content moderation. While the models can theoretically be used for harmful purposes, Nous Research argues that transparency and user control are preferable to corporate gatekeeping.

The company’s technical report, released alongside the models, provides unprecedented detail about the training process, evaluation results, and even the actual text outputs from benchmark tests. “We believe this report sets a new standard for transparency in benchmarking,” the company stated.

How a small startup with 192 GPUs is competing against Big Tech’s billion-dollar AI budgets

Hermes 4‘s release comes at a pivotal moment in the AI industry. While major technology companies have poured billions into developing increasingly powerful AI systems, a growing open-source movement argues that these capabilities should not be controlled by a handful of corporations.

Recent months have seen significant advances in open-source AI, with models like Meta’s Llama 3.1, DeepSeek’s R1, and Alibaba’s Qwen series achieving performance that rivals proprietary systems. Hermes 4 represents another step in this progression, particularly in the area of reasoning—long considered a strength of closed systems like OpenAI’s o1.

“First up, Nous is a startup with dozens of extremely talented people,” noted Shaughnessy. “They do not have the $100b+ annual capex spend of a hyperscaler nor 1,000’s of employees and despite that they continue to put out innovative models and research at an insane pace.”

The startup, which raised $65 million in funding earlier this year led by Paradigm, has also been developing Psyche Network, a distributed training system that aims to coordinate AI training across internet-connected computers using blockchain technology.

The technical fix that stopped Hermes 4 from thinking in endless loops

One of Hermes 4‘s most significant technical contributions addresses a problem plaguing reasoning models: overly long thinking processes. The researchers found that their smaller 14-billion parameter model would reach maximum context length 60% of the time when reasoning, essentially getting stuck in endless loops of thinking.

Their solution involved a second training stage that teaches models to stop reasoning at exactly 30,000 tokens, reducing overlong generation by 65-79% while maintaining most of the reasoning performance. This “length control” technique could prove valuable for the broader AI research community.

“Smaller models (Muyu He on X, highlighting insights from the technical report.

However, Hermes 4 still faces limitations common to open-source models. Despite impressive benchmark performance, the models require significant computational resources to run and may not match the ease of use or reliability of commercial AI services for many applications.

Where to try Hermes 4 and what it costs compared to ChatGPT and Claude

Nous Research has made Hermes 4 available through multiple channels, reflecting the open-source philosophy. The model weights are freely downloadable on Hugging Face, while the company also offers API access through its revamped chat interface and partnerships with inference providers like Chutes, Nebius, and Luminal.

“You can try Hermes 4 in the new, revamped Nous Chat UI,” the company announced, highlighting features like parallel interactions and a memory system.

For enterprise users and researchers, the models represent a potentially attractive alternative to paying for API access to proprietary systems, especially for applications requiring high levels of customization or handling of sensitive content.

The bigger picture: What Hermes 4 means for the future of AI development

The release of Hermes 4 represents more than just another AI model launch — it’s a statement about who should control the future of artificial intelligence. In an industry increasingly dominated by a handful of tech giants with virtually unlimited resources, Nous Research has demonstrated that innovation can still come from unexpected places.

The company’s approach raises fundamental questions about the trade-offs between safety and capability, between corporate control and user freedom. While major technology companies argue that careful content moderation and safety guardrails are essential for responsible AI deployment, Nous Research contends that transparency and user agency are more important than corporate-imposed restrictions.

Whether this philosophy will ultimately prove beneficial or problematic remains to be seen. But one thing is certain: Hermes 4 has shown that the future of AI won’t be determined solely by the companies with the deepest pockets.

In a field where yesterday’s impossibilities become tomorrow’s commodities, Nous Research just proved that the only thing more dangerous than an AI that says no might be one that’s willing to say yes.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Kubernetes v1.34 brings networking refinements for cloud-native infrastructure

That’s one of the core challenges that KEP #3015 aims to help solve. Titled as ‘PreferSameZone and PreferSameNode Traffic Distribution’ the enhancement provides network operators with fine-grained control over traffic routing decisions within clusters. According to the KEP, the goals of the enhancement are to make traffic distribution less ambiguous.

Intel touts efficiency and performance in new 288-core Xeon processor

The processor packs 288 Efficiency cores and is the successor to the 144-core Sierra Forest chip currently on the market. In recent years, Intel has switched CPU design to Efficiency cores (E-cores) and Performance cores (P-cores). Their names are descriptors of what they are: E-cores are lower power and lower

Linux Foundation launches Essedum 1.0 to simplify AI integration in network operations

Essedum simplifies this entire workflow by providing a unified and comprehensive framework that contains all the necessary components in one place. Specifically, he noted that it offers networking teams: Easy access to AI building blocks: Essedum makes it simple to access and integrate the different layers required to build AI

Cisco launches dedicated wireless certification track

CCIE Wireless The CCIE Wireless certification validates networking professionals’ ability to “maximize the potential of any enterprise wireless solution from designing and deploying to operating and optimizing,” Cisco says. “Our Cisco CCIE Wireless certification also reflects the growth and evolution of wireless technologies. It includes Cisco’s cloud-based network management solution,

Karoon Reports Increase in 2P Reserves in Brazil’s Bauna Project

Karoon Energy Ltd reported an increase of 13.7 million barrels (MMbbl) of 2P reserves in the Bauna project in the southern post-salt region of the Santos Basin, Brazil. The increase in 2P reserves, which are 35 percent higher as of June 30 from six months ago, reflect the acquisition of the Baúna floating production, storage and offloading vessel (FPSO), facility operating life extension, and “better reservoir performance than anticipated,” Karoon said in a news release. The project’s economic field life has also been extended by seven years to 2039, limited by the current production concession expiry, the company said. The Bauna Project comprises the Bauna, Piracaba and Patola fields in Concession BM-S-40 offshore Brazil The asset’s 2C contingent resources have been reduced from 11.2 MMbbl to 3.0 MMbbl over the same period, reflecting an upwards revision of 5.5 MMbbl, less the transfer of 13.7 MMbbl to the 2P reserves category. The reassessment of contingent resources was based on the potential to produce until the assessed end of the facility’s operating life of 2040, according to the release. Karoon said it expects to invest $55 million to $65 million in an FPSO revitalization campaign in 2026, and approximately $80 million to $90 million in life extension activities between 2030 and 2034. Life extension capital expenditures are expected to include two flotel campaigns and associated equipment upgrades. The reassessment included an updated reservoir performance, modelling and activities outlook, as well as the removal of Altera & Ocyan FPSO charter costs, leading to a reduction in minimum economic production rates. It also included an updated assessment of long-term operating costs and field abandonment costs, with the production concession expiry date of February 2039 factored in, Karoon said. “One of the key drivers underpinning the [first-half] Baúna FPSO acquisition was the potential to reduce

How Much Crude Oil is in the USA Strategic Petroleum Reserve?

According to the U.S. Energy Information Administration’s (EIA) latest weekly petroleum status report, which was released on August 27 and included data for the week ending August 22, there were 404.2 million barrels of crude oil in the U.S. Strategic Petroleum Reserve (SPR) on August 22. The EIA’s report showed that crude oil in the SPR increased week on week and year on year. Crude oil in the SPR stood at 403.4 million barrels on August 15 and 377.9 million barrels on August 23, 2024, the report highlighted. In its previous weekly petroleum status report, which was released on August 20 and included data for the week ending August 15, the EIA showed that crude oil in the SPR stood at 403.4 million barrels on August 15, 403.2 million barrels on August 8, and 377.2 million barrels on August 16, 2024. In its latest short term energy outlook (STEO), which was released on August 12, the EIA projected that crude oil in the SPR will increase in 2025 and 2026. The EIA forecast in this STEO that crude oil in the SPR will come in at 419.7 million barrels this year and 426.4 million barrels next year. In this STEO, the EIA highlighted that crude oil in the SPR came in at 393.6 million barrels in 2024. In its August STEO, the EIA projected that crude oil in the SPR would come in at 409.7 million barrels in the third quarter of this year, 419.7 million barrels in the fourth quarter, and 426.4 million barrels across all four quarters of 2026. The EIA forecast in its previous STEO, which was released in July, that crude oil in the SPR would come in at 423.5 million barrels this year and 430.2 million barrels next year. That STEO projected that crude oil

Industry Body Pays Tribute to Phil Kirk

In a statement posted on social media, industry body Offshore Energies UK (OEUK) paid tribute to former OEUK Board Co-Chair Phil Kirk, who passed away recently. “Phil was a tireless champion of the North Sea industry and was respected by everyone he worked alongside,” OEUK Chief Executive David Whitehouse said in the statement. “As co-chair of OEUK’s board, Phil’s guidance, passion, and generosity helped drive the North Sea Transition Deal and made sure the voice of our 200,000 people was heard by governments and politicians of all parties,” he added. “In business, he helped transform the North Sea through the companies he led, gaining much respect from his peers for the way in which he led his organizations. Our thoughts are with his family, friends, and former colleagues during these difficult times. We will miss him,” he continued. In a statement posted on its site this week, the Aberdeen & Grampian Chamber of Commerce (AGCC) said Kirk “played a significant role in the oil and gas sector, and also indulged in his other passion, football, owning Chesterfield FC along with his brother”. The AGCC highlighted in that statement that Kirk founded Chrysaor Ltd, “a predecessor company to Harbour Energy”. Chesterfield FC said in a statement posted on its site on Monday, “it is with great sadness that we announce that the club’s owner, Phil Kirk, has died at the age of 59, following a short illness”. “Our thoughts are with Phil’s family and friends,” that statement added. A statement posted on AFC Wimbledon’s website this week said, “Chesterfield Football Club has announced with great sadness that club owner Phil Kirk has passed away at the age of 59”. “Under Phil’s stewardship, Chesterfield achieved promotion back to the EFL in 2024, marking a proud return to the EFL, before reaching the

India Refiners Boost US Crude Buys

Indian refiners have stepped up their purchases of US crude after price drops, as Washington cracks down on the Asian nation for buying Russian barrels. This week, state and private oil processors including Reliance Industries Ltd., Indian Oil Corp. and Bharat Petroleum Corp. bought more US West Texas Intermediate crude than normal, according to traders who asked not to be identified as they’re not authorized to speak to the media. The main driver was more favorable prices for the grade, which have weakened relative to Middle East benchmarks, they said. Reliance, IOC and BPCL didn’t immediately respond to requests for comment. White House trade adviser Peter Navarro earlier this week cranked up pressure on India to halt its purchases of Russian oil, repeating accusations that New Delhi is funding the Kremlin’s campaign in Ukraine. The remarks came after the Trump administration doubled tariffs on Indian goods to 50 percent. New Delhi has defended its ties with Russia and called Washington’s actions “unfair, unjustified and unreasonable.” It has eased, but not stopped, purchases since US criticism began to ramp up. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy.

Gas Treatment, Liquefaction Units of Congo LNG Phase 2 Completed

The floating liquefied natural gas (FLNG) unit for the expansion of Congo LNG has departed Shanghai for the Republic of the Congo. Congo LNG Phase II’s floating production unit (FPU), which will treat natural gas for delivery to the FLNG or liquefaction unit, has also been completed and is set to sail to the Central African country in the coming days, Eni SpA said in a statement online. The new FLNG platform, called Nguya, has a liquefaction capacity of 2.4 million metric tons per annum (MMtpa). It will raise Congo LNG’s capacity to three MMtpa. “Designed with advanced technologies to ensure a reduced carbon footprint, it stands as a benchmark in the industry”, Italy’s state-backed oil and gas major Eni said. “Conceived, designed, and built in only 33 months – from contract award to sail-away – the FLNG sets a record for time-to-market in the entire sector. “Moreover, its cutting-edge technical features allow it to process gas from multiple fields, making it suitable for the development of future fields as well”. Nguya, 376 meters (1,233.6 feet) long and 60 meters wide, will be moored at a depth of 35 meters, according to Eni. Meanwhile Saipem SpA, an Italian energy engineering company, said it had completed the conversion of the Scarabeo 5 semi-submersible drilling unit into an FPU for Congo LNG. The FPU will be installed northwest of the Djeno Terminal at a depth of around 35 meters, Saipem said. “The Scarabeo 5, built in Italy in the early 1990s, is one of the best units of its generation, hence it was chosen as an asset for conversion into a floating gas treatment facility”, it said. “Once installed, it will serve as a control hub for the entire offshore development field of Eni’s Congo LNG Project. “The conversion of Scarabeo 5 was completed in less

Technip Energies, JGC Win Abadi LNG FEED Contracts

Technip Energies NV, together with JGC Holding Corp., secured a pair of Front-End Engineering Design (FEED) contracts for the Abadi liquefied natural gas (LNG) project being developed by INPEX Corp. in Indonesia’s Masela Block. The first contract is for the gas Floating Production Storage and Offloading (FPSO) vessel and the second one is for the onshore LNG facility, Technip Energies said. The FPSO FEED contract involves engineering a gas FPSO for the Abadi gas field. This unit will process the gas and export dry gas via a subsea pipeline to the onshore LNG plant for liquefaction. The onshore LNG FEED contract includes designing two LNG trains and supporting infrastructure including a jetty, materials offloading facilities, and a logistics supply base. Dry gas from the FPSO will undergo impurity removal before liquefaction, storage, and offloading. The carbon dioxide (CO2) captured from the dry gas will be reinjected into the well. “LNG is a critical transition fuel for global energy security. We are honored to be selected as one of the FEED contractors for the two essential components of the Abadi Masela ambitious development, leveraging our recognized expertise in LNG and gas FPSOs”, Marco Villa, Chief Business Officer of Technip Energies, commented. The Abadi LNG project aims to provide 9.5 million tons of LNG a year, along with an extra 150 million standard cubic feet of natural gas per day for domestic use, Technip Energies said. Additionally, the project incorporates carbon capture and storage technology, which is in line with Indonesia’s goal of achieving net-zero CO2 emissions by 2060, the company said. “This project represents a significant step forward in the development of low-carbon energy solutions, incorporating CCS technologies to deliver sustainable LNG, which is in line with the direction of our energy transition strategy”, Shoji Yamada, Representative Director and President of

AI networking success requires deep, real-time observability

Most research participants also told us they need to improve visibility into their data center network fabrics and WAN edge connectivity services. (See also: 10 network observability certifications to boost IT operations skills) The need for real-time data Observability of AI networks will require many enterprises to optimize how their tools collect network data. For instance, most observability tools rely on SNMP polling to pull metrics from network infrastructure, and these tools typically poll devices at five minute intervals. Shorter polling intervals can adversely impact network performance and tool performance. Sixty-nine percent of survey participants told EMA that AI networks require real-time infrastructure monitoring that SNMP simply cannot support. Real-time telemetry closes visibility gaps. For instance, AI traffic bursts that create congestion and packet drops may last only seconds, an issue that a five-minute polling interval would miss entirely. To achieve this level of metric granularity, network teams will have to adopt streaming network telemetry. Unfortunately, support of such technology is still uneven among network infrastructure and network observability vendors due to a lack of industry standardization and a perception among vendors that customers simply don’t need it. Well, AI is about to create a lot of demand for it. In parallel to the need for granular infrastructure metrics, 51% of respondents told EMA that they need more real-time network flow monitoring. In general, network flow technologies such as NetFlow and IPFIX can deliver data nearly in real-time, with delays of seconds or a couple minutes depending on the implementation. However, other technologies are less timely. In particular, the VPC flow logs generated by cloud providers are do not offer the same data granularity. Network teams may need to turn to real-time packet monitoring to close cloud visibility gaps. Smarter analysis for smarter networks Network teams also need their network

Equinix Bets on Nuclear and Fuel Cells to Meet Exploding Data Center Energy Demand

A New Chapter in Data Center Energy Strategy Equinix’s strategic investments in advanced nuclear and fuel cell technologies mark a pivotal moment in the evolution of data center energy infrastructure. By proactively securing power sources like Oklo’s fast reactors and Radiant’s microreactors, Equinix is not merely adapting to the industry’s growing energy demands but is actively shaping the future of sustainable, resilient power solutions. This forward-thinking approach is mirrored across the tech sector. Google, for instance, has partnered with Kairos Power to develop small modular reactors (SMRs) in Tennessee, aiming to supply power to its data centers by 2030 . Similarly, Amazon has committed to deploying 5 gigawatts of nuclear energy through partnerships with Dominion Energy and X-energy, underscoring the industry’s collective shift towards nuclear energy as a viable solution to meet escalating power needs . The urgency of these initiatives is underscored by projections from the U.S. Department of Energy, which anticipates data center electricity demand could rise to 6.7%–12% of total U.S. production by 2028, up from 4.4% in 2023. This surge, primarily driven by AI technologies, is straining existing grid infrastructure and prompting both public and private sectors to explore innovative solutions. Equinix’s approach, i.e. investing in both immediate and long-term energy solutions, sets a precedent for the industry. By integrating fuel cells for near-term needs and committing to advanced nuclear projects for future scalability, Equinix exemplifies a balanced strategy that addresses current challenges while preparing for future demands. As the industry moves forward, the collaboration between data center operators, energy providers, and policymakers will be crucial. The path to a sustainable, resilient energy future for data centers lies in continued innovation, strategic partnerships, and a shared commitment to meeting the digital economy’s power needs responsibly.

Evolving to Meet AI-Era Data Center Power Demands: A Conversation with Rehlko CEO Brian Melka

On the latest episode of the Data Center Frontier Show Podcast, we sat down with Brian Melka, CEO of Rehlko, to explore how the century-old mission-critical power provider is reinventing itself to support the new realities of AI-driven data center growth. Rehlko, formerly known as Kohler Energy, rebranded a year ago but continues to draw on more than a century of experience in power generation and backup systems. Melka emphasized that while the name has changed, the mission has not: delivering reliable, scalable, and flexible energy solutions to support always-on digital infrastructure. Meeting Surging AI Power Demands Asked how Rehlko is evolving to support the next wave of data center development, Melka pointed to two major dynamics shaping the market: Unprecedented capacity needs driven by AI training and inference. New, “spiky” usage patterns that strain traditional backup systems. “Power generation is something we’ve been doing longer than anyone else, starting in 1920,” Melka noted. “As we look forward, it’s not just about the scale of backup power required — it’s about responsiveness. AI has very large short-duration power demands that put real strain on traditional systems.” To address this, Rehlko is scaling its production capacity fourfold over the next three to four years, while also leveraging its global in-house EPC (engineering, procurement, construction) capabilities to design and deliver hybrid systems. These combine diesel or gas generation with battery storage and short-duration modulation, creating a more responsive power backbone for AI data centers. “We’re the only ones out there that can deliver that breadth of capability on a full turnkey basis,” Melka said. “It positions us to support customers as they navigate these new patterns of energy demand.” Speed to Power Becomes a Priority In today’s market, “speed to power” has become the defining theme. Developers and operators are increasingly considering

Data Center Chip Giants Negotiate Political Moves, Tariffs, and Corporate Strategies

And with the current restrictions being placed on US manufacturers selling AI parts to China, reporting says NVIDIA is developing a Blackwell-based China chip, more capable than the current H20 but still structured to comply with U.S. export rules. Reuters reported that it would be a single-die design (roughly half the compute of the dual-die B300), with HBM and NVLink, sampling as soon as next month. A second compliant workstation/inference product (RTX6000D) is also in development. Chinese agencies have reportedly discouraged use of NVIDIA H20 in government work, favoring Huawei Ascend. However, there have been reports describing AI training using the Ascend to be “challenging”, forcing some AI firms to revert to NVIDIA for large-scale training while using Ascend for inference. This keeps China demand alive for compliant NVIDIA/AMD parts—hence the U.S. interest in revenue-sharing. Meanwhile, AMD made its announcements at June’s “Advancing AI 2025” to set MI350 (CDNA 4) expectations and a yearly rollout rhythm that’s designed to erase NVIDIA’s time lead as much as fight on absolute perf/Watt. If MI350 systems ramp aligns with major cloud designs in 2026, AMD’s near-term objective is defending MI300X momentum while converting large customers to multi-vendor strategies (often pairing MI clusters with NVIDIA estates for redundancy and price leverage). The 15% China license fee will shape how AMD prices MI-series export SKUs and whether Chinese hyperscalers still prefer them to the domestic alternative (Huawei Ascend), which continue to face software/toolchain challenges. If Chinese buyers balk or Beijing discourages purchases, the revenue-share may be moot; if they don’t, AMD has a path to keep seats warm in China while building MI350 demand elsewhere. Beyond China export licenses, the U.S. and EU recently averted a larger trade war by settling near 15% on certain sectors, which included semiconductors, as opposed to the far more

Johnson Controls Brings Data Center Cooling into the “As-a-Service” Era

Cooling Without the Risk Johnson Controls’ Data Center Cooling as a Service (DCCaaS) approach is designed to take cooling risk off the operator’s shoulders. The company doesn’t just provide the technology—it delivers a comprehensive, long-term service package that covers design, build, operation, maintenance, and life cycle management. The model shifts cooling from a capital expense to an operating expense, providing financial flexibility at a time when operators are pouring billions into AI-ready infrastructure. “We take on the risk of performance and uptime,” Renkis explained. “If we don’t meet the agreed-upon KPIs, there are financial consequences for us—not the customer.” The AI Advantage A key differentiator in Johnson Controls’ approach is its integration of AI, machine learning, and advanced analytics. Through its OpenBlue and Metasys platforms—supplemented by partnerships with three to four external AI providers—the company is able to continuously optimize cooling system performance. These AI-driven systems not only extend the life of equipment but also deliver financially guaranteed outcomes. “We tie our results to customer-defined KPIs,” said Renkis. “If we miss, we pay. That accountability drives everything we do.” Modularity with Flexibility While the industry is trending toward modularity and prefabricated builds, Renkis stressed that every DCCaaS project remains unique. Johnson Controls designs contracts with “detour functionality”—flexible pathways to upgrade and adapt as technology shifts. That flexibility is crucial given the rapid emergence of AI factory-scale demands. New chip architectures and ultra-dense racks—600kW, 1MW, even 1.5MW—are reshaping expectations for cooling and power. “Nobody knows exactly how this will evolve,” Renkis noted. “That uncertainty makes the as-a-service model the most prudent path forward.” Beyond Traditional Facilities Management Cooling-as-a-service is distinct from conventional facilities management in both scope and financial muscle. Johnson Controls brings to the table its own capital arm—Johnson Controls Capital—and a joint venture with Apollo Group, known as Ionic

Meta’s Dual-Track Data Center Strategy: Owning AI Campuses, Leasing Cloud, and Expanding Nationwide

Provisioning the Power is a Major Project All its Own Powering a data center campus on this scale in an area like rural Louisiana is not a simple task. News reports and a utility commission filing by power company Entergy are starting to reveal the scope of project preparation already in process to get the site the power it will need. To bring in outside power, Entergy plans a 100-mile, 500kV transmission project (at an approximate cost of $1.2 billion) to move bulk power into the area. Substations & lines tied to the site will include a new “Smalling” 500/230kV substation, a new “Car Gas Road” 500kV switchyard, six customer substations on Meta’s property, two 30-mile 500kV lines, and multiple 230kV feeders into the campus. Additionally, Entergy has sought approval for three combined-cycle gas plants generating abou 2.25 GW of power and associated lines to meet the immediate load while broader transmission is built out; state hearings are underway with a vote on this part of the project expected before the end of August 2025. Approval is being sought from the Louisiana Public Service Commision to build these three new gas plants and their associated infrastructure at a cost of just under $4 billion. Concerns are being raised by local community groups as well as the Union of Concerned Scientists (UCS) and Louisiana-based Alliance for Affordable Energy (AAE) not just about how much of the initial costs will be passed on to Louisiana ratepayers, but also on issues related to what happens as the first series of contracts for power begin to expire in 15 years. The plans being presented were initially scheduled to be voted on in October 2025 and the fast tracking of project approval has highlighted the concerns of the opposition. Both the short- and long-term

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE