Stay Ahead, Stay ONMINE

DeepSeek unveils new technique for smarter, scalable AI reward models

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek AI, a Chinese research lab gaining recognition for its powerful open-source language models such as DeepSeek-R1, has introduced a significant advancement in reward modeling for large language models (LLMs).  Their new technique, Self-Principled Critique Tuning […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


DeepSeek AI, a Chinese research lab gaining recognition for its powerful open-source language models such as DeepSeek-R1, has introduced a significant advancement in reward modeling for large language models (LLMs). 

Their new technique, Self-Principled Critique Tuning (SPCT), aims to create generalist and scalable reward models (RMs). This could potentially lead to more capable AI applications for open-ended tasks and domains where current models can’t capture the nuances and complexities of their environment and users.

The crucial role and current limits of reward models

Reinforcement learning (RL) has become a cornerstone in developing state-of-the-art LLMs. In RL, models are fine-tuned based on feedback signals that indicate the quality of their responses. 

Reward models are the critical component that provides these signals. Essentially, an RM acts as a judge, evaluating LLM outputs and assigning a score or “reward” that guides the RL process and teaches the LLM to produce more useful responses.

However, current RMs often face limitations. They typically excel in narrow domains with clear-cut rules or easily verifiable answers. For example, current state-of-the-art reasoning models such as DeepSeek-R1 underwent an RL phase, in which they were trained on math and coding problems where the ground truth is clearly defined.

However, creating a reward model for complex, open-ended, or subjective queries in general domains remains a major hurdle. In the paper explaining their new technique, researchers at DeepSeek AI write, “Generalist RM requires to generate high-quality rewards beyond specific domains, where the criteria for rewards are more diverse and complex, and there are often no explicit reference or ground truth.” 

They highlight four key challenges in creating generalist RMs capable of handling broader tasks:

  1. Input flexibility: The RM must handle various input types and be able to evaluate one or more responses simultaneously.
  2. Accuracy: It must generate accurate reward signals across diverse domains where the criteria are complex and the ground truth is often unavailable. 
  3. Inference-time scalability: The RM should produce higher-quality rewards when more computational resources are allocated during inference.
  4. Learning scalable behaviors: For RMs to scale effectively at inference time, they need to learn behaviors that allow for improved performance as more computation is used.
Different types of reward models
Different types of reward models Credit: arXiv

Reward models can be broadly classified by their “reward generation paradigm” (e.g., scalar RMs outputting a single score, generative RMs producing textual critiques) and their “scoring pattern” (e.g., pointwise scoring assigns individual scores to each response, pairwise selects the better of two responses). These design choices affect the model’s suitability for generalist tasks, particularly its input flexibility and potential for inference-time scaling

For instance, simple scalar RMs struggle with inference-time scaling because they will generate the same score repeatedly, while pairwise RMs can’t easily rate single responses. 

The researchers propose that “pointwise generative reward modeling” (GRM), where the model generates textual critiques and derives scores from them, can offer the flexibility and scalability required for generalist requirements.

The DeepSeek team conducted preliminary experiments on models like GPT-4o and Gemma-2-27B, and found that “certain principles could guide reward generation within proper criteria for GRMs, improving the quality of rewards, which inspired us that inference-time scalability of RM might be achieved by scaling the generation of high-quality principles and accurate critiques.” 

Training RMs to generate their own principles

Based on these findings, the researchers developed Self-Principled Critique Tuning (SPCT), which trains the GRM to generate principles and critiques based on queries and responses dynamically. 

The researchers propose that principles should be a “part of reward generation instead of a preprocessing step.” This way, the GRMs could generate principles on the fly based on the task they are evaluating and then generate critiques based on the principles. 

“This shift enables [the] principles to be generated based on the input query and responses, adaptively aligning [the] reward generation process, and the quality and granularity of the principles and corresponding critiques could be further improved with post-training on the GRM,” the researchers write.

SPCT
Self-Principled Critique Tuning (SPCT) Credit: arXiv

SPCT involves two main phases:

  1. Rejective fine-tuning: This phase trains the GRM to generate principles and critiques for various input types using the correct format. The model generates principles, critiques and rewards for given queries/responses. Trajectories (generation attempts) are accepted only if the predicted reward aligns with the ground truth (correctly identifying the better response, for instance) and rejected otherwise. This process is repeated and the model is fine-tuned on the filtered examples to improve its principle/critique generation capabilities.
  2. Rule-based RL: In this phase, the model is further fine-tuned through outcome-based reinforcement learning. The GRM generates principles and critiques for each query, and the reward signals are calculated based on simple accuracy rules (e.g., did it pick the known best response?). Then the model is updated. This encourages the GRM to learn how to generate effective principles and accurate critiques dynamically and in a scalable way.

“By leveraging rule-based online RL, SPCT enables GRMs to learn to adaptively posit principles and critiques based on the input query and responses, leading to better outcome rewards in general domains,” the researchers write.

To tackle the inference-time scaling challenge (getting better results with more compute), the researchers run the GRM multiple times for the same input, generating different sets of principles and critiques. The final reward is determined by voting (aggregating the sample scores). This allows the model to consider a broader range of perspectives, leading to potentially more accurate and nuanced final judgments as it is provided with more resources.

However, some generated principles/critiques might be low-quality or biased due to model limitations or randomness. To address this, the researchers introduced a “meta RM”—a separate, lightweight scalar RM trained specifically to predict whether a principle/critique generated by the primary GRM will likely lead to a correct final reward. 

During inference, the meta RM evaluates the generated samples and filters out the low-quality judgments before the final voting, further enhancing scaling performance.

Putting SPCT into practice with DeepSeek-GRM

The researchers applied SPCT to Gemma-2-27B, Google’s open-weight model, creating DeepSeek-GRM-27B. They evaluated it against several strong baseline RMs (including LLM-as-a-Judge, scalar RMs, and semi-scalar RMs) and public models (like GPT-4o and Nemotron-4-340B-Reward) across multiple benchmarks.

They found that DeepSeek-GRM-27B outperformed baseline methods trained on the same data. SPCT significantly improved the quality and, crucially, the inference-time scalability compared to standard fine-tuning.

DeepSeek-GRM
The performance of DeepSeek-GRM (trained with SPCT) continues to improve with inference-time scaling Credit: arXiv

When scaled at inference time by generating more samples, DeepSeek-GRM-27B’s performance increased substantially, surpassing even much larger models like Nemotron-4-340B-Reward and GPT-4o. The meta RM further improved the scaling, achieving the best results by filtering judgments. 

“With larger-scale sampling, DeepSeek-GRM could judge more accurately upon principles with higher diversity, and output rewards with finer granularity,” the researchers write.

Interestingly, SPCT showed less bias across different domains compared to scalar RMs, which often performed well on verifiable tasks but poorly elsewhere.

Implications for the enterprise

Developing more generalist and scalable reward models can be promising for enterprise AI applications. Potential areas that can benefit from generalist RMs include creative tasks and applications where the model must adapt to dynamic environments such as evolving customer preferences. 

Despite the strong results, DeepSeek-GRM still lags behind specialized scalar RMs on purely verifiable tasks where explicit reasoning generation might be less efficient than direct scoring. Efficiency also remains a challenge compared to non-generative RMs. 

The DeepSeek team suggests future work will focus on efficiency improvements and deeper integration. As they conclude, “Future directions could include integrating GRMs into online RL pipelines as versatile interfaces of reward systems, exploring inference-time co-scaling with policy models, or serving as robust offline evaluators for foundation models.” 

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Kyndryl launches private cloud services for enterprise AI deployments

Kyndryl’s AI Private Cloud environment includes services and capabilities around containerization, data science tools, and microservices to deploy and manage AI applications on the private cloud. The service supports AI data foundations and MLOps/LLMOps services, letting customers manage their AI data pipelines and machine learning operation, Kyndryl stated. These tools facilitate

Read More »

Cato Networks augments CASB with genAI security

CASBs sit between an end user and a cloud service to enforce security policies, protect data, and ensure compliance. CASBs provide enterprise network and security teams with information on how end users are accessing and using cloud resources such as data, applications, and services. They provide visibility into cloud usage,

Read More »

8 unusual Linux commands

3. The column command The column command will display text in columns. Here are two examples of how to use it: $ cat staff | columnJohn Doe Lisa Stone Joanne Zahn Eric Docker Ben MatsonMary Berry Elaine Henry David Bloom Sam Adams Sally Rose$ cat staff | column -tJohn DoeMary

Read More »

AI agents vs. agentic AI: What do enterprises want?

The cloud-provider technical people I know don’t like this approach; they see it as likely to raise barriers to the use of their online generative AI services. Enterprises see their AI agent vision as facilitating cloud AI services instead. If there’s one massive AI entity doing everything, then data sovereignty

Read More »

Major shareholder revolt against BP chairman amid climate clash

Outgoing BP chairman Helge Lund received a near 25% vote against his reelection at the UK oil major’s annual general meeting in a shareholder revolt. The company’s board was dealt a bloody nose from shareholders as it faced conflicting pressures over climate goals during the meeting at its Sunbury-on-Thames hub on Thursday. It follows BP announcing a drastic shift away from investing in renewables in February after some shareholders pushed for a refocus on fossil fuels to boost its profits and share price, which have lagged behind its rivals. But ahead of the AGM, a group of 48 institutional investors criticised the board for not offering a direct vote on the oil major’s revised strategy, while environmental groups fiercely criticised the climate row-back. A resolution for Lund’s reelection received a provisional 24.3% of opposed votes, which marks a major rebuttal for a FTSE 100 company. © Supplied by Kenny Elrick/DC ThomBP’s North Sea HQ in Dyce, Aberdeen. Lund, who played a key role in setting BP’s green agenda, announced he will step down as the company plots a new course, meaning votes against his reelection were largely seen as a protest. Tarek Bouhouch, from the activist group Follow This, argued a vote against of 10% or more would have a “sole ESG purpose” and send a “strong signal”. According to the campaign group, a vote against the chairman likely never breached 10% in the firm’s history, or at least in the last decades. “Double digits is history,” Bouhouch said, claiming BP had never seen an oppose vote hit 10% at an AGM, at least not in the last decade. During the 90-minute meeting, board members and executives discussed the new strategy, with a large sign saying “A reset BP” on the set above their seats. Lund spoke about recent concerns

Read More »

TGS Expands its Mauritania 3D Seismic Library

Energy data and intelligence provider TGS has expanded its multi-client data library offshore Mauritania with the addition of over 101,500 square kilometers (39,189 square miles) of high-quality 3D seismic data. This expansion adds to the existing library of over 19,000 square kilometers (7,335.9 square miles) of reprocessed PSDM 3D seismic data and multiple regional 2D seismic surveys, TGS said in a media release. Together, these integrated datasets provide a more comprehensive and nuanced view of the subsurface, helping to refine known plays while revealing new exploration opportunities.  This data was released in collaboration with the Islamic Republic of Mauritania, the company said. Initial insights are beginning to surface through the integration of regional geological expertise and comprehensive mapping of trap trends, which extend from shelf-edge wells to deeper outboard fairways. Significantly, the discovery of Late Cretaceous channel-fan systems within the basin region is offering a renewed understanding of the area’s hydrocarbon potential, TGS said. “With the extension of our offshore data library with new 3D seismic data, Mauritania is opening the door to a new era of energy exploration. Our rich, underexplored basins and stable investment climate make Mauritania one of the most exciting frontiers for oil and gas”, Mohamed Ould Khaled, the Mauritanian Minister of Energy and Petroleum, said. “We are ready to work hand in hand with international partners to unlock this immense potential and deliver long-term, mutually beneficial growth”. “Aside from greatly expanding the quantity of high-quality multi-client subsurface data in West Africa, the true value of this extensive regional study lies in its ability to contextualize borehole data across the area. By re-evaluating historical exploration results – both successes and failures – exploration teams are better equipped to de-risk future ventures and make informed investment decisions”, David Hajovsky, Executive Vice President of Multi-Client, commented. TGS said

Read More »

Trump administration ordered to resume IRA funding

A federal judge Tuesday ordered the Trump administration to take “immediate steps” to reinstate already awarded funding from the Inflation Reduction Act and the Infrastructure Investment and Jobs Act, after the president broadly froze the disbursements on his first day in office.  Judge Mary McElroy of the U.S. District Court for Rhode Island ordered the Departments of Energy, Housing and Urban Development, Interior and Agriculture, as well as the Environmental Protection Agency, to release awards previously withheld, after the ruling found the agencies lacked authority to freeze the funding.  The decision applies to all awardees nationwide, and will remain in effect until McElroy rules on the merits of the lawsuit. The agencies must update the court of the status of their compliance by 5 p.m. EST on Wednesday.  “Agencies do not have unlimited authority to further a President’s agenda, nor do they have unfettered power to hamstring in perpetuity two statutes passed by Congress during the previous administration,” McElroy wrote in her decision.  The decision is a blow to President Donald Trump’s plans to dismantle the Biden administration’s hallmark climate funding law. The Inflation Reduction Act, passed in August 2022, provides hundreds of billions of dollars in direct funding and loan financing. It also offers lucrative tax credits for manufacturers that meet domestic production requirements, incentivizing a host of companies to invest in domestic facilities over the past three years. The Infrastructure Investment and Jobs Act, also known as the Bipartisan Infrastructure Law, also provides billions of dollars in clean energy funding.  Following the funding freeze, six nonprofits — Woonasquatucket River Watershed Council, Eastern Rhode Island Conservation District, Childhood Lead Action Project, Codman Square Neighborhood Development Corp., Green Infrastructure Center, and National Council of Nonprofits — sued the agencies in March in a bid to access their awarded funding, after other court orders failed. McElroy’s

Read More »

Meeting unprecedented load growth: Challenges and opportunities for resource adequacy

Samuel Newell is a principal at The Brattle Group. The U.S. electric system is under more pressure than ever before. In order to meet the challenges of unprecedented growth in electricity demand from data centers, re-shoring and other uses, the U.S. electricity grid will have to expand more than five times faster than in the previous two decades. All regions of the country will need massive amounts of new resources and increased grid capability yet face lagging infrastructure, supply chain issues and slow-moving planning processes. Current forecasts, based on Brattle’s aggregation of most recent RTO and utility forecasts across the country, suggest peak loads will increase by 175 GW by 2030, and 270 GW by 2035 (24% and 36%, respectively) relative to 2024. Annual energy use is projected to grow even faster, adding 53% by 2035, since many of the new loads have higher load factors than existing loads. Forecast of annual electric use (TWh), based on individual RTOs’ and utilities’ most recent forecasts. Permission granted by The Brattle Group These forecasts are uncertain, with wider error bars than in more stable periods. Uncertainties surround the firmness of hyperscalers’ plans for new data centers and their future expansion, as everything about artificial intelligence training, usage and computational efficiencies evolve. There is also uncertainty around whether planned manufacturing plants will proceed with federal incentives that have been depended upon. Perhaps there is more downside than upside if some service requests are tentative or duplicative of requests in other candidate locations. Yet even if only half materializes, the growth rate will still far exceed that experienced in recent decades. This rate of growth is difficult to meet because of its sheer magnitude and because it is so much higher than anticipated in forecasts from even a mere 1.5 years ago, which were closer

Read More »

‘Critical’ Greenlink connects the UK and Ireland

The Greenlink interconnector between Wales and Ireland has come online, doubling capacity to one gigawatt. Spanning from County Wexford to National Grid’s Pembroke substation, the 504MW interconnector has been welcomed by both the Irish and UK governments. Irish minister for climate, environment and energy Darragh O’Brien said: “I want to congratulate the team at Greenlink for bringing this critical piece of energy infrastructure for Ireland and the UK to life.” The infrastructure connects into the electricity transmission networks of the UK’s National Grid and Ireland’s EirGrid, with the latter operating the link. Michael Kelly, interim chief operations and asset management officer at EirGrid, commented: “This latest connection marks a vital step forward in strengthening our shared commitment to energy resilience and security and was made possible through combining expertise, resources and innovation with our UK colleagues and through collaboration with the Greenlink team.” The interconnector is made up of two 320kV high-voltage direct current (HVDC) subsea cables and associated converter stations. National Grid has claimed that the project “strengthens energy security”, while UK minister Michael Shanks said that it will allow both nations to achieve their “clean energy potential.” The energy minister said: “It is important that Ireland and the UK work together to strengthen our mutual energy security, and drive forward in reaching our clean energy potential. “This cable between Wexford and Wales will help deliver our clean power 2030 mission and support Ireland’s renewable expansion by allowing us to trade more cheaper-to-generate clean energy with each other, helping both nations to move away from volatile fossil fuel markets.” © Mathew Perry/DCT MediaUK Energy Minister Michael Shanks speaking at the 2024 OEUK Conference. O’Brien added that the delivery of Greenlink is a symbol of the “ever-strengthening energy relationship” between the UK and Ireland. “Increased electricity interconnection will be a

Read More »

OEG bags East Anglia Three vessel contract

ScottishPower Renewables has awarded a charter agreement to Aberdeen-headquartered OEG Group to help develop the East Anglia Three offshore wind farm. In addition, Caister-based NR Marine Services also received a deal to provide ships for the project. Combined, both companies’ agreements are worth more than £16 million, with the vessels operating out of the port of Lowestoft. OEG will provide support vessels as part of the construction of the 1.4GW project, which is due to come into operation next year. These include the support vessel Tess, which will carry out guard operations at the wind farm site. Thanks to its design and capabilities, the Tess can stay out at sea for longer periods, making it the suitable for East Anglia Three’s needs. © Supplied by ScottishPower RenewaNR Marine Services’ crew transfer vessel NR Rebellion. OEG business development director George Moore said it has worked on the project for a number of years during the offshore wind farm’s construction phase. He added: “Having supported ScottishPower Renewables for a number of years now, OEG has been able to establish firm roots in the region, and this contract further strengthens our commitment to the East of England. “It is a source of great pride here at OEG that our collaboration with ScottishPower Renewables continues to flourish as our shared commitment to developing a truly robust local supply chain endures. We now look forward to delivering a safe and efficient project.” OEG was recently acquired by US fund manager Apollo as part of $1 billion deal, with the transaction expected to close in the second quarter of this year. NR Marine Services vessels on the job include two crew transfer vessels (CTVs) – NR Rebellion and NR Hunter – with the Typhoon Class Rebellion taking to the water from April, and the Storm Class Hunter

Read More »

Intel sells off majority stake in its FPGA business

Altera will continue offering field-programmable gate array (FPGA) products across a wide range of use cases, including automotive, communications, data centers, embedded systems, industrial, and aerospace.  “People were a bit surprised at Intel’s sale of the majority stake in Altera, but they shouldn’t have been. Lip-Bu indicated that shoring up Intel’s balance sheet was important,” said Jim McGregor, chief analyst with Tirias Research. The Altera has been in the works for a while and is a relic of past mistakes by Intel to try to acquire its way into AI, whether it was through FPGAs or other accelerators like Habana or Nervana, note Anshel Sag, principal analyst with Moor Insight and Research. “Ultimately, the 50% haircut on the valuation of Altera is unfortunate, but again is a demonstration of Intel’s past mistakes. I do believe that finishing the process of spinning it out does give Intel back some capital and narrows the company’s focus,” he said. So where did it go wrong? It wasn’t with FPGAs because AMD is making a good run of it with its Xilinx acquisition. The fault, analysts say, lies with Intel, which has a terrible track record when it comes to acquisitions. “Altera could have been a great asset to Intel, just as Xilinx has become a valuable asset to AMD. However, like most of its acquisitions, Intel did not manage Altera well,” said McGregor.

Read More »

Intelligence at the edge opens up more risks: how unified SASE can solve it

In an increasingly mobile and modern workforce, smart technologies such as AI-driven edge solutions and the Internet of Things (IoT) can help enterprises improve productivity and efficiency—whether to address operational roadblocks or respond faster to market demands. However, new solutions also come with new challenges, mainly in cybersecurity. The decentralized nature of edge computing—where data is processed, transmitted, and secured closer to the source rather than in a data center—has presented new risks for businesses and their everyday operations. This shift to the edge increases the number of exposed endpoints and creates new vulnerabilities as the attack surface expands. Enterprises will need to ensure their security is watertight in today’s threat landscape if they want to reap the full benefits of smart technologies at the edge. Bypassing the limitations of traditional network security  For the longest time, enterprises have relied on traditional network security approaches to protect their edge solutions. However, these methods are becoming increasingly insufficient as they typically rely on static rules and assumptions, making them inflexible and predictable for malicious actors to circumvent.  While effective in centralized infrastructures like data centers, traditional network security models fall short when applied to the distributed nature of edge computing. Instead, organizations need to adopt more adaptive, decentralized, and intelligent security frameworks built with edge deployments in mind.  Traditional network security typically focuses on keeping out external threats. But today’s threat landscape has evolved significantly, with threat actors leveraging AI to launch advanced attacks such as genAI-driven phishing, sophisticated social engineering attacks, and malicious GPTs. Combined with the lack of visibility with traditional network security, a cybersecurity breach could remain undetected until it’s too late, resulting in consequences extending far beyond IT infrastructures.  Next generation of enterprise security with SASE As organizations look into implementing new technologies to spearhead their business, they

Read More »

Keysight tools tackle data center deployment efficiency

Test and performance measurement vendor Keysight Technologies has developed Keysight Artificial Intelligence (KAI) to identify performance inhibitors affecting large GPU deployments. It emulates workload profiles, rather than using actual resources, to pinpoint performance bottlenecks. Scaling AI data centers requires testing throughout the design and build process – every chip, cable, interconnect, switch, server, and GPU needs to be validated, Keysight says. From the physical layer through the application layer, KAI is designed to identify weak links that degrade the performance of AI data centers, and it validates and optimizes system-level performance for optimal scaling and throughput. AI providers, semiconductor fabricators, and network equipment manufacturers can use KAI to accelerate design, development, deployment, and operations by pinpointing performance issues before deploying in production.

Read More »

U.S. Advances AI Data Center Push with RFI for Infrastructure on DOE Lands

ORNL is also the home of the Center for Artificial Intelligence Security Research (CAISER), which Edmon Begoli, CAISER founding director, described as being in place to build the security necessary by defining a new field of AI research targeted at fighting future AI security risks. Also, at the end of 2024, Google partner Kairos Power started construction of their Hermes demonstration SMR in Oak Ridge. Hermes is a high-temperature gas-cooled reactor (HTGR) that uses triso-fueled pebbles and a molten fluoride salt coolant (specifically Flibe, a mix of lithium fluoride and beryllium fluoride). This demonstration reactor is expected to be online by 2027, with a production level system becoming available in the 2030 timeframe. Also located in a remote area of Oak Ridge is the Tennessee Valley Clinch River project, where the TVA announced a signed agreement with GE-Hitachi to plan and license a BWRX-300 small modular reactor (SMR). On Integrating AI and Energy Production The foregoing are just examples of ongoing projects at the sites named by the DOE’s RFI. Presuming that additional industry power, utility, and data center providers get on board with these locations, any of the 16 could be the future home of AI data centers and on-site power generation. The RFI marks a pivotal step in the U.S. government’s strategy to solidify its global dominance in AI development and energy innovation. By leveraging the vast resources and infrastructure of its national labs and research sites, the DOE is positioning the country to meet the enormous power and security demands of next-generation AI technologies. The selected locations, already home to critical energy research and cutting-edge supercomputing, present a compelling opportunity for industry stakeholders to collaborate on building integrated, sustainable AI data centers with dedicated energy production capabilities. With projects like Oak Ridge’s pioneering SMRs and advanced AI security

Read More »

Generac Sharpens Focus on Data Center Power with Scalable Diesel and Natural Gas Generators

In a digital economy defined by constant uptime and explosive compute demand, power reliability is more than a design criterion—it’s a strategic imperative. In response to such demand, Generac Power Systems, a company long associated with residential backup and industrial emergency power, is making an assertive move into the heart of the digital infrastructure sector with a new portfolio of high-capacity generators engineered for the data center market. Unveiled this week, Generac’s new lineup includes five generators ranging from 2.25 MW to 3.25 MW. These units are available in both diesel and natural gas configurations, and form part of a broader suite of multi-asset energy systems tailored to hyperscale, colocation, enterprise, and edge environments. The product introductions expand Generac’s commercial and industrial capabilities, building on decades of experience with mission-critical power in hospitals, telecom, and manufacturing, now optimized for the scale and complexity of modern data centers. “Coupled with our expertise in designing generators specific to a wide variety of industries and uses, this new line of generators is designed to meet the most rigorous standards for performance, packaging, and after-treatment specific to the data center market,” said Ricardo Navarro, SVP & GM, Global Telecom and Data Centers, Generac. Engineering for the Demands of Digital Infrastructure Each of the five new generators is designed for seamless integration into complex energy ecosystems. Generac is emphasizing modularity, emissions compliance, and high-ambient operability as central to the offering, reflecting a deep understanding of the real-world challenges facing data center operators today. The systems are built around the Baudouin M55 engine platform, which is engineered for fast transient response and high operating temperatures—key for data center loads that swing sharply under AI and cloud workloads. The M55’s high-pressure common rail fuel system supports low NOx emissions and Tier 4 readiness, aligning with the most

Read More »

CoolIT and Accelsius Push Data Center Liquid Cooling Limits Amid Soaring Rack Densities

The CHx1500’s construction reflects CoolIT’s 24 years of DLC experience, using stainless-steel piping and high-grade wetted materials to meet the rigors of enterprise and hyperscale data centers. It’s also designed to scale: not just for today’s most power-hungry processors, but for future platforms expected to surpass today’s limits. Now available for global orders, CoolIT is offering full lifecycle support in over 75 countries, including system design, installation, CDU-to-server certification, and maintenance services—critical ingredients as liquid cooling shifts from high-performance niche to a requirement for AI infrastructure at scale. Capex Follows Thermals: Dell’Oro Forecast Signals Surge In Cooling and Rack Power Infrastructure Between Accelsius and CoolIT, the message is clear: direct liquid cooling is stepping into its maturity phase, with products engineered not just for performance, but for mass deployment. Still, technology alone doesn’t determine the pace of adoption. The surge in thermal innovation from Accelsius and CoolIT isn’t happening in a vacuum. As the capital demands of AI infrastructure rise, the industry is turning a sharper eye toward how data center operators account for, prioritize, and report their AI-driven investments. To wit: According to new market data from Dell’Oro Group, the transition toward high-power, high-density AI racks is now translating into long-term investment shifts across the data center physical layer. Dell’Oro has raised its forecast for the Data Center Physical Infrastructure (DCPI) market, predicting a 14% CAGR through 2029, with total revenue reaching $61 billion. That revision stems from stronger-than-expected 2024 results, particularly in the adoption of accelerated computing by both Tier 1 and Tier 2 cloud service providers. The research firm cited three catalysts for the upward adjustment: Accelerated server shipments outpaced expectations. Demand for high-power infrastructure is spreading to smaller hyperscalers and regional clouds. Governments and Tier 1 telecoms are joining the buildout effort, reinforcing AI as a

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »