New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples

Stay Ahead, Stay ONMINE

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples

Singapore-based AI startup Sapient Intelligence has developed a new AI architecture that can match, and in some cases vastly outperform, large language models (LLMs) on complex reasoning tasks, all while being significantly smaller and more data-efficient.The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.

When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.

While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

The AI Impact Series Returns to San Francisco – August 5

The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

Secure your spot now – space is limited: https://bit.ly/3GuuPLF

This dependency on generating explicit language tethers the model’s reasoning to the token level, often requiring massive amounts of training data and producing long, slow responses. This approach also overlooks the type of “latent reasoning” that occurs internally, without being explicitly articulated in language.

As the researchers note, “A more efficient approach is needed to minimize these data requirements.”

A hierarchical approach inspired by the brain

To move beyond CoT, the researchers explored “latent reasoning,” where instead of generating “thinking tokens,” the model reasons in its internal, abstract representation of the problem. This is more aligned with how humans think; as the paper states, “the brain sustains lengthy, coherent chains of reasoning with remarkable efficiency in a latent space, without constant translation back to language.”

However, achieving this level of deep, internal reasoning in AI is challenging. Simply stacking more layers in a deep learning model often leads to a “vanishing gradient” problem, where learning signals weaken across layers, making training ineffective. An alternative, recurrent architectures that loop over computations can suffer from “early convergence,” where the model settles on a solution too quickly without fully exploring the problem.

hierarchical reasoning model — *The Hierarchical Reasoning Model (HRM) is inspired by the structure of the brain Source: arXiv*

Seeking a better approach, the Sapient team turned to neuroscience for a solution. “The human brain provides a compelling blueprint for achieving the effective computational depth that contemporary artificial models lack,” the researchers write. “It organizes computation hierarchically across cortical regions operating at different timescales, enabling deep, multi-stage reasoning.”

Inspired by this, they designed HRM with two coupled, recurrent modules: a high-level (H) module for slow, abstract planning, and a low-level (L) module for fast, detailed computations. This structure enables a process the team calls “hierarchical convergence.” Intuitively, the fast L-module addresses a portion of the problem, executing multiple steps until it reaches a stable, local solution. At that point, the slow H-module takes this result, updates its overall strategy, and gives the L-module a new, refined sub-problem to work on. This effectively resets the L-module, preventing it from getting stuck (early convergence) and allowing the entire system to perform a long sequence of reasoning steps with a lean model architecture that doesn’t suffer from vanishing gradients.

*HRM (left) smoothly converges on the solution across computation cycles and avoids early convergence (center, RNNs) and vanishing gradients (right, classic deep neural networks) Source: arXiv*

According to the paper, “This process allows the HRM to perform a sequence of distinct, stable, nested computations, where the H-module directs the overall problem-solving strategy and the L-module executes the intensive search or refinement required for each step.” This nested-loop design allows the model to reason deeply in its latent space without needing long CoT prompts or huge amounts of data.

A natural question is whether this “latent reasoning” comes at the cost of interpretability. Guan Wang, Founder and CEO of Sapient Intelligence, pushes back on this idea, explaining that the model’s internal processes can be decoded and visualized, similar to how CoT provides a window into a model’s thinking. He also points out that CoT itself can be misleading. “CoT does not genuinely reflect a model’s internal reasoning,” Wang told VentureBeat, referencing studies showing that models can sometimes yield correct answers with incorrect reasoning steps, and vice versa. “It remains essentially a black box.”

*Example of how HRM reasons over a maze problem across different compute cycles Source: arXiv*

HRM in action

To test their model, the researchers pitted HRM against benchmarks that require extensive search and backtracking, such as the Abstraction and Reasoning Corpus (ARC-AGI), extremely difficult Sudoku puzzles and complex maze-solving tasks.

The results show that HRM learns to solve problems that are intractable for even advanced LLMs. For instance, on the “Sudoku-Extreme” and “Maze-Hard” benchmarks, state-of-the-art CoT models failed completely, scoring 0% accuracy. In contrast, HRM achieved near-perfect accuracy after being trained on just 1,000 examples for each task.

On the ARC-AGI benchmark, a test of abstract reasoning and generalization, the 27M-parameter HRM scored 40.3%. This surpasses leading CoT-based models like the much larger o3-mini-high (34.5%) and Claude 3.7 Sonnet (21.2%). This performance, achieved without a large pre-training corpus and with very limited data, highlights the power and efficiency of its architecture.

*HRM outperforms large models on complex reasoning tasks Source: arXiv*

While solving puzzles demonstrates the model’s power, the real-world implications lie in a different class of problems. According to Wang, developers should continue using LLMs for language-based or creative tasks, but for “complex or deterministic tasks,” an HRM-like architecture offers superior performance with fewer hallucinations. He points to “sequential problems requiring complex decision-making or long-term planning,” especially in latency-sensitive fields like embodied AI and robotics, or data-scarce domains like scientific exploration.

In these scenarios, HRM doesn’t just solve problems; it learns to solve them better. “In our Sudoku experiments at the master level… HRM needs progressively fewer steps as training advances—akin to a novice becoming an expert,” Wang explained.

For the enterprise, this is where the architecture’s efficiency translates directly to the bottom line. Instead of the serial, token-by-token generation of CoT, HRM’s parallel processing allows for what Wang estimates could be a “100x speedup in task completion time.” This means lower inference latency and the ability to run powerful reasoning on edge devices.

The cost savings are also substantial. “Specialized reasoning engines such as HRM offer a more promising alternative for specific complex reasoning tasks compared to large, costly, and latency-intensive API-based models,” Wang said. To put the efficiency into perspective, he noted that training the model for professional-level Sudoku takes roughly two GPU hours, and for the complex ARC-AGI benchmark, between 50 and 200 GPU hours—a fraction of the resources needed for massive foundation models. This opens a path to solving specialized business problems, from logistics optimization to complex system diagnostics, where both data and budget are finite.

Looking ahead, Sapient Intelligence is already working to evolve HRM from a specialized problem-solver into a more general-purpose reasoning module. “We are actively developing brain-inspired models built upon HRM,” Wang said, highlighting promising initial results in healthcare, climate forecasting, and robotics. He teased that these next-generation models will differ significantly from today’s text-based systems, notably through the inclusion of self-correcting capabilities.

The work suggests that for a class of problems that have stumped today’s AI giants, the path forward may not be bigger models, but smarter, more structured architectures inspired by the ultimate reasoning engine: the human brain.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Sysadmins ready for AI, but skepticism abounds

The report shows that AI is being deployed typically in high-volume, repetitive tasks. Troubleshooting and log analysis lead the way, with 41% and 35% of sysadmins, respectively, reporting use of AI in those areas—up significantly from 2024. Respondents reported that the following tasks are most likely to be automated with

Unexpected costs drive on-premises computing

“This reversal challenges the assumption that cloud is always the end goal, and highlights growing concerns about cost predictability, control, and performance in shared cloud environments,” MacDonald told Network World. The survey found 86% of IT professionals report that their organizations currently use dedicated servers, with government (93%), information technology

CISPE seeks to annul Broadcom’s VMware takeover

However, Forrester Research Senior Analyst Dario Maisto said, “Broadcom VMware commercial practices have been under the lenses for quite some time now. While we may agree or disagree with the European Commission’s decision to approve Broadcom’s acquisition of VMware, the fact is that a number of European organizations are suffering from

CompTIA updates Linux+ certification

CompTIA has updated its Linux+ certification exam to include new and expanded content on artificial intelligence, automation, cybersecurity, DevOps, infrastructure as code (IaC), scalability, and systems troubleshooting. The Linux+ V8 certification validates IT professionals’ abilities to manage, secure, automate, and troubleshoot Linux systems in cloud and hybrid environments, according to

Supertanker Hauling Saudi Diesel Heads to Europe

A supertanker carrying a cargo of diesel from the Middle East is en route to the fuel-starved European market, reflecting supply tightness in the region. The VLCC Nissos Keros loaded about 2 million barrels of ultra-low sulfur diesel from Saudi Arabia’s Jubail terminal and is currently signaling France where it’s due to arrive Aug. 30, according to Kpler and ship-tracking data compiled by Bloomberg. The vessel, which usually transports crude oil, was re-configured to carry diesel. Cargoes of the fuel would typically be carried on smaller tankers, but with freight rates elevated after the latest attacks on shipping in the Red Sea, operators have an incentive to clean up dirty tankers to haul products instead and reap the economies of scale. Europe’s diesel market remains under pressure, driven by a combination of lower refinery output, costly rerouting of imports to replace shunned Russian supplies and sanctions-related uncertainty. The arrival of a large shipment may provide temporary relief, but dependence on long-haul imports continues to expose the European market to spikes in freight costs and supply volatility. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Oil Slips on Stronger Dollar, Trade Doubts

Oil fell as the dollar strengthened and conviction waned that the US will reach agreements with key trade partners ahead of a deadline next week. West Texas Intermediate crude slid more than 1% to settle near $65 a barrel after President Donald Trump said the US has a 50-50 chance of striking a trade deal with Europe, a contrast to the optimism the bloc’s diplomats expressed this week. Trump also said most tariff rates are essentially settled now. The effective US tariff rate is at the highest in a century, by some estimates, a potential threat to energy demand. In another headwind, Trump indicated he had no plans to fire Federal Reserve Chair Jerome Powell, boosting the dollar and making the commodities priced in the currency less attractive. Crude has remained in a holding pattern this month, but is down for the year as increased supply from OPEC+ adds to concerns of a looming glut. The group will next meet on Aug. 3 to decide on production levels. On Thursday, one member, Venezuela, was given a production reprieve by a US decision to let Chevron resume pumping oil in the country. “We expect crude to slowly sell off this fall, driven by steady acceleration of stock builds, softening physical markets, reduced refinery margin support and continued deescalation of geopolitically driven supply risk,” Macquarie Group analysts including Vikas Dwivedi wrote in a note. Oil Prices WTI for September delivery fell 1.3% to settle at $65.16 a barrel. Brent for September settlement slipped 1.1% to $68.44 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with

BP to Exit $36B Australian Green Hydrogen Hub

BP Plc will exit its role in a massive green hydrogen production facility planned in Australia as the British oil major refocuses on the fossil fuels that drive its profits. The company told its partners in the Australian Renewable Energy Hub that it plans to leave the project as both operator and equity holder, according to a statement from a BP spokesperson. It’s the latest setback for green hydrogen, a fuel once touted as a key way for Big Oil to profit from the energy transition that has so far proved too costly for mass production and consumption. The AREH project company will take over as operator over coming months with support from founding partner InterContinental Energy, according to an AREH spokesperson. BP’s decision to exit the project doesn’t reflect the opportunity the hub presents to decarbonize the Pilbara and support the creation of a green iron industry, they said. BP’s entry into the project – once estimated to cost about $36 billion – came at a time when the company sought to rapidly build up a business in low-carbon energy and shrink its oil business. But after years of stock under-performance compared with its peers and the departure of the plan’s architect – Chief Executive Officer Bernard Looney – BP has refined its strategy to focus more squarely on profits than green goals. The company is far from alone in leaving its ambitions for green hydrogen behind. Scores of companies that once saw the fuel as the next big thing in energy have cut back plans as hoped for cost declines failed to materialize. Also on Thursday, Fortescue Ltd. said it would abandon plans for a $550 million Arizona Hydrogen Project in the US and a $150 million PEM50 Project in Gladstone, Australia – resulting in a pretax writedown of $150 million. Meanwhile, Woodside

Eni Profit Tops Estimates

Eni SpA reported profit that beat analyst estimates as proceeds from asset sales and sweeping cost cuts helped counter a weak oil market. While crude prices were lower in the second quarter — weighing on earnings at other European oil companies — Eni has been buoyed by a cost-reduction program introduced earlier this year, while asset disposals brought down debt. Adjusted net income fell 25% from a year earlier to €1.13 billion ($1.3 billion), the Italian energy company said Friday in a statement. That exceeded the €932.6 million average estimate of analysts surveyed by Bloomberg. Eni said it’s now targeting €3 billion of cost cuts this year, up from €2 billion previously. The company has also reaped billions of euros by offloading stakes in its renewables arm and mobility division, and is in talks to sell half of its carbon capture unit. “The combination of divestments set to come through this year, ongoing ‘self-help,’ as well as the additional cash flow from new ramp-ups sets Eni up for a strong second half of 2025 and 2026,” RBC Europe Ltd. analyst Biraj Borkhataria said in a note. He expects “growing free cash flow and a more resilient balance sheet than we’ve seen for many years.” The shares rose as much as 0.6% at the open in Milan, before trading little changed as of 9:08 a.m. local time. Eni confirmed plans for shareholders’ returns this year. It expects free cash flow before working capital of about €11.5 billion at $70-a-barrel crude, up from previous guidance of €11 billion. The company also raised its forecast for annual earnings from its gas division to €1 billion from €800 million. Net debt shrank to €29.1 billion at the end of June. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of

Chevron to Cut Positions as Part of Hess Integration

Chevron will “consolidate or eliminate some positions” as part of its integration with Hess Corporation, a Chevron spokesperson told Rigzone. “Chevron completed the merger with Hess Corporation on July 18,” the spokesperson said. “We are working quickly to integrate the Hess workforce and are focused on maintaining safe and reliable operations throughout the transition period,” the spokesperson added. “As part of the integration, we will consolidate or eliminate some positions. As required by the WARN Act, Chevron has provided notice of a planned workforce reduction to appropriate state and local government representatives for Downtown Houston and North Dakota,” the spokesperson went on to state. When asked by Rigzone to confirm how many positions will be affected, the Chevron spokesperson said, “for the WARN Notices issued on July 21, Chevron anticipates a reduction of approximately 575 employees in Downtown Houston and 70 employees in North Dakota”. The spokesperson told Rigzone that “these are difficult decisions which … [the company does] not make lightly”. “We understand the impact this news may have on employees, their families and the communities where we operate,” the spokesperson said. “Our priority is to support our employees through this transition. We are offering severance benefits and outplacement support,” the Chevron representative added. In a statement posted on its website on July 18, Chevron announced that it had completed its acquisition of Hess Corporation following the satisfaction of all necessary closing conditions, including a favorable arbitration outcome regarding Hess’ offshore Guyana asset. “This merger of two great American companies brings together the best in the industry,” Chevron Chairman and CEO Mike Wirth said in that statement. “The combination enhances and extends our growth profile well into the next decade, which we believe will drive greater long-term value to shareholders,” he added. In this statement, former Hess Corporation CEO

Coal- and gas-fired power plants have a new best friend: data centers

Abbe Ramanan is a project director at Clean Energy Group. In 2020, the Virginia Assembly passed the Virginia Clean Economy Act, a law that required the state’s largest utility, Dominion Energy, to generate all its electricity from renewable resources by 2045. However, Dominion has found a useful loophole to get around the law’s requirements — data centers. Viriginia hosts the largest data center market in the world, and is home to at least 150 hyperscale data centers, with more being proposed. In its recent integrated resource plan, Dominion cited projected energy demand from these data centers as a key reason to delay retiring existing power plants, including the Clover Power Station, a coal-powered peaker plant in Halifax County, a disproportionately low-income region. In addition to delaying peaker retirements, Dominion has proposed building new gas-powered generation, including a 1-GW peaker plant in Chesterfield, a community that already shoulders an undue environmental burden from existing natural gas- and coal-fired generation. Similar stories have played out across the country as data centers become more and more ubiquitous, particularly in the Southeast. Utilities in Virginia, Georgia, North Carolina and South Carolina have proposed building 20,000 MW of new gas power plants by 2040. Data centers driving the projected load growth are being used to justify this buildout. In Virginia, Georgia and South Carolina, data centers are responsible for at least 65% of projected load growth. Data centers are also delaying the retirement of fossil fuel power plants nationwide, with at least 17 fossil fuel generators originally scheduled for closure now delaying retirement. This new gas buildout, as well as the delayed retirement of fossil fuel generators, overwhelmingly harms Black and brown communities, who face higher energy and environmental burdens. The gas bonanza is especially concerning because the projected demand from data centers could be

AI Project Stargate struggles to get off the ground

Analysts aren’t surprised at the news. “Big IT projects have a long history of dramatically overpromising and it appears that trend is quickly moving into the world of AI data center-based projects as well. The Stargate project, in particular, also seems to have more of a political bent to it than many other projects so that’s likely complicating matters as well,” said Bob O’Donnell, president and chief analyst with TECHnalysis Research. “There’s little doubt we will see massive investments by many different organizations to build out AI infrastructure here in the US, but I’m not convinced that individual projects will end up mattering that much in the long run,” he added. “I have always been skeptical about the huge number that was projected. In the hundreds of billions,” said Patrick Moorhead, CEO & chief analyst with Moor Insights & Strategy. “The only problem was that only a few billion in new funding was raised. And now there’s strife between OpenAI and SoftBank. To be fair, Oracle is part of Stargate now and OpenAI will soak up many GPUs in the Texas facility, but this was already in process when the Stargate announcement happened.”

Storage vendors bring record capacity devices to handle massive data generation

Both are built on Seagate’s Mozaic3+ with advanced storage technology called HAMR, or Heat-Assisted Magnetic Recording. By heating the platter to as much as 500°C, they can squeeze up to 3TB per platter. Other than that, it looks like a standard hard drive: 3.5-inch enclosure, 7,200 RPM spin rotation, and SATA III interface with 6Gbps/s transfer speeds. The drivers are available now and are rather affordable. The 30TB Exos is just $599 on NewEgg.com. On the enterprise solid state drive (SSD) front, KIOXIA America has expanded its high-capacity KIOXIA LC9 Series enterprise SSD lineup with the introduction of a 245.76TB NVMe SSD. The drive comes in a 2.5-inch and Enterprise and Datacenter Standard Form Factor (EDSFF) E3.L form factor and is purpose-built for the performance and efficiency demands of generative AI environments.

Technology is coming so fast data centers are obsolete by the time they launch

Tariffs aside, Enderle feels that AI technology and ancillary technology around it like battery backup is still in the early stages of development and there will be significant changes coming in the next few years. GPUs from AMD and Nvidia are the primary processors for AI, and they are derived from video game accelerators. They were never meant for use in AI processing, but they are being fine-tuned for the task. It’s better to wait to get a more mature product than something that is still in a relatively early state. But Alan Howard, senior analyst for data center infrastructure at Omdia, disagrees and says not to wait. One reason is the rate at which people that are building data centers is all about seizing market opportunity.” You must have a certain amount of capacity to make sure that you can execute on strategies meant to capture more market share.” The same sentiment exists on the colocation side, where there is a considerable shortage of capacity as demand outstrips supply. “To say, well, let’s wait and see if maybe we’ll be able to build a better, more efficient data center by not building anything for a couple of years. That’s just straight up not going to happen,” said Howard. “By waiting, you’re going to miss market opportunities. And these companies are all in it to make money. And so, the almighty dollar rules,” he added. Howard acknowledges that by the time you design and build the data center, it’s obsolete. The question is, does that mean it can’t do anything? “I mean, if you start today on a data center that’s going to be full of [Nvidia] Blackwells, and let’s say you deploy in two years when they’ve already retired Blackwell, and they’re making something completely new. Is that data

‘Significant’ outage at Alaska Airlines not a security incident, but a hardware breakdown

The airline told Network World that when the critical piece of what it described as “third-party multi-redundant hardware” failed unexpectedly, “it impacted several of our key systems that enable us to run various operations.” The company is currently working with its vendor to replace the faulty equipment at the data center. The airline has cancelled more than 150 flights since Sunday evening, including 64 on Monday. The company said additional flight disruptions are likely as it repositions aircraft and crews throughout its network. Alaska Airlines emphasized that the safety of its flights was never compromised, and that “the IT outage is not related to any other current events, and it’s not connected to the recent cybersecurity incident at Hawaiian Airlines.” The airline did not provide additional information to Network World about the specifics of the outage. “There are many redundant components that can fail,” said Roberts, noting that it could have been something as simple as a RAID array (which combines multiple physical data storage components into one or more logical units). Or, on the network side, it could have been the failure of a pair of load balancers. “It’s interesting that redundancy didn’t save them,” said Roberts. “Perhaps multiple pieces of hardware were impacted by the same issue, like a firmware update. Or, maybe they’re just really unlucky.”

Cisco upgrades 400G optical receiver to boost AI infrastructure throughput

“In the data center, what’s really changed in the last year or so is that with AI buildouts, there’s much, much more optics that are part of 400G and 800G. It’s not so much using 10G and 25G optics, which we still sell a ton of, for campus applications. But for AI infrastructure, the 400G and 800G optics are really the dominant optics for that application,” Gartner said. Most of the AI infrastructure builds have been for training models, especially in hyperscaler environments, Gartner said. “I expect, towards the tail end of this year, we’ll start to see more enterprises deploying AI infrastructure for inference. And once they do that, because it has an Nvidia GPU attached to it, it’s going to be a 400G or 800G optic.” Core enterprise applications – such as real-time trading, high-frequency transactions, multi-cloud communications, cybersecurity analytics, network forensics, and industrial IoT – can also utilize the higher network throughput, Gartner said.

Supermicro bets big on 4-socket X14 servers to regain enterprise trust

In April, Dell announced its PowerEdge R470, R570, R670, and R770 servers with Intel Xeon 6 Processors with P-cores, but with single and double-socket servers. Similarly, Lenovo’s ThinkSystem V4 servers are also based on the Intel Xeon 6 processor but are limited to dual socket configurations. The launch of 4-socket servers by Supermicro reflects a growing enterprise need for localized compute that can support memory-bound AI and reduce the complexity of distributed architectures. “The modern 4-socket servers solve multiple pain points that have intensified with GenAI and memory-intensive analytics. Enterprises are increasingly challenged by latency, interconnect complexity, and power budgets in distributed environments. High-capacity, scale-up servers provide an architecture that is more aligned with low-latency, large-model processing, especially where data residency or compliance constraints limit cloud elasticity,” said Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research. “Launching a 4-socket Xeon 6 platform and packaging it within their modular ‘building block’ strategy shows Supermicro is focusing on staying ahead in enterprise and AI data center compute,” said Devroop Dhar, co-founder and MD at Primus Partner. A critical launch after major setbacks Experts peg this to be Supermicro’s most significant product launch since it became mired in governance and regulatory controversies. In 2024, the company lost Ernst & Young, its second auditor in two years, following allegations by Hindenburg Research involving accounting irregularities and the alleged export of sensitive chips to sanctioned entities. Compounding its troubles, Elon Musk’s AI startup xAI redirected its AI server orders to Dell, a move that reportedly cost Supermicro billions in potential revenue and damaged its standing in the hyperscaler ecosystem. Earlier this year, HPE signed a $1 billion contract to provide AI servers for X, a deal Supermicro was also bidding for. “The X14 launch marks a strategic reinforcement for Supermicro, showcasing its commitment

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE