SWiRL: The business case for AI that thinks like your best problem-solvers

Stay Ahead, Stay ONMINE

SWiRL: The business case for AI that thinks like your best problem-solvers

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers from Stanford University and Google DeepMind have unveiled Step-Wise Reinforcement Learning (SWiRL), a technique designed to enhance the ability of large language models (LLMs) to tackle complex tasks requiring multi-step reasoning and tool use. As […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Researchers from Stanford University and Google DeepMind have unveiled Step-Wise Reinforcement Learning (SWiRL), a technique designed to enhance the ability of large language models (LLMs) to tackle complex tasks requiring multi-step reasoning and tool use.

As the interest in AI agents and LLM tool use continues to increase, this technique could offer substantial benefits for enterprises looking to integrate reasoning models into their applications and workflows.

The challenge of multi-step problems

Real-world enterprise applications often involve multi-step processes. For example, planning a complex marketing campaign may involve market research, internal data analysis, budget calculation and reviewing customer support tickets. This requires online searches, access to internal databases and running code.

Traditional reinforcement learning (RL) methods used to fine-tune LLMs, such as Reinforcement Learning from Human Feedback (RLHF) or RL from AI Feedback (RLAIF), typically focus on optimizing models for single-step reasoning tasks.

The lead authors of the SWiRL paper, Anna Goldie, research scientist at Google DeepMind, and Azalia Mirhosseini, assistant professor of computer science at Stanford University, believe that current LLM training methods are not suited for the multi-step reasoning tasks that real-world applications require.

“LLMs trained via traditional methods typically struggle with multi-step planning and tool integration, meaning that they have difficulty performing tasks that require retrieving and synthesizing documents from multiple sources (e.g., writing a business report) or multiple steps of reasoning and arithmetic calculation (e.g., preparing a financial summary),” they told VentureBeat.

Step-Wise Reinforcement Learning (SWiRL)

SWiRL tackles this multi-step challenge through a combination of synthetic data generation and a specialized RL approach that trains models on entire sequences of actions.

As the researchers state in their paper, “Our goal is to teach the model how to decompose complex problems into a sequence of more manageable subtasks, when to call the tool, how to formulate a call to the tool, when to use the results of these queries to answer the question, and how to effectively synthesize its findings.”

SWiRL employs a two-stage methodology. First, it generates and filters large amounts of multi-step reasoning and tool-use data. Second, it uses a step-wise RL algorithm to optimize a base LLM using these generated trajectories.

“This approach has the key practical advantage that we can quickly generate large volumes of multi-step training data via parallel calls to avoid throttling the training process with slow tool use execution,” the paper notes. “In addition, this offline process enables greater reproducibility due to having a fixed dataset.”

Generating training data

*SWiRL data generation process Credit: arXiv*

The first stage involves creating the synthetic data SWiRL learns from. An LLM is given access to a relevant tool, like a search engine or a calculator. The model is then prompted iteratively to generate a “trajectory,” a sequence of steps to solve a given problem. At each step, the model can generate internal reasoning (its “chain of thought“), call a tool, or produce the final answer. If it calls a tool, the query is extracted, executed (e.g., a search is performed), and the result is fed back into the model’s context for the next step. This continues until the model provides a final answer.

Each complete trajectory, from the initial prompt to the final answer, is then broken down into multiple overlapping sub-trajectories. Each sub-trajectory represents the process up to a specific action, providing a granular view of the model’s step-by-step reasoning. Using this method, the team compiled large datasets based on questions from multi-hop question-answering (HotPotQA) and math problem-solving (GSM8K) benchmarks, generating tens of thousands of trajectories.

The researchers explored four different data filtering strategies: no filtering, filtering based solely on the correctness of the final answer (outcome filtering), filtering based on the judged reasonableness of each individual step (process filtering) and filtering based on both process and outcome.

Many standard approaches, such as Supervised Fine-Tuning (SFT), rely heavily on “golden labels” (perfect, predefined correct answers) and often discard data that does not lead to the correct final answer. Recent popular RL approaches, such as the one used in DeepSeek-R1, also use outcome-based rewards to train the model.

In contrast, SWiRL achieved its best results using process-filtered data. This means the data included trajectories where each reasoning step or tool call was deemed logical given the previous context, even if the final answer turned out to be wrong.

The researchers found that SWiRL can “learn even from trajectories that end in incorrect final answers. In fact, we achieve our best results by including process-filtered data, regardless of the correctness of the outcome.”

Training LLMs with SWiRL

In the second stage, SWiRL uses reinforcement learning to train a base LLM on the generated synthetic trajectories. At every step within a trajectory, the model is optimized to predict the next appropriate action (an intermediate reasoning step, a tool call, or the final answer) based on the preceding context.

The LLM receives feedback at each step by a separate generative reward model, which assesses the model’s generated action given the context up to that point.

“Our granular, step-by-step finetuning paradigm enables the model to learn both local decision-making (next-step prediction) and global trajectory optimization (final response generation) while being guided by immediate feedback on the soundness of each prediction,” the researchers write.

At inference time, a SWiRL-trained model works in the same iterative fashion. It receives a prompt and generates text in response. If it outputs a tool call (such as a search query or a mathematical expression), the system parses it, executes the tool, and feeds the result back into the model’s context window. The model then continues generating, potentially making more tool calls, until it outputs a final answer or reaches a pre-set limit on the number of steps.

“By training the model to take reasonable steps at each moment in time (and to do so in a coherent and potentially more explainable way), we address a core weakness of traditional LLMs, namely their brittleness in the face of complex, multi-step tasks, where the probability of success decays exponentially with path length,” Goldie and Mirhoseini said. “Useful and robust Enterprise AI will inevitably need to integrate a wide variety of different tools, chaining them together into complex sequences.”

SWiRL in action

The Stanford and Google DeepMind team evaluated SWiRL across several challenging multi-step question-answering and mathematical reasoning tasks. Compared to baseline models, SWiRL demonstrated significant relative accuracy improvements, ranging from 11% to over 21% on datasets like GSM8K, HotPotQA, MuSiQue and BeerQA.

The experiments confirmed that training a Gemma 2-27B model with SWiRL on process-filtered data yielded the best results, outperforming models trained on outcome-filtered data or using traditional SFT. This suggests SWiRL learns the underlying reasoning process more effectively, rather than just memorizing paths to correct answers, which aids performance on unseen problems.

More importantly, SWiRL exhibited strong generalization capabilities. For example, training a model using SWiRL on text-based question-answering examples improved its performance on math reasoning tasks, even though the model wasn’t explicitly trained on math problems.

This transferability across different tasks and tool types is highly valuable as there is an explosion of agentic applications for language models, and methods that generalize across datasets and tasks will be easier, cheaper and faster to adapt to new environments.

“SWiRL’s generalization seems quite robust in the domains that we explored, but it would be interesting to test this in other areas such as coding,” Goldie and Mirhoseini said. “Our findings suggest that an enterprise AI model trained on one core task using SWiRL would likely exhibit significant performance improvements on other, seemingly unrelated tasks without task-specific fine-tuning. SWiRL generalizes better when applied to larger (i.e. more powerful) models, indicating that this technique may be even more effective in the future as baseline capabilities grow.”

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Global economic upheaval creates ROI for recycling rare earth elements in servers

“When you are getting rid of tens of thousands of devices every year and sometimes hundreds of thousands, negotiate,” Nguyen said. “You may be able to say ‘Give me X amount for this service, this device, this component.’” Data security preserved Other variables in this equation include privacy, cybersecurity and

Linkerd 2.18 advances cloud-native service mesh

The project’s focus has evolved significantly over the years. While early adoption centered on mutual TLS between pods, today’s enterprises are tackling much larger challenges. “For a long time, the most common pattern was simply, ‘I want to get mutual TLS between all my pods, which gives me encryption, and

18 essential commands for new Linux users

[jdoe@fedora ~]$ ls -ld /home/jdoedrwx——. 1 jdoe jdoe 106 Apr 3 14:39 /home/jdoe As you may have suspected, “r” stands for read, “w” means write and “x” is for execute. Note that no permissions are available for other group members and anyone else on the system. Each user will be

Huawei set to ship 910C AI chips at scale, signaling shift in global AI supply chain

“From a performance standpoint, Nvidia’s new-generation chips — such as the B200 and the upcoming B300 Ultra, based on TSMC’s 4nm process and equipped with advanced HBM3/3E memory — have significantly widened the gap compared to Huawei’s 910C, which is likely built on SMIC’s N+2 7nm process (effectively 14nm) and

TotalEnergies to Scale Down Petrochemical Production at Antwerp Complex

TotalEnergies SE said it will end ethylene production from the oldest steam cracker at its Antwerp refining and petrochemical complex by 2027, citing “overcapacity” in Europe. The facility will continue to produce ethylene from a newer steam cracker. Investment will also be refocused on the facility’s decarbonization. The older cracker has relied on a third-party consumer, which recently decided to end the contract by 2027. “As a result, the steam cracker, which is not integrated to TotalEnergies’ downstream polymer production, will no longer have any outlets for its ethylene production”, the French integrated energy company said in an online statement. “The unit shutdown will allow the site to focus on its more recent steam cracker, whose ethylene production is entirely consumed by TotalEnergies’ industrial units in Antwerp and Feluy”. The shutdown will not result in layoffs. “The 253 employees concerned will each be offered a solution aligned with their personal situation: retirement or an internal transfer to another position based at the Antwerp site”, the statement said. “This project is subject to the legally required employee consultation and notification process, which TotalEnergies will initiate with representatives of Antwerp platform employees in late April”. The facility, on the other hand, will this year enable a conventional refining unit to co-process biomass to produce 50,000 metric tons a year of sustainable aviation fuel. This will be supplied to aviation customers, TotalEnergies said. Meanwhile as part of the facility’s decarbonization TotalEnergies has signed a tolling deal for 130 megawatts (MW) from a 200-MW electrolyzer project of Air Liquide. That secures 15,000 metric tons per annum of green hydrogen for the Antwerp complex. “Upstream of the electrolyzer, TotalEnergies will supply green electricity thanks to its OranjeWind offshore wind project”, TotalEnergies added. “Scheduled for the end of 2027, the project will reduce CO2 [carbon dioxide] emissions at the

UK government set to approve Eni’s HyNet carbon capture plans

The UK government and Italian energy firm Eni are set to announce approval for a major carbon capture project in England on Thursday, according to reports. The FT reported that officials will announce the go-ahead for a 38 mile pipeline as part of the HyNet North West carbon capture and storage (CCS) project, citing two people familiar with the project. The announcement will be made at a major energy security summit in London later today. Under the HyNet plans, industrial emissions will be capture and transported for storage at Eni’s Douglas CCS platform in the Liverpool Bay. Based in the north west of England, HyNet was selected as one of the two Track-1 CCUS clusters to receive funding from the UK government in November 2021 alongside the East Coast Cluster. © HyNetHyNet The UK government gave approval to the Northern Endurance Partnership CCS project, part of the East Coast Cluster, in December last year. The approvals come after the government last year pledged nearly £22 billion for the Track 1 projects over the next 25 years. Energy Voice has contacted Eni and the UK government for comment in response to the FT story. HyNet North West Industries set to make use of CO2 storage through HyNet include cement, construction materials, oil refining, recycling and waste management, low carbon hydrogen and waste-to-energy generation. Eni expects to be able to store 10Mtpa of CO2 before the end of the decade. The project backers, including EET Hydrogen and Viridor, estimate HyNet will contribute up to £17bn in economic benefits. Alongside the HyNet and East Coast Cluster, the industry is also progressing the Acorn CCS project in Scotland and the Viking CCS project in the Humber under the Track-2 process. In 2024, the North Sea Transition Authority (NSTA) regulator also finalised its first ever offshore carbon

Government will make ‘calm and considered’ decision on zonal pricing, Miliband says

The UK government will make a “calm and considered decision” on whether to shake up the energy market and move towards zonal pricing, Ed Miliband has said. The Energy Secretary is reported to be considering zonal pricing, which some newspaper reports have said could lead to higher bills in the South East of England, while other areas may get cheaper energy. There are also concerns among the Scottish renewable energy sector that zonal pricing reforms could risk “derailing” tens of billions of pounds of offshore wind investment. The changes could also impact areas including battery storage and green hydrogen production. Asked about the move by Sky News, Miliband said: “We’re still looking at the details of this, which is something we’ve got to really get right, and we are studying in detail the effect. “My bottom line here is we want to cut bills, and we want to do so in a way that’s fair, and we want to make sure that happens, and that’s my test for any reforms that we make. © Supplied by Ocean WindsThe last turbine to be installed on the Moray West offshore wind farm in Scotland. “There’s very strong views on both sides of industry, as you’ll probably have gathered on this. People are fighting it out. “We’re going to take this, make a calm and considered decision on this.” Miliband said he would not take a decision that would raise energy prices in some parts of the country. Speaking to LBC about zonal pricing, the Energy Secretary said: “I’m not going to take a decision that is going to raise prices in some parts of the country. That is not what I’m going to do. “Honestly, this is about reforms to cut prices for people, that is my absolute bottom line here.” He

Petronas Pens 11 Agreements to Advance Malaysia’s OGSE Sector

Malaysia’s Petroliam Nasional Berhad (Petronas) has signed 11 Memoranda of Understanding (MOU) to boost the country’s oil and gas services and equipment (OGSE) sector through two initiatives, Yard Transformation and Productivity Enhancement, and Skilled Trade Champion. The MOUs were signed by Petronas’ Malaysia Petroleum Management (MPM). The agreements promote cooperation with industry stakeholders to improve the efficiency of fabrication yards through modernization and to develop a highly skilled local workforce, Petronas said in a media release. “The MOUs reflect PETRONAS’ steadfast commitment to build a robust and sustainable oil and gas sector in Malaysia. The structured implementation, undertaken in close collaboration with homegrown OGSE players, will focus on delivering measurable outcomes to create an environment that is conducive for investment and accelerated growth, aligned with national aspirations”, MPM Senior Vice President Bacho Pilong said. As part of the Yard Transformation and Productivity Enhancement initiative, Petronas said it has signed five MOUs with leading local fabrication yard contractors: Brooke Holding Sdn. Bhd., Ocean Might Sdn. Bhd., Muhibbah Engineering (M) Bhd., Malaysia Marine and Heavy Engineering Holdings Bhd., and Sapura Fabrication Sdn. Bhd. Petrona said this initiative is a vital part of Malaysia’s overall plan to rejuvenate the fabrication yard ecosystem. The efforts aimed at transformation emphasize the incorporation of cutting-edge technologies, enhancing workforce skills, and broadening market prospects to increase productivity, it said. Petronas added that under the Skilled Trade Champion initiative, six MOUs were signed with key industry players, including Pan-Malaysia Maintenance, Construction and Modification contractors, Hook-up and Commissioning contractors, and the Malaysia Offshore Support Vessel Owners’ Association. These collaborations focus on enhancing essential offshore trades, including rigging, blasting and painting, scaffolding, welding, joint-making, and seafaring. The initiative also emphasizes the importance of enhancing the skills of Malaysian seafarers to develop a larger pool of qualified officers in offshore

AI could offset energy demand by cutting industry consumption, says Minister

Power-hungry AI data centres could more than offset their energy demands with the technology being used to drive down consumption across other industries, a science minister has said. Labour frontbencher Lord Vallance of Balham told Parliament the UK stood “a very good chance” of securing a large number of computer processing sites, as critics cast doubt on Britain’s attraction given its high energy costs compared with other countries. The Government has previously set out plans to turn areas of industrial wasteland into “hotbeds” for AI development. Prime Minister Sir Keir Starmer wants to drastically expand use of the technology to help revolutionise struggling public services and turn around Britain’s economy. Measures include the development of “growth zones” around the country to build infrastructure such as data centres and improve access to the power grid. However, concern has been expressed at the amount of energy consumed by the new technology as the Government pursues the emissions goal of net-zero by 2050. It follows the ending of the mainstream political consensus on tackling climate change, amid worries over the cost of the UK’s green transition on household bills. Tory peer Lord Mackinlay of Richborough, director of the Global Warming Policy Foundation think tank, said: “For that diminishing number of people who still believe that diminishing Britain’s 0.8% of global CO2 still further is actually an undertaking worth having, I bring very good news. “And that is the amount of CO2 to be released from UK data centres will be very close to zero. “Because with energy price in the UK some three times higher than the US, double the price of much of mainland Europe – notably Switzerland where this is a developing industry – I very much doubt we will have any or very few energy hungry AI centres.” But Lord

Planning reforms to deliver clean energy projects ‘at least a year faster’

The UK government says clean energy projects and other major infrastructure will be delivered “at least a year faster” on average under accelerated planning reforms. The government said “burdensome” statutory consultation requirements for major projects will be scrapped through amendments to the Planning and Infrastructure Bill. Reforming the planning system was a key pledge of the Labour party ahead of its election victory last year amid frustration from the energy sector over delays. The reforms will cut down the average two-year statutory pre-consultation period by half, the government said, “paving the way for new roads, railways and windfarms”. Altogether, the government estimates the reforms could save over £1 billion for industry and taxpayers within the current term of parliament. Deputy Prime Minister and housing secretary Angela Rayner said the UK “can’t afford to have projects held up by tiresome requirements and uncertainty”. “We are strengthening the Planning and Infrastructure Bill to make sure we can lead the world again with new roads, railways, and energy infrastructure as part of the Plan for Change, whilst ensuring local people still have a say in our journey to get Britain building,” Rayner said. ‘Significant step forward’ for renewable energy RenewableUK head of policy James Robottom welcomed the government announcement and said the reforms are a “significant step forward for the renewable energy industry”. “The industry has a long track record of engaging early and closely with local communities and a wide range of environmental stakeholders, and this will continue as we want to carry on building projects with local support by giving communities a clear voice in the decision-making process,” he said. Ørsted UK country manager Benj Sykes said the changes will allow developers to “focus on the issues that matter to stakeholders and local communities, and to our developments”.

Cloudbrink pushes SASE boundaries with 300 Gbps data center throughput

Those core components are functionally table stakes and don’t really serve to differentiate Cloudbrink against its myriad competitors in the SASE market. Where Cloudbrink looks to differentiate is at a technical level through a series of innovations including: Distributed edge architecture: The company has decoupled software from hardware, allowing their platform to run across 800 data centers by leveraging public clouds, telco networks and edge computing infrastructure. This approach reduces network latency from 300 milliseconds to between 7 and 20 milliseconds, the company says. This density dramatically improves TCP performance and responsiveness. Protocol optimization: Cloudbrink developed its own algorithms for SD-WAN optimization that bring enterprise-grade reliability to last mile links. These algorithms significantly improve efficiency on consumer broadband connections, enabling enterprise-grade performance over standard internet links. Integrated security stack: “We’ve been able to produce secure speeds at line rate on our platform by bringing security to the networking stack itself,” Mana noted. Rather than treating security as a separate overlay that degrades performance, Cloudbrink integrates security functions directly into the networking stack. The solution consists of three core components: client software for user devices, a cloud management plane, and optional data center connectors for accessing internal applications. The client intelligently connects to multiple edge nodes simultaneously, providing redundancy and application-specific routing optimization. Cloudbrink expands global reach Beyond its efforts to increase throughput, Cloudbrink is also growing its global footprint. Cloudbrink today announced a global expansion through new channel agreements and the opening of a Brazil office to serve emerging markets in Latin America, Korea and Africa. The expansion includes exclusive partnerships with WITHX in Korea, BAMM Technologies for Latin America distribution and OneTic for African markets. The company’s software-defined FAST (Flexible, Autonomous, Smart and Temporary) Edges technology enables rapid deployment of points of presence by leveraging existing infrastructure from multiple

CIOs could improve sustainability with data center purchasing decisions — but don’t

CIOs can drive change Even though it’s difficult to calculate an organization’s carbon footprint, CIOs and IT purchasing leaders trying to reduce their environmental impact can influence data center operators, experts say. “Customers have a very large voice,” Seagate’s Feist says. “Don’t underestimate how powerful that CIO feedback loop is. The large cloud accounts are customer-obsessed organizations, so they listen, and they react.” While DataBank began using renewable energy years ago, customer demand can push more data center operators to follow suit, Gerson says. “For sure, if there is a requirement to purchase renewable power, we are going to purchase renewable power,” she adds.

Copper-to-optics technology eyed for next-gen AI networking gear

Broadcom’s demonstration and a follow-up session explored the benefits of further developing CPC, such as reduced signal integrity penalties and extended reach, through channel modeling and simulations, Broadcom wrote in a blog about the DesignCon event. “Experimental results showed successful implementation of CPC, demonstrating its potential to address bandwidth and signal integrity challenges in data centers, which is crucial for AI applications,” Broadcom stated. In addition to the demo, Broadcom and Samtec also authored a white paper on CPC that stated: “Co-packaged connectivity (CPC) provides the opportunity to omit loss and reflection penalties from the [printed circuit board (PCB)] and the package. When high speed I/O is cabled from the top of the package advanced PCB materials are not necessary. Losses from package vertical paths and PCB routing can be transferred to the longer reach of cables,” the authors stated. “As highly complex systems are challenged to scale the number of I/O and their reach, co- packaged connectivity presents opportunity. As we approach 224G-PAM4 [which uses optical techniques to support 224 Gigabits per second data rates per optical lane] and above, system loss and dominating noise sources necessitate the need to re-consider that which has been restricted in the back of the system architect’s mind for years: What if we attached to the package?” At OFC, Samtec demonstrated its Si-FlyHD co-packaged cable assemblies and Samtec FlyoverOctal Small Form-factor Pluggable (OSFP) over the Samtec Eye Speed Hyper Low Skew twinax copper cable. Flyover is Samtec’s proprietary way of addressing signal integrity and reach limitations of routing high-speed signals through traditional printed circuit boards (PCBs). “This evaluation platform incorporates Broadcom’s industry-leading 200G SerDes technology and Samtec’s co-packaged Flyover technology. Si-Fly HD CPC offers the industry’s highest footprint density and robust interconnect which enables 102.4T (512 lanes at 200G) in a 95 x

The Rise of AI Factories: Transforming Intelligence at Scale

AI Factories Redefine Infrastructure The architecture of AI factories reflects a paradigm shift that mirrors the evolution of the industrial age itself—from manual processes to automation, and now to autonomous intelligence. Nvidia’s framing of these systems as “factories” isn’t just branding; it’s a conceptual leap that positions AI infrastructure as the new production line. GPUs are the engines, data is the raw material, and the output isn’t a physical product, but predictive power at unprecedented scale. In this vision, compute capacity becomes a strategic asset, and the ability to iterate faster on AI models becomes a competitive differentiator, not just a technical milestone. This evolution also introduces a new calculus for data center investment. The cost-per-token of inference—how efficiently a system can produce usable AI output—emerges as a critical KPI, replacing traditional metrics like PUE or rack density as primary indicators of performance. That changes the game for developers, operators, and regulators alike. Just as cloud computing shifted the industry’s center of gravity over the past decade, the rise of AI factories is likely to redraw the map again—favoring locations with not only robust power and cooling, but with access to clean energy, proximity to data-rich ecosystems, and incentives that align with national digital strategies. The Economics of AI: Scaling Laws and Compute Demand At the heart of the AI factory model is a requirement for a deep understanding of the scaling laws that govern AI economics. Initially, the emphasis in AI revolved around pretraining large models, requiring massive amounts of compute, expert labor, and curated data. Over five years, pretraining compute needs have increased by a factor of 50 million. However, once a foundational model is trained, the downstream potential multiplies exponentially, while the compute required to utilize a fully trained model for standard inference is significantly less than

Google’s AI-Powered Grid Revolution: How Data Centers Are Reshaping the U.S. Power Landscape

Google Unveils Groundbreaking AI Partnership with PJM and Tapestry to Reinvent the U.S. Power Grid In a move that underscores the growing intersection between digital infrastructure and energy resilience, Google has announced a major new initiative to modernize the U.S. electric grid using artificial intelligence. The company is partnering with PJM Interconnection—the largest grid operator in North America—and Tapestry, an Alphabet moonshot backed by Google Cloud and DeepMind, to develop AI tools aimed at transforming how new power sources are brought online. The initiative, detailed in a blog post by Alphabet and Google President Ruth Porat, represents one of Google’s most ambitious energy collaborations to date. It seeks to address mounting challenges facing grid operators, particularly the explosive backlog of energy generation projects that await interconnection in a power system unprepared for 21st-century demands. “This is our biggest step yet to use AI for building a stronger, more resilient electricity system,” Porat wrote. Tapping AI to Tackle an Interconnection Crisis The timing is critical. The U.S. energy grid is facing a historic inflection point. According to the Lawrence Berkeley National Laboratory, more than 2,600 gigawatts (GW) of generation and storage projects were waiting in interconnection queues at the end of 2023—more than double the total installed capacity of the entire U.S. grid. Meanwhile, the Federal Energy Regulatory Commission (FERC) has revised its five-year demand forecast, now projecting U.S. peak load to rise by 128 GW before 2030—more than triple the previous estimate. Grid operators like PJM are straining to process a surge in interconnection requests, which have skyrocketed from a few dozen to thousands annually. This wave of applications has exposed the limits of legacy systems and planning tools. Enter AI. Tapestry’s role is to develop and deploy AI models that can intelligently manage and streamline the complex process of

Podcast: Vaire Computing Bets on Reversible Logic for ‘Near Zero Energy’ AI Data Centers

The AI revolution is charging ahead—but powering it shouldn’t cost us the planet. That tension lies at the heart of Vaire Computing’s bold proposition: rethinking the very logic that underpins silicon to make chips radically more energy efficient. Speaking on the Data Center Frontier Show podcast, Vaire CEO Rodolfo Rossini laid out a compelling case for why the next era of compute won’t just be about scaling transistors—but reinventing the way they work. “Moore’s Law is coming to an end, at least for classical CMOS,” Rossini said. “There are a number of potential architectures out there—quantum and photonics are the most well known. Our bet is that the future will look a lot like existing CMOS, but the logic will look very, very, very different.” That bet is reversible computing—a largely untapped architecture that promises major gains in energy efficiency by recovering energy lost during computation. A Forgotten Frontier Unlike conventional chips that discard energy with each logic operation, reversible chips can theoretically recycle that energy. The concept, Rossini explained, isn’t new—but it’s long been overlooked. “The tech is really old. I mean really old,” Rossini said. “The seeds of this technology were actually at the very beginning of the industrial revolution.” Drawing on the work of 19th-century mechanical engineers like Sadi Carnot and later insights from John von Neumann, the theoretical underpinnings of reversible computing stretch back decades. A pivotal 1961 paper formally connected reversibility to energy efficiency in computing. But progress stalled—until now. “Nothing really happened until a team of MIT students built the first chip in the 1990s,” Rossini noted. “But they were trying to build a CPU, which is a world of pain. There’s a reason why I don’t think there’s been a startup trying to build CPUs for a very, very long time.” AI, the

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE