Formulation of Feature Circuits with Sparse Autoencoders in LLM

Stay Ahead, Stay ONMINE

Formulation of Feature Circuits with Sparse Autoencoders in LLM

Large Language models (LLMs) have witnessed impressive progress and these large models can do a variety of tasks, from generating human-like text to answering questions. However, understanding how these models work still remains challenging, especially due a phenomenon called superposition where features are mixed into one neuron, making it very difficult to extract human understandable […]

In this blog post, we will use the Sparse Autoencoder to find some feature circuits on a particular interesting case of subject-verb agreement ,and understand how the model components contribute to the task.

Key concepts

Feature circuits

In the context of neural networks, feature circuits are how networks learn to combine input features to form complex patterns at higher levels. We use the metaphor of “circuits” to describe how features are processed along layers in a neural network because such processes remind us of circuits in electronics processing and combining signals.

These feature circuits form gradually through the connections between neurons and layers, where each neuron or layer is responsible for transforming input features, and their interactions lead to useful feature combinations that play together to make the final predictions.

Here is one example of feature circuits: in lots of vision neural networks, we can find “a circuit as a family of units detecting curves in different angular orientations. Curve detectors are primarily implemented from earlier, less sophisticated curve detectors and line detectors. These curve detectors are used in the next layer to create 3D geometry and complex shape detectors” [1].

In the coming chapter, we will work on one feature circuit in LLMs for a subject-verb agreement task.

Superposition and Sparse AutoEncoder

In the context of Machine Learning, we have sometimes observed superposition, referring to the phenomenon that one neuron in a model represents multiple overlapping features rather than a single, distinct one. For example, InceptionV1 contains one neuron that responds to cat faces, fronts of cars, and cat legs.

This is where the Sparse Autoencoder (SAE) comes in.

The SAE helps us disentangle the network’s activations into a set of sparse features. These sparse features are normally human understandable,m allowing us to get a better understanding of the model. By applying an SAE to the hidden layers activations of an LLM mode, we can isolate the features that contribute to the model’s output.

You can find the details of how the SAE works in my former blog post.

Case study: Subject-Verb Agreement

Subject-Verb Agreement

Subject-verb agreement is a fundamental grammar rule in English. The subject and the verb in a sentence must be consistent in numbers, aka singular or plural. For example:

“The cat runs.” (Singular subject, singular verb)
“The cats run.” (Plural subject, plural verb)

Understanding this rule simple for humans is important for tasks like text generation, translation, and question answering. But how do we know if an LLM has actually learned this rule?

We will now explore in this chapter how the LLM forms a feature circuit for such a task.

Building the Feature Circuit

Let’s now build the process of creating the feature circuit. We would do it in 4 steps:

We start by inputting sentences into the model. For this case study, we consider sentences like:

“The cat runs.” (singular subject)
“The cats run.” (plural subject)

We run the model on these sentences to get hidden activations. These activations stand for how the model processes the sentences at each layer.
We pass the activations to an SAE to “decompress” the features.
We construct a feature circuit as a computational graph:
- The input nodes represent the singular and plural sentences.
- The hidden nodes represent the model layers to process the input.
- The sparse nodes represent obtained features from the SAE.
- The output node represents the final decision. In this case: runs or run.

Toy Model

We start by building a toy language model which might have no sense at all with the following code. This is a network with two simple layers.

For the subject-verb agreement, the model is supposed to:

Input a sentence with either singular or plural verbs.
The hidden layer transforms such information into an abstract representation.
The model selects the correct verb form as output.

# ====== Define Base Model (Simulating Subject-Verb Agreement) ======
class SubjectVerbAgreementNN(nn.Module):
   def __init__(self):
       super().__init__()
       self.hidden = nn.Linear(2, 4)  # 2 input → 4 hidden activations
       self.output = nn.Linear(4, 2)  # 4 hidden → 2 output (runs/run)
       self.relu = nn.ReLU()


   def forward(self, x):
       x = self.relu(self.hidden(x))  # Compute hidden activations
       return self.output(x)  # Predict verb

It is unclear what happens inside the hidden layer. So we introduce the following sparse AutoEncoder:

# ====== Define Sparse Autoencoder (SAE) ======
class c(nn.Module):
   def __init__(self, input_dim, hidden_dim):
       super().__init__()
       self.encoder = nn.Linear(input_dim, hidden_dim)  # Decompress to sparse features
       self.decoder = nn.Linear(hidden_dim, input_dim)  # Reconstruct
       self.relu = nn.ReLU()


   def forward(self, x):
       encoded = self.relu(self.encoder(x))  # Sparse activations
       decoded = self.decoder(encoded)  # Reconstruct original activations
       return encoded, decoded

We train the original model SubjectVerbAgreementNN and the SubjectVerbAgreementNN with sentences designed to represent different singular and plural forms of verbs, such as “The cat runs”, “the babies run”. However, just like before, for the toy model, they may not have actual meanings.

Now we visualise the feature circuit. As introduced before, a feature circuit is a unit of neurons for processing specific features. In our model, the feature consists:

The hidden layer transforming language properties into abstract representation..
The SAE with independent features that contribute directly to the verb -subject agreement task.

Trained Feature Circuit: Singular vs. Plural (Dog/Dogs)

You can see in the plot that we visualize the feature circuit as a graph:

Hidden activations and the encoder’s outputs are all nodes of the graph.
We also have the output nodes as the correct verb.
Edges in the graph are weighted by activation strength, showing which pathways are most important in the subject-verb agreement decision. For example, you can see that the path from H3 to F2 plays an important role.

GPT2-Small

For a real case, we run the similar code on GPT2-small. We show the graph of a feature circuit representing the decision to choose the singular verb.

Feature Circuit for Subject-Verb agreement (run/runs). For code details and a larger version of the above, please refer to my notebook.

Conclusion

Feature circuits help us to understand how different parts in a complex LLM lead to a final output. We show the possibility to use an SAE to form a feature circuit for a subject-verb agreement task.

However, we have to admit this method still needs some human-level intervention in the sense that we don’t always know if a circuit can really form without a proper design.

Reference

[1] Zoom In: An Introduction to Circuits

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

US lets China buy semiconductor design software again

The reversal marks a dramatic shift from the aggressive stance the Trump administration took in May, when it imposed sweeping restrictions on electronic design automation (EDA) software — the critical tools needed to design advanced semiconductors. A short-lived stoppage The restrictions had targeted what analysts called the “upstream” of chip

Hardcoded root credentials in Cisco Unified CM trigger max-severity alert

The affected products-Cisco Unified CM and Unified CM SME–are core components of enterprise telephony infrastructure, widely deployed across government agencies, financial institutions, and large corporations to manage voice, video, and messaging at scale. A flaw in these systems could allow attackers to compromise an organization’s communications, letting them log in

HPE finalizes Juniper acquisition, forms new AI-centric networking unit

“We have agreed with the DOJ to offer a license, through an auction, to specific aspects of Juniper Mist, which is just the AI operations part,” HPE CEO Antonio Neri explained during the press conference. The distinction is technically significant. Competitors will gain access to Mist’s anomaly detection and predictive failure

Solving multicloud networking challenges to scale businesses with AI

Artificial Intelligence (AI) and its demand for massive computational power have spurred the growth of cloud and edge computing. As AI pilot projects take off and scale, there has been an increasing demand for more flexible and high-performing infrastructure. This has led to the rise of multi-cloud networking as organisations

Oil Slips as US Plans Iran Talks

Oil declined after Axios reported the US plans to restart nuclear talks with Iran, reducing the risk of another flare-up in the Middle East conflict. West Texas Intermediate crude slumped 0.7% to settle at $67 a barrel, while Brent settled below $69 after the news service said US Middle East envoy Steven Witkoff plans to meet with Iranian Foreign Minister Abbas Araghchi in Oslo next week. That followed a statement from Iran’s top diplomat that the country would continue to engage with the UN’s nuclear watchdog. Crude prices have been buffeted by geopolitical events in recent weeks, first surging after the escalation that included direct US strikes in Iran then declining after Tehran’s retaliation was dismissed as largely symbolic. Renewed negotiations over Iran’s nuclear program would further reduce oil’s already-diminished risk premium. Oil’s slump on Thursday also may have been amplified by low liquidity ahead of Friday’s July Fourth holiday in the US. The Middle East developments squelched some earlier strength in prices that was driven by US jobs data showing stronger-than-expected additions in June. Equity markets rose and the dollar gained, making commodities priced in the currency less appealing. The US also took fresh steps to restrict the trade of Iranian oil, including sanctions on companies and a “shadow fleet” of vessels that help Iran export its crude. Oil had rallied on Wednesday against the backdrop of a market flashing pockets of strength. Diesel’s premium to crude in the US earlier hit the biggest in 15 months after stockpiles of the fuel continued to decline. Spreads on the nearest crude contracts are also pointing to tight supplies, with stockpiles at the key storage hub of Cushing, Oklahoma, sliding. The continued outlook for supply dynamics, however, depends on a meeting between the Organization of the Petroleum Exporting Countries and its

Crude’s Drop, Strong Ruble Cut Russian Oil Revenue to 2-Year Low

Russia’s oil revenue in June slumped to a two-year low as global crude prices fell and the ruble strengthened, meaning each barrel brought fewer rubles to the Kremlin. Proceeds from the oil industry shrank by almost 30% to 415.6 billion rubles ($5.27 billion), according to Bloomberg calculations based on Finance Ministry data published Thursday. That’s the lowest since June 2023. Russia’s combined revenue from oil and gas taxes fell by a third compared with a year earlier, to almost 495 billion rubles, the calculations show. That’s the lowest since January 2023. A stronger currency means Russia and its oil producers get fewer rubles for every barrel they pump and sell. That erodes the profitability of the companies and strains the federal budget, which depends on oil and gas taxes for about a third of its revenues. Any substantial decline in the tax take from the industry directly affects the nation’s coffers, which are already burdened by multibillion-dollar spending on the war in Ukraine. A stronger ruble also reduces the incentive to export. At the end of April, the Finance Ministry revised its expectation for this year’s budget deficit, forecasting a shortfall much deeper than previously estimated as US tariff policies and OPEC+ supply hikes caused oil prices to nosedive. To cover deficits, the nation taps into its wealth fund, designed to stabilize the economy. The ministry calculated Russia’s June oil revenue based on a Urals price of $52.08 a barrel in May. That’s the lowest price for the nation’s key export grade since March 2023, data compiled by Bloomberg show. The currency traded at an average 80.4603 rubles per US dollar in May, the strongest in two years, driven by record-high interest rates and expectations of an improvement in relations with Washington. As a result, the country’s oil companies received only 4,190 rubles for

House passes Senate version of megabill, sending it to Trump’s desk

The Republican budget megabill, which makes steep cuts to the Inflation Reduction Act’s clean energy tax credits, now heads to President Donald Trump’s desk after passing both houses of Congress. The House passed the Senate’s version of the bill 218-214 on Thursday, after Republicans debated through the night and House Minority Leader Hakeem Jeffries, D-N.Y., used the leadership prerogative of a “magic minute” to speak in opposition to the bill for a record-breaking 8 hours and 44 minutes. Jeffries at one point referenced a June letter 13 House Republicans sent to the Senate, in which the Republicans stated they were “proud to have worked to ensure that the bill did not include a full repeal of the clean energy tax credits, [but] remain deeply concerned by several provisions” cutting back those credits the House version did include. “Every single one of these signatories voted for a House bill that undermines the clean energy economy, noted that it would hurt their own constituents, voted for the bill anyway, then begged the Senate to make a difference,” Jeffries said. “That is not how we should be legislating in the United States House of Representatives … It limped out of the House by a single vote, so every single signatory on this letter could have stopped the bill.” The bill restricts the ability of projects to qualify for the tech-neutral clean electricity 45Y production tax credit and 48E investment tax credit, shortens the timeline for those credits, and ends the 25D residential solar credit after this year. The 25E, 30D, 30C and 45W electric vehicle credits will terminate after Sept. 30. While clean energy advocates and congressional Democrats maintain that the final version of the bill goes too far in slashing IRA credits, some Republicans wanted to see more significant cuts. Before the House

Airloom Energy to pilot novel wind power tech at Wyoming site

Dive Brief: Airloom Energy has broken ground on a utility-scale pilot of the novel wind power technology that it says offers better energy density and siting flexibility at significantly lower capital cost. The Bill Gates-backed startup says the southeastern Wyoming project will validate its low-slung turbine design, which captures wind energy using 30-foot, vertically oriented airfoils that move around an ovular trackway about 80 feet above the ground. Airloom plans to begin generating power at the site near Rock River later this year, and begin commercial-scale demonstrations in 2027, according to a timeline posted on its website. Dive Insight: The Wyoming Energy Authority awarded Airloom $5 million in November to “design, build and test a 1 MW demonstration device that validates the company’s innovative, low-profile design.” That description matches what Airloom said about the pilot project in its announcement last week, but Airloom CEO Neal Rickner told TechCrunch on June 25 that the pilot would generate about 150 kW of electricity while using the same parts as future megawatt-scale turbines. Airloom did not respond to requests for comment. The key difference between the pilot- and commercial-scale plants, Rickner told TechCrunch, is the physical footprint: The straightaways run about 100 meters on the former and 500 meters on the latter. Despite the relatively large area inside the trackway, Airloom says its technology offers higher energy densities than traditional wind, which the U.S. Department of Agriculture says requires about 10 times less land per megawatt than solar PV. Whereas a traditional wind turbine has a circular swept area, Airloom’s design sweeps a rectangular area, capturing more wind and conserving more energy, it says. Airloom touts the undisturbed land inside the trackway as fit for complementary uses like livestock grazing, crop production or solar arrays. The design has other advantages over traditional turbines,

Trinidad Aims to Boost Gas Output with BP, Woodside Projects

Trinidad and Tobago is working to reverse a major slump in gas production and revive a key export industry in the Caribbean nation to boost its economic growth. Energy Minister Roodal Moonilal sees strong potential in the country’s offshore waters at a time when auditors said more gas output must be added to satisfy demand. During the opening of the Society of Petroleum Engineers 2025 Mature Basin Energy Symposium Tuesday, Moonilal said Woodside Energy’s Calypso offshore project is expected to add 700 million standard cubic feet per day to domestic gas production. “The current and upcoming gas projects will, however, only provide relief in the short to medium term,” he added. Calypso, with reserves of 3.5 trillion cubic feet, is “one of the main projects we would be seeking to accelerate,” Moonilal told Bloomberg in an interview on June 27. Trinidad and Tobago has seen a 33 percent drop in gas output since 2015. The need to ramp up production comes as the country aims to fill a supply gap and deals with a crippling shortage of foreign exchange. Natural gas reserves auditors DeGoyler and MacNaughton estimated in its latest report that 58.25 trillion cubic feet of prospective resources were offshore at the end of 2023. The country’s gas production has been declining, and it produced 3.8 billion cubic feet in 2015, Moonilal said. Trinidad and Tobago produced around 2.5 billion cubic feet in the first quarter of 2025, 46 percent of which is exported as liquefied natural gas, the energy minister told Bloomberg Wednesday. The auditors advised the country to fast track gas projects and exploration efforts to “convert prospective resources into reserves and contingent resources, to meet gas demand,” he added. Several projects from BP’s subsidiary in Trinidad and Tobago are planned with the oil and gas major set to

Mozambique President Urges Total LNG Restart Despite Risks

Mozambican President Daniel Chapo said his government and private companies will have to collectively ensure the necessary security is in place to enable TotalEnergies SE to restart construction of a $20 billion gas project that has stalled due to a militant insurgency — and even then risks will remain. The project in the northern Cabo Delgado province along with others that are at earlier stages of development are seen as crucial to the future of the southern African nation, which ranks among the world’s poorest. The French oil major halted work, evacuated workers and declared force majeure in 2021 following an escalation in attacks by Islamic State-linked militants. “Regarding security, it’s relatively stable compared to this past four years, but continuity of this stability doesn’t depend only on the government of the republic but on all partners in that area,” Chapo said in an interview with Bloomberg in Seville, Spain, on Tuesday. “If we’re waiting for Cabo Delgado to be a heaven, we won’t lift force majeure.” Total’s facility, which will take another four years to complete, will liquefy and export the extensive gas reserves off northeast Mozambique that were discovered 15 years ago. Since then, only one floating plant operated by Eni SpA has come on line. A final investment decision on a third planned venture, Exxon Mobil Corp.’s $27 billion Rovuma LNG project, is expected next year. Mozambique called on nations in the region to help secure Cabo Delgado after its attempts to use mercenaries to halt the violence failed. Rwandan troops, brought in months after the attacks that led to Total’s evacuation, provided an effective response. The European Council gave the Rwanda Defence Force additional funding in November to help fight the insurgency, but the terms of its deployment remain uncertain. “How long will depend a lot on security

CoreWeave achieves a first with Nvidia GB300 NVL72 deployment

The deployment, Kimball said, “brings Dell quality to the commodity space. Wins like this really validate what Dell has been doing in reshaping its portfolio to accommodate the needs of the market — both in the cloud and the enterprise.” Although concerns were voiced last year that Nvidia’s next-generation Blackwell data center processors had significant overheating problems when they were installed in high-capacity server racks, he said that a repeat performance is unlikely. Nvidia, said Kimball “has been very disciplined in its approach with its GPUs and not shipping silicon until it is ready. And Dell almost doubles down on this maniacal quality focus. I don’t mean to sound like I have blind faith, but I’ve watched both companies over the last several years be intentional in delivering product in volume. Especially as the competitive market starts to shape up more strongly, I expect there is an extremely high degree of confidence in quality.” CoreWeave ‘has one purpose’ He said, “like Lambda Labs, Crusoe and others, [CoreWeave] seemingly has one purpose (for now): deliver GPU capacity to the market. While I expect these cloud providers will expand in services, I think for now the type of customer employing services is on the early adopter side of AI. From an enterprise perspective, I have to think that organizations well into their AI journey are the consumers of CoreWeave.” “CoreWeave is also being utilized by a lot of the model providers and tech vendors playing in the AI space,” Kimball pointed out. “For instance, it’s public knowledge that Microsoft, OpenAI, Meta, IBM and others use CoreWeave GPUs for model training and more. It makes sense. These are the customers that truly benefit from the performance lift that we see from generation to generation.”

Oracle to power OpenAI’s AGI ambitions with 4.5GW expansion

“For CIOs, this shift means more competition for AI infrastructure. Over the next 12–24 months, securing capacity for AI workloads will likely get harder, not easier. Though cost is coming down but demand is increasing as well, due to which CIOs must plan earlier and build stronger partnerships to ensure availability,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. He added that CIOs should expect longer wait times for AI infrastructure. To mitigate this, they should lock in capacity through reserved instances, diversify across regions and cloud providers, and work with vendors to align on long-term demand forecasts. “Enterprises stand to benefit from more efficient and cost-effective AI infrastructure tailored to specialized AI workloads, significantly lower their overall future AI-related investments and expenses. Consequently, CIOs face a critical task: to analyze and predict the diverse AI workloads that will prevail across their organizations, business units, functions, and employee personas in the future. This foresight will be crucial in prioritizing and optimizing AI workloads for either in-house deployment or outsourced infrastructure, ensuring strategic and efficient resource allocation,” said Neil Shah, vice president at Counterpoint Research. Strategic pivot toward AI data centers The OpenAI-Oracle deal comes in stark contrast to developments earlier this year. In April, AWS was reported to be scaling back its plans for leasing new colocation capacity — a move that AWS Vice President for global data centers Kevin Miller described as routine capacity management, not a shift in long-term expansion plans. Still, these announcements raised questions around whether the hyperscale data center boom was beginning to plateau. “This isn’t a slowdown, it’s a strategic pivot. The era of building generic data center capacity is over. The new global imperative is a race for specialized, high-density, AI-ready compute. Hyperscalers are not slowing down; they are reallocating their capital to

Arista Buys VeloCloud to reboot SD-WANs amid AI infrastructure shift

What this doesn’t answer is how Arista Networks plans to add newer, security-oriented Secure Access Service Edge (SASE) capabilities to VeloCloud’s older SD-WAN technology. Post-acquisition, it still has only some of the building blocks necessary to achieve this. Mapping AI However, in 2025 there is always more going on with networking acquisitions than simply adding another brick to the wall, and in this case it’s the way AI is changing data flows across networks. “In the new AI era, the concepts of what comprises a user and a site in a WAN have changed fundamentally. The introduction of agentic AI even changes what might be considered a user,” wrote Arista Networks CEO, Jayshree Ullal, in a blog highlighting AI’s effect on WAN architectures. “In addition to people accessing data on demand, new AI agents will be deployed to access data independently, adapting over time to solve problems and enhance user productivity,” she said. Specifically, WANs needed modernization to cope with the effect AI traffic flows are having on data center traffic. Sanjay Uppal, now VP and general manager of the new VeloCloud Division at Arista Networks, elaborated. “The next step in SD-WAN is to identify, secure and optimize agentic AI traffic across that distributed enterprise, this time from all end points across to branches, campus sites, and the different data center locations, both public and private,” he wrote. “The best way to grab this opportunity was in partnership with a networking systems leader, as customers were increasingly looking for a comprehensive solution from LAN/Campus across the WAN to the data center.”

Data center capacity continues to shift to hyperscalers

However, even though colocation and on-premises data centers will continue to lose share, they will still continue to grow. They just won’t be growing as fast as hyperscalers. So, it creates the illusion of shrinkage when it’s actually just slower growth. In fact, after a sustained period of essentially no growth, on-premises data center capacity is receiving a boost thanks to genAI applications and GPU infrastructure. “While most enterprise workloads are gravitating towards cloud providers or to off-premise colo facilities, a substantial subset are staying on-premise, driving a substantial increase in enterprise GPU servers,” said John Dinsdale, a chief analyst at Synergy Research Group.

Oracle inks $30 billion cloud deal, continuing its strong push into AI infrastructure.

He pointed out that, in addition to its continued growth, OCI has a remaining performance obligation (RPO) — total future revenue expected from contracts not yet reported as revenue — of $138 billion, a 41% increase, year over year. The company is benefiting from the immense demand for cloud computing largely driven by AI models. While traditionally an enterprise resource planning (ERP) company, Oracle launched OCI in 2016 and has been strategically investing in AI and data center infrastructure that can support gigawatts of capacity. Notably, it is a partner in the $500 billion SoftBank-backed Stargate project, along with OpenAI, Arm, Microsoft, and Nvidia, that will build out data center infrastructure in the US. Along with that, the company is reportedly spending about $40 billion on Nvidia chips for a massive new data center in Abilene, Texas, that will serve as Stargate’s first location in the country. Further, the company has signaled its plans to significantly increase its investment in Abu Dhabi to grow out its cloud and AI offerings in the UAE; has partnered with IBM to advance agentic AI; has launched more than 50 genAI use cases with Cohere; and is a key provider for ByteDance, which has said it plans to invest $20 billion in global cloud infrastructure this year, notably in Johor, Malaysia. Ellison’s plan: dominate the cloud world CTO and co-founder Larry Ellison announced in a recent earnings call Oracle’s intent to become No. 1 in cloud databases, cloud applications, and the construction and operation of cloud data centers. He said Oracle is uniquely positioned because it has so much enterprise data stored in its databases. He also highlighted the company’s flexible multi-cloud strategy and said that the latest version of its database, Oracle 23ai, is specifically tailored to the needs of AI workloads. Oracle

Datacenter industry calls for investment after EU issues water consumption warning

CISPE’s response to the European Commission’s report warns that the resulting regulatory uncertainty could hurt the region’s economy. “Imposing new, standalone water regulations could increase costs, create regulatory fragmentation, and deter investment. This risks shifting infrastructure outside the EU, undermining both sustainability and sovereignty goals,” CISPE said in its latest policy recommendation, Advancing water resilience through digital innovation and responsible stewardship. “Such regulatory uncertainty could also reduce Europe’s attractiveness for climate-neutral infrastructure investment at a time when other regions offer clear and stable frameworks for green data growth,” it added. CISPE’s recommendations are a mix of regulatory harmonization, increased investment, and technological improvement. Currently, water reuse regulation is directed towards agriculture. Updated regulation across the bloc would encourage more efficient use of water in industrial settings such as datacenters, the asosciation said. At the same time, countries struggling with limited public sector budgets are not investing enough in water infrastructure. This could only be addressed by tapping new investment by encouraging formal public-private partnerships (PPPs), it suggested: “Such a framework would enable the development of sustainable financing models that harness private sector innovation and capital, while ensuring robust public oversight and accountability.” Nevertheless, better water management would also require real-time data gathered through networks of IoT sensors coupled to AI analytics and prediction systems. To that end, cloud datacenters were less a drain on water resources than part of the answer: “A cloud-based approach would allow water utilities and industrial users to centralize data collection, automate operational processes, and leverage machine learning algorithms for improved decision-making,” argued CISPE.

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle