Stay Ahead, Stay ONMINE

Formulation of Feature Circuits with Sparse Autoencoders in LLM

Large Language models (LLMs) have witnessed impressive progress and these large models can do a variety of tasks, from generating human-like text to answering questions. However, understanding how these models work still remains challenging, especially due a phenomenon called superposition where features are mixed into one neuron, making it very difficult to extract human understandable […]

Large Language models (LLMs) have witnessed impressive progress and these large models can do a variety of tasks, from generating human-like text to answering questions. However, understanding how these models work still remains challenging, especially due a phenomenon called superposition where features are mixed into one neuron, making it very difficult to extract human understandable representation from the original model structure. This is where methods like sparse Autoencoder appear to disentangle the features for interpretability. 

In this blog post, we will use the Sparse Autoencoder to find some feature circuits on a particular interesting case of subject-verb agreement ,and understand how the model components contribute to the task.

Key concepts 

Feature circuits 

In the context of neural networks, feature circuits are how networks learn to combine input features to form complex patterns at higher levels. We use the metaphor of “circuits” to describe how features are processed along layers in a neural network because such processes remind us of circuits in electronics processing and combining signals.

These feature circuits form gradually through the connections between neurons and layers, where each neuron or layer is responsible for transforming input features, and their interactions lead to useful feature combinations that play together to make the final predictions.

Here is one example of feature circuits: in lots of vision neural networks, we can find “a circuit as a family of units detecting curves in different angular orientations. Curve detectors are primarily implemented from earlier, less sophisticated curve detectors and line detectors. These curve detectors are used in the next layer to create 3D geometry and complex shape detectors” [1]. 

In the coming chapter, we will work on one feature circuit in LLMs for a subject-verb agreement task. 

Superposition and Sparse AutoEncoder 

In the context of Machine Learning, we have sometimes observed superposition, referring to the phenomenon that one neuron in a model represents multiple overlapping features rather than a single, distinct one. For example, InceptionV1 contains one neuron that responds to cat faces, fronts of cars, and cat legs. 

This is where the Sparse Autoencoder (SAE) comes in.

The SAE helps us disentangle the network’s activations into a set of sparse features. These sparse features are normally human understandable,m allowing us to get a better understanding of the model. By applying an SAE to the hidden layers activations of an LLM mode, we can isolate the features that contribute to the model’s output. 

You can find the details of how the SAE works in my former blog post

Case study: Subject-Verb Agreement

Subject-Verb Agreement 

Subject-verb agreement is a fundamental grammar rule in English. The subject and the verb in a sentence must be consistent in numbers, aka singular or plural. For example:

  • “The cat runs.” (Singular subject, singular verb)
  • “The cats run.” (Plural subject, plural verb)

Understanding this rule simple for humans is important for tasks like text generation, translation, and question answering. But how do we know if an LLM has actually learned this rule? 

We will now explore in this chapter how the LLM forms a feature circuit for such a task. 

Building the Feature Circuit

Let’s now build the process of creating the feature circuit. We would do it in 4 steps:

  1. We start by inputting sentences into the model. For this case study, we consider sentences like: 
  • “The cat runs.” (singular subject)
  • “The cats run.” (plural subject)
  1. We run the model on these sentences to get hidden activations. These activations stand for how the model processes the sentences at each layer.
  2. We pass the activations to an SAE to “decompress” the features. 
  3. We construct a feature circuit as a computational graph:
    • The input nodes represent the singular and plural sentences.
    • The hidden nodes represent the model layers to process the input. 
    • The sparse nodes represent obtained features from the SAE.
    • The output node represents the final decision. In this case: runs or run. 

Toy Model 

We start by building a toy language model which might have no sense at all with the following code. This is a network with two simple layers. 

For the subject-verb agreement, the model is supposed to: 

  • Input a sentence with either singular or plural verbs. 
  • The hidden layer transforms such information into an abstract representation. 
  • The model selects the correct verb form as output.
# ====== Define Base Model (Simulating Subject-Verb Agreement) ======
class SubjectVerbAgreementNN(nn.Module):
   def __init__(self):
       super().__init__()
       self.hidden = nn.Linear(2, 4)  # 2 input → 4 hidden activations
       self.output = nn.Linear(4, 2)  # 4 hidden → 2 output (runs/run)
       self.relu = nn.ReLU()


   def forward(self, x):
       x = self.relu(self.hidden(x))  # Compute hidden activations
       return self.output(x)  # Predict verb

It is unclear what happens inside the hidden layer. So we introduce the following sparse AutoEncoder: 

# ====== Define Sparse Autoencoder (SAE) ======
class c(nn.Module):
   def __init__(self, input_dim, hidden_dim):
       super().__init__()
       self.encoder = nn.Linear(input_dim, hidden_dim)  # Decompress to sparse features
       self.decoder = nn.Linear(hidden_dim, input_dim)  # Reconstruct
       self.relu = nn.ReLU()


   def forward(self, x):
       encoded = self.relu(self.encoder(x))  # Sparse activations
       decoded = self.decoder(encoded)  # Reconstruct original activations
       return encoded, decoded

We train the original model SubjectVerbAgreementNN and the SubjectVerbAgreementNN with sentences designed to represent different singular and plural forms of verbs, such as “The cat runs”, “the babies run”. However, just like before, for the toy model, they may not have actual meanings. 

Now we visualise the feature circuit. As introduced before, a feature circuit is a unit of neurons for processing specific features. In our model, the feature consists: 

  1. The hidden layer transforming language properties into abstract representation..
  2. The SAE with independent features that contribute directly to the verb -subject agreement task. 
Trained Feature Circuit: Singular vs. Plural (Dog/Dogs)

You can see in the plot that we visualize the feature circuit as a graph: 

  • Hidden activations and the encoder’s outputs are all nodes of the graph.
  • We also have the output nodes as the correct verb.
  • Edges in the graph are weighted by activation strength, showing which pathways are most important in the subject-verb agreement decision. For example, you can see that the path from H3 to F2 plays an important role. 

GPT2-Small 

For a real case, we run the similar code on GPT2-small. We show the graph of a feature circuit representing the decision to choose the singular verb.

Feature Circuit for Subject-Verb agreement (run/runs). For code details and a larger version of the above, please refer to my notebook.

Conclusion 

Feature circuits help us to understand how different parts in a complex LLM lead to a final output. We show the possibility to use an SAE to form a feature circuit for a subject-verb agreement task. 

However, we have to admit this method still needs some human-level intervention in the sense that we don’t always know if a circuit can really form without a proper design.

Reference 

[1] Zoom In: An Introduction to Circuits

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Chinese cyberspies target VMware vSphere for long-term persistence

Designed to work in virtualized environments The CISA, NSA, and Canadian Cyber Center analysts note that some of the BRICKSTORM samples are virtualization-aware and they create a virtual socket (VSOCK) interface that enables inter-VM communication and data exfiltration. The malware also checks the environment upon execution to ensure it’s running

Read More »

IBM boosts DNS protection for multicloud operations

“In addition to this DNS synchronization, you can publish DNS configurations to your Amazon Simple Storage Service (S3) bucket. As you implement DNS changes, the S3 bucket will automatically update. The ability to store multiple configurations in your S3 bucket allows you to choose the most appropriate restore point if

Read More »

Sanctioned Russian LNG Plant Ships to China

A Russian liquefied natural gas export facility delivered its first shipment to China since being sanctioned by the US in January, the latest sign of increased energy cooperation between Beijing and Moscow. The Valera vessel, which loaded a shipment from Gazprom PJSC’s Portovaya facility on the Baltic Sea in October, arrived at the Beihai import terminal in southern China on Monday, ship data compiled by Bloomberg shows. Both Valera and Portovaya were sanctioned by Joe Biden’s administration to thwart Russia’s plans to boost LNG exports. China, which doesn’t recognize the unilateral sanctions, has increasingly bought blacklisted Russian gas over the last few months, ratcheting up energy ties between the two countries. Beijing has also ignored a broader push by US President Donald Trump to halt sales of Russian oil, which will likely be a key part of trade negotiations between Washington and New Delhi this week. Russia has two relatively small LNG export facilities on the Baltic Sea, with the Novatek PJSC-led Vysotsk plant also blacklisted by the US. Another sanctioned Russian plant, the Arctic LNG 2 site in Siberia, started delivering fuel to Beihai in late August. Total Russian LNG shipments to China, including from unsanctioned plants, rose about 14 percent from September through November from the same period a year earlier, ship data shows. If unloaded, Valera would be the 19th shipment of LNG into China from a blacklisted Russian plant since August, the data shows. In mid-October, satellite images showed a tanker that loaded at Portovaya transferring fuel into another vessel registered to a Hong Kong-based company near Malaysia. That ship, known as CCH Gas, has been sending out false location signals, and was spotted by satellites near China last month. It isn’t clear where it is currently located. What do you think? We’d love to hear from

Read More »

Key Oil Price Firm Will Ignore Fuel from Russia Crude

One of the world’s main companies for setting benchmark prices of physical commodities said it will start to ignore fuel that’s made from Russian crude when making its assessments. The step by Platts, a unit of S&P Global Energy, effectively eliminates one source of supply that might be cheaper than others. The move will align with European Union rules.  On Nov. 18, Intercontinental Exchange Inc. set out rules that are more restrictive than those of the EU, which allow diesel from a refinery that processes Russian barrels into the bloc, provided the fuel’s from a production line that uses non-Russian oil. By contrast, Platts said that bids and offers that it considers for its assessment process “are expected to carry the implicit guarantee that the oil product will satisfy the EU’s import ban.”  Platts’s two key types of price assessment are cargoes and barge loads of fuel.  Cargo assessments will cease reflecting products made from Russian crude from Dec. 15, the pricing agency said in a statement. Barge prices will stop doing so from Jan. 2. The new EU measures taking effect next year will ban imports of fuels made with Russian crude as part of efforts to cripple revenues that help fund the Kremlin’s war in Ukraine. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Read More »

No Hurricanes Strike USA For 1st Time in a Decade

For the first time in a decade, not a single hurricane struck the U.S. this season, and that was a much needed break. That’s what Neil Jacobs, Under Secretary of Commerce for Oceans and Atmosphere, and National Oceanic and Atmospheric Administration (NOAA) Administrator, said in a statement posted on NOAA’s site recently, which summarized the Atlantic, Eastern Pacific, and Central Pacific hurricane seasons. “Still, a tropical storm caused damage and casualties in the Carolinas, distant hurricanes created rough ocean waters that caused property damage along the East Coast, and neighboring countries experienced direct hits from hurricanes,” Jacobs said in the statement. The NOAA statement noted that the Atlantic basin produced 13 named storms. Of these, five became hurricanes, including four major hurricanes, NOAA highlighted, pointing out that an average season has 14 named storms, seven hurricanes, and three major hurricanes. In the statement, NOAA said, overall, the season fell within the predicted ranges for named storms, hurricanes, and major hurricanes issued in NOAA’s seasonal outlooks. Hurricane season activity was near-normal for both the Eastern Pacific basin and Central Pacific basin and fell within predicted ranges, respectively, NOAA added in the statement. The organization highlighted that the Eastern Pacific basin hurricane season produced 18 named storms, “with nine becoming hurricanes and three intensifying to major hurricane status”. “Two named storms formed in the Central Pacific basin, with one, Iona, becoming a major hurricane well south of Hawaii,” NOAA added. “Eastern Pacific storms Henriette and Kiko were also hurricanes in the Central Pacific that passed northeast of Hawaii with little impact to the state,” it continued. AI Guidance In the NOAA statement, Jacobs said “the 2025 season was the first year NOAA’s National Hurricane Center incorporated Artificial Intelligence model guidance into their forecasts”. “The NHC [National Hurricane Center] performed exceedingly well when it came to forecasting rapid intensification for

Read More »

Chile Pens Nearly $12B Deals to Buy Vaca Muerta Oil

Chile’s Empresa Nacional del Petróleo (Enap) has signed contracts to purchase crude from Argentina’s Vaca Muerta shale patch from Argentina’s state-owned YPF SA, Norway’s majority state-owned Equinor ASA, Britain’s Shell PLC and Mexico’s Vista Energy SAB de CV. The agreements, which last through June 2033, amount to about 35 percent of Enap’s annual crude demand, Enap said in an online statement. YPF said separately the initial combined volume is up to 70,000 barrels per day (bpd). YPF said its share is around 32,000 bpd or 45.45 percent of the total volume. “The contracts, signed after a negotiation process and operational testing that lasted more than two years, involve a projected value of nearly $12 billion, making it the largest commercial agreement in Enap’s history”, Enap said. “As a reference, the total annual trade between Chile and Argentina is currently close to $8 billion”. The volumes will be delivered via the more than 400-kilometer (248.55 miles) Trans-Andean pipeline, co-owned between Enap, YPF and Chevron Corp. After 17 years, the pipeline resumed flows July 2023, delivering about 40,000 barrels per day of Vaca Muerta oil to Enap’s facilities in Hualpén, Región del Biobío, as previously reported by Enap. “The subscription of these contracts provides greater security and stability to the supply of crude oil, strengthens the country’s energy security, enhances the logistics chain on both sides of the mountain range and reduces dependence on maritime transport that is regularly impacted by factors such as weather conditions or port congestion”, Enap said. “In addition, it allows for the purchase of crude oil with a lower sulfur content, which is beneficial from an environmental point of view. “It also reinforces Enap’s recently announced positioning with regard to its logistics business, as it will enable the export of crude oil from Vaca Muerta through the

Read More »

WoodMac Flags ‘Key Themes’ Shaping Lower 48 in 2026

Wood Mackenzie (WoodMac) identified several “key themes shaping the U.S. Lower 48 landscape” next year in a statement sent to Rigzone recently. Among these was a projection that the horizontal rig count will fall below 500. “Oil focused activity levels will decline as operators face macro headwinds, particularly in H1 2026,” WoodMac said in the statement. “This sits below the $60 per barrel threshold that sparks questions around investment strategy,” it added. WoodMac said in the statement, however, that declining rig count is no longer the needle mover it once was. “Major strides in operational efficiency have reduced the number of active rigs required to maintain base business,” the company stated. “Operators are drilling faster, and cycle times are improving,” it added. WoodMac went on to note in the statement that “the activity taper will create deflationary pressures on costs”. “Wood Mackenzie expects to see a modest reduction in drilling and completion costs across the Lower 48 in 2026, including tariffs,” it said. “Lower costs help protect most of the new drill supply curve. Even at $60 per barrel Brent, more than 90 percent of U.S. Lower 48 assets can cover their capex requirements, with all assets covering operating costs,” it continued. Another theme was a projection that core Permian plays produce more than 50 percent of U.S. onshore liquids next year. “Lower 48 oil production will stall in 2026 for the first time since the pandemic,” WoodMac warned. “Rigs falling throughout 2025 and less activity in the year create this culmination. The Permian remains resilient and the powerhouse of U.S. oil supply,” it added. “Combined 2026 production from the Delaware Wolfcamp, Bone Spring, Midland Wolfcamp, and Midland Spraberry will account for more than 50 percent of onshore U.S. oil output for the first time ever,” it continued. “Delaware Wolfcamp

Read More »

DTECH 2026: 5 observations ahead of the biggest grid event of the year

The energy transition isn’t coming — it’s here. Across North America, utilities are navigating an unprecedented convergence of challenges: exponential load growth from data centers and electrification, rising reliability expectations and an accelerating influx of distributed energy resources (DERs). At DTECH 2026, Feb. 2-5 in San Diego, OATI will showcase how utilities can turn these pressures into opportunities through a simple, unifying concept: flexibility. “Flexibility is no longer optional — it’s the operating principle of the modern grid,” says Sasan Mokhtari, OATI’s president & CEO. “The future belongs to the utilities that can orchestrate DERs, manage load growth intelligently and connect operations from meters to markets.” 1. From DER chaos to DERMS clarity Distributed Energy Resource Management Systems (DERMS) have evolved from niche pilots to mission-critical platforms. OATI would know—we deployed the first DERMS in North America in 2009 and created what would define a generation of grid modernization. What was once an experiment in aggregation is now the foundation of reliable, data-driven grid management. Modern DERMS platforms, like OATI DERMS, enable utilities to see, forecast and control a complex web of rooftop solar, battery storage, EV chargers and flexible loads in real time. They bring together three critical capabilities: Visibility — a unified picture of all DERs across the grid Optimization — dispatch decisions that balance economics, carbon and reliability Market Integration — the ability to monetize flexibility through participation in wholesale markets At DTECH 2026, OATI will demonstrate how its DERMS platform bridges the divide between bulk power and distribution operations, uniting IT and OT operations and helping utilities truly manage the grid from meters to markets. 2. The new face of load growth: Data centers, EVs and electrification The growth of data centers and electric transportation is transforming grid demand faster than many planners ever imagined. In 2024 alone, U.S. data

Read More »

What does Arm need to do to gain enterprise acceptance?

But in 2017, AMD released the Zen architecture, which was equal if not superior to the Intel architecture. Zen made AMD competitive, and it fueled an explosive rebirth for a company that was near death a few years prior. AMD now has about 30% market share, while Intel suffers from a loss of technology as well as corporate leadership. Now, customers have a choice of Intel or AMD, and they don’t have to worry about porting their applications to a new platform like they would have to do if they switched to Arm. Analysts weigh in on Arm Tim Crawford sees no demand for Arm in the data center. Crawford is president of AVOA, a CIO consultancy. In his role, he talks to IT professionals all the time, but he’s not hearing much interest in Arm. “I don’t see Arm really making a dent, ever, into the general-purpose processor space,” Crawford said. “I think the opportunity for Arm is special applications and special silicon. If you look at the major cloud providers, their custom silicon is specifically built to do training or optimized to do inference. Arm is kind of in the same situation in the sense that it has to be optimized.” “The problem [for Arm] is that there’s not necessarily a need to fulfill at this point in time,” said Rob Enderle, principal analyst with The Enderle Group. “Obviously, there’s always room for other solutions, but Arm is still going to face the challenge of software compatibility.” And therein lies what may be Arm’s greatest challenge: software compatibility. Software doesn’t care (usually) if it’s on Intel or AMD, because both use the x86 architecture, with some differences in extensions. But Arm is a whole new platform, and that requires porting and testing. Enterprises generally don’t like disruption —

Read More »

Intel decides to keep networking business after all

That doesn’t explain why Intel made the decision to pursue spin-off in the first place. In July, NEX chief Sachin Katti issued a memo that outlined plans to establish key elements of the Networking and Communications business as a stand-alone company. It looked like a done deal, experts said. Jim Hines, research director for enabling technologies and semiconductors at IDC, declined to speculate on whether Intel could get a decent offer but noted NEX is losing ground. IDC estimates Intel’s market share in overall semiconductors at 6.8% in Q3 2025, which is down from 7.4% for the full year 2024 and 9.2% for the full year 2023. Intel’s course reversal “is a positive for Intel in the long term, and recent improvements in its financial situation may have contributed to the decision to keep NEX in house,” he said. When Tan took over as CEO earlier this year, prioritized strengthening the balance sheet and bringing a greater focus on execution. Divest NEX was aligned with these priorities, but since then, Intel has secured investments from the US Government, Nvidia and SoftBank that have reduced the need to raise cash through other means, Hines notes. “The NEX business will prove to be a strategic asset for Intel as it looks to protect and expand its position in the AI datacenter market. Success in this market now requires processor suppliers to offer a full-stack solution, not just silicon. Scale-up and scale-out networking solutions are a key piece of the package, and Intel will be able to leverage its NEX technologies and software, including silicon photonics, to develop differentiated product offerings in this space,” Hines said.

Read More »

At the Crossroads of AI and the Edge: Inside 1623 Farnam’s Rising Role as a Midwest Interconnection Powerhouse

That was the thread that carried through our recent conversation for the DCF Show podcast, where Severn walked through the role Farnam now plays in AI-driven networking, multi-cloud connectivity, and the resurgence of regional interconnection as a core part of U.S. digital infrastructure. Aggregation, Not Proximity: The Practical Edge Severn is clear-eyed about what makes the edge work and what doesn’t. The idea that real content delivery could aggregate at the base of cell towers, he noted, has never been realistic. The traffic simply isn’t there. Content goes where the network already concentrates, and the network concentrates where carriers, broadband providers, cloud onramps, and CDNs have amassed critical mass. In Farnam’s case, that density has grown steadily since the building changed hands in 2018. At the time an “underappreciated asset,” the facility has since become a meeting point for more than 40 broadband providers and over 60 carriers, with major content operators and hyperscale platforms routing traffic directly through its MMRs. That aggregation effect feeds on itself; as more carrier and content traffic converges, more participants anchor themselves to the hub, increasing its gravitational pull. Geography only reinforces that position. Located on the 41st parallel, the building sits at the historical shortest-distance path for early transcontinental fiber routes. It also lies at the crossroads of major east–west and north–south paths that have made Omaha a natural meeting point for backhaul routes and hyperscale expansions across the Midwest. AI and the New Interconnection Economy Perhaps the clearest sign of Farnam’s changing role is the sheer volume of fiber entering the building. More than 5,000 new strands are being brought into the property, with another 5,000 strands being added internally within the Meet-Me Rooms in 2025 alone. These are not incremental upgrades—they are hyperscale-grade expansions driven by the demands of AI traffic,

Read More »

Schneider Electric’s $2.3 Billion in AI Power and Cooling Deals Sends Message to Data Center Sector

When Schneider Electric emerged from its 2025 North American Innovation Summit in Las Vegas last week with nearly $2.3 billion in fresh U.S. data center commitments, it didn’t just notch a big sales win. It arguably put a stake in the ground about who controls the AI power-and-cooling stack over the rest of this decade. Within a single news cycle, Schneider announced: Together, the deals total about $2.27 billion in U.S. data center infrastructure, a number Schneider confirmed in background with multiple outlets and which Reuters highlighted as a bellwether for AI-driven demand.  For the AI data center ecosystem, these contracts function like early-stage fuel supply deals for the power and cooling systems that underpin the “AI factory.” Supply Capacity Agreements: Locking in the AI Supply Chain Significantly, both deals are structured as supply capacity agreements, not traditional one-off equipment purchase orders. Under the SCA model, Schneider is committing dedicated manufacturing lines and inventory to these customers, guaranteeing output of power and cooling systems over a multi-year horizon. In return, Switch and Digital Realty are providing Schneider with forecastable volume and visibility at the scale of gigawatt-class campus build-outs.  A Schneider spokesperson told Reuters that the two contracts are phased across 2025 and 2026, underscoring that this arrangement is about pipeline, as opposed to a one-time backlog spike.  That structure does three important things for the market: Signals confidence that AI demand is durable.You don’t ring-fence billions of dollars of factory output for two customers unless you’re highly confident the AI load curve runs beyond the current GPU cycle. Pre-allocates power & cooling the way the industry pre-allocated GPUs.Hyperscalers and neoclouds have already spent two years locking up Nvidia and AMD capacity. These SCAs suggest power trains and thermal systems are joining chips on the list of constrained strategic resources.

Read More »

The Data Center Power Squeeze: Mapping the Real Limits of AI-Scale Growth

As we all know, the data center industry is at a crossroads. As artificial intelligence reshapes the already insatiable digital landscape, the demand for computing power is surging at a pace that outstrips the growth of the US electric grid. As engines of the AI economy, an estimated 1,000 new data centers1 are needed to process, store, and analyze the vast datasets that run everything from generative models to autonomous systems. But this transformation comes with a steep price and the new defining criteria for real estate: power. Our appetite for electricity is now the single greatest constraint on our expansion, threatening to stall the very innovation we enable. In 2024, US data centers consumed roughly 4% of the nation’s total electricity, a figure that is projected to triple by 2030, reaching 12% or more.2 For AI-driven hyperscale facilities, the numbers are even more staggering. With the largest planned data centers requiring gigawatts of power, enough to supply entire cities, the cumulative demand from all data centers is expected to reach 134 gigawatts by 2030, nearly three times the current load.​3 This presents a systemic challenge. The U.S. power grid, built for a different era, is struggling to keep pace. Utilities are reporting record interconnection requests, with some regions seeing demand projections that exceed their total system capacity by fivefold.4 In Virginia and Texas, the epicenters of data center expansion, grid operators are warning of tight supply-demand balances and the risk of blackouts during peak periods.5 The problem is not just the sheer volume of power needed, but the speed at which it must be delivered. Data center operators are racing to secure power for projects that could be online in as little as 18 months, but grid upgrades and new generation can take years, if not decades. The result

Read More »

The Future of Hyperscale: Neoverse Joins NVLink Fusion as SC25 Accelerates Rack-Scale AI Architectures

Neoverse’s Expanding Footprint and the Power-Efficiency Imperative With Neoverse deployments now approaching roughly 50% of all compute shipped into top hyperscalers in 2025 (representing more than a billion Arm cores) and with nation-scale AI campuses such as the Stargate project already anchored on Arm compute, the addition of NVLink Fusion becomes a pivotal extension of the Neoverse roadmap. Partners can now connect custom Arm CPUs to their preferred NVIDIA accelerators across a coherent, high-bandwidth, rack-scale fabric. Arm characterized the shift as a generational inflection point in data-center architecture, noting that “power—not FLOPs—is the bottleneck,” and that future design priorities hinge on maximizing “intelligence per watt.” Ian Buck, vice president and general manager of accelerated computing at NVIDIA, underscored the practical impact: “Folks building their own Arm CPU, or using an Arm IP, can actually have access to NVLink Fusion—be able to connect that Arm CPU to an NVIDIA GPU or to the rest of the NVLink ecosystem—and that’s happening at the racks and scale-up infrastructure.” Despite the expanded design flexibility, this is not being positioned as an open interconnect ecosystem. NVIDIA continues to control the NVLink Fusion fabric, and all connections ultimately run through NVIDIA’s architecture. For data-center planners, the SC25 announcement translates into several concrete implications: 1.   NVIDIA “Grace-style” Racks Without Buying Grace With NVLink Fusion now baked into Neoverse, hyperscalers and sovereign operators can design their own Arm-based control-plane or pre-processing CPUs that attach coherently to NVIDIA GPU domains—such as NVL72 racks or HGX B200/B300 systems—without relying on Grace CPUs. A rack-level architecture might now resemble: Custom Neoverse SoC for ingest, orchestration, agent logic, and pre/post-processing NVLink Fusion fabric Blackwell GPU islands and/or NVLink-attached custom accelerators (Marvell, MediaTek, others) This decouples CPU choice from NVIDIA’s GPU roadmap while retaining the full NVLink fabric. In practice, it also opens

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »