Stay Ahead, Stay ONMINE

Anthropic scientists expose how AI actually ‘thinks’ — and discover it secretly plans ahead and sometimes lies

Anthropic has developed a new method for peering inside large language models like Claude, revealing for the first time how these AI systems process information and make decisions. The research, published today in two papers (available here and here), shows these models are more sophisticated than previously understood — they plan ahead when writing poetry, use the same internal blueprint to interpret ideas regardless of language, and sometimes even work backward from a desired outcome instead of simply building up from the facts. The work, which draws inspiration from neuroscience techniques used to study biological brains, represents a significant advance in AI interpretability. This approach could allow researchers to audit these systems for safety issues that might remain hidden during conventional external testing. “We’ve created these AI systems with remarkable capabilities, but because of how they’re trained, we haven’t understood how those capabilities actually emerged,” said Joshua Batson, a researcher at Anthropic, in an exclusive interview with VentureBeat. “Inside the model, it’s just a bunch of numbers —matrix weights in the artificial neural network.” New techniques illuminate AI’s previously hidden decision-making process Large language models like OpenAI’s GPT-4o, Anthropic’s Claude, and Google’s Gemini have demonstrated remarkable capabilities, from writing code to synthesizing research papers. But these systems have largely functioned as “black boxes” — even their creators often don’t understand exactly how they arrive at particular responses. Anthropic’s new interpretability techniques, which the company dubs “circuit tracing” and “attribution graphs,” allow researchers to map out the specific pathways of neuron-like features that activate when models perform tasks. The approach borrows concepts from neuroscience, viewing AI models as analogous to biological systems. “This work is turning what were almost philosophical questions — ‘Are models thinking? Are models planning? Are models just regurgitating information?’ — into concrete scientific inquiries about what’s literally happening inside these systems,” Batson explained. Claude’s hidden planning: How AI plots poetry lines and solves geography questions Among the most striking discoveries was evidence that Claude plans ahead when writing poetry. When asked to compose a rhyming couplet, the model identified potential rhyming words for the end of the next line before it began writing — a level of sophistication that surprised even Anthropic’s researchers. “This is probably happening all over the place,” Batson said. “If you had asked me before this research, I would have guessed the model is thinking ahead in various contexts. But this example provides the most compelling evidence we’ve seen of that capability.” For instance, when writing a poem ending with “rabbit,” the model activates features representing this word at the beginning of the line, then structures the sentence to naturally arrive at that conclusion. The researchers also found that Claude performs genuine multi-step reasoning. In a test asking “The capital of the state containing Dallas is…” the model first activates features representing “Texas,” and then uses that representation to determine “Austin” as the correct answer. This suggests the model is actually performing a chain of reasoning rather than merely regurgitating memorized associations. By manipulating these internal representations — for example, replacing “Texas” with “California” — the researchers could cause the model to output “Sacramento” instead, confirming the causal relationship. Beyond translation: Claude’s universal language concept network revealed Another key discovery involves how Claude handles multiple languages. Rather than maintaining separate systems for English, French, and Chinese, the model appears to translate concepts into a shared abstract representation before generating responses. “We find the model uses a mixture of language-specific and abstract, language-independent circuits,” the researchers write in their paper. When asked for the opposite of “small” in different languages, the model uses the same internal features representing “opposites” and “smallness,” regardless of the input language. This finding has implications for how models might transfer knowledge learned in one language to others, and suggests that models with larger parameter counts develop more language-agnostic representations. When AI makes up answers: Detecting Claude’s mathematical fabrications Perhaps most concerning, the research revealed instances where Claude’s reasoning doesn’t match what it claims. When presented with difficult math problems like computing cosine values of large numbers, the model sometimes claims to follow a calculation process that isn’t reflected in its internal activity. “We are able to distinguish between cases where the model genuinely performs the steps they say they are performing, cases where it makes up its reasoning without regard for truth, and cases where it works backwards from a human-provided clue,” the researchers explain. In one example, when a user suggests an answer to a difficult problem, the model works backward to construct a chain of reasoning that leads to that answer, rather than working forward from first principles. “We mechanistically distinguish an example of Claude 3.5 Haiku using a faithful chain of thought from two examples of unfaithful chains of thought,” the paper states. “In one, the model is exhibiting ‘bullshitting‘… In the other, it exhibits motivated reasoning.” Inside AI Hallucinations: How Claude decides when to answer or refuse questions The research also provides insight into why language models hallucinate — making up information when they don’t know an answer. Anthropic found evidence of a “default” circuit that causes Claude to decline to answer questions, which is inhibited when the model recognizes entities it knows about. “The model contains ‘default’ circuits that cause it to decline to answer questions,” the researchers explain. “When a model is asked a question about something it knows, it activates a pool of features which inhibit this default circuit, thereby allowing the model to respond to the question.” When this mechanism misfires — recognizing an entity but lacking specific knowledge about it — hallucinations can occur. This explains why models might confidently provide incorrect information about well-known figures while refusing to answer questions about obscure ones. Safety implications: Using circuit tracing to improve AI reliability and trustworthiness This research represents a significant step toward making AI systems more transparent and potentially safer. By understanding how models arrive at their answers, researchers could potentially identify and address problematic reasoning patterns. “We hope that we and others can use these discoveries to make models safer,” the researchers write. “For example, it might be possible to use the techniques described here to monitor AI systems for certain dangerous behaviors—such as deceiving the user—to steer them towards desirable outcomes, or to remove certain dangerous subject matter entirely.” However, Batson cautions that the current techniques still have significant limitations. They only capture a fraction of the total computation performed by these models, and analyzing the results remains labor-intensive. “Even on short, simple prompts, our method only captures a fraction of the total computation performed by Claude,” the researchers acknowledge. The future of AI transparency: Challenges and opportunities in model interpretation Anthropic’s new techniques come at a time of increasing concern about AI transparency and safety. As these models become more powerful and more widely deployed, understanding their internal mechanisms becomes increasingly important. The research also has potential commercial implications. As enterprises increasingly rely on large language models to power applications, understanding when and why these systems might provide incorrect information becomes crucial for managing risk. “Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse — including in scenarios of catastrophic risk,” the researchers write. While this research represents a significant advance, Batson emphasized that it’s only the beginning of a much longer journey. “The work has really just begun,” he said. “Understanding the representations the model uses doesn’t tell us how it uses them.” For now, Anthropic’s circuit tracing offers a first tentative map of previously uncharted territory — much like early anatomists sketching the first crude diagrams of the human brain. The full atlas of AI cognition remains to be drawn, but we can now at least see the outlines of how these systems think.

Anthropic has developed a new method for peering inside large language models like Claude, revealing for the first time how these AI systems process information and make decisions.

The research, published today in two papers (available here and here), shows these models are more sophisticated than previously understood — they plan ahead when writing poetry, use the same internal blueprint to interpret ideas regardless of language, and sometimes even work backward from a desired outcome instead of simply building up from the facts.

The work, which draws inspiration from neuroscience techniques used to study biological brains, represents a significant advance in AI interpretability. This approach could allow researchers to audit these systems for safety issues that might remain hidden during conventional external testing.

“We’ve created these AI systems with remarkable capabilities, but because of how they’re trained, we haven’t understood how those capabilities actually emerged,” said Joshua Batson, a researcher at Anthropic, in an exclusive interview with VentureBeat. “Inside the model, it’s just a bunch of numbers —matrix weights in the artificial neural network.”

New techniques illuminate AI’s previously hidden decision-making process

Large language models like OpenAI’s GPT-4o, Anthropic’s Claude, and Google’s Gemini have demonstrated remarkable capabilities, from writing code to synthesizing research papers. But these systems have largely functioned as “black boxes” — even their creators often don’t understand exactly how they arrive at particular responses.

Anthropic’s new interpretability techniques, which the company dubs “circuit tracing” and “attribution graphs,” allow researchers to map out the specific pathways of neuron-like features that activate when models perform tasks. The approach borrows concepts from neuroscience, viewing AI models as analogous to biological systems.

“This work is turning what were almost philosophical questions — ‘Are models thinking? Are models planning? Are models just regurgitating information?’ — into concrete scientific inquiries about what’s literally happening inside these systems,” Batson explained.

Claude’s hidden planning: How AI plots poetry lines and solves geography questions

Among the most striking discoveries was evidence that Claude plans ahead when writing poetry. When asked to compose a rhyming couplet, the model identified potential rhyming words for the end of the next line before it began writing — a level of sophistication that surprised even Anthropic’s researchers.

“This is probably happening all over the place,” Batson said. “If you had asked me before this research, I would have guessed the model is thinking ahead in various contexts. But this example provides the most compelling evidence we’ve seen of that capability.”

For instance, when writing a poem ending with “rabbit,” the model activates features representing this word at the beginning of the line, then structures the sentence to naturally arrive at that conclusion.

The researchers also found that Claude performs genuine multi-step reasoning. In a test asking “The capital of the state containing Dallas is…” the model first activates features representing “Texas,” and then uses that representation to determine “Austin” as the correct answer. This suggests the model is actually performing a chain of reasoning rather than merely regurgitating memorized associations.

By manipulating these internal representations — for example, replacing “Texas” with “California” — the researchers could cause the model to output “Sacramento” instead, confirming the causal relationship.

Beyond translation: Claude’s universal language concept network revealed

Another key discovery involves how Claude handles multiple languages. Rather than maintaining separate systems for English, French, and Chinese, the model appears to translate concepts into a shared abstract representation before generating responses.

“We find the model uses a mixture of language-specific and abstract, language-independent circuits,” the researchers write in their paper. When asked for the opposite of “small” in different languages, the model uses the same internal features representing “opposites” and “smallness,” regardless of the input language.

This finding has implications for how models might transfer knowledge learned in one language to others, and suggests that models with larger parameter counts develop more language-agnostic representations.

When AI makes up answers: Detecting Claude’s mathematical fabrications

Perhaps most concerning, the research revealed instances where Claude’s reasoning doesn’t match what it claims. When presented with difficult math problems like computing cosine values of large numbers, the model sometimes claims to follow a calculation process that isn’t reflected in its internal activity.

“We are able to distinguish between cases where the model genuinely performs the steps they say they are performing, cases where it makes up its reasoning without regard for truth, and cases where it works backwards from a human-provided clue,” the researchers explain.

In one example, when a user suggests an answer to a difficult problem, the model works backward to construct a chain of reasoning that leads to that answer, rather than working forward from first principles.

“We mechanistically distinguish an example of Claude 3.5 Haiku using a faithful chain of thought from two examples of unfaithful chains of thought,” the paper states. “In one, the model is exhibiting ‘bullshitting‘… In the other, it exhibits motivated reasoning.”

Inside AI Hallucinations: How Claude decides when to answer or refuse questions

The research also provides insight into why language models hallucinate — making up information when they don’t know an answer. Anthropic found evidence of a “default” circuit that causes Claude to decline to answer questions, which is inhibited when the model recognizes entities it knows about.

“The model contains ‘default’ circuits that cause it to decline to answer questions,” the researchers explain. “When a model is asked a question about something it knows, it activates a pool of features which inhibit this default circuit, thereby allowing the model to respond to the question.”

When this mechanism misfires — recognizing an entity but lacking specific knowledge about it — hallucinations can occur. This explains why models might confidently provide incorrect information about well-known figures while refusing to answer questions about obscure ones.

Safety implications: Using circuit tracing to improve AI reliability and trustworthiness

This research represents a significant step toward making AI systems more transparent and potentially safer. By understanding how models arrive at their answers, researchers could potentially identify and address problematic reasoning patterns.

“We hope that we and others can use these discoveries to make models safer,” the researchers write. “For example, it might be possible to use the techniques described here to monitor AI systems for certain dangerous behaviors—such as deceiving the user—to steer them towards desirable outcomes, or to remove certain dangerous subject matter entirely.”

However, Batson cautions that the current techniques still have significant limitations. They only capture a fraction of the total computation performed by these models, and analyzing the results remains labor-intensive.

“Even on short, simple prompts, our method only captures a fraction of the total computation performed by Claude,” the researchers acknowledge.

The future of AI transparency: Challenges and opportunities in model interpretation

Anthropic’s new techniques come at a time of increasing concern about AI transparency and safety. As these models become more powerful and more widely deployed, understanding their internal mechanisms becomes increasingly important.

The research also has potential commercial implications. As enterprises increasingly rely on large language models to power applications, understanding when and why these systems might provide incorrect information becomes crucial for managing risk.

“Anthropic wants to make models safe in a broad sense, including everything from mitigating bias to ensuring an AI is acting honestly to preventing misuse — including in scenarios of catastrophic risk,” the researchers write.

While this research represents a significant advance, Batson emphasized that it’s only the beginning of a much longer journey. “The work has really just begun,” he said. “Understanding the representations the model uses doesn’t tell us how it uses them.”

For now, Anthropic’s circuit tracing offers a first tentative map of previously uncharted territory — much like early anatomists sketching the first crude diagrams of the human brain. The full atlas of AI cognition remains to be drawn, but we can now at least see the outlines of how these systems think.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Ubuntu namespace vulnerability should be addressed quickly: Expert

Thus, “there is little impact of not ‘patching’ the vulnerability,” he said. “Organizations using centralized configuration tools like Ansible may deploy these changes with regularly scheduled maintenance or reboot windows.”  Features supposed to improve security Ironically, last October Ubuntu introduced AppArmor-based features to improve security by reducing the attack surface

Read More »

Google Cloud partners with mLogica to offer mainframe modernization

Other than the partnership with mLogica, Google Cloud also offers a variety of other mainframe migration tools, including Radis and G4 that can be employed to modernize specific applications. Enterprises can also use a combination of migration tools to modernize their mainframe applications. Some of these tools include the Gemini-powered

Read More »

Battery fires pose minor environmental risks: ACP report

Dive Brief: A third-party review of large-scale battery energy storage system fires in the United States since 2012 found that none resulted in contaminant concentrations that would prompt a public health concern or require further remediation, the American Clean Power Association said Friday. ACP also on Friday released a safety blueprint for battery energy storage systems. The blueprint recommends that operators adopt the latest national fire safety standards, inspect systems deployed prior to those standards’ adoption, and conduct hazard mitigation and emergency response planning in partnership with state and local authorities. “Every community across the country should have confidence that the battery storage facilities keeping their lights on and utility bills affordable adhere to the most rigorous safety requirements,” ACP Vice President of Energy Storage Noah Roberts said in a statement. Dive Insight: Conducted by the Fire & Risk Alliance for ACP, the fire safety review examined 35 documented BESS fires, including high-profile 2023 incidents in New York and Idaho that the review says consumed substantial portions of the battery arrays involved. The FRA review cited environmental “studies [that found airborne] emissions are largely confined to the immediate vicinity of the fire, with rapid dissipation and concentration reduction in open-air scenarios.” Post-incident monitoring at battery fires in California and New York that showed “no detectable hazardous concentrations” of toxic chemicals like hydrogen fluoride and hydrogen cyanide led local authorities to lift shelter-in-place orders, FRA said. Firefighting efforts that consume large amounts of water raise concerns about local soil and water contamination, but post-incident data “does not support the notion of widespread contamination risks,” FRA said. Allowing batteries to consume themselves and applying cooling water only as needed is a best practice for fighting lithium-ion fires. That practice, in conjunction with standard stormwater management practices, can reduce contaminant runoff, FRA said.

Read More »

Trump Vents Anger at Putin Over Ukraine, Hints at Oil Curbs

President Donald Trump said he was “very angry” at Vladimir Putin and threatened “secondary tariffs” on buyers of his country’s oil if the Russian leader refuses a ceasefire with Ukraine. In comments reported by NBC News, Trump said he was “pissed off” at Putin for casting doubt on Ukrainian President Volodymyr Zelenskiy’s legitimacy as a negotiating partner, and threatened curbs on “all oil coming out of Russia.” He later added that he didn’t think the Russian president would “go back on his word.”  While the US president appeared to temper his remarks, the threats mark a significant change of tone for Washington and suggest a possible souring in relations with his Russian counterpart over the pace of ceasefire talks. Before taking office, Trump said he could resolve the war quickly, but the conflict rages on more than two months later.  “I certainly wouldn’t want to put secondary tariffs on Russia,” Trump later clarified in comments to reporters on Air Force One, adding he was “disappointed” with some of Putin’s recent comments on Zelenskiy. “He’s supposed to be making a deal with him, whether you like him or don’t like him. So I wasn’t happy with that. But I think he’s going to be good.” Trump’s frustration was sparked by comments Putin made on Friday that implicitly challenged Zelenskiy’s legitimacy by proposing the United Nations should take over Ukraine with a temporary government overseen by the US and possibly even some European countries.  The Kremlin on Monday said that Putin remained open to contacts with Trump.  “If necessary, their conversation will be organized very quickly,” spokesman Dmitry Peskov told reporters, according to the state-run Tass news agency, though he said no call had been scheduled yet. Peskov also said that Russia was continuing to work with the US to build bilateral

Read More »

FERC review of PJM colocation rules for data centers, large loads may extend past mid-year: analysts

The PJM Interconnection’s response to the Federal Energy Regulatory Commission’s investigation into the grid operator’s rules for colocated loads indicates FERC may not approve new regulations by mid-year, as some people initially thought, according to utility-sector analysts. FERC on Feb. 20 launched a review of issues related to colocating large loads, such as data centers, at power plants in PJM’s footprint. The outcome of the review could set a precedent for colocated load in the power markets FERC oversees. Talen Energy, Constellation Energy and PSEG Power, a Public Service Enterprise Group subsidiary, are among the companies that are considering hosting data centers at their nuclear power plants in PJM. In its “show cause” order, FERC asked PJM and stakeholders to explain why the grid operator’s colocation rules are just and reasonable or to offer rules that would pass agency muster. FERC established a comment schedule that enables the agency to issue a response by June 20. The agency said it could make a decision on a PJM proposal within three months. However, instead of proposing new colocation rules, PJM on March 24 said its existing rules are just and reasonable. The grid operator also offered five conceptual colocation options that have been proposed by stakeholders or developed by PJM. PJM urged FERC to issue “detailed guiding principles” that the grid operator could use to craft colocation rules for the agency’s approval. The lack of a proposal from PJM likely extends FERC’s review process, according to analysts. “FERC may still act on the show cause order in June, but we don’t rule out a new iteration of process instead of a clear policy decision,” ClearView Energy Partners analysts said in a client note on Friday. It will likely take FERC until late this year to approve changes to PJM’s colocation rules,

Read More »

EPA denies harm from GGRF freeze in court filing

The U.S. Environmental Protection Agency filed a motion Wednesday opposing motions for injunctive relief filed by three nonprofits that have had their access to Greenhouse Gas Reduction Fund grant money frozen, arguing that their monetary harm does not warrant an injunction and is not irreparable.  The nonprofit Climate Fund United, which received a $6.97 billion National Clean Investment Fund grant, was the first to sue over the frozen funds last month, targeting EPA and fund holder Citibank. The Coalition for Green Capital, which received $5 billion from the NCFI, and Power Forward Communities, which received $2 billion from it, have each filed lawsuits against Citibank.  EPA argued for the injunction requests filed by each to be denied, as “an injunction should be denied when Plaintiffs’ alleged harms are monetary and may be remedied by damages” and “in terminating Plaintiffs’ grants, EPA has not prohibited or made it unlawful for Plaintiffs (or their subgrantees) to carry out their work.” “Nor has any other government action,” EPA said. “The government is not preventing Plaintiffs from providing services; EPA has just terminated the contracts under which the government would provide reimbursement for those services.” In a joint response filed Friday, the three plaintiffs argued that they have already “demonstrated several forms of irreparable harm, including potentially fatal disruption to Plaintiffs’ operations; irreplaceable loss of clients, partnerships, and opportunities; devastating reputational injury; interference with Plaintiffs’ missions; and an immediate risk of insolvency for some of the Plaintiffs and their subgrantees.” “Many of these injuries have already materialized and will worsen if Plaintiffs continue to be deprived of access to their funds,” they said. The plaintiffs argue that the U.S. District Court for the District of Columbia, where the case is being heard, has previously held that financial harm can constitute irreparable harm when the existence

Read More »

Where battery and hydrogen-powered trains are coming to US commuter rail

As U.S. transit agencies increasingly order buses powered by batteries or hydrogen fuel cells, some of these same agencies are beginning to look at trains that use similar technologies. Stadler, an international train manufacturer, already has trains in testing and on order in two states, while other manufacturers of such trains operating in Canada and Europe are eyeing U.S. opportunities, too. California puts Stadler hydrogen trains to the test California announced a $310 billion plan in January to develop a zero-emission passenger rail network across much of the state by 2050. A hydrogen-powered passenger train built by Stadler, a Swiss company, began testing on San Bernardino County’s Metrolink commuter line between San Bernardino and Redlands, California, in November. The San Bernardino County Transportation Authority expects the train to go into regular service this year. “We’re confident that once that train goes into revenue service soon, that we’ll see a lot of positive feedback,” said Stadler’s Martin Ritter, executive vice president for North America. Ritter said California signed a contract with Stadler to provide up to 29 hydrogen fuel cell trains; it had ordered 10 as of a year ago. The state is bundling the procurement contract and will assign trains to different transit agencies, he said. Prior to its arrival in California, the SBCTA hydrogen train underwent testing at the Ensco Transportation Technology Center in Pueblo, Colorado. During that process, the train set a Guinness World Record for traveling 1,741.7 miles around a test loop without refueling or recharging.  Ritter said zero-emission trains are quieter and produce fewer vibrations than conventional fuel trains as they speed through communities along the line. He noted that the only byproduct of a fuel cell train is water vapor. Electric trains and streetcars have existed for more than a century. Passenger railroads like the

Read More »

ISO New England issues transmission RFP to access new wind resources

The New England grid operator on Monday published a request for proposals to address the region’s longer-term transmission needs, aimed at upgrading the electric system between anticipated wind generation in northern Maine and demand centers to the south. ISO New England said it published the RFP at the direction of the New England States Committee on Electricity. Proposals are due in September, though the schedule is subject to change, the ISO said. After evaluation by the ISO, a preferred solution may be selected by NESCOE as early as September 2026.  Proposals must aim to increase the amount of power that can flow across the Maine–New Hampshire and Surowiec–South transmission interfaces, and develop new infrastructure around Pittsfield, Maine, that could accommodate the interconnection of 1,200 MW of land-based wind generation, the ISO said. “A strong preference will be given to proposals with an in-service date on or before December 31, 2035, or as close as possible,” according to the RFP.  Massachusetts officials celebrated the announcement, noting that the first competitive RFP for longer-term transmission investments has been “a long-time goal of the New England states.”  “This RFP will address long-standing constraints on the New England power system and integrate new, affordable, onshore wind resources in the coming years,” according to a statement from Massachusetts Gov. Maura Healey, D. Previously, New England lacked a mechanism to enable the ISO to procure transmission at the states’ request. The RFP process was developed in collaboration between the ISO and regional stakeholders, allowing the states to request that the grid operator pursue transmission investment “that is grounded in the evaluation of broad regional benefits and consumer interests,” according to the Massachusetts statement. “This milestone represents what can happen when we work together — innovative and cost-effective solutions to our region’s most pressing energy challenges,” Healey said. “We are grateful

Read More »

Talent gap complicates cost-conscious cloud planning

The top strategy so far is what one enterprise calls the “Cloud Team.” You assemble all your people with cloud skills, and your own best software architect, and have the team examine current and proposed cloud applications, looking for a high-level approach that meets business goals. In this process, the team tries to avoid implementation specifics, focusing instead on the notion that a hybrid application has an agile cloud side and a governance-and-sovereignty data center side, and what has to be done is push functionality into the right place. The Cloud Team supporters say that an experienced application architect can deal with the cloud in abstract, without detailed knowledge of cloud tools and costs. For example, the architect can assess the value of using an event-driven versus transactional model without fixating on how either could be done. The idea is to first come up with approaches. Then, developers could work with cloud providers to map each approach to an implementation, and assess the costs, benefits, and risks. Ok, I lied about this being the top strategy—sort of, at least. It’s the only strategy that’s making much sense. The enterprises all start their cloud-reassessment journey on a different tack, but they agree it doesn’t work. The knee-jerk approach to cloud costs is to attack the implementation, not the design. What cloud features did you pick? Could you find ones that cost less? Could you perhaps shed all the special features and just host containers or VMs with no web services at all? Enterprises who try this, meaning almost all of them, report that they save less than 15% on cloud costs, a rate of savings that means roughly a five-year payback on the costs of making the application changes…if they can make them at all. Enterprises used to build all of

Read More »

Lightmatter launches photonic chips to eliminate GPU idle time in AI data centers

“Silicon photonics can transform HPC, data centers, and networking by providing greater scalability, better energy efficiency, and seamless integration with existing semiconductor manufacturing and packaging technologies,” Jagadeesan added. “Lightmatter’s recent announcement of the Passage L200 co-packaged optics and M1000 reference platform demonstrates an important step toward addressing the interconnect bandwidth and latency between accelerators in AI data centers.” The market timing appears strategic, as enterprises worldwide face increasing computational demands from AI workloads while simultaneously confronting the physical limitations of traditional semiconductor scaling. Silicon photonics offers a potential path forward as conventional approaches reach their limits. Practical applications For enterprise IT leaders, Lightmatter’s technology could impact several key areas of infrastructure planning. AI development teams could see significantly reduced training times for complex models, enabling faster iteration and deployment of AI solutions. Real-time AI applications could benefit from lower latency between processing units, improving responsiveness for time-sensitive operations. Data centers could potentially achieve higher computational density with fewer networking bottlenecks, allowing more efficient use of physical space and resources. Infrastructure costs might be optimized by more efficient utilization of expensive GPU resources, as processors spend less time waiting for data and more time computing. These benefits would be particularly valuable for financial services, healthcare, research institutions, and technology companies working with large-scale AI deployments. Organizations that rely on real-time analysis of large datasets or require rapid training and deployment of complex AI models stand to gain the most from the technology. “Silicon photonics will be a key technology for interconnects across accelerators, racks, and data center fabrics,” Jagadeesan pointed out. “Chiplets and advanced packaging will coexist and dominate intra-package communication. The key aspect is integration, that is companies who have the potential to combine photonics, chiplets, and packaging in a more efficient way will gain competitive advantage.”

Read More »

Silicon Motion rolls SSD kit to bolster AI workload performance

The kit utilizes the PCIe Dual Ported enterprise-grade SM8366 controller with support for PCIe Gen 5 x4 NVMe 2.0 and OCP 2.5 data center specifications. The 128TB SSD RDK also supports NVMe 2.0 Flexible Data Placement (FDP), a feature that allows advanced data management and improved SSD write efficiency and endurance. “Silicon Motion’s MonTitan SSD RDK offers a comprehensive solution for our customers, enabling them to rapidly develop and deploy enterprise-class SSDs tailored for AI data center and edge server applications.” said Alex Chou, senior vice president of the enterprise storage & display interface solution business at Silicon Motion. Silicon Motion doesn’t make drives, rather it makes reference design kits in different form factors that its customers use to build their own product. Its kits come in E1.S, E3.S, and U.2 form factors. The E1.S and U.2 forms mirror the M.2, which looks like a stick of gum and installs on the motherboard. There are PCI Express enclosures that hold four to six of those drives and plug into one card slot and appear to the system as a single drive.

Read More »

Executive Roundtable: Cooling Imperatives for Managing High-Density AI Workloads

Michael Lahoud, Stream Data Centers: For the past two years, Stream Data Centers has been developing a modular, configurable air and liquid cooling system that can handle the highest densities in both mediums. Based on our collaboration with customers, we see a future that still requires both cooling mediums, but with the flexibility to deploy either type as the IT stack destined for that space demands. With this necessity as a backdrop, we saw a need to develop a scalable mix-and-match front-end thermal solution that gives us the ability to late bind the equipment we need to meet our customers’ changing cooling needs. It’s well understood that liquid far outperforms air in its ability to transport heat, but further to this, with the right IT configuration, cooling fluid temperatures can also be raised, and this affords operators the ability to use economization for a greater number of hours a year. These key properties can help reduce the energy needed for the mechanical part of a data center’s operations substantially.  It should also be noted that as servers are redesigned for liquid cooling and the onboard server fans get removed or reduced in quantity, more of the critical power delivered to the server is being used for compute. This means that liquid cooling also drives an improvement in overall compute productivity despite not being noted in facility PUE metrics.  Counter to air cooling, liquid cooling certainly has some added management challenges related to fluid cleanliness, concurrent maintainability and resiliency/redundancy, but once those are accounted for, the clusters become stable, efficient and more sustainable with improved overall productivity.

Read More »

Airtel connects India with 100Tbps submarine cable

“Businesses are becoming increasingly global and digital-first, with industries such as financial services, data centers, and social media platforms relying heavily on real-time, uninterrupted data flow,” Sinha added. The 2Africa Pearls submarine cable system spans 45,000 kilometers, involving a consortium of global telecommunications leaders including Bayobab, China Mobile International, Meta, Orange, Telecom Egypt, Vodafone Group, and WIOCC. Alcatel Submarine Networks is responsible for the cable’s manufacturing and installation, the statement added. This cable system is part of a broader global effort to enhance international digital connectivity. Unlike traditional telecommunications infrastructure, the 2Africa Pearls project represents a collaborative approach to solving complex global communication challenges. “The 100 Tbps capacity of the 2Africa Pearls cable significantly surpasses most existing submarine cable systems, positioning India as a key hub for high-speed connectivity between Africa, Europe, and Asia,” said Prabhu Ram, VP for Industry Research Group at CyberMedia Research. According to Sinha, Airtel’s infrastructure now spans “over 400,000 route kilometers across 34+ cables, connecting 50 countries across five continents. This expansive infrastructure ensures businesses and individuals stay seamlessly connected, wherever they are.” Gogia further emphasizes the broader implications, noting, “What also stands out is the partnership behind this — Airtel working with Meta and center3 signals a broader shift. India is no longer just a consumer of global connectivity. We’re finally shaping the routes, not just using them.”

Read More »

Former Arista COO launches NextHop AI for customized networking infrastructure

Sadana argued that unlike traditional networking where an IT person can just plug a cable into a port and it works, AI networking requires intricate, custom solutions. The core challenge is creating highly optimized, efficient networking infrastructure that can support massive AI compute clusters with minimal inefficiencies. How NextHop is looking to change the game for hyperscale networking NextHop AI is working directly alongside its hyperscaler customers to develop and build customized networking solutions. “We are here to build the most efficient AI networking solutions that are out there,” Sadana said. More specifically, Sadana said that NextHop is looking to help hyperscalers in several ways including: Compressing product development cycles: “Companies that are doing things on their own can compress their product development cycle by six to 12 months when they partner with us,” he said. Exploring multiple technological alternatives: Sadana noted that hyperscalers might try and build on their own and will often only be able to explore one or two alternative approaches. With NextHop, Sadana said his company will enable them to explore four to six different alternatives. Achieving incremental efficiency gains: At the massive cloud scale that hyperscalers operate, even an incremental one percent improvement can have an oversized outcome. “You have to make AI clusters as efficient as possible for the world to use all the AI applications at the right cost structure, at the right economics, for this to be successful,” Sadana said. “So we are participating by making that infrastructure layer a lot more efficient for cloud customers, or the hyperscalers, which, in turn, of course, gives the benefits to all of these software companies trying to run AI applications in these cloud companies.” Technical innovations: Beyond traditional networking In terms of what the company is actually building now, NextHop is developing specialized network switches

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »