Stay Ahead, Stay ONMINE

Anthropic can now track the bizarre inner workings of a large language model

The AI firm Anthropic has developed a way to peer inside a large language model and watch what it does as it comes up with a response, revealing key new insights into how the technology works. The takeaway: LLMs are even stranger than we thought. The Anthropic team was surprised by some of the counterintuitive workarounds that large language models appear to use to complete sentences, solve simple math problems, suppress hallucinations, and more, says Joshua Batson, a research scientist at the company. It’s no secret that large language models work in mysterious ways. Few—if any—mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science. But it’s not just about curiosity. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can’t do. And it would show how trustworthy (or not) they really are. Batson and his colleagues describe their new work in two reports published today. The first presents Anthropic’s use of a technique called circuit tracing, which lets researchers track the decision-making processes inside a large language model step by step. Anthropic used circuit tracing to watch its LLM Claude 3.5 Haiku carry out various tasks. The second (titled “On the Biology of a Large Language Model”) details what the team discovered when it looked at 10 tasks in particular. “I think this is really cool work,” says Jack Merullo, who studies large language models at Brown University in Providence, Rhode Island, and was not involved in the research. “It’s a really nice step forward in terms of methods.” Circuit tracing is not itself new. Last year Merullo and his colleagues analyzed a specific circuit in a version of OpenAI’s GPT-2, an older large language model that OpenAI released in 2019. But Anthropic has now analyzed a number of different circuits as a far larger and far more complex model carries out multiple tasks. “Anthropic is very capable at applying scale to a problem,” says Merullo. Eden Biran, who studies large language models at Tel Aviv University, agrees. “Finding circuits in a large state-of-the-art model such as Claude is a nontrivial engineering feat,” he says. “And it shows that circuits scale up and might be a good way forward for interpreting language models.” Circuits chain together different parts—or components—of a model. Last year, Anthropic identified certain components inside Claude that correspond to real-world concepts. Some were specific, such as “Michael Jordan” or “greenness”; others were more vague, such as “conflict between individuals.” One component appeared to represent the Golden Gate Bridge. Anthropic researchers found that if they turned up the dial on this component, Claude could be made to self-identify not as a large language model but as the physical bridge itself. The latest work builds on that research and the work of others, including Google DeepMind, to reveal some of the connections between individual components. Chains of components are the pathways between the words put into Claude and the words that come out.   “It’s tip-of-the-iceberg stuff. Maybe we’re looking at a few percent of what’s going on,” says Batson. “But that’s already enough to see incredible structure.” Growing LLMs Researchers at Anthropic and elsewhere are studying large language models as if they were natural phenomena rather than human-built software. That’s because the models are trained, not programmed. “They almost grow organically,” says Batson. “They start out totally random. Then you train them on all this data and they go from producing gibberish to being able to speak different languages and write software and fold proteins. There are insane things that these models learn to do, but we don’t know how that happened because we didn’t go in there and set the knobs.” Sure, it’s all math. But it’s not math that we can follow. “Open up a large language model and all you will see is billions of numbers—the parameters,” says Batson. “It’s not illuminating.” Anthropic says it was inspired by brain-scan techniques used in neuroscience to build what the firm describes as a kind of microscope that can be pointed at different parts of a model while it runs. The technique highlights components that are active at different times. Researchers can then zoom in on different components and record when they are and are not active. Take the component that corresponds to the Golden Gate Bridge. It turns on when Claude is shown text that names or describes the bridge or even text related to the bridge, such as “San Francisco” or “Alcatraz.” It’s off otherwise. Yet another component might correspond to the idea of “smallness”: “We look through tens of millions of texts and see it’s on for the word ‘small,’ it’s on for the word ‘tiny,’ it’s on for the word ‘petite,’ it’s on for words related to smallness, things that are itty-bitty, like thimbles—you know, just small stuff,” says Batson. Having identified individual components, Anthropic then follows the trail inside the model as different components get chained together. The researchers start at the end, with the component or components that led to the final response Claude gives to a query. Batson and his team then trace that chain backwards. Odd behavior So: What did they find? Anthropic looked at 10 different behaviors in Claude. One involved the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on? The team found that Claude used components independent of any language to answer a question or solve a problem and then picked a specific language when it replied. Ask it “What is the opposite of small?” in English, French, and Chinese and Claude will first use the language-neutral components related to “smallness” and “opposites” to come up with an answer. Only then will it pick a specific language in which to reply. This suggests that large language models can learn things in one language and apply them in other languages. Anthropic also looked at how Claude solved simple math problems. The team found that the model seems to have developed its own internal strategies that are unlike those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95. And yet if you then ask Claude how it worked that out, it will say something like: “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” In other words, it gives you a common approach found everywhere online rather than what it actually did. Yep! LLMs are weird. (And not to be trusted.) The steps that Claude 3.5 Haiku used to solve a simple math problem were not what Anthropic expected—they’re not the steps Claude claimed it took either. This is clear evidence that large language models will give reasons for what they do that do not necessarily reflect what they actually did. But this is true for people too, says Batson: “You ask somebody, ‘Why did you do that?’ And they’re like, ‘Um, I guess it’s because I was— .’ You know, maybe not. Maybe they were just hungry and that’s why they did it.” Biran thinks this finding is especially interesting. Many researchers study the behavior of large language models by asking them to explain their actions. But that might be a risky approach, he says: “As models continue getting stronger, they must be equipped with better guardrails. I believe—and this work also shows—that relying only on model outputs is not enough.” A third task that Anthropic studied was writing poems. The researchers wanted to know if the model really did just wing it, predicting one word at a time. Instead they found that Claude somehow looked ahead, picking the word at the end of the next line several words in advance.   For example, when Claude was given the prompt “A rhyming couplet: He saw a carrot and had to grab it,” the model responded, “His hunger was like a starving rabbit.” But using their microscope, they saw that Claude had already hit upon the word “rabbit” when it was processing “grab it.” It then seemed to write the next line with that ending already in place. This might sound like a tiny detail. But it goes against the common assumption that large language models always work by picking one word at a time in sequence. “The planning thing in poems blew me away,” says Batson. “Instead of at the very last minute trying to make the rhyme make sense, it knows where it’s going.” “I thought that was cool,” says Merullo. “One of the joys of working in the field is moments like that. There’s been maybe small bits of evidence pointing toward the ability of models to plan ahead, but it’s been a big open question to what extent they do.” Anthropic then confirmed its observation by turning off the placeholder component for “rabbitness.” Claude responded with “His hunger was a powerful habit.” And when the team replaced “rabbitness” with “greenness,” Claude responded with “freeing it from the garden’s green.” Anthropic also explored why Claude sometimes made stuff up, a phenomenon known as hallucination. “Hallucination is the most natural thing in the world for these models, given how they’re just trained to give possible completions,” says Batson. “The real question is, ‘How in God’s name could you ever make it not do that?’” The latest generation of large language models, like Claude 3.5 and Gemini and GPT-4o, hallucinate far less than previous versions, thanks to extensive post-training (the steps that take an LLM trained on the internet and turn it into a usable chatbot). But Batson’s team was surprised to find that this post-training seems to have made Claude refuse to speculate as a default behavior. When it did respond with false information, it was because some other component had overridden the “don’t speculate” component. This seemed to happen most often when the speculation involved a celebrity or other well-known entity. It’s as if the amount of information available pushed the speculation through, despite the default setting. When Anthropic overrode the “don’t speculate” component to test this, Claude produced lots of false statements about individuals, including claiming that Batson was famous for inventing the Batson principle (he isn’t). Still unclear Because we know so little about large language models, any new insight is a big step forward. “A deep understanding of how these models work under the hood would allow us to design and train models that are much better and stronger,” says Biran. But Batson notes there are still serious limitations. “It’s a misconception that we’ve found all the components of the model or, like, a God’s-eye view,” he says. “Some things are in focus, but other things are still unclear—a distortion of the microscope.” And it takes several hours for a human researcher to trace the responses to even very short prompts. What’s more, these models can do a remarkable number of different things, and Anthropic has so far looked at only 10 of them. Batson also says there are big questions that this approach won’t answer. Circuit tracing can be used to peer at the structures inside a large language model, but it won’t tell you how or why those structures formed during training. “That’s a profound question that we don’t address at all in this work,” he says. But Batson sees this as the start of a new era in which it is possible, at last, to find real evidence for how these models work: “We don’t have to be, like: ‘Are they thinking? Are they reasoning? Are they dreaming? Are they memorizing?’ Those are all analogies. But if we can literally see step by step what a model is doing, maybe now we don’t need analogies.”

The AI firm Anthropic has developed a way to peer inside a large language model and watch what it does as it comes up with a response, revealing key new insights into how the technology works. The takeaway: LLMs are even stranger than we thought.

The Anthropic team was surprised by some of the counterintuitive workarounds that large language models appear to use to complete sentences, solve simple math problems, suppress hallucinations, and more, says Joshua Batson, a research scientist at the company.

It’s no secret that large language models work in mysterious ways. Few—if any—mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science.

But it’s not just about curiosity. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can’t do. And it would show how trustworthy (or not) they really are.

Batson and his colleagues describe their new work in two reports published today. The first presents Anthropic’s use of a technique called circuit tracing, which lets researchers track the decision-making processes inside a large language model step by step. Anthropic used circuit tracing to watch its LLM Claude 3.5 Haiku carry out various tasks. The second (titled “On the Biology of a Large Language Model”) details what the team discovered when it looked at 10 tasks in particular.

“I think this is really cool work,” says Jack Merullo, who studies large language models at Brown University in Providence, Rhode Island, and was not involved in the research. “It’s a really nice step forward in terms of methods.”

Circuit tracing is not itself new. Last year Merullo and his colleagues analyzed a specific circuit in a version of OpenAI’s GPT-2, an older large language model that OpenAI released in 2019. But Anthropic has now analyzed a number of different circuits as a far larger and far more complex model carries out multiple tasks. “Anthropic is very capable at applying scale to a problem,” says Merullo.

Eden Biran, who studies large language models at Tel Aviv University, agrees. “Finding circuits in a large state-of-the-art model such as Claude is a nontrivial engineering feat,” he says. “And it shows that circuits scale up and might be a good way forward for interpreting language models.”

Circuits chain together different parts—or components—of a model. Last year, Anthropic identified certain components inside Claude that correspond to real-world concepts. Some were specific, such as “Michael Jordan” or “greenness”; others were more vague, such as “conflict between individuals.” One component appeared to represent the Golden Gate Bridge. Anthropic researchers found that if they turned up the dial on this component, Claude could be made to self-identify not as a large language model but as the physical bridge itself.

The latest work builds on that research and the work of others, including Google DeepMind, to reveal some of the connections between individual components. Chains of components are the pathways between the words put into Claude and the words that come out.  

“It’s tip-of-the-iceberg stuff. Maybe we’re looking at a few percent of what’s going on,” says Batson. “But that’s already enough to see incredible structure.”

Growing LLMs

Researchers at Anthropic and elsewhere are studying large language models as if they were natural phenomena rather than human-built software. That’s because the models are trained, not programmed.

“They almost grow organically,” says Batson. “They start out totally random. Then you train them on all this data and they go from producing gibberish to being able to speak different languages and write software and fold proteins. There are insane things that these models learn to do, but we don’t know how that happened because we didn’t go in there and set the knobs.”

Sure, it’s all math. But it’s not math that we can follow. “Open up a large language model and all you will see is billions of numbers—the parameters,” says Batson. “It’s not illuminating.”

Anthropic says it was inspired by brain-scan techniques used in neuroscience to build what the firm describes as a kind of microscope that can be pointed at different parts of a model while it runs. The technique highlights components that are active at different times. Researchers can then zoom in on different components and record when they are and are not active.

Take the component that corresponds to the Golden Gate Bridge. It turns on when Claude is shown text that names or describes the bridge or even text related to the bridge, such as “San Francisco” or “Alcatraz.” It’s off otherwise.

Yet another component might correspond to the idea of “smallness”: “We look through tens of millions of texts and see it’s on for the word ‘small,’ it’s on for the word ‘tiny,’ it’s on for the word ‘petite,’ it’s on for words related to smallness, things that are itty-bitty, like thimbles—you know, just small stuff,” says Batson.

Having identified individual components, Anthropic then follows the trail inside the model as different components get chained together. The researchers start at the end, with the component or components that led to the final response Claude gives to a query. Batson and his team then trace that chain backwards.

Odd behavior

So: What did they find? Anthropic looked at 10 different behaviors in Claude. One involved the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on?

The team found that Claude used components independent of any language to answer a question or solve a problem and then picked a specific language when it replied. Ask it “What is the opposite of small?” in English, French, and Chinese and Claude will first use the language-neutral components related to “smallness” and “opposites” to come up with an answer. Only then will it pick a specific language in which to reply. This suggests that large language models can learn things in one language and apply them in other languages.

Anthropic also looked at how Claude solved simple math problems. The team found that the model seems to have developed its own internal strategies that are unlike those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95.

And yet if you then ask Claude how it worked that out, it will say something like: “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” In other words, it gives you a common approach found everywhere online rather than what it actually did. Yep! LLMs are weird. (And not to be trusted.)

The steps that Claude 3.5 Haiku used to solve a simple math problem were not what Anthropic expected—they’re not the steps Claude claimed it took either.

This is clear evidence that large language models will give reasons for what they do that do not necessarily reflect what they actually did. But this is true for people too, says Batson: “You ask somebody, ‘Why did you do that?’ And they’re like, ‘Um, I guess it’s because I was— .’ You know, maybe not. Maybe they were just hungry and that’s why they did it.”

Biran thinks this finding is especially interesting. Many researchers study the behavior of large language models by asking them to explain their actions. But that might be a risky approach, he says: “As models continue getting stronger, they must be equipped with better guardrails. I believe—and this work also shows—that relying only on model outputs is not enough.”

A third task that Anthropic studied was writing poems. The researchers wanted to know if the model really did just wing it, predicting one word at a time. Instead they found that Claude somehow looked ahead, picking the word at the end of the next line several words in advance.  

For example, when Claude was given the prompt “A rhyming couplet: He saw a carrot and had to grab it,” the model responded, “His hunger was like a starving rabbit.” But using their microscope, they saw that Claude had already hit upon the word “rabbit” when it was processing “grab it.” It then seemed to write the next line with that ending already in place.

This might sound like a tiny detail. But it goes against the common assumption that large language models always work by picking one word at a time in sequence. “The planning thing in poems blew me away,” says Batson. “Instead of at the very last minute trying to make the rhyme make sense, it knows where it’s going.”

“I thought that was cool,” says Merullo. “One of the joys of working in the field is moments like that. There’s been maybe small bits of evidence pointing toward the ability of models to plan ahead, but it’s been a big open question to what extent they do.”

Anthropic then confirmed its observation by turning off the placeholder component for “rabbitness.” Claude responded with “His hunger was a powerful habit.” And when the team replaced “rabbitness” with “greenness,” Claude responded with “freeing it from the garden’s green.”

Anthropic also explored why Claude sometimes made stuff up, a phenomenon known as hallucination. “Hallucination is the most natural thing in the world for these models, given how they’re just trained to give possible completions,” says Batson. “The real question is, ‘How in God’s name could you ever make it not do that?’”

The latest generation of large language models, like Claude 3.5 and Gemini and GPT-4o, hallucinate far less than previous versions, thanks to extensive post-training (the steps that take an LLM trained on the internet and turn it into a usable chatbot). But Batson’s team was surprised to find that this post-training seems to have made Claude refuse to speculate as a default behavior. When it did respond with false information, it was because some other component had overridden the “don’t speculate” component.

This seemed to happen most often when the speculation involved a celebrity or other well-known entity. It’s as if the amount of information available pushed the speculation through, despite the default setting. When Anthropic overrode the “don’t speculate” component to test this, Claude produced lots of false statements about individuals, including claiming that Batson was famous for inventing the Batson principle (he isn’t).

Still unclear

Because we know so little about large language models, any new insight is a big step forward. “A deep understanding of how these models work under the hood would allow us to design and train models that are much better and stronger,” says Biran.

But Batson notes there are still serious limitations. “It’s a misconception that we’ve found all the components of the model or, like, a God’s-eye view,” he says. “Some things are in focus, but other things are still unclear—a distortion of the microscope.”

And it takes several hours for a human researcher to trace the responses to even very short prompts. What’s more, these models can do a remarkable number of different things, and Anthropic has so far looked at only 10 of them.

Batson also says there are big questions that this approach won’t answer. Circuit tracing can be used to peer at the structures inside a large language model, but it won’t tell you how or why those structures formed during training. “That’s a profound question that we don’t address at all in this work,” he says.

But Batson sees this as the start of a new era in which it is possible, at last, to find real evidence for how these models work: “We don’t have to be, like: ‘Are they thinking? Are they reasoning? Are they dreaming? Are they memorizing?’ Those are all analogies. But if we can literally see step by step what a model is doing, maybe now we don’t need analogies.”

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Ubuntu namespace vulnerability should be addressed quickly: Expert

Thus, “there is little impact of not ‘patching’ the vulnerability,” he said. “Organizations using centralized configuration tools like Ansible may deploy these changes with regularly scheduled maintenance or reboot windows.”  Features supposed to improve security Ironically, last October Ubuntu introduced AppArmor-based features to improve security by reducing the attack surface

Read More »

Google Cloud partners with mLogica to offer mainframe modernization

Other than the partnership with mLogica, Google Cloud also offers a variety of other mainframe migration tools, including Radis and G4 that can be employed to modernize specific applications. Enterprises can also use a combination of migration tools to modernize their mainframe applications. Some of these tools include the Gemini-powered

Read More »

Repsol Brings Schoders Greencoat as Partner in 400 MW Renewables Portfolio

Spanish energy company Repsol S.A. is partnering with Schroders Greencoat in a 400-megawatt (MW) wind and solar portfolio, valued at EUR 580 million ($626.4 million). Repsol said in a media release that Schroders Greencoat has acquired a 49 percent share in a portfolio of eight wind farms, with a total capacity of 300 MW, located in the northern Spanish provinces of Huesca, Zaragoza, and Teruel. The agreement also encompasses two solar facilities, amounting to 100 MW, in the province of Palencia. All of these assets are anticipated to become operational in the first half of 2025, Repsol said. Repsol said it will maintain control of the assets. In December 2024, Repsol arranged a long-term syndicated loan financing of EUR348 million ($375.8 million) with BBVA, Crédit Agricole CIB, Banco Sabadell, and the Official Spanish Credit Institute (ICO). “The alliance with a partner like Schroders Greencoat, one of the world’s leading renewable infrastructure managers, at a time when there is a wide offer of renewable assets for sale, highlights the quality and attractiveness of our portfolio in the market”, João Costeira, Repsol’s Executive Managing Director of Low Carbon Generation, said. This is the first investment by Schroders Greencoat Europe SCSp Fund, which secured over EUR 220 million ($237.6 million) in its initial November 2024 funding. The fund, aimed at a diverse European energy transition portfolio, will prioritize renewable energy infrastructure while also investing in grid upgrades, storage, hydrogen, efficiency, mobility, and renewable heat. “We are delighted to have made the Fund’s first acquisition following the first close. Our partnership with Repsol signifies a first step in our investment strategy and we look forward to working together to deliver long-term value for our clients with high quality of assets all supported by long-term offtake agreements”, Adam Basnett, Portfolio Manager for Schroders Greencoat, said.

Read More »

Shell Traders Haven’t Lost Money Over Last Decade

Shell Plc’s sprawling in-house trading operation — which includes oil, natural gas and electricity — hasn’t lost money during a single quarter over the last decade, said Chief Executive Officer Wael Sawan. The London-based energy giant keeps a tight lid on information about its trading business for competition reasons, but Sawan provided a peek during the company’s investor day presentation on Tuesday.  Over the last decade, Shell traders have delivered an average uplift on return on average capital employed of 2%, Sawan said at the New York Stock Exchange. They are expected to contribute 2% to 4% going forward, he said. Trading is core to Shell and will remain at the heart of the company’s future. Sawan outlined plans on Tuesday to boost investor returns for the rest of this decade by reinforcing the company’s position as the world’s top marketer of liquefied natural gas. The head of trading was recently elevated to the executive committee, giving trading a seat at the firm’s decision-making table.  WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed. MORE FROM THIS AUTHOR Bloomberg

Read More »

Rubio Warns Venezuela Against Attacking Guyana, Exxon

US Secretary of State Marco Rubio warned Venezuela that any attempt to invade Guyana or threaten Exxon Mobil Corp.’s operations in the country would be a “very bad move.”  Rubio spoke less than a month after a Venezuelan patrol ship entered Guyanese waters and positioned itself near a vessel contracted by Exxon, which is operating the world’s fastest-growing major oil field off the coast of the South American country.  “It would be a very bad day for the Venezuelan regime if they were to attack Guyana or attack Exxon Mobil,” Rubio said in the capital city of Georgetown on Thursday. “Suffice it to say that if that regime were to do something such as that, it would be a very bad move. It would be a big mistake. For them.” Venezuelan leader Nicolas Maduro reopened a border dispute more than a century after it was settled by international arbitration as he sought to galvanize supporters for last year’s presidential election. Maduro’s military and naval arsenal dwarfs Guyana’s, which was one of the continent’s poorest countries prior to Exxon’s 2015 discovery of oil.   Guyana’s President Irfaan Ali has been successful in rallying the international community behind the country’s dispute with Venezuela, with the UK, France and the US pledging support.  “We have a big Navy,” Rubio said. “It can get anywhere in the world.”  Rubio also said the US would bolster ties with Guyana, without getting into specifics. “We have commitments that exist today with Guyana,” he said. “We want to build on those, expand on those.”  Rubio also was scheduled to visit Suriname, which has sought to encourage oil exploration in offshore territory close to the Guyanese discoveries. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All

Read More »

Oil Slips Despite Weekly Gain

Oil fell on concerns that the Trump administration’s tariff onslaught will reduce energy demand. West Texas Intermediate slid 0.8% to settle above $69 a barrel, retreating along with equity markets. Crude still notched its third straight weekly advance amid waning expectations of a near-term oversupply. The US is planning to impose tariffs on auto imports and so-called reciprocal levies next week, widening the global trade war. Oil traders face an uncertain outlook as they grapple with President Donald Trump’s policies and an OPEC+ plan to revive idled output. WTI futures have been rangebound for the past eight months, trading in a band of about $15 between the high $60s and low $80s. “US stocks are struggling, and longer-term demand fears are on the minds of most traders as tariffs begin to kick in on cars not manufactured in the US,” said Dennis Kissler, senior vice president for trading at BOK Financial Securities.   Earlier this week, Vitol’s chief executive officer said while there are some threats to supply, it’s generally adequate for the next couple of years. Meanwhile, Venezuela is boosting oil exports to China as the Trump administration deploys sanctions and secondary tariffs to squeeze the Latin American nation. Oil Prices: WTI for May delivery fell 0.8% to settle at $69.36 a barrel in New York. Futures gained 1.6% for the week. Brent for May settlement dipped 0.5% to settle at $73.63 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy. MORE FROM THIS AUTHOR

Read More »

Solong North Sea disaster ship pulls into Aberdeen

The Solong, a burned-out container ship badly damaged in a collision with a US oil tanker, has finally reached Aberdeen Friday morning. It arrived at South Harbour for “safe berthing” following days of intense salvage operations. The Portuguese-flagged vessel was towed to the Port of Aberdeen after crashing into the anchored Stena Immaculate off the East Yorkshire coast on March 10, triggering an explosion and fires. It has been the focus of ongoing salvage efforts after enduring extensive damage and a week-long fire. The Solong was accompanied by another vessel equipped with counter-pollution measures to prevent further environmental damage on its pas The Solong, a burned-out container ship badly damaged in a collision with a US oil tanker, has finally reached Aberdeen Friday morning. It arrived at South Harbour for “safe berthing” following days of intense salvage operations. The Portuguese-flagged vessel was towed to the Port of Aberdeen after crashing into the anchored Stena Immaculate off the East Yorkshire coast on March 10, triggering an explosion and fires. It has been the focus of ongoing salvage efforts after enduring extensive damage and a week-long fire. The Solong was accompanied by another vessel equipped with counter-pollution measures to prevent further environmental damage on its passage to Aberdeen. Solong sailor presumed dead The crash resulted in a tragic loss: one sailor from the Solong, 38-year-old Filipino national Mark Angelo Pernia, remains missing and is presumed dead. In total, rescuers saved 36 crew members from both ships. Meanwhile, the Solong’s captain, 59-year-old Vladimir Motin from St. Petersburg, Russia, has been arrested and charged with gross negligence manslaughter. © DC ThomsonCrew on board the burnt out Solong container ship being tugged into Aberdeen’s south harbour. Image: Kenny Elrick/DC Thomson © DC ThomsonImage: Kenny Elrick/DC Thomson © DC ThomsonImage: Kenny Elrick/DC Thomson. Drone / DJI

Read More »

USA Crude Oil Inventories Down 3.3MM Barrels WoW

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), decreased by 3.3 million barrels from the week ending March 14 to the week ending March 21, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. This report was published on March 26 and included data for the week ending March 21. The EIA report showed that crude oil stocks, not including the SPR, stood at 433.6 million barrels on March 21, 437.0 million barrels on March 14, and 448.2 million barrels on March 22, 2024. Crude oil in the SPR stood at 396.1 million barrels on March 21, 395.9 million barrels on March 14, and 363.1 million barrels on March 22, 2024, the report outlined. The EIA report highlighted that data may not add up to totals due to independent rounding. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.600 billion barrels on March 21, the report showed. Total petroleum stocks were up 3.5 million barrels week on week and up 19.9 million barrels year on year, the report revealed. “At 433.6 million barrels, U.S. crude oil inventories are about five percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories decreased by 1.4 million barrels from last week and are two percent above the five year average for this time of year. Finished gasoline inventories increased and blending components inventories decreased last week,” it added. “Distillate fuel inventories decreased by 0.4 million barrels last week and are about seven percent below the five year average for this time of year. Propane/propylene inventories decreased by

Read More »

Airtel connects India with 100Tbps submarine cable

“Businesses are becoming increasingly global and digital-first, with industries such as financial services, data centers, and social media platforms relying heavily on real-time, uninterrupted data flow,” Sinha added. The 2Africa Pearls submarine cable system spans 45,000 kilometers, involving a consortium of global telecommunications leaders including Bayobab, China Mobile International, Meta, Orange, Telecom Egypt, Vodafone Group, and WIOCC. Alcatel Submarine Networks is responsible for the cable’s manufacturing and installation, the statement added. This cable system is part of a broader global effort to enhance international digital connectivity. Unlike traditional telecommunications infrastructure, the 2Africa Pearls project represents a collaborative approach to solving complex global communication challenges. “The 100 Tbps capacity of the 2Africa Pearls cable significantly surpasses most existing submarine cable systems, positioning India as a key hub for high-speed connectivity between Africa, Europe, and Asia,” said Prabhu Ram, VP for Industry Research Group at CyberMedia Research. According to Sinha, Airtel’s infrastructure now spans “over 400,000 route kilometers across 34+ cables, connecting 50 countries across five continents. This expansive infrastructure ensures businesses and individuals stay seamlessly connected, wherever they are.” Gogia further emphasizes the broader implications, noting, “What also stands out is the partnership behind this — Airtel working with Meta and center3 signals a broader shift. India is no longer just a consumer of global connectivity. We’re finally shaping the routes, not just using them.”

Read More »

Former Arista COO launches NextHop AI for customized networking infrastructure

Sadana argued that unlike traditional networking where an IT person can just plug a cable into a port and it works, AI networking requires intricate, custom solutions. The core challenge is creating highly optimized, efficient networking infrastructure that can support massive AI compute clusters with minimal inefficiencies. How NextHop is looking to change the game for hyperscale networking NextHop AI is working directly alongside its hyperscaler customers to develop and build customized networking solutions. “We are here to build the most efficient AI networking solutions that are out there,” Sadana said. More specifically, Sadana said that NextHop is looking to help hyperscalers in several ways including: Compressing product development cycles: “Companies that are doing things on their own can compress their product development cycle by six to 12 months when they partner with us,” he said. Exploring multiple technological alternatives: Sadana noted that hyperscalers might try and build on their own and will often only be able to explore one or two alternative approaches. With NextHop, Sadana said his company will enable them to explore four to six different alternatives. Achieving incremental efficiency gains: At the massive cloud scale that hyperscalers operate, even an incremental one percent improvement can have an oversized outcome. “You have to make AI clusters as efficient as possible for the world to use all the AI applications at the right cost structure, at the right economics, for this to be successful,” Sadana said. “So we are participating by making that infrastructure layer a lot more efficient for cloud customers, or the hyperscalers, which, in turn, of course, gives the benefits to all of these software companies trying to run AI applications in these cloud companies.” Technical innovations: Beyond traditional networking In terms of what the company is actually building now, NextHop is developing specialized network switches

Read More »

Microsoft abandons data center projects as OpenAI considers its own, hinting at a market shift

A potential ‘oversupply position’ In a new research note, TD Cowan analysts reportedly said that Microsoft has walked away from new data center projects in the US and Europe, purportedly due to an oversupply of compute clusters that power AI. This follows reports from TD Cowen in February that Microsoft had “cancelled leases in the US totaling a couple of hundred megawatts” of data center capacity. The researchers noted that the company’s pullback was a sign of it “potentially being in an oversupply position,” with demand forecasts lowered. OpenAI, for its part, has reportedly discussed purchasing billions of dollars’ worth of data storage hardware and software to increase its computing power and decrease its reliance on hyperscalers. This fits with its planned Stargate Project, a $500 billion, US President Donald Trump-endorsed initiative to build out its AI infrastructure in the US over the next four years. Based on the easing of exclusivity between the two companies, analysts say these moves aren’t surprising. “When looking at storage in the cloud — especially as it relates to use in AI — it is incredibly expensive,” said Matt Kimball, VP and principal analyst for data center compute and storage at Moor Insights & Strategy. “Those expenses climb even higher as the volume of storage and movement of data grows,” he pointed out. “It is only smart for any business to perform a cost analysis of whether storage is better managed in the cloud or on-prem, and moving forward in a direction that delivers the best performance, best security, and best operational efficiency at the lowest cost.”

Read More »

PEAK:AIO adds power, density to AI storage server

There is also the fact that many people working with AI are not IT professionals, such as professors, biochemists, scientists, doctors, clinicians, and they don’t have a traditional enterprise department or a data center. “It’s run by people that wouldn’t really know, nor want to know, what storage is,” he said. While the new AI Data Server is a Dell design, PEAK:AIO has worked with Lenovo, Supermicro, and HPE as well as Dell over the past four years, offering to convert their off the shelf storage servers into hyper fast, very AI-specific, cheap, specific storage servers that work with all the protocols at Nvidia, like NVLink, along with NFS and NVMe over Fabric. It also greatly increased storage capacity by going with 61TB drives from Solidigm. SSDs from the major server vendors typically maxed out at 15TB, according to the vendor. PEAK:AIO competes with VAST, WekaIO, NetApp, Pure Storage and many others in the growing AI workload storage arena. PEAK:AIO’s AI Data Server is available now.

Read More »

SoftBank to buy Ampere for $6.5B, fueling Arm-based server market competition

SoftBank’s announcement suggests Ampere will collaborate with other SBG companies, potentially creating a powerful ecosystem of Arm-based computing solutions. This collaboration could extend to SoftBank’s numerous portfolio companies, including Korean/Japanese web giant LY Corp, ByteDance (TikTok’s parent company), and various AI startups. If SoftBank successfully steers its portfolio companies toward Ampere processors, it could accelerate the shift away from x86 architecture in data centers worldwide. Questions remain about Arm’s server strategy The acquisition, however, raises questions about how SoftBank will balance its investments in both Arm and Ampere, given their potentially competing server CPU strategies. Arm’s recent move to design and sell its own server processors to Meta signaled a major strategic shift that already put it in direct competition with its own customers, including Qualcomm and Nvidia. “In technology licensing where an entity is both provider and competitor, boundaries are typically well-defined without special preferences beyond potential first-mover advantages,” Kawoosa explained. “Arm will likely continue making independent licensing decisions that serve its broader interests rather than favoring Ampere, as the company can’t risk alienating its established high-volume customers.” Industry analysts speculate that SoftBank might position Arm to focus on custom designs for hyperscale customers while allowing Ampere to dominate the market for more standardized server processors. Alternatively, the two companies could be merged or realigned to present a unified strategy against incumbents Intel and AMD. “While Arm currently dominates processor architecture, particularly for energy-efficient designs, the landscape isn’t static,” Kawoosa added. “The semiconductor industry is approaching a potential inflection point, and we may witness fundamental disruptions in the next 3-5 years — similar to how OpenAI transformed the AI landscape. SoftBank appears to be maximizing its Arm investments while preparing for this coming paradigm shift in processor architecture.”

Read More »

Nvidia, xAI and two energy giants join genAI infrastructure initiative

The new AIP members will “further strengthen the partnership’s technology leadership as the platform seeks to invest in new and expanded AI infrastructure. Nvidia will also continue in its role as a technical advisor to AIP, leveraging its expertise in accelerated computing and AI factories to inform the deployment of next-generation AI data center infrastructure,” the group’s statement said. “Additionally, GE Vernova and NextEra Energy have agreed to collaborate with AIP to accelerate the scaling of critical and diverse energy solutions for AI data centers. GE Vernova will also work with AIP and its partners on supply chain planning and in delivering innovative and high efficiency energy solutions.” The group claimed, without offering any specifics, that it “has attracted significant capital and partner interest since its inception in September 2024, highlighting the growing demand for AI-ready data centers and power solutions.” The statement said the group will try to raise “$30 billion in capital from investors, asset owners, and corporations, which in turn will mobilize up to $100 billion in total investment potential when including debt financing.” Forrester’s Nguyen also noted that the influence of two of the new members — xAI, owned by Elon Musk, along with Nvidia — could easily help with fundraising. Musk “with his connections, he does not make small quiet moves,” Nguyen said. “As for Nvidia, they are the face of AI. Everything they do attracts attention.” Info-Tech’s Bickley said that the astronomical dollars involved in genAI investments is mind-boggling. And yet even more investment is needed — a lot more.

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »