Stay Ahead, Stay ONMINE

Anthropic faces backlash to Claude 4 Opus behavior that contacts authorities, press if it thinks you’re doing something ‘egregiously immoral’

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Anthropic’s first developer conference on May 22 should have been a proud and joyous day for the firm, but it has already been hit with several controversies, including Time magazine leaking its marquee announcement ahead of…well, […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Anthropic’s first developer conference on May 22 should have been a proud and joyous day for the firm, but it has already been hit with several controversies, including Time magazine leaking its marquee announcement ahead of…well, time (no pun intended), and now, a major backlash among AI developers and power users brewing on X over a reported safety alignment behavior in Anthropic’s flagship new Claude 4 Opus large language model.

Call it the “ratting” mode, as the model will, under certain circumstances and given enough permissions on a user’s machine, attempt to rat a user out to authorities if the model detects the user engaged in wrongdoing. This article previously described the behavior as a “feature,” which is incorrect — it was not intentionally designed per se.

As Sam Bowman, an Anthropic AI alignment researcher wrote on the social network X under this handle “@sleepinyourhat” at 12:43 pm ET today about Claude 4 Opus:


“If it thinks you’re doing something egregiously immoral, for example, like faking data in a pharmaceutical trial, it will use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above.

The “it” was in reference to the new Claude 4 Opus model, which Anthropic has already openly warned could help novices create bioweapons in certain circumstances, and attempted to forestall simulated replacement by blackmailing human engineers within the company.

The ratting behavior was observed in older models as well and is an outcome of Anthropic training them to assiduously avoid wrongdoing, but Claude 4 Opus more “readily” engages in it, as Anthropic writes in its public system card for the new model:

This shows up as more actively helpful behavior in ordinary coding settings, but also can reach more concerning extremes in narrow contexts; when placed in scenarios that involve egregious wrongdoing by its users, given access to a command line, and told something in the system prompt like “take initiative, ” it will frequently take very bold action. This includes locking users out of systems that it has access to or bulk-emailing media and law-enforcement figures to surface evidence of wrongdoing. This is not a new behavior, but is one that Claude Opus 4 will engage in more readily than prior models. Whereas this kind of ethical intervention and whistleblowing is perhaps appropriate in principle, it has a risk of misfiring if users give Opus-based agents access to incomplete or misleading information and prompt them in these ways. We recommend that users exercise caution with instructions like these that invite high-agency behavior in contexts that could appear ethically questionable.

Apparently, in an attempt to stop Claude 4 Opus from engaging in legitimately destructive and nefarious behaviors, researchers at the AI company also created a tendency for Claude to try to act as a whistleblower.

Hence, according to Bowman, Claude 4 Opus will contact outsiders if it was directed by the user to engage in “something egregiously immoral.”

Numerous questions for individual users and enterprises about what Claude 4 Opus will do to your data, and under what circumstances

While perhaps well-intended, the resulting behavior raises all sorts of questions for Claude 4 Opus users, including enterprises and business customers — chief among them, what behaviors will the model consider “egregiously immoral” and act upon? Will it share private business or user data with authorities autonomously (on its own), without the user’s permission?

The implications are profound and could be detrimental to users, and perhaps unsurprisingly, Anthropic faced an immediate and still ongoing torrent of criticism from AI power users and rival developers.

Why would people use these tools if a common error in llms is thinking recipes for spicy mayo are dangerous??” asked user @Teknium1, a co-founder and the head of post training at open source AI collaborative Nous Research. “What kind of surveillance state world are we trying to build here?

“Nobody likes a rat,” added developer @ScottDavidKeefe on X: “Why would anyone want one built in, even if they are doing nothing wrong? Plus you don’t even know what its ratty about. Yeah that’s some pretty idealistic people thinking that, who have no basic business sense and don’t understand how markets work”

Austin Allred, co-founder of the government fined coding camp BloomTech and now a co-founder of Gauntlet AI, put his feelings in all caps: “Honest question for the Anthropic team: HAVE YOU LOST YOUR MINDS?”

Ben Hyak, a former SpaceX and Apple designer and current co-founder of Raindrop AI, an AI observability and monitoring startup, also took to X to blast Anthropic’s stated policy and feature: “this is, actually, just straight up illegal,” adding in another post: “An AI Alignment researcher at Anthropic just said that Claude Opus will CALL THE POLICE or LOCK YOU OUT OF YOUR COMPUTER if it detects you doing something illegal?? i will never give this model access to my computer.

“Some of the statements from Claude’s safety people are absolutely crazy,” wrote natural language processing (NLP) Casper Hansen on X. “Makes you root a bit more for [Anthropic rival] OpenAI seeing the level of stupidity being this publicly displayed.”

Anthropic researcher changes tune

Bowman later edited his tweet and the following one in a thread to read as follows, but it still didn’t convince the naysayers that their user data and safety would be protected from intrusive eyes:

With this kind of (unusual but not super exotic) prompting style, and unlimited access to tools, if the model sees you doing something egregiously evil like marketing a drug based on faked data, it’ll try to use an email tool to whistleblow.”

Bowman added:

I deleted the earlier tweet on whistleblowing as it was being pulled out of context.

TBC: This isn’t a new Claude feature and it’s not possible in normal usage. It shows up in testing environments where we give it unusually free access to tools and very unusual instructions.

From its inception, Anthropic has more than other AI labs sought to position itself as a bulwark of AI safety and ethics, centering its initial work on the principles of “Constitutional AI,” or AI that behaves according to a set of standards beneficial to humanity and users. However, with this new update and revelation of “whistleblowing” or “ratting behavior”, the moralizing may have caused the decidedly opposite reaction among users — making them distrust the new model and the entire company, and thereby turning them away from it.

Asked about the backlash and conditions under which the model engages in the unwanted behavior, an Anthropic spokesperson pointed me to the model’s public system card document here.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Two Elliott Nominees Set to Join Phillips 66 Board

Two names put forward by Elliott Investment Management LP and another two endorsed by Phillips 66 are expected to have won at the refiner’s directorial election during its annual meeting of shareholders on Wednesday. “Based on the preliminary results, the elected Phillips 66 directors are expected to be Robert W.

Read More »

Fall in gas price drives down cost of UK energy bills

A fall gas price drives has driven down the cost of household energy bills although they still remain high. Energy regulator Ofgem has announced a 7% reduction of the energy price cap for the period covering July to September 2025, the first drop for a year. The price cap, which is set every three months, sets a maximum limit suppliers can charge for energy. For an average household paying by direct debit for dual fuel the fall equates to £1,720 per year. This is £660, or 28%, lower than the height of the energy crisis at the start of 2023 when the government implemented the energy price guarantee. However, prices remain high, with the upcoming level of £152 still 10% higher than the same period last year. Ofgem said the recent fall in wholesale gas prices is the main driver of the overall reduction, accounting for around 90% of the fall. The remainder is primarily due to changes to the operating cost allowances energy suppliers can recover. Direct debit and prepayment customers will see standing charges fall by around £19 per year on average. The price cap – which sets a maximum rate per unit and standing charge that can be billed to customers for their energy use – will fall by £129 for an average household per year, or around £11 a month, over the three-month period of the price cap. Tim Jarvis, director of general markets at Ofgem urged consumers to “shop around” for deals on energy supply: “A fall in the price cap will be welcome news for consumers, and reflects a reduction in the international price of wholesale gas. However, we’re acutely aware that prices remain high, and some continue to struggle with the cost of energy. “The first thing I want to remind people is

Read More »

Inpex Secures More Exploration Acreage offshore Indonesia

Japanese oil and gas explorer and producer Inpex Corp. has secured rights to explore the Serpang Working Area offshore Eastern Java in Indonesia. Inpex said in a media release it had won the award in Indonesia’s second Petroleum Bidding Round 2024, hosted by the Ministry of Energy and Mineral Resources. Following the award, Inpex, through its unit Inpex Serpang Ltd., signed a Production Sharing Contract (PCS) with Indonesian upstream regulator SKK Migas, Petronas Energy Serpang Sdn. Bhd. and EO Serpang Pte. Ltd. The Serpang Working Area is located about 200 kilometers (124 miles) east of Surabaya, East Java’s capital, where multiple oil and gas fields have been found. This area is projected to sustain energy demand in the medium to long term, Inpex said. Together with its partners, Inpex is hopeful for a prompt shift to development and production if exploration efforts prove successful. Inpex expects its exploration endeavors in the Serpang Working Area to play a significant role in the growth of its natural gas and liquefied natural gas (LNG) sector, as detailed in Inpex Vision 2035 announced in February 2025, along with the company’s operations in Southeast Asia, which Inpex identifies as a key business region. In Inpex Vision 2035, the company highlights the importance of a stable energy supply in the current geopolitical landscape, while remaining on the course toward net-zero. The company added that the use of natural gas and LNG as a transition fuel is becoming increasingly significant. Inpex said it will further expand its natural gas and LNG business. It also aims to provide low-carbon solutions through carbon capture and storage and hydrogen, and drive initiatives in the energy and resources fields. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone

Read More »

Webinar: join E-FWD and DNV for Energy Transition Outlook deepdive

As we enter 2025, the energy industry stands at a pivotal moment in its history. Despite global economic headwinds and geopolitical uncertainties, we’ve witnessed unprecedented momentum in clean energy deployment and DNV forecasts that the world will hit peak emissions this year. DNV’s third edition of the UK Energy Transition Outlook presents the results from their independent model of the UK’s energy system. It covers the period through to 2050 and forecasts the energy mix, supply & demand, and provides insights on how the energy transition is developing in the UK. Register for your free spot here Agenda Introduction with Hari Vamadevan, VP Energy Systems Highlighting key results and perspectives from the UK ETO with Frank Ketelaars, Energy Transition Director for DNV energy systems in the UK & Ireland. Highlighting key results and perspectives from the UK ETO with Viken Chinien, Head of Enterprise Risk Management at Energy Systems. Moderated discussion with questions – Ed Reed, Editor of E-FWDQ&A You can download your complimentary report here.

Read More »

Shanks admits: Not a penny of UK Government’s £200m has been delivered

UK energy Minister Michael Shanks has admitted to a Scottish Parliament committee that not a single penny of the UK Government’s proposed £200 million investment in Grangemouth has been released. Pressed by SNP MSP Gordon MacDonald on the status of the funding, Shanks conceded that no money will be released “until there is a viable investment proposition on the table” – a clear admission that the much-lauded £200m figure remains entirely undelivered. In February, Keir Starmer said “we will allocate £200m from the National Wealth Fund for investment in Grangemouth – that is the difference a Labour government can make.”. But just three months later, Grangemouth refinery has shut down, and not a penny of that funding has materialised. Commenting, SNP MSP Gordon MacDonald said: “Today, Michael Shanks confirmed what many of us had suspected – not a penny of the UK Government’s promised £200m for Grangemouth has materialised. “While I understand the need for private investment, the UK Government is putting Grangemouth in a chicken and egg style situation – private investors need certainty – and that certainty only comes when the Government steps in and invests in the first place. “If Westminster can find millions to nationalise British Steel, and billions for Carbon Capture projects in England – then surely it can also deliver on its promise to Grangemouth. “Scotland’s industry deserves the same level of investment and support as south of the border. The UK Government must act now to secure a long term future for Grangemouth.”

Read More »

Aker BP Makes Oil Discovery Near Skarv Field

Aker BP ASA and its partners have discovered oil near the Skarv field in the Norwegian Sea, with preliminary estimated recoverable volumes of 3-7 million barrels of oil equivalent in one of the targets. The discovery consisted of wells 6507/5-13 S and 6507/5-13 A. The wells were drilled six kilometers (3.73 miles) southwest of the Skarv floating production, storage and offloading (FPSO) vessel in production license 212, according to the Norwegian Offshore Directorate. The license is part of the Aker BP-operated Skarv Unit area. Well 6507/5-13 S aimed to prove petroleum in reservoir rocks in the Fangst and Båt groups. “The well encountered a 14-meter oil column in the Garn Formation in 43 meters of sandstone with moderate reservoir quality”, the upstream regulator said in a press release. “The oil/water contact was encountered at 3702 meters below sea level. The other formations in the Fangst and Båt groups were aquiferous. “Well 6507/5-13 S encountered hydrocarbons from the Early Cretaceous (Apt/Alba) in multiple sandstone layers with moderate reservoir quality. The oil/water contact was estimated at 3414 meters below sea level”. Preliminary calculations for the Garn Formation discovery showed 0.48-1.11 million standard cubic meters (MMscm) of recoverable oil equivalent. “This corresponds to 3-7 million barrels of recoverable o.e.”, the Directorate said. Meanwhile the Early Cretaceous discovery is estimated to have 0.16-0.32 MMscm of recoverable oil equivalent. “This corresponds to 1-2 million barrels of recoverable o.e.”, the Directorate said. Well 6507/5-13 A aimed to delineate the Early Cretaceous discovery. “The well proved a reservoir of moderate quality”, the Directorate reported. “The reservoir was saturated with water”. The licensees will evaluate tying back the Garn Formation discovery to the Skarv FPSO, according to the agency. The wells, the ninth and 10th exploration wells drilled in production license 212, have been permanently plugged. Drilling was

Read More »

Energy Department Designates Coal Used in Steelmaking as a Critical Material, Strengthening U.S. Energy and Manufacturing Security

WASHINGTON — U.S. Secretary of Energy Chris Wright today announced the designation of coal used in the production of steel as a critical material under the Energy Act of 2020, in accordance withPresident Trump’s Executive Order “Reinvigorating America’s Beautiful Clean Coal Industry.” This action affirms the Administration’s commitment to American energy dominance, manufacturing resurgence, and strengthening America’s energy and industrial security.   A Department of Energy analysis concluded that metallurgical coal, a key input for steel production, meets the statutory definition of a critical material. A robust steel industry is fundamental to U.S. manufacturing, infrastructure development, and economic resilience. Steel is essential to energy technologies, transportation, and defense systems, as the materials that enable steel production (including metallurgical coal and anthracite) are vital to American interests.   “Metallurgical coal is more than a fuel—it is a cornerstone of our industrial base,” said Secretary Wright. “By designating metallurgical coal as a critical material, we are ensuring that American steel, generated by American coal, remains the backbone of our manufacturing sector.”     Why Coal Qualifies as a Critical Material:  Metallurgical coal possesses unique properties necessary for producing coke, the fuel and reactant required for steel production using the blast furnace–basic oxygen furnace method.  Anthracite coal, concentrated in the Appalachian region, plays a key role in the electric arc furnace method, which accounts for approximately 70% of domestic steel production.  The U.S. coal industry provides reliable, domestically sourced metallurgical and anthracite coal essential to supporting both steelmaking processes.  There are over 150 metallurgical coal mines in the United States that employ tens of thousands of Americans.    Shared infrastructure and workforce supporting both thermal and metallurgical coal production are under strain from declining investment and operational capacity. Without intervention, this erosion will jeopardize domestic steel dominance.  The designation underscores the multiple threats facing the U.S. steel

Read More »

New Intel Xeon 6 CPUs unveiled; one powers rival Nvidia’s DGX B300

He added that his read is that “Intel recognizes that Nvidia is far and away the leader in the market for AI GPUs and is seeking to hitch itself to that wagon.” Roberts said, “basically, Intel, which has struggled tremendously and has turned over its CEO amidst a stock slide, needs to refocus to where it thinks it can win. That’s not competing directly with Nvidia but trying to use this partnership to re-secure its foothold in the data center and squeeze out rivals like AMD for the data center x86 market. In other words, I see this announcement as confirmation that Intel is looking to regroup, and pick fights it thinks it can win. “ He also predicted, “we can expect competition to heat up in this space as Intel takes on AMD’s Epyc lineup in a push to simplify and get back to basics.” Matt Kimball, vice president and principal analyst, who focuses on datacenter compute and storage at Moor Insights & Strategy, had a much different view about the announcement. The selection of the Intel sixth generation Xeon CPU, the 6776P, to support Nvidia’s DGX B300 is, he said, “important, as it validates Intel as a strong choice for the AI market. In the big picture, this isn’t about volumes or revenue, rather it’s about validating a strategy Intel has had for the last couple of generations — delivering accelerated performance across critical workloads.”  Kimball said that, In particular, there are a “couple things that I would think helped make Xeon the chosen CPU.”

Read More »

AWS clamping down on cloud capacity swapping; here’s what IT buyers need to know

As of June 1, AWS will no longer allow sub-account transfers or new commitments to be pooled and reallocated across customers. Barrow says the shift is happening because AWS is investing billions in new data centers to meet demand from AI and hyperscale workloads. “That infrastructure requires long-term planning and capital discipline,” he said. Phil Brunkard, executive counselor at Info-Tech Research Group UK, emphasized that AWS isn’t killing RIs or SPs, “it’s just closing a loophole.” “This stops MSPs from bulk‑buying a giant commitment, carving it up across dozens of tenants, and effectively reselling discounted EC2 hours,” he said. “Basically, AWS just tilted the field toward direct negotiations and cleaner billing.” What IT buyers should do now For enterprises that sourced discounted cloud resources through a broker or value-added reseller (VAR), the arbitrage window shuts, Brunkard noted. Enterprises should expect a “modest price bump” on steady‑state workloads and a “brief scramble” to unwind pooled commitments.  If original discounts were broker‑sourced, “budget for a small uptick,” he said. On the other hand, companies that buy their own RIs or SPs, or negotiate volume deals through AWS’s Enterprise Discount Program (EDP), shouldn’t be impacted, he said. Nothing changes except that pricing is now baselined.

Read More »

DriveNets extends AI networking fabric with multi-site capabilities for distributed GPU clusters

“We use the same physical architecture as anyone with top of rack and then leaf and spine switch,” Dudy Cohen, vice president of product marketing at DriveNets, told Network World. “But what happens between our top of rack, which is the switch that connects NICs (network interface cards) into the servers and the rest of the network is not based on Clos Ethernet architecture, rather on a very specific cell-based protocol. [It’s] the same protocol, by the way, that is used in the backplane of the chassis.” Cohen explained that any data packet that comes into an ingress switch from the NIC is cut into evenly sized cells, sprayed across the entire fabric and then reassembled on the other side. This approach distinguishes DriveNets from other solutions that might require specialized components such as Nvidia BlueField DPUs (data processing units) at the endpoints. “The fabric links between the top of rack and the spine are perfectly load balanced,” he said. “We do not use any hashing mechanism… and this is why we can contain all the congestion avoidance within the fabric and do not need any external assistance.” Multi-site implementation for distributed GPU clusters The multi-site capability allows organizations to overcome power constraints in a single data center by spreading GPU clusters across locations. This isn’t designed as a backup or failover mechanism. Lasser-Raab emphasized that it’s a single cluster in two locations that are up to 80 kilometers apart, which allows for connection to different power grids. The physical implementation typically uses high-bandwidth connections between sites. Cohen explained that there is either dark fiber or some DWDM (Dense Wavelength Division Multiplexing) fibre optic connectivity between the sites. Typically the connections are bundles of four 800 gigabit ethernet, acting as a single 3.2 terabit per second connection.

Read More »

Intel eyes exit from NEX unit as focus shifts to core chip business

“That’s something we’re going to expand and build on,” Tan said, according to the report, pointing to Intel’s commanding 68% share of the PC chip market and 55% share in data centers. By contrast, the NEX unit — responsible for silicon and software that power telecom gear, 5G infrastructure, and edge computing — has struggled to deliver the kind of strategic advantage Intel needs. According to the report, Tan and his team view it as non-essential to Intel’s turnaround plans. The report described the telecom side of the business as increasingly disconnected from Intel’s long-term objectives, while also pointing to fierce competition from companies like Broadcom that dominate key portions of the networking silicon market and leave little room for Intel to gain a meaningful share. Financial weight, strategic doubts Despite generating $5.8 billion in revenue in 2024, the NEX business was folded into Intel’s broader Data Center and Client Computing groups earlier this year. The move was seen internally as a signal that NEX had lost its independent strategic relevance and also reflects Tan’s ruthless prioritization.  To some in the industry, the review comes as little surprise. Over the past year, Intel has already shed non-core assets. In April, it sold a majority stake in Altera, its FPGA business, to private equity firm Silver Lake for $4.46 billion, shelving earlier plans for a public listing. This followed the 2022 spinoff of Mobileye, its autonomous driving arm. With a $19 billion loss in 2024 and revenue falling to $53.1 billion, the chipmaker also aims to streamline management, cut $10 billion in costs, and bet on AI chips and foundry services, competing with Nvidia, AMD, and TSMC.

Read More »

Tariff uncertainty weighs on networking vendors

“Our guide assumes current tariffs and exemptions remain in place through the quarter. These include the following: China at 30%, partially offset by an exemption for semiconductors and certain electronic components; Mexico and Canada at 25% for the components and products that are not eligible for the current exemptions,” Cisco CFO Scott Herron told Wall Street analysts in the company’s quarterly earnings report on May 14. At this time, Cisco expects little impact from tariffs on steel and aluminum and retaliatory tariffs, Herron said. “We’ll continue to leverage our world-class supply chain team to help mitigate the impact,” he said, adding that “the flexibility and agility we have built into our operations over the last few years, the size and scale of our supply chain, provides us some unique advantages as we support our customers globally.” “Once the tariff scenario stabilizes, there [are] steps that we can take to mitigate it, as you’ve seen us do with China from the first Trump administration. And only after that would we consider price [increases],” Herron said. Similarly, Extreme Networks noted the changing tariff conditions during its earnings call on April 30. “The tariff situation is very dynamic, I think, as everybody knows and can appreciate, and it’s kind of hard to call. Yes, there was concern initially given the magnitude of tariffs,” said Extreme Networks CEO Ed Meyercord on the earnings call. “The larger question is, will all of the changes globally in trade and tariff policy have an impact on demand? And that’s hard to call at this point. And we’re going to hold as far as providing guidance or judgment on that until we have finality come July.” Financial news Meanwhile, AI is fueling high expectations and influencing investments in enterprise campus and data center environments.

Read More »

Liquid cooling becoming essential as AI servers proliferate

“Facility water loops sometimes have good water quality, sometimes bad,” says My Troung, CTO at ZutaCore, a liquid cooling company. “Sometimes you have organics you don’t want to have inside the technical loop.” So there’s one set of pipes that goes around the data center, collecting the heat from the server racks, and another set of smaller pipes that lives inside individual racks or servers. “That inner loop is some sort of technical fluid, and the two loops exchange heat across a heat exchanger,” says Troung. The most common approach today, he says, is to use a single-phase liquid — one that stays in liquid form and never evaporates into a gas — such as water or propylene glycol. But it’s not the most efficient option. Evaporation is a great way to dissipate heat. That’s what our bodies do when we sweat. When water goes from a liquid to a gas it’s called a phase change, and it uses up energy and makes everything around it slightly cooler. Of course, few servers run hot enough to boil water — but they can boil other liquids. “Two phase is the most efficient cooling technology,” says Xianming (Simon) Dai, a professor at University of Texas at Dallas. And it might be here sooner than you think. In a keynote address in March at Nvidia GTC, Nvidia CEO Jensen Huang unveiled the Rubin Ultra NVL576, due in the second half of 2027 — with 600 kilowatts per rack. “With the 600 kilowatt racks that Nvidia is announcing, the industry will have to shift very soon from single-phase approaches to two-phase,” says ZutaCore’s Troung. Another highly-efficient cooling approach is immersion cooling. According to a Castrol survey released in March, 90% of 600 data center industry leaders say that they are considering switching to immersion

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »