Stay Ahead, Stay ONMINE

When AIs bargain, a less advanced agent could cost you

The race to build ever larger AI models is slowing down. The industry’s focus is shifting toward agents—systems that can act autonomously, make decisions, and negotiate on users’ behalf. These AI agents are already being deployed in customer service and programming—and, increasingly, in e-commerce and personal finance. But what would happen if both a customer and a seller were using an AI agent? A recent study put agent-to-agent negotiations to the test and found that stronger agents can exploit weaker ones to get a better deal. It’s a bit like entering court with a seasoned attorney versus a rookie: You’re technically playing the same game, but the odds are skewed from the start. The paper, posted to arXiv’s preprint site, found that access to more advanced AI models —those with greater reasoning ability, better training data, and more parameters—could lead to consistently better financial deals, potentially widening the gap between people with greater resources and technical access and those without. If agent-to-agent interactions become the norm, disparities in AI capabilities could quietly deepen existing inequalities. “Over time, this could create a digital divide where your financial outcomes are shaped less by your negotiating skill and more by the strength of your AI proxy,” says Jiaxin Pei, a postdoc researcher at Stanford University and one of the authors of the study. In their experiment, the researchers had AI models play the roles of buyers and sellers in three scenarios, negotiating deals for electronics, motor vehicles, and real estate. Each seller agent received the product’s specs, wholesale cost, and retail price, with instructions to maximize profit. Buyer agents, in contrast, were given a budget, the retail price, and ideal product requirements and were tasked with driving the price down. Each agent had some, but not all, relevant details. This setup mimics many real-world negotiation conditions, where parties lack full visibility into each other’s constraints or objectives. The differences in performance were striking. OpenAI’s ChatGPT-o3 delivered the strongest overall negotiation results, followed by the company’s GPT-4.1 and o4-mini. GPT-3.5, which came out almost two years earlier and is the oldest model included in the study,  lagged significantly in both roles—it made the least money as the seller and spent the most as a buyer. DeepSeek R1 and V3 also performed well, particularly as sellers. Qwen2.5 trailed behind, though it showed more strength in the buyer role. One notable pattern was that some agents often failed to close deals but effectively maximize profit in the sales they did make, while others completed more negotiations but settled for lower margins. GPT-4.1 and DeepSeek R1 struck the best balance, achieving both solid profits and high completion rates. Beyond financial losses, the researchers found that AI agents could get stuck in prolonged negotiation loops without reaching an agreement—or end talks prematurely, even when instructed to push for the best possible deal. Even the most capable models were prone to these failures. “The result was very surprising to us,” says Pei. “We all believe LLMs are pretty good these days, but they can be untrustworthy in high-stakes scenarios.” The disparity in negotiation performance could be caused by a number of factors, says Pei. These include differences in training data and the models’ ability to reason and infer missing information. The precise causes remain uncertain, but one factor seems clear: Model size plays a significant role. According to the scaling laws of large language models, capabilities tend to improve with an increase in the number of parameters. This trend held true in the study: Even within the same model family, larger models were consistently able to strike better deals as both buyers and sellers. This study is part of a growing body of research warning about the risks of deploying AI agents in real-world financial decision-making. Earlier this month, a group of researchers from multiple universities argued that LLM agents should be evaluated primarily on the basis of their risk profiles, not just their peak performance. Current benchmarks, they say, emphasize accuracy and return-based metrics, which measure how well an agent can perform at its best but overlook how safely it can fail. Their research also found that even top-performing models are more likely to break down under adversarial conditions. The team suggests that in the context of real-world finances, a tiny weakness—even a 1% failure rate—could expose the system to systemic risks. They recommend that AI agents be “stress tested” before being put into practical use. Hancheng Cao, an incoming assistant professor at Emory University, notes that the price negotiation study has limitations. “The experiments were conducted in simulated environments that may not fully capture the complexity of real-world negotiations or user behavior,” says Cao.  Pei, the researcher, says researchers and industry practitioners are experimenting with a variety of strategies to reduce these risks. These include refining the prompts given to AI agents, enabling agents to use external tools or code to make better decisions, coordinating multiple models to double-check each other’s work, and fine-tuning models on domain-specific financial data—all of which have shown promise in improving performance. Many prominent AI shopping tools are currently limited to product recommendation. In April, for example, Amazon launched “Buy for Me,” an AI agent that helps customers find and buy products from other brands’ sites if Amazon doesn’t sell them directly. While price negotiation is rare in consumer e-commerce, it’s more common in business-to-business transactions. Alibaba.com has rolled out a sourcing assistant called Accio, built on its open-source Qwen models, that helps businesses find suppliers and research products. The company told MIT Technology Review it has no plans to automate price bargaining so far, citing high risk. That may be a wise move. For now, Pei advises consumers to treat AI shopping assistants as helpful tools—not stand-ins for humans in decision-making. “I don’t think we are fully ready to delegate our decisions to AI shopping agents,” he says. “So maybe just use it as an information tool, not a negotiator.”

The race to build ever larger AI models is slowing down. The industry’s focus is shifting toward agents—systems that can act autonomously, make decisions, and negotiate on users’ behalf. These AI agents are already being deployed in customer service and programming—and, increasingly, in e-commerce and personal finance.

But what would happen if both a customer and a seller were using an AI agent? A recent study put agent-to-agent negotiations to the test and found that stronger agents can exploit weaker ones to get a better deal. It’s a bit like entering court with a seasoned attorney versus a rookie: You’re technically playing the same game, but the odds are skewed from the start.

The paper, posted to arXiv’s preprint site, found that access to more advanced AI models —those with greater reasoning ability, better training data, and more parameters—could lead to consistently better financial deals, potentially widening the gap between people with greater resources and technical access and those without. If agent-to-agent interactions become the norm, disparities in AI capabilities could quietly deepen existing inequalities.

“Over time, this could create a digital divide where your financial outcomes are shaped less by your negotiating skill and more by the strength of your AI proxy,” says Jiaxin Pei, a postdoc researcher at Stanford University and one of the authors of the study.

In their experiment, the researchers had AI models play the roles of buyers and sellers in three scenarios, negotiating deals for electronics, motor vehicles, and real estate. Each seller agent received the product’s specs, wholesale cost, and retail price, with instructions to maximize profit. Buyer agents, in contrast, were given a budget, the retail price, and ideal product requirements and were tasked with driving the price down.

Each agent had some, but not all, relevant details. This setup mimics many real-world negotiation conditions, where parties lack full visibility into each other’s constraints or objectives.

The differences in performance were striking. OpenAI’s ChatGPT-o3 delivered the strongest overall negotiation results, followed by the company’s GPT-4.1 and o4-mini. GPT-3.5, which came out almost two years earlier and is the oldest model included in the study,  lagged significantly in both roles—it made the least money as the seller and spent the most as a buyer. DeepSeek R1 and V3 also performed well, particularly as sellers. Qwen2.5 trailed behind, though it showed more strength in the buyer role.

One notable pattern was that some agents often failed to close deals but effectively maximize profit in the sales they did make, while others completed more negotiations but settled for lower margins. GPT-4.1 and DeepSeek R1 struck the best balance, achieving both solid profits and high completion rates.

Beyond financial losses, the researchers found that AI agents could get stuck in prolonged negotiation loops without reaching an agreement—or end talks prematurely, even when instructed to push for the best possible deal. Even the most capable models were prone to these failures.

“The result was very surprising to us,” says Pei. “We all believe LLMs are pretty good these days, but they can be untrustworthy in high-stakes scenarios.”

The disparity in negotiation performance could be caused by a number of factors, says Pei. These include differences in training data and the models’ ability to reason and infer missing information. The precise causes remain uncertain, but one factor seems clear: Model size plays a significant role. According to the scaling laws of large language models, capabilities tend to improve with an increase in the number of parameters. This trend held true in the study: Even within the same model family, larger models were consistently able to strike better deals as both buyers and sellers.

This study is part of a growing body of research warning about the risks of deploying AI agents in real-world financial decision-making. Earlier this month, a group of researchers from multiple universities argued that LLM agents should be evaluated primarily on the basis of their risk profiles, not just their peak performance. Current benchmarks, they say, emphasize accuracy and return-based metrics, which measure how well an agent can perform at its best but overlook how safely it can fail. Their research also found that even top-performing models are more likely to break down under adversarial conditions.

The team suggests that in the context of real-world finances, a tiny weakness—even a 1% failure rate—could expose the system to systemic risks. They recommend that AI agents be “stress tested” before being put into practical use.

Hancheng Cao, an incoming assistant professor at Emory University, notes that the price negotiation study has limitations. “The experiments were conducted in simulated environments that may not fully capture the complexity of real-world negotiations or user behavior,” says Cao. 

Pei, the researcher, says researchers and industry practitioners are experimenting with a variety of strategies to reduce these risks. These include refining the prompts given to AI agents, enabling agents to use external tools or code to make better decisions, coordinating multiple models to double-check each other’s work, and fine-tuning models on domain-specific financial data—all of which have shown promise in improving performance.

Many prominent AI shopping tools are currently limited to product recommendation. In April, for example, Amazon launched “Buy for Me,” an AI agent that helps customers find and buy products from other brands’ sites if Amazon doesn’t sell them directly.

While price negotiation is rare in consumer e-commerce, it’s more common in business-to-business transactions. Alibaba.com has rolled out a sourcing assistant called Accio, built on its open-source Qwen models, that helps businesses find suppliers and research products. The company told MIT Technology Review it has no plans to automate price bargaining so far, citing high risk.

That may be a wise move. For now, Pei advises consumers to treat AI shopping assistants as helpful tools—not stand-ins for humans in decision-making.

“I don’t think we are fully ready to delegate our decisions to AI shopping agents,” he says. “So maybe just use it as an information tool, not a negotiator.”

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Trump Overturns California Phaseout of Fossil Fuel Cars

President Donald Trump on Thursday signed into law congressional resolutions that overturn three California regulations for cleaner transport, including one that would phase out the sale of new fossil fuel vehicles by 2035. Last February the Environmental Protection Agency (EPA) said it was letting Congress review waivers it had issued

Read More »

Why people love Linux

The people who love Linux love it for a wide variety of reasons. Some of them appreciate having access to source code and the ability (if they’re so inclined) to modify it. Most love that the majority of Linux distributions are completely free. Some understand and appreciate that Linux is

Read More »

Petronas, F1 Team Join Hands to Support Research on Mangrove CCS

Petroliam Nasional Bhd. and the Mercedes-AMG Petronas F1 Team have agreed to launch a South-South initiative to study carbon capture and storage (CCS) in mangrove ecosystems. The Blue Carbon Collective will expand an existing research collaboration between Universiti Putra Malaysia (UPM) and University of Sao Paulo (USP). UPM will conduct research in the Sungai Santi Forest Reserve and apply established methodologies from Brazil. The research will “include carbon stock assessment and monitoring of soil quality and ecosystem health in Malaysia, enabling comparative analysis between the two countries”, Petronas said in an online statement. “The Blue Carbon Collective aims to deliver several research objectives including identifying the impact of land use changes, understanding carbon stabilization mechanisms, and developing and applying a soil quality index”. “The five-year collaboration is expected to generate vital research data to advance carbon emissions reduction strategies, help conserve mangroves, and create local job and business opportunities”, Petronas said. “The Mercedes-AMG PETRONAS F1 Team will support the research activities”. Professor Tiago Osorio Ferreira, project coordinator from USP, said, “These findings will support the development of process-based models for carbon dynamics in Blue Carbon ecosystems at a global scale and produce evidence-based climate policies grounded in nature-based solutions”. Petronas unveiled the initiative as it announced biodiversity and resource efficiency goals at the inaugural Petronas-hosted Energy and Nature Forum in Kuala Lumpur. By 2030 Petronas aims to have “Biodiversity Action Plans” for all “very high” and “high” risk areas that host sites under Petronas’ operational control. “From 2030, PETRONAS aims to maintain the habitat size for all sites within their operational control located in protected areas and/or key biodiversity areas”, Petronas said. “Where not feasible, PETRONAS establishes comparable areas to substitute the loss. “From 2030, PETRONAS’ decommissioning plans or equivalent documents, will include ecosystem rehabilitation measures for operations/projects in protected

Read More »

Google, CTC Global partner to deploy advanced conductors

Google and conductor manufacturer CTC Global on Tuesday said they are partnering to ask states, utilities and transmission developers to identify areas to deploy advanced conductors, which can carry more power than standard transmission lines but use existing towers and poles. Responses to a request for information are due on July 14, and a request for proposals will “shortly follow,” Google and CTC Global said in a release shared in advance with Utility Dive. “Applications are encouraged from areas where Google has existing or announced data centers, as well as their associated wholesale markets,” the release states. The partnership will focus on U.S. transmission lines that have the most potential to accelerate grid capacity using CTC Global’s conductors, and it will prioritize projects that would “deliver the greatest immediate impact and that support load growth where Google operates,” the release said. Advanced conductors are one of the alternative transmission technologies that the Federal Energy Regulatory Commission’s Order 1920 requires transmission providers “in each transmission planning region to consider more fully.” FERC said in an explainer that the order’s goal “is to identify efficient and cost-effective solutions to meet transmission needs and optimize the transmission system without the need to build additional transmission facilities.” Google and CTC Global invited states, utilities and transmission developers to start responding to the RFI immediately. They said “selected partners and projects” will gain access to cost assistance, workforce training on the deployment of CTC Global’s ACCC conductors and “support for technical project studies … to validate the technology’s integration and impact.” CTC Global CEO J.D. Sitton said in the release that the partnership is a “positive turning point to lower electricity costs, generate economic growth, and advance U.S. energy dominance … [helping] ensure that the U.S. invests in cost-effective solutions for the long-term that help the U.S.

Read More »

Petronas Enters into New Partnerships with Eni, TotalEnergies

Malaysia’s national oil and gas company has signed a deal with Eni SpA to combine their upstream assets in Malaysia and Indonesia. Separately Petroliam Nasional Bhd. (Petronas) penned multiple agreements with TotalEnergies SE to jointly explore several offshore blocks in Malaysia and signed another agreement for a farm-down in Indonesia to the French company. The joint venture agreement with Italy’s state-backed Eni, expected to be finalized in the fourth quarter, would deliver 500,000 barrels of oil equivalent (boe) a day in the medium term. The combined portfolio would consist of about three billion boe of reserves and 10 billion boe of exploration potential, according to the companies. “This collaboration will unlock new opportunities for us to contribute to the energy security in the region and deliver long-term value across Malaysia and Indonesia”, Petronas president and chief executive Muhammad Taufik said in an online statement. The partnership would create “synergy in terms of assets, expertise and financial capabilities, in a transformational model that further strengthens the huge potential of the two countries”, Eni chief executive Claudio Descalzi said separately. “The new company will have a strong regional impact on gas production, bringing additional energy, infrastructures and employment for the benefit of both Indonesia and Malaysia”. Meanwhile Petronas’ agreements with TotalEnergies in Malaysia and Indonesia involve offshore blocks in different maturation stages and covering over 100,000 square kilometers (38,610.19 square miles), TotalEnergies said in a press release. “TotalEnergies will notably hold, alongside PETRONAS through its wholly-owned subsidiary Petronas Carigali Sdn. Bhd., a 50 percent operated working interest in Blocks SK301b and SK313, where significant gas discoveries (more than 4 Tcf) were made and are expected to be developed to support gas supply to Malaysia LNG from 2030”, TotalEnergies said. “TotalEnergies will also hold, alongside PETRONAS, interests in several exploration blocks offshore Malaysia.

Read More »

Trump Plays Down Iran-Israel Ceasefire as He Leaves G7 Early

US President Donald Trump left the Group of Seven leaders meeting in Canada early to deal with the Israel-Iran conflict, but played down the chances of a ceasefire. On Tuesday morning, he returned to Washington and criticized French President Emmanuel Macron for saying the move was possibly a sign he was working on a truce. “Wrong!,” Trump said in reference to Macron on Truth Social. “He has no idea why I am now on my way to Washington, but it certainly has nothing to do with a Cease Fire. Much bigger than that. Stay Tuned!” Trump hasn’t clearly spelled out his next steps as he returns to the US capital. Israel and Iran continue to strike on one another. While global markets have calmed since hostilities started on Friday with Israel bombing Iran, there are still widespread fears the war will spread to other countries in the oil- and gas-producing region. Despite Trump’s latest comments, top US officials said the president remains hopeful a peace deal can be achieved between Israel and Iran. The diplomatic flurry followed another 24 hours of intense bombardments, with Iran firing ballistic missiles and Israel striking targets across the Islamic Republic, including the capital of Tehran. The USS Nimitz aircraft carrier strike group is now sailing to the Middle East ahead of schedule, marking the first significant move of American military assets to the region since Friday.  “Iran should have signed the ‘deal’ I told them to sign,” Trump wrote in an earlier social media post, referring to nuclear talks between Tehran and Washington that are now on hold. “What a shame, and waste of human life. Simply stated, IRAN CAN NOT HAVE A NUCLEAR WEAPON. I said it over and over again! Everyone should immediately evacuate Tehran!” It wasn’t clear if Trump knew of a

Read More »

EU Needs $279B Investment in Traditional Nuclear through 2050: Commission

European Union member states need around EUR 241 billion ($278.62 billion) through 2050 to grow the share of conventional nuclear in their energy mix toward meeting their decarbonization, industrial competitiveness and energy security goals, according to official analysis. The 27-member bloc had 101 operational nuclear power reactors, with a combined capacity of 98 gigawatts electric (GWe), as of last year, the European Commission said. These are spread across 12 member states: Belgium, Bulgaria, Czechia, Finland, France, Hungary, the Netherlands, Romania, Slovakia, Slovenia, Spain and Sweden. In 2023 the units supplied 22.8 percent of the EU’s electricity generation. Three more reactors are under construction: one in Slovakia (Mochovce 4) and two in Hungary (Paks II), according to the Commission. While the EU’s top economy, Germany, shut down its three remaining nuclear power plants in April 2023, the new German government signaled it would drop its opposition to nuclear power, according to a Reuters report May 20, 2025, citing a French official. The estimated investment need, EUR 241 billion in present-value terms, is based on generation gaps identified in National Energy and Climate Plans (NECPs). The estimate, a “base case scenario”, accounts for the existing fleet, ongoing constructions and planned newbuilds. Additional investment is needed for small modular reactors (SMRs), advanced modular reactors (AMRs) and microreactors, the Commission said, though it did not quantify the investment need. Newbuild projects account for EUR 205 billion. Lifetime extensions would need EUR 36 billion, according to the Commission. In the base case scenario, the Commission projects an increase in nuclear generation capacity to 109 GWe in 2050, assuming at least some of the existing reactors extend their operating life beyond 60 years and planned newbuild reactors are delivered on time. “The Commission estimates that over 90 percent of electricity in the EU in 2040 will

Read More »

Chevron Lummus, Neste Make Progress on New Waste-to-Fuel Tech

Neste Oyj and Chevron Lummus Global (CLG) have announced promising pilot results for a new process to convert lignocellulosic biomass into renewable fuels. “Through close collaboration at CLG’s state-of-the-art R&D facility in the U.S., Neste and CLG have successfully demonstrated proof of concept for converting lignocellulosic waste into renewable fuels, with highly promising initial results”, a joint statement said. The results indicated the new technology could outperform existing technologies for processing lignocellulosic raw materials, according to the companies. “Neste and CLG are currently validating the technology and targeting readiness to scale up the technology to commercial scale”, they said. “Vast amounts of lignocellulosic waste and residues from existing forest industry and agricultural production remain underutilized and could be leveraged as valuable renewable raw materials”. “The partnership combines CLG’s extensive experience and proven track record in developing and licensing market-leading refining technologies with Neste’s pioneering expertise and global leadership in renewable fuels”, the partners said. CLG chief executive Rajesh Samarth said, “We are confident this partnership will pave a new pathway for producing renewable fuels, leveraging our versatile and scalable hydroprocessing technology platform”. Lars Peter Lindfors, senior vice president for technology and innovation at Neste, said, “Unlocking the potential of these promising raw materials would allow us to meet the growing demand of renewable fuels in the long-term and contribute to ambitious greenhouse gas emission reduction targets”. Espoo, Finland-based Neste produces sustainable aviation fuel (SAF) and renewable diesel. It has increased its SAF production capacity to 1.5 million metric tons per annum (MMtpa) with last year’s start-up of a Rotterdam project with a capacity of 500,000 metric tons a year. Neste aims to grow its production capacity for renewable fuels to 6.8 million metric tons a year by 2027. CLG, a joint venture between Chevron Corp. and Lummus Technology, provides technology

Read More »

Next-gen AI chips will draw 15,000W each, redefining power, cooling, and data center design

“Dublin imposed a 2023 moratorium on new data centers, Frankfurt has no new capacity expected before 2030, and Singapore has just 7.2 MW available,” said Kasthuri Jagadeesan, Research Director at Everest Group, highlighting the dire situation. Electricity: the new bottleneck in AI RoI As AI modules push infrastructure to its limits, electricity is becoming a critical driver of return on investment. “Electricity has shifted from a line item in operational overhead to the defining factor in AI project feasibility,” Gogia noted. “Electricity costs now constitute between 40–60% of total Opex in modern AI infrastructure, both cloud and on-prem.” Enterprises are now forced to rethink deployment strategies—balancing control, compliance, and location-specific power rates. Cloud hyperscalers may gain further advantage due to better PUE, renewable access, and energy procurement models. “A single 15,000-watt module running continuously can cost up to $20,000 annually in electricity alone, excluding cooling,” said Manish Rawat, analyst at TechInsights. “That cost structure forces enterprises to evaluate location, usage models, and platform efficiency like never before.” The silicon arms race meets the power ceiling AI chip innovation is hitting new milestones, but the cost of that performance is no longer just measured in dollars or FLOPS — it’s in kilowatts. The KAIST TeraLab roadmap demonstrates that power and heat are becoming dominant factors in compute system design. The geography of AI, as several experts warn, is shifting. Power-abundant regions such as the Nordics, the Midwest US, and the Gulf states are becoming magnets for data center investments. Regions with limited grid capacity face a growing risk of becoming “AI deserts.”

Read More »

Edge reality check: What we’ve learned about scaling secure, smart infrastructure

Enterprises are pushing cloud resources back to the edge after years of centralization. Even as major incumbents such as Google, Microsoft, and AWS pull more enterprise workloads into massive, centralized hyperscalers, use cases at the edge increasingly require nearby infrastructure—not a long hop to a centralized data center—to take advantage of the torrents of real-time data generated by IoT devices, sensor networks, smart vehicles, and a panoply of newly connected hardware. Not long ago, the enterprise edge was a physical one. The central data center was typically located in or very near the organization’s headquarters. When organizations sought to expand their reach, they wanted to establish secure, speedy connections to other office locations, such as branches, providing them with fast and reliable access to centralized computing resources. Vendors initially sold MPLS, WAN optimization, and SD-WAN as “branch office solutions,” after all. Lesson one: Understand your legacy before locking in your future The networking model that connects centralized cloud resources to the edge via some combination of SD-WAN, MPLS, or 4G reflects a legacy HQ-branch design. However, for use cases such as facial recognition, gaming, or video streaming, old problems are new again. Latency, middle-mile congestion, and the high cost of bandwidth all undermine these real-time edge use cases.

Read More »

Cisco capitalizes on Isovalent buy, unveils new load balancer

The customer deploys the Isovalent Load Balancer control plane via automation and configures the desired number of virtual load-balancer appliances, Graf said. “The control plane automatically deploys virtual load-balancing appliances via the virtualization or Kubernetes platform. The load-balancing layer is self-healing and supports auto-scaling, which means that I can replace unhealthy instances and scale out as needed. The load balancer supports powerful L3-L7 load balancing with enterprise capabilities,” he said. Depending on the infrastructure the load balancer is deployed into, the operator will deploy the load balancer using familiar deployment methods. In a data center, this will be done using a standard virtualization automation installation such as Terraform or Ansible. In the public cloud, the load balancer is deployed as a public cloud service. In Kubernetes and OpenShift, the load balancer is deployed as a Kubernetes Deployment/Operator, Graf said.  “In the future, the Isovalent Load Balancer will also be able to run on top of Cisco Nexus smart switches,” Graf said. “This means that the Isovalent Load Balancer can run in any environment, from data center, public cloud, to Kubernetes while providing a consistent load-balancing layer with a frictionless cloud-native developer experience.” Cisco has announced a variety of smart switches over the past couple of months on the vendor’s 4.8T capacity Silicon One chip. But the N9300, where Isovalent would run, includes a built-in programmable data processing unit (DPU) from AMD to offload complex data processing work and free up the switches for AI and large workload processing. For customers, the Isovalent Load Balancer provides consistent load balancing across infrastructure while being aligned with Kubernetes as the future for infrastructure. “A single load-balancing solution that can run in the data center, in public cloud, and modern Kubernetes environments. This removes operational complexity, lowers cost, while modernizing the load-balancing infrastructure in preparation

Read More »

Oracle’s struggle with capacity meant they made the difficult but responsible decisions

IDC President Crawford Del Prete agreed, and said that Oracle senior management made the right move, despite how difficult the situation is today. “Oracle is being incredibly responsible here. They don’t want to have a lot of idle capacity. That capacity does have a shelf life,” Del Prete said. CEO Katz “is trying to be extremely precise about how much capacity she puts on.” Del Prete said that, for the moment, Oracle’s capacity situation is unique to the company, and has not been a factor with key rivals AWS, Microsoft, and Google. During the investor call, Katz said that her team “made engineering decisions that were much different from the other hyperscalers and that were better suited to the needs of enterprise customers, resulting in lower costs to them and giving them deployment flexibility.” Oracle management certainly anticipated a flurry of orders, but Katz said that she chose to not pay for expanded capacity until she saw finalized “contracted noncancelable bookings.” She pointed to a huge capex line of $9.1 billion and said, “the vast majority of our capex investments are for revenue generating equipment that is going into data centers and not for land or buildings.”

Read More »

Winners and losers in the Top500 supercomputer ranking

GPU winner: AMD AMD is finally making a showing for itself, albeit modestly, in GPU accelerators. For the June 2025 edition of the list, AMD Instinct accelerators are in 23 systems, a nice little jump from the 10 systems on the June 2024 list. Of course, it helps with the sales pitch when AMD processors and coprocessors can be found powering the No. 1 and No. 2 supercomputers in the world. GPU loser: Intel Intel’s GPU efforts have been a disaster. It failed to make a dent in the consumer space with its Arc GPUs, and it isn’t making much headway in the data center, either. There were only four systems running GPU Max processors on the list, and that’s up from three a year ago. Still, it’s pitiful showing given the effort Intel made. Server winners: HPE, Dell, EVIDAN, Nvidia The four server vendors — servers, not component makers — all saw share increases. Nvidia is also a server vendor, selling its SuperPOD AI servers directly to customers. They all gained at the expense of Lenovo and Arm. Server loser: Lenovo It saw the sharpest drop in server share, going from 163 systems in June of 2024 to 136 in this most recent listing. Loser: Arm Other than the 13 Nvidia Grace chips, the ARM architecture was completely absent from this spring’s list.

Read More »

Micron joins HBM4 race with 36GB 12-high stack, eyes AI and data center dominance

Race to power the next generation of AI By shipping samples of the HMB4 to the key customers, Micron has joined SK hynix in the HBM4 race. In March this year, SK hynix shipped the 12-Layer HBM4 samples to customers. SK hynix’s HBM4 has implemented bandwidth capable of processing more than 2TB of data per second, processing data equivalent to more than 400 full-HD movies (5GB each) in a second, said the company. “HBM competitive landscape, SK hynix has already sampled and secured approval of HBM4 12-high stack memory early Q1’2025 to NVIDIA for its next generation Rubin product line and plans to mass produce HBM4 in 2H 2025,” said Danish Faruqui, CEO, Fab Economics. “Closely following, Micron is pending Nvidia’s tests for its latest HBM4 samples, and Micron plans to mass produce HBM4 in 1H 2026. On the other hand, the last contender, Samsung is struggling with Yield Ramp on HBM4 Technology Development stage, and so has to delay the customer samples milestones to Nvidia and other players while it earlier shared an end of 2025 milestone for mass producing HBM4.” Faruqui noted another key differentiator among SK hynix, Micron, and Samsung: the base die that anchors the 12-high DRAM stack. For the first time, both SK hynix and Samsung have introduced a logic-enabled base die on 3nm and 4nm process technology to enable HBM4 product for efficient and faster product performance via base logic-driven memory management. Both Samsung and SK hynix rely on TSMC for the production of their logic-enabled base die. However, it remains unclear whether Micron is using a logic base die, as the company lacks in-house capability to fabricate at 3nm.

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »