Cutting cloud waste at scale: Akamai saves 70% using AI agents orchestrated by kubernetes

Stay Ahead, Stay ONMINE

Cutting cloud waste at scale: Akamai saves 70% using AI agents orchestrated by kubernetes

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Particularly in this dawning era of generative AI, cloud costs are at an all-time high. But that’s not merely because enterprises are using more compute — they’re not using it efficiently. In […]

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

Particularly in this dawning era of generative AI, cloud costs are at an all-time high. But that’s not merely because enterprises are using more compute — they’re not using it efficiently. In fact, just this year, enterprises are expected to waste $44.5 billion on unnecessary cloud spending.

This is an amplified problem for Akamai Technologies: The company has a large and complex cloud infrastructure on multiple clouds, not to mention numerous strict security requirements.

To resolve this, the cybersecurity and content delivery provider turned to the Kubernetes automation platform Cast AI, whose AI agents help optimize cost, security and speed across cloud environments.

Ultimately, the platform helped Akamai cut between 40% to 70% of cloud costs, depending on workload.

“We needed a continuous way to optimize our infrastructure and reduce our cloud costs without sacrificing performance,” Dekel Shavit, senior director of cloud engineering at Akamai, told VentureBeat. “We’re the ones processing security events. Delay is not an option. If we’re not able to respond to a security attack in real time, we have failed.”

Specialized agents that monitor, analyze and act

Kubernetes manages the infrastructure that runs applications, making it easier to deploy, scale and manage them, particularly in cloud-native and microservices architectures.

Cast AI has integrated into the Kubernetes ecosystem to help customers scale their clusters and workloads, select the best infrastructure and manage compute lifecycles, explained founder and CEO Laurent Gil. Its core platform is Application Performance Automation (APA), which operates through a team of specialized agents that continuously monitor, analyze and take action to improve application performance, security, efficiency and cost. Companies provision only the compute they need from AWS, Microsoft, Google or others.

APA is powered by several machine learning (ML) models with reinforcement learning (RL) based on historical data and learned patterns, enhanced by an observability stack and heuristics. It is coupled with infrastructure-as-code (IaC) tools on several clouds, making it a completely automated platform.

Gil explained that APA was built on the tenet that observability is just a starting point; as he called it, observability is “the foundation, not the goal.” Cast AI also supports incremental adoption, so customers don’t have to rip out and replace; they can integrate into existing tools and workflows. Further, nothing ever leaves customer infrastructure; all analysis and actions occur within their dedicated Kubernetes clusters, providing more security and control.

Gil also emphasized the importance of human-centricity. “Automation complements human decision-making,” he said, with APA maintaining human-in-the-middle workflows.

Akamai’s unique challenges

Shavit explained that Akamai’s large and complex cloud infrastructure powers content delivery network (CDN) and cybersecurity services delivered to “some of the world’s most demanding customers and industries” while complying with strict service level agreements (SLAs) and performance requirements.

He noted that for some of the services they consume, they’re probably the largest customers for their vendor, adding that they have done “tons of core engineering and reengineering” with their hyperscaler to support their needs.

Further, Akamai serves customers of various sizes and industries, including large financial institutions and credit card companies. The company’s services are directly related to its customers’ security posture.

Ultimately, Akamai needed to balance all this complexity with cost. Shavit noted that real-life attacks on customers could drive capacity 100X or 1,000X on specific components of its infrastructure. But “scaling our cloud capacity by 1,000X in advance just isn’t financially feasible,” he said.

His team considered optimizing on the code side, but the inherent complexity of their business model required focusing on the core infrastructure itself.

Automatically optimizing the entire Kubernetes infrastructure

What Akamai really needed was a Kubernetes automation platform that could optimize the costs of running its entire core infrastructure in real time on several clouds, Shavit explained, and scale applications up and down based on constantly changing demand. But all this had to be done without sacrificing application performance.

Before implementing Cast, Shavit noted that Akamai’s DevOps team manually tuned all its Kubernetes workloads just a few times a month. Given the scale and complexity of its infrastructure, it was challenging and costly. By only analyzing workloads sporadically, they clearly missed any real-time optimization potential.

“Now, hundreds of Cast agents do the same tuning, except they do it every second of every day,” said Shavit.

The core APA features Akamai uses are autoscaling, in-depth Kubernetes automation with bin packing (minimizing the number of bins used), automatic selection of the most cost-efficient compute instances, workload rightsizing, Spot instance automation throughout the entire instance lifecycle and cost analytics capabilities.

“We got insight into cost analytics two minutes into the integration, which is something we’d never seen before,” said Shavit. “Once active agents were deployed, the optimization kicked in automatically, and the savings started to come in.”

Spot instances — where enterprises can access unused cloud capacity at discounted prices — obviously made business sense, but they turned out to be complicated due to Akamai’s complex workloads, particularly Apache Spark, Shavit noted. This meant they needed to either overengineer workloads or put more working hands on them, which turned out to be financially counterintuitive.

With Cast AI, they were able to use spot instances on Spark with “zero investment” from the engineering team or operations. The value of spot instances was “super clear”; they just needed to find the right tool to be able to use them. This was one of the reasons they moved forward with Cast, Shavit noted.

While saving 2X or 3X on their cloud bill is great, Shavit pointed out that automation without manual intervention is “priceless.” It has resulted in “massive” time savings.

Before implementing Cast AI, his team was “constantly moving around knobs and switches” to ensure that their production environments and customers were up to par with the service they needed to invest in.

“Hands down the biggest benefit has been the fact that we don’t need to manage our infrastructure anymore,” said Shavit. “The team of Cast’s agents is now doing this for us. That has freed our team up to focus on what matters most: Releasing features faster to our customers.”

Editor’s note: At this month’s VB Transform, Google Cloud CTO Will Grannis and Highmark Health SVP and Chief Analytics Officer Richard Clarke will discuss the new AI stack in healthcare and the real-world challenges of deploying multi-model AI systems in a complex, regulated environment. Register today.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Understanding how data fabric enhances data security and governance

“If security is already inconsistent across hybrid or multi-cloud setups, teams will subsequently struggle to get their data fabric architecture as secure as it needs to be,” Inamdar said. How data fabric enhances security While there are some challenges, the reason why so many organizations choose to deploy data fabric

Trump Overturns California Phaseout of Fossil Fuel Cars

President Donald Trump on Thursday signed into law congressional resolutions that overturn three California regulations for cleaner transport, including one that would phase out the sale of new fossil fuel vehicles by 2035. Last February the Environmental Protection Agency (EPA) said it was letting Congress review waivers it had issued

Why people love Linux

The people who love Linux love it for a wide variety of reasons. Some of them appreciate having access to source code and the ability (if they’re so inclined) to modify it. Most love that the majority of Linux distributions are completely free. Some understand and appreciate that Linux is

Google Cloud outage disrupts over 50 services globally for over 7 hours

“Resilience isn’t a feature you layer on. It’s an architectural commitment. Performance under adversity — not in perfect conditions — is the real benchmark now. If your system can’t absorb failure without taking your customers down with it, you’re not production-ready in 2025 — especially not in the AI era,”

Chevron Lummus, Neste Make Progress on New Waste-to-Fuel Tech

Neste Oyj and Chevron Lummus Global (CLG) have announced promising pilot results for a new process to convert lignocellulosic biomass into renewable fuels. “Through close collaboration at CLG’s state-of-the-art R&D facility in the U.S., Neste and CLG have successfully demonstrated proof of concept for converting lignocellulosic waste into renewable fuels, with highly promising initial results”, a joint statement said. The results indicated the new technology could outperform existing technologies for processing lignocellulosic raw materials, according to the companies. “Neste and CLG are currently validating the technology and targeting readiness to scale up the technology to commercial scale”, they said. “Vast amounts of lignocellulosic waste and residues from existing forest industry and agricultural production remain underutilized and could be leveraged as valuable renewable raw materials”. “The partnership combines CLG’s extensive experience and proven track record in developing and licensing market-leading refining technologies with Neste’s pioneering expertise and global leadership in renewable fuels”, the partners said. CLG chief executive Rajesh Samarth said, “We are confident this partnership will pave a new pathway for producing renewable fuels, leveraging our versatile and scalable hydroprocessing technology platform”. Lars Peter Lindfors, senior vice president for technology and innovation at Neste, said, “Unlocking the potential of these promising raw materials would allow us to meet the growing demand of renewable fuels in the long-term and contribute to ambitious greenhouse gas emission reduction targets”. Espoo, Finland-based Neste produces sustainable aviation fuel (SAF) and renewable diesel. It has increased its SAF production capacity to 1.5 million metric tons per annum (MMtpa) with last year’s start-up of a Rotterdam project with a capacity of 500,000 metric tons a year. Neste aims to grow its production capacity for renewable fuels to 6.8 million metric tons a year by 2027. CLG, a joint venture between Chevron Corp. and Lummus Technology, provides technology

Oil Drops as Iran Seeks Deescalation with Israel

Oil fell on signs that the conflict in the Middle East may avoid disrupting crude production, with Iran seeking to deescalate hostilities with Israel. West Texas Intermediate slid 1.7% to settle below $72 a barrel after spiking to start the session and swinging in an $8 throughout the day. US President Donald Trump said Iran wants to talk about deescalating the conflict, helping quell fears that a protracted war would engulf a region that produces around a third of the world’s crude. “The question is — will Israel really be on board with that?” said Rebecca Babin, a senior energy trader at CIBC Private Wealth Group. “What this does suggest, though, is that the chatter about the Strait of Hormuz may have been overstating the threat.” Still, oil markets remain on edge after Israel launched an attack on the South Pars gas field, forcing the halt of a production platform, following strikes on Iran’s nuclear sites and military leadership last week. However, critical crude oil-exporting infrastructure has so far been spared and there’s been no blockage of the vital Strait of Hormuz. While an attack on Iran’s gas production is a concern, the biggest fear for the oil market centers on Hormuz. Middle East producers ship about a fifth of the world’s daily output through the narrow waterway, and prices could soar further if Tehran attempts to disrupt shipments through the route. Oil prices remain significantly higher than where they were before the attacks began. Crude gained more than 7% in the session after air strikes began on Friday, leading to record volumes of producer hedging as well as futures and options changing hands. Wall Street analysts have been quick to highlight the risks the conflict could pose. RBC Capital Markets said the fact both sides have targeted energy infrastructure

OPEC Says Output Hike Tempered by Compensation from Quota Cheats

Key OPEC+ nations added less oil to the market last month than the headline figure of its output plan, as the cartel’s leadership pushed members to atone for earlier over-production. The eight members involved in the group’s current accord raised production by 154,000 barrels a day, compared with a headline increase of 411,000 barrels a day, according to a monthly report from OPEC’s secretariat. Iraq, the United Arab Emirates and Russia were among those compensating for past excess output. However, the eight countries’ total output was almost 400,000 barrels a day above their target for the month as Kazakhstan continued to pump well above its quota. Group leader Saudi Arabia has spurred OPEC and its allies to accelerate their planned output revival in a bid to punish members that flouted their limits with lower prices, and to reclaim the market share Riyadh has ceded during years of supply curbs. That push initially weakened oil prices, but markets have been roiled in recent days as Israel launched a wave of attacks on OPEC member Iran, including some strikes on its domestic energy infrastructure. US crude futures are trading near $73 a barrel after surging on Friday by the most in three years. With Iran’s crude exports so far unaffected, OPEC Secretary General Haitham Al Ghais has said the organization doesn’t need to take any immediate action. Quota Cheats The report published by OPEC’s Vienna-based research department on Monday showed that members have taken a mixed approach in implementing their agreed production increases. Saudi Arabia largely went ahead with its mandated hike in May, raising production by 177,000 barrels a day to an average of 9.183 million per day. Kazakhstan and Iraq, who had pledged extra cutbacks as compensation for cheating, both reduced output in May. Astana cut by 21,000 barrels to 1.8

Israel Strikes Continue amid Reports Iran Keen to De-Escalate

Israel and Iran exchanged fire for the fourth consecutive day on Monday, stoking fears of an all-out war with the potential to drag in others in the oil-rich region and force the US into a more hands-on stance. Iran fired several waves of drones and missiles over the last 24 hours, while Israel continued hitting the Islamic Republic’s capital, Tehran, and killing one more senior military official. Since Friday, 224 people have been killed in Iran, according to the government, which said most of the casualties were civilians. Iranian attacks killed 24 people in Israel, according to the Israeli government press office, and injured 592. Tehran is signaling it wants to deescalate hostilities with Israel and is willing to resume nuclear talks with the US as long as Washington doesn’t join the Israeli attacks, the Wall Street Journal reported Monday citing Middle Eastern and European officials it didn’t identify. A similar report by Reuters says Iran conveyed the message through Qatar, Saudi Arabia and Oman. Oil fell on the WSJ report, with Brent futures dropping around 4 percent – they rose over 10 percent Friday. US Treasuries pared earlier drops and European bonds gained as traders reacted to diminishing concerns about inflation. It’s not clear whether Israel would agree to stop missile strikes. Israeli officials have said they want to ensure Iran doesn’t have the capacity to build a nuclear weapon. The exchange of missile salvos between Israel and Iran is the most serious escalation after years of shadow war. Analysts fear it might push the Middle East into a regional conflict, causing wider human loss and potentially disrupting energy flows and vital trade routes. One missile landed near the US consulate in central Tel Aviv, causing minor material damages but no injuries to personnel, the ambassador to Israel, Mike Huckabee, said Monday.

Michigan regulators order reliability improvements for Consumers, DTE

Dive Brief: The Michigan Public Service Commission on Thursday ordered a series of reliability improvements for Consumers Energy and DTE Electric, including bolstering the utilities’ vegetation management programs and prioritizing equipment replacement programs “based on inspections and the actual condition of the equipment instead of solely on age of facilities.” Michigan regulators launched the audit in 2022 following a string of storm-related outages and downed-wires incidents that resulted in the death of a 14-year-old child. Restoration times at both utilities are “worse than average,” the audit concluded in September. Both utilities say they are reviewing the PSC’s order and taking steps to improve reliability. The two utilities serve more than 80% of Michigan residents. Dive Insight: Almost 500,000 Michigan customers lost power, some for several days, following severe storms in August 2022. And in addition to the young girl killed, two boys were critically hurt by contact with power lines downed in the storm. “After a thorough and detailed audit process, now is the time for implementation,” Commissioner Katherine Peretick said in a statement. “The recommendations provided by the third-party experts will now be embedded into decisions going forward, including distribution plans and rate cases for cost recovery.” DTE and Consumers serve more than 4 million customers combined. The audit was performed by The Liberty Consulting Group. The commission directed both utilities to file a report on downed-wires policies and resources by Aug. 29. “The steps the Commission is outlining today build on what we’ve learned from the audit and a series of initiatives going back nearly a decade and take concrete actions to continue addressing reliability issues that have frustrated customers,” PSC Chair Dan Scripps said. For both utilities, the PSC ordered they focus on: Expansion of resources to address downed wires, and for the utilities to file information regarding personnel, protocols,

Will ERCOT’s streamlined connect-and-manage approach work for other markets?

Will ERCOT’s streamlined connect-and-manage approach work for other markets? | Utility Dive Skip to main content An article from Deep Dive The Southwest Power Pool and other regions see flexible integrated system planning as a more reliable way to clear interconnection queues. Published June 16, 2025 The Electric Reliability Council of Texas uses a unique connect-and-manage interconnection process that is adding new generation faster than any other U.S. system, data shows. Wikipedia. (2024). “2024 RTO Map” [jpg]. Retrieved from WikiCommons. The need to speed interconnection of new generation continues to grow more urgent, but the Electric Reliability Council of Texas may have the solution, some analysts say. ERCOT uses a unique connect-and-manage, or C&M, interconnection process that is adding new generation faster than any other U.S. system, data shows. C&M may meet the current Texas spiking electricity demand but other markets need more price and reliability certainty from streamlined interconnection processes, stakeholders outside Texas said. ERCOT’s C&M approach “allows generators to enter the market as energy-only resources because it manages them as part of its proactive transmission planning process,” Duke University Nicholas School of the Environment Fellow Tyler Norris said. But those generators “face the financial risk of curtailments,” he added. Unlike ERCOT, most system operators’ reliability requirements and allocation of upgrade costs force detailed and redundant studies of generators seeking interconnection that result in backlogged queues, studies show. “Most regions do studies to ensure generation is deliverable when and where it’s needed, which slows interconnection approvals,” said Carrie Bivens, vice president of Southwest Power Pool’s market monitoring unit and former director of ERCOT market monitor Potomac Economics. Dropouts from generator applicant clusters “force redundant studies and slows the process more,” she added. System operators are finding ways to streamline interconnection processes, analysts and regulators agree. But more flexible interconnection approaches could

Cisco capitalizes on Isovalent buy, unveils new load balancer

The customer deploys the Isovalent Load Balancer control plane via automation and configures the desired number of virtual load-balancer appliances, Graf said. “The control plane automatically deploys virtual load-balancing appliances via the virtualization or Kubernetes platform. The load-balancing layer is self-healing and supports auto-scaling, which means that I can replace unhealthy instances and scale out as needed. The load balancer supports powerful L3-L7 load balancing with enterprise capabilities,” he said. Depending on the infrastructure the load balancer is deployed into, the operator will deploy the load balancer using familiar deployment methods. In a data center, this will be done using a standard virtualization automation installation such as Terraform or Ansible. In the public cloud, the load balancer is deployed as a public cloud service. In Kubernetes and OpenShift, the load balancer is deployed as a Kubernetes Deployment/Operator, Graf said. “In the future, the Isovalent Load Balancer will also be able to run on top of Cisco Nexus smart switches,” Graf said. “This means that the Isovalent Load Balancer can run in any environment, from data center, public cloud, to Kubernetes while providing a consistent load-balancing layer with a frictionless cloud-native developer experience.” Cisco has announced a variety of smart switches over the past couple of months on the vendor’s 4.8T capacity Silicon One chip. But the N9300, where Isovalent would run, includes a built-in programmable data processing unit (DPU) from AMD to offload complex data processing work and free up the switches for AI and large workload processing. For customers, the Isovalent Load Balancer provides consistent load balancing across infrastructure while being aligned with Kubernetes as the future for infrastructure. “A single load-balancing solution that can run in the data center, in public cloud, and modern Kubernetes environments. This removes operational complexity, lowers cost, while modernizing the load-balancing infrastructure in preparation

Oracle’s struggle with capacity meant they made the difficult but responsible decisions

IDC President Crawford Del Prete agreed, and said that Oracle senior management made the right move, despite how difficult the situation is today. “Oracle is being incredibly responsible here. They don’t want to have a lot of idle capacity. That capacity does have a shelf life,” Del Prete said. CEO Katz “is trying to be extremely precise about how much capacity she puts on.” Del Prete said that, for the moment, Oracle’s capacity situation is unique to the company, and has not been a factor with key rivals AWS, Microsoft, and Google. During the investor call, Katz said that her team “made engineering decisions that were much different from the other hyperscalers and that were better suited to the needs of enterprise customers, resulting in lower costs to them and giving them deployment flexibility.” Oracle management certainly anticipated a flurry of orders, but Katz said that she chose to not pay for expanded capacity until she saw finalized “contracted noncancelable bookings.” She pointed to a huge capex line of $9.1 billion and said, “the vast majority of our capex investments are for revenue generating equipment that is going into data centers and not for land or buildings.”

Winners and losers in the Top500 supercomputer ranking

GPU winner: AMD AMD is finally making a showing for itself, albeit modestly, in GPU accelerators. For the June 2025 edition of the list, AMD Instinct accelerators are in 23 systems, a nice little jump from the 10 systems on the June 2024 list. Of course, it helps with the sales pitch when AMD processors and coprocessors can be found powering the No. 1 and No. 2 supercomputers in the world. GPU loser: Intel Intel’s GPU efforts have been a disaster. It failed to make a dent in the consumer space with its Arc GPUs, and it isn’t making much headway in the data center, either. There were only four systems running GPU Max processors on the list, and that’s up from three a year ago. Still, it’s pitiful showing given the effort Intel made. Server winners: HPE, Dell, EVIDAN, Nvidia The four server vendors — servers, not component makers — all saw share increases. Nvidia is also a server vendor, selling its SuperPOD AI servers directly to customers. They all gained at the expense of Lenovo and Arm. Server loser: Lenovo It saw the sharpest drop in server share, going from 163 systems in June of 2024 to 136 in this most recent listing. Loser: Arm Other than the 13 Nvidia Grace chips, the ARM architecture was completely absent from this spring’s list.

Micron joins HBM4 race with 36GB 12-high stack, eyes AI and data center dominance

Race to power the next generation of AI By shipping samples of the HMB4 to the key customers, Micron has joined SK hynix in the HBM4 race. In March this year, SK hynix shipped the 12-Layer HBM4 samples to customers. SK hynix’s HBM4 has implemented bandwidth capable of processing more than 2TB of data per second, processing data equivalent to more than 400 full-HD movies (5GB each) in a second, said the company. “HBM competitive landscape, SK hynix has already sampled and secured approval of HBM4 12-high stack memory early Q1’2025 to NVIDIA for its next generation Rubin product line and plans to mass produce HBM4 in 2H 2025,” said Danish Faruqui, CEO, Fab Economics. “Closely following, Micron is pending Nvidia’s tests for its latest HBM4 samples, and Micron plans to mass produce HBM4 in 1H 2026. On the other hand, the last contender, Samsung is struggling with Yield Ramp on HBM4 Technology Development stage, and so has to delay the customer samples milestones to Nvidia and other players while it earlier shared an end of 2025 milestone for mass producing HBM4.” Faruqui noted another key differentiator among SK hynix, Micron, and Samsung: the base die that anchors the 12-high DRAM stack. For the first time, both SK hynix and Samsung have introduced a logic-enabled base die on 3nm and 4nm process technology to enable HBM4 product for efficient and faster product performance via base logic-driven memory management. Both Samsung and SK hynix rely on TSMC for the production of their logic-enabled base die. However, it remains unclear whether Micron is using a logic base die, as the company lacks in-house capability to fabricate at 3nm.

Cisco reinvigorates data center, campus, branch networking with AI demands in mind

“We have a number of … enterprise data center customers that have been using bi-directional optics for many generations, and this is the next generation of that feature,” said Bill Gartner, senior vice president and general manager of Cisco’s optical systems and optics business. “The 400G lets customer use their existing fiber infrastructure and reduces fiber count for them so they can use one fiber instead of two, for example,” Gartner said. “What’s really changed in the last year or so is that with AI buildouts, there’s much, much more optics that are part of 400G and 800G, too. For AI infrastructure, the 400G and 800G optics are really the dominant optics going forward,” Gartner said. New AI Pods Taking aim at next-generation interconnected compute infrastructures, Cisco expanded its AI Pod offering with the Nvidia RTX 6000 Pro and Cisco UCS C845A M8 server package. Cisco AI Pods are preconfigured, validated, and optimized infrastructure packages that customers can plug into their data center or edge environments as needed. The Pods include Nvidia AI Enterprise, which features pretrained models and development tools for production-ready AI, and are managed through Cisco Intersight. The Pods are based on Cisco Validated Design principals, which offer customers pre-tested and validated network designs that provide a blueprint for building reliable, scalable, and secure network infrastructures, according to Cisco. Building out the kind of full-scale AI infrastructure compute systems that hyperscalers and enterprises will utilize is a huge opportunity for Cisco, said Daniel Newman, CEO of The Futurum Group. “These are full-scale, full-stack systems that could land in a variety of enterprise and enterprise service application scenarios, which will be a big story for Cisco,” Newman said. Campus networking For the campus, Cisco has added two new programable SiliconOne-based Smart Switches: the C9350 Fixed Access Smart Switches and C9610

Qualcomm’s $2.4B Alphawave deal signals bold data center ambitions

Qualcomm says its Oryon CPU and Hexagon NPU processors are “well positioned” to meet growing demand for high-performance, low-power compute as AI inferencing accelerates and more enterprises move to custom CPUs housed in data centers. “Qualcomm’s advanced custom processors are a natural fit for data center workloads,” Qualcomm president and CEO Cristiano Amon said in the press release. Alphawave’s connectivity and compute technologies can work well with the company’s CPU and NPU cores, he noted. The deal is expected to close in the first quarter of 2026. Complementing the ‘great CPU architecture’ Qualcomm has been amassing Client CPUs have been a “big play” for Qualcomm, Moor’s Kimball noted; the company acquired chip design company Nuvia in 2021 for $1.4 billion and has also announced that it will be designing data center CPUs with Saudi AI company Humain. “But there was a lot of data center IP that was equally valuable,” he said. This acquisition of Alphawave will help Qualcomm complement the “great CPU architecture” it acquired from Nuvia with the latest in connectivity tools that link a compute complex with other devices, as well as with chip-to-chip communications, and all of the “very low level architectural goodness” that allows compute cores to deliver “absolute best performance.” “When trying to move data from, say, high bandwidth memory to the CPU, Alphawave provides the IP that helps chip companies like Qualcomm,” Kimball explained. “So you can see why this is such a good complement.”

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE