Move over, Alexa: Amazon launches new realtime voice model Nova Sonic for third-party enterprise development

Stay Ahead, Stay ONMINE

Move over, Alexa: Amazon launches new realtime voice model Nova Sonic for third-party enterprise development

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Amazon is best known as an e-commerce giant and then somewhere perhaps slightly further down the list of notable offerings is its Alexa AI voice assistant product, which just got a big intelligence upgrade last month […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Amazon is best known as an e-commerce giant and then somewhere perhaps slightly further down the list of notable offerings is its Alexa AI voice assistant product, which just got a big intelligence upgrade last month thanks in part to Amazon investment Anthropic.

Now Alexa will have to make space for a new Amazon voice AI sibling: today the company is introducing Amazon Nova Sonic, a new foundation model designed to allow third-party app developers to build realtime, naturalistic, conversational voice interactivity to their products using Amazon’s web platform Bedrock

It’s available now via a bi-directional streaming application programming interface (API).

Obvious use cases include customer support and service, guidance, information retrieval, and entertainment.

A unified approach

Nova Sonic addresses a key challenge in voice AI: the fragmentation of technologies.

Traditionally, building voice interfaces required combining separate models for speech recognition, language processing, and speech synthesis, according to Rohit Prasad, SVP and Head Scientist for Artificial General Intelligence (AGI) at Amazon, in a video call interview with VentureBeat yesterday using Amazon’s Chime video service.

This complexity often results in robotic, unnatural interactions and increased development overhead.

Now, Sonic seeks to improve on this state of affairs by combining all three distinct model types into one.

Prasad explained the model’s core innovation: “Nova Sonic brings together three traditionally separate models—speech-to-text, text understanding, and text-to-speech—into one unified system that can model not just the ‘what’ but also the ‘how’ of communication.”

By retaining the acoustic context—such as tone, cadence, and style—Nova Sonic helps maintain the nuances of human conversation.

Recognizing the intricacies and quirks of live, two-way audio conversations

One of Nova Sonic’s defining capabilities is its ability to handle live, two-way conversations. It recognizes when users pause, hesitate, or interrupt—common behaviors in human speech—and responds fluidly while maintaining context.

“The real breakthrough here is real-time, interactive, low-latency voice interaction, which means you can interrupt the AI mid-sentence, and it will still maintain context and respond coherently,” said Prasad. This feature is especially relevant in scenarios like customer service, where responsiveness and adaptability are critical.

Nova Sonic is also designed to integrate seamlessly with other systems. It automatically generates transcripts of spoken input, which can be used to trigger APIs or interact with proprietary tools. This allows companies to build AI agents that can perform tasks such as booking appointments, retrieving live information, or answering complex customer inquiries.

“You can use Nova Sonic through Amazon Bedrock and connect it with any tools or proprietary data sources, even visual ones, as long as they’re wrapped as callable APIs,” said Prasad. This flexibility makes the model suitable for a wide range of industries, from education and travel to enterprise operations and entertainment.

Benchmark performance and industry comparisons

Nova Sonic has been benchmarked against other real-time voice models, including OpenAI’s GPT-4o and Google’s Gemini Flash 2.0. On the Common Eval data set, it achieved a 69.7% win-rate over Gemini Flash 2.0 and a 51.0% win-rate over GPT-4o for American English single-turn conversations using a masculine voice. Similar gains were seen with feminine and British English voices.

Prasad emphasized Nova Sonic’s strong performance in its primary language markets: “Nova Sonic is currently best-in-class in U.S. and British English, outperforming even GPT-4o real-time in both conversational naturalness and accuracy.” He added, “To the best of our knowledge, only two other models—GPT-4o real-time and a variant of GPT-4o mini—come close to what Nova Sonic does in combining speech understanding and generation in real time. This space is still very early and very hard.”

Multilingual capabilities and noisy environment handling

In speech recognition, Nova Sonic also excels in multilingual and real-world conditions. It recorded a word error rate (WER) of 4.2% on the Multilingual LibriSpeech benchmark, outperforming GPT-4o Transcribe by over 36% across English, French, German, Italian, and Spanish. In noisy, multi-speaker environments (measured using the AMI benchmark), Nova Sonic showed a 46.7% improvement in WER over GPT-4o Transcribe.

Expressive voices and language expansion

Currently, the model supports multiple expressive voices, both masculine and feminine, in American and British English. Amazon noted that additional accents and languages are in development and will be released in future updates.

Low latency and enterprise-friendly cost

Speed and cost are also part of the appeal. Third-party benchmarking shows Nova Sonic delivers a customer-perceived latency of 1.09 seconds, compared to 1.18 seconds for OpenAI’s GPT-4o and 1.41 seconds for Google’s Gemini Flash 2.0.

From a pricing standpoint, Amazon positions Nova Sonic as an enterprise-ready solution. “We’re nearly 80% cheaper than GPT-4o real-time, and that superior price-performance is resonating with enterprises moving from experimentation to deployment,” said Prasad.

Early adoption across sectors

According to Amazon, companies across different sectors have already begun using or testing Nova Sonic.

ASAPP is applying the technology to optimize contact center workflows, praising its accuracy and natural dialog handling.

Education First (EF) uses the model to support language learners with real-time pronunciation feedback, especially for non-native speakers with varied accents.

Sports data provider Stats Perform is leveraging Nova Sonic’s low latency and simple setup to power rapid, data-rich interactions in its Opta AI Chat platform.

Responsible AI and safety commitment

Alongside performance and cost, Amazon is highlighting its commitment to responsible AI development. The Nova family of models includes built-in safeguards and is supported by AWS AI Service Cards that outline intended use cases, potential limitations, and ethical guidelines.

Prasad underscored Amazon’s focus on trust and safety: “Trust is paramount for us—developers can customize personality within limits, but we’ve put in strong guardrails to prevent voice cloning or unwanted mimicry.” He added, “We work extremely hard to eliminate hallucinations and voice drift. The bar we’ve set for release is high because speech generation must be trustworthy.”

Amazon Nova Sonic is now generally available through Amazon Bedrock. Developers and enterprises interested in exploring the model can get started by visiting https://aws.amazon.com/nova/.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Cato Networks augments CASB with genAI security

CASBs sit between an end user and a cloud service to enforce security policies, protect data, and ensure compliance. CASBs provide enterprise network and security teams with information on how end users are accessing and using cloud resources such as data, applications, and services. They provide visibility into cloud usage,

8 unusual Linux commands

3. The column command The column command will display text in columns. Here are two examples of how to use it: $ cat staff | columnJohn Doe Lisa Stone Joanne Zahn Eric Docker Ben MatsonMary Berry Elaine Henry David Bloom Sam Adams Sally Rose$ cat staff | column -tJohn DoeMary

AI agents vs. agentic AI: What do enterprises want?

The cloud-provider technical people I know don’t like this approach; they see it as likely to raise barriers to the use of their online generative AI services. Enterprises see their AI agent vision as facilitating cloud AI services instead. If there’s one massive AI entity doing everything, then data sovereignty

Why digital transformation starts with an intelligent network infrastructure

Heightened end-user expectations and a greater emphasis on digitalization are rewiring the business landscape again. Businesses looking to sharpen their competitive edge and stay innovative in this climate must tap into solutions such as artificial intelligence (AI), machine learning (ML) and hybrid multicloud. Yet, this new digital landscape has seen

Lower gas reserves expected this winter as UK’s largest storage facility halts

Centrica has ceased injecting natural gas into the UK’s largest energy storage facility, located in the North Sea, which is likely to mean lower gas reserves this winter. The company is believed to have stopped refilling the Rough gas storage facility off the Yorkshire coast this month, which comprises about half of the UK’s energy storage capacity. Centrica warned in December that the Rough facility was making a loss of between £50m and £100m for the Centrica Energy Storage+ business division. The company has indicated that the storage facility, which was reopened in 2022 due to the energy crisis to plug demand, was not financially viable in prevailing market conditions. British Gas owner Centrica has said that it needs a cap-and-floor mechanism to redevelop the facility with £2 billion of its own cash so that it can store hydrogen. The company has broached talks with government over the future operation of the plant and met with Ed Miliband in March to discuss options for keeping the plant open. A spokesperson for the Department of Energy Security and Net Zero (DESNZ) said the government is “open to discussing proposals on gas storage sites, as long as it provides value for money for taxpayers”. Clean power mission boss Chris Stark said at a parliamentary hearing earlier this year in January that the government was considering a regulatory mechanism to support hydrogen storage from around 2030. Unabated gas is envisaged to comprise up to 5% of the UK’s energy demand by 2030 under a system operator study on the clean power mission. Its group chief executive Chris O’Shea said on a webinar with analysts at the release of its annual results in February that the company was considering all options for Rough and had not made a decision around its continuation. The impetus

US Inventory Drop, OPEC Action Lift Oil Prices

Oil rose on the prospect of a de-escalation in the trade war between the world’s two largest economies and a stall in nuclear talks between the US and Iran. West Texas Intermediate futures added 1.9% to settle near $62.50 a barrel, the third gain in the four past sessions, after China signaled openness to trade negotiations with the Trump administration. Pre-conditions for the talks would include a more consistent US position and a willingness to address China’s concerns around American sanctions and Taiwan, according to a person familiar with the Chinese government’s thinking. Elsewhere, Iran said it won’t be drawn into negotiations with the US over its ability to enrich uranium, reducing the potential of looser restrictions on Iranian crude. The US also sanctioned another China-based independent “teapot” refinery for its role in purchasing Tehran’s crude, and Treasury Secretary Scott Bessent said the US would ramp up pressure on Iran. Crude has recovered from a sharp drop to near the lowest in four years brought about by an onslaught of tariffs and counter-levies between the US and its biggest trading partners. Washington on Tuesday started a probe into the need for import taxes on critical minerals, while trade differences with the European Union persist as White House officials said the bulk of the US tariffs imposed on the bloc won’t be removed. Meanwhile, Iraq plans to cut its oil exports this month as it faces growing pressure to adhere to its OPEC+ production target. The country aims to reduce shipments by 70,000 barrels a day, an official with knowledge of the matter said. In another support for prices, US government data released Wednesday showed inventory levels at Cushing, Oklahoma — the delivery point for West Texas Intermediate — fell by roughly 650,000 barrels to the lowest since 2008 for this

Keystone Restarts Oil Pipeline After Leak Prompted Shutdown

The operator of the Keystone oil pipeline brought the conduit back into service, putting an end to a week-long outage caused by an estimated 3,500-barrel spill in rural North Dakota. Most of the oil released has been recovered and remediation efforts have started, South Bow Corp. said in a statement Wednesday. The line will be able to operate at no more of 80% of pressure levels at the time of the April 8 spill. At the time of failure, the line was transporting 17,844 barrels per hour, or the equivalent of 428,000 barrels a day. The restart, delayed by inclement weather, comes roughly two days after it met all conditions imposed by the Pipeline and Hazardous Materials Safety Administration. South Bow will continue to monitor the system as an investigation into the causes of the spill continues, the company said. Keystone can transport as much as 620,000 barrels of Canadian crude daily to US Midwest and Gulf Coast markets.

Renewable PPA prices shrug off the tariff roller coaster — at least for now

Dive Brief: Solar power purchase agreement prices remain essentially unchanged since the end of 2024, while wind PPA prices declined slightly in spite of uncertain and even adverse policy actions coming out of the Trump administration, according to data from LevelTen Energy’s PPA marketplace. The average North American solar PPA went for $57.04 per MWh in the first quarter of 2025, up 28 cents from the end of 2024 and 9.8% since this time last year. Wind PPA prices dropped more than 5% during the first quarter, but remain 4.4% higher than last year, according to LevelTen Energy. Although an ample supply of solar projects should put downward pressure on solar prices, developers may be reluctant to tighten their margins in the face of policy uncertainty, said Zach Starsia, senior director of the energy marketplace at LevelTen. Data from the next few months could clarify which direction prices are headed, Starsia said. Dive Insight: PPA prices have remained relatively static despite — or perhaps because of — the policy turmoil in recent months, Starsia said. It’s not just the Trump administration’s on-again, off-again tariffs that stand to increase costs for solar developers, he said. Renewable energy developers, who rely heavily on the U.S. Army Corps of Engineers during federal permitting processes, have also been impacted by the Department of Government Efficiency’s cost-cutting. And talk of revamping or repealing the Inflation Reduction Act — while still seen as unlikely — could hit developers hard, Starsia said. With a glut of solar projects set to come online in many U.S. markets, long-term analyses suggest that PPA prices should decline. But uncertainty about the future of trade and energy policy in the U.S. seems to have prompted most developers to hedge their bets by maintaining their asking prices — at least for now.

Humber carbon emitter wants government signal on Viking CCS

Power company VPI has called for clarity to progress the Viking carbon capture and storage (CCS) project and help drive the future of heavy industries in the Humber. VPI requested a signal from the UK government in its upcoming comprehensive spending review that it will be selected as an anchor emitter for the CCS project. The group owns the nearly 1.3GW Immingham thermal power plant, which provides power to the Humber’s two large oil refineries. VPI is planning to deploy a £1.5 billion carbon capture proposal, which will utilise Harbour Energy’s Viking CCS pipeline to transport carbon that will be buried in a depleted gas field in the North Sea. VPI chief executive Jorge Pikunic said: “Carbon capture and storage provides a once-in-a-generation opportunity to turn the Humber into a powerhouse of the future. If missed, it may not come again. “For the last five years, public officials have worked tirelessly with industry to set in motion the development of Viking CCS, a unique carbon capture and storage network, here in the Humber. “Proceeding with the next stage of Viking CCS now will demonstrate how a strategic, mission-driven government can successfully transition an industrial hub into a future powerhouse, in a prudent, value-for money driven, just and meaningful way.” Viking CCS The Viking CCS pipeline will transport CO₂ captured from the industrial cluster at Immingham out to the Viking reservoirs via the Theddlethorpe gas terminal and an existing 75-mile (120km) pipeline as part of the Lincolnshire offshore gas gathering system (LOGGS). The project forms part of the UK’s track 2 CCS projects along with Scotland’s Acorn CCS project. While the UK government has backed the track 1 projects with around £22 billion of government funding, the track 2 proposal have not received similar pledges of support. Business leaders have warned

APA Corp Makes Leadership Changes, Names Ben Rodgers as CFO

Oil and natural gas exploration and production company APA Corporation has recently made changes to its executive leadership team. The company said in a media release that Ben Rodgers has been named executive vice president (EVP) and chief financial officer (CFO), effective May 12, 2025. Furthermore, Steve Riley will continue in his role as president, while Shad Frazier joined the company as senior vice president, U.S. Onshore Operations. Additionally, Donald Martin will join the company as vice president, Decommissioning, effective May 26, 2025, APA said. In this role as EVP and CFO, Rodgers will oversee all financial activities and departments, including Accounting, Audit, Investor Relations, Planning, Tax, and Treasury. He joined APA in 2018 and previously served as SVP, Finance, and Treasurer. He also served as CFO of Altus Midstream and later as a director on the board of Kinetik Holdings Inc., APA said. He currently serves on the board of Khalda Petroleum Company, a joint venture between APA subsidiary Apache Corporation and Egypt Petroleum Company. In his position, Riney will continue overseeing asset development and operations. Both Frazier and Martin have been added to Riney’s team to help oversee operations. APA highlighted that Frazier has nearly 30 years of industry experience, most recently as vice president, Production Operations at Endeavor Energy Resources, LP. Previously, he held various leadership positions at Legacy Reserves and SandRidge Energy. Martin brings 20 years of operations and decommissioning portfolio experience, most recently as the head of decommissioning and projects at Spirit Energy. He has also managed decommissioning at Canadian Natural Resources, APA said. “I am pleased to welcome Ben to our executive leadership team. He has done a tremendous job and will bring valuable expertise to our financial operations”, John J. Christmann, APA Corporation CEO, said. “I am also excited to welcome both Shad and Donald

Intel sells off majority stake in its FPGA business

Altera will continue offering field-programmable gate array (FPGA) products across a wide range of use cases, including automotive, communications, data centers, embedded systems, industrial, and aerospace. “People were a bit surprised at Intel’s sale of the majority stake in Altera, but they shouldn’t have been. Lip-Bu indicated that shoring up Intel’s balance sheet was important,” said Jim McGregor, chief analyst with Tirias Research. The Altera has been in the works for a while and is a relic of past mistakes by Intel to try to acquire its way into AI, whether it was through FPGAs or other accelerators like Habana or Nervana, note Anshel Sag, principal analyst with Moor Insight and Research. “Ultimately, the 50% haircut on the valuation of Altera is unfortunate, but again is a demonstration of Intel’s past mistakes. I do believe that finishing the process of spinning it out does give Intel back some capital and narrows the company’s focus,” he said. So where did it go wrong? It wasn’t with FPGAs because AMD is making a good run of it with its Xilinx acquisition. The fault, analysts say, lies with Intel, which has a terrible track record when it comes to acquisitions. “Altera could have been a great asset to Intel, just as Xilinx has become a valuable asset to AMD. However, like most of its acquisitions, Intel did not manage Altera well,” said McGregor.

Intelligence at the edge opens up more risks: how unified SASE can solve it

In an increasingly mobile and modern workforce, smart technologies such as AI-driven edge solutions and the Internet of Things (IoT) can help enterprises improve productivity and efficiency—whether to address operational roadblocks or respond faster to market demands. However, new solutions also come with new challenges, mainly in cybersecurity. The decentralized nature of edge computing—where data is processed, transmitted, and secured closer to the source rather than in a data center—has presented new risks for businesses and their everyday operations. This shift to the edge increases the number of exposed endpoints and creates new vulnerabilities as the attack surface expands. Enterprises will need to ensure their security is watertight in today’s threat landscape if they want to reap the full benefits of smart technologies at the edge. Bypassing the limitations of traditional network security For the longest time, enterprises have relied on traditional network security approaches to protect their edge solutions. However, these methods are becoming increasingly insufficient as they typically rely on static rules and assumptions, making them inflexible and predictable for malicious actors to circumvent. While effective in centralized infrastructures like data centers, traditional network security models fall short when applied to the distributed nature of edge computing. Instead, organizations need to adopt more adaptive, decentralized, and intelligent security frameworks built with edge deployments in mind. Traditional network security typically focuses on keeping out external threats. But today’s threat landscape has evolved significantly, with threat actors leveraging AI to launch advanced attacks such as genAI-driven phishing, sophisticated social engineering attacks, and malicious GPTs. Combined with the lack of visibility with traditional network security, a cybersecurity breach could remain undetected until it’s too late, resulting in consequences extending far beyond IT infrastructures. Next generation of enterprise security with SASE As organizations look into implementing new technologies to spearhead their business, they

Keysight tools tackle data center deployment efficiency

Test and performance measurement vendor Keysight Technologies has developed Keysight Artificial Intelligence (KAI) to identify performance inhibitors affecting large GPU deployments. It emulates workload profiles, rather than using actual resources, to pinpoint performance bottlenecks. Scaling AI data centers requires testing throughout the design and build process – every chip, cable, interconnect, switch, server, and GPU needs to be validated, Keysight says. From the physical layer through the application layer, KAI is designed to identify weak links that degrade the performance of AI data centers, and it validates and optimizes system-level performance for optimal scaling and throughput. AI providers, semiconductor fabricators, and network equipment manufacturers can use KAI to accelerate design, development, deployment, and operations by pinpointing performance issues before deploying in production.

U.S. Advances AI Data Center Push with RFI for Infrastructure on DOE Lands

ORNL is also the home of the Center for Artificial Intelligence Security Research (CAISER), which Edmon Begoli, CAISER founding director, described as being in place to build the security necessary by defining a new field of AI research targeted at fighting future AI security risks. Also, at the end of 2024, Google partner Kairos Power started construction of their Hermes demonstration SMR in Oak Ridge. Hermes is a high-temperature gas-cooled reactor (HTGR) that uses triso-fueled pebbles and a molten fluoride salt coolant (specifically Flibe, a mix of lithium fluoride and beryllium fluoride). This demonstration reactor is expected to be online by 2027, with a production level system becoming available in the 2030 timeframe. Also located in a remote area of Oak Ridge is the Tennessee Valley Clinch River project, where the TVA announced a signed agreement with GE-Hitachi to plan and license a BWRX-300 small modular reactor (SMR). On Integrating AI and Energy Production The foregoing are just examples of ongoing projects at the sites named by the DOE’s RFI. Presuming that additional industry power, utility, and data center providers get on board with these locations, any of the 16 could be the future home of AI data centers and on-site power generation. The RFI marks a pivotal step in the U.S. government’s strategy to solidify its global dominance in AI development and energy innovation. By leveraging the vast resources and infrastructure of its national labs and research sites, the DOE is positioning the country to meet the enormous power and security demands of next-generation AI technologies. The selected locations, already home to critical energy research and cutting-edge supercomputing, present a compelling opportunity for industry stakeholders to collaborate on building integrated, sustainable AI data centers with dedicated energy production capabilities. With projects like Oak Ridge’s pioneering SMRs and advanced AI security

Generac Sharpens Focus on Data Center Power with Scalable Diesel and Natural Gas Generators

In a digital economy defined by constant uptime and explosive compute demand, power reliability is more than a design criterion—it’s a strategic imperative. In response to such demand, Generac Power Systems, a company long associated with residential backup and industrial emergency power, is making an assertive move into the heart of the digital infrastructure sector with a new portfolio of high-capacity generators engineered for the data center market. Unveiled this week, Generac’s new lineup includes five generators ranging from 2.25 MW to 3.25 MW. These units are available in both diesel and natural gas configurations, and form part of a broader suite of multi-asset energy systems tailored to hyperscale, colocation, enterprise, and edge environments. The product introductions expand Generac’s commercial and industrial capabilities, building on decades of experience with mission-critical power in hospitals, telecom, and manufacturing, now optimized for the scale and complexity of modern data centers. “Coupled with our expertise in designing generators specific to a wide variety of industries and uses, this new line of generators is designed to meet the most rigorous standards for performance, packaging, and after-treatment specific to the data center market,” said Ricardo Navarro, SVP & GM, Global Telecom and Data Centers, Generac. Engineering for the Demands of Digital Infrastructure Each of the five new generators is designed for seamless integration into complex energy ecosystems. Generac is emphasizing modularity, emissions compliance, and high-ambient operability as central to the offering, reflecting a deep understanding of the real-world challenges facing data center operators today. The systems are built around the Baudouin M55 engine platform, which is engineered for fast transient response and high operating temperatures—key for data center loads that swing sharply under AI and cloud workloads. The M55’s high-pressure common rail fuel system supports low NOx emissions and Tier 4 readiness, aligning with the most

CoolIT and Accelsius Push Data Center Liquid Cooling Limits Amid Soaring Rack Densities

The CHx1500’s construction reflects CoolIT’s 24 years of DLC experience, using stainless-steel piping and high-grade wetted materials to meet the rigors of enterprise and hyperscale data centers. It’s also designed to scale: not just for today’s most power-hungry processors, but for future platforms expected to surpass today’s limits. Now available for global orders, CoolIT is offering full lifecycle support in over 75 countries, including system design, installation, CDU-to-server certification, and maintenance services—critical ingredients as liquid cooling shifts from high-performance niche to a requirement for AI infrastructure at scale. Capex Follows Thermals: Dell’Oro Forecast Signals Surge In Cooling and Rack Power Infrastructure Between Accelsius and CoolIT, the message is clear: direct liquid cooling is stepping into its maturity phase, with products engineered not just for performance, but for mass deployment. Still, technology alone doesn’t determine the pace of adoption. The surge in thermal innovation from Accelsius and CoolIT isn’t happening in a vacuum. As the capital demands of AI infrastructure rise, the industry is turning a sharper eye toward how data center operators account for, prioritize, and report their AI-driven investments. To wit: According to new market data from Dell’Oro Group, the transition toward high-power, high-density AI racks is now translating into long-term investment shifts across the data center physical layer. Dell’Oro has raised its forecast for the Data Center Physical Infrastructure (DCPI) market, predicting a 14% CAGR through 2029, with total revenue reaching $61 billion. That revision stems from stronger-than-expected 2024 results, particularly in the adoption of accelerated computing by both Tier 1 and Tier 2 cloud service providers. The research firm cited three catalysts for the upward adjustment: Accelerated server shipments outpaced expectations. Demand for high-power infrastructure is spreading to smaller hyperscalers and regional clouds. Governments and Tier 1 telecoms are joining the buildout effort, reinforcing AI as a

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE