Nvidia’s Nemotron Model Families will advance AI agents

Stay Ahead, Stay ONMINE

Nvidia’s Nemotron Model Families will advance AI agents

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia announced Nemotron Model Families to advance agentic AI as part of its bevy of AI announcements at CES 2025 today. Available as Nvidia NIM microservices, open Llama Nemotron large language models and Cosmos Nemotron vision […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Nvidia announced Nemotron Model Families to advance agentic AI as part of its bevy of AI announcements at CES 2025 today.

Available as Nvidia NIM microservices, open Llama Nemotron large language models and Cosmos Nemotron vision language models can supercharge AI agents on any accelerated system.

Artificial intelligence is entering a new era — agentic AI — where teams of specialized agents can help people solve complex problems and automate repetitive tasks. Nvidia made the announcement as part of Nvidia CEO Jensen Huang’s opening keynote today at CES 2025.

With custom AI agents, enterprises across industries can manufacture intelligence and achieve unprecedented productivity. These advanced AI agents require a system of multiple generative AI models optimized for agentic AI functions and capabilities. This complexity means that the need for powerful, efficient, enterprise-grade models has never been greater.

“AI agents is the next robotic industry and likely to be a multibillion-dollar opportunity,” Huang said.

To provide a foundation for enterprise agentic AI, Nvidia today announced the Llama Nemotron family of open large language models (LLMs). Built with Llama, the models can help developers create and deploy AI agents across a range of applications —- including customer support, fraud detection, and product supply chain and inventory management optimization.

To be effective, many AI agents need both language skills and the ability to perceive the world and respond with the appropriate action.

With new Nvidia Cosmos Nemotron vision language models (VLMs) and Nvidia NIM microservices for video search and summarization, developers can build agents that analyze and respond to images and video from autonomous machines, hospitals, stores and warehouses, as well as sports events, movies and news. For developers seeking to generate physics-aware videos for robotics and autonomous vehicles, Nvidia today separately announced Nvidia Cosmos world foundation models.

Open Llama Nemotron Models Optimize Compute Efficiency, Accuracy for AI Agents Built with Llama foundation models — one of the most popular commercially viable open source model collections, downloaded over 650 million times — Nvidia Llama Nemotron models provide optimized building blocks for AI agent development.

Llama Nemotron models are pruned and trained with Nvidia’s latest techniques and high-quality datasets for enhanced agentic capabilities. They excel at instruction following, chat, function calling, coding and math, while being size-optimized to run on a broad range of Nvidia accelerated computing resources.

“Agentic AI is the next frontier of AI development, and delivering on this opportunity requires full-stack optimization across a system of LLMs to deliver efficient, accurate AI agents,” said Ahmad Al-Dahel, vice president and head of GenAI at Meta, in a statement. “Through our collaboration with Nvidia and our shared commitment to open models, the Nvidia Llama Nemotron family built on Llama can help enterprises quickly create their own custom AI agents.”

Leading AI agent platform providers including SAP and ServiceNow are expected to be among the first to use the new Llama Nemotron models.

“AI agents that collaborate to solve complex tasks across multiple lines of the business will unlock a whole new level of enterprise productivity beyond today’s generative AI scenarios,” said Philipp Herzig, chief AI officer at SAP, in a statement. “Through SAP’s Joule, hundreds of millions enterprise users will interact with these agents to accomplish their goals faster than ever before. Nvidia’s new open Llama Nemotron model family will foster the development of multiple specialized AI agents to transform business processes.”

“AI agents make it possible for organizations to achieve more with less effort, setting new standards for business transformation,” said Jeremy Barnes, vice president of platform AI at ServiceNow, in a statement. “The improved performance and accuracy of Nvidia’s open Llama Nemotron models can help build advanced AI agent services that solve complex problems across functions, in any industry.”

The Nvidia Llama Nemotron models use Nvidia NeMo for distilling, pruning and alignment. Using these techniques, the models are small enough to run on a variety of computing platforms while providing high accuracy as well as increased model throughput.

The Llama Nemotron model family will be available as downloadable models and as Nvidia NIM microservices that can be easily deployed on clouds, data centers, PCs and workstations. They offer enterprises industry-leading performance with reliable, secure and seamless integration into their agentic AI application workflows.

Customize and Connect to Business Knowledge With Nvidia NeMo

The Llama Nemotron and Cosmos Nemotron model families are coming in Nano, Super and Ultra sizes to provide options for deploying AI agents at every scale.

● Nano: The most cost-effective model optimized for real-time applications with low latency, ideal for deployment on PCs and edge devices.

● Super: A high-accuracy model offering exceptional throughput on a single GPU.

● Ultra: The highest-accuracy model, designed for data-center-scale applications demanding the highest performance.

Enterprises can also customize the models for their specific use cases and domains with Nvidia NeMo microservices to simplify data curation, accelerate model customization and evaluation, and apply guardrails to keep responses on track.

With Nvidia NeMo Retriever, developers can also integrate retrieval-augmented generation (RAG) capabilities to connect models to their enterprise data.

And using Nvidia Blueprints for agentic AI, enterprises can quickly create their own applications using Nvidia’s advanced AI tools and end-to-end development expertise. In fact, Nvidia Cosmos Nemotron, Nvidia Llama Nemotron and NeMo Retriever supercharge the new Nvidia Blueprint for video search and summarization, announced separately today.

NeMo, NeMo Retriever and Nvidia Blueprints are all available with the Nvidia AI Enterprise software platform.

Availability

Llama Nemotron and Cosmos Nemotron models will be available as hosted APIs and for download on build.nvidia.com and on Hugging Face. Access for development, testing and research is free for members of the Nvidia Developer Program.

Enterprises can run Llama Nemotron and Cosmos Nemotron NIM microservices in production with the Nvidia AI Enterprise software platform on accelerated data center and cloud infrastructure.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Fortinet adds AI protections to endpoint security platform

“FortiEndpoint provides centralized visibility into AI applications and agents operating across managed endpoints. Security teams can identify sanctioned and unsanctioned tools, detect shadow AI, monitor adoption trends, and understand user activity through unified dashboards,” wrote Ankit Gupta, product and marketing leader for Fortinet, in a blog post about the enhancements.

ISC2: AI raises accountability demands for cybersecurity teams

Artificial intelligence is changing how cybersecurity teams work, with security professionals spending more time validating AI-generated recommendations and deciding when to trust AI outputs, according to new research from ISC2. The ISC2 survey of 856 cybersecurity professionals found that 65% spent more time deciding when to trust or act on

Governments to enterprises: Improve your router security hygiene

Actors have exploited, at the very least, CVE-2018-0171 (published in 2018) and CVE-2008-4128 (published in 2008), according to the bulletin. Both of these targeted Cisco routers, giving remote, unauthenticated attackers the ability to execute arbitrary code, take unauthorized actions, or cause a denial of service (DoS). Notable groups using this

Routine maintenance as a failure vector in modern networks

Pre-checks should include both control-plane and data-plane evidence. Control-plane checks confirm configuration, synchronization, device health, routing tables, interface status and object availability. Data-plane checks validate real traffic movement: TCP handshakes, TLS negotiation, HTTP status codes, API responses, session persistence, source NAT behavior and return-path consistency. During the change, monitoring should

Energy Secretary Secures Mid-Atlantic Grid Ahead of Period of Hot Weather

WASHINGTON—The U.S. Department of Energy (DOE) today issued an emergency order to mitigate blackout risks in the Mid-Atlantic ahead of the forecasted hot weather conditions and expected system load increase. The order directs PJM Interconnection, L.L.C. (PJM) to dispatch specified units and to order their operation as needed to maintain reliability. The order also authorizes PJM to direct backup generation resources to operate as a last resort before declaring an Energy Emergency Alert (EEA) 3 or during an EEA 3. PJM is authorized to call upon its Transmission Owners and Electric Distribution Companies to implement the order as needed. The order was issued pursuant to an application from PJM submitted on July 13, 2026. “Maintaining affordable, reliable, and secure power in the PJM service territory is non-negotiable,” said U.S. Secretary of Energy Chris Wright. “The previous administration’s energy subtraction policies weakened the grid, leaving Americans more vulnerable during events like this. Thanks to President Trump’s leadership, we are reversing those failures and using every available tool ensuring Americans in the Mid-Atlantic have continued access to affordable, reliable, and secure energy to power and cool their homes.” DOE estimates more than 35 GW of unused backup generation remains available nationwide. On day one, President Trump declared a national energy emergency after the Biden administration’s energy subtraction agenda left behind a grid increasingly vulnerable to risks of blackouts. According to the North American Electric Reliability Corporation’s (NERC) 2026 Summer Reliability Assessment, the peak electricity demand in PJM occurs during the summer season. NERC further notes that “if extreme high temperatures are experienced, PJM anticipates the need for demand-response resources to help reduce load.” Power outages cost the American people $44 billion per year, according to data from DOE’s National Laboratories. This order will mitigate the possibility of power outages in the Mid-Atlantic and highlights the common sense policies of the Trump Administration to ensure Americans have access to affordable, reliable,

DOE Alternative Fuels and Feedstocks Office Announces Intent to Advance Innovative Chemical Technologies

Proposed DOE funding will accelerate domestic chemical production WASHINGTON—The U.S. Department of Energy’s (DOE) Alternative Fuels and Feedstocks Office (AFFO) today announced its intent to fund the advancement of novel, high-impact chemical technologies. The proposed funding opportunity, Accelerating Scale-up and Pre-piloting of Emerging Chemical Technologies (ASPECT), will advance technologies for producing chemicals from alternative and waste feedstocks. In accordance with President Trump’s Executive Order Unleashing American Energy, this funding opportunity will strengthen domestic supply chains, reduce reliance on foreign imports, and accelerate American technology innovation. More than 96% of manufactured goods rely on products from the U.S. chemical sector, which directly employs more than half a million Americans. The ASPECT funding opportunity will reinforce domestic manufacturing and chemical supply chains by expanding the use of alternative feedstocks. Funding provided through ASPECT will target chemical technologies that improve performance, reduce costs, and show large market growth potential. AFFO expects to issue a notice of funding opportunity (NOFO) in August 2026, making up to $58 million available for projects that address the following topic areas: Topic Area 1: Bench ASPECT – to support the development and adoption of new technologies for producing chemicals from alternative feedstocks, moving beyond proof-of-concept to bench and pre-pilot scale. Topic Area 2: Pre-pilot ASPECT – to accelerate the development and market entry of strategically valuable, domestically produced chemicals. Following the NOFO announcement, AFFO will host an informational webinar to discuss a new, streamlined application and review process. Learn more about the topic areas, applicant eligibility and registration requirements, and the Teaming Partner List. Visit DOE eXCHANGE to view the full NOI.

PTTEP achieves Thailand’s first wellhead platform reuse in Gulf of Thailand

PTT Exploration and Production Public Co. Ltd. (PTTEP) has completed Thailand’s first total wellhead platform reuse project by redeploying an entire decommissioned petroleum wellhead platform as a complete structure in Funan field in the Gulf of Thailand. The reuse project comes as part of PTTEP’s program to maximize value and extend utilization of wellhead platforms that remain structurally sound and safe after depleting resources at a location by redeploying the platform as a complete structure. The first implementation was carried out at the Jakrawan K wellhead platform (JKWK), in Funan field under the G1/61 Project. As part of the project, PTTEP adopted the wet-tow method to relocate the jacket, helping curb energy consumption and minimize impacts on marine life attached to the platform structure, supporting a balance between energy production and marine environmental stewardship. The topside, jacket, and selected pile sections were relocated and reinstalled for use within the same field, reducing the overall construction and installation period to only 6 months, down from about 20 months for a newly built platform. Additionally, the approach cut construction costs by about 35–50% compared with construction of an entirely new wellhead platform. PTTEP said it expects the initiative to also reduce greenhouse gas emissions by about 3,270 tonnes of CO2e/platform by limiting the use of steel and other equipment required for construction of new platforms. PTTEP is operator of the G1/61 project (60%) with partner Mubadala Investment Co. (40%).

Trump declares Iran ceasefire over; oil surges on renewed supply risk

US President Donald Trump said the ceasefire and memorandum of understanding (MOU) reached with Iran last month is effectively over following a fresh exchange of strikes, reigniting supply concerns and sending crude prices sharply higher. Speaking alongside NATO Secretary-General Mark Rutte at the alliance’s summit in Ankara, Pres. Trump said Washington no longer sees value in maintaining the ceasefire framework with Tehran, though he left open the possibility of continued talks. He added that further US military action against Iran remains likely after strikes overnight. Stay updated on oil price volatility, shipping disruptions, LNG market analysis, and production output at OGJ’s Iran war content hub. The escalation was triggered by alleged Iranian attacks on three commercial vessels transiting the Strait of Hormuz on July 7. US Central Command said it responded with strikes on more than 80 Iranian targets, including air defense systems, command-and-control infrastructure, anti-ship missile capabilities, and over 60 Islamic Revolutionary Guard Corps (IRGC) fast boats operating in and near the strait. US Central Command described the tanker attacks as a clear violation of the June 17 agreement. Iran’s Foreign Ministry called the US strikes a breach of the MOU and said Tehran would continue to defend its sovereignty. The IRGC said it retaliated with drone and missile strikes targeting US military facilities in Bahrain and Kuwait. Authorities in both countries reported intercepting incoming projectiles, with no material damage confirmed. Trump said on July 8 the US is considering reinstating a naval blockade targeting Iranian ports and vessels. He also raised the possibility of strikes on civilian infrastructure, including electric plants and desalination facilities, as well as a potential move to take control of Kharg Island, home to the bulk of Iran’s crude export infrastructure. He said Tuesday’s strikes had reached the island but had not targeted its

US EIA forecasts declining oil prices as supply disruptions ease

In its July 7 Short-Term Energy Outlook (STEO) report, the US Energy Information Administration (EIA) said it expects global oil prices to decline as supply disruptions linked to the Strait of Hormuz ease and production recovers. On June 18, the US and Iran signed a memorandum of understanding to end the conflict and reopen the strait, which had been largely closed since Feb. 28. The disruption to this critical oil transit chokepoint constrained global flows, driving major price volatility. Brent crude averaged $85/bbl in June, down $22/bbl from May and $32/bbl below its April peak. Prices fell below $70/bbl on July 1 as tanker traffic through the strait increased sharply, easing supply concerns. EIA now expects most shut-in crude production to return to near pre-conflict levels by yearend, with full restoration largely to be completed by first-quarter 2027. Despite the recovery in flows, global inventories remain significantly depleted following earlier draws. EIA estimates oil inventories declined by an average of 5.1 million b/d in second-quarter 2026 and will fall by a further 2.2 million b/d in third-quarter 2026, as much of the recent tanker movement reflects previously stranded cargoes. As a result, the market is expected to remain relatively tight through most of third-quarter 2026 before shifting back into oversupply. EIA forecasts global oil consumption will decline by 1.2 million b/d in 2026, led by a 0.8 million b/d drop in non-OECD demand, particularly in the Asia Pacific. Demand is expected to rebound in 2027 as prices ease and supply normalizes, with consumption rising by 2.0 million b/d to 104.8 million b/d. As supply growth outpaces demand, inventories are projected to build by 2.7 million b/d in fourth-quarter 2026 and by 5.0 million b/d in 2027. This shift is expected to place sustained downward pressure on prices. EIA forecasts Brent

Eni lets EPCI contract for Kutei North Hub field FPSO

Eni North Ganal has let an engineering, procurement, construction, and installation (EPCI) contract to a joint venture between PT Saipem Indonesia and PT Tripatra Engineers and Constructors for a floating production, storage, and offloading (FPSO) unit for the Kutei North Hub Field Development Project in Kutei basin, offshore Indonesia, about 70 km off East Kalimantan. The project execution, with an estimated duration of 48 months, includes project management, engineering, procurement of materials, fabrication, construction and installation activities, as well as commissioning and start-up of the FPSO unit. The contract is valued at about $2 billion for Saipem’s share. The Kutei FPSO project is part of the Kutei North Hub Development, which comprises a subsea development tied back to the new FPSO, a dedicated gas export pipeline to the Bontang LNG plant, and domestic gas users via the existing East Kalimantan System. Eni North Ganal is controlled by Searah Ltd., which was formed through a strategic partnership between Eni and Petronas.

Google Cloud configuration update disrupts VMware Engine stretched clusters

“Google made a network setting change that accidentally broke the connection between the two data center zones in VMware Engine. The virtual machines themselves kept running fine, but nobody could reach them, and there was a risk that some machines might lose the ability to save data properly. This indicates that even managed cloud infrastructure can experience failures in critical shared network components,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. Neil Shah, vice president at Counterpoint Research, said the real culprit here is the SDN orchestration control plane, where a routine internal network update or configuration tweak introduced routing failure across multiple zones. “While most of the physical nodes are distributed for exactly this redundancy purpose, they are still tightly coupled to a singular shared orchestration fabric, so if that control plane crashes, then everything comes crashing down, and the physical distributed nodes become irrelevant.” Stretched clusters fall short Although the outage did not bring down virtual machines, the incident undermined the primary reason enterprises deploy stretched clusters.

AI’s Future Must Return to the Edge: How Power Constraints and Local Politics Are Redefining AI Infrastructure

Over the past two years, AI build plans have driven a sharp escalation in projected data center power demand. One recent assessment1 found that the U.S. disclosed data center development pipeline reached roughly 241 gigawatts by the end of 2025—an increase of about 159% in a single year—illustrating the unprecedented pace at which AI infrastructure demand is expanding. Forecasts from major analysts indicate that total data center power consumption could grow at least 50% by 2027 and potentially as much as 165% by 2030, with AI training and inference responsible for most of the incremental load.2 At this pace, planned AI capacity is growing faster than electric infrastructure can realistically be expanded. In many markets, available land and fiber are not the limiting factors; dependable megawatt delivery is.3 At the facility level, AI hardware is moving standard designs into new ranges. Power densities that once centered around 10–20 kW per rack are being replaced by configurations nearer 40 kW, with dense AI racks pushing toward 85 kW today and credible roadmaps to 200–250 kW per rack by 2030, though we’ve all seen the reports of even larger. These levels do not only affect cooling and white‑space layouts; they materially change the electrical infrastructure required per room and per building, and by extension the strain on local grids. On the power‑system side, constraints are now explicit. Transmission operators and regulators are stating that current generation, interconnection, and build‑out timelines are not sufficient to accommodate another decade of large demand centers in their present form. Analysts tracking AI data center energy demand point to electricity, grid access, and firm capacity as the primary constraints on new builds, with grid bottlenecks and transmission limitations flagged as risks for up to 20% of planned projects.4, 5 At the facility level, AI hardware is moving

Data Center Frontier Trends Summit 2026 Preview

The Hidden Constraints of Delivery If power gets the headlines, supply chain and logistics often decide the schedule. Kleyman notes that a seemingly small missing component can delay a multibillion-dollar facility. A busway, switchgear component, cooling element, or logistics failure can ripple through construction sequencing, commissioning, customer handoff, and revenue recognition. “The weakest link may not be the most expensive component,” he says. That reality receives sustained attention across the Summit agenda. Day One’s “Beyond the Dashboard: Active Exception Management for Hyperscale AI” features CargoSense CEO Rich Kilmer in a live case study examining how organizations are moving beyond passive shipment visibility toward active exception management. For hyperscale AI projects, supply chain disruption is not simply about delayed shipments. It can affect site readiness, construction sequencing, commissioning windows, and the ability to bring capacity online as planned. Day Two’s “The Hidden Constraint: Supply Chains in the Age of AI Infrastructure” continues the discussion, examining how global supply chains are becoming a defining constraint and differentiator in AI data center delivery. The execution lens sharpens again on Day Three with “The Last 90 Days: Solving the Final Infrastructure Bottlenecks Before Go-Live.” This session focuses on the phase where projects can be won or lost: generator delivery, electrical integration, controls validation, startup sequencing, fuel systems, utility coordination, commissioning, and operational readiness. Even projects that have secured power, capital, customers, and equipment can face costly delays if the final stretch is not executed with precision. In the AI infrastructure era, the last 90 days may determine whether a project becomes energized capacity—or another delayed announcement. Capital Meets Execution Reality The Summit also examines whether capital is moving in step with what can actually be built. Day Two’s investment panel, “AI Infrastructure Investment: Bubble, Breakthrough, or Both?” will assess how investors are underwriting risk

DCF Poll: How Much of the AI Data Center Pipeline Will Actually Get Built?

Matt Vincent is Editor in Chief of Data Center Frontier, where he leads editorial strategy and coverage focused on the infrastructure powering cloud computing, artificial intelligence, and the digital economy. A veteran B2B technology journalist with more than two decades of experience, Vincent specializes in the intersection of data centers, power, cooling, and emerging AI-era infrastructure. Since assuming the EIC role in 2023, he has helped guide Data Center Frontier’s coverage of the industry’s transition into the gigawatt-scale AI era, with a focus on hyperscale development, behind-the-meter power strategies, liquid cooling architectures, and the evolving energy demands of high-density compute, while working closely with the Digital Infrastructure Group at Endeavor Business Media to expand the brand’s analytical and multimedia footprint. Vincent also hosts The Data Center Frontier Show podcast, where he interviews industry leaders across hyperscale, colocation, utilities, and the data center supply chain to examine the technologies and business models reshaping digital infrastructure. Since its inception he serves as Head of Content for the Data Center Frontier Trends Summit. Before becoming Editor in Chief, he served in multiple senior editorial roles across Endeavor Business Media’s digital infrastructure portfolio, with coverage spanning data centers and hyperscale infrastructure, structured cabling and networking, telecom and datacom, IP physical security, and wireless and Pro AV markets. He began his career in 2005 within PennWell’s Advanced Technology Division and later held senior editorial positions supporting brands such as Cabling Installation & Maintenance, Lightwave Online, Broadband Technology Report, and Smart Buildings Technology. Vincent is a frequent moderator, interviewer, and keynote speaker at industry events including the HPC Forum, where he delivers forward-looking analysis on how AI and high-performance computing are reshaping digital infrastructure. He graduated with honors from Indiana University Bloomington with a B.A. in English Literature and Creative Writing and lives in southern New Hampshire with

Powering Canada’s AI Future: Electricity, Policy, and the Race for Data Center Leadership

Alberta represents the biggest point of contention in Canada’s data center strategy. The province is aggressively pursuing data center development with its Artificial Intelligence Data Center Strategy. It has abundant natural gas, large land parcels, a deregulated power market, experienced energy developers and political leaders actively courting AI infrastructure. That makes it attractive to data center operators that care most about speed to power. Alberta has also promoted “bring your own generation” models, where data center developers pair facilities with dedicated generation rather than relying entirely on the public grid. But Alberta’s electricity system is much more carbon-intensive than Québec, British Columbia, Manitoba or Ontario. The same feature that makes it attractive for development, potential large AI build-outs powered primarily by natural gas, would undercut Canada’s claim that its data centers can run on some of the cleanest power in the world. Saskatchewan illustrates another version of the opportunity. Bell Canada’s planned 300-megawatt AI data center in the Rural Municipality of Sherwood near Regina is a major signal that large-scale AI infrastructure can move beyond the traditional Toronto-Montreal-Calgary corridor. The project combines domestic telecom infrastructure, sovereign compute ambitions, hyperscale tenants, fiber partnerships, Indigenous procurement participation and closed-loop cooling. It also shows why power availability is now the deciding factor in site selection. At 300 megawatts, a single facility becomes a grid-planning event, not merely a real estate development. British Columbia, meanwhile, is trying to prioritize power among competing industrial demands. Data centers are arriving at the same time as mining, LNG, manufacturing, forestry, hydrogen and electrification projects. The province has moved toward limiting and screening certain high-load uses, including data centers and cryptocurrency mining, so that scarce clean electricity is allocated to projects with the strongest public benefit. This seems to be a preview of the future for these industrial

DC Byte’s Colby Cox Talks Power, Density and the AI Data Center Map

For much of the past two years, the data center industry’s central question was whether artificial intelligence demand could sustain the unprecedented scale of infrastructure being announced in its name. That is no longer the most urgent question. The more immediate concern is whether developers can assemble the power, land, cooling systems, interconnections, capital, political support and community acceptance required to convert that demand into operating capacity. “AI infrastructure has moved from being a fast-growing demand segment to becoming the organizing principle of data center development,” said Colby Cox, Managing Director for the Americas at DC Byte, during a recent appearance on the Data Center Frontier Show podcast. The phrase captures a fundamental shift. AI is no longer simply one workload category competing for space inside the conventional data center market. It is beginning to determine where facilities are built, how campuses are financed, how electrical and mechanical systems are designed, and which regions can realistically participate in the next phase of digital infrastructure growth. The market has also moved beyond incremental expansion. Campuses planned around hundreds of megawatts—and increasingly multiple gigawatts—are no longer treated strictly as outliers. They are being conceived from the beginning around GPU density, liquid cooling, accelerated deployment and enormous concentrations of electrical load. Behind that transformation lies a second, increasingly decisive reality: Demand may be abundant, but deployable power is not. Executable Power Separates Announcements From Infrastructure From DC Byte’s market-intelligence vantage point, the dividing line between announced capacity and capacity likely to reach operation is increasingly straightforward. Has the power been secured? When can it be energized? Can the grid—or an onsite alternative—support the project’s intended phasing? “The market is no longer constrained primarily by demand or capital,” Cox said. “It is constrained by executable power.” That distinction matters because a headline capacity figure

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE