Stay Ahead, Stay ONMINE

Nvidia’s Nemotron Model Families will advance AI agents

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia announced Nemotron Model Families to advance agentic AI as part of its bevy of AI announcements at CES 2025 today. Available as Nvidia NIM microservices, open Llama Nemotron large language models and Cosmos Nemotron vision […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Nvidia announced Nemotron Model Families to advance agentic AI as part of its bevy of AI announcements at CES 2025 today.

Available as Nvidia NIM microservices, open Llama Nemotron large language models and Cosmos Nemotron vision language models can supercharge AI agents on any accelerated system.

Artificial intelligence is entering a new era — agentic AI — where teams of specialized agents can help people solve complex problems and automate repetitive tasks. Nvidia made the announcement as part of Nvidia CEO Jensen Huang’s opening keynote today at CES 2025.

With custom AI agents, enterprises across industries can manufacture intelligence and achieve unprecedented productivity. These advanced AI agents require a system of multiple generative AI models optimized for agentic AI functions and capabilities. This complexity means that the need for powerful, efficient, enterprise-grade models has never been greater.

“AI agents is the next robotic industry and likely to be a multibillion-dollar opportunity,” Huang said.

To provide a foundation for enterprise agentic AI, Nvidia today announced the Llama Nemotron family of open large language models (LLMs). Built with Llama, the models can help developers create and deploy AI agents across a range of applications —- including customer support, fraud detection, and product supply chain and inventory management optimization.

To be effective, many AI agents need both language skills and the ability to perceive the world and respond with the appropriate action.

Nvidia Nemotron
Nvidia Nemotron

With new Nvidia Cosmos Nemotron vision language models (VLMs) and Nvidia NIM microservices for video search and summarization, developers can build agents that analyze and respond to images and video from autonomous machines, hospitals, stores and warehouses, as well as sports events, movies and news. For developers seeking to generate physics-aware videos for robotics and autonomous vehicles, Nvidia today separately announced Nvidia Cosmos world foundation models.

Open Llama Nemotron Models Optimize Compute Efficiency, Accuracy for AI Agents Built with Llama foundation models — one of the most popular commercially viable open source model collections, downloaded over 650 million times — Nvidia Llama Nemotron models provide optimized building blocks for AI agent development.

Llama Nemotron models are pruned and trained with Nvidia’s latest techniques and high-quality datasets for enhanced agentic capabilities. They excel at instruction following, chat, function calling, coding and math, while being size-optimized to run on a broad range of Nvidia accelerated computing resources.

“Agentic AI is the next frontier of AI development, and delivering on this opportunity requires full-stack optimization across a system of LLMs to deliver efficient, accurate AI agents,” said Ahmad Al-Dahel, vice president and head of GenAI at Meta, in a statement. “Through our collaboration with Nvidia and our shared commitment to open models, the Nvidia Llama Nemotron family built on Llama can help enterprises quickly create their own custom AI agents.”

Leading AI agent platform providers including SAP and ServiceNow are expected to be among the first to use the new Llama Nemotron models.

“AI agents that collaborate to solve complex tasks across multiple lines of the business will unlock a whole new level of enterprise productivity beyond today’s generative AI scenarios,” said Philipp Herzig, chief AI officer at SAP, in a statement. “Through SAP’s Joule, hundreds of millions enterprise users will interact with these agents to accomplish their goals faster than ever before. Nvidia’s new open Llama Nemotron model family will foster the development of multiple specialized AI agents to transform business processes.”

“AI agents make it possible for organizations to achieve more with less effort, setting new standards for business transformation,” said Jeremy Barnes, vice president of platform AI at ServiceNow, in a statement. “The improved performance and accuracy of Nvidia’s open Llama Nemotron models can help build advanced AI agent services that solve complex problems across functions, in any industry.”

The Nvidia Llama Nemotron models use Nvidia NeMo for distilling, pruning and alignment. Using these techniques, the models are small enough to run on a variety of computing platforms while providing high accuracy as well as increased model throughput.

The Llama Nemotron model family will be available as downloadable models and as Nvidia NIM microservices that can be easily deployed on clouds, data centers, PCs and workstations. They offer enterprises industry-leading performance with reliable, secure and seamless integration into their agentic AI application workflows.

Customize and Connect to Business Knowledge With Nvidia NeMo

The Llama Nemotron and Cosmos Nemotron model families are coming in Nano, Super and Ultra sizes to provide options for deploying AI agents at every scale.

● Nano: The most cost-effective model optimized for real-time applications with low latency, ideal for deployment on PCs and edge devices.

● Super: A high-accuracy model offering exceptional throughput on a single GPU.

● Ultra: The highest-accuracy model, designed for data-center-scale applications demanding the highest performance.

Enterprises can also customize the models for their specific use cases and domains with Nvidia NeMo microservices to simplify data curation, accelerate model customization and evaluation, and apply guardrails to keep responses on track.

With Nvidia NeMo Retriever, developers can also integrate retrieval-augmented generation (RAG) capabilities to connect models to their enterprise data.

And using Nvidia Blueprints for agentic AI, enterprises can quickly create their own applications using Nvidia’s advanced AI tools and end-to-end development expertise. In fact, Nvidia Cosmos Nemotron, Nvidia Llama Nemotron and NeMo Retriever supercharge the new Nvidia Blueprint for video search and summarization, announced separately today.

NeMo, NeMo Retriever and Nvidia Blueprints are all available with the Nvidia AI Enterprise software platform.

Availability

Llama Nemotron and Cosmos Nemotron models will be available as hosted APIs and for download on build.nvidia.com and on Hugging Face. Access for development, testing and research is free for members of the Nvidia Developer Program.

Enterprises can run Llama Nemotron and Cosmos Nemotron NIM microservices in production with the Nvidia AI Enterprise software platform on accelerated data center and cloud infrastructure.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Intel nabs Qualcomm veteran to lead GPU initiative

Intel has struggled for more than two decades to develop a successful GPU/accelerated computing strategy, going all the way back to the aughts and the ill-fated Larrabee effort.  Its most recent efforts centered around Ponte Vecchio and Gaudi chips, neither of which have gained any traction. Still, CEO Lip-Bu Tan

Read More »

New Relic extends observability into ChatGPT-hosted apps

New Relic’s cloud-based observability platform monitors applications and services in real time to provide insights into software, hardware, and cloud performance. The new capability extends the platform’s browser agent into the GPT iframe environment. It captures standard telemetry data, including latency and connectivity of an application within the GPT iframe.

Read More »

AI can’t fix a broken NetOps practice

Data collection errors, inconsistent data formatting issues across vendors, data storage issues, and network monitoring blind spots were the top issues that are impacting this data quality. Bad data leads to bad AI insights. Network teams will need to assess their data before they invest time and money in AI

Read More »

Work-from-office mandate? Expect top talent turnover, culture rot

IT workers value flexibility Ivanti’s survey suggests that IT workers are skeptical of return-to-office (RTO) mandates. Eighty-three percent of IT workers surveyed say flexible work arrangements are either “high value” or “essential,” compared to 73% of office workers. Meanwhile, IT workers facing work-from-office mandates are two to three times more

Read More »

Energy Secretary Prepares to Unleash Backup Generation Ahead of Winter Storm Fern

Secretary Wright issues letter to grid operators detailing how unused backup generation can keep the grid stable, save lives, and lower costs during the coming winter storm. WASHINGTON—The U.S. Department of Energy (DOE) announced today it is prepared to take emergency action to prevent blackouts during winter storm Fern. In a letter today, U.S. Secretary of Energy Chris Wright asked the nation’s grid operators to maintain communication with DOE during the storm and be prepared to make backup generation resources at data centers and other major facilities available as needed. DOE estimates more than 35 GW of unused backup generation remains available nationwide. These actions could mitigate blackouts and reduce costs for potentially hundreds of millions of Americans during the winter storm.  “The Trump administration will not stand by and allow the previous administration’s reckless energy subtraction policies and bureaucratic red tape put American lives at risk,” said Secretary Wright. “We have identified more than 35 GW of unused backup generation that exists across the country and are taking action to ensure that if the nation needs it, the generation will be made available. Rest assured, President Trump and the Energy Department remain committed to doing everything in our power to mitigate blackouts and lower energy costs for the American people.” On day one, President Trump declared a national energy emergency after the Biden administration’s energy subtraction agenda left behind a grid increasingly vulnerable to blackouts. According to the North American Electric Reliability Corporation (NERC), “Winter electricity demand is rising at the fastest rate in recent years,” while the premature forced closure of reliable generation such as coal and natural gas plants leaves American families vulnerable to power outages. The NERC 2025 – 2026 Winter Reliability Assessment further warns that areas across the continental United States have an elevated risk

Read More »

USA Won’t Offer OTG Security to Oil Firms in VEN

The Trump administration has no plans to directly provide security to oil producers in Venezuela, Energy Secretary Chris Wright said Thursday, dismissing the notion US troops will be used to address companies’ concerns about safety in the troubled nation. “We are not going to get involved in providing on-the-ground security,” Wright said during an interview with Bloomberg Television. “The US involvement right now in controlling the flow of funds in Venezuela gives us huge leverage to reduce the criminality in that country, reestablish peace and better business conditions.” Oil executives and industry leaders have stressed companies need political and legal reforms, contract certainty and security guarantees before investing in Venezuela following the apprehension of former President Nicolás Maduro. While US President Donald Trump has vowed to provide “total safety” to companies operating there, it remains unclear how the US would accomplish that.   During the interview Thursday, Wright said the steps the US has taken in Venezuela have already made the nation a more secure place to work and that oil companies are well versed in operating in challenging environments around the world.   Ultimately, he said, Venezuela will need a representative government, new laws and changes to its constitution.     “But that will take time,” Wright said. “There’s always different risk and reward situations in time, which is why the wildcatters will move first,” Wright said. “The bigger, longer-term, tens of million of dollars of investment, they’re going to wait until there’s more clarity in that environment.”  Wright said he plans to travel to Venezuela within the next few weeks to meet with government officials, look at the oil infrastructure and meet with the nation’s acting President Delcy Rodríguez. “We will definitely see a number of American oil and gas companies going down as well and investigating opportunities

Read More »

Energy Department Reins in Over $83 Billion in Biden-Era Loans and Conditional Commitments

WASHINGTON—The U.S. Department of Energy (DOE) announced today that the Office of Energy Dominance Financing (EDF) is restructuring, revising, or eliminating more than $83 billion in Green New Scam loans and conditional commitments from the Biden-era loan portfolio. This action follows an exhaustive first-year review of the previous administration’s $104 billion principal loan obligations, including approximately $85 billion rushed out the door in the final months after Election Day. Previously known as the Loan Programs Office (LPO), EDF continues to reform the office to more responsibly steward taxpayer dollars and support financing opportunities that accelerate the deployment of affordable, reliable, and secure American energy. During the first year of the Trump administration, EDF conducted a thorough review of each borrower to ensure loans were a responsible investment of taxpayer dollars and aligned with the Administration’s priorities. “Over the past year, the Energy Department individually reviewed our entire loan portfolio to ensure the responsible investment of taxpayer dollars,” Secretary Wright said. “We found more dollars were rushed out the door of the Loan Programs Office in the final months of the Biden Administration than had been disbursed in over fifteen years. President Trump promised to protect taxpayer dollars and expand America’s supply of affordable, reliable, and secure energy. Thanks to the Working Families Tax Cut, the newly re-structured Energy Dominance Financing is playing a key role in fulfilling that mission.” EDF has eliminated around $9.5 billion in government-subsidized, intermittent wind and solar projects, and is replacing them with investments in natural gas and nuclear uprates that provide more affordable and reliable energy for the American people. Of the $104 billion in Biden-era principal loan obligations, EDF has completed or is in the process of de-obligating almost $30 billion, with another $53 billion in revision. EDF currently has more than $289 billion

Read More »

Can rising power demand boost renewables above policy obstacles in 2026?

Listen to the article 14 min This audio is auto-generated. Please let us know if you have feedback. In 2026, the renewable energy industry is facing considerable headwinds from Trump administration policies. The One Big Beautiful Bill Act set a new July 4 deadline for wind and solar projects to start construction to qualify for the Inflation Reduction Act’s production and investment tax credits, most notably. It also set new, strict foreign entity of concern rules for which the Treasury Department has yet to issue final guidance. The OBBBA was “definitely a bad outcome for the industry, worse than most people expected,” said Dan Smith, vice president of markets at DSD Renewables, a provider of distributed renewable energy resources. “I think we all expected something to happen, but this was a little more draconian than we and most people we know were hoping for,” he added.  “Now is the time to be disciplined and focused and be realistic about what projects we’re going to be able to complete under the ITC and which ones are just likely not going to benefit.” Dan Smith Vice president of markets at DSD Renewables The administration is also dragging out the timeline for approving projects in federal lands and waters. The Department of the Interior has been issuing stop-work orders to offshore wind projects and revoking their permits, and it cancelled its environmental review for the 6.2-GW Esmeralda 7 solar project on federal land in Nevada, saying it would instead review each of the seven project components individually. Renewables poised to meet rising demand Despite these setbacks, some industry sectors still see considerable opportunity for renewables in the U.S., as renewable energy generation can often come online more quickly and cheaply than a fossil fuel plant.  Analysts continue to forecast staggering load growth over the

Read More »

Venezuelan Oil Heads to Europe

Europe is set to receive some of its first shipments of Venezuelan oil in almost a year after traders rolled out offers worldwide to sell cargoes at the behest of the Trump administration.  The ship Poliegos is on its way to pick up Venezuelan oil and deliver it to a port in Italy, according to a shipping report seen by Bloomberg. Energy trader Vitol Group, which together with Trafigura Group was enlisted by the US to sell Venezuelan oil, is listed as owner of the cargo.  Another crude tanker named Folegandros is also scheduled to set sail from Venezuela to the Mediterranean in the coming days, according to people familiar with the matter, adding the vessel would deliver the barrels to Repsol SA’s oil refinery in Cartagena, Spain. A spokesperson for the Madrid-based company declined to comment. The crude, scheduled to arrive in Europe in February, is part of US President Donald Trump’s plan to shore up the Venezuelan economy after three decades of mismanagement, underinvestment and corruption. The pace and scale at which the nation’s output and exports is restored will be an important detail in an oil market that’s dealing with a large supply excess, with global prices stuck near $60 a barrel. Vitol declined to comment.  After the Jan. 3 capture of strongman Nicolás Maduro by US forces, Trump officials enlisted the help of Vitol and Trafigura to help sell as much as 50 million barrels of oil, with proceeds earmarked to help rebuild the country’s economy. The shipments would market the first visible exports of Venezuelan oil to Europe since April, when the Vitol-backed Saras SpA took 1 million barrels of Merey 16 oil to the Sarroch refinery in Italy, according to vessel movements compiled by Bloomberg. The traders are quickly moving Venezuelan barrels as the US

Read More »

Acting CISA chief defends workforce cuts, declares agency ‘back on mission’

The Cybersecurity and Infrastructure Security Agency’s acting leader used a hearing on Wednesday to defend the Trump administration’s mass layoffs at CISA and reassure lawmakers that the agency was still prepared to defend government and critical infrastructure networks from hackers. “A disciplined mission requires the right workforce — not a larger one, but a more capable and skilled one,” Madhu Gottumukkala said during a House Homeland Security Committee hearing that featured him and two other Department of Homeland Security officials. In the coming year, Gottumukkala added, “CISA will continue targeted hiring in mission critical roles while remaining aligned with [DHS’s] broader efforts to control costs and maximize return.” For now, though, he said, “we have the staff that we need.” CISA turmoil over layoffs, transfers CISA has lost more than one-third of its workforce since President Donald Trump took office almost exactly one year ago. The Trump administration has forced out key experts, eliminated a major collaboration framework, withdrawn funding from a state and local cybersecurity group and shuttered offices that managed important partnerships with states, businesses and foreign allies. At least 998 CISA employees have quit or been laid off or transferred since the start of the Trump administration, with 65 receiving forced reassignments to other agencies, according to an internal agency report that House Homeland Security Committee ranking member Bennie Thompson, D-Miss., entered into the record at the end of the hearing. Thompson’s office later provided the report to Cybersecurity Dive. Lawmakers of both parties expressed concern about the turmoil at CISA, with committee chairman Andrew Gabarino, R-N.Y., saying that “workforce continuity, clear leadership, and mission readiness are essential to effective cyber defenses.” Democrats were more critical. Rep. James Walkinshaw, D-Va., said the cuts at CISA “have weakened our defenses and left our critical systems and infrastructure more exposed

Read More »

CBRE’s 2026 Data Center Outlook: Demand Surges as Delivery Becomes the Constraint

The U.S. data center market is entering 2026 with fundamentals that remain unmatched across commercial real estate, but the nature of the dominant constraint has shifted. Demand is no longer gated by capital, connectivity, or even land. It is gated by the ability to deliver very large blocks of power, on aggressive timelines, at a predictable cost. According to the CBRE 2026 U.S. Real Estate Market Outlook as overseen by Gordon Dolven and Pat Lynch, the sector is on track to post another record year for leasing activity, even as vacancy remains at historic lows and pricing reaches all-time highs. What has changed is the scale at which demand now presents itself, and the difficulty of meeting it. Large-Block Leasing Rewrites the Economics AI-driven workloads are reshaping leasing dynamics in ways that break from prior hyperscale norms. Where 10-MW-plus deployments once commanded pricing concessions, CBRE now observes the opposite behavior: large, contiguous blocks of capacity are commanding premiums. Neocloud providers, GPU-as-a-service platforms and AI startups, many backed by aggressive capital deployment strategies, are actively competing for full-building and campus-scale capacity.  For operators, this is altering development and merchandising strategies. Rather than subdividing shells for flexibility, owners increasingly face a strategic choice: hold buildings intact to preserve optionality for single-tenant, high-density users who are willing to pay for scale. In effect, scale itself has become the scarce asset. Behind-the-Meter Power Moves to the Foreground For data centers, power availability meaning not just access, but certainty of delivery, is now the defining variable in the market.  CBRE notes accelerating adoption of behind-the-meter strategies as operators seek to bypass increasingly constrained utility timelines. On-site generation using natural gas, solar, wind, and battery storage is gaining traction, particularly in deregulated electricity markets where operators have more latitude to structure BYOP (bring your own power) solutions. 

Read More »

Blue Origin targets enterprise networks with a multi-terabit satellite connectivity plan

“It’s ideal for remote, sparse, or sensitive regions,” said Manish Rawat, analyst at TechInsights. “Key use cases include cloud-to-cloud links, data center replication, government, defense, and disaster recovery workloads. It supports rapid or temporary deployments and prioritizes fewer customers with high capacity, strict SLAs, and deep carrier integration.” Adoption, however, is expected to largely depend on the sector. For governments and organizations operating highly critical or sensitive infrastructure, where reliability and security outweigh cost considerations, this could be attractive as a redundancy option. “Banks, national security agencies, and other mission-critical operators may consider it as an alternate routing path,” Jain said. “For most enterprises, however, it is unlikely to replace terrestrial connectivity and would instead function as a supplementary layer.” Real-world performance Although satellite connectivity offers potential advantages, analysts note that questions remain around real-world performance. “TeraWave’s 6 Tbps refers to total constellation capacity, not per-user throughput, achieved via multiple optical inter-satellite links and ground gateways,” Rawat said. “Optical crosslinks provide high aggregate bandwidth but not a single terabit-class pipe. Performance lies between fiber and GEO satellites, with lower intercontinental latency than GEO but higher than fiber.” Operational factors could also affect network stability. Jitter is generally low, but handovers, rerouting, and weather conditions can introduce intermittent performance spikes. Packet loss is expected to remain modest but episodic, Rawat added.

Read More »

CyrusOne Hones AI-Era Data Center Strategy for Power, Pace, and Reliability

In the second half of 2025, CyrusOne was racing to secure buildable power and faster time-to-market capacity for AI-era customers. At the same time, its reputation for mission-critical reliability took a very public hit when a disruption at a CyrusOne facility helped knock CME trading offline. The incident forced the company into an unusually open conversation about redundancy, cooling systems, and operational discipline: systems that are meant to disappear in normal operation, and dominate the story when they malfunction. From Projects to a Playbook Which projects, missteps, and strategic moves from 2025 are now shaping how CyrusOne enters 2026? Nowhere is that view clearer than in Texas. There, CyrusOne has been leaning hard into a “power + land + interconnect” model: treating deliverable power and grid position as part of the product, not just a prerequisite. If you map the company’s announcements since late July, Texas reveals the playbook. Secure power, secure substations and grid position, then build multi-phase campuses designed to scale quickly as demand materializes. The Calpine “Powered Land” Deal: From 190 MW to 400 MW in Three Months On July 30, 2025, CyrusOne and Calpine announced a 190-MW agreement tied to a hyperscale campus (DFW10) adjacent to Calpine’s Thad Hill Energy Center in Bosque County, Texas. The structure bundled power, grid connection, and land into a single development package, with CyrusOne saying the site was already under construction and targeting operation by Q4 2026. Just three months later, on November 3–4, the partners announced a second phase, adding 210 MW and taking the campus to 400 MW. The update emphasized coordination to support grid reliability during scarcity; such curtailment and operational-coordination concepts are becoming table stakes for ERCOT-scale megaprojects. Together, the two announcements show CyrusOne placing a large bet on an emerging model: power-ready campuses, or “powered

Read More »

Forrester study quantifies benefits of Cisco Intersight

If IT groups are to be the strategic business partners their companies need, they require solutions that can improve infrastructure life cycle management in the age of artificial intelligence (AI) and heightened security threats. To quantify the value of such solutions, Cisco recently commissioned Forrester Consulting to conduct a Total Economic Impact™ analysis of Cisco Intersight. The comprehensive study found that for a composite organization, Intersight delivered 192% return on investment (ROI) and a payback period of less than six months, along with significant tangible benefits to IT and businesses. Cisco Intersight overview Cisco Intersight is a cloud-native IT operations platform for infrastructure life cycle management. It provides IT teams with comprehensive visibility, control, and automation capabilities for Cisco’s portfolio of compute solutions for data centers, colocation facilities, and edge environments based on the Cisco Unified Computing System (Cisco UCS). Intersight also integrates with leading operating systems, storage providers, hypervisors, and third-party IT service management and security tools. Intersight’s unified, policy-driven approach to infrastructure management helps IT groups automate numerous tasks and, as Forrester found, free up time to dedicate to strategic projects. Forrester study quantifies the benefits of Cisco Intersight  A composite organization using Cisco Intersight achieved:192% ROI and payback in less than six months$3.3M net present value over three years$2.7M from improved uptime and resilience 50% reduction in mean time to resolution $1.7M from increased IT productivity$267K benefit from decreased time to value due to faster project execution and earlier return on infrastructure investments Forrester Total Economic Impact study findings The analyst firm conducted detailed interviews with IT decision-makers and Intersight users at six organizations, from which it created one composite organization: a multinational technology-driven company with $10 billion in annual revenue, 120 branch locations, and a team of six engineers managing its 1,000 servers deployed in several

Read More »

SoftBank launches software stack for AI data center operations

Addressing enterprise challenges The software provides two main services, according to SoftBank. The Kubernetes-as-a-Service component automates the stack from BIOS and RAID settings through the OS, GPU drivers, networking, Kubernetes controllers, and storage, the company said. It reconfigures physical connectivity using Nvidia NVLink and memory allocation as users create, update, or delete clusters, according to the announcement. The system allocates nodes based on GPU proximity and NVLink domain configuration to reduce latency, SoftBank said. Enterprises currently face complex GPU cluster provisioning, Kubernetes lifecycle management, inference scaling, and infrastructure tuning challenges that require deep expertise, according to Dai. SoftBank’s automated approach addresses these pain points by handling BIOS-to-Kubernetes configuration, optimizing GPU interconnects, and abstracting inference into API-based services, he said. This allows teams to focus on model development rather than infrastructure maintenance, Dai said. The Inference-as-a-Service component lets users deploy inference services by selecting large language models without configuring Kubernetes or underlying infrastructure, according to the company. It provides OpenAI-compatible APIs and scales across multiple nodes on platforms including the GB200 NVL72, SoftBank said. The software includes tenant isolation through encrypted communications, automated system monitoring and failover, and APIs for connecting to portal, customer management, and billing systems, according to the announcement.

Read More »

OpenAI shifts AI data center strategy toward power-first design

The shift to ‘energy sovereignty’  Analysts say the move reflects a fundamental shift in data center strategy, moving from “fiber-first” to “power-first” site selection. “Historically, data centers were built near internet exchange points and urban centers to minimize latency,” said Ashish Banerjee, senior principal analyst at Gartner. “However, as AI training requirements reach the gigawatt scale, OpenAI is signaling that they will prioritize regions with ‘energy sovereignty’, places where they can build proprietary generation and transmission, rather than fighting for scraps on an overtaxed public grid.” For network architecture, this means a massive expansion of the “middle mile.” By placing these behemoth data centers in energy-rich but remote locations, the industry will have to invest heavily in long-haul, high-capacity dark fiber to connect these “power islands” back to the edge. “We should expect a bifurcated network: a massive, centralized core for ‘cold’ model training located in the wilderness, and a highly distributed edge for ‘hot’ real-time inference located near the users,” Banerjee added. Manish Rawat, a semiconductor analyst at TechInsights, also noted that the benefits may come at the cost of greater architectural complexity. “On the network side, this pushes architectures toward fewer mega-hubs and more regionally distributed inference and training clusters, connected via high-capacity backbone links,” Rawat said. “The trade-off is higher upfront capex but greater control over scalability timelines, reducing dependence on slow-moving utility upgrades.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »