Nvidia’s ‘AI Factory’ narrative faces reality check as inference wars expose 70% margins

Stay Ahead, Stay ONMINE

Nvidia’s ‘AI Factory’ narrative faces reality check as inference wars expose 70% margins

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more The gloves came off at Tuesday at VB Transform 2025 as alternative chip makers directly challenged Nvidia’s dominance narrative during a panel about inference, exposing a fundamental contradiction: How can AI inference be a […]

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

The gloves came off at Tuesday at VB Transform 2025 as alternative chip makers directly challenged Nvidia’s dominance narrative during a panel about inference, exposing a fundamental contradiction: How can AI inference be a commoditized “factory” and command 70% gross margins?

Jonathan Ross, CEO of Groq, didn’t mince words when discussing Nvidia’s carefully crafted messaging. “AI factory is just a marketing way to make AI sound less scary,” Ross said during the panel. Sean Lie, CTO of Cerebras, a competitor, was equally direct: “I don’t think Nvidia minds having all of the service providers fighting it out for every last penny while they’re sitting there comfortable with 70 points.”

Hundreds of billions in infrastructure investment and the future architecture of enterprise AI are at stake. For CISOs and AI leaders currently locked in weekly negotiations with OpenAI and other providers for more capacity, the panel exposed uncomfortable truths about why their AI initiatives keep hitting roadblocks.

>>See all our Transform 2025 coverage here<<

The capacity crisis no one talks about

“Anyone who’s actually a big user of these gen AI models knows that you can go to OpenAI, or whoever it is, and they won’t actually be able to serve you enough tokens,” explained Dylan Patel, founder of SemiAnalysis. There are weekly meetings between some of the biggest AI users and their model providers to try to persuade them to allocate more capacity. Then there’s weekly meetings between those model providers and their hardware providers.”

Panel participants also pointed to the token shortage as exposing a fundamental flaw in the factory analogy. Traditional manufacturing responds to demand signals by adding capacity. However, when enterprises require 10 times more inference capacity, they discover that the supply chain can’t flex. GPUs require two-year lead times. Data centers need permits and power agreements. The infrastructure wasn’t built for exponential scaling, forcing providers to ration access through API limits.

According to Patel, Anthropic jumped from $2 billion to $3 billion in ARR in just six months. Cursor went from essentially zero to $500 million ARR. OpenAI crossed $10 billion. Yet enterprises still can’t get the tokens they need.

Why ‘Factory’ thinking breaks AI economics

Jensen Huang’s “AI factory” concept implies standardization, commoditization and efficiency gains that drive down costs. But the panel revealed three fundamental ways this metaphor breaks down:

First, inference isn’t uniform. “Even today, for inference of, say, DeepSeek, there’s a number of providers along the curve of sort of how fast they provide at what cost,” Patel noted. DeepSeek serves its own model at the lowest cost but only delivers 20 tokens per second. “Nobody wants to use a model at 20 tokens a second. I talk faster than 20 tokens a second.”

Second, quality varies wildly. Ross drew a historical parallel to Standard Oil: “When Standard Oil started, oil had varying quality. You could buy oil from one vendor and it might set your house on fire.” Today’s AI inference market faces similar quality variations, with providers using various techniques to reduce costs that inadvertently compromise output quality.

Third, and most critically, the economics are inverted. “One of the things that’s unusual about AI is that you can spend more to get better results,” Ross explained. “You can’t just have a software application, say, I’m going to spend twice as much to host my software, and applications can get better.”

When Ross mentioned that Mark Zuckerberg praised Groq for being “the only ones who launched it with the full quality,” he inadvertently revealed the industry’s quality crisis. This wasn’t just recognition. It was an indictment of every other provider cutting corners.

Ross spelled out the mechanics: “A lot of people do a lot of tricks to reduce the quality, not intentionally, but to lower their cost, improve their speed.” The techniques sound technical, but the impact is straightforward. Quantization reduces precision. Pruning removes parameters. Each optimization degrades model performance in ways enterprises may not detect until production fails.

The Standard Oil parallel Ross drew illuminates the stakes. Today’s inference market faces the same quality variance problem. Providers betting that enterprises won’t notice the difference between 95% and 100% accuracy are betting against companies like Meta that have the sophistication to measure degradation.

This creates immediate imperatives for enterprise buyers.

Establish quality benchmarks before selecting providers.
Audit existing inference partners for undisclosed optimizations.
Accept that premium pricing for full model fidelity is now a permanent market feature. The era of assuming functional equivalence across inference providers ended when Zuckerberg called out the difference.

The $1 million token paradox

The most revealing moment came when the panel discussed pricing. Lie highlighted an uncomfortable truth for the industry: “If these million tokens are as valuable as we believe they can be, right? That’s not about moving words. You don’t charge $1 for moving words. I pay my lawyer $800 for an hour to write a two-page memo.”

This observation cuts to the heart of AI’s price discovery problem. The industry is racing to drive token costs below $1.50 per million while claiming these tokens will transform every aspect of business. The panel implicitly agreed with each other that the math doesn’t add up.

“Pretty much everyone is spending, like all of these fast-growing startups, the amount that they’re spending on tokens as a service almost matches their revenue one to one,” Ross revealed. This 1:1 spend ratio on AI tokens versus revenue represents an unsustainable business model that panel participants contend the “factory” narrative conveniently ignores.

Performance changes everything

Cerebras and Groq aren’t just competing on price; they are also competing on performance. They’re fundamentally changing what is possible in terms of inference speed. “With the wafer scale technology that we’ve built, we’re enabling 10 times, sometimes 50 times, faster performance than even the fastest GPUs today,” Lie said.

This isn’t an incremental improvement. It’s enabling entirely new use cases. “We have customers who have agentic workflows that might take 40 minutes, and they want these things to run in real time,” Lie explained. “These things just aren’t even possible, even if you’re willing to pay top dollar.”

The speed differential creates a bifurcated market that defies factory standardization. Enterprises needing real-time inference for customer-facing applications can’t use the same infrastructure as those running overnight batch processes.

The real bottleneck: power and data centers

While everyone focuses on chip supply, the panel revealed the actual constraint throttling AI deployment. “Data center capacity is a big problem. You can’t really find data center space in the U.S.,” Patel said. “Power is a big problem.”

The infrastructure challenge goes beyond chip manufacturing to fundamental resource constraints. As Patel explained, “TSMC in Taiwan is able to make over $200 million worth of chips, right? It’s not even… it’s the speed at which they scale up is ridiculous.”

But chip production means nothing without infrastructure. “The reason we see these big Middle East deals, and partially why both of these companies have big presences in the Middle East is, it’s power,” Patel revealed. The global scramble for compute has enterprises “going across the world to get wherever power does exist, wherever data center capacity exists, wherever there are electricians who can build these electrical systems.”

Google’s ‘success disaster’ becomes everyone’s reality

Ross shared a telling anecdote from Google’s history: “There was a term that became very popular at Google in 2015 called Success Disaster. Some of the teams had built AI applications that began to work better than human beings for the first time, and the demand for compute was so high, they were going to need to double or triple the global data center footprint quickly.”

This pattern now repeats across every enterprise AI deployment. Applications either fail to gain traction or experience hockey stick growth that immediately hits infrastructure limits. There’s no middle ground, no smooth scaling curve that factory economics would predict.

What this means for enterprise AI strategy

For CIOs, CISOs and AI leaders, the panel’s revelations demand strategic recalibration:

Capacity planning requires new models. Traditional IT forecasting assumes linear growth. AI workloads break this assumption. When successful applications increase token consumption by 30% monthly, annual capacity plans become obsolete within quarters. Enterprises must shift from static procurement cycles to dynamic capacity management. Build contracts with burst provisions. Monitor usage weekly, not quarterly. Accept that AI scaling patterns resemble those of viral adoption curves, not traditional enterprise software rollouts.

Speed premiums are permanent. The idea that inference will commoditize to uniform pricing ignores the massive performance gaps between providers. Enterprises need to budget for speed where it matters.

Architecture beats optimization. Groq and Cerebras aren’t winning by doing GPUs better. They’re winning by rethinking the fundamental architecture of AI compute. Enterprises that bet everything on GPU-based infrastructure may find themselves stuck in the slow lane.

Power infrastructure is strategic. The constraint isn’t chips or software but kilowatts and cooling. Smart enterprises are already locking in power capacity and data center space for 2026 and beyond.

The infrastructure reality enterprises can’t ignore

The panel revealed a fundamental truth: the AI factory metaphor isn’t only wrong, but also dangerous. Enterprises building strategies around commodity inference pricing and standardized delivery are planning for a market that doesn’t exist.

The real market operates on three brutal realities.

Capacity scarcity creates power inversions, where suppliers dictate terms and enterprises beg for allocations.
Quality variance, the difference between 95% and 100% accuracy, determines whether your AI applications succeed or catastrophically fail.
Infrastructure constraints, not technology, set the binding limits on AI transformation.

The path forward for CISOs and AI leaders requires abandoning factory thinking entirely. Lock in power capacity now. Audit inference providers for hidden quality degradation. Build vendor relationships based on architectural advantages, not marginal cost savings. Most critically, accept that paying 70% margins for reliable, high-quality inference may be your smartest investment.

The alternative chip makers at Transform didn’t just challenge Nvidia’s narrative. They revealed that enterprises face a choice: pay for quality and performance, or join the weekly negotiation meetings. The panel’s consensus was clear: success requires matching specific workloads to appropriate infrastructure rather than pursuing one-size-fits-all solutions.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Lenovo expands enterprise AI server options

Lenovo is also introducing four new Hybrid AI Advantage offerings which are product bundles consisting of Lenovo hardware and third-party AI software. The four are: Hospitality with Centific AI Data Foundry and Nvidia: Enables hotels and resorts to boost real-time personalization, streamlined operations, and gain guest data-driven insights. Workplace Safety

DigitalOcean teams with AMD for low-cost GPU access

DigitalOcean’s GPU Droplets are packaged such that customers get all the capabilities they need with a single click. Saha claims other cloud providers require customers to separately allocate compute, storage, networking, VPC, and other components. This adds complexity in setting up the infrastructure as well as in pricing. “It is

Neoclouds roll in, challenge hyperscalers for AI workloads

When using an established cloud provider, dedicated infrastructure is cheaper than the public cloud if average utilization levels are maintained at 22% or above, Uptime Institute reports. When using a neocloud, this figure rises to 66%.

StorONE launches turnkey enterprise AI storage package

Thanks to the GPU integration, ONEai eliminates the need for a separate AI stack or external orchestration and cloud-based workflows. It offers full on-premises processing for complete data sovereignty and control over sensitive data. ONEai automatically recognizes and responds to file creation, modification and deletion, offering real-time insights into data

Oil Rebounds After Steep Two-Day Plunge

Oil drifted higher after the biggest two-day decline since 2022 as President Donald Trump downplayed the prospects of near-term sanctions relief for Iran and a government report showed a large drop in US crude stockpiles. West Texas Intermediate edged up 0.9% to settle just below $65 a barrel, after slumping 14% over the previous two sessions. Trump said the US would hold a meeting with Iran next week, but maintained that he is “not giving up” a maximum-pressure campaign targeting Tehran’s petrodollars. Still, he declared the tensions in the region as “over.” Elsewhere, US government data showed the country’s crude inventories fell for the fifth straight week, dropping by 5.8 million barrels to sit at an 11-year seasonal low. The data bolstered WTI’s prompt spread — the difference between its two nearest contracts — to $1.46 a barrel in backwardation, a bullish structure that signals a tight near-term market. Limiting the rally, Bloomberg reported that Russia is open to another output hike at the next OPEC+ meeting if the alliance deems such an increase to be necessary, after pushing back against further production increases in a previous meeting. The OPEC+ alliance is due to hold discussions on July 6 to consider a further supply boost in August. “Today’s move in crude looks like a combination of factors: a technical bounce after an oversold selloff, a walk-back of yesterday’s surprise comments on Iranian sanctions, and supportive EIA data,” said Rebecca Babin, a senior energy trader at CIBC Private Wealth Group. “While the second-half outlook still points to a surplus, and bearish sentiment remains, near-term balances look tighter than the broader narrative suggests.” Trump’s comments appeared to reverse his remarks on Tuesday giving China, Iran’s biggest crude customer, the green light to carry on buying its oil. The announcement on social media

Ambani and Adani Team Up to Use Each Other’s Fuel Pumps

Billionaires Mukesh Ambani and Gautam Adani have partnered to sell transportation fuels using their respective groups’ retail stations across India, bolstering their distribution network across the world’s third largest oil consumer. Reliance Industries’s Jio-bp venture will set up gasoline and diesel dispensers at compressed natural gas retail outlets of Adani Total Gas Ltd., according to a joint statement Wednesday. Similarly, the Adani group and TotalEnergies JV will install CNG dispensing units at Jio-bp’s pumps. The agreement will apply to both existing and future outlets. This is the second business collaboration between the billionaires’ groups, after Reliance bought a 26% stake in a 500 MW power project of Adani. Private firms such as Jio-bp, Shell India and Nayara Energy have struggled to expand fuel retail operations in the country where nearly 90% of the more than 97,000 gasoline and diesel outlets are run by state refiners. The government maintains a tight grip over pump prices through the companies it controls, particularly during periods of sharp spikes in global rates, to keep inflation in check. In 2021, Jio-bp had unveiled plans to expand their fuel network to 5,550 stations in five years, but they have so far reached only 2,000 outlets. The partnership between Adani and Reliance may provide a fillip to the struggling plan. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Utilities, energy developers back Senate’s more lenient tax credit timeline

Dive Brief: Eighteen trade associations representing utilities, power plant developers and electrical equipment manufacturers on Friday urged Senate leaders to preserve a more lenient standard for energy projects to qualify for the technology-neutral clean energy investment and production tax credits in the Republican budget bill. Reverting from the “start of construction” standard in the current Senate Finance Committee text to the more restrictive “placed in service” standard in the version the House passed last month would “upend investment expectations, introduce substantial business uncertainty, harm electricity customers, and risk delaying or even canceling critical U.S. energy infrastructure projects already underway,” the letter said. The Senate Finance Committee text still sets a high bar for wind and solar generation, restricting full credit eligibility to projects that start construction by the end of this year and dropping credit values to 60% and 20% for projects that break ground in 2026 and 2027, respectively. Nuclear, geothermal and other clean energy technology developers can earn credits on projects that break ground well into the 2030s. Dive Insight: With Congressional Republicans facing a self-imposed July 4 deadline to send the budget bill to President Trump, the Senate could vote as soon as Friday on its version of the package. If it passes, House and Senate negotiators would then hash out a compromise bill for both chambers to vote on. Provisions pertaining to the technology-neutral clean energy tax credits the Inflation Reduction Act authorized in 2022 could change at any point during the process. Current policy allows wind, solar and other clean energy technologies to qualify for the full investment or production tax credits — generally 30% of the total qualifying project value — through 2032. The House budget bill would begin phasing out credit eligibility for projects placed in service next year and eliminate it completely

$1.4B in new clean energy factories, projects canceled in May: E2

Dive Brief: Companies canceled $1.4 billion in new clean energy factories and projects in May, as Congressional Republicans work through a reconciliation bill expected to dramatically pare back clean energy tax incentives, according to a report released Monday by clean energy nonprofits E2 and the Clean Economy Tracker. Nearly $15.5 billion in new factories and electricity projects have been canceled since the beginning of the year, according to the report. The majority of those investments were slated for Republican-held congressional districts, where $9 billion worth of canceled investments were planned. While the version of the reconciliation bill working through the Senate would reduce some of the cuts in the House-passed version, it still drastically cuts incentives for wind and solar. Dive Insight: E2 has been tracking clean energy manufacturing and project announcements monthly since the Inflation Reduction Act was passed in August 2022, but only began tracking cancellations in Q1 of this year. The updated cancellation calculations come after E2 previously reported nearly $8 billion in clean energy projects were canceled, closed or downsized in the first fiscal quarter of 2025. Monday’s report found that 30 projects have been canceled, closed or downsized since the beginning of the year. May’s cancellations included General Motors scrapping a $300 million EV manufacturing facility, and battery maker Li-Cycle canceling and closing four battery manufacturing plans in three states. Li-Cycle’s May updates included abandoning plans for a $960 million battery storage manufacturing facility planned for New York. Corporations and organizations across the clean energy economy expressed broad concerns about the version of the “One Big Beautiful Bill Act” that passed in the House of Representatives before Memorial Day in the U.S. Local leaders and some GOP lawmakers have expressed a need to conserve and tweak the clean energy tax credits, respectively. E2 Communications Director

Alberta Premier Warns Carney He Must Act to Quell Separatist Threat

Alberta Premier Danielle Smith played down concerns that the secession movement in her province will scare away investors, saying it’s up to the government of Mark Carney to prove Canada can be a more attractive place for capital. Polls show a significant minority of Albertans are interested in exploring independence from Canada, partly because they’re frustrated about environmental rules that limit the development of oil and gas. The cancellation of proposed crude oil pipelines including Energy East is the result of federal “anti-investment policies,” Smith said, and she argued Carney must reverse those measures if he wants to tamp down separatism. “They have to take responsibility for the fact that that sentiment is there,” the Alberta leader said in an interview with Bloomberg News in Calgary. “I’m telling him what the pathway is to have it subside, and I guess it’ll be up to him to choose whether or not he takes that pathway.” A survey published last month by the Angus Reid Institute said 36% of Albertans would likely vote to leave in a referendum on seceding from Canada. But the polling firm also found many of those people would be open to changing their minds if concessions are made to help the province’s No. 1 industry, such as scrapping the emissions cap and the ban on large oil tankers off much of British Columbia’s coast. Both were policies of former Prime Minister Justin Trudeau. The tanker prohibition restricts Canada’s ability to ship Alberta oil to Asian markets, and is one reason the vast majority of its crude is sold to US refiners at a discount. Removing it is one of nine demands Smith made after Carney became prime minister. Smith reiterated that she doesn’t support secession from Canada but her government recently passed legislation that makes it easier for citizens to force referendums. A petition of just 177,000 voters’ signatures

Maersk Training Opens New Facility in Louisiana

In a release sent to Rigzone recently, Maersk Training announced the opening of a new maritime and safety training facility at Fletcher Technical Community College in Houma, Louisiana. The company stated in the release that this expansion marks a significant milestone in Maersk Training’s commitment to enhancing workforce development, safety, and operational performance in key industries across the Gulf Coast. “By combining world-class training expertise with Fletcher’s strong educational foundation, the facility will equip workers with essential skills and certifications to enhance safety and performance in real-world job settings,” Maersk Training said in the release. “Louisiana serves as an energy hub, playing a critical role in the nation’s oil, gas, and maritime industries,” the company added. “As one of the top oil and gas production areas in the world, the region is home to a substantial workforce dedicated to the energy sector. This makes Houma an ideal location for Maersk Training’s expansion, ensuring workers have access to high-quality, industry-specific training,” it continued. In its release, Maersk Training noted that the new maritime and safety training facility at Fletcher Technical Community College will primarily serve the offshore oil and gas industry and the maritime sector. The center will offer a wide range of industry-accredited training courses focused on offshore safety and survival, as well as industrial safety, according to Maersk Training, which said course certifications will be approved by industry bodies such as OPITO, OSHA, STCW, IADC, and API. “One of the most exciting aspects of the facility is its OPITO and STCW-certified courses, including Basic Offshore Safety Induction and Emergency Training (BOSIET) and Tropical Helicopter Underwater Escape Training (T-HUET),” Maersk Training said in the release. “Unique to this location, the training will utilize a twin-fall davit launched from a working barge into the intracoastal waterway, providing the most realistic OPITO-certified

Cisco backs quantum networking startup Qunnect

In partnership with Deutsche Telekom’s T-Labs, Qunnect has set up quantum networking testbeds in New York City and Berlin. “Qunnect understands that quantum networking has to work in the real world, not just in pristine lab conditions,” Vijoy Pandey, general manager and senior vice president of Outshift by Cisco, stated in a blog about the investment. “Their room-temperature approach aligns with our quantum data center vision.” Cisco recently announced it is developing a quantum entanglement chip that could ultimately become part of the gear that will populate future quantum data centers. The chip operates at room temperature, uses minimal power, and functions using existing telecom frequencies, according to Pandey.

HPE announces GreenLake Intelligence, goes all-in with agentic AI

Like a teammate who never sleeps Agentic AI is coming to Aruba Central as well, with an autonomous supervisory module talking to multiple specialized models to, for example, determine the root cause of an issue and provide recommendations. David Hughes, SVP and chief product officer, HPE Aruba Networking, said, “It’s like having a teammate who can work while you’re asleep, work on problems, and when you arrive in the morning, have those proposed answers there, complete with chain of thought logic explaining how they got to their conclusions.” Several new services for FinOps and sustainability in GreenLake Cloud are also being integrated into GreenLake Intelligence, including a new workload and capacity optimizer, extended consumption analytics to help organizations control costs, and predictive sustainability forecasting and a managed service mode in the HPE Sustainability Insight Center. In addition, updates to the OpsRamp operations copilot, launched in 2024, will enable agentic automation including conversational product help, an agentic command center that enables AI/ML-based alerts, incident management, and root cause analysis across the infrastructure when it is released in the fourth quarter of 2025. It is now a validated observability solution for the Nvidia Enterprise AI Factory. OpsRamp will also be part of the new HPE CloudOps software suite, available in the fourth quarter, which will include HPE Morpheus Enterprise and HPE Zerto. HPE said the new suite will provide automation, orchestration, governance, data mobility, data protection, and cyber resilience for multivendor, multi cloud, multi-workload infrastructures. Matt Kimball, principal analyst for datacenter, compute, and storage at Moor Insights & strategy, sees HPE’s latest announcements aligning nicely with enterprise IT modernization efforts, using AI to optimize performance. “GreenLake Intelligence is really where all of this comes together. I am a huge fan of Morpheus in delivering an agnostic orchestration plane, regardless of operating stack

MEF goes beyond metro Ethernet, rebrands as Mplify with expanded scope on NaaS and AI

While MEF is only now rebranding, Vachon said that the scope of the organization had already changed by 2005. Instead of just looking at metro Ethernet, the organization at the time had expanded into carrier Ethernet requirements. The organization has also had a growing focus on solving the challenge of cross-provider automation, which is where the LSO framework fits in. LSO provides the foundation for an automation framework that allows providers to more efficiently deliver complex services across partner networks, essentially creating a standardized language for service integration. NaaS leadership and industry blueprint Building on the LSO automation framework, the organization has been working on efforts to help providers with network-as-a-service (NaaS) related guidance and specifications. The organization’s evolution toward NaaS reflects member-driven demands for modern service delivery models. Vachon noted that MEF member organizations were asking for help with NaaS, looking for direction on establishing common definitions and some standard work. The organization responded by developing comprehensive industry guidance. “In 2023 we launched the first blueprint, which is like an industry North Star document. It includes what we think about NaaS and the work we’re doing around it,” Vachon said. The NaaS blueprint encompasses the complete service delivery ecosystem, with APIs including last mile, cloud, data center and security services. (Read more about its vision for NaaS, including easy provisioning and integrated security across a federated network of providers)

AMD rolls out first Ultra Ethernet-compliant NIC

The UEC was launched in 2023 under the Linux Foundation. Members include major tech-industry players such as AMD, Intel, Broadcom, Arista, Cisco, Google, Microsoft, Meta, Nvidia, and HPE. The specification includes GPU and accelerator interconnects as well as support for data center fabrics and scalable AI clusters. AMD’s Pensando Pollara 400GbE NICs are designed for massive scale-out environments containing thousands of AI processors. Pollara is based on customizable hardware that supports using a fully programmable Remote Direct Memory Access (RDMA) transport and hardware-based congestion control. Pollara supports GPU-to-GPU communication with intelligent routing technologies to reduce latency, making it very similar to Nvidia’s NVLink c2c. In addition to being UEC-ready, Pollara 400 offers RoCEv2 compatibility and interoperability with other NICs.

Can Intel cut its way to profit with factory layoffs?

Matt Kimball, principal analyst at Moor Insights & Strategy, said, “While I’m sure tariffs have some impact on Intel’s layoffs, this is actually pretty simple — these layoffs are largely due to the financial challenges Intel is facing in terms of declining revenues.” The move, he said, “aligns with what the company had announced some time back, to bring expenses in line with revenues. While it is painful, I am confident that Intel will be able to meet these demands, as being able to produce quality chips in a timely fashion is critical to their comeback in the market.” Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count)

Intel appears stuck between ‘a rock and a hard place’

Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count) realizes that, when planning for these, ensuring the resilience of these mission critical functions is priority one.” Added Bickley, “trimming the workforce, delaying construction of the US fab plants, and flattening the decision structure of the organization are prudent moves meant to buy time in the hopes that their new chip designs and foundry processes attract new business.”

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

IBM sees enterprise customers are using ‘everything’ when it comes to AI, the challenge is matching the LLM to the right use case

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Over the

Stay Ahead, Stay ONMINE