Stay Ahead, Stay ONMINE

DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek R1’s Monday release has sent shockwaves through the AI community, disrupting assumptions about what’s required to achieve cutting-edge AI performance. Matching OpenAI’s o1 at just 3%-5% of the cost, this open-source model has not only […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


DeepSeek R1’s Monday release has sent shockwaves through the AI community, disrupting assumptions about what’s required to achieve cutting-edge AI performance. Matching OpenAI’s o1 at just 3%-5% of the cost, this open-source model has not only captivated developers but also challenges enterprises to rethink their AI strategies.

The model has rocketed to the top-trending model being downloaded on HuggingFace (109,000, as of this writing) – as developers rush to try it out and seek to understand what it means for their AI development. Users are commenting that DeepSeek’s accompanying search feature (which you can find at DeepSeek’s site) is now superior to competitors like OpenAI and Perplexity, and is only rivaled by Google’s Gemini Deep Research.

The implications for enterprise AI strategies are profound: With reduced costs and open access, enterprises now have an alternative to costly proprietary models like OpenAI’s. DeepSeek’s release could democratize access to cutting-edge AI capabilities, enabling smaller organizations to compete effectively in the AI arms race.

This story focuses on exactly how DeepSeek managed this feat, and what it means for the vast number of users of AI models. For enterprises developing AI-driven solutions, DeepSeek’s breakthrough challenges assumptions of OpenAI’s dominance — and offers a blueprint for cost-efficient innovation. It’s the “how” DeepSeek did what it did that should be the most educational here.

DeepSeek’s breakthrough: Moving to pure reinforcement learning

In November, DeepSeek made headlines with its announcement that it had achieved performance surpassing OpenAI’s o1, but at the time it only offered a limited R1-lite-preview model. With Monday’s full release of R1 and the accompanying technical paper, the company revealed a surprising innovation: a deliberate departure from the conventional supervised fine-tuning (SFT) process widely used in training large language models (LLMs).

SFT, a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. However, DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model.

This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets. While some flaws emerge – leading the team to reintroduce a limited amount of SFT during the final stages of building the model – the results confirmed the fundamental breakthrough: reinforcement learning alone could drive substantial performance gains.

The company got much of the way using open source – a conventional and unsurprising way

First, some background on how DeepSeek got to where it did. DeepSeek, a 2023 spin-off from Chinese hedge-fund High-Flyer Quant, began by developing AI models for its proprietary chatbot before releasing them for public use.  Little is known about the company’s exact approach, but it quickly open sourced its models, and it’s extremely likely that the company built upon the open projects produced by Meta, for example the Llama model, and ML library Pytorch. 

To train its models, High-Flyer Quant secured over 10,000 Nvidia GPUs before U.S. export restrictions, and reportedly expanded to 50,000 GPUs through alternative supply routes, despite trade barriers. This pales compared to leading AI labs like OpenAI, Google, and Anthropic, which operate with more than 500,000 GPUs each.  

DeepSeek’s ability to achieve competitive results with limited resources highlights how ingenuity and resourcefulness can challenge the high-cost paradigm of training state-of-the-art LLMs.

Despite speculation, DeepSeek’s full budget is unknown

DeepSeek reportedly trained its base model — called V3 — on a $5.58 million budget over two months, according to Nvidia engineer Jim Fan. While the company hasn’t divulged the exact training data it used (side note: critics say this means DeepSeek isn’t truly open-source), modern techniques make training on web and open datasets increasingly accessible. Estimating the total cost of training DeepSeek-R1 is challenging. While running 50,000 GPUs suggests significant expenditures (potentially hundreds of millions of dollars), precise figures remain speculative.

What’s clear, though, is that DeepSeek has been very innovative from the get-go. Last year, reports emerged about some initial innovations it was making, around things like Mixture of Experts and Multi-Head Latent Attention.

How DeepSeek-R1 got to the “aha moment”

The journey to DeepSeek-R1’s final iteration began with an intermediate model, DeepSeek-R1-Zero, which was trained using pure reinforcement learning. By relying solely on RL, DeepSeek incentivized this model to think independently, rewarding both correct answers and the logical processes used to arrive at them.

This approach led to an unexpected phenomenon: The model began allocating additional processing time to more complex problems, demonstrating an ability to prioritize tasks based on their difficulty. DeepSeek’s researchers described this as an “aha moment,” where the model itself identified and articulated novel solutions to challenging problems (see screenshot below). This milestone underscored the power of reinforcement learning to unlock advanced reasoning capabilities without relying on traditional training methods like SFT.

Source: DeepSeek-R1 paper. Don’t let this graphic intimidate you. The key takeaway is the red line, where the model literally used the phrase “aha moment.” Researchers latched onto this as a striking example of the model’s ability to rethink problems in an anthropomorphic tone. For the researchers, they said it was their own “aha moment.”

The researchers conclude: “It underscores the power and beauty of reinforcement learning: rather than explicitly teaching the model on how to solve a problem, we simply provide it with the right incentives, and it autonomously develops advanced problem-solving strategies.”

More than RL

However, it’s true that the model needed more than just RL. The paper goes on to talk about how despite the RL creating unexpected and powerful reasoning behaviors, this intermediate model DeepSeek-R1-Zero did face some challenges, including poor readability, and language mixing (starting in Chinese and switching over to English, for example). So only then did the team decide to create a new model, which would become the final DeepSeek-R1 model. This model, again based on the V3 base model, was first injected with limited SFT – focused on a “small amount of long CoT data” or what was called cold-start data, to fix some of the challenges. After that, it was put through the same reinforcement learning process of R1-Zero. The paper then talks about how R1 went through some final rounds of fine-tuning.

The ramifications

One question is why there has been so much surprise by the release. It’s not like open source models are new. Open Source models have a huge logic and momentum behind them. Their free cost and malleability is why we reported recently that these models are going to win in the enterprise.

Meta’s open-weights model Llama 3, for example, exploded in popularity last year, as it was fine-tuned by developers wanting their own custom models. Similarly, now DeepSeek-R1 is already being used to distill its reasoning into an array of other, much smaller models – the difference being that DeepSeek offers industry-leading performance. This includes running tiny versions of the model on mobile phones, for example.

DeepSeek-R1 not only performs better than the leading open source alternative, Llama 3. It shows its entire chain of thought of its answers transparently. Meta’s Llama hasn’t been instructed to do this as a default; it takes aggressive prompting of Llama to do this.

The transparency has also provided a PR black-eye to OpenAI, which has so far hidden its chains of thought from users, citing competitive reasons and not to confuse users when a model gets something wrong. Transparency allows developers to pinpoint and address errors in a model’s reasoning, streamlining customizations to meet enterprise requirements more effectively.

For enterprise decision-makers, DeepSeek’s success underscores a broader shift in the AI landscape: leaner, more efficient development practices are increasingly viable. Organizations may need to reevaluate their partnerships with proprietary AI providers, considering whether the high costs associated with these services are justified when open-source alternatives can deliver comparable, if not superior, results.

To be sure, no massive lead

While DeepSeek’s innovation is groundbreaking, by no means has it established a commanding market lead. Because it published its research, other model companies will learn from it, and adapt. Meta and Mistral, the French open source model company, may be a beat behind, but it will probably only be a few months before they catch up. As Meta’s lead researcher Yann Lecun put it: “The idea is that everyone profits from everyone else’s ideas. No one ‘outpaces’ anyone and no country ‘loses’ to another. No one has a monopoly on good ideas. Everyone’s learning from everyone else.” So it’s execution that matters.

Ultimately, it’s the consumers, startups and other users who will win the most, because DeepSeek’s offerings will continue to drive the price of using these models near zero (again aside from cost of running models at inference). This rapid commoditization could pose challenges – indeed, massive pain – for leading AI providers that have invested heavily in proprietary infrastructure. As many commentators have put it, including Chamath Palihapitiya, an investor and former executive at Meta, this could mean that years of OpEx and CapEx by OpenAI and others will be wasted.

There is substantial commentary about whether it is ethical to use the DeepSeek-R1 model because of the biases instilled in it by Chinese laws, for example that it shouldn’t answer questions about the Chinese government’s brutal crackdown at Tiananmen Square. Despite ethical concerns around biases, many developers view these biases as infrequent edge cases in real-world applications – and they can be mitigated through fine-tuning. Moreover, they point to different, but analogous biases that are held by models from OpenAI and other companies. Meta’s Llama has emerged as a popular open model despite its data sets not being made public, and despite hidden biases, and lawsuits being filed against it as a result.

Questions abound around the ROI of big investments by OpenAI

This all raises big questions about the investment plans pursued by OpenAI, Microsoft and others. OpenAI’s $500 billion Stargate project reflects its commitment to building massive data centers to power its advanced models. Backed by partners like Oracle and Softbank, this strategy is premised on the belief that achieving artificial general intelligence (AGI) requires unprecedented compute resources. However, DeepSeek’s demonstration of a high-performing model at a fraction of the cost challenges the sustainability of this approach, raising doubts about OpenAI’s ability to deliver returns on such a monumental investment.

Entrepreneur and commentator Arnaud Bertrand captured this dynamic, contrasting China’s frugal, decentralized innovation with the U.S. reliance on centralized, resource-intensive infrastructure: “It’s about the world realizing that China has caught up — and in some areas overtaken — the U.S. in tech and innovation, despite efforts to prevent just that.” Indeed, yesterday another Chinese company, ByteDance announced Doubao-1.5-pro, which Includes a “Deep Thinking” mode that surpasses OpenAI’s o1 on the AIME benchmark.

Want to dive deeper into how DeepSeek-R1 is reshaping AI development? Check out our in-depth discussion on YouTube, where I explore this breakthrough with ML developer Sam Witteveen. Together, we break down the technical details, implications for enterprises, and what this means for the future of AI:

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

SD-WAN to gain AI-driven deployment, management capabilities

SD-WAN, security, and SASE As SD-WAN technology continues to mature with such advanced features, it will also play a pivotal role in enterprise organizations’ secure access service edge (SASE) strategies going forward, according to Butler. Integration of security features within SD-WAN platforms will also be a major focus this year

Read More »

Hydrogen ‘hard to beat’ but UK lacks incentives, says Mission Control boss

Chris Stark, who is tasked with implementing the UK’s clean power plan, has said government will explore using methods usually used for financing nuclear and renewable energy projects for hydrogen storage. “For the really long duration energy storage at the moment… it’s hard to beat hydrogen as a low-carbon energy store that works well with a renewable-power system,” the head of Mission Control said at a committee meeting in Westminster. “We have already acknowledged that we have not put all the market mechanisms in place to deliver what we need to. At the end of this, there will have to be something that gives the incentive to store that hydrogen, because the incentive is not there at present, and the stores are huge.” Stark admitted that there is no “policy framework in place” to support the as-yet-unproven hydrogen storage industry. The government said in October that it intends to hold the first round of a cap-and-floor mechanism in 2025 for long-duration energy storage (LDES). It said the mechanism will support technologies including pumped storage hydropower, liquid air energy storage, compressed air energy storage and flow batteries, but excluded hydrogen storage. Energy secretary Ed Miliband confirmed at the committee hearing that the government is “moving forward with the hydrogen storage business model”. According to market engagement documents, a separate mechanism for hydrogen storage would require new or converted hydrogen storage facilities to be operational by 2028-2032. Stark said at the Environment and Climate Change Committee and Science and Technology Committee in Westminster this month that possible options for financing hydrogen storage include a regulated asset base model (RAB), as used for nuclear, or contracts-for-difference (CfDs). He added that government would seek to bring this through “over the course of this parliament”. “It might be something that looks like a regulated asset

Read More »

PG&E, other electric utilities call for Senate to pass forest management bill

Electric utilities are calling for the U.S. Senate to pass legislation aimed at reducing wildfire intensity and restoring forest health, in part by allowing power companies greater leeway around vegetation management. The House passed the Save our Forests Act on Thursday in a 279 to 141 vote. The bill would “simplify and expedite the most critical forest management projects while maintaining strong environmental standards,” Rep. Scott Peters, D-Calif., said in a floor speech. “It will reduce the threat of litigation, and add new ways for communities to provide input early.” Peters introduced the bill with Rep. Bruce Westerman, R-Ark. The bill would designate certain areas at high risk for wildfires as “fireshed management areas” and directs the U.S. Forest Service and U.S. Geological Survey to establish a joint center responsible for duties related to assessing and predicting fires. The bill also includes categorical exclusions for electric utility line rights-of-ways, allowing vegetation management on federal lands to take place without an environmental assessment or an environmental impact statement. And it would allow utilities to cut and remove trees near power lines on federal lands without a timber sale. There is opposition to the measure, with some advocates warning the bill cuts important environmental reviews. Environment America said it understands the need to address wildfires but warned the bill could have “devastating consequences for the environment and endangered species.” “The Fix Our Forests Act bypasses critical environmental laws that protect our ecosystems and restricts scientific input and public engagement,” the group said in a statement. Edison Electric Institute interim President and CEO Pat Vincent-Collawn urged the Senate to pass the bill. “We need common-sense legislation to protect more homes and communities and to create a more reliable and resilient energy grid, while helping keep costs to customers as low as possible,” Vincent-Collawn said in

Read More »

FERC reinstates Transco gas pipeline approval, ends GHG policy review

The Federal Energy Regulatory Commission on Friday unanimously reinstated its approval of a New Jersey-area gas pipeline project owned by Transcontinental Gas Pipe Line Co., a Williams Co. subsidiary. The agency also unanimously ended a proceeding for potentially changing its process for considering how proposed gas pipelines and other gas infrastructure can affect greenhouse gas emissions. FERC in January 2023 approved Transco’s roughly $950 million Regional Energy Access expansion project in Pennsylvania, New Jersey and Maryland. Since then, the company brought the fully subscribed project into service. It increased Transco’s pipeline capacity by 829,000 dekatherms a day to provide fuel to local gas utilities. However, the U.S. Court of Appeals for the District of Columbia Circuit in July vacated FERC’s certificate for the project. The court said FERC failed to explain why it dismissed studies showing that the pipeline may not be needed. Also, the commission’s decision not to determine the significance of the project’s greenhouse gas emissions was arbitrary and capricious, according to the court. In its decision to reinstate the project’s permit, FERC reaffirmed its finding that the project was needed. FERC dismissed arguments that utilities would buy more capacity than they needed and foist the additional costs onto their ratepayers. New Jersey utility regulators, for example, will be able to review the prudency of the pipeline capacity contracts, according to FERC. “Accordingly, we continue to find that precedent agreements with [local distribution companies] are probative of market need for new capacity, the same as any other precedent agreement, and that the precedent agreements supporting [Tranco’s] project are significant evidence of project need,” FERC said. Without FERC’s action to reinstate the certificate, Transco would have had to take its facilities out of service on Jan. 28, according to the American Gas Association. “This scenario highlights the need for

Read More »

Achieving AI dominance through competitive power markets

Todd Glass is a partner at Wilson Sonsini. The views in this op-ed do not necessarily reflect the views of the firm or its clients. Integrating AI into the global economy is the next transformative technological revolution. American policy makers and corporate leaders alike understand that beyond technological dominance and economic opportunity, leading the AI revolution is a matter of national security. Even Presidents Trump and Biden seemingly agree: whoever leads that revolution will dominate the flow of information, how privacy and security regimes are regulated, what economies and governments are secure (or not), and which nations will drive global economic growth over the next several decades. AI works through statistical operations on enormous databases, so its computing demand is unprecedented. Because computing boils down to the transformation of electric energy into information, the growth and success of AI is directly correlated to the availability of cost-effective electric energy. The supply of electric energy will play a critical role in this global race.  Data center development to meet demand for compute will create massive growth in electric load in the United States at a rate not seen since the 1960s. Fortunately, with targeted reforms, the U.S. can win, while fostering innovation and maximizing value to consumers, by expanding access to its competitive power markets.    The impediment The U.S. electric energy system is no longer designed to deal with such load growth. Utilities in traditional markets have not planned for and cannot deal with significant load growth much beyond 1% per annum; utilities in structured ISO/RTO markets are no longer in the business of serving such load growth. Our grid is aging and processes for accessing the grid are arcane and fraught with project-killing delays. Quite simply, the U.S. is at risk of losing the AI revolution not due to a lack of technological innovation, but

Read More »

North America Keeps Adding Rigs

North America added 12 rigs week on week, according to Baker Hughes’ latest North America rotary rig count, which was released on January 24. Although the U.S. dropped four rigs week on week, Canada added 16 rigs during the same timeframe, taking the total North America rig count up to 821, comprising 576 rigs from the U.S. and 245 rigs from Canada, Baker Hughes’ count outlined.  Of the total U.S. rig count of 576, 560 rigs are categorized as land rigs, 14 are categorized as offshore rigs, and two are categorized as inland water rigs. The total U.S. rig count is made up of 472 oil rigs, 99 gas rigs, and five miscellaneous rigs, according to the count, which revealed that the U.S. total comprises 511 horizontal rigs, 51 directional rigs, and 14 vertical rigs. Week on week, the U.S. offshore and inland water rig counts remained unchanged, but the country’s land rig count dropped by four, the count revealed. The country’s oil rig count dropped by six, its gas rig count increased by one, and its miscellaneous rig count increased by one, week on week, the count showed. The U.S. horizontal rig count dropped by four, while its directional rig count dropped by one and its vertical rig count increased by one, week on week, the count highlighted. A major state variances subcategory included in the rig count showed that Texas dropped four rigs and New Mexico dropped one rig, week on week. Oklahoma added one rig during the same period, the count outlined. A major basin variances subcategory included in Baker Hughes’ rig count showed that the Permian basin dropped six rigs, and the Cana Woodford and Barnett basins each dropped one rig, week on week. The Eagle Ford basin added one rig week on week, the count revealed.

Read More »

Iraq’s Rumaila Oil Field Output Down 300K Barrels a Day After Fire

Oil output at Iraq’s giant Rumaila field remains reduced by 300,000 barrels a day after a fire last week, according to an official.  The timeline for a full recovery to previous levels of 1.2 million barrels a day is uncertain, said Mohammed Al-Najjar, Iraq’s national representative to OPEC. On Friday, the Oil Ministry reported it had extinguished the fire a few hours after it broke out at tank No. 2 at the DS5 station in North Rumaila.  The Rumaila field, one of the world’s biggest, is jointly operated by Iraq, BP Plc and PetroChina Co., and there are plans to bolster production to just over 2 million barrels per day by the end of the decade. The supply disruption could bring Iraq further in line with the output cap it has agreed with fellow members of the Organization of Petroleum Exporting Countries, which has been restraining supplies in a bid to shore up crude prices.  After several months of cutbacks, Baghdad lowered output to 4.019 million barrels a day in December, just 19,000 per day above its OPEC quota, according to a monthly report from the organization last week. The country has also promised to reduce supplies below the quota in order to compensate for earlier over-production. Oil prices showed little reaction to the disruption, with Brent crudes little changed near $78 a barrel in London on Monday, after retreating 2.8% last week. Traders are focused instead on the prospect of trade tariffs on a range of major oil consumers and producers threatened by US President Donald Trump.  WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed. MORE FROM THIS AUTHOR Bloomberg

Read More »

High-speed Ethernet switches a bright spot in network forecasts

Looking at the enterprise router segment, Dinsdale noted that Cisco is it; everyone else is relatively very small. Cisco’s dominance also extends into the Ethernet switch space, though it is facing increasing competition. Cisco’s share of enterprise switches varies quite a bit quarter by quarter but is usually in the 40-48% range, he says. “Arista has grown rapidly over the last few years and is now the clear number two in the market– and it is growing quite a bit more rapidly that Cisco,” Dinsdale said. “Next biggest players in the market are Huawei and HPE. Other big vendors include H3C, Juniper, Extreme and white box vendors.” The overall market in 2024 was a bit challenging for a number of reasons. Dinsdale noted that there were global supply chain issues, then sales leapt as they suddenly caught up with the backlog, and then sales dropped sharply as customers were ingesting large volumes of shipments.  “We are expecting the market to bounce back quite a bit in 2025 after the lows in 2024,” Dinsdale said. 2025 looks strong for enterprise switches, not so much for routers 2025 is looking to be a solid year of growth for multiple segments of the switch and router networking markets. Brandon Butler, senior research manager, enterprise networks, at IDC told Network World that his firm’s non-data center (campus and branch) Ethernet switch market forecast for 2025 is for revenues to reach $21.4 billion, which is expected to be a 6.0% gain over 2024. The data center ethernet switch forecast is also robust with 2025 revenue projected to come in 11% higher than 2024, up to $22.7 billion.

Read More »

Meta wants everyone to know that it, too, is investing a lot in AI

In his Facebook post this week, Zuckerberg wrote that Meta will also “significantly” grow out its AI teams and will build an “AI engineer” AI agent that will contribute code to Meta’s R&D efforts. Planned investments for this year already represent a 50% increase over the company’s 2024 spending, and Zuckerberg noted, “we have the capital to continue investing in the years ahead.” His announcement comes just four days after OpenAI dropped its “Stargate Project” bombshell, and four days before Meta’s planned fourth-quarter financial reporting on Jan. 29. It also follows on the heels of US President Donald Trump’s controversial executive order on AI that specifically revokes previous administration policies that he said “act as barriers to American AI innovation.” “This will be a defining year for AI,” Zuckerberg wrote in his Facebook post, also saying he expects that Meta AI will be the world’s “leading assistant,” serving more than 1 billion people, and that Llama 4 will become the “leading state-of-the-art model.” “This is a massive effort, and over the coming years it will drive our core products and business, unlock historic innovation, and extend American technology leadership,” he wrote. “Let’s go build!” Big tech racing to build the data centers of the future Undoubtedly one of the biggest stories out of the tech world this week was the US President Donald Trump-endorsed Stargate Project, an ambitious, $500 billion initiative to build out AI infrastructure in the US for OpenAI over the next four years. It is an industry collaboration, with participation from OpenAI, Oracle, SoftBank, MGX, Arm, Nvidia and Microsoft, and will deploy $100 billion “immediately,” OpenAI said in a blog post earlier this week. Other industry players are investing as well. Microsoft has announced its intent to spend $80 billion in fiscal year 2025, more than half

Read More »

More questions than answers around Trump’s Stargate AI plans

“Leasing has always been the preference over building its own. Oracle has recently increased its focus on cloud services and ramped up its capex, but I’ve not seen evidence that the strategy of leasing rather than building has changed,” said John Dinsdale, chief analyst and research director for Synergy Research Group. Ellison appears to be taking the lead in this effort, and he’s the wrong one for this role, argues Rob Enderle, principal analyst with The Enderle Group. “It’s not that he can’t execute, it’s just that he’s been removed from technology for a while. This is probably a huge step too far for him, given his more recent background and level of engagement,” he said. Another question is the role for OpenAI. It still hasn’t figured out how to monetize ChatGPT and the company is bleeding money, requiring a significant cash infusion/investment from Microsoft. ChatGPT partner Microsoft is conspicuously absent from the announcement. This prompted speculation on X that the two companies were parting ways, a rather difficult thing to do given that Microsoft owns a 49% stake in OpenAI. For its part, Microsoft issued a glowing statement stating that their partnership remains in place through 2030 and that OpenAI had made recent contribution to the Azure service. But CEO Satya Nadella did get in a little dig in an interview with CNBC when the issue of funding the $500 billion project. “Look, all I know is, I’m good for my $80 billion,” he said, a reference to a recent promised investment of $80 billion in Microsoft data centers. Another missing player is Nvidia, which is usually involved in everything AI. Oracle has made a significant investment in deploying Nvidia product in its data centers, so you think that they would have been some involvement. But no, all an

Read More »

Data center growth puts unprecedented pressure on power grids

The spike in electricity needs is unprecedented, McKinsey wrote, and will be difficult to meet. In another report released this week, Goldman Sachs also predicts a surge in data center power demand, and expects an increase of 160% by 2030, compared to 2023 levels. Data center builds typically have timelines of 18 to 24 months, gas and renewable power plans have timelines of three to five years, and new transmission development can take seven to ten years. In fact, it’s the power transmission grid that’s often the biggest constraint, the research firm says, more even than the power generation capacity. According to a report by Grid Strategies, construction of new high-voltage transmission lines has actually slowed recently, dropping steadily from a peak of 4,000 miles of new high-voltage lines  in 2012 to just 55 in 2023. To meet demands, the DOE estimates that 115,000 miles will be needed by 2040, doubling current grid capacity. Companies are already planning ahead for shortfalls.

Read More »

US GPU export limits could bring cold war to AI, data center markets

Eighteen countries, including the UK, Canada, Sweden, France, Germany, Japan, and South Korea, are exempted from the AI export caps. The Biden administration had previously banned the export of some powerful AI chips to China, Russia, and other adversaries in rules from 2022 and 2023. But other countries friendly to the US, including Mexico, Israel, India, and Saudi Arabia, would be subject to the quotas. The export limits would take effect 120 days from the Jan. 13 order, and it’s unclear whether the incoming Trump administration will amend or rewrite the rule, although Trump has targeted China as a primary economic competitor of the US. The cost of AI In addition to cutting off most of the world from large AI chip purchases, the rule will force countries such as China and Russia to pump up their own AI capabilities, ultimately reducing US AI leadership, claims Aible’s Sengupta.

Read More »

Sustainability, grid demands, AI workloads will challenge data center growth in 2025

Cloud training for AI models Uptime believes that most AI models will be trained in the cloud rather than on dedicated enterprise infrastructure, as cloud services provide a more cost-effective way to fine-tune foundation models for specific use cases. The incremental training required to fine-tune a foundation model can be done cost-effectively on cloud platforms without the need for a large, expensive on-premises cluster. Enterprises can leverage on-demand cloud resources to customize the foundation model as needed, without investing the capital and operational costs of dedicated hardware. “Because fine-tuning requires only a relatively small amount of training, for many it just wouldn’t make sense to buy a huge, expensive dedicated AI cluster for this purpose. The foundation model, which has already been trained by someone else, has taken the burden of most of the training away from us,” said Dr. Owen Rogers, research director for cloud computing at Uptime. “Instead, we could just use on-demand cloud services to tweak the foundation model for our needs, only paying for the resources we need for as long as we need them.” Data center collaboration with utilities Uptime expects new and expanded data center developers will be asked to provide or store power to support grids. That means data centers will need to actively collaborate with utilities to manage grid demand and stability, potentially shedding load or using local power sources during peak times. Uptime forecasts that data center operators “running non-latency-sensitive workloads, such as specific AI training tasks, could be financially incentivized or mandated to reduce power use when required.” “The context for all of this is that the [power] grid, even if there were no data centers, would have a problem meeting demand over time. They’re having to invest at a rate that is historically off the charts. It’s not just

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »