Stay Ahead, Stay ONMINE

Alibaba’s new open source model QwQ-32B matches DeepSeek R1 with way smaller compute requirements

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Qwen Team, a division of Chinese e-commerce giant Alibaba developing its growing family of open-source Qwen large language models (LLMs), has introduced QwQ-32B, a new 32-billion-parameter reasoning model designed to improve performance on complex problem-solving tasks […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Qwen Team, a division of Chinese e-commerce giant Alibaba developing its growing family of open-source Qwen large language models (LLMs), has introduced QwQ-32B, a new 32-billion-parameter reasoning model designed to improve performance on complex problem-solving tasks through reinforcement learning (RL).

The model is available as open-weight on Hugging Face and on ModelScope under an Apache 2.0 license. This means it’s available for commercial and research uses, so enterprises can employ it immediately to power their products and applications (even ones they charge customers to use).

It can also be accessed for individual users via Qwen Chat.

Quan-with-Questions was Alibaba’s answer to OpenAI’s original reasoning model o1

QwQ, short for Qwen-with-Questions, was first introduced by Alibaba in November 2024 as an open-source reasoning model aimed at competing with OpenAI’s o1-preview.

At launch, the model was designed to enhance logical reasoning and planning by reviewing and refining its own responses during inference, a technique that made it particularly effective in math and coding tasks.

The initial version of QwQ featured 32 billion parameters and a 32,000-token context length, with Alibaba highlighting its ability to outperform o1-preview in mathematical benchmarks like AIME and MATH, as well as scientific reasoning tasks such as GPQA.

Despite its strengths, QwQ’s early iterations struggled with programming benchmarks like LiveCodeBench, where OpenAI’s models maintained an edge. Additionally, as with many emerging reasoning models, QwQ faced challenges such as language mixing and occasional circular reasoning loops.

However, Alibaba’s decision to release the model under an Apache 2.0 license ensured that developers and enterprises could freely adapt and commercialize it, distinguishing it from proprietary alternatives like OpenAI’s o1.

Since QwQ’s initial release, the AI landscape has evolved rapidly. The limitations of traditional LLMs have become more apparent, with scaling laws yielding diminishing returns in performance improvements.

This shift has fueled interest in large reasoning models (LRMs) — a new category of AI systems that use inference-time reasoning and self-reflection to enhance accuracy. These include OpenAI’s o3 series and the massively successful DeepSeek-R1 from rival Chinese lab DeepSeek, an offshoot of Hong Kong quantitative analysis firm High-Flyer Capital Management.

A new report from web traffic analytics and research firm SimilarWeb found that since the launch of R1 back in January 2024, DeepSeek has rocketed up the charts to become the most-visited AI model-providing website behind OpenAI.

Credit: SimilarWeb, AI Global Global Sector Trends on Generative AI

QwQ-32B, Alibaba’s latest iteration, builds on these advancements by integrating RL and structured self-questioning, positioning it as a serious competitor in the growing field of reasoning-focused AI.

Scaling up performance with multi-stage reinforcement learning

Traditional instruction-tuned models often struggle with difficult reasoning tasks, but the Qwen Team’s research suggests that RL can significantly improve a model’s ability to solve complex problems.

QwQ-32B builds on this idea by implementing a multi-stage RL training approach to enhance mathematical reasoning, coding proficiency and general problem-solving.

The model has been benchmarked against leading alternatives such as DeepSeek-R1, o1-mini and DeepSeek-R1-Distilled-Qwen-32B, demonstrating competitive results despite having fewer parameters than some of these models.

For example, while DeepSeek-R1 operates with 671 billion parameters (with 37 billion activated), QwQ-32B achieves comparable performance with a much smaller footprint — typically requiring 24 GB of vRAM on a GPU (Nvidia’s H100s have 80GB) compared to more than 1500 GB of vRAM for running the full DeepSeek R1 (16 Nvidia A100 GPUs) — highlighting the efficiency of Qwen’s RL approach.

QwQ-32B follows a causal language model architecture and includes several optimizations:

  • 64 transformer layers with RoPE, SwiGLU, RMSNorm and Attention QKV bias;
  • Generalized query attention (GQA) with 40 attention heads for queries and 8 for key-value pairs;
  • Extended context length of 131,072 tokens, allowing for better handling of long-sequence inputs;
  • Multi-stage training including pretraining, supervised fine-tuning and RL.

The RL process for QwQ-32B was executed in two phases:

  1. Math and coding focus: The model was trained using an accuracy verifier for mathematical reasoning and a code execution server for coding tasks. This approach ensured that generated answers were validated for correctness before being reinforced.
  2. General capability enhancement: In a second phase, the model received reward-based training using general reward models and rule-based verifiers. This stage improved instruction following, human alignment and agent reasoning without compromising its math and coding capabilities.

What it means for enterprise decision-makers

For enterprise leaders—including CEOs, CTOs, IT leaders, team managers and AI application developers—QwQ-32B represents a potential shift in how AI can support business decision-making and technical innovation.

With its RL-driven reasoning capabilities, the model can provide more accurate, structured and context-aware insights, making it valuable for use cases such as automated data analysis, strategic planning, software development and intelligent automation.

Companies looking to deploy AI solutions for complex problem-solving, coding assistance, financial modeling or customer service automation may find QwQ-32B’s efficiency an attractive option. Additionally, its open-weight availability allows organizations to fine-tune and customize the model for domain-specific applications without proprietary restrictions, making it a flexible choice for enterprise AI strategies.

The fact that it comes from a Chinese e-commerce giant may raise some security and bias concerns for some non-Chinese users, especially when using the Qwen Chat interface. But as with DeepSeek-R1, the fact that the model is available on Hugging Face for download and offline usage and fine-tuning or retraining suggests that these can be overcome fairly easily. And it is a viable alternative to DeepSeek-R1.

Early reactions from AI power users and influencers

The release of QwQ-32B has already gained attention from the AI research and development community, with several developers and industry professionals sharing their initial impressions on X (formerly Twitter):

  • Hugging Face’s Vaibhav Srivastav (@reach_vb) highlighted QwQ-32B’s speed in inference thanks to provider Hyperbolic Labs, calling it “blazingly fast” and comparable to top-tier models. He also noted that the model “beats DeepSeek-R1 and OpenAI o1-mini with Apache 2.0 license.”
  • AI news and rumor publisher Chubby (@kimmonismus) was impressed by the model’s performance, emphasizing that QwQ-32B sometimes outperforms DeepSeek-R1, despite being 20 times smaller. “Holy moly! Qwen cooked!” they wrote.
  • Yuchen Jin (@Yuchenj_UW), co-founder and CTO of Hyperbolic Labs, celebrated the release by noting the efficiency gains. “Small models are so powerful! Alibaba Qwen released QwQ-32B, a reasoning model that beats DeepSeek-R1 (671B) and OpenAI o1-mini!”
  • Another Hugging Face team member, Erik Kaunismäki (@ErikKaum) emphasized the ease of deployment, sharing that the model is available for one-click deployment on Hugging Face endpoints, making it accessible to developers without extensive setup.

Agentic capabilities

QwQ-32B incorporates agentic capabilities, allowing it to dynamically adjust reasoning processes based on environmental feedback.

For optimal performance, Qwen Team recommends using the following inference settings:

  • Temperature: 0.6
  • TopP: 0.95
  • TopK: Between 20-40
  • YaRN Scaling: Recommended for handling sequences longer than 32,768 tokens

The model supports deployment using vLLM, a high-throughput inference framework. However, current implementations of vLLM only support static YaRN scaling, which maintains a fixed scaling factor regardless of input length.

Future developments

Qwen’s team sees QwQ-32B as the first step in scaling RL to enhance reasoning capabilities. Looking ahead, the team plans to:

  • Further explore scaling RL to improve model intelligence;
  • Integrate agents with RL for long-horizon reasoning;
  • Continue developing foundation models optimized for RL;
  • Move toward artificial general intelligence (AGI) through more advanced training techniques.

With QwQ-32B, Qwen Team is positioning RL as a key driver of the next generation of AI models, demonstrating that scaling can produce highly performant and effective reasoning systems.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

SolarWinds buys Squadcast to speed incident response

Squadcast customers shared their experiences with the technology. “Since implementing Squadcast, we’ve reduced incoming alerts from tens of thousands to hundreds, thanks to flexible deduplication. It has a direct impact on reducing alert fatigue and increasing awareness,” said Avner Yaacov, Senior Manager at Redis, in a statement. According to SolarWinds,

Read More »

Tariffs won’t impact IT organizations, for now anyway

The idea behind tariffs is to increase domestic manufacturing, but Almassy notes the United Stated doesn’t have the manufacturing capacity or capability that Taiwan does. TSMC, Samsung, and GlobalFoundries have some fabs here but they are not building the most leading edge technologies. “Those are all in Taiwan at the

Read More »

Massachusetts cuts $500M from state’s energy efficiency program

Massachusetts utility regulators on Friday approved a three-year budget for the state’s energy efficiency program Mass Save, cutting $500 million from the original $5 billion proposal. “Even with this budget reduction, the state’s energy efficiency program will continue to provide customers with billions in savings and benefits each year by supporting improvements to homes and businesses like energy efficient heating systems and appliances as well as low-cost weatherization,” the Department of Public Utilities said in a statement. Mass Save has helped customers save 18 million MWh of annual electricity consumption in the last 15 years, DPU said. The budget reduction will lower total residential program budgets by 25% for gas and 15% for electric, regulators said. Even with the reduction, the program’s 2025-2027 budget remains higher than its $3.95 billion 2022-2024 budget, according to Western Mass News. “The decrease will vary by utility provider, as the utilities must work together to reduce the total budget of Mass Save,” DPU said. Each utility budgets for their own programs that are then collectively proposed in the total budget, the agency explained. Mass Save members include Eversource Energy, Berkshire Gas, Cape Light Compact, Liberty Utilities, National Grid and Unitil. Eversource issued a statement commending the DPU “for listening to customer concerns about affordability and taking the difficult action.” “This is the most immediate step the state can take to provide long-term rate relief to customers and ensure that the pace of the energy transition in Massachusetts is affordable and attainable,” Eversource said. “To be clear, we are steadfastly committed to the Mass Save programs, which are essential to meeting the commonwealth’s decarbonization goals.” Massachusetts is targeting net zero emissions by 2050. Efficiency advocates say the DPU’s decision will result in $1.5 billion loss in benefits and savings. “Paring back energy efficiency programs designed

Read More »

Oil Futures Plunge Amid Trade Wars and Supply Hike

Oil plunged to the lowest in about six months as US President Donald Trump’s trade wars hammer the outlook for demand just as OPEC+ signals it’s ready to start opening the taps on supply. Brent crude plummeted 2.4% to settle just above $69, while West Texas Intermediate dropped 2.9% to settle near $66. Both closing prices were the lowest since early September. Global benchmark Brent at one point grazed the lowest level since December 2021 during the session, before paring losses. Trump’s trade measures are threatening to reduce global energy demand and redraw oil flows, though just how they will play out depends on their final makeup and duration, both of which remain uncertain. On the supply side, OPEC+ nations are forging ahead with a scheduled production hike and US domestic stockpiles swelled last week, adding to expectations of a surplus. Crude has trended lower since mid-January as Trump’s policies raised fears of multiple trade wars. Oil options traders are the most bearish in five months, and volumes of bearish put contracts surged Tuesday. EIA Data: Crude +3,614k Bbl, Median Est. +800k Bbl The mounting gloom is leading many firms to revise price forecasts lower. Industry consultant Enverus downgraded its view for Brent to $70 a barrel for the year from $80 a barrel previously. Morgan Stanley cut its price forecast by $5 to $70 a barrel for the second quarter of 2025. Citigroup sees Brent sliding to $60 a barrel. “The market is repricing the downside risk in crude, shifting from a $65 floor in WTI to closer to $60,” said Rebecca Babin, a senior energy trader at CIBC Private Wealth Group. “At this point, the focus has completely shifted from supply risks to demand concerns, which could signal we’re approaching a bottom.” Oil Prices: WTI for April delivery

Read More »

UK Drafts Post-2030 Oil Tax Plan to Replace Windfall Duties

The UK government is drafting a new tax regime for oil and gas companies to replace a controversial windfall levy after 2030, with an aim to hit companies only when prices are unusually high.  The new mechanism will be permanent, which the government says will give the industry more predictability, and will mean higher payments only if there’s a need to respond to oil and gas “price shocks,” according to a consultation published on Wednesday. The industry is expected to contribute £19 billion ($24.5 billion) in tax receipts between now and 2030. “We will ensure that it minimizes distortions on investment decisions when prices are not unusually high,” Treasury Secretary James Murray said in the statement.  The previous Conservative government imposed a windfall tax on surging oil and gas profits during the energy crisis three years ago, and Labour will use some of the proceeds to help fund its state-owned company GB Energy. Maintaining a tough stance on oil and gas firms formed a prominent part of the Labour campaign during the last election.  The government is seeking input from the industry as it tries to define two thresholds — one for oil and one for gas — to use in the new regime. The UK also confirmed plans to end new North Sea exploration licenses for oil and gas, in line with the government’s manifesto commitment, but specified that project extensions — blocks where there “is a valid license” — will not be banned.  The move to design a new tax regime comes after persistent calls from the nation’s top producers for more clarity on duties and drilling permits to allow investment decisions.  While global energy prices have retreated from 2022 peaks, the Energy Profits Levy continued to rise, bringing the total tax rate to 78% late last year and prompting dramatic cuts in industry forecasts

Read More »

DOE Issues Export Approval to Golden Pass LNG, Accelerating President Trump’s Pledge to Restore American Energy Dominance

WASHINGTON—U.S. Secretary of Energy Chris Wright today approved an LNG export permit extension for Golden Pass LNG Terminal LLC (Golden Pass), marking yet another step toward meeting President Trump’s and Secretary Wright’s commitment to unleash American energy dominance and restore regular order to liquefied natural gas (LNG) export reviews. The approval will grant additional time to begin LNG exports from the Golden Pass LNG Terminal, currently under construction in Sabine Pass, Texas. “Exporting U.S. LNG supports American jobs, bolsters our national security and strengthens America’s position as a world energy leader. President Trump has pledged to restore energy dominance for the American people, and I am proud to help deliver on that agenda with today’s permit extension,” said Secretary Wright. The issuance to Golden Pass marks the third LNG-related approval from DOE since President Trump took office, following an export approval to Commonwealth LNG on February 14 and an order on rehearing removing barriers for the use of LNG as bunkering fuel announced on February 28. “Golden Pass was the first project approved for exports to non-free trade agreement countries by DOE during the first Trump Administration, and it is gratifying that this project is so close to being able to deliver its first LNG,” said Tala Goudarzi, Acting Principal Deputy Assistant Secretary of the Office of Fossil Energy and Carbon Management. Golden Pass, owned by QatarEnergy and ExxonMobil, is set to begin exporting as early as later this year, and once operational, will become the ninth large-scale export terminal operating in the United States. Once completed, Golden Pass will be able to export up to 2.57 billion cubic feet per day (Bcf/d) of natural gas as LNG and will bring unprecedented levels of LNG exports from the United States.  ###

Read More »

USA Energy Sec Says We’re at the Start of a New Manhattan Project

We’re at the start of a new Manhattan Project. That’s what U.S. Secretary of Energy Chris Wright said in a release posted on the U.S. Department of Energy (DOE) website on Saturday. Wright made the statement “after participating in the launch of an AI collaboration session involving more than 1,000 Energy Department scientists and OpenAI employees”, the release highlighted. The DOE release noted that Wright joined Senator Bill Hagerty, Chairman of House Committee on Energy and Waters Chuck Fleischmann, and OpenAI President and Co-Founder Greg Brockman,   at the ‘1,000 Scientist AI Jam Session’ on February 28. This first of its kind event co-hosted by OpenAI and nine of the U.S. Department of Energy’s national labs explored how AI can accelerate scientific discovery, the release stated. “More than 70 years ago, experts from the Department of Energy’s Oak Ridge National Lab joined with innovators from around the United States in one of the greatest scientific and engineering accomplishments in history: the Manhattan Project,” Wright said in the release. “We’re at the start of a new Manhattan Project. With President Trump’s leadership, the United States will win the global AI race, but first, we must unleash our energy dominance and restore American competitiveness,” he added. “Today’s collaboration of DOE’s national labs and technology companies is an important step in our efforts to secure America’s future,” he continued. Hagerty, Fleischmann, and Brockman also made statements following the event, the DOE release highlighted. “It was great to join Secretary Wright and Representative Chuck Fleischmann this morning in Oak Ridge,” Hagerty said in the release. “In order for the U.S. to win the Artificial Intelligence race, we need computing power and energy, and Tennessee has both,” he added. “The U.S. can lead in this space with the Volunteer State at the tip of the spear. I

Read More »

US SMR deals remain elusive for NuScale

Dive Brief: NuScale is in “advanced commercial dialogue with major technology and industrial companies, utilities, and national and local governments” on potential small modular reactor deals as it looks forward to U.S. Nuclear Regulatory Commission approval for its 77-MW reactor uprate application later this year, the nuclear technology company said in a Monday earnings release. The company ordered “long-lead materials” for six additional modules from supply chain partner Doosan Enerbility in anticipation of customer orders, and continues to advance a 462-MW power plant project in Romania as a subcontractor to Fluor Corp., CEO John Hopkins said on the company’s Monday earnings call. But NuScale has yet to finalize a deal with any U.S. data center operators or other industrial customers due to “the complexity of putting these projects together,” Hopkins said. Dive Insight: In a Q4 and full-year 2024 earnings presentation that noted a significant revenue boost toward the end of last year, NuScale detailed “broad customer interest” in its technology, which it said can support multiple industries and grid use cases. Those grid use cases include resiliency, since reactors like NuScale’s can continue operating through outages and remain available once grid connectivity is restored, and microgrid support for mission-critical facilities like hospitals, the company said. NuScale reactors can also firm an increasingly renewables-heavy grid and replace baseload capacity lost due to coal-fired power plant retirements, the company said. Last year, a Department of Energy report on coal-to-nuclear transition potential found up to 174 GW of potential nuclear electric-generating capacity at 145 existing U.S. coal sites. The company presentation also highlighted four industrial opportunities: data centers and AI, carbon capture and sequestration, water desalination and hydrogen production. New nuclear power capacity is eligible for the maximum $3/kg credit for clean hydrogen production under the Inflation Reduction Act’s Section 45V tax

Read More »

AI driving a 165% rise in data center power demand by 2030

Goldman Sachs Research estimates the power usage by the global data center market to be around 55 gigawatts, which breaks down as 54% for cloud computing workloads, 32% for traditional line of business workloads and 14% for AI. By 2027, that number jumps to 84 GW, with AI growing to 27% of the overall market, cloud dropping to 50%, and traditional workloads falling to 23%, Schneider stated. Goldman Sachs Research estimates that there will be around 122 GW of data center capacity online by the end of 2030, and the density of power use in data centers is likely to grow as well, from 162 kilowatts per square foot to 176 KW per square foot in 2027, thanks to AI, Schneider stated.  “Data center supply — specifically the rate at which incremental supply is built — has been constrained over the past 18 months,” Schneider wrote. These constraints have arisen from the inability of utilities to expand transmission capacity because of permitting delays, supply chain bottlenecks, and infrastructure that is both costly and time-intensive to upgrade. The result is that due to power demand from data centers, there will need to be additional utility investment, to the tune of about $720 billion of grid spending through 2030. And then they are subject to the pace of public utilities, which move much slower than hyperscalers. “These transmission projects can take several years to permit, and then several more to build, creating another potential bottleneck for data center growth if the regions are not proactive about this given the lead time,” Schneider wrote.

Read More »

Top data storage certifications to sharpen your skills

Organization: Hitachi Vantara Skills acquired: Knowledge of data center infrastructure management tasks automation using Hitachi Ops Center Automator. Price: $100 Exam duration: 60 minutes How to prepare: Knowledge of all storage-related operations from an end-user perspective, including planning, allocating, and managing storage and architecting storage layouts. Read more about Hitachi Vantara’s training and certification options here. Certifications that bundle cloud, networking and storage skills AWS Certified Solutions Architect – Professional The AWS Certified Solutions Architect – Professional certification from leading cloud provider Amazon Web Services (AWS) helps individuals showcase advanced knowledge and skills in optimizing security, cost, and performance, and automating manual processes. The certification is a means for organizations to identify and develop talent with these skills for implementing cloud initiatives, according to AWS. The ideal candidate has the ability to evaluate cloud application requirements, make architectural recommendations for deployment of applications on AWS, and provide expert guidance on architectural design across multiple applications and projects within a complex organization, AWS says. Certified individuals report increased credibility with technical colleagues and customers as a result of earning this certification, it says. Organization: Amazon Web Services Skills acquired: Helps individuals showcase skills in optimizing security, cost, and performance, and automating manual processes Price: $300 Exam duration: 180 minutes How to prepare: The recommended experience prior to taking the exam is two or more years of experience in using AWS services to design and implement cloud solutions Cisco Certified Internetwork Expert (CCIE) Data Center The Cisco CCIE Data Center certification enables individuals to demonstrate advanced skills to plan, design, deploy, operate, and optimize complex data center networks. They will gain comprehensive expertise in orchestrating data center infrastructure, focusing on seamless integration of networking, compute, and storage components. Other skills gained include building scalable, low-latency, high-performance networks that are optimized to support artificial intelligence (AI)

Read More »

Netskope expands SASE footprint, bolsters AI and automation

Netskope is expanding its global presence by adding multiple regions to its NewEdge carrier-grade infrastructure, which now includes more than 75 locations to ensure processing remains close to end users. The secure access service edge (SASE) provider also enhanced its digital experience monitoring (DEM) capabilities with AI-powered root-cause analysis and automated network diagnostics. “We are announcing continued expansion of our infrastructure and our continued focus on resilience. I’m a believer that nothing gets adopted if end users don’t have a great experience,” says Netskope CEO Sanjay Beri. “We monitor traffic, we have multiple carriers in every one of our more than 75 regions, and when traffic goes from us to that destination, the path is direct.” Netskope added regions including data centers in Calgary, Helsinki, Lisbon, and Prague as well as expanded existing NewEdge regions including data centers in Bogota, Jeddah, Osaka, and New York City. Each data center offers customers a range of SASE capabilities including cloud firewalls, secure web gateway (SWG), inline cloud access security broker (CASB), zero trust network access (ZTNA), SD-WAN, secure service edge (SSE), and threat protection. The additional locations enable Netskope to provide coverage for more than 220 countries and territories with 200 NewEdge Localization Zones, which deliver a local direct-to-net digital experience for users, the company says.

Read More »

Inside the Nuclear Race for Data Center Energy with Aalo Atomics CEO Matt Loszak

The latest episode of the DCF Show podcast delves into one of the most pressing challenges facing the data center industry today: the search for sustainable, high-density power solutions. And how, as hyperscale operators like Google and Meta contend with growing energy demands—and, in some cases, resistance from utilities unwilling or unable to support their expanding footprints—the conversation around nuclear energy has intensified.  Both legacy nuclear providers and innovative startups are racing to secure the future business of data center giants, each bringing unique approaches to the table. Our guest for this podcast episode is Matt Loszak, co-founder and CEO of Aalo Atomics, an Austin-based company that’s taking a fresh approach to nuclear energy. Aalo, which secured a $29.5 million Series A funding round in 2024, stands out in the nuclear sector with its 10-megawatt sodium-cooled reactor design—eliminating the need for water, a critical advantage for siting flexibility. Inspired by the Department of Energy’s MARVEL microreactor, Aalo’s technology benefits from direct expertise, as the company’s CTO was the chief architect behind MARVEL. Beyond reactor design, Aalo’s vision extends to full-scale modular plant production. Instead of just building reactors, the company aims to manufacture entire nuclear plants using prefabricated, LEGO-style components. The fully modular plants, shipped in standard containers, are designed to match the footprint of a data center while requiring no onsite water—features that could make them particularly attractive to hyperscale operators seeking localized, high-density power.  Aalo has already made significant strides, with the Department of Energy identifying land at Idaho National Laboratory (INL) as a potential site for its first nuclear facility. The company is on an accelerated timeline, expecting to complete a non-nuclear prototype within three months and break ground on its first nuclear reactor in about a year—remarkably fast progress for the nuclear industry. In our discussion,

Read More »

Does It Matter If Microsoft Is Cancelling AI Data Center Leases?

Strategic Reallocation: Microsoft is a major owner and operator of data centers and might be reallocating resources to in-house infrastructure rather than leased spaces. Supply Chain Delays: TD Cowen noted that Microsoft used power and facility delays as justifications for voiding agreements, a tactic previously employed by Meta. Oversupply Issues: Analysts at TD Cowen speculate that Microsoft may have overestimated AI demand, leading to an excess in capacity. As it is all speculation, it could simply be that the latest information has driven Microsoft to reevaluate demand and move to more closely align projected supply with projected demand. Microsoft has reiterated their commitment to spend $80 billion on AI in the coming year. Reallocating this spending internally or wit a different set of partners remains on the table. And when you put the TD Cowen report that Microsoft has cancelled leases for “a couple hundred megawatts” into context with Microsoft’s overall leased power, which is estimated at around 20 GW, you see that more than 98% of their energy commitment remains unchanged. Investment Markets Might See the Biggest Hits Microsoft’s retreat has had ripple effects on the stock market, particularly among energy and infrastructure companies. European firms like Schneider Electric and Siemens Energy experienced a decline in stock value, indicating fears that major AI companies might scale back energy-intensive data center investments. However, at press time we have not seen any other indicators that this is an issue as despite these concerns about potential AI overcapacity, major tech firms continue to invest heavily in AI infrastructure:         Amazon: Pledged $100 billion towards AI data centers.         Alphabet (Google): Committed $75 billion.         Meta (Facebook): Planning to spend up to $65 billion.         Alibaba: Announced a $53 billion investment over the next three years. If we see a rush of announcements

Read More »

Dual Feed: Vantage Data Centers, VoltaGrid, Equinix, Bloom Energy, Constellation, Calpine

Nuclear Giant Constellation Acquires Natural Gas Stalwart Calpine, Creating the Largest U.S. Clean Energy Provider On January 10, 2025, Constellation (Nasdaq: CEG) announced a definitive agreement to acquire Calpine Corp. in a $16.4 billion cash-and-stock transaction, including the assumption of $12.7 billion in net debt.  A landmark transaction, the acquisition positions Constellation as the largest clean energy provider in the United States, significantly enhancing its generation portfolio with natural gas and geothermal assets. With an expanded coast-to-coast footprint, the combined company will provide 60 GW of power, reinforcing grid reliability and offering businesses and consumers a broader array of sustainability solutions. The move strengthens Constellation’s competitive retail electricity presence, serving 2.5 million customers across key U.S. markets, including Texas, California, and the Northeast. “This acquisition will help us better serve our customers across America, from families to businesses and utilities,” said Joe Dominguez, president and CEO of Constellation. “By combining Constellation’s unmatched expertise in zero-emission nuclear energy with Calpine’s industry-leading, low-carbon natural gas and geothermal generation, we can deliver the most comprehensive clean energy portfolio in the industry.” A Strategic Move for the Data Center Industry With skyrocketing demand for AI and cloud services, data centers are under increasing pressure to secure reliable, low-carbon energy sources. The Constellation-Calpine combination is particularly relevant for large-scale hyperscale operators and colocation providers seeking flexible energy solutions.  For the data center industry, this consolidation offers several advantages: Diverse Energy Mix: The integration of nuclear, geothermal, and low-emission natural gas provides data centers with flexible and reliable energy options. Grid Stability: Calpine’s extensive natural gas fleet enhances grid reliability, crucial for data centers operating in high-demand regions. Sustainability Initiatives: The combined entity is well-positioned to invest in clean energy infrastructure, including battery storage and carbon sequestration, aligning with the sustainability goals of hyperscale operators. The

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »