Stay Ahead, Stay ONMINE

Alibaba’s new open source model QwQ-32B matches DeepSeek R1 with way smaller compute requirements

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Qwen Team, a division of Chinese e-commerce giant Alibaba developing its growing family of open-source Qwen large language models (LLMs), has introduced QwQ-32B, a new 32-billion-parameter reasoning model designed to improve performance on complex problem-solving tasks […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Qwen Team, a division of Chinese e-commerce giant Alibaba developing its growing family of open-source Qwen large language models (LLMs), has introduced QwQ-32B, a new 32-billion-parameter reasoning model designed to improve performance on complex problem-solving tasks through reinforcement learning (RL).

The model is available as open-weight on Hugging Face and on ModelScope under an Apache 2.0 license. This means it’s available for commercial and research uses, so enterprises can employ it immediately to power their products and applications (even ones they charge customers to use).

It can also be accessed for individual users via Qwen Chat.

Quan-with-Questions was Alibaba’s answer to OpenAI’s original reasoning model o1

QwQ, short for Qwen-with-Questions, was first introduced by Alibaba in November 2024 as an open-source reasoning model aimed at competing with OpenAI’s o1-preview.

At launch, the model was designed to enhance logical reasoning and planning by reviewing and refining its own responses during inference, a technique that made it particularly effective in math and coding tasks.

The initial version of QwQ featured 32 billion parameters and a 32,000-token context length, with Alibaba highlighting its ability to outperform o1-preview in mathematical benchmarks like AIME and MATH, as well as scientific reasoning tasks such as GPQA.

Despite its strengths, QwQ’s early iterations struggled with programming benchmarks like LiveCodeBench, where OpenAI’s models maintained an edge. Additionally, as with many emerging reasoning models, QwQ faced challenges such as language mixing and occasional circular reasoning loops.

However, Alibaba’s decision to release the model under an Apache 2.0 license ensured that developers and enterprises could freely adapt and commercialize it, distinguishing it from proprietary alternatives like OpenAI’s o1.

Since QwQ’s initial release, the AI landscape has evolved rapidly. The limitations of traditional LLMs have become more apparent, with scaling laws yielding diminishing returns in performance improvements.

This shift has fueled interest in large reasoning models (LRMs) — a new category of AI systems that use inference-time reasoning and self-reflection to enhance accuracy. These include OpenAI’s o3 series and the massively successful DeepSeek-R1 from rival Chinese lab DeepSeek, an offshoot of Hong Kong quantitative analysis firm High-Flyer Capital Management.

A new report from web traffic analytics and research firm SimilarWeb found that since the launch of R1 back in January 2024, DeepSeek has rocketed up the charts to become the most-visited AI model-providing website behind OpenAI.

Credit: SimilarWeb, AI Global Global Sector Trends on Generative AI

QwQ-32B, Alibaba’s latest iteration, builds on these advancements by integrating RL and structured self-questioning, positioning it as a serious competitor in the growing field of reasoning-focused AI.

Scaling up performance with multi-stage reinforcement learning

Traditional instruction-tuned models often struggle with difficult reasoning tasks, but the Qwen Team’s research suggests that RL can significantly improve a model’s ability to solve complex problems.

QwQ-32B builds on this idea by implementing a multi-stage RL training approach to enhance mathematical reasoning, coding proficiency and general problem-solving.

The model has been benchmarked against leading alternatives such as DeepSeek-R1, o1-mini and DeepSeek-R1-Distilled-Qwen-32B, demonstrating competitive results despite having fewer parameters than some of these models.

For example, while DeepSeek-R1 operates with 671 billion parameters (with 37 billion activated), QwQ-32B achieves comparable performance with a much smaller footprint — typically requiring 24 GB of vRAM on a GPU (Nvidia’s H100s have 80GB) compared to more than 1500 GB of vRAM for running the full DeepSeek R1 (16 Nvidia A100 GPUs) — highlighting the efficiency of Qwen’s RL approach.

QwQ-32B follows a causal language model architecture and includes several optimizations:

  • 64 transformer layers with RoPE, SwiGLU, RMSNorm and Attention QKV bias;
  • Generalized query attention (GQA) with 40 attention heads for queries and 8 for key-value pairs;
  • Extended context length of 131,072 tokens, allowing for better handling of long-sequence inputs;
  • Multi-stage training including pretraining, supervised fine-tuning and RL.

The RL process for QwQ-32B was executed in two phases:

  1. Math and coding focus: The model was trained using an accuracy verifier for mathematical reasoning and a code execution server for coding tasks. This approach ensured that generated answers were validated for correctness before being reinforced.
  2. General capability enhancement: In a second phase, the model received reward-based training using general reward models and rule-based verifiers. This stage improved instruction following, human alignment and agent reasoning without compromising its math and coding capabilities.

What it means for enterprise decision-makers

For enterprise leaders—including CEOs, CTOs, IT leaders, team managers and AI application developers—QwQ-32B represents a potential shift in how AI can support business decision-making and technical innovation.

With its RL-driven reasoning capabilities, the model can provide more accurate, structured and context-aware insights, making it valuable for use cases such as automated data analysis, strategic planning, software development and intelligent automation.

Companies looking to deploy AI solutions for complex problem-solving, coding assistance, financial modeling or customer service automation may find QwQ-32B’s efficiency an attractive option. Additionally, its open-weight availability allows organizations to fine-tune and customize the model for domain-specific applications without proprietary restrictions, making it a flexible choice for enterprise AI strategies.

The fact that it comes from a Chinese e-commerce giant may raise some security and bias concerns for some non-Chinese users, especially when using the Qwen Chat interface. But as with DeepSeek-R1, the fact that the model is available on Hugging Face for download and offline usage and fine-tuning or retraining suggests that these can be overcome fairly easily. And it is a viable alternative to DeepSeek-R1.

Early reactions from AI power users and influencers

The release of QwQ-32B has already gained attention from the AI research and development community, with several developers and industry professionals sharing their initial impressions on X (formerly Twitter):

  • Hugging Face’s Vaibhav Srivastav (@reach_vb) highlighted QwQ-32B’s speed in inference thanks to provider Hyperbolic Labs, calling it “blazingly fast” and comparable to top-tier models. He also noted that the model “beats DeepSeek-R1 and OpenAI o1-mini with Apache 2.0 license.”
  • AI news and rumor publisher Chubby (@kimmonismus) was impressed by the model’s performance, emphasizing that QwQ-32B sometimes outperforms DeepSeek-R1, despite being 20 times smaller. “Holy moly! Qwen cooked!” they wrote.
  • Yuchen Jin (@Yuchenj_UW), co-founder and CTO of Hyperbolic Labs, celebrated the release by noting the efficiency gains. “Small models are so powerful! Alibaba Qwen released QwQ-32B, a reasoning model that beats DeepSeek-R1 (671B) and OpenAI o1-mini!”
  • Another Hugging Face team member, Erik Kaunismäki (@ErikKaum) emphasized the ease of deployment, sharing that the model is available for one-click deployment on Hugging Face endpoints, making it accessible to developers without extensive setup.

Agentic capabilities

QwQ-32B incorporates agentic capabilities, allowing it to dynamically adjust reasoning processes based on environmental feedback.

For optimal performance, Qwen Team recommends using the following inference settings:

  • Temperature: 0.6
  • TopP: 0.95
  • TopK: Between 20-40
  • YaRN Scaling: Recommended for handling sequences longer than 32,768 tokens

The model supports deployment using vLLM, a high-throughput inference framework. However, current implementations of vLLM only support static YaRN scaling, which maintains a fixed scaling factor regardless of input length.

Future developments

Qwen’s team sees QwQ-32B as the first step in scaling RL to enhance reasoning capabilities. Looking ahead, the team plans to:

  • Further explore scaling RL to improve model intelligence;
  • Integrate agents with RL for long-horizon reasoning;
  • Continue developing foundation models optimized for RL;
  • Move toward artificial general intelligence (AGI) through more advanced training techniques.

With QwQ-32B, Qwen Team is positioning RL as a key driver of the next generation of AI models, demonstrating that scaling can produce highly performant and effective reasoning systems.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

TotalEnergies farms out 40% participating interest in certain licenses offshore Nigeria to Chevron

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style

Read More »

Harbour Energy to Buy Waldorf Subsidiaries for $170MM

Harbour Energy Plc, one of the largest independent oil and gas firms in the UK, agreed to pay $170 million for all the subsidiaries of Waldorf Energy Partners Ltd. and Waldorf Production Ltd. The deal will add 20,000 barrels of oil equivalent a day to production and increases the company’s share of the Catcher field in the North Sea to 90% from 50%, Harbour Energy said in a statement on Friday. The subsidiaries are currently in administration and the acquisition will release an estimated $350 million of cash posted to secure Waldorf’s decommissioning liabilities, it added. Harbour Energy shares gained as much as 7.6% in London trading, the most since August. Many oil and gas companies, already suffering declines in production at mature fields in the British North Sea, have been reassessing their activities after a UK windfall tax was extended and increased several times. Harbour Energy, which completed the acquisition of Wintershall Dea’s non-Russian assets last year, operates in nine countries, including in Norway, Germany, Argentina, Mexico and North Africa.  “This transaction is an important step for Harbour in the UK North Sea, building on the action we’ve already taken to sustain our position in the basin given the ongoing fiscal and regulatory challenges,” said Scott Barr, managing director of Harbour Energy’s UK business unit. Harbour Energy accounts for about 15% of UK’s total oil and gas production, pumping 156,000 barrels of oil equivalent daily in the first nine months of the year.  The sale could be a step toward ending Waldorf’s struggle to get a debt restructuring over the line. In August, the UK’s High Court of Justice in August rejected a plan proposed by the company on the basis that it had “not discharged the burden on it of showing the plan is fair and that it is appropriate, just

Read More »

Oil Drifts Lower Despite Geopolitical Tensions

Oil prices edged down in choppy trading, with US crude falling to the lowest since May, as weakness in US equities markets added to bearish sentiment about oversupply. West Texas Intermediate settled below $58 a barrel, the lowest since May, while global benchmark Brent slumped to the weakest in about two months. Diesel futures, which were down about 1.4%, were the biggest drag on the oil complex on Friday, while a selloff in US stocks compounded declines. Thin trading activity ahead of the Christmas and New Year holidays, as well as traders being cautious about deploying risk after a tough year for profits, also contributed to choppy trading. Growing consensus about supplies exceeding demand next year has pushed crude toward the lower end of a band it has traded in since mid-October. Some traders are positioning for further declines as bearish bets on Brent crude reached their highest in seven weeks, according to data released on Friday. The International Energy Agency on Thursday reiterated its prediction for an unprecedented surplus, although slightly below its forecast last month, and said global inventories have swollen to a four-year high. Geopolitical tensions may add some support to oil prices. President Donald Trump announced new sanctions on three of Venezuelan counterpart Nicolas Maduro’s nephews as well as six oil tankers, after the US seized a supertanker off the coast of the Latin American nation on Wednesday. The ship seizure was just the beginning of a new phase in the Trump administration’s ramped-up pressure campaign against the Venezuelan president, according to people familiar with the operation. The act of economic statecraft is designed to deny Maduro a lifeline of oil revenue and force him to relinquish power, the people said. A murky outlook for a peace deal to end Russia’s war in Ukraine, which could

Read More »

New Supertankers Sail Empty to Collect Oil

A shortage of oil tankers is becoming so acute that newly built vessels, which usually carry refined fuels on their maiden voyages, are instead racing empty to pick up crude as soon as possible.  Six supertankers that were delivered this year have traveled without cargoes from East Asia to load crude in the Middle East, Africa or the Americas, ship-tracking and fixtures data reviewed by Bloomberg and Signal Ocean show. That compares with just one such journey last year. The Atrebates was delivered in early November. It sailed empty from China to the Middle East to pick up a crude cargo from Iraq, and is now headed for Gibraltar. Tanker owners about to receive new ships almost always use them to carry fuels like gasoline on their maiden voyages to pick up crude. This makes both economic and geographical sense, given that oil products are cleaner than crude and the vessels won’t need to be washed after carrying them, and also because many of the ships are built in East Asia, which imports a lot of unprocessed oil and exports refined fuels. A severe shortage of tankers is now upending that logic. Oil producers — both within and outside OPEC — have ramped up output this year. Western sanctions on Russia and the risk of traveling through the Red Sea, meanwhile, have disrupted traditional routes, resulting in longer voyages and more ships being used. Smaller product tankers have also been drawn into the oil trade, while some traders have had to break up cargoes due to the lack of larger vessels, pushing up transport costs even further. The Baltic Dirty Tanker Index, which tracks rates to carry crude oil on 12 major routes, has jumped more than 50% since the end of July, while the Baltic Clean Tanker Index only rose 12%. “When very large crude carriers

Read More »

Analyst Looks at Natural Gas Price Moves

In a natural gas focused EBW Analytics Group report sent to Rigzone by the EBW team on Friday, Eli Rubin, an energy analyst at the company, warned that late December heating demand “continues to disintegrate”. “Yesterday’s 177 billion cubic foot withdrawal did little to stop the massive sell-off in natural gas, with the NYMEX front-month plummeting to close at a seven-week low of $4.231 [per million British thermal units (MMBtu)],” Rubin said in the report. “Although a frigid early December may erode storage surpluses over the next two EIA [U.S. Energy Information Administration] reports, the market is focused on eroding late-December heating demand,” he added. In the report, Rubin noted that the week leading into Christmas “shed another seven gHDDs over the past 24 hours, with exceptionally mild weather anticipated across the country in the back half of the month”. “Daily demand may still surge into Sunday’s peak – but is expected to plunge 26 billion cubic feet per day [Bcfpd] into mid-next week, likely delivering a blow to physical gas prices,” he added. Rubin went on to warn in the report that technicals also appear weak, “with prices falling below the 20-day, 50-day, 100-day and 200-day moving averages”. “Shorts may take profits off the table ahead of the weekend, and medium to long term fundamentals appear more supportive than recent price action suggests, but momentum is bearish and this week’s 133 billion cubic foot loss of weather-driven demand will leave an enduring mark on NYMEX futures,” he said. This EBW report highlighted that the January natural gas contract closed at $4.231 per MMBtu on Thursday. It outlined that this was down 36.4 cents, or 7.9 percent, from Wednesday’s close. In an EBW report sent to Rigzone by the EBW team on December 10, Rubin highlighted that a “weather collapse

Read More »

EIA Ups Brent Price Forecast, Still Sees Drop in 2026

In its latest short term energy outlook (STEO), which was released on December 9, the U.S. Energy Information Administration (EIA) increased its Brent price forecast for 2025 and 2026 but still projected that the commodity will drop next year compared to 2025. According to its December STEO, the EIA now sees the Brent spot price averaging $68.91 per barrel this year and $55.08 per barrel next year. In its previous STEO, which was released in November, the EIA projected that the Brent spot price would average $68.76 per barrel in 2025 and $54.92 per barrel in 2026. The EIA’s October STEO forecast that the commodity would average $68.64 per barrel this year and $52.16 per barrel next year, and its September STEO saw the commodity coming in at $67.80 per barrel in 2025 and $51.43 per barrel in 2026. A quarterly breakdown included in the EIA’s latest STEO projected that the Brent spot price will average $63.10 per barrel in the fourth quarter of this year, $54.93 per barrel in the first quarter of next year, $54.02 per barrel in the second quarter, $55.32 per barrel in the third quarter, and $56.00 per barrel in the fourth quarter of 2026. The commodity averaged $75.83 per barrel in the first quarter of this year, $68.01 per barrel in the second quarter, and $69.00 per barrel in the third quarter, the EIA’s December STEO showed. It also pointed out that the Brent spot price averaged $80.56 per barrel overall last year. In its December STEO, the EIA highlighted that the Brent crude oil spot price averaged $64 per barrel in November, which it pointed out was $11 per barrel lower than in November 2024. “Crude oil prices continue to fall as growing crude oil production outweighs the effect of increased drone attacks

Read More »

BP Starts Up Atlantis Expansion Project in US Gulf

BP PLC said Thursday it has put onstream an expansion project in the Atlantis field in the Gulf of America that will add 15,000 barrels of oil equivalent a day to the deepwater development’s production capacity. Atlantis Drill Center 1 Expansion, BP’s “seventh upstream major project startup of the year”, ties back two wells to the subsea hub via new pipelines, according to the British operator. Atlantis, discovered 1998, has been producing for nearly 20 years and has one of BP’s longest-running platforms in the U.S. Gulf. The field also contains BP’s deepest moored floating platform in the U.S. Gulf, operating in 7,074 feet of water about 150 miles south of New Orleans, according to the company. Atlantis currently has a declared peak output of 200,000 barrels of oil and 180 million cubic feet of gas per day. “Atlantis Drill Center 1 caps off an excellent year of seven major project start-ups for BP. This project supports our plans to safely grow our upstream business, which includes increasing U.S. production to around one million barrels of oil equivalent per day by 2030”, Gordon Birrell, BP executive vice president for production and operations, said in an online statement. BP said, “BP delivered the Atlantis Drill Center 1 expansion project two months ahead of its original schedule by utilizing existing subsea inventory, drilling and completing wells more efficiently, and streamlining offshore execution planning. This is BP’s fifth major startup that has been delivered ahead of schedule this year”. Atlantis Drill Center 1 Expansion is one of three U.S. Gulf projects on a list of 10 upstream projects across BP’s global portfolio that the company aims to complete by 2027. On August 4 BP announced the start of production at Argos Southwest Extension, adding 20,000 bpd of capacity to the Argos platform, which started

Read More »

FinOps Foundation sharpens FOCUS to reduce cloud cost chaos

“The big change that’s really started to happen in late 2024 early 2025 is that the FinOps practice started to expand past the cloud,” Storment said. “A lot of organizations got really good at using FinOps to manage the value of cloud, and then their organizations went, ‘oh, hey, we’re living in this happily hybrid state now where we’ve got cloud, SaaS, data center. Can you also apply the FinOps practice to our SaaS? Or can you apply it to our Snowflake? Can you apply it to our data center?’” The FinOps Foundation’s community has grown to approximately 100,000 practitioners. The organization now includes major cloud vendors, hardware providers like Nvidia and AMD, data center operators and data cloud platforms like Snowflake and Databricks. Some 96 of the Fortune 100 now participate in FinOps Foundation programs. The practice itself has shifted in two directions. It has moved left into earlier architectural and design processes, becoming more proactive rather than reactive. It has also moved up organizationally, from director-level cloud management roles to SVP and COO positions managing converged technology portfolios spanning multiple infrastructure types. This expansion has driven the evolution of FOCUS beyond its original cloud billing focus. Enterprises are implementing FOCUS as an internal standard for chargeback reporting even when their providers don’t generate native FOCUS data. Some newer cloud providers, particularly those focused on AI infrastructure, are using the FOCUS specification to define their billing data structures from the ground up rather than retrofitting existing systems. The FOCUS 1.3 release reflects this maturation, addressing technical gaps that have emerged as organizations apply cost management practices across increasingly complex hybrid environments. FOCUS 1.3 exposes cost allocation logic for shared infrastructure The most significant technical enhancement in FOCUS 1.3 addresses a gap in how shared infrastructure costs are allocated and

Read More »

Aetherflux joins the race to launch orbital data centers by 2027

Enterprises will connect to and manage orbital workloads “the same way they manage cloud workloads today,” using optical links, the spokesperson added. The company’s approach is to “continuously launch new hardware and quickly integrate the latest architectures,” with older systems running lower-priority tasks to serve out the full useful lifetime of their high-end GPUs. The company declined to disclose pricing. Aetherflux plans to launch about 30 satellites at a time on SpaceX Falcon 9 rockets. Before the data center launch, the company will launch a power-beaming demonstration satellite in 2026 to test transmission of one kilowatt of energy from orbit to ground stations, using infrared lasers. Competition in the sector has intensified in recent months. In November, Starcloud launched its Starcloud-1 satellite carrying an Nvidia H100 GPU, which is 100 times more powerful than any previous GPU flown in space, according to the company, and demonstrated running Google’s Gemma AI model in orbit. In the same month, Google announced Project Suncatcher, with a 2027 demonstration mission planned. Analysts see limited near-term applications Despite the competitive activity, orbital data centers won’t replace terrestrial cloud regions for general hosting through 2030, said Ashish Banerjee, senior principal analyst at Gartner. Instead, they suit specific workloads, including meeting data sovereignty requirements for jurisdictionally complex scenarios, offering disaster recovery immune to terrestrial risks, and providing asynchronous high-performance computing, he said. “Orbital centers are ideal for high-compute, low-I/O batch jobs,” Banerjee said. “Think molecular folding simulations for pharma, massive Monte Carlo financial simulations, or training specific AI model weights. If the job takes 48 hours, the 500ms latency penalty of LEO is irrelevant.” One immediate application involves processing satellite-generated data in orbit, he said. Earth observation satellites using synthetic aperture radar generate roughly 10 gigabytes per second, but limited downlink bandwidth creates bottlenecks. Processing data in

Read More »

Here’s what Oracle’s soaring infrastructure spend could mean for enterprises

He said he had earlier told analysts in a separate call that margins for AI workloads in these data centers would be in the 30% to 40% range over the life of a customer contract. Kehring reassured that there would be demand for the data centers when they were completed, pointing to Oracle’s increasing remaining performance obligations, or services contracted but not yet delivered, up $68 billion on the previous quarter, saying that Oracle has been seeing unprecedented demand for AI workloads driven by the likes of Meta and Nvidia. Rising debt and margin risks raise flags for CIOs For analysts, though, the swelling debt load is hard to dismiss, even with Oracle’s attempts to de-risk its spend and squeeze more efficiency out of its buildouts. Gogia sees Oracle already under pressure, with the financial ecosystem around the company pricing the risk — one of the largest debts in corporate history, crossing $100 billion even before the capex spend this quarter — evident in the rising cost of insuring the debt and the shift in credit outlook. “The combination of heavy capex, negative free cash flow, increasing financing cost and long-dated revenue commitments forms a structural pressure that will invariably finds its way into the commercial posture of the vendor,” Gogia said, hinting at an “eventual” increase in pricing of the company’s offerings. He was equally unconvinced by Magouyrk’s assurances about the margin profile of AI workloads as he believes that AI infrastructure, particularly GPU-heavy clusters, delivers significantly lower margins in the early years because utilisation takes time to ramp.

Read More »

New Nvidia software gives data centers deeper visibility into GPU thermals and reliability

Addressing the challenge Modern AI accelerators now draw more than 700W per GPU, and multi-GPU nodes can reach 6kW, creating concentrated heat zones, rapid power swings, and a higher risk of interconnect degradation in dense racks, according to Manish Rawat, semiconductor analyst at TechInsights. Traditional cooling methods and static power planning increasingly struggle to keep pace with these loads. “Rich vendor telemetry covering real-time power draw, bandwidth behavior, interconnect health, and airflow patterns shifts operators from reactive monitoring to proactive design,” Rawat said. “It enables thermally aware workload placement, faster adoption of liquid or hybrid cooling, and smarter network layouts that reduce heat-dense traffic clusters.” Rawat added that the software’s fleet-level configuration insights can also help operators catch silent errors caused by mismatched firmware or driver versions. This can improve training reproducibility and strengthen overall fleet stability. “Real-time error and interconnect health data also significantly accelerates root-cause analysis, reducing MTTR and minimizing cluster fragmentation,” Rawat said. These operational pressures can shape budget decisions and infrastructure strategy at the enterprise level.

Read More »

Arista goes big with campus wireless tech

In a white paper describing how VESPA works, Arista wrote: The first component of VESPA involves Arista access points creating VXLAN tunnels to Arista switches serving as WLAN Gateways…. Second, as device packets arrive via the AP, it dynamically creates an Ethernet Segment Identifier (Type 6 ESI) based on the AP’s VTEP IP address. These dynamically created tunnels can scale to 30K ESI’s spread across paired switches in the cluster which provide active/active load sharing (performance+HA) to the APs. Third, the gateway switches use Type 2 EVPN NLRI (Network Layer Reachability Information) to learn and exchange end point MAC addresses across the cluster. … With this architecture, adding more EVPN WLAN gateways scales both AP and user connections, to tens of thousands of end points. To manage the forwarding information for hundreds of thousands of clients (e.g: FIB next hop and rewrite) would prove very complex and expensive if using conventional networking solutions. Arista’s innovation is to distribute this function across the WiFi access points with a unique MAC Rewrite Offload feature (MRO). With MRO, the access point is responsible for servicing mobile client ARP requests (using its own mac address), building a localized MAC-IP binding table, and forwarding client IP addresses to the WLAN gateways with the APs MAC address. The WLAN Gateways therefore only learns one (MAC) address for all the clients associated with the AP. This improves the gateway’s scaling from 10X to 100X, allowing these cost effective gateways to support hundreds of thousands of clients attached to the APs. AVA system gets a boost In addition to the new wireless technology, Arista is also bolstering the capabilities of its natural-language, generative AI-based Autonomous Virtual Assist (AVA) system for delivering network insights and AIOps.  AVA is aimed at providing an intelligent assistant that’s not there to replace

Read More »

Most significant networking acquisitions of 2025

Cisco makes two AI deals: EzDubs and NeuralFabric Last month Cisco completed its acquisition of EzDubs, a privately held AI software company with speech-to-speech translation technology. EzDubs translates conversations across 31 languages and will accelerate Cisco’s delivery of next-generation features, such as live voice translation that preserves the characteristics of speech, the vendor stated. Cisco plans to incorporate EzDubs’ technology in its Cisco Collaboration portfolio. Also in November, Cisco bought AI platform company NeuralFabric, which offers a generative AI platform that lets organizations develop domain-specific small language models using their own proprietary data. Coreweave buys Core Scientific Nvidia-backed AI cloud provider CoreWeave acquired crypto miner Core Scientific for about $9 billion, giving it access to 1.3 gigawatts of contracted power to support growing demand for AI and high-performance computing workloads. CoreWeave said the deal augments its vertical integration by expanding its owned and operated data center footprint, allowing it to scale GPU-powered services for enterprise and research customers. F5 picks up three: CalypsoAI, Fletch and MantisNet F5 acquired Dublin, Ireland-based CalypsoAI for $180 million. CalypsoAI’s platform creates what the company calls an Inference Perimeter that protects across models, vendors, and environments. F5 says it will integrate CalypsoAI’s adaptive AI security capabilities into its F5 Application Delivery and Security Platform (ADSP). F5’s ADSP also stands to gain from F5’s acquisition of agentic AI and threat management startup Fletch. Fletch’s technology turns external threat intelligence and internal logs into real-time, prioritized insights; its agentic AI capabilities will be integrated into ADSP, according to F5. Lastly, F5 grabbed startup MantisNet to enhance cloud-native observability in F5’s ADSP. MantisNet leverages extended Berkeley Packet Filer (eBPF)-powered, kernel-level telemetry to provide real-time insights into encrypted protocol activity and allow organizations “to gain visibility into even the most elusive traffic, all without performance overhead,” according to an F5 blog

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »