MiniMax unveils its own open source LLM with industry-leading 4M token context

Stay Ahead, Stay ONMINE

MiniMax unveils its own open source LLM with industry-leading 4M token context

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More MiniMax is perhaps today best known here in the U.S. as the Singaporean company behind Hailuo, a realistic, high-resolution generative AI video model that competes with Runway, OpenAI’s Sora and Luma AI’s Dream Machine. But the […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

MiniMax is perhaps today best known here in the U.S. as the Singaporean company behind Hailuo, a realistic, high-resolution generative AI video model that competes with Runway, OpenAI’s Sora and Luma AI’s Dream Machine.

But the company has far more tricks up its sleeve: Today, for instance, it announced the release and open-sourcing of the MiniMax-01 series, a new family of models built to handle ultra-long contexts and enhance AI agent development.

The series includes MiniMax-Text-01, a foundation large language model (LLM), and MiniMax-VL-01, a visual multi-modal model.

A massive context window

MiniMax-Text-o1, is of particular note for enabling up to 4 million tokens in its context window — equivalent to a small library’s worth of books. The context window is how much information the LLM can handle in one input/output exchange, with words and concepts represented as numerical “tokens,” the LLM’s own internal mathematical abstraction of the data it was trained on.

And, while Google previously led the pack with its Gemini 1.5 Pro model and 2 million token context window, MiniMax remarkably doubled that.

As MiniMax posted on its official X account today: “MiniMax-01 efficiently processes up to 4M tokens — 20 to 32 times the capacity of other leading models. We believe MiniMax-01 is poised to support the anticipated surge in agent-related applications in the coming year, as agents increasingly require extended context handling capabilities and sustained memory.”

The models are available now for download on Hugging Face and Github under a custom MiniMax license, for users to try directly on Hailuo AI Chat (a ChatGPT/Gemini/Claude competitor), and through MiniMax’s application programming interface (API), where third-party developers can link their own unique apps to them.

MiniMax is offering APIs for text and multi-modal processing at competitive rates:

$0.2 per 1 million input tokens
$1.1 per 1 million output tokens

For comparison, OpenAI’s GPT-4o costs $2.50 per 1 million input tokens through its API, a staggering 12.5X more expensive.

MiniMax has also integrated a mixture of experts (MoE) framework with 32 experts to optimize scalability. This design balances computational and memory efficiency while maintaining competitive performance on key benchmarks.

Striking new ground with Lightning Attention Architecture

At the heart of MiniMax-01 is a Lightning Attention mechanism, an innovative alternative to transformer architecture.

This design significantly reduces computational complexity. The models consist of 456 billion parameters, with 45.9 billion activated per inference.

Unlike earlier architectures, Lightning Attention employs a mix of linear and traditional SoftMax layers, achieving near-linear complexity for long inputs. SoftMax, for those like myself who are new to the concept, are the transformation of input numerals into probabilities adding up to 1, so that the LLM can approximate which meaning of the input is likeliest.

MiniMax has rebuilt its training and inference frameworks to support the Lightning Attention architecture. Key improvements include:

MoE all-to-all communication optimization: Reduces inter-GPU communication overhead.
Varlen ring attention: Minimizes computational waste for long-sequence processing.
Efficient kernel implementations: Tailored CUDA kernels improve Lightning Attention performance.

These advancements make MiniMax-01 models accessible for real-world applications, while maintaining affordability.

Performance and Benchmarks

On mainstream text and multi-modal benchmarks, MiniMax-01 rivals top-tier models like GPT-4 and Claude-3.5, with especially strong results on long-context evaluations. Notably, MiniMax-Text-01 achieved 100% accuracy on the Needle-In-A-Haystack task with a 4-million-token context.

The models also demonstrate minimal performance degradation as input length increases.

MiniMax plans regular updates to expand the models’ capabilities, including code and multi-modal enhancements.

The company views open-sourcing as a step toward building foundational AI capabilities for the evolving AI agent landscape.

With 2025 predicted to be a transformative year for AI agents, the need for sustained memory and efficient inter-agent communication is increasing. MiniMax’s innovations are designed to meet these challenges.

Open to collaboration

MiniMax invites developers and researchers to explore the capabilities of MiniMax-01. Beyond open-sourcing, its team welcomes technical suggestions and collaboration inquiries at [email protected].

With its commitment to cost-effective and scalable AI, MiniMax positions itself as a key player in shaping the AI agent era. The MiniMax-01 series offers an exciting opportunity for developers to push the boundaries of what long-context AI can achieve.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Will Google throw gasoline on the AI chip arms race?

The Nvidia processors, he explains, are for processing massive, large language models (LLMs), while the Google TPU is used for inferencing, the next step after processing the LLM. So the two chips don’t compete with each other, they complement each other, according to Gold. Selling and supporting processors may not

Nvidia moves deeper into AI infrastructure with SchedMD acquisition

“Slurm excels at orchestrating multi-node distributed training, where jobs span hundreds or thousands of GPUs,” said Lian Jye Su, chief analyst at Omdia. “The software can optimize data movement within servers by deciding where jobs should be placed based on resource availability. With strong visibility into the network topology, Slurm

ExxonMobil bumps up 2030 target for Permian production

ExxonMobil Corp., Houston, is looking to grow production in the Permian basin to about 2.5 MMboe/d by 2030, an increase of 200,000 boe/d from executives’ previous forecasts and a jump of more than 45% from this year’s output. Helping drive that higher target is an expected 2030 cost profile that

Cloud providers continue to push EU court to undo Broadcom-VMware merger

CISPE director of communications Ben Maynard dismissed fears that any action by the Commission could lead to a fine on VMware that would be passed on to its users, increasing prices even further. “I’m not sure that a fine is a likely consequence. This isn’t an action against Broadcom; this

3D Energi Secures Investor Commitments for $10MM Capital Raise

3D Energi Ltd said Tuesday new institutional investors and existing shareholders have committed to participating in a share issuance to raise AUD 14.5 million ($9.61 million) for its drilling campaign offshore Victoria. The Eastern Australia-focused exploration company plans to issue nearly 104 million shares at AUD 0.14 per share, it said in a stock filing. “The placement price of [AU]$0.14 per share represents a 17.6 percent discount to the last trading price of [AU]$0.17 on 11 December 2025 and a 18.5 percent discount to the 15-trading-day volume weighted average price of [AU]$0.1719”, 3D Energi said. “Placement shares will be listed on the ASX [Australian Securities Exchange] and rank pari-passu with the existing fully paid ordinary shares”. It expects to settle the placement December 23, while placement options are subject to shareholder approval at a general meeting that 3D Energy expects to hold late January 2026. 3D Energi said it would issue one free attaching option for every one new share issued under the placement. “The placement options are exercisable at [AU]$0.21 each, with an expiry date of two years from the date of issue”, 3D Energi said. “It is intended that the placement options will be listed, and an application will be made to the ASX for quotation of the options, subject to shareholder approval and meeting the ASX requirements for quotation of the options”. “The placement was strongly supported by a number of new domestic and international institutional investors, as well as existing shareholders”, 3D Energi said. “Proceeds from the placement will be applied towards testing at the Essington-1 well, drilling the Charlemont-1 gas exploration well within the VIC/P79 exploration permit, the second well of the 2025 Otway Exploration Drilling Program, and for general working capital purposes, including costs of the placement”, it said. On Wednesday 3D Energi said the

Trump Orders Blockade of Sanctioned Oil Tankers in Venezuela

President Donald Trump ordered a blockade of sanctioned oil tankers going into and leaving Venezuela, ratcheting up pressure on Caracas as the US builds up its military presence in the region. “Venezuela is completely surrounded by the largest Armada ever assembled in the History of South America,” Trump wrote on social media Tuesday. “It will only get bigger, and the shock to them will be like nothing they have ever seen before.” The move threatens to choke off the economic lifeblood of a country that was already under severe financial pressure. But it will have a less profound impact on global markets due to the diminished status of Venezuela’s oil industry. The OPEC member’s crude output has slumped about 70% through more than 25 years of socialist rule to less than 1 million barrels a day. It could potentially rebound if the governing regime were to change. Even so, the move represents an escalation of Trump’s pressure on President Nicolas Maduro with the potential to further destabilize the country in the short term. Venezuela condemned the latest measures as a “reckless and serious” threat. US crude benchmark West Texas Intermediate climbed as much as 1.7% to trade near $56 a barrel, rebounding from the lowest level in almost five years. “Trump intends to impose, in an utterly irrational manner, a supposed military blockade of Venezuela with the aim of stealing the riches that belong to our homeland,” the government said in a statement published late Tuesday on Vice President Delcy Rodríguez’s Telegram account. “Venezuela reaffirms its sovereignty over all its natural resources.” Venezuela said in its statement that its ambassador to the United Nations would immediately denounce what it called a “grave” violation of international law. Trump said he was also designating the Maduro regime as a “FOREIGN TERRORIST ORGANIZATION.”

USA Readies New Russia Sanctions If Putin Rejects Deal

The US is preparing a fresh round of sanctions on Russia’s energy sector to increase the pressure on Moscow should President Vladimir Putin reject a peace agreement with Ukraine, according to people familiar with the matter. The US is considering options, such as targeting vessels in Russia’s so-called shadow fleet of tankers used to transport Moscow’s oil, as well as traders who facilitate the transactions, said the people who spoke on condition of anonymity to discuss private deliberations. The new measures could be unveiled as early as this week, some of the people said. Treasury Secretary Scott Bessent discussed the plans when he met a group of European ambassadors earlier this week, the people said. “President Trump is the President of Peace, and I reiterated that under his leadership, America will continue to prioritize ending the war in Ukraine,” he wrote in a post on the social media platform X after the meeting. The people cautioned that any final decision rests with President Donald Trump. A request for comment placed with the Department of Treasury outside of business hours wasn’t immediately returned. The Kremlin is aware that some US officials are mulling plans to introduce new sanctions against Russia, Putin’s spokesman Dmitry Peskov told reporters Wednesday, according to the Interfax news service. “It’s obvious that any sanctions are harmful for the process of rebuilding relations,” he said. Oil briefly rose after the news. Brent futures advanced as much as 70 cents a barrel to trade as high as $60.33, before paring their advance. My thanks to @EUAmbUS Ambassador Neliupšienė for hosting discussions this morning with the 27 EU Ambassadors to the United States. President Trump is the President of Peace, and I reiterated that under his leadership, America will continue to prioritize ending the war in… pic.twitter.com/3SfQiL4lvw — Treasury Secretary Scott Bessent (@SecScottBessent) December

Strategists Forecast Week on Week USA Crude Build

In an oil and gas report sent to Rigzone by the Macquarie team this week, Macquarie strategists, including Walt Chancellor, revealed that they are forecasting that U.S. crude inventories will be up by 2.5 million barrels for the week ending December 12. “This follows a 1.8 million barrel draw in the prior week, with the crude balance realizing quite loose relative to our expectations amidst an apparent surge in Canadian imports,” the strategists said in the report. “While our balances point to a much looser fundamental picture this week, we note some potential for a ‘catch-up’ to the tighter side in this week’s data,” they added. “For this week’s balance, from refineries, we look for a minimal reduction in crude runs. Among net imports, we model a small increase, with exports lower (-0.1 million barrels per day) and imports higher (+0.1 million barrels per day) on a nominal basis,” they continued. The strategists warned in the report that the timing of cargoes remains a source of potential volatility in this week’s crude balance. “From implied domestic supply (prod.+adj.+transfers), we look for an increase (+0.4 million barrels per day) on a nominal basis this week,” the strategists went on to note. “Rounding out the picture, we anticipate another small increase (+0.3 million barrels) in SPR [Strategic Petroleum Reserve] stocks this week,” they added. The analysts also stated in the report that, “among products”, they “again look for across the board builds (gasoline/ distillate/jet +5.2/+2.0/+1.5 million barrels)”. “We model implied demand for these three products at ~14.3 million barrels per day for the week ending December 12,” they said. In its latest weekly petroleum status report at the time of writing, which was released on December 10 and included data for the week ending December 5, the U.S. Energy Information Administration (EIA)

SK On pivots to stationary energy storage after Ford joint venture ends

Dive Brief: Korean battery maker SK On says it remains committed to building out a Tennessee plant originally intended to supply electric vehicle batteries to Ford after a joint venture with the car maker was called off, the company said in a statement. The manufacturer will maintain its strategic partnership with Ford and continue to supply EV batteries for its future vehicles, SK Americas spokesperson Joe Guy Collier said in an email. However, going forward, SK On plans to focus more on “profitable and sustainable growth” in the U.S. by supplying batteries produced in the Tennessee plant to other customers, including for stationary energy storage systems, the company said. “This agreement allows SK On to strategically realign assets and production capacity to improve its operational efficiency,” the battery maker said in a statement. “It also enables the company to enhance productivity, operational flexibility, and respond more effectively to evolving market dynamics and diverse customer needs.” Dive Insight: Ford and SK On reached a mutual agreement to dissolve their electric vehicle battery joint venture, BlueOval SK, Collier confirmed in an email last week. The joint venture was established in September 2021 as part of a planned $11.4 billion investment by the two companies to build three large-scale manufacturing plants — one in Tennessee and two in Kentucky — to produce advanced batteries for Ford’s future EVs. Under the terms of the dissolution agreement, each company will independently own and operate the joint venture’s former production facilities, Collier said. A Ford subsidiary will take full ownership of the two battery plants in Kentucky, and SK On will assume full ownership and operate the battery plant in Tennessee. “SK On is committed to the Tennessee plant long-term,” the company said. “We plan to make it a key part of our manufacturing base for advanced batteries

Shell Adds New Gas Customer in Nigeria

Shell PLC, through Shell Nigeria Gas Ltd (SNG), has signed an agreement to supply natural gas to SG Industrial FZE. The new customer is “a leading steel company in the Guandong industrial zone in the state”, the British company said on its Nigerian website. “The agreement adds to a growing list of clients for SNG which has developed as a dependable supplier of gas through distribution pipelines of some 150 kilometers [93.21 miles], serving over 150 clients in Abia, Bayelsa, Ogun and Rivers states”, Shell said. Shell did not disclose the contract volume or value. SNG managing director Ralph Gbobo said, “Our commitment is clear – to build, operate and maintain a gas distribution system that is not only reliable but resilient, transparent and designed to fuel growth”. SG Industrial vice general manager Moya Shua said, “This collaboration marks a major step forward in securing reliable energy that will power our growth and long-term ambitions”. Shell said it had previously signed agreements to supply pipeline gas to Nigeria Distilleries Ltd III, Reliance Chemical Products Limited II, Rumbu Industries Nigeria Ltd and Ultimum Ltd. Expanding its gas operations in the West African country, Shell recently announced a final investment decision to develop the HI field to supply up to 350 million standard cubic feet of gas a day, equivalent to about 60,000 oil barrels per day, to Nigeria LNG. The project is part of a joint venture in which Shell owns 40 percent through Shell Nigeria Exploration and Production Co Ltd. Sunlink Energies and Resources Ltd holds 60 percent. At Nigeria LNG, which has a declared capacity of 22 million metric tons of liquefied natural gas a year, Shell owns 25.6 percent. “The increase in feedstock to NLNG, via the train VII project that aims to expand the Bonny Island terminal’s production capacity,

Uptime Institute’s Max Smolaks: Power, Racks, and the Economics of the AI Data Center Boom

The latest episode of the Data Center Frontier Show opens not with a sweeping thesis, but with a reminder of just how quickly the industry’s center of gravity has shifted. Editor in Chief Matt Vincent is joined by Max Smolaks, research analyst at Uptime Institute, whom DCF met in person earlier this year at the Open Compute Project (OCP) Global Summit 2025 in San Jose. Since then, Smolaks has been closely tracking several of the most consequential—and least obvious—threads shaping the AI infrastructure boom. What emerges over the course of the conversation is not a single narrative, but a set of tensions: between power and place, openness and vertical integration, hyperscale ambition and economic reality. From Crypto to Compute: An Unlikely On-Ramp One of the clearest structural patterns Smolaks sees in today’s AI buildout is the growing number of large-scale AI data center projects that trace their origins back to cryptocurrency mining. It is a transition few would have predicted even a handful of years ago. Generative AI was not an anticipated workload in traditional capacity planning cycles. Three years ago, ChatGPT did not exist, and the industry had not yet begun to grapple with the scale, power density, and energy intensity now associated with AI training and inference. When demand surged, developers were left with only a limited set of viable options. Many leaned heavily on on-site generation—most often natural gas—to bypass grid delays. Others ended up in geographies that had already been “discovered” by crypto miners. For years, cryptocurrency operators had been quietly mapping underutilized power capacity. Latency did not matter. Proximity to population centers did not matter. Cheap, abundant electricity did—often in remote or unconventional locations that would never have appeared on a traditional data center site-selection short list. As crypto markets softened, those same sites became

Google’s TPU Roadmap: Challenging Nvidia’s Dominance in AI Infrastructure

Google’s roadmap for its Tensor Processing Units has quietly evolved into a meaningful counterweight to Nvidia’s GPU dominance in cloud AI infrastructure—particularly at hyperscale. While Nvidia sells physical GPUs and associated systems, Google sells accelerator services through Google Cloud Platform. That distinction matters: Google isn’t competing in the GPU hardware market, but it is increasingly competing in the AI compute services market, where accelerator mix and economics directly influence hyperscaler strategy. Over the past 18–24 months, Google has focused on identifying workloads that map efficiently onto TPUs and has introduced successive generations of the architecture, each delivering notable gains in performance, memory bandwidth, and energy efficiency. Currently, three major TPU generations are broadly available in GCP: v5e and v5p, the “5-series” workhorses tuned for cost-efficient training and scale-out learning. Trillium (v6), offering a 4–5× performance uplift over v5e with significant efficiency gains. Ironwood (v7 / TPU7x), a pod-scale architecture of 9,216 chips delivering more than 40 exaFLOPS FP8 compute, designed explicitly for the emerging “age of inference.” Google is also aggressively marketing TPU capabilities to external customers. The expanded Anthropic agreement (up to one million TPUs, representing ≥1 GW of capacity and tens of billions of dollars) marks the most visible sign of TPU traction. Reporting also suggests that Google and Meta are in advanced discussions for a multibillion-dollar arrangement in which Meta would lease TPUs beginning in 2026 and potentially purchase systems outright starting in 2027. At the same time, Google is broadening its silicon ambitions. The newly introduced Axion CPUs and the fully integrated AI Hypercomputer architecture frame TPUs not as a standalone option, but as part of a multi-accelerator environment that includes Nvidia H100/Blackwell GPUs, custom CPUs, optimized storage, and high-performance fabrics. What follows is a deeper look at how the TPU stack has evolved, and what

DCF Trends Summit 2025: Beyond the Grid – Natural Gas, Speed, and the New Data Center Reality

By 2025, the data center industry’s power problem has become a site-selection problem, a finance problem, a permitting problem and, increasingly, a communications problem. That was the throughline of “Beyond the Grid: Natural Gas, Speed, and the New Data Center Reality,” a DCF Trends Summit panel moderated by Stu Dyer, First Vice President at CBRE, with Aad den Elzen, VP of Power Generation at Solar Turbines (a Caterpillar company); Creede Williams, CEO & President of Exigent Energy Partners; and Adam Michaelis, Vice President of Hyperscale Engineering at PointOne Data Centers. In an industry that once treated proximity to gas infrastructure as a red flag, Dyer opened with a blunt marker of the market shift: what used to be a “no-go” is now, for many projects, the shortest path to “yes.” Vacancy is tight, preleasing is high, and the center of gravity is moving both in scale and geography as developers chase power beyond the traditional core. From 48MW Campuses to Gigawatt Expectations Dyer framed the panel’s premise with a Northern Virginia memory: a “big” 48MW campus in Sterling that was expected to last five to seven years—until a hyperscale takedown effectively erased the runway. That was the early warning sign of what’s now a different era entirely. Today, Dyer said, the industry isn’t debating 72MW or even 150MW blocks. Increasingly, the conversation starts at 500MW critical and, for some customers, pushes past a gigawatt. Grid delivery timelines have not kept pace with that shift, and the mismatch is forcing alternative strategies into the mainstream. “If you’re interested in speed and scale… gas.” If there was a sharp edge to the panel, it came from Williams’ assertion that for near-term speed-to-power at meaningful scale, natural gas is the only broadly viable option. Williams spoke as an independent power producer (IPP) operator who

Roundtable: The Economics of Acceleration

Ben Rapp, Rehlko: The pace of AI deployment is outpacing grid capacity in many regions, which means power strategy is now directly tied to deployment timelines. To move fast without sacrificing lifecycle cost or reliability, operators are adopting modular power systems that can be installed and commissioned quickly, then expanded or adapted as loads grow. From an energy perspective, this requires architectures that support multiple pathways: traditional generation, cleaner fuels like HVO, battery energy storage, and eventually hydrogen or renewable integrations where feasible. Backup power is no longer a static insurance policy, it’s a dynamic part of the operating model, supporting uptime, compliance, and long-term cost management. Rehlko’s global footprint and broad energy portfolio enable us to support operators through these transitions with scalable solutions that meet existing technical needs while providing a roadmap for future adaptation.

DCF Trends Summit 2025: Bridging the Data Center Power Gap – Utilities, On-Site Power, and the AI Buildout

The second installment in our recap series from the 2025 Data Center Frontier Trends Summit highlights a panel that brought unusual candor—and welcome urgency—to one of the defining constraints of the AI era: power availability. Moderated by Buddy Rizer, Executive Director of Economic Development for Loudoun County, Bridging the Data Center Power Gap: Ways to Streamline the Energy Supply Chain convened a powerhouse group of energy and data center executives representing on-site generation, independent power markets, regulated utilities, and hyperscale operators: Jeff Barber, VP of Global Data Centers, Bloom Energy Bob Kinscherf, VP of National Accounts, Constellation Stan Blackwell, Director, Data Center Practice, Dominion Energy Joel Jansen, SVP Regulated Commercial Operations, American Electric Power David McCall, VP of Innovation, QTS Data Centers As presented on September 26, 2025 in Reston, Virginia, the discussion quickly revealed that while no single answer exists to the industry’s power crunch, a more collaborative, multi-path playbook is now emerging—and evolving faster than many realize. A Grid Designed for Yesterday Meets AI-Era Demand Curves Rizer opened with context familiar to anyone operating in Northern Virginia: this region sits at the epicenter of globally scaled digital infrastructure, but its once-ample headroom has evaporated under the weight of AI scaling cycles. Across the panel, the message was consistent: demand curves have shifted permanently, and the step-changes in load growth require new thinking across the entire energy supply chain. Joel Jansen (AEP) underscored the pace of change. A decade ago, utilities faced flat or declining load growth. Now, “our load curve is going straight up,” driven by hyperscale and AI training clusters that are large, high-density, and intolerant of slow development cycles. AEP’s 40,000 miles of transmission and 225,000 miles of distribution infrastructure give it perspective: generation is challenging, but transmission and interconnection timelines are becoming decisive gating factors.

DCF Trends Summit 2025 – Scaling AI: Adaptive Reuse, Power-Rich Sites, and the New GPU Frontier

When Jones Lang LaSalle (JLL)’s Sean Farney walked back on stage after lunch at the Data Center Frontier Trends Summit 2025, he didn’t bother easing into the topic. “This is the best one of the day,” he joked, “and it’s got the most buzzwords in the title.” The session, “Scaling AI: The Role of Adaptive Reuse and Power-Rich Sites in GPU Deployment,” lived up to that billing. Over the course of the hour, Farney and his panel of experts dug into the hard constraints now shaping AI infrastructure—and the unconventional sites and power strategies needed to overcome them. Joining Farney on stage were: Lovisa Tedestedt, Strategic Account Executive – Cloud & Service Providers, Schneider Electric Phill Lawson-Shanks, Chief Innovation Officer, Aligned Data Centers Scott Johns, Chief Commercial Officer, Sapphire Gas Solutions Together, they painted a picture of an industry running flat-out, where adaptive reuse, modular buildouts, and behind-the-meter power are becoming the fastest path to AI revenue. The Perfect Storm: 2.3% Vacancy, Power-Constrained Revenue Farney opened with fresh JLL research that set the stakes in stark terms. U.S. colo vacancy is down to 2.3% – roughly 98% utilization. Just five years ago, vacancy was about 10%. The industry is tracking to over 5.4 GW of colocation absorption this year, with 63% of first-half absorption concentrated in just two markets: Northern Virginia and Dallas. There’s roughly 8 GW of build pipeline, but about 73% of that is already pre-leased, largely by hyperscalers and “Mag 7” cloud and AI giants. “We are the envy of every industry on the planet,” Farney said. “That’s fantastic if you’re in the data center business. It’s a really bad thing if you’re a customer.” The message to CIOs and CTOs was blunt: if you don’t have a capacity strategy dialed in, your growth may be constrained

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE