Stay Ahead, Stay ONMINE

MiniMax unveils its own open source LLM with industry-leading 4M token context

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More MiniMax is perhaps today best known here in the U.S. as the Singaporean company behind Hailuo, a realistic, high-resolution generative AI video model that competes with Runway, OpenAI’s Sora and Luma AI’s Dream Machine. But the […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


MiniMax is perhaps today best known here in the U.S. as the Singaporean company behind Hailuo, a realistic, high-resolution generative AI video model that competes with Runway, OpenAI’s Sora and Luma AI’s Dream Machine.

But the company has far more tricks up its sleeve: Today, for instance, it announced the release and open-sourcing of the MiniMax-01 series, a new family of models built to handle ultra-long contexts and enhance AI agent development.

The series includes MiniMax-Text-01, a foundation large language model (LLM), and MiniMax-VL-01, a visual multi-modal model.

A massive context window

MiniMax-Text-o1, is of particular note for enabling up to 4 million tokens in its context window — equivalent to a small library’s worth of books. The context window is how much information the LLM can handle in one input/output exchange, with words and concepts represented as numerical “tokens,” the LLM’s own internal mathematical abstraction of the data it was trained on.

And, while Google previously led the pack with its Gemini 1.5 Pro model and 2 million token context window, MiniMax remarkably doubled that.

As MiniMax posted on its official X account today: “MiniMax-01 efficiently processes up to 4M tokens — 20 to 32 times the capacity of other leading models. We believe MiniMax-01 is poised to support the anticipated surge in agent-related applications in the coming year, as agents increasingly require extended context handling capabilities and sustained memory.”

The models are available now for download on Hugging Face and Github under a custom MiniMax license, for users to try directly on Hailuo AI Chat (a ChatGPT/Gemini/Claude competitor), and through MiniMax’s application programming interface (API), where third-party developers can link their own unique apps to them.

MiniMax is offering APIs for text and multi-modal processing at competitive rates:

  • $0.2 per 1 million input tokens
  • $1.1 per 1 million output tokens

For comparison, OpenAI’s GPT-4o costs $2.50 per 1 million input tokens through its API, a staggering 12.5X more expensive.

MiniMax has also integrated a mixture of experts (MoE) framework with 32 experts to optimize scalability. This design balances computational and memory efficiency while maintaining competitive performance on key benchmarks.

Striking new ground with Lightning Attention Architecture

At the heart of MiniMax-01 is a Lightning Attention mechanism, an innovative alternative to transformer architecture.

This design significantly reduces computational complexity. The models consist of 456 billion parameters, with 45.9 billion activated per inference.

Unlike earlier architectures, Lightning Attention employs a mix of linear and traditional SoftMax layers, achieving near-linear complexity for long inputs. SoftMax, for those like myself who are new to the concept, are the transformation of input numerals into probabilities adding up to 1, so that the LLM can approximate which meaning of the input is likeliest.

MiniMax has rebuilt its training and inference frameworks to support the Lightning Attention architecture. Key improvements include:

  • MoE all-to-all communication optimization: Reduces inter-GPU communication overhead.
  • Varlen ring attention: Minimizes computational waste for long-sequence processing.
  • Efficient kernel implementations: Tailored CUDA kernels improve Lightning Attention performance.

These advancements make MiniMax-01 models accessible for real-world applications, while maintaining affordability.

Performance and Benchmarks

On mainstream text and multi-modal benchmarks, MiniMax-01 rivals top-tier models like GPT-4 and Claude-3.5, with especially strong results on long-context evaluations. Notably, MiniMax-Text-01 achieved 100% accuracy on the Needle-In-A-Haystack task with a 4-million-token context.

The models also demonstrate minimal performance degradation as input length increases.

MiniMax plans regular updates to expand the models’ capabilities, including code and multi-modal enhancements.

The company views open-sourcing as a step toward building foundational AI capabilities for the evolving AI agent landscape.

With 2025 predicted to be a transformative year for AI agents, the need for sustained memory and efficient inter-agent communication is increasing. MiniMax’s innovations are designed to meet these challenges.

Open to collaboration

MiniMax invites developers and researchers to explore the capabilities of MiniMax-01. Beyond open-sourcing, its team welcomes technical suggestions and collaboration inquiries at [email protected].

With its commitment to cost-effective and scalable AI, MiniMax positions itself as a key player in shaping the AI agent era. The MiniMax-01 series offers an exciting opportunity for developers to push the boundaries of what long-context AI can achieve.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

NextDecade names Mott as interim CFO

NextDecade Corp. has appointed company senior vice-president Michael (Mike) Mott as interim chief financial officer, effective Oct. 20, 2025. Mott will take over from Brent Wahl, who resigns from the company as chief financial officer, effective Oct. 20. Wahl was named chief financial officer of NextDecade in 2021 after having served

Read More »

Three options for wireless power in the enterprise

Sensors such as these can be attached to pallets to track its location, says Srivastava. “People in Europe are very conscious about where their food is coming from and, to comply with regulations, companies need to have sensors on the pallets,” he says. “Or they might need to know that

Read More »

IBM unveils advanced quantum computer in Spain

IBM executives and officials from the Basque Government and regional councils in front of Europe’s first IBM Quantum System Two, located at the IBM-Euskadi Quantum Computational Center in San Sebastián, Spain. The Basque Government and IBM unveil the first IBM Quantum System Two in Europe at the IBM-Euskadi Quantum Computational

Read More »

For electric utilities to win with AI, focus on the 3 C’s

A recent gathering of electric industry leaders in Washington, DC, hosted by the Edison Electric Institute (EEI), underscored a critical truth: artificial intelligence is no longer a futuristic concept but a present reality shaping the electric utility landscape. This includes everything from data center power needs to customer outreach efficiencies. As the issue of AI and its possibilities becomes more complex, electric utilities must cut through the competing applications and focus their AI investments on the most pressing challenges. To truly deliver on AI’s promise of enhanced reliability, efficiency and security, the industry can focus on three core areas: cybersecurity, climate-stressed grids due to extreme weather, and wildfire risk and customer load growth. These are the “Three C’s” that will determine the industry’s success in this new era of AI. AI as a Shield Against Growing Cyber Threats For electric utilities, cybersecurity is a foundational pillar of grid resilience, with threats surfacing from geopolitical adversaries to typical cybercrime found in other industries. As the grid becomes more interconnected, the attack surface expands exponentially. For electric utility security teams, AI is a force multiplier for adversaries and it must become a force multiplier for defense as well. It can help with everything from triaging vulnerabilities and prioritizing high-risk assets, to detecting sophisticated digital attacks on infrastructure. It can also enhance the physical security of critical infrastructure, using data to recognize and alert human operators to potential threats. The sheer volume of data involved in monitoring a modern utility grid is too great for humans alone to process effectively. By deploying AI as a vigilant and intelligent partner, electric utilities improve their anticipatory security posture, shrinking the time between threat detection and response and significantly reducing the potential for damage. Electric utilities that succeed start by identifying and protecting their highest-value assets

Read More »

The race to pack more megawatt-hours into fewer boxes is a density trap

In the energy storage industry, we often see a race to pack more megawatt-hours into the same enclosure. The strategy of more battery capacity in fewer boxes seems like it would be efficient and less costly. But in practice, it’s a density trap. On paper, higher nameplate energy density makes promises of increased energy throughput and cost savings. But those metrics don’t always translate to operational value once systems are deployed in the field. Think of a battery storage system’s density like a car fuel tank. The size of the gas tank is important, but without an accurate gauge, you can’t be certain how many miles you can drive. You may never use the full capacity of the tank if you’re unsure when you will hit empty. The same is true of batteries. What matters isn’t simply how much energy you can store; it’s how much usable energy you know you have and can confidently, consistently deliver when it counts. That’s why usable energy is more critical than just energy density to generate revenue and support the grid. Higher density doesn’t guarantee a competitive advantage Asset owners often assume that higher energy density cells translate to fewer battery enclosures, smaller sites, and lower capital costs. In reality, that’s often not the case. Some ultra-dense units are too heavy or complex to transport easily, adding shipping and on-site integration challenges before the first kilowatt-hour is even dispatched. Combining multiple lower-weight, smaller units with those high density cells can complicate installation and raise integration costs. And once the system is online, performance doesn’t always match expectations. If batteries can’t reliably discharge at full-rated power, asset owners may need to oversize their installations just to meet contractual or expected energy commitments, erasing any cost or site density advantage. In other words, density is

Read More »

The best of the BESS: The role of battery energy storage systems in grid reliability

In an era of rapid technological advancement and increasing reliance on renewable energy, battery energy storage systems (BESS) are emerging as pivotal players in enhancing grid reliability. They bridge the gap between energy supply and demand while addressing the intermittency challenges inherent in solar and wind power. By harnessing innovations such as long-duration storage solutions and advanced chemistries, emerging battery technologies are redefining energy management and paving the way for a more sustainable future. All about that BESS A battery energy storage system is an advanced technology designed to store and dispatch energy on demand. It functions much like a large rechargeable battery, capable of capturing excess energy generated during low-demand periods and releasing it when demand peaks.These systems currently play a critical role in balancing the grid by compensating for the variable nature of renewable energy sources like solar and wind, which do not produce a consistent power output. In practical terms, BESS can be deployed at various scales, from small residential setups to large utility-scale installations. They provide several valuable services to the grid, including peak shaving, load shifting, frequency regulation and backup power during outages. Despite their numerous benefits, these systems face certain inefficiencies that the industry is actively working to overcome. One major challenge is the limited storage duration for current solutions. Most existing systems commonly offer two to four hours of storage capacity, with renewable developers often pushing for six- to ten-hour systems. However, the high capital expenditure makes it difficult to justify the use case for ten-hour duration. This limits the ability of these systems to fully compensate for prolonged periods of low renewable generation or unexpected surges in demand. Additionally, the rapid growth of renewable energy sources causes transmission infrastructure to struggle to keep pace. This leads to congestion in areas with high

Read More »

Polish Judge Denies Nord Stream Suspect Extradition

A Polish court declined to extradite to Germany a Ukrainian national suspected of helping sabotage the Nord Stream gas link, saying the destruction of the pipeline was justified by Russia’s attack on Ukraine.  The man, identified as Volodymyr Zhuravlov, is wanted under a European arrest warrant issued by a German court and has been in Polish police custody since Sept. 30. Germany’s foreign minister signaled the government in Berlin wouldn’t challenge the court ruling.  The former Ukrainian diving instructor is suspected of being part of a group that used explosives to blow up the Nord Stream 1 and Nord Stream 2 pipelines in the Baltic Sea in September 2022, months after Russia’s full-scale invasion of Ukraine. The pipelines carried Russian gas to Germany and were controlled by Moscow’s Gazprom PJSC. Judge Dariusz Lubowski said the sabotage was carried out during wartime, meaning it shouldn’t be treated as a terrorist act, regardless of whether Ukrainian authorities ordered it. Furthermore, it occurred in international waters, limiting the German court’s jurisdiction. “Blowing up critical infrastructure during a war – during a just, defensive war – is not sabotage but denotes a military action,” Lubowski said. “These actions were not illegal. On the contrary, they were justified, rational and just.” The judge ordered Zhuravlov released from custody. The decision can be appealed within three days, he said. The case is politically charged given Warsaw’s staunch support for Ukraine and prior criticism of German dependence on Russian energy. Prime Minister Donald Tusk said this month that it’s “not in the interest of the country, but also not in the interest of plain decency and justice” to charge or extradite the suspect. “It’s not the problem of Europe, Ukraine, Lithuania or Poland that Nord Stream 2 was blown up, but the fact that it was built,” Tusk said during a

Read More »

National Grid Launches Consultation for England, Scotland Power Line

National Grid PLC has opened a public comment window for early plans for the 6.9-gigawatt Cross Border Connection project between England and Scotland, part of a broader project to enable more renewables into Britain’s grid. “The project includes a new overhead electricity line and a new substation in the Carlisle area”, the distribution and transmission operator said in a statement on its website. “The proposed line would run from the England-Scotland border near Kershopefoot to the new substation, with two possible sites under consideration, one to the north of the city near Harker, and one to the south. “The need for the Cross Border Connection was identified by the National Energy System Operator, because more grid capacity is urgently needed between England and Scotland to transport cleaner electricity from sources like onshore and offshore wind. Electricity demand is set to grow by 50 percent over the next decade, including in Cumbria, and this project will help deliver home-grown power to homes, business and public services across the region and beyond”. National Grid is proposing two route options. “Option A would end at a new substation north of Carlisle, near the existing Harker substation”, it said. This route would have a new overhead line stretching about 28 kilometers (17.4 miles). “Option B would end at a new substation south of Carlisle, with a longer route of approximately 47 km”, the company said. “This route would cross Hadrian’s Wall World Heritage Site just north of the River Eden, where the wall survives as underground remains”. In both routes, “the landscape already includes existing power lines, roads and other modern infrastructure”, it added. “National Grid is committed to minimizing impact on heritage sites and will work closely with experts and local communities throughout the consultation”. The consultation runs until December 10. The campaign includes

Read More »

Iberdrola Posts 6 Pct Growth in Power Distribution in First 9 Months

Iberdrola SA distributed nearly 189,000 gigawatt hours (gWh) between January and September across its global portfolio, up 6.1 percent from the first nine months of 2024, the Spanish renewables-focused utility reported recently. The United Kingdom and Spain drove the increase. Iberdrola’s electricity distribution in the UK rose 42.9 percent in the nine-month period to nearly 32,000 gWh. Spain increased 2.8 to nearly 69,000 gWh, according to a fact sheet published on the company’s website. Iberdrola’s power distribution in Brazil and the United States in the period both fell 0.8 percent to nearly 60,000 gWh and nearly 29,000 gWh respectively. In the U.S., natural gas distribution climbed up 7.8 percent to nearly 47,000 gWh. “This good balance in [power] distribution, as a result of the increase in demand and electrification, has been accompanied by an increase in production, which stood at 96,047 gWh worldwide between January and September, with significant growth in Iberdrola Energia Internacional (+15 percent), Spain (+5 percent) and Brazil (+3 percent)”, Iberdrola said in a separate statement. “In the isolated third quarter, the recovery of the United Kingdom is also noteworthy, a country in which production grew by nine percent between July and September. “By technology, the greater contribution of offshore wind (+33 percent) has been significant, through the Group’s projects in the United Kingdom, Germany, France and the United States; and solar (+41 percent), especially by plants in Spain, the United Kingdom, the United States and other European Union countries such as Portugal and Italy”. Iberdrola’s installed renewable energy generation capacity rose to 45.26 GW as of September 2025 compared to 43.99 GW in September 2024, according to the fact sheet. Renewables comprised 78.8 percent of Iberdrola’s power production capacity as of September 2025. Onshore wind capacity rose to 20.76 GW. Offshore wind rose to 2.44 GW. Hydro fell to 12.86 GW; mini-hydro remained at

Read More »

Roundup: Digital Realty Marks Major Milestones in AI, Quantum Computing, Data Center Development

Key features of the DRIL include: • High-Density AI and HPC Testing. The DRIL supports AI and high-performance computing (HPC) workloads with high-density colocation, accommodating workloads up to 150 kW per cabinet. • AI Infrastructure Optimization. The ePlus AI Experience Center lets businesses explore AI-specific power, cooling, and GPU resource requirements in an environment optimized for AI infrastructure. • Hybrid Cloud Validation. With direct cloud connectivity, users can refine hybrid strategies and onboard through cross connects. • AI Workload Orchestration. Customers can orchestrate AI workloads across Digital Realty’s Private AI Exchange (AIPx) for seamless integration and performance. • Latency Testing Across Locations. Enterprises can test latency scenarios for seamless performance across multiple locations and cloud destinations. The firm’s Northern Virginia campus is the primary DRIL location, but companies can also test latency scenarios between there and other remote locations. DRIL rollout to other global locations is already in progress, and London is scheduled to go live in early 2026. Digital Realty, Redeployable Launch Pathway for Veteran Technical Careers As new data centers are created, they need talented workers. To that end, Digital Realty has partnered with Redeployable, an AI-powered career platform for veterans, to expand access to technical careers in the United Kingdom and United States. The collaboration launched a Site Engineer Pathway, now live on the Redeployable platform. It helps veterans explore, prepare for, and transition into roles at Digital Realty. Nearly half of veterans leave their first civilian role within a year, often due to unclear expectations, poor skill translation, and limited support, according to Redeployable. The Site Engineer Pathway uses real-world relevance and replaces vague job descriptions with an experience-based view of technical careers. Veterans can engage in scenario-based “job drops” simulating real facility and system challenges so they can assess their fit for the role before applying. They

Read More »

BlackRock’s $40B data center deal opens a new infrastructure battle for CIOs

Everest Group partner Yugal Joshi said, “CIOs are under significant pressure to clearly define their data center strategy beyond traditional one-off leases. Given most of the capacity is built and delivered by fewer players, CIOs need to prepare for a higher-price market with limited negotiation power.” The numbers bear this out. Global data center costs rose to $217.30 per kilowatt per month in the first quarter of 2025, with major markets seeing increases of 17-18% year-over-year, according to CBRE. Those prices are at levels last seen in 2011-2012, and analysts expect them to remain elevated. Gogia said, “The combination of AI demand, energy scarcity, and environmental regulation has permanently rewritten the economics of running workloads. Prices that once looked extraordinary have now become baseline.” Hyperscalers get first dibs The consolidation problem is compounded by the way capacity is being allocated. North America’s data center vacancy rate fell to 1.6% in the first half of 2025, with Northern Virginia posting just 0.76%, according to CBRE Research. More troubling for enterprises: 74.3% of capacity currently under construction is already preleased, primarily to cloud and AI providers. “The global compute market is no longer governed by open supply and demand,” Gogia said. “It is increasingly shaped by pre-emptive control. Hyperscalers and AI majors are reserving capacity years in advance, often before the first trench for power is dug. This has quietly created a two-tier world: one in which large players guarantee their future and everyone else competes for what remains.” That dynamic forces enterprises into longer planning cycles. “CIOs must forecast their infrastructure requirements with the same precision they apply to financial budgets and talent pipelines,” Gogia said. “The planning horizon must stretch to three or even five years.”

Read More »

Nvidia, Infineon partner for AI data center power overhaul

The solution is to convert power right at the GPU on the server board and to upgrade the backbone to 800 volts. That should squeeze more reliability and efficiency out of the system while dealing with the heat, Infineon stated.   Nvidia announced the 800 Volt direct current (VDC) power architecture at Computex 2025 as a much-needed replacement for the 54 Volt backbone currently in use, which is overwhelmed by the demand of AI processors and increasingly prone to failure. “This makes sense with the power needs of AI and how it is growing,” said Alvin Nguyen, senior analyst with Forrester Research. “This helps mitigate power losses seen from lower voltage and AC systems, reduces the need for materials like copper for wiring/bus bars, better reliability, and better serviceability.” Infineon says a shift to a centralized 800 VDC architecture allows for reduced power losses, higher efficiency and reliability. However, the new architecture requires new power conversion solutions and safety mechanisms to prevent potential hazards and costly server downtimes such as service and maintenance.

Read More »

Meta details cutting-edge networking technologies for AI infrastructure

ESUN initiative As part of its standardization efforts, Meta said it would be a key player in the new Ethernet for Scale-Up Networking (ESUN) initiative that brings together AMD, Arista, ARM, Broadcom, Cisco, HPE Networking, Marvell, Microsoft, NVIDIA, OpenAI and Oracle to advance the networking technology to handle the growing scale-up domain for AI systems. ESUN will focus solely on open, standards-based Ethernet switching and framing for scale-up networking—excluding host-side stacks, non-Ethernet protocols, application-layer solutions, and proprietary technologies. The group will focus on the development and interoperability of XPU network interfaces and Ethernet switch ASICs for scale-up networks, the OCP wrote in a blog. ESUN will actively engage with other organizations such as Ultra-Ethernet Consortium (UEC) and long-standing IEEE 802.3 Ethernet to align open standards, incorporate best practices, and accelerate innovation, the OCP stated. Data center networking milestones The launch of ESUN is just one of the AI networking developments Meta shared at the event. Meta engineers also announced three data center networking innovations aimed at making its infrastructure more flexible, scalable, and efficient: The evolution of Meta’s Disaggregated Scheduled Fabric (DSF) to support scale-out interconnect for large AI clusters that span entire data center buildings. A new Non-Scheduled Fabric (NSF) architecture based entirely on shallow-buffer, disaggregated Ethernet switches that will support our largest AI clusters like Prometheus. The addition of Minipack3N, based on Nvidia’s Ethernet Spectrum-4 ASIC, to Meta’s portfolio of 51Tbps OCP switches that use OCP’s Switch Abstraction Interface and Meta’s Facebook Open Switching System (FBOSS) software stack. DSF is Meta’s open networking fabric that completely separates switch hardware, NICs, endpoints, and other networking components from the underlying network and uses OCP-SAI and FBOSS to achieve that, according to Meta. It supports Ethernet-based RoCE RDMA over Converged Ethernet (RoCE/RDMA)) to endpoints, accelerators and NICs from multiple vendors, such as Nvidia,

Read More »

Arm joins Open Compute Project to build next-generation AI data center silicon

Keeping up with the demand comes down to performance, and more specifically, performance per watt. With power limited, OEMs have become much more involved in all aspects of the system design, rather than pulling silicon off the shelf or pulling servers or racks off the shelf. “They’re getting much more specific about what that silicon looks like, which is a big departure from where the data center was ten or 15 years ago. The point here being is that they look to create a more optimized system design to bring the acceleration closer to the compute, and get much better performance per watt,” said Awad. The Open Compute Project is a global industry organization dedicated to designing and sharing open-source hardware configurations for data center technologies and infrastructure. It covers everything from silicon products to rack and tray design.  It is hosting its 2025 OCP Global Summit this week in San Jose, Calif. Arm also was part of the Ethernet for Scale-Up Networking (ESUN) initiative announced this week at the Summit that included AMD, Arista, Broadcom, Cisco, HPE Networking, Marvell, Meta, Microsoft, and Nvidia. ESUN promises to advance Ethernet networking technology to handle scale-up connectivity across accelerated AI infrastructures. Arm’s goal by joining OCP is to encourage knowledge sharing and collaboration between companies and users to share ideas, specifications and intellectual property. It is known for focusing on modular rather than monolithic designs, which is where chiplets come in. For example, customers might have multiple different companies building a 64-core CPU and then choose IO to pair it with, whether like PCIe or an NVLink. They then choose their own memory subsystem, deciding whether to go HBM, LPDDR, or DDR. It’s all mix and match like Legos, Awad said.

Read More »

BlackRock-Led Consortium to Acquire Aligned Data Centers in $40 Billion AI Infrastructure Deal

Capital Strategy and Infrastructure Readiness The AIP consortium has outlined an initial $30 billion in equity, with potential to scale toward $100 billion including debt over time as part of a broader AI infrastructure buildout. The Aligned acquisition represents a cornerstone investment within that capital roadmap. Aligned’s “ready-to-scale” platform – encompassing land, permits, interconnects, and power roadmaps – is far more valuable today than a patchwork of single-site developments. The consortium framed the transaction as a direct response to the global AI buildout crunch, targeting critical land, energy, and equipment bottlenecks that continue to constrain hyperscale expansion. Platform Overview: Aligned’s Evolution and Strategic Fit Aligned Data Centers has rapidly emerged as a scale developer and operator purpose-built for high-density, quick-turn capacity demanded by hyperscalers and AI platforms. Beyond the U.S., Aligned extended its reach across the Americas through its acquisition of ODATA in Latin America, creating a Pan-American presence that now spans more than 50 campuses and over 5 GW of capacity. The company has repeatedly accessed both public and private capital markets, most recently securing more than $12 billion in new equity and debt financing to accelerate expansion. Aligned’s U.S.–LATAM footprint provides geographic diversification and proximity to fast-growing AI regions. The buyer consortium’s global relationships – spanning utilities, OEMs, and sovereign-fund partners – help address power, interconnect, and supply-chain constraints, all of which are critical to sustaining growth in the AI data-center ecosystem. Macquarie Asset Management built Aligned from a niche U.S. operator into a 5 GW-plus, multi-market platform, the kind of asset infrastructure investors covet as AI demand outpaces grid and supply-chain capacity. Its sale at this stage reflects a broader wave of industry consolidation among large-scale digital-infrastructure owners. Since its own acquisition by BlackRock in early 2024, GIP has strengthened its position as one of the world’s top owners

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »