Stay Ahead, Stay ONMINE

Microsoft launches Phi-4-Reasoning-Plus, a small, powerful, open weights reasoning model!

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft Research has announced the release of Phi-4-reasoning-plus, an open-weight language model built for tasks requiring deep, structured reasoning. Building on the architecture of the previously released Phi-4, the new model integrates supervised fine-tuning and reinforcement […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Microsoft Research has announced the release of Phi-4-reasoning-plus, an open-weight language model built for tasks requiring deep, structured reasoning.

Building on the architecture of the previously released Phi-4, the new model integrates supervised fine-tuning and reinforcement learning to deliver improved performance on benchmarks in mathematics, science, coding, and logic-based tasks.

Phi-4-reasoning-plus is a 14-billion parameter dense decoder-only Transformer model that emphasizes quality over scale. Its training process involved 16 billion tokens—about 8.3 billion of them unique—drawn from synthetic and curated web-based datasets.

A reinforcement learning (RL) phase, using only about 6,400 math-focused problems, further refined the model’s reasoning capabilities.

The model has been released under a permissive MIT license — enabling its use for broad commercial and enterprise applications, and fine-tuning or distillation, without restriction — and is compatible with widely used inference frameworks including Hugging Face Transformers, vLLM, llama.cpp, and Ollama.

Microsoft provides detailed recommendations on inference parameters and system prompt formatting to help developers get the most from the model.

Outperforms larger models

The model’s development reflects Microsoft’s growing emphasis on training smaller models capable of rivaling much larger systems in performance.

Despite its relatively modest size, Phi-4-reasoning-plus outperforms larger open-weight models such as DeepSeek-R1-Distill-70B on a number of demanding benchmarks.

On the AIME 2025 math exam, for instance, it delivers a higher average accuracy at passing all 30 questions on the first try (a feat known as “pass@1”) than the 70B parameter distillation model, and approaches the performance of DeepSeek-R1 itself, which is far larger at 671B parameters.

Structured thinking via fine-tuning

To achieve this, Microsoft employed a data-centric training strategy.

During the supervised fine-tuning stage, the model was trained using a curated blend of synthetic chain-of-thought reasoning traces and filtered high-quality prompts.

A key innovation in the training approach was the use of structured reasoning outputs marked with special and tokens.

These guide the model to separate its intermediate reasoning steps from the final answer, promoting both transparency and coherence in long-form problem solving.

Reinforcement learning for accuracy and depth

Following fine-tuning, Microsoft used outcome-based reinforcement learning—specifically, the Group Relative Policy Optimization (GRPO) algorithm—to improve the model’s output accuracy and efficiency.

The RL reward function was crafted to balance correctness with conciseness, penalize repetition, and enforce formatting consistency. This led to longer but more thoughtful responses, particularly on questions where the model initially lacked confidence.

Optimized for research and engineering constraints

Phi-4-reasoning-plus is intended for use in applications that benefit from high-quality reasoning under memory or latency constraints. It supports a context length of 32,000 tokens by default and has demonstrated stable performance in experiments with inputs up to 64,000 tokens.

It is best used in a chat-like setting and performs optimally with a system prompt that explicitly instructs it to reason through problems step-by-step before presenting a solution.

Extensive safety testing and use guidelines

Microsoft positions the model as a research tool and a component for generative AI systems rather than a drop-in solution for all downstream tasks.

Developers are advised to carefully evaluate performance, safety, and fairness before deploying the model in high-stakes or regulated environments.

Phi-4-reasoning-plus has undergone extensive safety evaluation, including red-teaming by Microsoft’s AI Red Team and benchmarking with tools like Toxigen to assess its responses across sensitive content categories.

According to Microsoft, this release demonstrates that with carefully curated data and training techniques, small models can deliver strong reasoning performance — and democratic, open access to boot.

Here’s a revised version of the enterprise implications section in a more technical, news-style tone, aligning with a business-technology publication:

Implications for enterprise technical decision-makers

The release of Microsoft’s Phi-4-reasoning-plus may present meaningful opportunities for enterprise technical stakeholders managing AI model development, orchestration, or data infrastructure.

For AI engineers and model lifecycle managers, the model’s 14B parameter size coupled with competitive benchmark performance introduces a viable option for high-performance reasoning without the infrastructure demands of significantly larger models. Its compatibility with frameworks such as Hugging Face Transformers, vLLM, llama.cpp, and Ollama provides deployment flexibility across different enterprise stacks, including containerized and serverless environments.

Teams responsible for deploying and scaling machine learning models may find the model’s support for 32k-token contexts—expandable to 64k in testing—particularly useful in document-heavy use cases such as legal analysis, technical QA, or financial modeling. The built-in structure of separating chain-of-thought reasoning from the final answer could also simplify integration into interfaces where interpretability or auditability is required.

For AI orchestration teams, Phi-4-reasoning-plus offers a model architecture that can be more easily slotted into pipelines with resource constraints. This is relevant in scenarios where real-time reasoning must occur under latency or cost limits. Its demonstrated ability to generalize to out-of-domain problems, including NP-hard tasks like 3SAT and TSP, suggests utility in algorithmic planning and decision support use cases beyond those explicitly targeted during training.

Data engineering leads may also consider the model’s reasoning format—designed to reflect intermediate problem-solving steps—as a mechanism for tracking logical consistency across long sequences of structured data. The structured output format could be integrated into validation layers or logging systems to support explainability in data-rich applications.

From a governance and safety standpoint, Phi-4-reasoning-plus incorporates multiple layers of post-training safety alignment and has undergone adversarial testing by Microsoft’s internal AI Red Team. For organizations subject to compliance or audit requirements, this may reduce the overhead of developing custom alignment workflows from scratch.

Overall, Phi-4-reasoning-plus shows how the reasoning craze kicked off by the likes of OpenAI’s “o” series of models and DeepSeek R1 is continuing to accelerate and move downstream to smaller, more accessible, affordable, and customizable models.

For technical decision-makers tasked with managing performance, scalability, cost, and risk, it offers a modular, interpretable alternative that can be evaluated and integrated on a flexible basis—whether in isolated inference endpoints, embedded tooling, or full-stack generative AI systems.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Pantheon of college football gets a Wi-Fi upgrade

Notre Dame has fully adopted mobile ticketing and introduced grab-and-go concession stands, with plans to expand them further. Alcohol sales were recently approved, prompting efforts to support new services like mobile carts. In premium areas, fans can stream various games during events. Notre Dame also tested mobile ordering for concessions

Read More »

The U.S. leads the world in AI (job) anxiety

The Americans have the highest search volume with a population-adjusted value of 440,000 search queries on the topic of AI job loss, while their attitude towards AI is moderately positive at 54.5%. The intensity score of 3 for the U.S. shows that the concern of losing jobs to AI is

Read More »

Tigera extends cloud-native networking with Calico 3.30

This logging capability is exposed through two new components: Goldmane: A gRPC-based API endpoint that aggregates flow logs from Calico’s Felix component, which runs on each node. Whisker: A web-based visualization tool built with React and TypeScript that connects to the Goldmane API. The combination of these components provides detailed

Read More »

Puerto Rico’s rooftop solar boom is strengthening grid resilience — why is a federal board trying to stop it?

PJ Wilson is president of the Solar and Energy Storage Association of Puerto Rico. Puerto Rico’s electric grid has long struggled with reliability, high costs and vulnerability to natural disasters. In response, the island is experiencing a surge in distributed energy: about 4,000 homes are connecting new rooftop solar and battery systems to the grid each month. This rapid adoption is not only providing relief to families — it’s actively improving grid stability, thanks to smart inverter standards now required for all systems installed since Jan. 1. Yet, the Financial Oversight and Management Board for Puerto Rico (FOMB) is pursuing a lawsuit that could undermine the net metering policy driving this transformation. Legal uncertainty threatens distributed energy progress The FOMB’s legal challenge targets Puerto Rico’s law extending net metering through 2030. Net metering is the policy that ensures solar households are fairly credited for excess energy sent to the grid — a key incentive that has enabled more than 10% of Puerto Rican homes to install solar and batteries. Instead of focusing on grid modernization, the FOMB’s lawsuit risks stalling a policy that has broad support from lawmakers, regulators and the public. For many, net metering is essential to making solar affordable and accessible, especially as the centralized grid remains unreliable. Smart inverters: distributed solar as a grid asset A frequent concern about rapid rooftop solar adoption is the potential for grid instability. In Puerto Rico, the opposite is unfolding. Since Jan. 1, all new solar installations must use advanced smart inverter settings, enabling these systems to help regulate voltage and frequency. This ensures that as more distributed resources come online, they contribute to grid stability rather than undermine it. With thousands of new smart-enabled solar homes joining the grid each month, Puerto Rico is building a distributed, responsive network

Read More »

PSE&G open to utility-owned generation to tackle supply needs: CEO LaRossa

Dive Brief: Public Service Electric & Gas is open to owning generating assets as one solution to ensuring New Jersey has adequate power supplies, according to Ralph LaRossa, chair, president and CEO of Public Service Enterprise Group, the utility’s parent company. Other options include attracting independent power producers to New Jersey or adding transmission lines to import electricity from other states, LaRossa said Wednesday during an earnings conference call. With electricity bills set to rise roughly 20% in New Jersey, mainly because of the PJM Interconnection’s last capacity auction, New Jersey lawmakers and its governor are exploring pathways to add power supplies to the state’s system, including by allowing utility-owned generation. Dive Insight: Affordability is a major concern in New Jersey and across the country, according to LaRossa. “It goes from eggs to energy,” he said. “From our standpoint, we want to do our part.” PSE&G has about 2.4 million electric customers. The utility’s residential customers face a 17% rate increase on June 1, but the New Jersey Bureau of Public Utilities has asked the state’s utility to propose options for buffering the rate hike, LaRossa noted. Near-term options include connecting customers to assistance programs, such as the Low Income Home Energy Assistance Program, energy efficiency programs and bill-payment programs, according to LaRossa. In the long term, more power supplies are needed, LaRossa said. “We’d be more than willing to do it in rate base … or it could show up by new wires being built and bringing in generation from another location,” he said. “What I hear from policymakers in the state is that they would like to have more control over their destiny, and that would lead me to believe that we would want to have more generation in the state.” However, building new power plants will take

Read More »

Corning increased solar Michigan facility investment to $1.5B

Dive Brief: Corning Inc. announced plans on Tuesday to invest another $600 million in its upcoming solar component facility in Richland Township, Michigan, aiming to accelerate advanced manufacturing operations. The additional funds will create 400 more jobs, bringing the total to 1,500 roles. The money builds on its February 2024 announcement through a $900 million investment. Production at the now $1.5 billion factory, which will be operated under Corning’s subsidiary Solar Technology, is expected to come online in the second half of the year, Chairman and CEO Wendell Weeks said in an earnings call Tuesday. Dive Insight: The solar industry experienced unprecedented growth in domestic capacity in 2024, accounting for 66% of the energy-generating segment space — which includes wind, natural gas and coal — in the United States, according to a U.S. solar market insight report from Wood Mackenzie and the Solar Energy Industries Association.   The increase was due to investments in capacity that were driven by the Inflation Reduction Act, as well as more resilient supply chains and growing interest from utility and power companies, the report stated.  Corning is experiencing increased demand for its solar products as well, which Weeks said makes the company’s solar assets “even more valuable.” Corning sold out its solar wafer capacity for the year, including from the Michigan facility under construction, EVP and CFO Edward Schlesinger said in the earnings call. Furthermore, the company sold 80% of its planned capacity for the next five years, Weeks said last month at the company’s investor event. Corning also launched a new solar market access platform, which it expects to generate $2.5 billion in revenue by 2028, Weeks said last month at the company’s investor event. The company expects to see a “positive incremental impact” on its sales, profits and cash flow in the second half

Read More »

Former Shell executive set to lead UK’s National Grid

Former Shell integrated gas and upstream director Zoë Yujnovich is set to become the next chief executive of UK gas and electricity utility operator National Grid. In a statement, National Grid said Yujnovich will succeed current chief executive John Pettigrew from 16 November. The leadership transition comes as the grid operator embarks on an “unprecedented” £35 billion upgrade programme in its electricity transmission business over the next six years. National Grid is also investing up to £59bn in high-voltage cable projects, including the Eastern Green Link 4, Sealink and Lionlink. John Pettigrew and Zoë Yujnovich Pettigrew has spent almost ten years leading National Grid, and the company said he and its board had agreed it is the “right time to transition leadership”. After joining National Grid as a graduate in 1991, Pettigrew held a variety of senior roles before taking on the CEO position in 2016. © Supplied by National GridNational Grid chief executive John Pettigrew will retire in November 2025. Meanwhile, Shell announced Yujnovich would step down from her role at the supermajor in March after more than a decade. The Australian joined Shell in 2014 after more than a decade at mining company Rio Tinto, and is also on the board of Unilever as a non-executive director. National Grid said Yujnovich will join its board on 1 September before becoming chief executive on 17 November. National Grid leadership National Grid chair Paula Reynolds thanked Pettigrew for his “invaluable contribution” to the company over three decades. “His leadership as chief executive has been exemplary, driving the group’s strategic transformation, enabling the energy transition and delivering significant shareholder value,” Reynolds said. © Supplied by National GridOffshore construction of the National Grid North Sea Link HVDC cable project linking the UK and Norway. “He will leave the group in a strong

Read More »

Scottish government dishes out £3.4m in pursuit of hydrogen ambitions

The Scottish government has unveiled a £3.4 million investment in the country’s hydrogen economy across 11 projects. Acting Net Zero Cabinet Secretary Gillian Martin said: “Hydrogen stands as a critical pillar of Scotland’s route to net zero by 2045, but also, alongside the development of our offshore wind capacity, as one of Scotland’s greatest industrial opportunities since the discovery of oil and gas in the North Sea.” The public funds are set to back green hydrogen projects, support Scotland’s green energy supply chain and “enhance hydrogen transport and storage infrastructure”, a government statement outlined. The announcement follows a move from September in which the Holyrood government invited projects to apply for a match-funding grant award of up to 50%, to the maximum value of £2 million. © Supplied by Clarke Cooper/DCTMThe successful applicants of the Scottish Government’s £3.4 billion hydrogen funding scheme (click to zoom). A total of 18 projects were shortlisted and submitted a full application to the Aberdeen-based Scottish Enterprise. Martin added: “We are working to build a hydrogen economy in which the benefits of our energy transition are shared, and which harnesses the full potential of our skilled people, our worldclass industries, and our natural resources.” Projects from across the country were successful, with the Highlands being named as a location in three of the winning bids. Aberdeen was mentioned in the successful applications from Glacier Energy and Hydrasun with SSE listing the surrounding council district as the location of its Peterhead 1 and 2 Hydrogen projects. Hydrogen training site opens in Fife This comes as SGN announced the opening of the UK’s first hydrogen training centre for gas engineers, an initiative carried out in partnership with Fife College. The facility at Fife College’s Levenmouth Campus aims to train over 100 ‘Gas Safe’ registered engineers this year. SGN

Read More »

Supply chain offers up 31 tenders for Teesside CCS project

Supermajor BP has listed more than 30 multi-million-pound tenders for its Teesside-based carbon capture storage (CCS) project. All but one of the 31 Northern Endurance Partnership (NEP) tenders are listed as being worth less that £25 million on the North Sea Transition Authority’s (NSTA) Pathfinder service. The work up for grabs covers subsea operations, fabrication, and engineering and design contracts with the highest value gig being listed for “pre-dredging and sweeping”. This contract is set to be dished out on Friday, alongside a couple of others. TechnipFMC is also offering up work for less than £25 million for “manifold fabrication of two off manifold structures”. The tender valued at over £25m has been listed by TechnipFMC, which is responsible for the Offshore Subsea Injection System, and one of the two firms tasked with working on the NEP’s onshore power, capture and compression. Building upon NEP’s £4bn contracts © Supplied by BPThe East Coast Cluster. NEP is a joint venture between BP, Equinor, and TotalEnergies and is the CO2 transportation and storage provider for the East Coast Cluster (ECC). Alongside NZT Power, a BP and Equinor JV which will be capable of dispatching up to 860 megawatts of flexible low-carbon power in the region, NEP dished out £4 billion in contracts last year. The partners previously estimated that the £4 billion worth of contracts would create 2,000 jobs on the English east coast. Among those who won work on the east coast Track 1 winning CCS project were Wood, Saipem, Genesis, and Costain. French energy services firm Altrad also won a three-year contract with NZT Power and NEP to provide “quality services during the execution phase of the projects,” Energy Voice reported in April. Costain and TechnipFMC to dish out Teesside CCS work TechnipFMC is offering up 24 of the 31

Read More »

AI’s energy appetite drives interest in nuclear power

In its new report, Deloitte said that its analysis of figures from the World Nuclear Association, the American Nuclear Society, the U.S. Department of Energy, and others showed that new nuclear power could potentially meet about 10% of the projected increase in data center demand over the next decade, assuming capacity is also significantly expanded by between 35GW and 62GW, and 30% of the expansion is earmarked for data centers. “Nuclear energy presents a potential solution for meeting some of the growing electricity demands of data centers, with its reliable and clean energy profile,” Deloitte’s report said, noting five key advantages of the technology: Reliable baseload power: Nuclear reactors operate 24/7, regardless of the weather, providing the reliable power so important to data centers. In addition, Deloitte said, “Their capacity factor, exceeding 92.5%, outperforms other sources like natural gas (56%) and renewables like wind (35%) and solar (25%).” High energy density: A small amount of fuel generates a lot of power, which minimizes the need for fuel storage and transportation. “This efficiency can translate to a smaller physical footprint and enhanced sustainability,” Deloitte said. Scalable power output: A full-sized reactor typically generates 800 megawatts (MW) or more of electricity, which accommodates the needs of large data centers. Low carbon emissions: Nuclear power plants produce virtually no greenhouse gas emissions during operation. Enhanced land use efficiency: Compared to other energy sources, nuclear power plants require relatively little land. Gartner’s Johnson echoed these advantages, and also predicted that nuclear energy, and small modular reactors (SMRs) in particular, will “provide a viable answer” to the question of what to do when electricity demand exceeds supply. They can, he said, “ensure independence from grid power fluctuations by providing dedicated on-site power for large data centers.” However, both Gartner and Deloitte also highlighted challenges in

Read More »

Nvidia AI supercluster targets agents, reasoning models on Oracle Cloud

Oracle has previously built an OCI Supercluster with 65,536 Nvidia H200 GPUs using the older Hopper GPU technology and no CPU that offers up to 260 exaflops of peak FP8 performance. According to the blog post announcing the availability, the Blackwell GPUs are available via Oracle’s public, government, and sovereign clouds, as well as in customer-owned data centers through its OCI Dedicated Region and Alloy offerings. Oracle joins a growing list of cloud providers that have made the GB200 NVL72 system available, including Google, CoreWeave and Lambda. In addition, Microsoft offers the GB200 GPUs, though they are not deployed as an NVL72 machine.

Read More »

Deep Data Center: Neoclouds as the ‘Picks and Shovels’ of the AI Gold Rush

In 1849, the discovery of gold in California ignited a frenzy, drawing prospectors from around the world in pursuit of quick fortune. While few struck it rich digging and sifting dirt, a different class of entrepreneurs quietly prospered: those who supplied the miners with the tools of the trade. From picks and shovels to tents and provisions, these providers became indispensable to the gold rush, profiting handsomely regardless of who found gold. Today, a new gold rush is underway, in pursuit of artificial intelligence. And just like the days of yore, the real fortunes may lie not in the gold itself, but in the infrastructure and equipment that enable its extraction. This is where neocloud players and chipmakers are positioned, representing themselves as the fundamental enablers of the AI revolution. Neoclouds: The Essential Tools and Implements of AI Innovation The AI boom has sparked a frenzy of innovation, investment, and competition. From generative AI applications like ChatGPT to autonomous systems and personalized recommendations, AI is rapidly transforming industries. Yet, behind every groundbreaking AI model lies an unsung hero: the infrastructure powering it. Enter neocloud providers—the specialized cloud platforms delivering the GPU horsepower that fuels AI’s meteoric rise. Let’s examine how neoclouds represent the “picks and shovels” of the AI gold rush, used for extracting the essential backbone of AI innovation. Neoclouds are emerging as indispensable players in the AI ecosystem, offering tailored solutions for compute-intensive workloads such as training large language models (LLMs) and performing high-speed inference. Unlike traditional hyperscalers (e.g., AWS, Azure, Google Cloud), which cater to a broad range of use cases, neoclouds focus exclusively on optimizing infrastructure for AI and machine learning applications. This specialization allows them to deliver superior performance at a lower cost, making them the go-to choice for startups, enterprises, and research institutions alike.

Read More »

Soluna Computing: Innovating Renewable Computing for Sustainable Data Centers

Dorothy 1A & 1B (Texas): These twin 25 MW facilities are powered by wind and serve Bitcoin hosting and mining workloads. Together, they consumed over 112,000 MWh of curtailed energy in 2024, demonstrating the impact of Soluna’s model. Dorothy 2 (Texas): Currently under construction and scheduled for energization in Q4 2025, this 48 MW site will increase Soluna’s hosting and mining capacity by 64%. Sophie (Kentucky): A 25 MW grid- and hydro-powered hosting center with a strong cost profile and consistent output. Project Grace (Texas): A 2 MW AI pilot project in development, part of Soluna’s transition into HPC and machine learning. Project Kati (Texas): With 166 MW split between Bitcoin and AI hosting, this project recently exited the Electric Reliability Council of Texas, Inc. planning phase and is expected to energize between 2025 and 2027. Project Rosa (Texas): A 187 MW flagship project co-located with wind assets, aimed at both Bitcoin and AI workloads. Land and power agreements were secured by the company in early 2025. These developments are part of the company’s broader effort to tackle both energy waste and infrastructure bottlenecks. Soluna’s behind-the-meter design enables flexibility to draw from the grid or directly from renewable sources, maximizing energy value while minimizing emissions. Competition is Fierce and a Narrower Focus Better Serves the Business In 2024, Soluna tested the waters of providing AI services via a  GPU-as-a-Service through a partnership with HPE, branded as Project Ada. The pilot aimed to rent out cloud GPUs for AI developers and LLM training. However, due to oversupply in the GPU market, delayed product rollouts (like NVIDIA’s H200), and poor demand economics, Soluna terminated the contract in March 2025. The cancellation of the contract with HPE frees up resources for Soluna to focus on what it believes the company does best: designing

Read More »

Quiet Genius at the Neutral Line: How Onics Filters Are Reshaping the Future of Data Center Power Efficiency

Why Harmonics Matter In a typical data center, nonlinear loads—like servers, UPS systems, and switch-mode power supplies—introduce harmonic distortion into the electrical system. These harmonics travel along the neutral and ground conductors, where they can increase current flow, cause overheating in transformers, and shorten the lifespan of critical power infrastructure. More subtly, they waste power through reactive losses that don’t show up on a basic utility bill, but do show up in heat, inefficiency, and increased infrastructure stress. Traditional mitigation approaches—like active harmonic filters or isolation transformers—are complex, expensive, and often require custom integration and ongoing maintenance. That’s where Onics’ solution stands out. It’s engineered as a shunt-style, low-pass filter: a passive device that sits in parallel with the circuit, quietly siphoning off problematic harmonics without interrupting operations.  The result? Lower apparent power demand, reduced electrical losses, and a quieter, more stable current environment—especially on the neutral line, where cumulative harmonic effects often peak. Behind the Numbers: Real-World Impact While the Onics filters offer a passive complement to traditional mitigation strategies, they aren’t intended to replace active harmonic filters or isolation transformers in systems that require them—they work best as a low-complexity enhancement to existing power quality designs. LoPilato says Onics has deployed its filters in mission-critical environments ranging from enterprise edge to large colos, and the data is consistent. In one example, a 6 MW data center saw a verified 9.2% reduction in energy consumption after deploying Onics filters at key electrical junctures. Another facility clocked in at 17.8% savings across its lighting and support loads, thanks in part to improved power factor and reduced transformer strain. The filters work by targeting high-frequency distortion—typically above the 3rd harmonic and up through the 35th. By passively attenuating this range, the system reduces reactive current on the neutral and helps stabilize

Read More »

New IEA Report Contrasts Energy Bottlenecks with Opportunities for AI and Data Center Growth

Artificial intelligence has, without question, crossed the threshold—from a speculative academic pursuit into the defining infrastructure of 21st-century commerce, governance, and innovation. What began in the realm of research labs and open-source models is now embedded in the capital stack of every major hyperscaler, semiconductor roadmap, and national industrial strategy. But as AI scales, so does its energy footprint. From Nvidia-powered GPU clusters to exascale training farms, the conversation across boardrooms and site selection teams has fundamentally shifted. It’s no longer just about compute density, thermal loads, or software frameworks. It’s about power—how to find it, finance it, future-proof it, and increasingly, how to generate it onsite. That refrain—“It’s all about power now”—has moved from a whisper to a full-throated consensus across the data center industry. The latest report from the International Energy Agency (IEA) gives this refrain global context and hard numbers, affirming what developers, utilities, and infrastructure operators have already sensed on the ground: the AI revolution will be throttled or propelled by the availability of scalable, sustainable, and dispatchable electricity. Why Energy Is the Real Bottleneck to Intelligence at Scale The major new IEA report puts it plainly: The transformative promise of AI will be throttled—or unleashed—by the world’s ability to deliver scalable, reliable, and sustainable electricity. The stakes are enormous. Countries that can supply the power AI craves will shape the future. Those that can’t may find themselves sidelined. Importantly, while AI poses clear challenges, the report emphasizes how it also offers solutions: from optimizing energy grids and reducing emissions in industrial sectors to enhancing energy security by supporting infrastructure defenses against cyberattacks. The report calls for immediate investments in both energy generation and grid capabilities, as well as stronger collaboration between the tech and energy sectors to avoid critical bottlenecks. The IEA advises that, for countries

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »