OpenAI launches GPT-5, nano, mini and Pro — not AGI, but capable of generating ‘software-on-demand’
After literally years of hype and speculation, OpenAI has officially launched a new lineup of large language models (LLMs), all different-sized variants of GPT-5, the long-awaited predecessor to its GPT-4 model from March of 2023, nearly 2.5 years ago. The company is rolling out four distinct versions of the model — GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Pro — to meet varying needs for speed, cost, and computational depth.GPT-5 will soon be powering ChatGPT exclusively and replace all other models going forward for its 700 million weekly users, though ChatGPT Pro subscribers ($200) month can still select older models for the next 60 days.As per rumors and reports, OpenAI has replaced the previous system of having users switch the underlying model powering ChatGPT with an automatic router that decides to engage a special “GPT-5 thinking” mode with “deeper reasoning” that takes longer to respond on harder queries, or uses the regular GPT-5 or mini models for simpler queries.
After literally years of hype and speculation, OpenAI has officially launched a new lineup of large language models (LLMs), all different-sized variants of GPT-5, the long-awaited predecessor to its GPT-4 model from March of 2023, nearly 2.5 years ago.
The company is rolling out four distinct versions of the model — GPT-5, GPT-5 Mini, GPT-5 Nano, and GPT-5 Pro — to meet varying needs for speed, cost, and computational depth.
GPT-5 will soon be powering ChatGPT exclusively and replace all other models going forward for its 700 million weekly users, though ChatGPT Pro subscribers ($200) month can still select older models for the next 60 days.
As per rumors and reports, OpenAI has replaced the previous system of having users switch the underlying model powering ChatGPT with an automatic router that decides to engage a special “GPT-5 thinking” mode with “deeper reasoning” that takes longer to respond on harder queries, or uses the regular GPT-5 or mini models for simpler queries.
AI Scaling Hits Its Limits
Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:
Turning energy into a strategic advantage
Architecting efficient inference for real throughput gains
Unlocking competitive ROI with sustainable AI systems
In the API, the three reasoning-focused models — GPT-5, GPT-5 mini, and GPT-5 nano — are available as gpt-5, gpt-5-mini, and gpt-5-nano, respectively. GPT-5 Pro is not currently accessible via API, being used only to power ChatGPT for Pro tier subscribers.
The biggest takeaway, though, is likely not what GPT-5 is, but what it isn’t: AGI, artificial general intelligence, OpenAI’s stated goal of an autonomous AI system that outperforms humans at most economically valuable work.
Whether or not you the reader personally believe such a system is possible or desirable, OpenAI declaring AGI would have material business impacts. Wired reported previously that there is a clause in OpenAI’s contract with Microsoft that permits OpenAI to begin charging Microsoft for access to its newest models, or cut it off from accessing OpenAI models, if OpenAI’s board determines the company has achieved AGI or generates more than $100 billion in profit.
But apparently, that is not the case today. As co-founder and CEO Sam Altman said, flanked by other OpenAI staffers on an embargoed video call with reporters last night, “the way that most of us define AGI, we’re still missing something quite important — many things that are quite important, actually — but one big one is a model that continuously learns as its deployed, and GPT-5 does not.”
I also asked OpenAI the following question directly: “Is OpenAI considering GPT-5 AGI? Will it trigger any changes regarding Microsoft negotiations?”
To which an OpenAI spokesperson responded over email:
GPT-5 is a significant step toward AGI in that it shows substantial improvements in reasoning and generalization, bringing us closer to systems that can perform a wide range of tasks with human-level capability. However, AGI is still a weakly defined term and means different things to different people. While GPT-5 meets some early criteria for AGI, it doesn’t yet reach the threshold of fully human-level AGI. There are still key limitations in areas like persistent memory, autonomy, and adaptability across tasks. Our focus remains on advancing these capabilities safely, rather than speculating on specific timelines.
Yet benchmark results shared by OpenAI show GPT-5 is nearing the threshold of performing as well as, and is close to exceeding, the average human expert performance at various tasks across law, logistics, sales, and engineering.
As OpenAI writes: “When using reasoning, GPT-5 is comparable to or better than experts in roughly half the cases, while outperforming OpenAI o3 and ChatGPT Agent.”
Why use GPT-5?
With so many alternate models available now from OpenAI and a growing list of competitors, namely Chinese startups offering powerful open source models, what does GPT-5 bring to the table?
Altman described the leap in capability as more than incremental. He compared the experience of using GPT-5 to upgrading from a pixelated display to a retina screen — something users simply don’t want to go back from.
“GPT-3 felt like talking to a high school student,” Altman said. “GPT-4 was like a college student. GPT-5 is the first time it feels like talking to a PhD-level expert in your pocket.”
Among the most impressive capabilities demoed for reporters during the embargoed call was the ability to generate the code for a fully working web application from a single prompt, in this case, a French language learning app with built-in game where English-to-French phrases were shown every time the user guided a virtual mouse to collect slices of cheese, with fully working emoji-inspired characters, backdrop/setting, and clickable interactive menus. The given prompt was only a single paragraph, too.
As Altman stated: “This idea of software on demand will be a defining part of the new GPT-5 era.”
However, this basic capability — prompt to working software — has been available already from prior OpenAI models such as o3 and o4-mini, o4-high, and rival services like Anthropic’s Claude Artifacts, which I (and many others) have used for many months to create interactive first-person and clickable games as well.
The advantage GPT-5 seems to offer in making games, apps, and other software from prompts seems to be in speed — it produced this demo app in a matter of mere minutes — and completeness, with very few discernible bugs and a completely playable experience in “one-shot,” or from a single prompt without back-and-forth conversation, as the developers like to say.
Available to ChatGPT free users and all plans
GPT-5 is not restricted to premium subscribers. OpenAI has made the model available across all ChatGPT tiers, including free users — a deliberate move aligned with the company’s mission to ensure broad benefits from AI.
Free-tier users can access GPT-5 and GPT-5 Mini, with usage limits — though exactly what those usage limits are remains undefined for now, and I’d guess will likely change on an irregular cadence depending on demand.
Subscribers to the ChatGPT Plus ($20 per month) tier receive higher usage allowances, while subscribers to the ChatGPT Pro ($200 monthly), Team ($30 per month or $240 annually), and Enterprise (variable pricing depending on company size and usage) customers get unlimited or prioritized access.
GPT-5 Pro will become available to Team, Enterprise, and EDU customers in the coming days.
The new unified ChatGPT experience eliminates the need to select a model manually. Once users reach usage limits on GPT-5, the system automatically shifts to GPT-5 mini — a more lightweight but still highly capable fallback.
Improved metrics across the board, including 100% in AIME 2025 Math
According to OpenAI, GPT-5 offers the most accurate, responsive, and context-aware AI system the company has ever shipped.
It reduces hallucinations, handles multi-step reasoning more reliably, and generates better-quality code, content, and responses across diverse domains.
The GPT-5 system delivers ~45% fewer factual errors than GPT-4o in real-world traffic, and up to ~80% fewer when using its “thinking” mode.
This mode, which users can trigger by explicitly asking the model to take its time, enables more complex and robust responses — powered by GPT-5 Pro in certain configurations. In tests, GPT-5 Pro sets new state-of-the-art scores on benchmarks like GPQA (88.4%), AIME 2025 math (100% when using Python to answer the questions), and HealthBench Hard (46.2%).
Performance improvements show up across key academic and real-world benchmarks. In coding, GPT-5 sets new state-of-the-art results on SWE-Bench Verified (74.9%) and Aider Polyglot (88%).
Perhaps most incredibly, on Humanity’s Last Exam — a newish benchmark of 2,500 extremely difficult tasks for programs — GPT-5 pro achieves a record-high 42%, blowing away the competition and all prior OpenAI models except the new ChatGPT agent unveiled last month that controls its own computer and cursor like a human.
On writing tasks, GPT-5 adapts more smoothly to tone, context, and user intent. It is better at maintaining coherence, structuring information clearly, and completing complex writing assignments.
The improvements are not just technical — OpenAI’s team emphasized how GPT-5 feels more natural and humanlike in conversation.
Health-related use cases have also been enhanced. While OpenAI continues to caution that ChatGPT is not a replacement for medical professionals, GPT-5 is more proactive about flagging concerns, helping users interpret medical results, and guiding them through preparing for appointments or evaluating options. The system also adjusts answers based on user location, background knowledge, and context — leading to safer and more personalized assistance.
One of the most significant updates is in safe completions, a new system that helps GPT-5 avoid abrupt refusals or unsafe outputs.
Instead of declining queries outright, GPT-5 aims to provide the most helpful response within its safety boundaries and explains when it cannot assist — a change that dramatically reduces unnecessary denials while maintaining trustworthiness.
GPT-5 is also a major upgrade for developers working on agentic systems and tool-assisted workflows. OpenAI has introduced a suite of developer-friendly controls in the GPT-5 API, including:
Free-form function calling – Tools can now accept raw strings such as SQL queries or shell commands, without requiring JSON structure.
Reasoning effort control – Developers can toggle between rapid responses and deeper analytical processing depending on the task.
Verbosity control – A new parameter allows users to select whether responses are brief, standard, or detailed.
Structured outputs with grammar constraints – Developers can now guide outputs using custom grammars or regular expressions.
Tool call preambles – GPT-5 can now explain its reasoning before using tools or making external requests.
Developers can access GPT-5 through OpenAI’s platform for the following prices:
gpt-5: $1.25/$10 per 1 million input/output tokens (with up to 90% input cache discount)
gpt-5-mini: $0.50 / $5 per 1 million input / output tokens
gpt-5-nano: $0.15 / $1.50 per 1 million input / output tokens
The context window now spans 256,000 tokens(about the length of a 600-800 page book of text) allowing GPT-5 to handle substantially larger documents and more extensive conversations than its predecessor, GPT-4 Turbo.
For those who require even more, GPT-4.1 (which supports 1 million-token context windows) remains available.
Compared to theprimary competitors — Anthropic and Google — OpenAI’s GPT-5 models are on par or cheaper for developers to access through the API, placing more downward pressure on the cost of intelligence.
Model / Tier
Input Cost (per 1M tokens)
Output Cost (per 1M tokens)
Notes
GPT‑5
$1.25 (before cache)
$10
With up to 90% input caching
GPT‑5‑mini
$0.50
$5
—
GPT‑5‑nano
$0.15
$1.50
—
Claude Sonnet 4
$3
$15
Up to 90% prompt-caching discount
Claude Opus 4
$15
$75
High-end model aimed at complex tasks
Gemini 2.5 Pro (≤200K)
$1.25
$10
Interactive prompts up to 200K tokens
Gemini 2.5 Pro (Batch ≤200K)
$0.625
$5
Batch processing reduces cost
Gemini 2.5 Pro (>200K)
$2.50
$15
For long prompts over 200K tokens
Gemini 2.5 Flash‑Lite
$0.10
$0.40
Google’s most cost-efficient LLM to date
Early enterprise testers have high praise
Several high-profile companies have already adopted GPT-5 in early trials. JetBrains is using it to power intelligent developer tools, and Notion has integrated GPT-5 to improve document generation and productivity workflows.
At AI developer tool startup Cursor, co-founder and CEO Michael Truell said in a quote provided to reporters by OpenAI: “Our team has found GPT-5 to be remarkably intelligent, easy to steer, and even to have a personality we haven’t seen in any other model. It not only catches tricky, deeply-hidden bugs but can also run long, multi-turn background agents to see complex tasks through to the finish—the kinds of problems that used to leave other models stuck. It’s become our daily driver for everything from scoping and planning PRs to completing end-to-end builds. ”
Other customers report major gains: GitLab cites a drop in tool call volume, GitHub notes improvements in reasoning across large codebases, and Uber is testing GPT-5 for real-time, domain-aware service applications. At Amgen, the model has already improved output quality and reduced ambiguity in scientific tasks.
More updates still to come
GPT-5’s launch coincides with several new features coming now and soon to ChatGPT.
Users can now personalize the interface with chat colors (with exclusive options for paid users) and experiment with preset personalities like Cynic, Robot, Listener, and Nerd — designed to match different communication styles.
ChatGPT will also soon support seamless integration with Gmail, Google Calendar, and Google Contacts. Once enabled, these services will be automatically referenced during chats, with no manual toggling required. These connectors launch for Pro subscribers next week, with broader availability to follow.
A new Advanced Voice mode understands instructions better and allows users to adjust tone and delivery. Voice will be available across all user tiers and included in custom GPTs.
In 30 days, OpenAI will retire the older “Standard Voice Mode” and fully transition to this unified experience.
With safer design, more robust reasoning, expanded developer tooling, and broad user access, GPT-5 reflects a maturing AI ecosystem that’s inching closer to real-world utility on a global scale.
OpenAI’s approach this time is less about flash and more about integration. GPT-5 isn’t a separate offering that users have to seek out — it’s simply there, powering the tools millions already use, making them smarter and more capable and unlocking a whole new raft of use cases for developers.
Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.
Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.
It’s a shining example of the old cliche that those who do not learn from history are doomed to repeat it. When Paul Otellini took over as CEO in 2005, he had to cut Intel’s wasted effort on communications back then as well. Intel attempted to become a player in
The T185 represents an exception with its Intel processor, reflecting its role as the highest-performance model approaching rack-mount capabilities. WatchGuard designed all models with optimized heat distribution through new enclosures, enabling fanless operation on several models for improved reliability and reduced noise. The wireless-capable models introduce Wi-Fi 7 support across
U.S. importers may soon see costs rise for many imported goods, as tariffs on foreign goods are set to rise. On July 31, President Donald Trump announced country-specific reciprocal tariffs would finally be implemented on Aug. 7, after a monthslong pause. The news means more than 90 countries will see
The JF Group (JF) has acquired Arizona-based Maverick Petroleum Services. JF, a fueling infrastructure, petroleum equipment distribution, service, general contracting, and construction services provider, said in a media release that Maverick brings expertise in the installation, maintenance, and repair of petroleum handling equipment, Point-of-Sale (POS) systems, and environmental testing. As
Advanced Manufacturing Collaborative in South Carolina Set to Lead on AI, Energy, and Manufacturing WASHINGTON—U.S. Secretary of Energy Chris Wright joined U.S. Senator Lindsey Graham (R-SC), U.S. Representative Joe Wilson (R-SC-02) and state and local leaders for the opening of the new Advanced Manufacturing Collaborative, creating a new chapter for American innovation in South Carolina. Launched during the first Trump Administration and led by the Department of Energy’s Savannah River National Laboratory (SRNL), the new center enables DOE’s mission to support American manufacturing – serving as an economic driver, creating jobs, spurring innovation and maximizing the reach of industry in South Carolina. “The Advanced Manufacturing Collaborative will bring the expertise of the Department of Energy’s National Labs together with innovators from academia and the private sector with one shared goal: to unleash America’s energy potential,” said Energy Secretary Wright. “This mission was started by President Trump in his first term, and I am proud to be representing the Department of Energy 200 days into his second administration for the grand opening of this facility, completed in record time.” “The opening of the Advanced Manufacturing Collaborative on USC Aiken’s campus will greatly enhance the ability for the Savannah River National Laboratory and private sector, along with academia, to work together on critical initiatives,” said U.S. Senator Lindsey Graham (R-SC). “I was proud to secure federal funding for this facility because I believe this partnership will pay dividends for South Carolina and the rest of the nation. The opening of this facility will cement Aiken as a hub for innovation and advanced technology development for years to come. Finally, I would like to thank Secretary Wright and President Trump for recognizing the importance of South Carolina’s contribution to positioning America as a leader in manufacturing and innovation. I will continue to work
Harbour Energy Plc, the UK’s biggest independent oil and gas producer, jumped the most since 2023 in London trading after announcing the start of a $100 million share buyback and raising financial targets. The company reported strong first-half earnings on Thursday, more than tripling free cash flow as it incorporated assets acquired from Wintershall Dea last year. That allowed it to raise its full-year cash forecast by about 10% to $1 billion and announce the fresh buyback. “We strengthened our financial position” despite market volatility, Chief Executive Officer Linda Cook told reporters. Harbour “entered the second half in an excellent position.” The shares climbed as much as 21% and traded up 13% as of 10:24 a.m. London time. Recent months have seen wild oil-market swings, with prices buffeted by US President Donald Trump’s trade war, shifting OPEC+ policy and Israel’s attacks on Iran. Yet Harbour’s integration of Wintershall Dea fields, including in Norway, Germany and Argentina, allowed it to triple daily production to 488,000 barrels of oil equivalent and raise the lower end of its full-year output guidance. The new buyback will take total shareholder distributions to $555 million in 2025, assuming it completes by year’s end, Harbour said in a statement, adding that it has to conclude by March 31. The company declared an interim dividend of $227.5 million, or 13.19 cents a share, in line with its annual payout policy. Harbour trades in London and operates fields in the UK North Sea. Yet it’s among many companies working on the British continental shelf to reassess their activities after several tax increases. In May it announced plans to cut jobs, and on Thursday said it expects to complete the reorganization by the end of this quarter. “So long as the fiscal regime is as it is in the country, investment here just
Drone attacks in the early hours of Thursday triggered a fire at the independent Afipsky refinery in southern Russia, according to regional emergency services. “A gas and gas-condensate processing unit caught on fire,” the services said in a statement on Telegram. No further details on the extent of the damage were provided. The blaze was fully extinguished by 8:21 a.m. local time, according to the statement. The facility has since resumed normal operations, its press service said. The incident comes during a renewed wave of Ukrainian drone strikes targeting Russia’s downstream oil sector. Earlier this month, similar attacks disrupted operations at two major refineries operated by Rosneft PJSC. The strikes were in response to increasingly intense Russian barrages, according to the Ukrainian General Staff. The Kremlin is now considering a potential concession to US President Donald Trump, which could include an air truce with Ukraine to head off secondary sanctions, according to people familiar with the situation. The Afipsky refinery has a processing capacity of as much as 9.1 million tons of crude oil annually, or some 180,000 barrels per day, which makes it one of Russia’s smaller facilities. The nation currently processes more than 5 million barrels of crude daily, according to Bloomberg estimates based on industry data. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.
When the Senate Appropriations Committee July 24 passed its fiscal 2026 funding bill for the U.S. Environmental Protection Agency, it included language to fund the Energy Star program at $36 million, the same as in fiscal 2024. “The Committee recognizes the value of and continues to support the Energy Star program,” it states in the report that accompanies the bill. House appropriators passed their version of the bill July 22, and it also included language funding the program, at a lower amount. “Within the [clean air] funds provided, at least $32,000,000 is for the Energy Star program,” says a committee-passed amendment to the House bill report. The two bills still face floor votes and reconciliation into a single bill before the legislation goes to President Donald Trump for signature into law, but by shining a spotlight on Energy Star as a priority, lawmakers are sending a message they want to see the Trump administration maintain a program that’s been popular in both political parties since it was enacted in 1992, program backers say. The “strong bipartisan support” for the Energy Star program is “one bright spot in the spending bills,” Sabine Rogers, federal policy manager at the U.S. Green Building Council, said on the organization’s website Aug. 6. The Trump administration hasn’t said it wants to eliminate the program, but in its fiscal year 2026 budget request for the EPA, which it released in early May, it eliminated all funding for the Atmospheric Protection Program, which administers Energy Star. “The Atmospheric Protection Program is an overreach of Government authority that imposes unnecessary and radical climate change regulations on businesses and stifles economic growth,” the administration said in the budget request. “This program is eliminated in the 2026 Budget.” The proposal sparked a backlash as a wide range of organizations joined
Arizona Public Service Co. and Salt River Project will buy capacity on Transwestern Pipeline’s just-announced Desert Southwest expansion project, which aims to deliver natural gas from the Permian Basin in Texas to Arizona, the utilities said Wednesday. APS, the project’s anchor customer, will use gas from the pipeline for gas-fired power plants that will support planned data centers in the utility’s service territory, said Ted Geisler, chairman, president and CEO of Pinnacle West Capital Corp., APS’ parent company, during a Wednesday earnings call. Two Fortis utilities — Tucson Electric Power and UniSource Energy Services — and the City of Mesa, Arizona, are finalizing negotiations with Transwestern, an Energy Transfer unit, according to the utilities. All existing interstate gas pipelines serving Arizona are fully subscribed, according to the utilities. Transwestern owns a 2,590-mile gas pipeline that can deliver 0.9 billion cubic feet a day from southwest Texas to Arizona. APS, SRP and TEP buy gas that is delivered on the pipeline. Transwestern plans to spend about $5.3 billion to expand its system by building 516 miles of 42-inch pipeline and adding compressor stations in Arizona, New Mexico and Texas, Energy Transfer said Wednesday. The project would add 1.5 Bcf/day of capacity to the Transwestern system, according to the company. Energy Transfer aims to complete its project by late 2029. Energy Transfer said it will launch an open season this quarter and expects the remaining capacity on the pipeline expansion to be fully subscribed. Depending on the results of the open season, the project could be expanded, the Dallas-based oil and gas infrastructure company said. Transwestern is considering more than doubling the pipeline’s capacity by using 46-inch pipe, Marshall McCrea, co-CEO of Energy Transfer, said Wednesday during an earnings conference call. APS expects to spend $7.3 billion over 25 years buying gas on
Dive Brief: Texas regulators have opted to keep, for now, the minimum required duration and real-time state of charge for non-spin reserve resources at four hours, dashing the hopes of some battery energy storage system operators for a reduction that could increase competition and profitability. On July 31, the Public Utility Commission of Texas approved the rule, which the Electric Reliability Council of Texas adopted despite stakeholder opposition and the independent market monitor’s recommendation that it be reduced to one hour. The four-hour requirement will benefit gas peaker plants owned by independent power producers like Vistra and NRG Energy at the expense of Tesla, Engie, Enel and other stationary storage operators, Capstone analysts Monica Chen and Sanjana Patel wrote. It could also push some BESS operators out of the state’s market, they said. But analysts with Aurora Energy Research told Utility Dive it’s unlikely to result in dramatic operational changes. Dive Insight: The four-hour state of charge requirement was already unfavorable for BESS operators, but any negative impact from maintaining the status quo may be offset by less contentious revisions to the protocol, said Olivier Beaufils, Aurora head of U.S. Central. For example, the commission shortened the duration requirement for resources in the ERCOT Contingency Reserve Service from two hours to one hour, broadening access for shorter-duration energy storage systems, he said. As for the four-hour requirement for non-spin reserve resources, “I don’t think there’s much of a change there,” he said. “It’s more that the evolution people were expecting is not going to happen anytime soon,” Beaufils said, adding, “I think ERCOT is being a bit conservative as to what deployment for non-spin means.” Capstone expects the commission to revisit the four-hour requirement and other duration changes of the revised protocol about a year after they take effect on
Across the United States, states are rolling out a wave of new tax incentives aimed squarely at attracting data centers, one of the country’s fastest-growing industries. Once clustered in only a handful of industry-friendly regions, today’s data-center boom is rapidly spreading, pushed along by profound shifts in federal policy, surging demand for artificial intelligence, and the drive toward digital transformation across every sector of the economy. Nowhere is this transformation more visible than in the intensifying state-by-state competition to land massive infrastructure investments, advanced technology jobs, and the alluring prospect of long-term economic growth. The past year alone has seen a record number of states introducing or expanding incentives for data centers, from tax credits to expedited permitting, reflecting a new era of proactive, tech-focused economic development policy. Behind these moves, federal initiatives and funding packages underscore the essential role of digital infrastructure as a national priority, encouraging states to lower barriers for data center construction and operation. As states watch their neighbors reap direct investment and job creation benefits, a real “domino effect” emerges: one state’s success becomes another’s blueprint, heightening the pressure and urgency to compete. Yet, this wave of incentives also exposes deeper questions about the local impact, community costs, and the evolving relationship between public policy and the tech industry. From federal levels to town halls, there are notable shifts in both opportunities and challenges shaping the landscape of digital infrastructure advancement. Industry Drivers: the Federal Push and Growth of AI The past year has witnessed a profound federal policy shift aimed squarely at accelerating U.S. digital infrastructure, especially for data centers in direct response both to the explosive growth of artificial intelligence and to intensifying international competition. In July 2025, the administration unveiled “America’s AI Action Plan,” accompanied by multiple executive orders that collectively redefined
From Cloud to GenAI, Hyperscalers Cement Role as Backbone of Global Infrastructure Data center capacity is undergoing a major shift toward hyperscale operators, which now control 44 percent of global capacity, according to Synergy Research Group. Non-hyperscale colocations account for another 22 percent of capacity and is expected to continue, but hyperscalers projected to hold 61 percent of the capacity by 2030. That swing also reflects the dominance of hyperscalers geographically. In a separate Synergy study revealing the world’s top 20 hyperscale data center locations, just 20 U.S. state or metro markets account for 62 percent of the world’s hyperscale capacity. Northern Virginia and the Greater Beijing areas alone make up 20 percent of the total. They’re followed by the U.S. states of Oregon and Iowa, Dublin, the U.S. state of Ohio, Dallas, and then Shanghai. Of the top 20 markets, 14 are in the U.S., five in APAC region, and only one is in Europe. This rapid shift is fueled by the explosive growth of cloud computing, artificial intelligence (AI), and especially generative AI (GenAI)—power-intensive technologies that demand the scale, efficiency, and specialized infrastructure only hyperscalers can deliver. What’s Coming for Capacity The capacity research shows on-premises data centers with 34 percent of the total capacity, a significant drop from the 56 percent capacity they accounted for just six years ago. Synergy projects that by 2030, hyperscale operators such as Google Cloud, Amazon Web Services, and Microsoft Azure will claim 61 percent of all capacity, while on-premises share will drop to just 22 percent. So, it appears on-premises data centers are both increasing and decreasing. That’s one way to put it, but it’s about perspective. Synergy’s capacity study indicates they’re growing as the volume of enterprise GPU servers increases. The shrinkage refers to share of the market: Hyperscalers are growing
Support for OpenTelemetry and open standards is another differentiator for Gartner. Vendors that embrace these frameworks are better positioned to offer extensibility, avoid vendor lock-in, and enable broader ecosystem integration. This openness is paired with a growing focus on cost optimization—an increasingly important concern as telemetry data volumes increase. Leaders offer granular data retention controls, tiered storage, and usage-based pricing models to help customers Gartner also highlights the importance of the developer experience and DevOps integration. Observability leaders provide “integration with other operations, service management, and software development technologies, such as IT service management (ITSM), configuration management databases (CMDB), event and incident response management, orchestration and automation, and DevOps tools.” On the automation front, observability platforms should support initiating changes to application and infrastructure code to optimize cost, capacity or performance—or to take corrective action to mitigate failures, Gartner says. Leaders must also include application security functionality to identify known vulnerabilities and block attempts to exploit them. Gartner identifies observability leaders This year’s report highlights eight vendors in the leaders category, all of which have demonstrated strong product capabilities, solid technology execution, and innovative strategic vision. Read on to learn what Gartner thinks makes these eight vendors (listed in alphabetical order) stand out as leaders in observability: Chronosphere: Strengths include cost optimization capabilities with its control plane that closely manages the ingestion, storage, and retention of incoming telemetry using granular policy controls. The platform requires no agents and relies largely on open protocols such as OpenTelemetry and Prometheus. Gartner cautions that Chronosphere has not emphasized AI capabilities in its observability platform and currently offers digital experience monitoring via partnerships. Datadog: Strengths include extensive capabilities for managing service-level objectives across data types and providing deep visibility into system and application behavior without the need for instrumentation. Gartner notes the vendor’s licensing
In this episode of the Data Center Frontier Show, Editor-in-Chief Matt Vincent speaks with LiquidStack CEO Joe Capes about the company’s breakthrough GigaModular platform — the industry’s first scalable, modular Coolant Distribution Unit (CDU) purpose-built for direct-to-chip liquid cooling. With rack densities accelerating beyond 120 kW and headed toward 600 kW, LiquidStack is targeting the real-world requirements of AI data centers while streamlining complexity and future-proofing thermal design. “AI will keep pushing thermal output to new extremes,” Capes tells DCF. “Data centers need cooling systems that can be easily deployed, managed, and scaled to match heat rejection demands as they rise.” LiquidStack’s new GigaModular CDU, unveiled at the 2025 Datacloud Global Congress in Cannes, delivers up to 10 MW of scalable cooling capacity. It’s designed to support single-phase direct-to-chip liquid cooling — a shift from the company’s earlier two-phase immersion roots — via a skidded modular design with a pay-as-you-grow approach. The platform’s flexibility enables deployments at N, N+1, or N+2 resiliency. “We designed it to be the only CDU our customers will ever need,” Capes says. From Immersion to Direct-to-Chip LiquidStack first built its reputation on two-phase immersion cooling, which Joe Capes describes as “the highest performing, most sustainable cooling technology on Earth.” But with the launch of GigaModular, the company is now expanding into high-density, direct-to-chip cooling, helping hyperscale and colocation providers upgrade their thermal strategies without overhauling entire facilities. “What we’re trying to do with GigaModular is simplify the deployment of liquid cooling at scale — especially for direct-to-chip,” Capes explains. “It’s not just about immersion anymore. The flexibility to support future AI workloads and grow from 2.5 MW to 10 MW of capacity in a modular way is absolutely critical.” GigaModular’s components — including IE5 pump modules, dual BPHx heat exchangers, and intelligent control systems —
Bloom Energy: A Leading Force in On-Site Power As of mid‑2025, Bloom Energy has deployed over 400 MW of capacity at data centers worldwide, working with partners including Equinix, American Electric Power (AEP), and Quanta Computing. In total, Bloom has delivered more than 1.5 GW of power across 1,200+ global installations, a tripling of its customer base in recent years. Several key partnerships have driven this rapid adoption. A decade-long collaboration with Equinix, for instance, began with a 1 MW pilot in 2015 and has since expanded to more than 100 MW deployed across 19 IBX data centers in six U.S. states, providing supplemental power at scale. Even public utilities are leaning in: in late 2024, AEP signed a deal to procure up to 1 GW of Bloom’s solid oxide fuel cell (SOFC) systems for fast-track deployments aimed at large data centers and commercial users facing grid connection delays. More recently, on July 24, 2025, Bloom and Oracle Cloud Infrastructure (OCI) announced a strategic partnership to deploy SOFC systems at select U.S. Oracle data centers. The deployments are designed to support OCI’s gigawatt-scale AI infrastructure, delivering clean, uninterrupted electricity for high-density compute workloads. Bloom has committed to providing sufficient on-site power to fully support an entire data center within 90 days of contract signing. With scalable, modular, and low-emissions energy solutions, Bloom Energy has emerged as a key enabler of next-generation data center growth. Through its strategic partnerships with Oracle, Equinix, and AEP, and backed by a rapidly expanding global footprint, Bloom is well-positioned to meet the escalating demand for multi-gigawatt on-site generation as the AI era accelerates. Oracle and Digital Realty: Accelerating the AI Stack Oracle, which continues to trail hyperscale cloud providers like Google, AWS, and Microsoft in overall market share, is clearly betting big on AI to drive its next phase of infrastructure growth.
In an AI-driven world of exponential compute demand, Aligned Data Centers is meeting the moment not just with scale, but with intent. The company’s recent blitz of strategic announcements, led by plans for a transformative new campus on legacy industrial land in Ohio, offers a composite image of what it means to build data center infrastructure for the AI era: rapid, resilient, regionally targeted, and relentlessly sustainable. From converting a former coal power plant site into a hub for digital progress in Coshocton County, to achieving new heights of energy efficiency in Phoenix, to enabling liquid-cooled, NVIDIA-accelerated AI deployments with Lambda in Dallas, Aligned is assembling a modular, AI-optimized framework designed to meet both today’s and tomorrow’s computational extremes. Ohio Expansion: A New Chapter for Conesville, and for Aligned Announced July 24, Aligned’s newest mega-scale data center campus in Central Ohio will rise on a 197-acre parcel adjacent to the retired AEP Conesville coal-fired power plant, a brownfield site that once symbolized legacy energy and is now poised to power the future of digital infrastructure. As noted by Andrew Schaap, CEO of Aligned Data Centers: “Through this strategic expansion, Aligned not only reinforces its commitment to providing future-ready digital infrastructure in vital growth markets but also directly catalyzes billions of dollars in investment for the state of Ohio and the Coshocton County community.” It’s a project with deep regional implications. The phased, multi-billion dollar development is expected to create thousands of construction jobs and hundreds of high-quality, long-term operational roles, while generating significant tax revenues that will support local services and infrastructure improvements. The campus has already secured a foundational customer, with the first facility targeting initial capacity delivery in mid-2026. This marks Aligned’s third campus in Ohio, a clear indication that the company sees the Buckeye State, with its
And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle
At long last, OpenAI has released GPT-5. The new system abandons the distinction between OpenAI’s flagship models and its o series of reasoning models, automatically
After literally years of hype and speculation, OpenAI has officially launched a new lineup of large language models (LLMs), all different-sized variants of GPT-5, the