Apple makes major AI advance with image generation technology rivaling DALL-E and Midjourney

Stay Ahead, Stay ONMINE

Apple makes major AI advance with image generation technology rivaling DALL-E and Midjourney

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Apple‘s machine learning research team has developed a breakthrough AI system for generating high-resolution images that could challenge the dominance of diffusion models, the technology powering popular image generators like DALL-E and […]

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

Apple‘s machine learning research team has developed a breakthrough AI system for generating high-resolution images that could challenge the dominance of diffusion models, the technology powering popular image generators like DALL-E and Midjourney.

The advancement, detailed in a research paper published last week, introduces “STARFlow,” a system developed by Apple researchers in collaboration with academic partners that combines normalizing flows with autoregressive transformers to achieve what the team calls “competitive performance” with state-of-the-art diffusion models.

The breakthrough comes at a critical moment for Apple, which has faced mounting criticism over its struggles with artificial intelligence. At Monday’s Worldwide Developers Conference, the company unveiled only modest AI updates to its Apple Intelligence platform, highlighting the competitive pressure facing a company that many view as falling behind in the AI arms race.

“To our knowledge, this work is the first successful demonstration of normalizing flows operating effectively at this scale and resolution,” wrote the research team, which includes Apple machine learning researchers Jiatao Gu, Joshua M. Susskind, and Shuangfei Zhai, along with academic collaborators from institutions including UC Berkeley and Georgia Tech.

How Apple is fighting back against OpenAI and Google in the AI wars

The STARFlow research represents Apple’s broader effort to develop distinctive AI capabilities that could differentiate its products from competitors. While companies like Google and OpenAI have dominated headlines with their generative AI advances, Apple has been working on alternative approaches that could offer unique advantages.

The research team tackled a fundamental challenge in AI image generation: scaling normalizing flows to work effectively with high-resolution images. Normalizing flows, a type of generative model that learns to transform simple distributions into complex ones, have traditionally been overshadowed by diffusion models and generative adversarial networks in image synthesis applications.

“STARFlow achieves competitive performance in both class-conditional and text-conditional image generation tasks, approaching state-of-the-art diffusion models in sample quality,” the researchers wrote, demonstrating the system’s versatility across different types of image synthesis challenges.

Inside the mathematical breakthrough that powers Apple’s new AI system

Apple’s research team introduced several key innovations to overcome the limitations of existing normalizing flow approaches. The system employs what researchers call a “deep-shallow design,” using “a deep Transformer block [that] captures most of the model representational capacity, complemented by a few shallow Transformer blocks that are computationally efficient yet substantially beneficial.”

The breakthrough also involves operating in the “latent space of pretrained autoencoders, which proves more effective than direct pixel-level modeling,” according to the paper. This approach allows the model to work with compressed representations of images rather than raw pixel data, significantly improving efficiency.

Unlike diffusion models, which rely on iterative denoising processes, STARFlow maintains the mathematical properties of normalizing flows, enabling “exact maximum likelihood training in continuous spaces without discretization.”

What STARFlow means for Apple’s future iPhone and Mac products

The research arrives as Apple faces increasing pressure to demonstrate meaningful progress in artificial intelligence. A recent Bloomberg analysis highlighted how Apple Intelligence and Siri have struggled to compete with rivals, while Apple’s modest announcements at WWDC this week underscored the company’s challenges in the AI space.

For Apple, STARFlow’s exact likelihood training could offer advantages in applications requiring precise control over generated content or in scenarios where understanding model uncertainty is critical for decision-making — potentially valuable for enterprise applications and on-device AI capabilities that Apple has emphasized.

The research demonstrates that alternative approaches to diffusion models can achieve comparable results, potentially opening new avenues for innovation that could play to Apple’s strengths in hardware-software integration and on-device processing.

Why Apple is betting on university partnerships to solve its AI problem

The research exemplifies Apple’s strategy of collaborating with leading academic institutions to advance its AI capabilities. Co-author Tianrong Chen, a PhD student at Georgia Tech who interned with Apple’s machine learning research team, brings expertise in stochastic optimal control and generative modeling.

The collaboration also includes Ruixiang Zhang from UC Berkeley’s mathematics department and Laurent Dinh, a machine learning researcher known for pioneering work on flow-based models during his time at Google Brain and DeepMind.

“Crucially, our model remains an end-to-end normalizing flow,” the researchers emphasized, distinguishing their approach from hybrid methods that sacrifice mathematical tractability for improved performance.

The full research paper is available on arXiv, providing technical details for researchers and engineers looking to build upon this work in the competitive field of generative AI. While STARFlow represents a significant technical achievement, the real test will be whether Apple can translate such research breakthroughs into the kind of consumer-facing AI features that have made competitors like ChatGPT household names. For a company that once revolutionized entire industries with products like the iPhone, the question isn’t whether Apple can innovate in AI — it’s whether they can do it fast enough.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

AWS cuts prices of some EC2 Nvidia GPU-accelerated instances

“Price cuts on P4d, P4de, P5, and P5en GPU instances suggest a targeted price competition move. These instances, powered by Nvidia A100 and H100-class GPUs, are central to generative AI workloads and already in demand,” said Kaustubh K, practice director at Everest Group. “The reductions can be considered all about

Ending the great network depression

This brings us to AI. In both these enterprise examples, we see the concept of a new technology model deploying from a seedling-like start in a single location and then expanding outward and, at the same time, expanding to other related areas of business operation. Through this double-build-out, to be

Broadcom grows revenues by 20% following VMware purchase, as customers fume about subscription costs

The company is “more than halfway” through the renewals, Tan said, adding, “we probably have at least another year plus, maybe a year and a half to go” in transitioning major accounts to VCF. Most of Broadcom’s VMware contracts are for about three years, which is “very traditional,” he said.

Nvidia aims to bring AI to wireless

Key features of ARC-Compact include: Energy Efficiency: Utilizing the L4 GPU (72-watt power footprint) and an energy-efficient ARM CPU, ARC-Compact aims for a total system power comparable to custom baseband unit (BBU) solutions currently in use. 5G vRAN support: It fully supports 5G TDD, FDD, massive MIMO, and all O-RAN

China Gas Sector Lobbies for More Power Plants to Boost Demand

China’s natural gas producers are lobbying Beijing to increase the number of power plants that run on the fuel, in a bid to help prop up faltering demand. The power sector – which currently accounts for 18 percent of China’s gas consumption – is viewed by the industry as a key engine of growth, according to people involved in advising on energy policy. Under the sector’s latest proposal, China would build nearly 70 gigawatts of new gas-fired capacity by 2030, an almost 50 percent increase from 2025’s estimated level, they said, asking not to be named as the plan is not public. The government has started collecting proposals as it drafts China’s next five-year plan, which will be ratified by the National People’s Congress in March 2026. The strategic blueprint will outline economy-wide targets that balance growth, decarbonization and energy security goals. China’s gas demand, once fast-expanding, has slowed over the last few years due to weaker industrial activity, booming renewable-energy supply and a continued reliance on coal. An unseasonably warm winter and strong inventories have prompted analysts to cut forecasts for China’s imports of liquefied natural gas in 2025, with deliveries slated to fall compared to the previous year. For domestic drillers, which have increasingly leaned on gas as oil consumption too stutters, expanding the amount that can be sold to the power sector offers a way to offset weaker growth in heating and elsewhere. Slowing urbanization and improved air quality have essentially ended a decade-long coal-to-gas transition among households. China is advancing energy market reforms that will favor more cost-effective sources of electricity generation. Although gas power is more expensive than solar, which currently trades at less than half the price, it’s able to ramp up more quickly than baseload coal or nuclear. That agility could secure the fuel a larger role

UK Trials Underwater Defense Robot

The Defense Science and Technology Laboratory (Dstl) has trialed an underwater robot which can prevent adversaries from sabotaging undersea cables and pipelines by disarming or removing threats, a statement posted on the UK government website on Monday said. A commercially available remotely operated vehicle (ROV) has been adapted by the Dstl and industry partners to deal with sabotage threats and clear legacy unexploded ordnance, the statement noted, adding that these present hazards to both vessels and divers deployed to deal with them. “Dstl has incorporated or developed a number of systems to enable the ROV to detect unexploded ordnance and remotely place explosive charges to enable safe neutralization,” the statement said. “The new technology and systems developed will work in partnership with other robots to scan the seabed for hazards and will be able to deal with them once one is spotted,” it added. The robot can be launched from a ship or a shoreline and is operated remotely, the statement highlighted, pointing out that it feeds video and sonar images back to the operators. “The robot is not normally destroyed, which means it can be used multiple times giving the public better value for money in addition to the economic benefits of partnering with industry,” the statement said, adding that this project “supports numerous specialist jobs in industry”. The statement also noted that the robot can operate at depths greater than divers can reach and pointed out that it can “work there safely for much longer”. According to the statement, trials have taken place at; Horsea Island in Portsmouth, Portland Harbour, South Wales, and Norway. Alford Technologies, Atlantas Marine, Sonardyne and ECS Special Projects are among the industry partners involved, the statement said. “This technology would be a valuable toolset for keeping our Armed Forces safe whilst providing the public with value

PTAS Aker Solutions Secures Contract Extension with Brunei Shell Petroleum

PTAS Aker Solutions Sdn. Bhd., a joint venture between Aker Solutions ASA and PTAS Sdn. Bhd., has secured a two-year deal extension with Brunei Shell Petroleum (BSP) for offshore maintenance and modification services. Aker Solutions said in a media release that the contract extension is a result of BSP exercising an option included in the current agreement. Aker Solutions said the deal is a significant one, meaning it is valued between NOK 1.5 billion ($149.2 million) and NOK 2.5 billion ($248.6 million). The work will be managed by PTAS Aker Solutions’ office in Kuala Belait, Brunei Darussalam. The scope of work covers maintenance and upgrades to sustain production levels across offshore assets in the South China Sea, with PTAS Aker Solutions serving as the main contractor, Aker Solutions said. “We look forward to leveraging an optimized delivery model and driving targeted improvement initiatives during this contract period. As the main contractor, we are committed to enhancing new ways of working and improving performance and efficiency while delivering cost reductions across the value chain”, Paal Eikeseth, Executive Vice President and head of Aker Solutions’ Life Cycle Business, said. Aker Solutions said it had obtained its initial ORMC contract with Brunei Shell Petroleum in 2012 and renewed the contract in 2020 under the name PTAS Aker Solutions. PTAS Aker Solutions merges local execution skills with the extensive international expertise of Aker Solutions. This contract will be recorded as an order intake in the Life Cycle segment during the second quarter of 2025, Aker Solutions said. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share

SAKURA Internet, JERA Explore Building Data Centers

SAKURA internet Inc. has signed a deal with JERA Co. Inc. to explore the potential of establishing data centers (DCs) connected to JERA’s power infrastructure. To meet customer demand for both stable supply and decarbonization, JERA is creating a clean energy platform, aiming to deliver low-carbon and decarbonized electricity on a large scale. The agreement involves exploring the establishment of data centers on the grounds of existing thermal power stations owned by JERA, JERA said in a media release. Progress in this exploration will enable SAKURA internet to provide the digital infrastructure that utilizes clean energy from JERA. By promoting the integration of power and telecommunications (watt-bit collaboration), the companies aim to accelerate the development of DCs that are crucial for upgrading Japan’s industrial structure, thereby helping to address the country’s “digital deficit”, make efficient use of power infrastructure, contribute to decarbonization, and enhance Japan’s industrial competitiveness, JERA said. Potential sites include Tokyo Bay. A key aspect of their collaboration involves using JERA’s regasification terminals to provide cold energy, aiming to significantly reduce energy consumption at the DCs. Additionally, the partnership will explore various technologies to decarbonize that energy in the future. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy. MORE FROM THIS AUTHOR

BofA Sees Saudis Embarking on Long But Shallow Oil Price War

OPEC+’s oil-output hikes are part of a Saudi strategy that will see the kingdom embark on a long but shallow price war designed to recapture market share, Bank of America Corp.’s head of commodities research said. The producer group, of which Saudi Arabia is the de-facto leader, announced a third output increase of more than 400,000 barrels a day last month, bigger than previously planned. The additions are reversing years of supply curbs that were aimed at keeping prices higher. “It’s not a price war that is going to be short and steep; rather it’s going to be a price war that is long and shallow,” BofA’s Francisco Blanch said in a Bloomberg Television interview. That reflects a desire to take market share from US shale, which is in relatively good health but faces higher costs of production, he said. The kingdom is also working to regain market share from fellow OPEC+ members, according to Blanch. “They’ve done this price support already by themselves for three-plus years,” which has allowed competitors’ output to rise, he said. “They’re done with that.” Blanch noted that the change in strategy is already producing results, with the latest US oil-drilling data from Baker Hughes Co. showing the lowest rig count in about four years. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Oil Edges Up as Traders Await Outcome of USA-China Trade Talks

Oil extended last week’s gain as a renewed round of US-China trade talks offered the potential for reduced global tensions. West Texas Intermediate futures gained 1.1% to settle at $65 a barrel, the highest price since early April. Negotiators from the US and China were holding talks in London on Monday, raising the possibility the two largest economies can make progress on disputes that have rattled markets this year. Commodity trading advisers, which can accelerate price momentum, liquidated short positions to sit flat in WTI on Monday, compared with 64% short on June 5, according to data from Bridgeton Research Group. A 3% to 4% price move higher from current price levels may trigger the funds to flip to net-long for the first time since February, the group added. The United Nations nuclear watchdog, meanwhile, said Iran’s rapidly increasing stockpile of uranium can’t be ignored ahead of a consequential meeting this week in Vienna. Traders have been keeping a close eye on the progress of nuclear talks between Washington and Tehran, with a setback potentially crimping flows from the OPEC member. Crude has recovered after plunging earlier this year on the twin hit of bumper OPEC+ supply increases and concerns about the outlook for demand following President Donald Trump’s tariff policies. Now, though, the peak summer demand season is looming and markets are looking tighter. The nearest US crude futures are trading more than $1 above the next month, which indicates tight short-term supplies. Prices WTI for July delivery climbed 1.1% to settle at $65.29 barrel in New York. Futures jumped 6.2% last week. Brent for August settlement rose 0.9% to settle at $67.04 a barrel. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or

Qualcomm’s $2.4B Alphawave deal signals bold data center ambitions

Qualcomm says its Oryon CPU and Hexagon NPU processors are “well positioned” to meet growing demand for high-performance, low-power compute as AI inferencing accelerates and more enterprises move to custom CPUs housed in data centers. “Qualcomm’s advanced custom processors are a natural fit for data center workloads,” Qualcomm president and CEO Cristiano Amon said in the press release. Alphawave’s connectivity and compute technologies can work well with the company’s CPU and NPU cores, he noted. The deal is expected to close in the first quarter of 2026. Complementing the ‘great CPU architecture’ Qualcomm has been amassing Client CPUs have been a “big play” for Qualcomm, Moor’s Kimball noted; the company acquired chip design company Nuvia in 2021 for $1.4 billion and has also announced that it will be designing data center CPUs with Saudi AI company Humain. “But there was a lot of data center IP that was equally valuable,” he said. This acquisition of Alphawave will help Qualcomm complement the “great CPU architecture” it acquired from Nuvia with the latest in connectivity tools that link a compute complex with other devices, as well as with chip-to-chip communications, and all of the “very low level architectural goodness” that allows compute cores to deliver “absolute best performance.” “When trying to move data from, say, high bandwidth memory to the CPU, Alphawave provides the IP that helps chip companies like Qualcomm,” Kimball explained. “So you can see why this is such a good complement.”

LiquidStack launches cooling system for high density, high-powered data centers

The CDU is serviceable from the front of the unit, with no rear or end access required, allowing the system to be placed against the wall. The skid-mounted system can come with rail and overhead piping pre-installed or shipped as separate cabinets for on-site assembly. The single-phase system has high-efficiency dual pumps designed to protect critical components from leaks and a centralized design with separate pump and control modules reduce both the number of components and complexity. “AI will keep pushing thermal output to new extremes, and data centers need cooling systems that can be easily deployed, managed, and scaled to match heat rejection demands as they rise,” said Joe Capes, CEO of LiquidStack in a statement. “With up to 10MW of cooling capacity at N, N+1, or N+2, the GigaModular is a platform like no other—we designed it to be the only CDU our customers will ever need. It future-proofs design selections for direct-to-chip liquid cooling without traditional limits or boundaries.”

Enterprises face data center power design challenges

” Now, with AI, GPUs need data to do a lot of compute and send that back to another GPU. That connection needs to be close together, and that is what’s pushing the density, the chips are more powerful and so on, but the necessity of everything being close together is what’s driving this big revolution,” he said. That revolution in new architecture is new data center designs. Cordovil said that instead of putting the power shelves within the rack, system administrators are putting a sidecar next to those racks and loading the sidecar with the power system, which serves two to four racks. This allows for more compute per rack and lower latency since the data doesn’t have to travel as far. The problem is that 1 mW racks are uncharted territory and no one knows how to manage the power, which is considerable now. ”There’s no user manual that says, hey, just follow this and everything’s going to be all right. You really need to push the boundaries of understanding how to work. You need to start designing something somehow, so that is a challenge to data center designers,” he said. And this brings up another issue: many corporate data centers have power plugs that are like the ones that you have at home, more or less, so they didn’t need to have an advanced electrician certification. “We’re not playing with that power anymore. You need to be very aware of how to connect something. Some of the technicians are going to need to be certified electricians, which is a skills gap in the market that we see in most markets out there,” said Cordovil. A CompTIA A+ certification will teach you the basics of power, but not the advanced skills needed for these increasingly dense racks. Cordovil

HPE Nonstop servers target data center, high-throughput applications

HPE has bumped up the size and speed of its fault-tolerant Nonstop Compute servers. There are two new servers – the 8TB, Intel Xeon-based Nonstop Compute NS9 X5 and Nonstop Compute NS5 X5 – aimed at enterprise customers looking to upgrade their transaction processing network infrastructure or support larger application workloads. Like other HPE Nonstop systems, the two new boxes include compute, software, storage, networking and database resources as well as full-system clustering and HPE’s specialized Nonstop operating system. The flagship NS9 X5 features support for dual-fabric HDR200 InfiniBand interconnect, which effectively doubles the interconnect bandwidth between it and other servers compared to the current NS8 X4, according to an HPE blog detailing the new servers. It supports up to 270 networking ports per NS9 X system, can be clustered with up to 16 other NS9 X5s, and can support 25 GbE network connectivity for modern data center integration and high-throughput applications, according to HPE.

AI boom exposes infrastructure gaps: APAC’s data center demand to outstrip supply by 42%

“Investor confidence in data centres is expected to strengthen over the remainder of the decade,” the report said. “Strong demand and solid underlying fundamentals fuelled by AI and cloud services growth will provide a robust foundation for investors to build scale.” Enterprise strategies must evolve With supply constrained and prices rising, CBRE recommended that enterprises rethink data center procurement models. Waiting for optimal sites or price points is no longer viable in many markets. Instead, enterprises should pursue early partnerships with operators that have robust development pipelines and focus on securing power-ready land. Build-to-suit models are becoming more relevant, especially for larger capacity requirements. Smaller enterprise facilities — those under 5MW — may face sustainability challenges in the long term. The report suggested that these could become “less relevant” as companies increasingly turn to specialized colocation and hyperscale providers. Still, traditional workloads will continue to represent up to 50% of total demand through 2030, preserving value in existing facilities for non-AI use cases, the report added. The region’s projected 15 to 25 GW gap is more than a temporary shortage — it signals a structural shift, CBRE said. Enterprises that act early to secure infrastructure, invest in emerging markets, and align with power availability will be best positioned to meet digital transformation goals. “Those that wait may find themselves locked out of the digital infrastructure they need to compete,” the report added.

Cisco bolsters DNS security package

The software can block domains associated with phishing, malware, botnets, and other high-risk categories such as cryptomining or new domains that haven’t been reported previously. It can also create custom block and allow lists and offers the ability to pinpoint compromised systems using real-time security activity reports, Brunetto wrote. According to Cisco, many organizations leave DNS resolution to their ISP. “But the growth of direct enterprise internet connections and remote work make DNS optimization for threat defense, privacy, compliance, and performance ever more important,” Cisco stated. “Along with core security hygiene, like a patching program, strong DNS-layer security is the leading cost-effective way to improve security posture. It blocks threats before they even reach your firewall, dramatically reducing the alert pressure your security team manages.” “Unlike other Secure Service Edge (SSE) solutions that have added basic DNS security in a ‘checkbox’ attempt to meet market demand, Cisco Secure Access – DNS Defense embeds strong security into its global network of 50+ DNS data centers,” Brunetto wrote. “Among all SSE solutions, only Cisco’s features a recursive DNS architecture that ensures low-latency, fast DNS resolution, and seamless failover.”

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE