Stay Ahead, Stay ONMINE

Google’s Jules aims to out-code Codex in battle for the AI developer stack

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Vibe coding and the growth of AI-powered coding platforms gave rise to yet another battleground among tech companies.  In December, Google released Jules, an autonomous coding agent that can fix bugs asynchronously, as an experiment. However, […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Vibe coding and the growth of AI-powered coding platforms gave rise to yet another battleground among tech companies. 

In December, Google released Jules, an autonomous coding agent that can fix bugs asynchronously, as an experiment. However, during Google I/O, Google announced that Jules will now be available in beta. 

With the broader release of Jules, Google positions itself as a strong competitor against a rising number of AI coding assistants designed to write, check and fix code autonomously. 

Josh Woodward, vice president of Google Labs, told reporters in a briefing that Jules “will be available to help developers fix bugs, create tests, consult documentation all happening in the background.”

“People are describing apps into existence,” Woodward said. “This started out as an asynchronous coding agent with the idea that, what if you created a way where you could assign tasks to this agent for the things you didn’t want to do?”

Jules will be integrated into GitHub and uses Google’s Gemini 2.5 Pro. During the public beta phase, developers can access Jules for free but with usage limits. 

Asynchronous and parallel

Jules works asynchronously, allowing developers to assign it a task while they work separately on something else. It runs tasks inside a virtual machine, shows tasks and their reasoning and even offers audio summaries. 

But Jules is not the only asynchronous and parallel task coding agent around, nor is it the only one announced in May. 

OpenAI surprised the industry by releasing a research preview of its coding agent Codex, after rumors circulated that the company would buy the coding startup Windsurf. Codex began life as a coding model but has since transformed into a coding agent able to write, fix bugs, and answer codebase questions in a separate sandbox.

Codex was also behind one of the first code completion assistants, GitHub Copilot. GitHub announced during Microsoft Build this week, GitHub Copilot Agent, doing much of the same asynchronous work as Codex and Jules. 

The upcoming arms race around coding agents is gaining interest in social media, even before Jules and Codex are fully released to the public. 

These more autonomous coding platforms follow the growth of “vibe coding,” where code and applications are generated mostly through prompting rather than hard coding written by humans. The entrance of Big Tech companies like Google and OpenAI into this arena brings coding agents even more to the forefront of the AI arms race. 

More AI-powered code

Even inside Google, Jules is not the only AI coding platform to build applications. Google offers Code Assist, AI Studio, Jules and Firebase. 

Firebase, announced in April, allows non-coders to build applications and add AI features. Google updated the platform, adding a new AI Workspace for Firebase Studio and Firebase AI Logic for monitoring AI usage. 

Firebase Studio, powered by Gemini 2.5 Pro, so that people can build more sophisticated applications. Firebase AI Logic offers developers the means to add features to the app’s backend, like authentication and identity. It also allows people to check token usage or resolve latency issues without needing a third-party orchestration program. 

Jeanine Banks, vice president and general manager for Developer X and head of Developer Relations at Google, told VentureBeat that Firebase differentiates itself from Jules and other Google coding products by being the first place people new to coding can experiment with making their own AI applications. 

“Google offers many wonderful tools to help you with specialized parts of your stack. So, for example, you can use Google AI Studio, which helps in experimenting with your AI inference to figure out the best optimized prompts,” Banks said. “But Firebase is the single place that integrates all of those things together, and it’s a single place for full-stack developers and professionals, but also creators who are vibe coding.”

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Survey: AMD continues to take server share from Intel

Dean McCarron, president of Mercury, said it’s not AMD stealing Intel business but mostly a case of AMD growing faster than Intel. “AMD’s growth rate in the quarter was multiples of Intel’s, resulting in significant server share gains,” he said in a research note. “Server processor shipments were definitively the

Read More »

WTI, Brent Edge Lower in Choppy Trade

Oil edged lower in a choppy session as broader financial markets weakened and uncertainty lingered about whether sanctions on Iran will be loosened or tightened.    Prices whipsawed after Iran’s supreme leader expressed skepticism over discussions with the US, further denting expectations of an agreement on his country’s nuclear program. Ayatollah Ali Khamenei said he doesn’t think negotiations with the US will succeed and urged the Trump administration to stop “talking nonsense.” He added he doesn’t know what will happen in any discussions. US West Texas Intermediate’s most-active July futures contract declined 0.2% to settle just above $62 a barrel. Brent also edged lower but still settled above $65 for the third consecutive session. Oil prices have been volatile since last week on mixed headlines about the fate of Iran-US talks, which could pave the way for more barrels to return to a market that’s expected to be oversupplied later in the year. Adding to global uncertainty, US President Donald Trump is pulling back from his efforts to end the war between Ukraine and Russia. “It’s just a road to ‘nowhere’ for crude right now until more is known about how the OPEC, Iran, and Russian sagas play out,” said Dennis Kissler, senior vice president for trading at BOK Financial Securities. Crude has rebounded this month, after sliding 19% in April, following an easing in the trade war between the US and China. In bullish sign for prices, premiums of several refined fuels over crude have surged over recent weeks, potentially bolstering demand for crude oil.   Elsewhere, ConocoPhillips Chief Executive Officer Ryan Lance said he doesn’t think US shale output has peaked. Prices in the $50s on a sustained basis would lead to a slow decline, but in the $60s, output will just plateau, he added. Traders have been

Read More »

India Plans to Spend $10B on Homebuilt Oil Tanker Fleet

India plans to spend 850 billion rupees ($10 billion) to purchase 112 crude carriers through 2040, people familiar with the matter said, as the world’s third-biggest importer of oil seeks to have its own fleet to secure supplies. State-owned oil companies currently operate an aging fleet that’s mostly on-charter from global companies and the shipping and petroleum ministries want to change that, said the people, who asked not to be identified citing rules. The plan’s first phase involves purchasing 79 ships, of which 30 of them would be medium-range vessels, they said. The purchase order for 10 tankers should come out as early as this month, the people said. Only ships built locally — even if there’s foreign collaboration — will be considered for purchase, they said. Despite the global push for transition to cleaner sources of energy, India’s crude oil refining capacity is set to expand — to 450 million tons by the end of the decade from about 250 million tons now — on the back of growing domestic and overseas demand for oil products. For a nation that relies on imports for the bulk of its crude oil needs, it’s imperative to have sufficient shipping capacity of its own to carry out its energy trade. India targets to raise the share of locally built oil tankers in its fleet to 7% by 2030 from 5% at present, the people said. The idea is to eventually increase it to 69% by 2047 — the deadline the country has set for becoming a developed nation. The shipping and petroleum ministries and the government’s Press Information Bureau didn’t immediately respond to emailed requests for comments.  Prime Minister Narendra Modi’s government this year announced a 250 billion-rupee fund to support the country’s maritime sector, with one of its goals being to eventually

Read More »

SEIA estimates budget bill could wipe out almost 300K clean energy jobs

A Republican-backed budget bill that rolls back clean energy incentives to pay for tax cuts could jeopardize nearly 300,000 jobs and some 287 solar and storage factories across the country, according to the Solar Energy Industries Association. The bill passed out of the House Budget Committee over the weekend and will be considered by the House Rules Committee early Wednesday morning. “If this proposal becomes law, nearly 300 U.S. factories — mostly in red states — could close or never open, and we simply won’t have the energy we need to power American innovation in AI and data centers,” SEIA president and CEO Abigail Ross Hopper said in a statement Tuesday. SEIA’s analysis urged supporters to join the group in lobbying their representatives against the bill, saying the current version of it “could trigger an immediate decline in solar and storage investments. By 2030, up to $220 billion in investment could be lost.” The group also estimated an increase of $51 billion in consumer electricity costs. The bill would set earlier phaseouts to several credits that have been boosting the solar and storage industries: The 25D residential energy credit would be phased out after 2025; the 48E clean electricity investment credit would be phased out after 2028; and the 45X advanced manufacturing production credit would be fully phased out after 2031.  The “removal of 25D alone will result in a minimum of 75,000 [to] 85,000 fewer jobs by 2026 and up to 250,000 by 2028,” SEIA’s analysis said.  The analysis emphasized that the IRA’s incentives have generated manufacturing investment largely in districts that voted for President Trump, and cited support for their preservation from Republicans like former Federal Energy Regulatory Comission Chairman Neil Chatterjee and GOP lawmakers in the House and Senate who have appealed to leadership to preserve parts of

Read More »

‘People are a bit lost’ – hydrogen pundits bemoan the European outlook in Rotterdam

Industry commentators highlighted the importance of firm direction, demand creation and realistic expectations during the opening sessions of the World Hydrogen 2025 Summit in Rotterdam, fending off criticism over the laggardly pace of European progress. During a fireside chat on the state of the industry, Hy24 chief executive Pierre-Etienne Franc said the industry needed to manage its expectations better following the hype that followed the creation of the Hydrogen Council in January 2017. Hy24 is “the world’s first and largest” hydrogen private equity asset management company. “People are a bit lost. They don’t see whether we need to believe the press, or we need to believe the facts,” said Franc. “There are three major critiques that are always given for hydrogen. The first one is ‘time’ – not being fast enough. The second one is ‘cost’. And the third one is ‘scale’, and in fact if you deep dive a bit on what’s going on – and we, as investors, we see that relatively well – we believe those three [issues] are not properly understood.” Responding to the accusation that the industry is not delivering, Franc pointed out that “the energy dynamic is going to take decades”. “We were expecting, looking at the press and the ambitions put forward by all the countries… hydrogen to deliver in less than 20 years, which is of course ridiculous.” “The big players have derisked the technology and they’re ready to invest significantly. So we’re not so bad. We are faster than the other energy waves and it’s, in fact, just the time to deploy a nascent industry.” Hydrogen Council chief executive Ivana Jemelkova added that, with exuberance having well and truly died down since the excitement of the Paris Agreement, the reality of the situation is now sinking in. © Supplied by Hydrogen

Read More »

Charging Forward: Centrica flags Rough shutdown in blow to hydrogen storage hopes

In this week’s Charging Forward, Centrica has warned it could shut down its Rough offshore gas storage facility without UK government support. Meanwhile, Harmony Energy is in the midst of a bidding war for its battery storage portfolio with competing offers from Drax Group and Foresight Group. In addition, Field Energy, SAE, Balance Power and EOS are all progressing various battery energy storage system (BESS) projects across the UK. This week’s UK energy storage headlines: Centrica warns it may close Rough gas storage facility Harmony Energy Income Trust to auction battery storage portfolio SAE forms joint venture with Econergy for 250 MW Uskmouth BESS EOS gets approval for £62m Teesside BESS Field secures £42m loan to accelerate three UK BESS projects Balance Power secures approval for 29.9 MW Ayrshire BESS Devon council rejects Clearstone Enrgy 150 MW BESS plans International energy storage news: Noon Energy targets ultra long duration storage and China’s Sungrow installs BESS near Arctic Circle Centrica warns it may close Rough offshore gas storage facility British Gas owner Centrica has warned it may shut down and decommission its Rough offshore gas storage facility without additional UK government support. In an interview with the BBC, Centrica chief executive Chris O’Shea said operating the Rough facility will lead to a £100 million loss for the company this year. “If we were to simply spend £2 billion redeveloping this field and the summer-winter gas price stays the same, then we will lose that £2bn and we’ll lose the cost of operating the facility,” O’Shea said. “It’s just not sustainable.” Centrica initially shut down Rough in 2017, but moved to reopen the site following Russia’s invasion of Ukraine in 2022. © Supplied by CentricaCentrica has plans to revamp its Rough site into a storage base for hydrogen. The company is investigating converting

Read More »

Transmission charging adds £1bn to Scottish offshore wind farm costs

Transmission charging is making Scottish offshore wind farms £1 billion more expensive than their English counterpart. A report from Aurora Energy Research found that the potential changes, currently being considered by the UK government, would add the £1bn figure to a 1GW wind farm over the course of its operating life. The report added that reforming the transmission charging model could save billpayers £16bn and protect investments in offshore wind projects. Transmission charging is a levy placed on generators, having originally been designed to incentivise building asset near major cities in England. However, Scotland is looking to build over 45GW of offshore wind in the country’s waters, driven through the ScotWind and INTOG leasing rounds. Transmission charging is now effectively penalising generation in these areas. Scottish Renewables chief executive Claire Mack said: “Scotland’s abundant natural resources should make it the home of the UK’s biggest and most productive renewable energy projects but our outdated transmission charging rules, designed over 30 years ago, are unbalancing how the modern-day electricity network should be paid for which is negatively impacting the development of major sites. “These charges are both volatile and unpredictable, unfairly penalising Scottish projects by tens of millions of pounds every year.” © Supplied by Scottish RenewablesScottish Renewables chief executive Claire Mack. She added that the UK government cannot meet its 2030 clean power targets without Scottish offshore wind. Instead, she urged the UK government and Ofgem to implement a cap and floor model for transmission charging that alleviates these costs and keeps projects on track. “Delivering this meaningful reform will provide a stable, investment-friendly environment – one that protects the clean power projects vital to creating green jobs at scale and delivering a secure, sustainable energy system for the future,” Mack said. Transmission charging reforms According to the report, transmission

Read More »

Tariff uncertainty weighs on networking vendors

“Our guide assumes current tariffs and exemptions remain in place through the quarter. These include the following: China at 30%, partially offset by an exemption for semiconductors and certain electronic components; Mexico and Canada at 25% for the components and products that are not eligible for the current exemptions,” Cisco CFO Scott Herron told Wall Street analysts in the company’s quarterly earnings report on May 14. At this time, Cisco expects little impact from tariffs on steel and aluminum and retaliatory tariffs, Herron said. “We’ll continue to leverage our world-class supply chain team to help mitigate the impact,” he said, adding that “the flexibility and agility we have built into our operations over the last few years, the size and scale of our supply chain, provides us some unique advantages as we support our customers globally.” “Once the tariff scenario stabilizes, there [are] steps that we can take to mitigate it, as you’ve seen us do with China from the first Trump administration. And only after that would we consider price [increases],” Herron said. Similarly, Extreme Networks noted the changing tariff conditions during its earnings call on April 30. “The tariff situation is very dynamic, I think, as everybody knows and can appreciate, and it’s kind of hard to call. Yes, there was concern initially given the magnitude of tariffs,” said Extreme Networks CEO Ed Meyercord on the earnings call. “The larger question is, will all of the changes globally in trade and tariff policy have an impact on demand? And that’s hard to call at this point. And we’re going to hold as far as providing guidance or judgment on that until we have finality come July.” Financial news Meanwhile, AI is fueling high expectations and influencing investments in enterprise campus and data center environments.

Read More »

Liquid cooling becoming essential as AI servers proliferate

“Facility water loops sometimes have good water quality, sometimes bad,” says My Troung, CTO at ZutaCore, a liquid cooling company. “Sometimes you have organics you don’t want to have inside the technical loop.” So there’s one set of pipes that goes around the data center, collecting the heat from the server racks, and another set of smaller pipes that lives inside individual racks or servers. “That inner loop is some sort of technical fluid, and the two loops exchange heat across a heat exchanger,” says Troung. The most common approach today, he says, is to use a single-phase liquid — one that stays in liquid form and never evaporates into a gas — such as water or propylene glycol. But it’s not the most efficient option. Evaporation is a great way to dissipate heat. That’s what our bodies do when we sweat. When water goes from a liquid to a gas it’s called a phase change, and it uses up energy and makes everything around it slightly cooler. Of course, few servers run hot enough to boil water — but they can boil other liquids. “Two phase is the most efficient cooling technology,” says Xianming (Simon) Dai, a professor at University of Texas at Dallas. And it might be here sooner than you think. In a keynote address in March at Nvidia GTC, Nvidia CEO Jensen Huang unveiled the Rubin Ultra NVL576, due in the second half of 2027 — with 600 kilowatts per rack. “With the 600 kilowatt racks that Nvidia is announcing, the industry will have to shift very soon from single-phase approaches to two-phase,” says ZutaCore’s Troung. Another highly-efficient cooling approach is immersion cooling. According to a Castrol survey released in March, 90% of 600 data center industry leaders say that they are considering switching to immersion

Read More »

Cisco taps OpenAI’s Codex for AI-driven network coding

“If you want to ask Codex a question about your codebase, click “Ask”. Each task is processed independently in a separate, isolated environment preloaded with your codebase. Codex can read and edit files, as well as run commands including test harnesses, linters, and type checkers. Task completion typically takes between 1 and 30 minutes, depending on complexity, and you can monitor Codex’s progress in real time,” according to OpenAI. “Once Codex completes a task, it commits its changes in its environment. Codex provides verifiable evidence of its actions through citations of terminal logs and test outputs, allowing you to trace each step taken during task completion,” OpenAI wrote. “You can then review the results, request further revisions, open a GitHub pull request, or directly integrate the changes into your local environment. In the product, you can configure the Codex environment to match your real development environment as closely as possible.” OpenAI is releasing Codex as a research preview: “We prioritized security and transparency when designing Codex so users can verify its outputs – a safeguard that grows increasingly more important as AI models handle more complex coding tasks independently and safety considerations evolve. Users can check Codex’s work through citations, terminal logs and test results,” OpenAI wrote.  Internally, technical teams at OpenAI have started using Codex. “It is most often used by OpenAI engineers to offload repetitive, well-scoped tasks, like refactoring, renaming, and writing tests, that would otherwise break focus. It’s equally useful for scaffolding new features, wiring components, fixing bugs, and drafting documentation,” OpenAI stated. Cisco’s view of agentic AI Patel stated that Codex is part of the developing AI agent world, where Cisco envisions billions of AI agents will work together to transform and redefine the architectural assumptions the industry has relied on. Agents will communicate within and

Read More »

US companies are helping Saudi Arabia to build an AI powerhouse

AMD announced a five-year, $10 billion collaboration with Humain to deploy up to 500 megawatts of AI compute in Saudi Arabia and the US, aiming to deploy “multi-exaflop capacity by early 2026.” AWS, too, is expanding its data centers in Saudi Arabia to bolster Humain’s cloud infrastructure. Saudi Arabia has abundant oil and gas to power those data centers, and is growing its renewable energy resources with the goal of supplying 50% of the country’s power by 2030. “Commercial electricity rates, nearly 50% lower than in the US, offer potential cost savings for AI model training, though high local hosting costs due to land, talent, and infrastructure limit total savings,” said Eric Samuel, Associate Director at IDC. Located near Middle Eastern population centers and fiber optic cables to Asia, these data centers will offer enterprises low-latency cloud computing for real-time AI applications. Late is great There’s an advantage to being a relative latecomer to the technology industry, said Eric Samuel, associate director, research at IDC. “Saudi Arabia’s greenfield tech landscape offers a unique opportunity for rapid, ground-up AI integration, unburdened by legacy systems,” he said.

Read More »

AMD, Nvidia partner with Saudi startup to build multi-billion dollar AI service centers

Humain will deploy the Nvidia Omniverse platform as a multi-tenant system to drive acceleration of the new era of physical AI and robotics through simulation, optimization and operation of physical environments by new human-AI-led solutions. The AMD deal did not discuss the number of chips involved in the deal, but it is valued at $10 billion. AMD and Humain plan to develop a comprehensive AI infrastructure through a network of AMD-based AI data centers that will extend from Saudi Arabia to the US and support a wide range of AI workloads across corporate, start-up, and government markets. Think of it as AWS but only offering AI as a service. AMD will provide its AI compute portfolio – Epyc, Instinct, and FPGA networking — and the AMD ROCm open software ecosystem, while Humain will manage the delivery of the hyperscale data center, sustainable power systems, and global fiber interconnects. The partners expect to activate a multi-exaflop network by early 2026, supported by next-generation AI silicon, modular data center zones, and a software platform stack focused on developer enablement, open standards, and interoperability. Amazon Web Services also got a piece of the action, announcing a more than $5 billion investment to build an “AI zone” in the Kingdom. The zone is the first of its kind and will bring together multiple capabilities, including dedicated AWS AI infrastructure and servers, UltraCluster networks for faster AI training and inference, AWS services like SageMaker and Bedrock, and AI application services such as Amazon Q. Like the AMD project, the zone will be available in 2026. Humain only emerged this month, so little is known about it. But given that it is backed by Crown Prince Salman and has the full weight of the Kingdom’s Public Investment Fund (PIF), which ranks among the world’s largest and

Read More »

Check Point CISO: Network segregation can prevent blackouts, disruptions

Fischbein agrees 100% with his colleague’s analysis and adds that education and training can help prevent such incidents from occurring. “Simulating such a blackout is impossible, it has never been done,” he acknowledges, but he is committed to strengthening personal and team training and risk awareness. Increased defense and cybersecurity budgets In 2025, industry watchers expect there will be an increase in the public budget allocated to defense. In Spain, one-third of the budget will be allocated to increasing cybersecurity. But for Fischbein, training teams is much more important than the budget. “The challenge is to distribute the budget in a way that can be managed,” he notes, and to leverage intuitive and easy-to-use platforms, so that organizations don’t have to invest all the money in training. “When you have information, management, users, devices, mobiles, data centers, clouds, cameras, printers… the security challenge is very complex. You have to look for a security platform that makes things easier, faster, and simpler,” he says. ” Today there are excellent tools that can stop all kinds of attacks.” “Since 2010, there have been cybersecurity systems, also from Check Point, that help prevent this type of incident from happening, but I’m not sure that [Spain’s electricity blackout] was a cyberattack.” Leading the way in email security According to Gartner’s Magic Quadrant, Check Point is the leader in email security platforms. Today email is still responsible for 88% of all malicious file distributions. Attacks that, as Fischbein explains, enter through phishing, spam, SMS, or QR codes. “There are two challenges: to stop the threats and not to disturb, because if the security tool is a nuisance it causes more harm than good. It is very important that the solution does not annoy [users],” he stresses. “As almost all attacks enter via e-mail, it is

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »