Lessons learned from agentic AI leaders reveal critical deployment strategies for enterprises

Stay Ahead, Stay ONMINE

Lessons learned from agentic AI leaders reveal critical deployment strategies for enterprises

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Companies are rushing AI agents into production — and many of them will fail. But the reason has nothing to do with their AI models. On day two of VB Transform 2025, […]

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

Companies are rushing AI agents into production — and many of them will fail. But the reason has nothing to do with their AI models.

On day two of VB Transform 2025, industry leaders shared hard-won lessons from deploying AI agents at scale. A panel moderated by Joanne Chen, general partner at Foundation Capital, included Sean Malhotra, CTO at Rocket Companies, which uses agents across the home ownership journey from mortgage underwriting to customer chat; Shailesh Nalawadi, head of product at Sendbird, which builds agentic customer service experiences for companies across multiple verticals; and Thys Waanders, SVP of AI transformation at Cognigy, whose platform automates customer experiences for large enterprise contact centers.

Their shared discovery: Companies that build evaluation and orchestration infrastructure first are successful, while those rushing to production with powerful models fail at scale.

>>See all our Transform 2025 coverage here<<

The ROI reality: Beyond simple cost cutting

A key part of engineering AI agent for success is understanding the return on investment (ROI). Early AI agent deployments focused on cost reduction. While that remains a key component, enterprise leaders now report more complex ROI patterns that demand different technical architectures.

Cost reduction wins

Malhotra shared the most dramatic cost example from Rocket Companies. “We had an engineer [who] in about two days of work was able to build a simple agent to handle a very niche problem called ‘transfer tax calculations’ in the mortgage underwriting part of the process. And that two days of effort saved us a million dollars a year in expense,” he said.

For Cognigy, Waanders noted that cost per call is a key metric. He said that if AI agents are used to automate parts of those calls, it’s possible to reduce the average handling time per call.

Revenue generation methods

Saving is one thing; making more revenue is another. Malhotra reported that his team has seen conversion improvements: As clients get the answers to their questions faster and have a good experience, they are converting at higher rates.

Proactive revenue opportunities

Nalawadi highlighted entirely new revenue capabilities through proactive outreach. His team enables proactive customer service, reaching out before customers even realize they have a problem.

A food delivery example illustrates this perfectly. “They already know when an order is going to be late, and rather than waiting for the customer to get upset and call them, they realize that there was an opportunity to get ahead of it,” he said.

Why AI agents break in production

While there are solid ROI opportunities for enterprises that deploy agentic AI, there are also some challenges in production deployments.

Nalawadi identified the core technical failure: Companies build AI agents without evaluation infrastructure.

“Before you even start building it, you should have an eval infrastructure in place,” Nalawadi said. “All of us used to be software engineers. No one deploys to production without running unit tests. And I think a very simplistic way of thinking about eval is that it’s the unit test for your AI agent system.”

Traditional software testing approaches don’t work for AI agents. He noted that it’s just not possible to predict every possible input or write comprehensive test cases for natural language interactions. Nalawadi’s team learned this through customer service deployments across retail, food delivery and financial services. Standard quality assurance approaches missed edge cases that emerged in production.

AI testing AI: The new quality assurance paradigm

Given the complexity of AI testing, what should organizations do? Waanders solved the testing problem through simulation.

“We have a feature that we’re releasing soon that is about simulating potential conversations,” Waanders explained. “So it’s essentially AI agents testing AI agents.”

The testing isn’t just conversation quality testing, it’s behavioral analysis at scale. Can it help to understand how an agent responds to angry customers? How does it handle multiple languages? What happens when customers use slang?

“The biggest challenge is you don’t know what you don’t know,” Waanders said. “How does it react to anything that anyone could come up with? You only find it out by simulating conversations, by really pushing it under thousands of different scenarios.”

The approach tests demographic variations, emotional states and edge cases that human QA teams can’t cover comprehensively.

The coming complexity explosion

Current AI agents handle single tasks independently. Enterprise leaders need to prepare for a different reality: Hundreds of agents per organization learning from each other.

The infrastructure implications are massive. When agents share data and collaborate, failure modes multiply exponentially. Traditional monitoring systems can’t track these interactions.

Companies must architect for this complexity now. Retrofitting infrastructure for multi-agent systems costs significantly more than building it correctly from the start.

“If you fast forward in what’s theoretically possible, there could be hundreds of them in an organization, and perhaps they are learning from each other,”Chen said. “The number of things that could happen just explodes. The complexity explodes.”

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Chronosphere unveils logging package with cost control features

According to a study by Chronosphere, enterprise log data is growing at 250% year-over-year, and Chronosphere Logs helps engineers and observability teams to resolve incidents faster while controlling costs. The usage and volume analysis and proactive recommendations can help reduce data before it’s stored, the company says. “Organizations are drowning

Cisco CIO on the future of IT: AI, simplicity, and employee power

AI can democratize access to information to deliver a “white-glove experience” once reserved for senior executives, Previn said. That might include, for example, real-time information retrieval and intelligent process execution for every employee. “Usually, in a large company, you’ve got senior executives, and you’ve got early career hires, and it’s

AMI MegaRAC authentication bypass flaw is being exploitated, CISA warns

The spoofing attack works by manipulating HTTP request headers sent to the Redfish interface. Attackers can add specific values to headers like “X-Server-Addr” to make their external requests appear as if they’re coming from inside the server itself. Since the system automatically trusts internal requests as authenticated, this spoofing technique

Lenovo expands enterprise AI server options

Lenovo is also introducing four new Hybrid AI Advantage offerings which are product bundles consisting of Lenovo hardware and third-party AI software. The four are: Hospitality with Centific AI Data Foundry and Nvidia: Enables hotels and resorts to boost real-time personalization, streamlined operations, and gain guest data-driven insights. Workplace Safety

MedcoEnergi to Acquire Stake in Corridor PSC in South Sumatra

PT Medco Energi Internasional Tbk. has agreed to acquire Fortuna International (Barbados) Inc. from Repsol E&P S.a.r.l. The company said in a media release it will shell out $425 million in the process. It expects to close the transaction in the third quarter of 2025. MedcoEnergi said Fortuna International holds an indirect 24 percent interest in the Corridor Production Sharing Contract (PSC). The Corridor PSC has seven producing gas fields and one producing oil field, all located onshore in South Sumatra, Indonesia, MedcoEnergi said. The gas is sold under long-term contracts to high-quality buyers in Indonesia and Singapore, it said. “This acquisition supports our strategy of owning and developing high-quality, cash-generative assets and reaffirms our commitment to national development where natural gas is a vital bridge to a lower-carbon future”, Hilmi Panigoro, President Director, stated. Last month MedcoEnergi signed a gas swap agreement through its subsidiaries, Medco E&P Natuna Ltd. (part of the West Natuna Group Supply Group) and Medco E&P Grissik Ltd. (from the South Sumatra Sellers). The agreement, signed with various key parties including PT Pertamina (Persero) and PT Perusahaan Gas Negara (Persero) Tbk (PGN), reallocates gas volumes. Specifically, the West Natuna Supply Group will now supply gas to Singapore, replacing volumes previously sent from the South Sumatra Sellers. These redirected South Sumatra volumes will then be supplied to PGN to meet Indonesia’s domestic gas demands, MedcoEnergi said. Additionally, Medco E&P Natuna Ltd., along with Premier Oil Natuna Sea B.V. and Star Energy (Kakap) Ltd., signed a separate Domestic Gas Sales Agreement with PGN. In other developments, MedcoEnergi earlier this month started commercial operations at its 25-megawatt peak East Bali Solar Photovoltaic Plant, PT Medcosolar Bali Timur, located in Karangasem, East Bali. To contact the author, email [email protected] What do you think? We’d love to hear from you,

USA Allows Ethane Cargoes to Sail to China But Not Discharge

The Trump administration has eased recent export limits on a critical petroleum product that’s used to make plastics — a shift that represents a modest pullback on curbs used as leverage in trade negotiations with China. Under the change, Enterprise Products Partners LP and Energy Transfer LP are being allowed to load that gas, known as ethane, onto tankers and transport it to Chinese ports. However, they are still barred from unloading that cargo for use by Chinese entities, said people familiar with the matter who asked not to be named. Enterprise Products “may not complete” ethane exports to Chinese entities “without further BIS authorization,” the US Bureau of Industry and Security said in a letter Thursday to the firm. Energy Transfer was also notified of the change, the people said. The shift follows weeks of lobbying by oil industry advocates who told administration officials the restrictions were inflicting more pain on the US than China. The use of ethane, which China depends almost entirely on America for, as a trade war bargaining chip has disrupted supply chains and redirected flows. While US inventories of ethane — essentially a byproduct of shale production in West Texas — climbed, expensive ships purpose-built to carry the fuel have been forced to idle or sail to new destinations like India after previously only plying dedicated routes between the US and China. As of late last week, INEOS Group Holdings SA had one tanker full of ethane waiting to ship and Enterprise Products Partners had three to four cargo ships stuck in limbo, according to a person familiar with the matter. Representatives of Enterprise Products, Energy Transfer and the Commerce Department, which oversees BIS, did not immediately respond to requests for comment on the shift, first reported by Reuters. While the revised licensing requirements will ease congestion at US Gulf

WTI Edges Up as OPEC+ Meeting Looms

Oil held steady as traders grappled with conflicting signals about how much US military strikes hobbled Iran’s nuclear program and whether Washington will continue to target Tehran’s oil flows. West Texas Intermediate edged up to settle near near $65 a barrel while Brent closed little changed near $68. WTI had climbed as much as 2.3% earlier after the Financial Times reported that European capitals believe that Iran’s highly enriched uranium stockpile remain largely intact following US strikes. President Donald Trump, in a social media post, denied reports that Iran successfully moved nuclear material from its sites before the attacks. At the same time, an Iranian law to suspend cooperation with the UN nuclear watchdog came into effect. Prices eased off intraday highs after CNN reported that Washington has discussed offering incentives to restart talks with Iran, including possibly easing sanctions. At a news conference on Wednesday, Trump indicated that US financial penalties are doing little to stop China from buying Iran’s supplies, contradicting earlier comments that he’s “not giving up” on a strategy of targeting Tehran’s petrodollars. “If they’re going to sell oil, they’re going to sell oil,” Trump said. “China is going to want to buy oil. They can buy it from us. They can buy it from other people.” In another headwind, White House Press Secretary Karoline Leavitt told reporters that there are no imminent plans to refill the Strategic Petroleum Reserve. She also said there were no plans for nuclear talks between Washington and Tehran and reiterated that the US destroyed Iran’s nuclear capabilities. The developments highlight the fragility of a ceasefire between Israel and Iran, with traders still on edge over potential disruptions to energy supplies from the Middle East. Still, the large geopolitical risk premium in the market only a few sessions ago has mostly

Carlyle, Diversified Energy to Jointly Invest Up To $2B in US PDP Assets

Diversified Energy Co. PLC and Carlyle have formed a partnership to invest up to $2 billion in proven developed producing (PDP) natural gas and oil assets across the United States. “This exclusive partnership will combine Carlyle’s deep credit and structuring expertise, led by Carlyle’s asset-backed finance (ABF) team, with Diversified’s market-leading operating capabilities and differentiated business model of acquiring and optimizing portfolios of existing long-life oil and gas assets to generate reliable production and consistent cash flow”, a joint statement said. “Under the terms of the agreement, Diversified will serve as the operator and servicer of the newly acquired assets”, the companies added. “As investments occur, Carlyle intends to pursue opportunities to securitize these assets, seeking to unlock long-term, resilient financing for this critical segment of the nation’s energy infrastructure”. Diversified Energy chief executive Rusty Hutson Jr said, “This arrangement significantly enhances our ability to pursue and scale strategic acquisitions in what we believe is a highly compelling environment for PDP asset consolidation”. “Diversified is a leading operator of long-life energy assets and a pioneer in bringing PDP securitizations to institutional markets”, commented Akhil Bansal, head of Carlyle ABF. “We are excited to bring institutional capital to high-quality, cash-yielding energy assets that are core to US domestic energy production and energy security. “This partnership underscores Carlyle’s ability to originate differentiated investment opportunities through proprietary sourcing channels and seek access to stable, yield-oriented energy exposure”. Carlyle ABF, part of Carlyle’s Global Credit platform, focuses on private fixed income and asset-backed investments. Carlyle ABF supports businesses, specialty finance companies, banks, asset managers and other originators and owners of diversified pools of assets. It has deployed around $8 billion since 2021 and has about$9 billion in assets under management as of the first quarter of 2025, according to Washington-based Carlyle. Birmingham, Alabama-based Diversified Energy

Eight utility regulators challenge DOE order keeping Michigan coal plant open

Eight utility commissions in the Midcontinent Independent System Operator’s footprint are challenging the U.S. Department of Energy’s emergency order to keep a coal-fired power plant running in Michigan past its May 30 retirement date. DOE failed to show that emergency conditions exist in the MISO footprint warranting its May 23 order directing Consumers Energy to delay retiring the 1,560-MW, J.H. Campbell power plant in West Olive, the Organization of MISO States said in a June 23 rehearing request to the department. In its decision ordering the Campbell plant to run through Aug. 21, DOE cited the North American Electric Reliability Corp.’s 2025 Summer Reliability Assessment as the main evidence that there is an emergency in MISO, according to the rehearing request. However, the OMS-MISO Resource Adequacy Survey, MISO’s 2025/2026 Planning Resource Auction, MISO’s summer readiness assessment and Consumers Energy’s plans do not show a regional reliability emergency, shortfall or an unmet reliability criterion that justifies reversing the approved power plant retirement, OMS said. MISO’s most recent capacity auction cleared above its reserve margin target for this summer, the utility commissions noted. The auction cleared with a 10.1% summer reserve margin — a buffer above expected power supply needs — for MISO’s north and central regions compared to a 7.9% reserve target, according to MISO. Also, NERC’s long term reliability and seasonal assessment have limited value due to the inconsistent data collection methods between regional transmission organizations, unverified data inputs and doubtful evaluation metrics, according to the OMS commissions signing the rehearing request. “At their core, the NERC [Long Term Reliability Assessment] and seasonal assessments are undependable because they lack stakeholder input and verification,” the utility commissions said, noting they have been called into question in recent years. Most recently, MISO’s independent market monitor raised his concerns about NERC’s assessments at

Senate negotiating IRA tax credits, aims to vote on budget bill Friday

The Senate is planning for gentler cuts to the Inflation Reduction Act’s clean energy tax credits as it continues to negotiate its version of the ‘One Big Beautiful Bill’ budget legislation, with Republican leadership hoping to start voting on the bill by Friday, senators said Tuesday. Sen. Kevin Cramer, R-N.D., said the Senate’s cuts to the IRA may ultimately be “a little more generous” than the House’s cuts, Reuters reported. Those cuts sought to claw back funding by winding down most credits much earlier than originally laid out in the IRA. Cramer also said there is “work being done” to revise the cuts to the residential solar tax credit, or 25D, which was eliminated in both the final bill passed by the House of Representatives and the version of the bill passed out of the Senate Finance Committee. Senate Majority Leader John Thune, R-S.D., told Axios Tuesday that he expects the Senate will begin voting on the bill Friday and “grind it out until … whenever.” Congressional Republicans have set a self-imposed deadline to deliver final legislation to President Donald Trump by July 4. House Speaker Mike Johnson, R-La., warned House Republicans on Tuesday “not to leave town” in the coming days in case of a vote, Politico reported. Senate Parliamentarian Elizabeth MacDonough announced on Monday that multiple provisions included in the Senate Finance Committee’s version of the bill wouldn’t qualify for budget reconciliation under the Byrd Rule, and have to be removed or will trigger a 60-vote threshold requirement. Several of those provisions are energy-related, including one deeming offshore oil and gas projects “automatically compliant” with the National Environmental Policy Act, and one which would remove the Secretary of the Interior’s “discretion to reduce fees for solar and wind projects on Bureau of Land Management land,” according to Senate

Cisco backs quantum networking startup Qunnect

In partnership with Deutsche Telekom’s T-Labs, Qunnect has set up quantum networking testbeds in New York City and Berlin. “Qunnect understands that quantum networking has to work in the real world, not just in pristine lab conditions,” Vijoy Pandey, general manager and senior vice president of Outshift by Cisco, stated in a blog about the investment. “Their room-temperature approach aligns with our quantum data center vision.” Cisco recently announced it is developing a quantum entanglement chip that could ultimately become part of the gear that will populate future quantum data centers. The chip operates at room temperature, uses minimal power, and functions using existing telecom frequencies, according to Pandey.

HPE announces GreenLake Intelligence, goes all-in with agentic AI

Like a teammate who never sleeps Agentic AI is coming to Aruba Central as well, with an autonomous supervisory module talking to multiple specialized models to, for example, determine the root cause of an issue and provide recommendations. David Hughes, SVP and chief product officer, HPE Aruba Networking, said, “It’s like having a teammate who can work while you’re asleep, work on problems, and when you arrive in the morning, have those proposed answers there, complete with chain of thought logic explaining how they got to their conclusions.” Several new services for FinOps and sustainability in GreenLake Cloud are also being integrated into GreenLake Intelligence, including a new workload and capacity optimizer, extended consumption analytics to help organizations control costs, and predictive sustainability forecasting and a managed service mode in the HPE Sustainability Insight Center. In addition, updates to the OpsRamp operations copilot, launched in 2024, will enable agentic automation including conversational product help, an agentic command center that enables AI/ML-based alerts, incident management, and root cause analysis across the infrastructure when it is released in the fourth quarter of 2025. It is now a validated observability solution for the Nvidia Enterprise AI Factory. OpsRamp will also be part of the new HPE CloudOps software suite, available in the fourth quarter, which will include HPE Morpheus Enterprise and HPE Zerto. HPE said the new suite will provide automation, orchestration, governance, data mobility, data protection, and cyber resilience for multivendor, multi cloud, multi-workload infrastructures. Matt Kimball, principal analyst for datacenter, compute, and storage at Moor Insights & strategy, sees HPE’s latest announcements aligning nicely with enterprise IT modernization efforts, using AI to optimize performance. “GreenLake Intelligence is really where all of this comes together. I am a huge fan of Morpheus in delivering an agnostic orchestration plane, regardless of operating stack

MEF goes beyond metro Ethernet, rebrands as Mplify with expanded scope on NaaS and AI

While MEF is only now rebranding, Vachon said that the scope of the organization had already changed by 2005. Instead of just looking at metro Ethernet, the organization at the time had expanded into carrier Ethernet requirements. The organization has also had a growing focus on solving the challenge of cross-provider automation, which is where the LSO framework fits in. LSO provides the foundation for an automation framework that allows providers to more efficiently deliver complex services across partner networks, essentially creating a standardized language for service integration. NaaS leadership and industry blueprint Building on the LSO automation framework, the organization has been working on efforts to help providers with network-as-a-service (NaaS) related guidance and specifications. The organization’s evolution toward NaaS reflects member-driven demands for modern service delivery models. Vachon noted that MEF member organizations were asking for help with NaaS, looking for direction on establishing common definitions and some standard work. The organization responded by developing comprehensive industry guidance. “In 2023 we launched the first blueprint, which is like an industry North Star document. It includes what we think about NaaS and the work we’re doing around it,” Vachon said. The NaaS blueprint encompasses the complete service delivery ecosystem, with APIs including last mile, cloud, data center and security services. (Read more about its vision for NaaS, including easy provisioning and integrated security across a federated network of providers)

AMD rolls out first Ultra Ethernet-compliant NIC

The UEC was launched in 2023 under the Linux Foundation. Members include major tech-industry players such as AMD, Intel, Broadcom, Arista, Cisco, Google, Microsoft, Meta, Nvidia, and HPE. The specification includes GPU and accelerator interconnects as well as support for data center fabrics and scalable AI clusters. AMD’s Pensando Pollara 400GbE NICs are designed for massive scale-out environments containing thousands of AI processors. Pollara is based on customizable hardware that supports using a fully programmable Remote Direct Memory Access (RDMA) transport and hardware-based congestion control. Pollara supports GPU-to-GPU communication with intelligent routing technologies to reduce latency, making it very similar to Nvidia’s NVLink c2c. In addition to being UEC-ready, Pollara 400 offers RoCEv2 compatibility and interoperability with other NICs.

Can Intel cut its way to profit with factory layoffs?

Matt Kimball, principal analyst at Moor Insights & Strategy, said, “While I’m sure tariffs have some impact on Intel’s layoffs, this is actually pretty simple — these layoffs are largely due to the financial challenges Intel is facing in terms of declining revenues.” The move, he said, “aligns with what the company had announced some time back, to bring expenses in line with revenues. While it is painful, I am confident that Intel will be able to meet these demands, as being able to produce quality chips in a timely fashion is critical to their comeback in the market.” Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count)

Intel appears stuck between ‘a rock and a hard place’

Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count) realizes that, when planning for these, ensuring the resilience of these mission critical functions is priority one.” Added Bickley, “trimming the workforce, delaying construction of the US fab plants, and flattening the decision structure of the organization are prudent moves meant to buy time in the hopes that their new chip designs and foundry processes attract new business.”

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE