Stay Ahead, Stay ONMINE

Google’s AlphaEvolve: The AI agent that reclaimed 0.7% of Google’s compute – and how to copy it

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google’s new AlphaEvolve shows what happens when an AI agent graduates from lab demo to production work, and you’ve got one of the most talented technology companies driving it. Built by Google’s DeepMind, the system autonomously […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Google’s new AlphaEvolve shows what happens when an AI agent graduates from lab demo to production work, and you’ve got one of the most talented technology companies driving it.

Built by Google’s DeepMind, the system autonomously rewrites critical code and already pays for itself inside Google. It shattered a 56-year-old record in matrix multiplication (the core of many machine learning workloads) and clawed back 0.7% of compute capacity across the company’s global data centers.

Those headline feats matter, but the deeper lesson for enterprise tech leaders is how AlphaEvolve pulls them off. Its architecture – controller, fast-draft models, deep-thinking models, automated evaluators and versioned memory – illustrates the kind of production-grade plumbing that makes autonomous agents safe to deploy at scale.

Google’s AI technology is arguably second to none. So the trick is figuring out how to learn from it, or even using it directly. Google says an Early Access Program is coming for academic partners and that “broader availability” is being explored, but details are thin. Until then, AlphaEvolve is a best-practice template: If you want agents that touch high-value workloads, you’ll need comparable orchestration, testing and guardrails.

Consider just the data center win. Google won’t put a price tag on the reclaimed 0.7%, but its annual capex runs tens of billions of dollars. Even a rough estimate puts the savings in the hundreds of millions annually—enough, as independent developer Sam Witteveen noted on our recent podcast, to pay for training one of the flagship Gemini models, estimated to cost upwards of $191 million for a version like Gemini Ultra.

VentureBeat was the first to report about the AlphaEvolve news earlier this week. Now we’ll go deeper: how the system works, where the engineering bar really sits and the concrete steps enterprises can take to build (or buy) something comparable.

1. Beyond simple scripts: The rise of the “agent operating system”

AlphaEvolve runs on what is best described as an agent operating system – a distributed, asynchronous pipeline built for continuous improvement at scale. Its core pieces are a controller, a pair of large language models (Gemini Flash for breadth; Gemini Pro for depth), a versioned program-memory database and a fleet of evaluator workers, all tuned for high throughput rather than just low latency.

A high-level overview of the AlphaEvolve agent structure. Source: AlphaEvolve paper.

This architecture isn’t conceptually new, but the execution is. “It’s just an unbelievably good execution,” Witteveen says.

The AlphaEvolve paper describes the orchestrator as an “evolutionary algorithm that gradually develops programs that improve the score on the automated evaluation metrics” (p. 3); in short, an “autonomous pipeline of LLMs whose task is to improve an algorithm by making direct changes to the code” (p. 1).

Takeaway for enterprises: If your agent plans include unsupervised runs on high-value tasks, plan for similar infrastructure: job queues, a versioned memory store, service-mesh tracing and secure sandboxing for any code the agent produces. 

2. The evaluator engine: driving progress with automated, objective feedback

A key element of AlphaEvolve is its rigorous evaluation framework. Every iteration proposed by the pair of LLMs is accepted or rejected based on a user-supplied “evaluate” function that returns machine-gradable metrics. This evaluation system begins with ultrafast unit-test checks on each proposed code change – simple, automatic tests (similar to the unit tests developers already write) that verify the snippet still compiles and produces the right answers on a handful of micro-inputs – before passing the survivors on to heavier benchmarks and LLM-generated reviews. This runs in parallel, so the search stays fast and safe.

In short: Let the models suggest fixes, then verify each one against tests you trust. AlphaEvolve also supports multi-objective optimization (optimizing latency and accuracy simultaneously), evolving programs that hit several metrics at once. Counter-intuitively, balancing multiple goals can improve a single target metric by encouraging more diverse solutions.

Takeaway for enterprises: Production agents need deterministic scorekeepers. Whether that’s unit tests, full simulators, or canary traffic analysis. Automated evaluators are both your safety net and your growth engine. Before you launch an agentic project, ask: “Do we have a metric the agent can score itself against?”

3. Smart model use, iterative code refinement

AlphaEvolve tackles every coding problem with a two-model rhythm. First, Gemini Flash fires off quick drafts, giving the system a broad set of ideas to explore. Then Gemini Pro studies those drafts in more depth and returns a smaller set of stronger candidates. Feeding both models is a lightweight “prompt builder,” a helper script that assembles the question each model sees. It blends three kinds of context: earlier code attempts saved in a project database, any guardrails or rules the engineering team has written and relevant external material such as research papers or developer notes. With that richer backdrop, Gemini Flash can roam widely while Gemini Pro zeroes in on quality.

Unlike many agent demos that tweak one function at a time, AlphaEvolve edits entire repositories. It describes each change as a standard diff block – the same patch format engineers push to GitHub – so it can touch dozens of files without losing track. Afterward, automated tests decide whether the patch sticks. Over repeated cycles, the agent’s memory of success and failure grows, so it proposes better patches and wastes less compute on dead ends.

Takeaway for enterprises: Let cheaper, faster models handle brainstorming, then call on a more capable model to refine the best ideas. Preserve every trial in a searchable history, because that memory speeds up later work and can be reused across teams. Accordingly, vendors are rushing to provide developers with new tooling around things like memory. Products such as OpenMemory MCP, which provides a portable memory store, and the new long- and short-term memory APIs in LlamaIndex are making this kind of persistent context almost as easy to plug in as logging.

OpenAI’s Codex-1 software-engineering agent, also released today, underscores the same pattern. It fires off parallel tasks inside a secure sandbox, runs unit tests and returns pull-request drafts—effectively a code-specific echo of AlphaEvolve’s broader search-and-evaluate loop.

4. Measure to manage: targeting agentic AI for demonstrable ROI

AlphaEvolve’s tangible wins – reclaiming 0.7% of data center capacity, cutting Gemini training kernel runtime 23%, speeding FlashAttention 32%, and simplifying TPU design – share one trait: they target domains with airtight metrics.

For data center scheduling, AlphaEvolve evolved a heuristic that was evaluated using a simulator of Google’s data centers based on historical workloads. For kernel optimization, the objective was to minimize actual runtime on TPU accelerators across a dataset of realistic kernel input shapes.

Takeaway for enterprises: When starting your agentic AI journey, look first at workflows where “better” is a quantifiable number your system can compute – be it latency, cost, error rate or throughput. This focus allows automated search and de-risks deployment because the agent’s output (often human-readable code, as in AlphaEvolve’s case) can be integrated into existing review and validation pipelines.

This clarity allows the agent to self-improve and demonstrate unambiguous value.

5. Laying the groundwork: essential prerequisites for enterprise agentic success

While AlphaEvolve’s achievements are inspiring, Google’s paper is also clear about its scope and requirements.

The primary limitation is the need for an automated evaluator; problems requiring manual experimentation or “wet-lab” feedback are currently out of scope for this specific approach. The system can consume significant compute – “on the order of 100 compute-hours to evaluate any new solution” (AlphaEvolve paper, page 8), necessitating parallelization and careful capacity planning.

Before allocating significant budget to complex agentic systems, technical leaders must ask critical questions:

  • Machine-gradable problem? Do we have a clear, automatable metric against which the agent can score its own performance?
  • Compute capacity? Can we afford the potentially compute-heavy inner loop of generation, evaluation, and refinement, especially during the development and training phase?
  • Codebase & memory readiness? Is your codebase structured for iterative, possibly diff-based, modifications? And can you implement the instrumented memory systems vital for an agent to learn from its evolutionary history?

Takeaway for enterprises: The increasing focus on robust agent identity and access management, as seen with platforms like Frontegg, Auth0 and others, also points to the maturing infrastructure required to deploy agents that interact securely with multiple enterprise systems.

The agentic future is engineered, not just summoned

AlphaEvolve’s message for enterprise teams is manifold. First, your operating system around agents is now far more important than model intelligence. Google’s blueprint shows three pillars that can’t be skipped:

  • Deterministic evaluators that give the agent an unambiguous score every time it makes a change.
  • Long-running orchestration that can juggle fast “draft” models like Gemini Flash with slower, more rigorous models – whether that’s Google’s stack or a framework such as LangChain’s LangGraph.
  • Persistent memory so each iteration builds on the last instead of relearning from scratch.

Enterprises that already have logging, test harnesses and versioned code repositories are closer than they think. The next step is to wire those assets into a self-serve evaluation loop so multiple agent-generated solutions can compete, and only the highest-scoring patch ships. 

As Cisco’s Anurag Dhingra, VP and GM of Enterprise Connectivity and Collaboration, told VentureBeat in an interview this week: “It’s happening, it is very, very real,” he said of enterprises using AI agents in manufacturing, warehouses, customer contact centers. “It is not something in the future. It is happening there today.” He warned that as these agents become more pervasive, doing “human-like work,” the strain on existing systems will be immense: “The network traffic is going to go through the roof,” Dhingra said. Your network, budget and competitive edge will likely feel that strain before the hype cycle settles. Start proving out a contained, metric-driven use case this quarter – then scale what works.

Watch the video podcast I did with developer Sam Witteveen, where we go deep on production-grade agents, and how AlphaEvolve is showing the way:

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Essential commands for Linux server management

Any Linux systems administrator needs to be proficient with a wide range of commands for user management, file handling, system monitoring, networking, security and more. This article covers a range of commands that are essential for managing a Linux server. Keep in mind that some commands will depend on the

Read More »

CenterPoint sends mobile generators to San Antonio, to support Texas grid

CenterPoint Energy will deploy 15 large mobile generators to the San Antonio area to help reduce the risk of energy shortfalls this summer, the Texas utility said Monday. Installation of the first five units has begun and the remaining 10 will be installed in the coming weeks. The move “will immediately lower monthly bills for our Houston-area customers,” Jason Ryan, CenterPoint’s executive vice president of regulatory services and government affairs, said in a statement. The deployment is expected to reduce bills for Houston-area customers by approximately $2/month by 2027, the utility said. The generators range from 27 MW to 32 MW and were acquired following Winter Storm Uri, which devastated the Texas grid in 2021. CenterPoint subsequently came under fire when the mobile units were not deployed after Hurricane Beryl last summer.  The utility opted to forego some profits associated with the mobile generation lease, amid the controversy. “CenterPoint will receive no revenue or profit from the 15 large units based on the agreement” with the Electric Reliability Council of Texas, the utility said Monday. According to the June agreement, ERCOT determined that the proposed retirement of CPS Energy’s V.H. Braunig Units 1, 2, and 3 “poses a significant risk” to its system because those units “help to mitigate risk of cascading outages associated with the post-contingency overload of certain transmission lines importing power into the greater San Antonio area.” ERCOT recommended Unit 3 continue operating under a reliability must-run agreement, but said CenterPoint’s mobile fleet “could help further mitigate the identified reliability risks more cost-effectively than committing V.H. Braunig Units 1 and 2 through RMR agreements.” According to CenterPoint, it’s generators will deliver approximately $200 million of value to the state’s grid. The agreement with ERCOT indicates the generators could run until March 2027. In the aftermath of Hurricane Beryl,

Read More »

How the potential end of Energy Star could affect apartment operators

In early May, sources from within the U.S. Environmental Protection Agency reported that the federal government intended to end its long-running Energy Star program for energy-efficient appliances, according to the New York Times — spelling the possible end of a widely used voluntary tool in the multifamily industry.  The sunset of the Energy Star program would be part of a larger agency initiative to eliminate divisions that oversee efforts related to climate change and energy efficiency, as reported by the New York Times. As of now, the program is still active, and no further reports or official announcements have been made on its status.  Energy Star has saved consumers and businesses $500 billion in energy costs since its founding in 1992, according to the program’s 2023 report. On a yearly basis, it creates roughly $40 billion in energy savings at a cost of $32 million to taxpayers, according to the Institute for Market Transformation, a Washington, D.C.-based non-profit focused on high-quality buildings. The program counts thousands of private organizations as its partners, including nearly 40% of Fortune 500 companies, according to the report. Its Sustained Excellence award winners, recognized for years or decades of support, include a number of multifamily companies: Houston-based Hines, Dallas-based CBRE, New York City-based Nuveen Real Estate and New York City-based Tishman Speyer, among others. Since media reports surfaced last month about the program’s demise, more than 1,200 organizations have signed letters appealing to the EPA to continue the Energy Star program, according to the IMT. “The real estate industry is really aligned … on the idea that it is essential to maintain the Energy Star program within the federal government,” Alex Dews, CEO of the IMT, told Multifamily Dive. “It is a public good that cannot be replicated in the same way outside of government.”

Read More »

From backup to backbone: Why utility-led DERs must drive MISO’s resource adequacy plans

Jigar Shah is managing partner at Multiplier, an advisory firm, and former director of the U.S. Department of Energy Loan Programs Office. The United States is facing an unprecedented surge in electricity demand, projected to grow by more than 150 GW by 2030, rivaling energy expansions seen only during World War II. Our power grid must evolve faster than ever. The recent Federal Energy Regulatory Commission ruling and the submission of a revised fast-track interconnection process by the Midcontinent Independent System Operator underscore the critical need for innovative, scalable solutions that enhance resource adequacy. One path forward lies in utility-led distributed energy resources (DERs), a model exemplified by Xcel Energy’s Distributed Capacity Procurement (DCP) program included in its 2024 integrated resource plan and mirrored by utility battery storage proposals from Exelon’s Maryland utilities in PJM. These programs highlight how utilities can approach deploying DERs like solar and storage up to ten times faster than traditional virtual power plant approaches, addressing load growth with precision, speed and scale. Distributed energy resources have often been underused due to lack of predictability of use and the random siting of assets, issues that utility-led programs directly solve by integrating DERs into system planning. By strategically siting, deploying and dispatching distributed assets, utilities can provide flexible capacity that smooths peak demand, defers or replaces costly combustion turbines and peaker plants, and minimizes expensive transmission and distribution upgrades as demand grows rapidly. This flexible, modular approach also narrows the uncertainty band around new generation buildout, enabling capacity to scale alongside real-time load growth while supporting economic expansion and community development. As DERs move from being mere “backup” resources to becoming the “backbone” of the power system, they provide an energy-dense, resilient and cost-stabilizing solution. The U.S. Department of Energy’s 2025 “Pathways to Commercial Liftoff: Virtual Power

Read More »

Chevron Acquires Two Smackover Leases as It Eyes Lithium Production

Chevron Corp. has acquired two leaseholds in the Smackover Formation spanning about 125,00 net acres, saying the purchases mark the first step in its establishment of a commercial-scale lithium business in the United States. The acreage positions, from East Texas Natural Resources LLC and The Energy & Minerals Group’s TerraVolta Resources, straddle Northeast Texas and Southwest Arkansas, Chevron said. The U.S. Geological Survey (USGS) in 2024 reported 5-19 million metric tons of lithium reserves in Southwest Arkansas’ portion of Smackover, based on a study led by the government-run agency. “Future development will aim to utilize the direct lithium extraction process, a set of advanced technologies employed to extract lithium from brines produced from the subsurface”, Chevron said in a press release. “Chevron seeks to deploy this emerging technology, which allows for faster and more efficient production and is expected to have a smaller environmental footprint compared to traditional extraction methods”. “Lithium is a key component supporting the trend toward electrification and can contribute to building a resilient, lower carbon energy system that meets growing energy demand, while balancing reliability and affordability”, the oil and gas giant added. Jeff Gustavson, president of Chevron New Energies, said, “This acquisition represents a strategic investment to support energy manufacturing and expand U.S.-based critical mineral supplies. Establishing domestic and resilient lithium supply chains is essential not only to maintaining U.S. energy leadership but also to meeting the growing demand from customers”. “This opportunity builds on many of Chevron’s strengths including subsurface resource development and value chain integration”, Gustavson added. “As demand for digital conveniences and EVs continues to increase, lithium has become one of the world’s most sought-after natural resources”, said Rania Yacoub, corporate business development manager at Chevron New Energies. According to the USGS, the lower end of the estimate for lithium reserves in Southwest

Read More »

TotalEnergies Gets New Offshore Wind Lease in German North Sea

TotalEnergies SE, as a shareholder of North Sea OFW One GmbH, has obtained an offshore concession that could enable one gigawatt (GW) of wind power capacity on Germany’s side of the North Sea. The N-9.4 concession spans about 141 square kilometers (54.44 square miles). It is around 150 kilometers northwest of the island of Heligoland according to TotalEnergies. N-9.4 is near RWE AG and TotalEnergies’s N-9.1 and N-9.2 sites. TotalEnergies intends to “prioritize the development of this cluster and leverage synergies to optimize construction and operating costs for the benefit of its customers”, it said in a press release. “As part of this award, Offshore Wind One GmbH will pay EUR 18 million to the German federal government in 2026, which will be allocated to marine conservation and the promotion of environmentally friendly fishing practices”, the French company added. “In addition, an annual contribution of EUR 8.1 million will be paid for 20 years to the electricity transmission system operator responsible for connecting the project, starting from the commissioning of the site. “Furthermore, considering the longer delays in the connection timelines announced by the German transmission system operators, TotalEnergies has launched a strategic review of the various concessions obtained since 2023, with a view to engaging in dialogue with the German authorities to explore the conditions of their possible developments”. In 2024 TotalEnergies acquired a 50 percent stake in N-9.1 and N-9.2 from RWE, which won the areas August that year. They are located approximately 110-115 kilometers northwest of the island of Borkum according to RWE. The projects each have a 2-GW capacity. The partners expect to approve N-9.1 in 2027 and N-9.2 in 2028 and commission the farms 2031 and 2032 respectively. Last April TotalEnergies said it had completed the acquisition of VSB Group, a Germany-focused wind and solar developer.

Read More »

TotalEnergies, QatarEnergy Win Exploration License in Algeria

TotalEnergies SE and QatarEnergy have secured an exploration area spanning 14,900 square kilometers (5,752.92 square miles) onshore Algeria. The Ahara license, awarded under the North African country’s 2024 bidding round, sits at the “intersection of the prolific Berkine and Illizi Basins”, TotalEnergies said in a press release. France’s TotalEnergies will serve as operator with a 24.5 percent stake during the exploration and appraisal phases. State-owned QatarEnergy also owns 24.5 percent. Algeria’s national oil and gas company Sonatrach SpA has the majority stake of 51 percent in accordance with Algerian law. Ahara marks QatarEnergy’s entry into Algeria’s upstream sector, QatarEnergy said separately. For TotalEnergies the license expands a footprint that already includes producing licenses. Algeria contributed 154 million cubic feet a day of gas and 21,000 barrels per day of liquids to TotalEnergies’ 2024 production, according to the company’s annual report. Last year TotalEnergies and Sonatrach signed a memorandum of understanding to conclude a hydrocarbon contract in the north-east Timimoun region. “This Memorandum of Understanding outlines the realization of a work program for the appraisal and development of gas resources in the North-East Timimoun region, in synergy with existing processing facilities for production from the Timimoun field, to reduce costs and emissions”, TotalEnergies said in a press release April 8, 2024. TotalEnergies and Sonatrach also extended a contract for the delivery of Algerian LNG to the French port of Fos-Cavaou near Marseille until 2025. That extension involves two million metric tons of LNG. In 2023 TotalEnergies and Sonatrach signed an agreement to raise gas production in the Tin Fouye Tabankort II and Tin Fouye Tabankort Sud fields in the south of Algeria. The projects include the drilling of more wells and the upgrade of existing facilities. “The combined production of the two fields is expected to exceed 100,000 boe [barrels of

Read More »

Can Intel cut its way to profit with factory layoffs?

Matt Kimball, principal analyst at Moor Insights & Strategy, said, “While I’m sure tariffs have some impact on Intel’s layoffs, this is actually pretty simple — these layoffs are largely due to the financial challenges Intel is facing in terms of declining revenues.” The move, he said, “aligns with what the company had announced some time back, to bring expenses in line with revenues. While it is painful, I am confident that Intel will be able to meet these demands, as being able to produce quality chips in a timely fashion is critical to their comeback in the market.”  Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count)

Read More »

Intel appears stuck between ‘a rock and a hard place’

Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count) realizes that, when planning for these, ensuring the resilience of these mission critical functions is priority one.”  Added Bickley, “trimming the workforce, delaying construction of the US fab plants, and flattening the decision structure of the organization are prudent moves meant to buy time in the hopes that their new chip designs and foundry processes attract new business.”

Read More »

Next-gen AI chips will draw 15,000W each, redefining power, cooling, and data center design

“Dublin imposed a 2023 moratorium on new data centers, Frankfurt has no new capacity expected before 2030, and Singapore has just 7.2 MW available,” said Kasthuri Jagadeesan, Research Director at Everest Group, highlighting the dire situation. Electricity: the new bottleneck in AI RoI As AI modules push infrastructure to its limits, electricity is becoming a critical driver of return on investment. “Electricity has shifted from a line item in operational overhead to the defining factor in AI project feasibility,” Gogia noted. “Electricity costs now constitute between 40–60% of total Opex in modern AI infrastructure, both cloud and on-prem.” Enterprises are now forced to rethink deployment strategies—balancing control, compliance, and location-specific power rates. Cloud hyperscalers may gain further advantage due to better PUE, renewable access, and energy procurement models. “A single 15,000-watt module running continuously can cost up to $20,000 annually in electricity alone, excluding cooling,” said Manish Rawat, analyst at TechInsights. “That cost structure forces enterprises to evaluate location, usage models, and platform efficiency like never before.” The silicon arms race meets the power ceiling AI chip innovation is hitting new milestones, but the cost of that performance is no longer just measured in dollars or FLOPS — it’s in kilowatts. The KAIST TeraLab roadmap demonstrates that power and heat are becoming dominant factors in compute system design. The geography of AI, as several experts warn, is shifting. Power-abundant regions such as the Nordics, the Midwest US, and the Gulf states are becoming magnets for data center investments. Regions with limited grid capacity face a growing risk of becoming “AI deserts.”

Read More »

Edge reality check: What we’ve learned about scaling secure, smart infrastructure

Enterprises are pushing cloud resources back to the edge after years of centralization. Even as major incumbents such as Google, Microsoft, and AWS pull more enterprise workloads into massive, centralized hyperscalers, use cases at the edge increasingly require nearby infrastructure—not a long hop to a centralized data center—to take advantage of the torrents of real-time data generated by IoT devices, sensor networks, smart vehicles, and a panoply of newly connected hardware. Not long ago, the enterprise edge was a physical one. The central data center was typically located in or very near the organization’s headquarters. When organizations sought to expand their reach, they wanted to establish secure, speedy connections to other office locations, such as branches, providing them with fast and reliable access to centralized computing resources. Vendors initially sold MPLS, WAN optimization, and SD-WAN as “branch office solutions,” after all. Lesson one: Understand your legacy before locking in your future The networking model that connects centralized cloud resources to the edge via some combination of SD-WAN, MPLS, or 4G reflects a legacy HQ-branch design. However, for use cases such as facial recognition, gaming, or video streaming, old problems are new again. Latency, middle-mile congestion, and the high cost of bandwidth all undermine these real-time edge use cases.

Read More »

Cisco capitalizes on Isovalent buy, unveils new load balancer

The customer deploys the Isovalent Load Balancer control plane via automation and configures the desired number of virtual load-balancer appliances, Graf said. “The control plane automatically deploys virtual load-balancing appliances via the virtualization or Kubernetes platform. The load-balancing layer is self-healing and supports auto-scaling, which means that I can replace unhealthy instances and scale out as needed. The load balancer supports powerful L3-L7 load balancing with enterprise capabilities,” he said. Depending on the infrastructure the load balancer is deployed into, the operator will deploy the load balancer using familiar deployment methods. In a data center, this will be done using a standard virtualization automation installation such as Terraform or Ansible. In the public cloud, the load balancer is deployed as a public cloud service. In Kubernetes and OpenShift, the load balancer is deployed as a Kubernetes Deployment/Operator, Graf said.  “In the future, the Isovalent Load Balancer will also be able to run on top of Cisco Nexus smart switches,” Graf said. “This means that the Isovalent Load Balancer can run in any environment, from data center, public cloud, to Kubernetes while providing a consistent load-balancing layer with a frictionless cloud-native developer experience.” Cisco has announced a variety of smart switches over the past couple of months on the vendor’s 4.8T capacity Silicon One chip. But the N9300, where Isovalent would run, includes a built-in programmable data processing unit (DPU) from AMD to offload complex data processing work and free up the switches for AI and large workload processing. For customers, the Isovalent Load Balancer provides consistent load balancing across infrastructure while being aligned with Kubernetes as the future for infrastructure. “A single load-balancing solution that can run in the data center, in public cloud, and modern Kubernetes environments. This removes operational complexity, lowers cost, while modernizing the load-balancing infrastructure in preparation

Read More »

Oracle’s struggle with capacity meant they made the difficult but responsible decisions

IDC President Crawford Del Prete agreed, and said that Oracle senior management made the right move, despite how difficult the situation is today. “Oracle is being incredibly responsible here. They don’t want to have a lot of idle capacity. That capacity does have a shelf life,” Del Prete said. CEO Katz “is trying to be extremely precise about how much capacity she puts on.” Del Prete said that, for the moment, Oracle’s capacity situation is unique to the company, and has not been a factor with key rivals AWS, Microsoft, and Google. During the investor call, Katz said that her team “made engineering decisions that were much different from the other hyperscalers and that were better suited to the needs of enterprise customers, resulting in lower costs to them and giving them deployment flexibility.” Oracle management certainly anticipated a flurry of orders, but Katz said that she chose to not pay for expanded capacity until she saw finalized “contracted noncancelable bookings.” She pointed to a huge capex line of $9.1 billion and said, “the vast majority of our capex investments are for revenue generating equipment that is going into data centers and not for land or buildings.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »