How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell

Stay Ahead, Stay ONMINE

How much information do LLMs really memorize? Now we know, thanks to Meta, Google, Nvidia and Cornell

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Most people interested in generative AI likely already know that Large Language Models (LLMs) — like those behind ChatGPT, Anthropic’s Claude, and Google’s Gemini — are trained on massive datasets: trillions of words pulled from websites, […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Most people interested in generative AI likely already know that Large Language Models (LLMs) — like those behind ChatGPT, Anthropic’s Claude, and Google’s Gemini — are trained on massive datasets: trillions of words pulled from websites, books, codebases, and, increasingly, other media such as images, audio, and video. But why?

From this data, LLMs develop a statistical, generalized understanding of language, its patterns, and the world — encoded in the form of billions of parameters, or “settings,” in a network of artificial neurons (which are mathematical functions that transform input data into output signals).

By being exposed to all this training data, LLMs learn to detect and generalize patterns that are reflected in the parameters of their neurons. For instance, the word “apple” often appears near terms related to food, fruit, or trees, and sometimes computers. The model picks up that apples can be red, green, or yellow, or even sometimes other colors if rotten or rare, are spelled “a-p-p-l-e” in English, and are edible. This statistical knowledge influences how the model responds when a user enters a prompt — shaping the output it generates based on the associations it “learned” from the training data.

But a big question — even among AI researchers — remains: how much of an LLM’s training data is used to build generalized representations of concepts, and how much is instead memorized verbatim or stored in a way that is identical or nearly identical to the original data?

This is important not only for better understanding how LLMs operate — and when they go wrong — but also as model providers defend themselves in copyright infringement lawsuits brought by data creators and owners, such as artists and record labels. If LLMs are shown to reproduce significant portions of their training data verbatim, courts could be more likely to side with plaintiffs arguing that the models unlawfully copied protected material. If not — if the models are found to generate outputs based on generalized patterns rather than exact replication — developers may be able to continue scraping and training on copyrighted data under existing legal defenses such as fair use.

Now, we finally have an answer to the question of how much LLMs memorize versus generalize: a new study released this week from researchers at Meta, Google DeepMind, Cornell University, and NVIDIA finds that GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter.

To understand what 3.6 bits means in practice:

A single bit is the smallest unit of digital data, representing either a 0 or a 1. Eight bits make up one byte.
Storing 3.6 bits allows for approximately 12.13 distinct values, as calculated by 2^3.6.
This is about the amount of information needed to choose one of 12 options—similar to selecting a month of the year or the outcome of a roll of a 12-sided die.
It is not enough to store even one English letter (which needs about 4.7 bits), but it is just enough to encode a character from a reduced set of 10 common English letters (which requires about 3.32 bits).
In bytes, 3.6 bits is 0.45 bytes—less than half the size of a typical character stored in ASCII (which uses 8 bits or 1 byte).

This number is model-independent within reasonable architectural variations: different depths, widths, and precisions produced similar results. The estimate held steady across model sizes and even precision levels, with full-precision models reaching slightly higher values (up to 3.83 bits/parameter).

More training data DOES NOT lead to more memorization — in fact, a model will be less likely to memorize any single data point

One key takeaway from the research is that models do not memorize more when trained on more data. Instead, a model’s fixed capacity is distributed across the dataset, meaning each individual datapoint receives less attention.

Jack Morris, the lead author, explained via the social network X that “training on more data will force models to memorize less per-sample.”

These findings may help ease concerns around large models memorizing copyrighted or sensitive content.

If memorization is limited and diluted across many examples, the likelihood of reproducing any one specific training example decreases. In essence, more training data leads to safer generalization behavior, not increased risk.

How the researchers identified these findings

To precisely quantify how much language models memorize, the researchers used an unconventional but powerful approach: they trained transformer models on datasets composed of uniformly random bitstrings. Each of these bitstrings was sampled independently, ensuring that no patterns, structure, or redundancy existed across examples.

Because each sample is unique and devoid of shared features, any ability the model shows in reconstructing or identifying these strings during evaluation directly reflects how much information it retained—or memorized—during training.

The key reason for this setup was to completely eliminate the possibility of generalization. Unlike natural language—which is full of grammatical structure, semantic overlap, and repeating concepts—uniform random data contains no such information. Every example is essentially noise, with no statistical relationship to any other. In such a scenario, any performance by the model on test data must come purely from memorization of the training examples, since there is no distributional pattern to generalize from.

The authors argue their method is perhaps one of the only principled ways to decouple memorization from learning in practice, because when LLMs are trained on real language, even when they produce an output that matches the training data, it’s difficult to know whether they memorized the input or merely inferred the underlying structure from the patterns they’ve observed.

This method allows the researchers to map a direct relationship between the number of model parameters and the total information stored. By gradually increasing model size and training each variant to saturation, across hundreds of experiments on models ranging from 500K to 1.5 billion parameters, they observed consistent results: 3.6 bits memorized per parameter, which they report as a fundamental measure of LLM memory capacity.

The team applied their methodology to models trained on real-world datasets as well. When trained on text, models exhibited a balance of memorization and generalization.

Smaller datasets encouraged more memorization, but as dataset size increased, models shifted toward learning generalizable patterns. This transition was marked by a phenomenon known as “double descent,” where performance temporarily dips before improving once generalization kicks in.

The study also examined how model precision—comparing training in bfloat16 versus float32—affects memorization capacity. They observed a modest increase from 3.51 to 3.83 bits-per-parameter when switching to full 32-bit precision. However, this gain is far less than the doubling of available bits would suggest, implying diminishing returns from higher precision.

Unique data is more likely to be memorized

The paper proposes a scaling law that relates a model’s capacity and dataset size to the effectiveness of membership inference attacks.

These attacks attempt to determine whether a particular data point was part of a model’s training set. The research shows that such attacks become unreliable as dataset size grows, supporting the argument that large-scale training helps reduce privacy risk.

While the paper focuses on average-case behavior, some researchers have pointed out that certain types of data—such as highly unique or stylized writing—may still be more susceptible to memorization.

The authors acknowledge this limitation and emphasize that their method is designed to characterize general trends rather than edge cases.

Moving toward greater human understanding of LLM understanding

By introducing a principled and quantifiable definition of memorization, the study gives developers and researchers new tools for evaluating the behavior of language models. This helps not only with model transparency but also with compliance, privacy, and ethical standards in AI development. The findings suggest that more data—and not less—may be the safer path when training large-scale language models.

To put total model memorization in perspective:

A 500K-parameter model can memorize roughly 1.8 million bits, or 225 KB of data.
A 1.5 billion parameter model can hold about 5.4 billion bits, or 675 megabytes of raw information.
This is not comparable to typical file storage like images (e.g., a 3.6 MB uncompressed image is about 30 million bits), but it is significant when distributed across discrete textual patterns.

I’m no lawyer or legal expert, but I would highly expect such research to be cited in the numerous ongoing lawsuits between AI providers and data creators/rights owners.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Nvidia aims to bring AI to wireless

Key features of ARC-Compact include: Energy Efficiency: Utilizing the L4 GPU (72-watt power footprint) and an energy-efficient ARM CPU, ARC-Compact aims for a total system power comparable to custom baseband unit (BBU) solutions currently in use. 5G vRAN support: It fully supports 5G TDD, FDD, massive MIMO, and all O-RAN

Netgear’s enterprise ambitions grow with SASE acquisition

Addressing the SME security gap The acquisition directly addresses a portfolio gap that Netgear (Nasdaq:NTGR) has identified through customer feedback. According to Badjate, customers have been saying that they like the Netgear products, but they also really need more security capabilities. Netgear’s target market focuses on organizations with fewer than

IBM’s cloud crisis deepens: 54 services disrupted in latest outage

Rawat said IBM’s incident response appears slow and ineffective, hinting at procedural or resource limitations. The situation also raises concerns about IBM Cloud’s adherence to zero trust principles, its automation in threat response, and the overall enforcement of security controls. “The recent IBM Cloud outages are part of a broader

AMD acquires Brium to loosen Nvidia’s grip on AI software

According to Greyhound Research, nearly 67 percent of global CIOs identify software maturity, particularly in middleware and runtime optimization, as the primary barrier to adopting alternatives to Nvidia. Brium’s compiler-based approach to AI inference could ease this dependency. While Nvidia still leads among developers, AMD’s expanding open-source stack, now backed

Oil Rises as Solid USA Jobs Data Pushes Algos to Drop Short Bets

Oil rose as stronger-than-expected US jobs data eased concerns about an economic slowdown that would crimp demand, spurring algorithmic traders to reduce short positions. West Texas Intermediate climbed almost 2% to settle above $64 a barrel, notching the largest weekly gain since November. Crude followed equities higher after US job growth in May narrowly surpassed economist forecasts, allaying concerns of near-term demand deterioration. The figures also pushed economy-sensitive diesel futures to a two-week high. “Trading is relatively quiet today, with macroeconomic factors continuing to drive the narrative,” said Rebecca Babin, a senior energy trader at CIBC Private Wealth Group. “The unemployment data is easing concerns that demand will sharply decline due to tariff uncertainty.” The positive economic data spurred commodity trading advisers to ease off of their bearish tilt. The funds, which can accelerate price momentum, liquidated short positions to sit at negative 9% short in WTI on Friday, compared with 64% short on June 5, according to data from Bridgeton Research Group. The rally was supported by enduring risk-on sentiment from optimistic signs on trade talks between the US and China, the world’s largest importer of crude. President Donald Trump and his Chinese counterpart, Xi Jinping, agreed to further negotiations over tariffs and supplies of rare earth minerals. The positive signals come against the backdrop of an oil market that has been increasingly rangebound in recent weeks. Prices have traded in a $5 band since the middle of May, and a gauge of volatility for US crude futures is at the lowest since early April. Oil has been buffeted in Trump’s second term as trade tensions between the world’s two largest economies menace demand. At the same time, the OPEC+ alliance has been adding barrels back to the market at a faster-than-expected rate, further clouding an already weak outlook for the second half

USA Targets Niche Gas That China Can’t Replace as Trade War Chip

The US is using its dominance of a niche petroleum gas as a bargaining chip in its trade war with China. America supplies China with almost all of its ethane, a product of the shale boom that’s used as a building block for making plastics. But the commerce department is now ordering shippers to apply for export licenses, and has told at least one, Enterprise Products Partners LP, that it intends to withhold permits for three China-bound cargoes. The trade war is throwing a spotlight on how the US and China rely on each other for certain commodities — dependencies that both nations are seeking to leverage as they negotiate terms to resolve their dispute. In this case, America is the world’s biggest producer of ethane, which is converted into ethylene for plastics factories, and China is its largest customer. The commerce department has cited risks that petroleum products like ethane could be diverted to the military, copying the playbook deployed by Beijing in justifying restrictions on what it calls dual-use items such as rare earths and other critical minerals. “Ethane is no longer just a byproduct of shale — it’s now a geopolitical weapon,” said Julian Renton, lead analyst covering natural gas liquids at East Daley Analytics. “China bet billions building infrastructure around US ethane, and Washington is now questioning whether that bet should continue to pay off.” America’s shale revolution and China’s rapid industrialization have coincided this century to create a market where cheap energy byproducts are parlayed into millions of tons of materials used as trash bags and shampoo bottles, car seats and computer keyboards. But companies that prospered from cooperation are now caught in the crossfire of an increasingly antagonistic trade relationship between Washington and Beijing. Chinese firms such as Satellite Chemical Co. operate giant petrochemical plants that

Alaska LNG Attracts Potential Partners in Asia, EU

The first round of partner selection for the planned Alaska LNG project attracted over 50 companies, lead developer Glenfarne Group LLC has said. The potential partners are from the United States, the European Union, Japan, Korea, Taiwan, Thailand and India. The companies “expressed interest for over $115 billion of contract value for various partnerships with the Project, including equipment and material supply, services, investment, and customer agreements”, Glenfarne said in an online statement. Alaska LNG, approved by the Federal Energy Regulatory Commission May 2020, will deliver natural gas from the state’s North Slope to both domestic and global markets. It is the only federally permitted liquefied natural gas (LNG) project on the United States Pacific Coast, according to co-developer Alaska Gasline Development Corp. (AGDC). “Alaska LNG’s economic fundamentals allow it to deliver LNG into Asia at prices that are lower than Henry Hub pricing from the U.S. Gulf Coast”, the statement said. Alaska LNG is planned to have an LNG export terminal with a capacity of 20 million metric tons per annum (MMtpa), an 807-mile 42-inch pipeline and a carbon capture plant with a storage capacity of seven MMtpa. Phase 1 aims to deliver gas about 765 miles from the North Slope to the Anchorage region. Phase 2 would install compression equipment and around 42 miles of pipeline under Cook Inlet to the Alaska LNG Export Facility in Nikiski, which would be constructed at the same time. “Glenfarne anticipates a final investment decision on the domestic portion of the Alaska LNG pipeline in late Q4 2025”, the statement said. Brendan Duval, chief executive and founder of New York City-based energy investor Glenfarne, commented, “The many expressions of interest received reinforce that the market recognizes Alaska LNG’s advantaged economics, fully permitted status, and powerful federal, state, and local support”. The project’s

NERC overstates MISO reliability risks: market monitor

The North American Electric Reliability Corp. is overstating the reliability risks faced by the Midcontinent Independent System Operator, according to David Patton, president of Potomac Economics, the grid operator’s market monitor. In its 2024 Long-Term Reliability Assessment released in December, NERC said MISO was at a high risk of having a shortfall in electricity supplies at the peak of an average summer or winter season in the next five years — the worst ranking of all North American regions. “I’d love to work with NERC to figure out where they got their numbers from, because I don’t think they’re accurate,” Patton said Thursday during a technical conference on resource adequacy challenges in the United States held by the Federal Energy Regulatory Commission. NERC understates MISO’s capacity for demand response, behind-the-meter generation and firm capacity imports by more than 8 GW, Patton said in written testimony to FERC. Also, NERC considered possible power plant retirements that have not occurred, according to Patton. Potomac Economics also acts as the market monitor for the Electric Reliability Council of Texas, ISO New England and the New York Independent System Operator. “MISO is the most reliable of any of them,” Patton said. “If I was concerned about the lights going out somewhere, it would not be MISO.” In recent winter storms, MISO exported power to neighboring grid operators to help meet their needs, Patton said, noting the exports reflected the value interconnections between grid operators can have. Further, MISO has been vastly overestimating the power supplies it will need in coming years to meet demand for electricity, Patton said. In its 2024 Regional Resource Assessment, MISO said its footprint may need 17 GW of new resources every year for the next 20 years. “That’s a result of a clearly flawed planning process,” Patton told FERC. In

Transferability is transforming clean energy project finance, say dealmakers

Dive Brief: The tax credit transferability provision included in the Inflation Reduction Act has introduced new deal structures and is allowing clean energy developers to secure project financing faster, said speakers at a Thursday panel at the American Council on Renewable Energy’s Finance Forum. “The closing of transactions has become so much easier,” said Gaurav Raniwala, global renewable energy leader at GE Vernova. “You don’t have to line up two different structures simultaneously and then close everything when there’s already enough mess going on. And the type of players that are now able to enter the market has broadened significantly.” A Wednesday report from Crux, a finance technology company that connects tax credit buyers and sellers, said lenders are “increasingly” looking to finance less established technologies like carbon capture, and that “this openness is supported by the robust and progressively more liquid market for transferable tax credits.” Dive Insight: Raniwala said that financing had previously relied on the tax equity market, which “was limited in capacity. The industry wanted to be bigger.” “If you really want to have a dominant energy industry which has abundance of supply to help with electrification, to help with all the AI stuff, we need all sources of energy out there,” he said. “And I think what transferability did was it broadened the market from just traditional tax equity to a whole host of players.” Crux’s analysis said that “tax equity structures have evolved to hybrid structures, or t-flips, which explicitly contemplate the sale of a portion of tax credits in the transfer market” and found that t-flips “made up about 60% of the tax equity committed in 2024, and that share is expected to rise.” “Historically, the tax equity market was about a $20 billion a year market dominated for many years by a

Vast array of solar power equipment left exposed online

Dive Brief: Nearly 35,000 solar power devices are remotely manageable and openly accessible to anyone from anywhere in the world, according to a new report from industrial cybersecurity firm Forescout. These exposed devices with internet-accessible management interfaces, which are made by 42 different companies, include equipment that is essential for operating solar energy infrastructure, according to the Tuesday report. Some of the management interfaces may include password protections, but Forescout said that virtually none of them needed to be online and that any exceptions should be placed behind VPNs. The 10 vendors with the greatest number of exposed devices have each disclosed vulnerabilities in the past decade, increasing the risk of their sitting exposed on the public internet. Dive Insight: The transition to renewable energy sources and the increasing digitization of the power grid have combined to create serious cybersecurity risks. Forescout’s latest findings illustrate how the absence of secure design practices in critical infrastructure devices also can endanger people’s lives and present opportunities to destabilize entire regions. Forescout’s report — based on a scan of public IP addresses using the Shodan search engine — contains details about the distribution of solar equipment with internet-accessible management interfaces. For example, these devices are more prevalent in Europe and Asia than elsewhere, with three-quarters of the devices residing in Europe and 17% in Asia. Germany and Greece each have one-fifth of the total number of exposed devices. In addition, the 10 vendors with the most exposed devices were not the same as the 10 vendors with the biggest market shares; global titan Huawei, for example, is not on Forescout’s list. SMA’s Sunny WebBox, a device that collects and reports information about the performance of solar inverters, was the most commonly observed piece of equipment left remotely accessible, followed by Fronius International inverters.

LiquidStack launches cooling system for high density, high-powered data centers

The CDU is serviceable from the front of the unit, with no rear or end access required, allowing the system to be placed against the wall. The skid-mounted system can come with rail and overhead piping pre-installed or shipped as separate cabinets for on-site assembly. The single-phase system has high-efficiency dual pumps designed to protect critical components from leaks and a centralized design with separate pump and control modules reduce both the number of components and complexity. “AI will keep pushing thermal output to new extremes, and data centers need cooling systems that can be easily deployed, managed, and scaled to match heat rejection demands as they rise,” said Joe Capes, CEO of LiquidStack in a statement. “With up to 10MW of cooling capacity at N, N+1, or N+2, the GigaModular is a platform like no other—we designed it to be the only CDU our customers will ever need. It future-proofs design selections for direct-to-chip liquid cooling without traditional limits or boundaries.”

Enterprises face data center power design challenges

” Now, with AI, GPUs need data to do a lot of compute and send that back to another GPU. That connection needs to be close together, and that is what’s pushing the density, the chips are more powerful and so on, but the necessity of everything being close together is what’s driving this big revolution,” he said. That revolution in new architecture is new data center designs. Cordovil said that instead of putting the power shelves within the rack, system administrators are putting a sidecar next to those racks and loading the sidecar with the power system, which serves two to four racks. This allows for more compute per rack and lower latency since the data doesn’t have to travel as far. The problem is that 1 mW racks are uncharted territory and no one knows how to manage the power, which is considerable now. ”There’s no user manual that says, hey, just follow this and everything’s going to be all right. You really need to push the boundaries of understanding how to work. You need to start designing something somehow, so that is a challenge to data center designers,” he said. And this brings up another issue: many corporate data centers have power plugs that are like the ones that you have at home, more or less, so they didn’t need to have an advanced electrician certification. “We’re not playing with that power anymore. You need to be very aware of how to connect something. Some of the technicians are going to need to be certified electricians, which is a skills gap in the market that we see in most markets out there,” said Cordovil. A CompTIA A+ certification will teach you the basics of power, but not the advanced skills needed for these increasingly dense racks. Cordovil

HPE Nonstop servers target data center, high-throughput applications

HPE has bumped up the size and speed of its fault-tolerant Nonstop Compute servers. There are two new servers – the 8TB, Intel Xeon-based Nonstop Compute NS9 X5 and Nonstop Compute NS5 X5 – aimed at enterprise customers looking to upgrade their transaction processing network infrastructure or support larger application workloads. Like other HPE Nonstop systems, the two new boxes include compute, software, storage, networking and database resources as well as full-system clustering and HPE’s specialized Nonstop operating system. The flagship NS9 X5 features support for dual-fabric HDR200 InfiniBand interconnect, which effectively doubles the interconnect bandwidth between it and other servers compared to the current NS8 X4, according to an HPE blog detailing the new servers. It supports up to 270 networking ports per NS9 X system, can be clustered with up to 16 other NS9 X5s, and can support 25 GbE network connectivity for modern data center integration and high-throughput applications, according to HPE.

AI boom exposes infrastructure gaps: APAC’s data center demand to outstrip supply by 42%

“Investor confidence in data centres is expected to strengthen over the remainder of the decade,” the report said. “Strong demand and solid underlying fundamentals fuelled by AI and cloud services growth will provide a robust foundation for investors to build scale.” Enterprise strategies must evolve With supply constrained and prices rising, CBRE recommended that enterprises rethink data center procurement models. Waiting for optimal sites or price points is no longer viable in many markets. Instead, enterprises should pursue early partnerships with operators that have robust development pipelines and focus on securing power-ready land. Build-to-suit models are becoming more relevant, especially for larger capacity requirements. Smaller enterprise facilities — those under 5MW — may face sustainability challenges in the long term. The report suggested that these could become “less relevant” as companies increasingly turn to specialized colocation and hyperscale providers. Still, traditional workloads will continue to represent up to 50% of total demand through 2030, preserving value in existing facilities for non-AI use cases, the report added. The region’s projected 15 to 25 GW gap is more than a temporary shortage — it signals a structural shift, CBRE said. Enterprises that act early to secure infrastructure, invest in emerging markets, and align with power availability will be best positioned to meet digital transformation goals. “Those that wait may find themselves locked out of the digital infrastructure they need to compete,” the report added.

Cisco bolsters DNS security package

The software can block domains associated with phishing, malware, botnets, and other high-risk categories such as cryptomining or new domains that haven’t been reported previously. It can also create custom block and allow lists and offers the ability to pinpoint compromised systems using real-time security activity reports, Brunetto wrote. According to Cisco, many organizations leave DNS resolution to their ISP. “But the growth of direct enterprise internet connections and remote work make DNS optimization for threat defense, privacy, compliance, and performance ever more important,” Cisco stated. “Along with core security hygiene, like a patching program, strong DNS-layer security is the leading cost-effective way to improve security posture. It blocks threats before they even reach your firewall, dramatically reducing the alert pressure your security team manages.” “Unlike other Secure Service Edge (SSE) solutions that have added basic DNS security in a ‘checkbox’ attempt to meet market demand, Cisco Secure Access – DNS Defense embeds strong security into its global network of 50+ DNS data centers,” Brunetto wrote. “Among all SSE solutions, only Cisco’s features a recursive DNS architecture that ensures low-latency, fast DNS resolution, and seamless failover.”

HPE Aruba unveils raft of new switches for data center, campus modernization

And in large-scale enterprise environments embracing collapsed-core designs, the switch acts as a high-performance aggregation layer. It consolidates services, simplifies network architecture, and enforces security policies natively, reducing complexity and operational cost, Gray said. In addition, the switch offers the agility and security required at colocation facilities and edge sites. Its integrated Layer 4 stateful security and automation-ready platform enable rapid deployment while maintaining robust control and visibility over distributed infrastructure, Gray said. The CX 10040 significantly expands the capacity it can provide and the roles it can serve for enterprise customers, according to one industry analyst. “From the enterprise side, this expands on the feature set and capabilities of the original 10000, giving customers the ability to run additional services directly in the network,” said Alan Weckel, co-founder and analyst with The 650 Group. “It helps drive a lower TCO and provide a more secure network.” Aimed as a VMware alternative Gray noted that HPE Aruba is combining its recently announced Morpheus VM Essentials plug-in package, which offers a hypervisor-based package aimed at hybrid cloud virtualization environments, with the CX 10040 to deliver a meaningful alternative to Broadcom’s VMware package. “If customers want to get out of the business of having to buy VM cloud or Cloud Foundation stuff and all of that, they can replace the distributed firewall, microsegmentation and lots of the capabilities found in the old VMware NSX [networking software] and the CX 10k, and Morpheus can easily replace that functionality [such as VM orchestration, automation and policy management],” Gray said. The 650 Group’s Weckel weighed in on the idea of the CX 10040 as a VMware alternative:

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE