Stay Ahead, Stay ONMINE

Chinese researchers unveil MemOS, the first ‘memory operating system’ that gives AI human-like recall

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A team of researchers from leading institutions including Shanghai Jiao Tong University and Zhejiang University has developed what they’re calling the first “memory operating system” for artificial intelligence, addressing […]

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


A team of researchers from leading institutions including Shanghai Jiao Tong University and Zhejiang University has developed what they’re calling the first “memory operating system” for artificial intelligence, addressing a fundamental limitation that has hindered AI systems from achieving human-like persistent memory and learning.

The system, called MemOS, treats memory as a core computational resource that can be scheduled, shared, and evolved over time — much like how traditional operating systems manage CPU and storage resources. The research, published July 4th on arXiv, demonstrates significant performance improvements over existing approaches, including a 159% boost in temporal reasoning tasks compared to OpenAI’s memory systems.

“Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI), yet their lack of well-defined memory management systems hinders the development of long-context reasoning, continual personalization, and knowledge consistency,” the researchers write in their paper.

AI systems struggle with persistent memory across conversations

Current AI systems face what researchers call the “memory silo” problem — a fundamental architectural limitation that prevents them from maintaining coherent, long-term relationships with users. Each conversation or session essentially starts from scratch, with models unable to retain preferences, accumulated knowledge, or behavioral patterns across interactions. This creates a frustrating user experience where an AI assistant might forget a user’s dietary restrictions mentioned in one conversation when asked about restaurant recommendations in the next.

While some solutions like Retrieval-Augmented Generation (RAG) attempt to address this by pulling in external information during conversations, the researchers argue these remain “stateless workarounds without lifecycle control.” The problem runs deeper than simple information retrieval — it’s about creating systems that can genuinely learn and evolve from experience, much like human memory does.

“Existing models mainly rely on static parameters and short-lived contextual states, limiting their ability to track user preferences or update knowledge over extended periods,” the team explains. This limitation becomes particularly apparent in enterprise settings, where AI systems are expected to maintain context across complex, multi-stage workflows that might span days or weeks.

New system delivers dramatic improvements in AI reasoning tasks

MemOS introduces a fundamentally different approach through what the researchers call “MemCubes” — standardized memory units that can encapsulate different types of information and be composed, migrated, and evolved over time. These range from explicit text-based knowledge to parameter-level adaptations and activation states within the model, creating a unified framework for memory management that previously didn’t exist.

Testing on the LOCOMO benchmark, which evaluates memory-intensive reasoning tasks, MemOS consistently outperformed established baselines across all categories. The system achieved a 38.98% overall improvement compared to OpenAI’s memory implementation, with particularly strong gains in complex reasoning scenarios that require connecting information across multiple conversation turns.

“MemOS (MemOS-0630) consistently ranks first in all categories, outperforming strong baselines such as mem0, LangMem, Zep, and OpenAI-Memory, with especially large margins in challenging settings like multi-hop and temporal reasoning,” according to the research. The system also delivered substantial efficiency improvements, with up to 94% reduction in time-to-first-token latency in certain configurations through its innovative KV-cache memory injection mechanism.

These performance gains suggest that the memory bottleneck has been a more significant limitation than previously understood. By treating memory as a first-class computational resource, MemOS appears to unlock reasoning capabilities that were previously constrained by architectural limitations.

The technology could reshape how businesses deploy artificial intelligence

The implications for enterprise AI deployment could be transformative, particularly as businesses increasingly rely on AI systems for complex, ongoing relationships with customers and employees. MemOS enables what the researchers describe as “cross-platform memory migration,” allowing AI memories to be portable across different platforms and devices, breaking down what they call “memory islands” that currently trap user context within specific applications.

Consider the current frustration many users experience when insights explored in one AI platform can’t carry over to another. A marketing team might develop detailed customer personas through conversations with ChatGPT, only to start from scratch when switching to a different AI tool for campaign planning. MemOS addresses this by creating a standardized memory format that can move between systems.

The research also outlines potential for “paid memory modules,” where domain experts could package their knowledge into purchasable memory units. The researchers envision scenarios where “a medical student in clinical rotation may wish to study how to manage a rare autoimmune condition. An experienced physician can encapsulate diagnostic heuristics, questioning paths, and typical case patterns into a structured memory” that can be installed and used by other AI systems.

This marketplace model could fundamentally alter how specialized knowledge is distributed and monetized in AI systems, creating new economic opportunities for experts while democratizing access to high-quality domain knowledge. For enterprises, this could mean rapidly deploying AI systems with deep expertise in specific areas without the traditional costs and timelines associated with custom training.

Three-layer design mirrors traditional computer operating systems

The technical architecture of MemOS reflects decades of learning from traditional operating system design, adapted for the unique challenges of AI memory management. The system employs a three-layer architecture: an interface layer for API calls, an operation layer for memory scheduling and lifecycle management, and an infrastructure layer for storage and governance.

The system’s MemScheduler component dynamically manages different types of memory — from temporary activation states to permanent parameter modifications — selecting optimal storage and retrieval strategies based on usage patterns and task requirements. This represents a significant departure from current approaches, which typically treat memory as either completely static (embedded in model parameters) or completely ephemeral (limited to conversation context).

“The focus shifts from how much knowledge the model learns once to whether it can transform experience into structured memory and repeatedly retrieve and reconstruct it,” the researchers note, describing their vision for what they call “Mem-training” paradigms. This architectural philosophy suggests a fundamental rethinking of how AI systems should be designed, moving away from the current paradigm of massive pre-training toward more dynamic, experience-driven learning.

The parallels to operating system development are striking. Just as early computers required programmers to manually manage memory allocation, current AI systems require developers to carefully orchestrate how information flows between different components. MemOS abstracts this complexity, potentially enabling a new generation of AI applications that can be built on top of sophisticated memory management without requiring deep technical expertise.

Researchers release code as open source to accelerate adoption

The team has released MemOS as an open-source project, with full code available on GitHub and integration support for major AI platforms including HuggingFace, OpenAI, and Ollama. This open-source strategy appears designed to accelerate adoption and encourage community development, rather than pursuing a proprietary approach that might limit widespread implementation.

“We hope MemOS helps advance AI systems from static generators to continuously evolving, memory-driven agents,” project lead Zhiyu Li commented in the GitHub repository. The system currently supports Linux platforms, with Windows and macOS support planned, suggesting the team is prioritizing enterprise and developer adoption over immediate consumer accessibility.

The open-source release strategy reflects a broader trend in AI research where foundational infrastructure improvements are shared openly to benefit the entire ecosystem. This approach has historically accelerated innovation in areas like deep learning frameworks and could have similar effects for memory management in AI systems.

Tech giants race to solve AI memory limitations

The research arrives as major AI companies grapple with the limitations of current memory approaches, highlighting just how fundamental this challenge has become for the industry. OpenAI recently introduced memory features for ChatGPT, while Anthropic, Google, and other providers have experimented with various forms of persistent context. However, these implementations have generally been limited in scope and often lack the systematic approach that MemOS provides.

The timing of this research suggests that memory management has emerged as a critical competitive battleground in AI development. Companies that can solve the memory problem effectively may gain significant advantages in user retention and satisfaction, as their AI systems will be able to build deeper, more useful relationships over time.

Industry observers have long predicted that the next major breakthrough in AI wouldn’t necessarily come from larger models or more training data, but from architectural innovations that better mimic human cognitive capabilities. Memory management represents exactly this type of fundamental advancement — one that could unlock new applications and use cases that aren’t possible with current stateless systems.

The development represents part of a broader shift in AI research toward more stateful, persistent systems that can accumulate and evolve knowledge over time — capabilities seen as essential for artificial general intelligence. For enterprise technology leaders evaluating AI implementations, MemOS could represent a significant advancement in building AI systems that maintain context and improve over time, rather than treating each interaction as isolated.

The research team indicates they plan to explore cross-model memory sharing, self-evolving memory blocks, and the development of a broader “memory marketplace” ecosystem in future work. But perhaps the most significant impact of MemOS won’t be the specific technical implementation, but rather the proof that treating memory as a first-class computational resource can unlock dramatic improvements in AI capabilities. In an industry that has largely focused on scaling model size and training data, MemOS suggests that the next breakthrough might come from better architecture rather than bigger computers.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Macquarie Strategists Forecast USA Crude Inventory Rise

In an oil and gas report sent to Rigzone by the Macquarie team on Tuesday, Macquarie strategists revealed that they are forecasting that U.S. crude inventories will be up by 2.7 million barrels for the week ending July 4. “This follows a 3.8 million barrel build in the prior week, with the crude balance realizing significantly looser than our expectations,” the strategists stated in the report. “For this week’s crude balance, from refineries, we model a minimal reduction in crude runs. Among net imports, we model a sharp reduction, with exports up (+0.7 million barrels per day) and imports down (-0.6 million barrels per day) on a nominal basis,” they added. In the report, the strategists warned that the timing of cargoes remains a source of potential volatility in this week’s crude balance. “From implied domestic supply (prod.+adj.+transfers), we look for a bounce (+1.2 million barrels) on a nominal basis this week,” the strategists went on to state in the report. “Rounding out the picture, we anticipate another small increase in SPR [Strategic Petroleum Reserve] stocks (+0.2 million barrels) this week,” they added. The Macquarie strategists also noted in the report that, “among products”, they “look for across the board builds (gasoline/distillate/ jet +1.0/+2.4/+1.1 million barrels)”. “We model implied demand for these three products at ~14.3 million barrels per day for the week ending July 4 amidst holiday effects,” they added. “On this front, while the week of July 4th typically sees a large reduction to distillate demand, the timing/magnitude of this impact could potentially be affected/reduced by the holiday falling on a Friday this year,” the strategists went on to note. In its latest weekly petroleum status report at the time of writing, which was released on July 2 and included data for the week ending June 27, the

Read More »

Hess Relinquishes Block 59 offshore Suriname after Failing to Find Partners

Hess Corp. decided to discontinue Block 59 exploration after failing to bring in new partners following Equinor ASA and Exxon Mobil Corp.’s withdrawal, Suriname’s national oil company said Tuesday. In July 2024 Norway’s majority state-owned Equinor and Texas-based ExxonMobil withdrew from the production sharing contract they signed July 2017 with New York City-based Hess. After 2D and 3D research that involved 6,000 kilometers (3,728.23 miles) and 9,000 square kilometers (3,474.92 square miles) respectively, ExxonMobil and Equinor deemed the risk for drilling an exploration well too high and transferred their stakes to Hess, according to Staatsolie Maatschappij Suriname NV. “Hess fulfilled its minimum work obligations and has decided not to move forward to the next phase of the exploration period, which concludes on 8 July 2025”, Staatsolie said in an online statement. Hess had failed to woo new partners to continue exploration, Staatsolie explained. PSC holders fully bear the costs and risks of exploration in the South American country, Staatsolie noted. Block 59 spanned about 11,480 square kilometers in waters 2,700-3,500 meters (8,858.27-11,482.94 feet) deep, according to Staatsolie. “Significant volumes are required for potential economically viable oilfield development in this block”, the statement said. “The area formerly designated as Block 59 will be incorporated into Staatsolie’s strategy to have as much of the offshore acreage under contract with international parties”, the statement said. “Currently, production sharing contracts are in place with a number of international oil and gas companies for the various blocks, covering approximately fifty percent of Suriname’s offshore”. In late 2024 ExxonMobil also exited Suriname’s Block 52 by transferring its 50 percent stake to partner Petroliam Nasional Bhd. (Petronas). “This withdrawal is part of ExxonMobil’s ongoing evaluation of assets in its global portfolio”, Staatsolie said in a press release November 20, 2024. “Staatsolie expects PETRONAS to continue the activities

Read More »

BP Appoints Shell Veteran to Board

BP PLC has appointed Simon Henry as a non-executive director to its board effective September 1, 2025. During his over 35 years with Shell, Henry held senior finance and management positions worldwide, serving as chief financial officer and board member from 2009 to 2017, BP noted. “The board will benefit from his deep and broad experience of the global upstream and downstream energy industry and his financial and commercial understanding of global markets, together with his extensive and varied board experience”, Helge Lund, chair of BP, said. BP said Henry possesses extensive expertise and experience in global finance, strategy, governance, and management. Henry currently serves as a non-executive director at Rio Tinto plc and Rio Tinto Ltd. In February, Rio Tinto said that he would step down from these positions in the second half of 2025. Just before BP’s announcement, Harbour Energy plc also said that Henry would resign from its board with immediate effect. BP also noted he had served as a director at Lloyds Banking Group plc and PetroChina Ltd. BP said non-executive director Pamela Daley had decided to step down from the BP board for personal reasons effective July 7, 2025. “On behalf of the board, I would like to thank Pam for her outstanding service over the past seven years”, Lund said. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy. MORE FROM THIS AUTHOR

Read More »

Petronas Ships First Cargo from LNG Canada to Japan

Malaysia’s national oil and gas company, which owns a 25 percent stake in LNG Canada, has sent its first cargo from the Kitimat, British Columbia project to Japan. The shipment embarked Monday through the 174,000-cubic-meter (6.14 million cubic feet) Puteri Sejinjang LNG vessel, Petroliam Nasional Bhd. (Petronas) said in a press release. It did not disclose the volume. The LNG Canada consortium earlier announced the start-up of the 14 million metric tons per annum (MMtpa) project on June 30. The facility targets Asia. “Each LNG Canada joint venture participant will provide its own natural gas supply and individually offtake and market their respective share of liquified natural gas from LNG Canada, starting today”, LNG Canada Development Inc., the joint venture, said in last month’s start-up announcement. Announcing its maiden dispatch from the project, Petronas said, “LNG Canada is a critical component of PETRONAS’ global LNG strategy to diversify its supply portfolio and increase market flexibility. Strategically located on Canada’s west coast and connected to PETRONAS’ upstream gas assets in Northeast B.C., LNG Canada offers a direct and efficient shipping corridor to key North Asian markets including Japan, South Korea and China”. Petronas noted, “Designed to be one of the lowest-emissions LNG export facilities globally, LNG Canada features energy-efficient gas turbines, advanced methane-leak detection and mitigation systems, and power sourced from the BC Hydro grid, predominantly powered by hydroelectric and renewable energy. These innovations enable the facility to operate with a GHG intensity that is approximately 35 percent lower than the existing best-performing LNG plants around the world, and about 60 percent below the global average”. Adif Zulkifli, Petronas executive vice president and chief executive for gas and maritime business, commented, “PETRONAS’ first cargo sail-away is the culmination of years of perseverance to realize our vision for Canadian LNG exports to

Read More »

EDF, Hy24 to Pursue Green Hydrogen Project to Decarbonize Fawley Complex

Hynamics UK, part of France’s EDF Group, is bringing Hy24 onboard a renewable hydrogen project meant to help decarbonize Exxon Mobil Corp.’s Fawley refining and petrochemical complex. Hynamics UK and Hy24 signed Tuesday during the UK-France Summit a memorandum of understanding that would secure funding for the Fawley Green Hydrogen Project. Paris-based Hy24 is a private equity asset manager focused on the production of low-carbon hydrogen and its derivatives. The project is planned to have a 120-megawatt electrolyzer. Hynamics UK expects to cut the Fawley complex’s carbon dioxide emissions by 100,000 metric tons a year “by replacing heavy fuel oil and gray hydrogen with low-carbon hydrogen”, a joint statement with Hy24 said. “This Memorandum of Understanding marks the beginning of exclusive negotiations with Hy24, through its Clean Hydrogen Infrastructure Fund, to develop and finance a GBP 300 million hydrogen electrolytic production facility in Fawley, located in Hampshire, England”, the companies said. “The project aims to supply green hydrogen to the ExxonMobil petrochemical complex as part of its decarbonization strategy”. The UK’s biggest integrated petrochemical complex, Fawley refines 270,000 barrels of oil a day and fuels 1 in every 5 cars on United Kingdom roads, according to ExxonMobil. Meanwhile the site’s petrochemical production amounts to about 650,000 metric tons per year, according to the United States energy giant. Additionally Hy24 and Hynamics UK “expressed their intention to collaborate on a broader range of projects developed by Hynamics UK, aligning with the UK government’s ambitions for clean energy development”, the companies said. Hynamics UK chief executive Pierre de Raphelis-Soissan, said, “Our MoU is in line with the Clean Industrial Strategy presented by the British Government in June and is a testimony to the potential for cooperation between industrial and financial players across the Channel”. Amir Sharifi, Hy24 head for the UK, Southern

Read More »

Exxon Sees $1.5B Earnings Hit From Lower Oil, Gas Prices

Exxon Mobil Corp. expects lower oil and gas prices to reduce the company’s earnings by about $1.5 billion as a volatile quarter for commodity prices weighs on second-quarter profits.  Oil prices pulled down earnings by about $1 billion while gas contributed another $500 million hit when compared to the first quarter, the Spring, Texas-based company said in a statement Monday. European rival Shell Plc’s shares fell 3.3% Monday after guiding to “significantly lower” trading earnings than the previous quarter.  The two oil giants’ outlook points to a downbeat quarter for the industry, which was already struggling to generate enough free cash to cover the dividends and share buybacks companies hiked after record earnings in 2022. President Donald Trump’s trade war and larger-than-expected supply increases from OPEC and its allies weighed on oil prices, while US and Israeli attacks on Iran only provided a temporary uplift.   Exxon expects some respite from refining margins, which will add about $300 million to earnings, the company said. The guidance only refers to market pricing and does not factor in operational performance like changes to production or costs, the company said.  Exxon’s guidance is “bang in line” with analysts’ estimates for the second quarter, RBC Capital Markets analyst Biraj Borkhataria said in a research note. Exxon “has a much smaller trading organization than its European peer Shell, and thus was not impacted by the same issues.” WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Read More »

CoreWeave acquires Core Scientific for $9B to power AI infrastructure push

Such a shift, analysts say, could offer short-term benefits for enterprises, particularly in cost and access, but also introduces new operational risks. “This acquisition may potentially lower enterprise pricing through lease cost elimination and annual savings, while improving GPU access via expanded power capacity, enabling faster deployment of Nvidia chipsets and systems,” said Charlie Dai, VP and principal analyst at Forrester. “However, service reliability risks persist during this crypto-to-AI retrofitting.” This also indicates that struggling vendors such as Core Scientific and similar have a way to cash out, according to Yugal Joshi, partner at Everest Group. “However, it does not materially impact the availability of Nvidia GPUs and similar for enterprises,” Joshi added. “Consolidation does impact the pricing power of vendors.” Concerns for enterprises Rising demand for AI-ready infrastructure can raise concerns among enterprises, particularly over access to power-rich data centers and future capacity constraints. “The biggest concern that CIOs should have with this acquisition is that mature data center infrastructure with dedicated power is an acquisition target,” said Hyoun Park, CEO and chief analyst at Amalgam Insights. “This may turn out to create challenges for CIOs currently collocating data workloads or seeking to keep more of their data loads on private data centers rather than in the cloud.”

Read More »

CoreWeave achieves a first with Nvidia GB300 NVL72 deployment

The deployment, Kimball said, “brings Dell quality to the commodity space. Wins like this really validate what Dell has been doing in reshaping its portfolio to accommodate the needs of the market — both in the cloud and the enterprise.” Although concerns were voiced last year that Nvidia’s next-generation Blackwell data center processors had significant overheating problems when they were installed in high-capacity server racks, he said that a repeat performance is unlikely. Nvidia, said Kimball “has been very disciplined in its approach with its GPUs and not shipping silicon until it is ready. And Dell almost doubles down on this maniacal quality focus. I don’t mean to sound like I have blind faith, but I’ve watched both companies over the last several years be intentional in delivering product in volume. Especially as the competitive market starts to shape up more strongly, I expect there is an extremely high degree of confidence in quality.” CoreWeave ‘has one purpose’ He said, “like Lambda Labs, Crusoe and others, [CoreWeave] seemingly has one purpose (for now): deliver GPU capacity to the market. While I expect these cloud providers will expand in services, I think for now the type of customer employing services is on the early adopter side of AI. From an enterprise perspective, I have to think that organizations well into their AI journey are the consumers of CoreWeave.”  “CoreWeave is also being utilized by a lot of the model providers and tech vendors playing in the AI space,” Kimball pointed out. “For instance, it’s public knowledge that Microsoft, OpenAI, Meta, IBM and others use CoreWeave GPUs for model training and more. It makes sense. These are the customers that truly benefit from the performance lift that we see from generation to generation.”

Read More »

Oracle to power OpenAI’s AGI ambitions with 4.5GW expansion

“For CIOs, this shift means more competition for AI infrastructure. Over the next 12–24 months, securing capacity for AI workloads will likely get harder, not easier. Though cost is coming down but demand is increasing as well, due to which CIOs must plan earlier and build stronger partnerships to ensure availability,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. He added that CIOs should expect longer wait times for AI infrastructure. To mitigate this, they should lock in capacity through reserved instances, diversify across regions and cloud providers, and work with vendors to align on long-term demand forecasts.  “Enterprises stand to benefit from more efficient and cost-effective AI infrastructure tailored to specialized AI workloads, significantly lower their overall future AI-related investments and expenses. Consequently, CIOs face a critical task: to analyze and predict the diverse AI workloads that will prevail across their organizations, business units, functions, and employee personas in the future. This foresight will be crucial in prioritizing and optimizing AI workloads for either in-house deployment or outsourced infrastructure, ensuring strategic and efficient resource allocation,” said Neil Shah, vice president at Counterpoint Research. Strategic pivot toward AI data centers The OpenAI-Oracle deal comes in stark contrast to developments earlier this year. In April, AWS was reported to be scaling back its plans for leasing new colocation capacity — a move that AWS Vice President for global data centers Kevin Miller described as routine capacity management, not a shift in long-term expansion plans. Still, these announcements raised questions around whether the hyperscale data center boom was beginning to plateau. “This isn’t a slowdown, it’s a strategic pivot. The era of building generic data center capacity is over. The new global imperative is a race for specialized, high-density, AI-ready compute. Hyperscalers are not slowing down; they are reallocating their capital to

Read More »

Arista Buys VeloCloud to reboot SD-WANs amid AI infrastructure shift

What this doesn’t answer is how Arista Networks plans to add newer, security-oriented Secure Access Service Edge (SASE) capabilities to VeloCloud’s older SD-WAN technology. Post-acquisition, it still has only some of the building blocks necessary to achieve this. Mapping AI However, in 2025 there is always more going on with networking acquisitions than simply adding another brick to the wall, and in this case it’s the way AI is changing data flows across networks. “In the new AI era, the concepts of what comprises a user and a site in a WAN have changed fundamentally. The introduction of agentic AI even changes what might be considered a user,” wrote Arista Networks CEO, Jayshree Ullal, in a blog highlighting AI’s effect on WAN architectures. “In addition to people accessing data on demand, new AI agents will be deployed to access data independently, adapting over time to solve problems and enhance user productivity,” she said. Specifically, WANs needed modernization to cope with the effect AI traffic flows are having on data center traffic. Sanjay Uppal, now VP and general manager of the new VeloCloud Division at Arista Networks, elaborated. “The next step in SD-WAN is to identify, secure and optimize agentic AI traffic across that distributed enterprise, this time from all end points across to branches, campus sites, and the different data center locations, both public and private,” he wrote. “The best way to grab this opportunity was in partnership with a networking systems leader, as customers were increasingly looking for a comprehensive solution from LAN/Campus across the WAN to the data center.”

Read More »

Data center capacity continues to shift to hyperscalers

However, even though colocation and on-premises data centers will continue to lose share, they will still continue to grow. They just won’t be growing as fast as hyperscalers. So, it creates the illusion of shrinkage when it’s actually just slower growth. In fact, after a sustained period of essentially no growth, on-premises data center capacity is receiving a boost thanks to genAI applications and GPU infrastructure. “While most enterprise workloads are gravitating towards cloud providers or to off-premise colo facilities, a substantial subset are staying on-premise, driving a substantial increase in enterprise GPU servers,” said John Dinsdale, a chief analyst at Synergy Research Group.

Read More »

Oracle inks $30 billion cloud deal, continuing its strong push into AI infrastructure.

He pointed out that, in addition to its continued growth, OCI has a remaining performance obligation (RPO) — total future revenue expected from contracts not yet reported as revenue — of $138 billion, a 41% increase, year over year. The company is benefiting from the immense demand for cloud computing largely driven by AI models. While traditionally an enterprise resource planning (ERP) company, Oracle launched OCI in 2016 and has been strategically investing in AI and data center infrastructure that can support gigawatts of capacity. Notably, it is a partner in the $500 billion SoftBank-backed Stargate project, along with OpenAI, Arm, Microsoft, and Nvidia, that will build out data center infrastructure in the US. Along with that, the company is reportedly spending about $40 billion on Nvidia chips for a massive new data center in Abilene, Texas, that will serve as Stargate’s first location in the country. Further, the company has signaled its plans to significantly increase its investment in Abu Dhabi to grow out its cloud and AI offerings in the UAE; has partnered with IBM to advance agentic AI; has launched more than 50 genAI use cases with Cohere; and is a key provider for ByteDance, which has said it plans to invest $20 billion in global cloud infrastructure this year, notably in Johor, Malaysia. Ellison’s plan: dominate the cloud world CTO and co-founder Larry Ellison announced in a recent earnings call Oracle’s intent to become No. 1 in cloud databases, cloud applications, and the construction and operation of cloud data centers. He said Oracle is uniquely positioned because it has so much enterprise data stored in its databases. He also highlighted the company’s flexible multi-cloud strategy and said that the latest version of its database, Oracle 23ai, is specifically tailored to the needs of AI workloads. Oracle

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »