Algorithm Protection in the Context of Federated Learning

Stay Ahead, Stay ONMINE

Algorithm Protection in the Context of Federated Learning

While working at a biotech company, we aim to advance ML & AI Algorithms to enable, for example, brain lesion segmentation to be executed at the hospital/clinic location where patient data resides, so it is processed in a secure manner. This, in essence, is guaranteed by federated learning mechanisms, which we have adopted in numerous real-world hospital settings. However, when an algorithm is already considered as a company asset, we also need means that protect not only sensitive data, but also secure algorithms in a heterogeneous federated environment. Fig.1 High-level workflow and attack surface. Image by author Most algorithms are assumed to be encapsulated within docker-compatible containers, allowing them to use different libraries and runtimes independently. It is assumed that there is a 3rd party IT administrator who will aim to secure patients’ data and lock the deployment environment, making it inaccessible for algorithm providers. This perspective describes different mechanisms intended to package and protect containerized workloads against theft of intellectual property by a local system administrator. To ensure a comprehensive approach, we will address protection measures across three critical layers: Algorithm code protection: Measures to secure algorithm code, preventing unauthorized access or reverse engineering. Runtime environment: Evaluates risks of administrators accessing confidential data within a containerized system. Deployment environment: Infrastructure safeguards against unauthorized system administrator access. Fig.2 Different layers of protection. Image by author Methodology After analysis of risks, we have identified two protection measures categories: Intellectual property theft and unauthorized distribution: preventing administrator users from accessing, copying, executing the algorithm. Reverse engineering risk reduction: blocking administrator users from analyzing code to uncover and claim ownership. While understanding the subjectivity of this assessment, we have considered both qualitative and quantitative characteristics of all mechanisms. Qualitative assessment Categories mentioned were considered when selecting suitable solution and are considered in summary: Hardware dependency: potential lock-in and scalability challenges in federated systems. Software dependency: reflects maturity and long-term stability Hardware and Software dependency: measures setup complexity, deployment and maintenance effort Cloud dependency: risks of lock-in with a single cloud hypervisor Hospital environment: evaluates technology maturity and requirements heterogeneous hardware setups. Cost: covers for dedicated hardware, implementation and maintenance Quantitative assessment Subjective risk reduction quantitative assessment description: Considering the above methodology and assessment criteria, we came up with a list of mechanisms that have the potential to guarantee the objective. Confidential containers Confidential Containers (CoCo) is an emerging CNCF technology that aims to deliver confidential runtime environments that will run CPU and GPU workloads while protecting the algorithm code and data from the hosting company. CoCo supports multiple TEE, including Intel TDX/SGX and AMD SEV hardware technologies, including extensions of NVidia GPU operators, that use hardware-backed protection of code and data during its execution, preventing scenarios in which a determined and skillful local administrator uses a local debugger to dump the contents of the container memory and has access to both the algorithm and data being processed. Trust is built using cryptographic attestation of runtime environment and code that is executed. It makes sure the code is not tempered with nor read by remote admin. This appears to be a perfect fit for our problem, as the remote data site admin would not be able to access the algorithm code. Unfortunately, the current state of the CoCo software stack, despite continuous efforts, still suffers from security gaps that enable the malicious administrators to issue attestation for themselves and effectively bypass all the other protection mechanisms, rendering all of them effectively useless. Each time the technology gets closer to practical production readiness, a new fundamental security issue is discovered that needs to be addressed. It is worth noting that this community is fairly transparent in communicating gaps. The often and rightfully recognized additional complexity introduced by TEEs and CoCo (specialized hardware, configuration burden, runtime overhead due to encryption) would be justifiable if the technology delivered on its promise of code protection. While TEE seems to be well adopted, CoCo is close but not there yet and based on our experiences the horizon keeps on moving, as new fundamental vulnerabilities are discovered and need to be addressed. In other words, if we had production-ready CoCo, it would have been a solution to our problem. Host-based container image encryption at rest (protection at rest and in transit) This strategy is based on end-to-end protection of container images containing the algorithm. It protects the source code of the algorithm at rest and in transit but does not protect it at runtime, as the container needs to be decrypted prior to the execution. The malicious administrator at the site has direct or indirect access to the decryption key, so he can read container contents just after it is decrypted for the execution time. Another attack scenario is to attach a debugger to the running container image. So host-based container image encryption at rest makes it harder to steal the algorithm from a storage device and in transit due to encryption, but moderately skilled administrators can decrypt and expose the algorithm. In our opinion, the increased practical effort of decrypting the algorithm (time, effort, skillset, infrastructure) from the container by the administrator who has access to the decryption key is too low to be considered as a valid algorithm protection mechanism. Prebaked custom virtual machine In this scenario the algorithm owner is delivering an encrypted virtual machine. The key can be added at boot time from the keyboard by someone else than admin (required at each reboot), from external storage (USB Key, very vulnerable, as anyone with physical access can attach the key storage), or using a remote SSH session (using Dropbear for instance) without allowing local admin to unlock the bootloader and disk. Effective and established technologies such as LUKS can be used to fully encrypt local VM filesystems including bootloader. However, even if the remote key is provided using a boot-level tiny SSH session by someone other than a malicious admin, the runtime is exposed to a hypervisor-level debugger attack, as after boot, the VM memory is decrypted and can be scanned for code and data. Still, this solution, especially with remotely provided keys by the algorithm owner, provides significantly increased algorithm code protection compared to encrypted containers because an attack requires more skills and determination than just decrypting the container image using a decryption key. To prevent memory dump analysis, we considered deploying a prebaked host machine with ssh possessed keys at boot time, this removes any hypervisor level access to memory. As a side note, there are methods to freeze physical memory modules to delay loss of data. Distroless container images Distroless container images are reducing the number of layers and components to a minimum required to run the algorithm. The attack surface is greatly reduced, as there are fewer components prone to vulnerabilities and known attacks. They are also lighter in terms of storage, network transmission, and latency. However, despite these improvements, the algorithm code is not protected at all. Distroless containers are recommended as more secure containers but not the containers that protect the algorithm, as the algorithm is there, container image can be easily mounted and algorithm can be stolen without a significant effort. Being distroless does not address our goal of protecting the algorithm code. Compiled algorithm Most machine learning algorithms are written in Python. This interpreted language makes it really easy not only to execute the algorithm code on other machines and in other environments but also to access source code and be able to modify the algorithm. The potential scenario even enables the party that steals the algorithm code to modify it, let’s say 30% or more of the source code, and claim it’s no longer the original algorithm, and could even make a legal action much harder to provide evidence of intellectual property infringement. Compiled languages, such as C, C++, Rust, when combined with strong compiler optimization (-O3 in the case of C, linker-time optimizations), make the source code not only unavailable as such, but also much harder to reverse engineer source code. Compiler optimizations introduce significant control flow changes, mathematical operations substitutions, function inlining, code restructuring, and difficult stack tracing. This makes it much harder to reverse engineer the code, making it a practically infeasible option in some scenarios, thus it can be considered as a way to increase the cost of reverse engineering attack by orders of magnitude compared to plain Python code. There’s an increased complexity and skill gap, as most of the algorithms are written in Python and would have to be converted to C, C++ or Rust. This option does increase the cost of further development of the algorithm and even modifying it to make a claim of its ownership but it does not prevent the algorithm from being executed outside of the agreed contractual scope. Code obfuscation The established technique of making the code much less readable, harder to understand and develop further can be used to make algorithm evolutions much harder. Unfortunately, it does not prevent the algorithm from being executed outside of contractual scope. Also, the de-obfuscation technologies are getting much better, thanks to advanced language models, lowering the practical effectiveness of code obfuscation. Code obfuscation does increase the practical cost of algorithm reverse engineering, so it’s worth considering as an option combined with other options (for instance, with compiled code and custom VMs). Homomorphic Encryption as code protection mechanism Homomorphic Encryption (HE) is a promised technology aimed at protecting the data, very interesting from secure aggregation strategies of partial results in Federated Learning and analytics scenarios. The aggregation party (with limited trust) can only process encrypted data and perform encrypted aggregations, then it can decrypt aggregated results without being able to decrypt any individual data. Practical applications of HE are limited due to its complexity, performance hits, limited number of supported operations, there’s observable progress (including GPU acceleration for HE) but still it’s a niche and emerging data protection technique. From an algorithm protection goal perspective, HE is not designed, nor can be made to protect the algorithm. So it’s not an algorithm protection mechanism at all. Conclusions Fig.3 Risk reduction scores, Image by author In essence, we described and assessed strategies and technologies to protect algorithm IP and sensitive data in the context of deploying Medical Algorithms and running them in potentially untrusted environments, such as hospitals. What’s visible, the most promising technologies are those that provide a degree of hardware isolation. However those make an algorithm provider completely dependent on the runtime it will be deployed. While compilation and obfuscation do not mitigate completely the risk of intellectual property theft, especially even basic LLM seem to be helpful, those methods, especially when combined, make algorithms very difficult, thus expensive, to use and modify the code. Which would already provide a degree of security. Prebaked host/virtual machines are the most common and adopted methods, extended with features like full disk encryption with keys acquired during boot via SSH, which could make it fairly difficult for local admin to access any data. However, especially pre-baked machines could cause certain compliance concerns at the hospital, and this needs to be assessed prior to establishing a federated network. Key Hardware and Software vendors(Intel, AMD, NVIDIA, Microsoft, RedHat) recognized significant demand and continue to evolve, which gives a promise that training IP-protected algorithms in a federated manner, without disclosing patients’ data, will soon be within reach. However, hardware-supported methods are very sensitive to hospital internal infrastructure, which by nature is quite heterogeneous. Therefore, containerisation provides some promise of portability. Considering this, Confidential Containers technology seems to be a very tempting promise provided by collaborators, while it’s still not fullyproduction-readyy. Certainly combining above mechanisms, code, runtime and infrastructure environment supplemented with proper legal framework decrease residual risks, however no solution provides absolute protection particularly against determined adversaries with privileged access – the combined effect of these measures creates substantial barriers to intellectual property theft. We deeply appreciate and value feedback from the community helping to further steer future efforts to develop sustainable, secure and effective methods for accelerating AI development and deployment. Together, we can tackle these challenges and achieve groundbreaking progress, ensuring robust security and compliance in various contexts. Contributions: The author would like to thank Jacek Chmiel, Peter Fernana Richie, Vitor Gouveia and the Federated Open Science team at Roche for brainstorming, pragmatic solution-oriented thinking, and contributions. Link & Resources Intel Confidential Containers Guide Nvidia blog describing integration with CoCo Confidential Containers Github & Kata Agent Policies Commercial Vendors: Edgeless systems contrast, Redhat & Azure Remote Unlock of LUKS encrypted disk A perfect match to elevate privacy-enhancing healthcare analytics Differential Privacy and Federated Learning for Medical Data

Fig.1 High-level workflow and attack surface. Image by author

Most algorithms are assumed to be encapsulated within docker-compatible containers, allowing them to use different libraries and runtimes independently. It is assumed that there is a 3rd party IT administrator who will aim to secure patients’ data and lock the deployment environment, making it inaccessible for algorithm providers. This perspective describes different mechanisms intended to package and protect containerized workloads against theft of intellectual property by a local system administrator.

To ensure a comprehensive approach, we will address protection measures across three critical layers:

Algorithm code protection: Measures to secure algorithm code, preventing unauthorized access or reverse engineering.
Runtime environment: Evaluates risks of administrators accessing confidential data within a containerized system.
Deployment environment: Infrastructure safeguards against unauthorized system administrator access.

Fig.2 Different layers of protection. Image by author

Methodology

After analysis of risks, we have identified two protection measures categories:

Intellectual property theft and unauthorized distribution: preventing administrator users from accessing, copying, executing the algorithm.
Reverse engineering risk reduction: blocking administrator users from analyzing code to uncover and claim ownership.

While understanding the subjectivity of this assessment, we have considered both qualitative and quantitative characteristics of all mechanisms.

Qualitative assessment

Categories mentioned were considered when selecting suitable solution and are considered in summary:

Hardware dependency: potential lock-in and scalability challenges in federated systems.
Software dependency: reflects maturity and long-term stability
Hardware and Software dependency: measures setup complexity, deployment and maintenance effort
Cloud dependency: risks of lock-in with a single cloud hypervisor
Hospital environment: evaluates technology maturity and requirements heterogeneous hardware setups.
Cost: covers for dedicated hardware, implementation and maintenance

Quantitative assessment

Subjective risk reduction quantitative assessment description:

Considering the above methodology and assessment criteria, we came up with a list of mechanisms that have the potential to guarantee the objective.

Confidential containers

Confidential Containers (CoCo) is an emerging CNCF technology that aims to deliver confidential runtime environments that will run CPU and GPU workloads while protecting the algorithm code and data from the hosting company.

CoCo supports multiple TEE, including Intel TDX/SGX and AMD SEV hardware technologies, including extensions of NVidia GPU operators, that use hardware-backed protection of code and data during its execution, preventing scenarios in which a determined and skillful local administrator uses a local debugger to dump the contents of the container memory and has access to both the algorithm and data being processed.

Trust is built using cryptographic attestation of runtime environment and code that is executed. It makes sure the code is not tempered with nor read by remote admin.

This appears to be a perfect fit for our problem, as the remote data site admin would not be able to access the algorithm code. Unfortunately, the current state of the CoCo software stack, despite continuous efforts, still suffers from security gaps that enable the malicious administrators to issue attestation for themselves and effectively bypass all the other protection mechanisms, rendering all of them effectively useless. Each time the technology gets closer to practical production readiness, a new fundamental security issue is discovered that needs to be addressed. It is worth noting that this community is fairly transparent in communicating gaps.

The often and rightfully recognized additional complexity introduced by TEEs and CoCo (specialized hardware, configuration burden, runtime overhead due to encryption) would be justifiable if the technology delivered on its promise of code protection. While TEE seems to be well adopted, CoCo is close but not there yet and based on our experiences the horizon keeps on moving, as new fundamental vulnerabilities are discovered and need to be addressed.

In other words, if we had production-ready CoCo, it would have been a solution to our problem.

Host-based container image encryption at rest (protection at rest and in transit)

This strategy is based on end-to-end protection of container images containing the algorithm.

It protects the source code of the algorithm at rest and in transit but does not protect it at runtime, as the container needs to be decrypted prior to the execution.

The malicious administrator at the site has direct or indirect access to the decryption key, so he can read container contents just after it is decrypted for the execution time.

Another attack scenario is to attach a debugger to the running container image.

So host-based container image encryption at rest makes it harder to steal the algorithm from a storage device and in transit due to encryption, but moderately skilled administrators can decrypt and expose the algorithm.

In our opinion, the increased practical effort of decrypting the algorithm (time, effort, skillset, infrastructure) from the container by the administrator who has access to the decryption key is too low to be considered as a valid algorithm protection mechanism.

Prebaked custom virtual machine

In this scenario the algorithm owner is delivering an encrypted virtual machine.

The key can be added at boot time from the keyboard by someone else than admin (required at each reboot), from external storage (USB Key, very vulnerable, as anyone with physical access can attach the key storage), or using a remote SSH session (using Dropbear for instance) without allowing local admin to unlock the bootloader and disk.

Effective and established technologies such as LUKS can be used to fully encrypt local VM filesystems including bootloader.

However, even if the remote key is provided using a boot-level tiny SSH session by someone other than a malicious admin, the runtime is exposed to a hypervisor-level debugger attack, as after boot, the VM memory is decrypted and can be scanned for code and data.

Still, this solution, especially with remotely provided keys by the algorithm owner, provides significantly increased algorithm code protection compared to encrypted containers because an attack requires more skills and determination than just decrypting the container image using a decryption key.

To prevent memory dump analysis, we considered deploying a prebaked host machine with ssh possessed keys at boot time, this removes any hypervisor level access to memory. As a side note, there are methods to freeze physical memory modules to delay loss of data.

Distroless container images

Distroless container images are reducing the number of layers and components to a minimum required to run the algorithm.

The attack surface is greatly reduced, as there are fewer components prone to vulnerabilities and known attacks. They are also lighter in terms of storage, network transmission, and latency.

However, despite these improvements, the algorithm code is not protected at all.

Distroless containers are recommended as more secure containers but not the containers that protect the algorithm, as the algorithm is there, container image can be easily mounted and algorithm can be stolen without a significant effort.

Being distroless does not address our goal of protecting the algorithm code.

Compiled algorithm

Most machine learning algorithms are written in Python. This interpreted language makes it really easy not only to execute the algorithm code on other machines and in other environments but also to access source code and be able to modify the algorithm.

The potential scenario even enables the party that steals the algorithm code to modify it, let’s say 30% or more of the source code, and claim it’s no longer the original algorithm, and could even make a legal action much harder to provide evidence of intellectual property infringement.

Compiled languages, such as C, C++, Rust, when combined with strong compiler optimization (-O3 in the case of C, linker-time optimizations), make the source code not only unavailable as such, but also much harder to reverse engineer source code.

Compiler optimizations introduce significant control flow changes, mathematical operations substitutions, function inlining, code restructuring, and difficult stack tracing.

This makes it much harder to reverse engineer the code, making it a practically infeasible option in some scenarios, thus it can be considered as a way to increase the cost of reverse engineering attack by orders of magnitude compared to plain Python code.

There’s an increased complexity and skill gap, as most of the algorithms are written in Python and would have to be converted to C, C++ or Rust.

This option does increase the cost of further development of the algorithm and even modifying it to make a claim of its ownership but it does not prevent the algorithm from being executed outside of the agreed contractual scope.

Code obfuscation

The established technique of making the code much less readable, harder to understand and develop further can be used to make algorithm evolutions much harder.

Unfortunately, it does not prevent the algorithm from being executed outside of contractual scope.

Also, the de-obfuscation technologies are getting much better, thanks to advanced language models, lowering the practical effectiveness of code obfuscation.

Code obfuscation does increase the practical cost of algorithm reverse engineering, so it’s worth considering as an option combined with other options (for instance, with compiled code and custom VMs).

Homomorphic Encryption as code protection mechanism

Homomorphic Encryption (HE) is a promised technology aimed at protecting the data, very interesting from secure aggregation strategies of partial results in Federated Learning and analytics scenarios.

The aggregation party (with limited trust) can only process encrypted data and perform encrypted aggregations, then it can decrypt aggregated results without being able to decrypt any individual data.

Practical applications of HE are limited due to its complexity, performance hits, limited number of supported operations, there’s observable progress (including GPU acceleration for HE) but still it’s a niche and emerging data protection technique.

From an algorithm protection goal perspective, HE is not designed, nor can be made to protect the algorithm. So it’s not an algorithm protection mechanism at all.

Conclusions

Chart — Fig.3 Risk reduction scores, Image by author

In essence, we described and assessed strategies and technologies to protect algorithm IP and sensitive data in the context of deploying Medical Algorithms and running them in potentially untrusted environments, such as hospitals.

What’s visible, the most promising technologies are those that provide a degree of hardware isolation. However those make an algorithm provider completely dependent on the runtime it will be deployed. While compilation and obfuscation do not mitigate completely the risk of intellectual property theft, especially even basic LLM seem to be helpful, those methods, especially when combined, make algorithms very difficult, thus expensive, to use and modify the code. Which would already provide a degree of security.

Prebaked host/virtual machines are the most common and adopted methods, extended with features like full disk encryption with keys acquired during boot via SSH, which could make it fairly difficult for local admin to access any data. However, especially pre-baked machines could cause certain compliance concerns at the hospital, and this needs to be assessed prior to establishing a federated network.

Key Hardware and Software vendors(Intel, AMD, NVIDIA, Microsoft, RedHat) recognized significant demand and continue to evolve, which gives a promise that training IP-protected algorithms in a federated manner, without disclosing patients’ data, will soon be within reach. However, hardware-supported methods are very sensitive to hospital internal infrastructure, which by nature is quite heterogeneous. Therefore, containerisation provides some promise of portability. Considering this, Confidential Containers technology seems to be a very tempting promise provided by collaborators, while it’s still not fullyproduction-readyy.

Certainly combining above mechanisms, code, runtime and infrastructure environment supplemented with proper legal framework decrease residual risks, however no solution provides absolute protection particularly against determined adversaries with privileged access – the combined effect of these measures creates substantial barriers to intellectual property theft.

We deeply appreciate and value feedback from the community helping to further steer future efforts to develop sustainable, secure and effective methods for accelerating AI development and deployment. Together, we can tackle these challenges and achieve groundbreaking progress, ensuring robust security and compliance in various contexts.

Contributions: The author would like to thank Jacek Chmiel, Peter Fernana Richie, Vitor Gouveia and the Federated Open Science team at Roche for brainstorming, pragmatic solution-oriented thinking, and contributions.

Link & Resources

Intel Confidential Containers Guide

Nvidia blog describing integration with CoCo Confidential Containers Github & Kata Agent Policies

Commercial Vendors: Edgeless systems contrast, Redhat & Azure

Remote Unlock of LUKS encrypted disk

A perfect match to elevate privacy-enhancing healthcare analytics

Differential Privacy and Federated Learning for Medical Data

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Solving multicloud networking challenges to scale businesses with AI

Artificial Intelligence (AI) and its demand for massive computational power have spurred the growth of cloud and edge computing. As AI pilot projects take off and scale, there has been an increasing demand for more flexible and high-performing infrastructure. This has led to the rise of multi-cloud networking as organisations

What businesses should consider while adopting multiple cloud technologies

In today’s world, the majority of enterprises have more than one cloud provider, with more data than ever in the cloud, particularly given the rise of work-from-anywhere and ubiquitous customer access. Research shows that over 80% of enterprises have more than one cloud provider. A multi-cloud approach combines the strengths

Why enterprise networks need both reach and resilience

As enterprises expand across regions, so do their cloud platforms and digital ecosystems. But with the rise of AI and its unprecedented appetite for data, networks are now under more pressure. Many businesses are learning the limits of legacy architecture the hard way. In the race to meet today’s standard

Can your network handle the demands of today’s connected workplace?

Enterprise innovation is accelerating at a dizzying pace, from AI-powered applications and real-time data platforms to immersive customer experiences. Solutions like Microsoft Copilot are transforming productivity. Platforms-as-a-Service, such as Lattice, are reimagining how teams collaborate and grow. But beneath this digital renaissance lies the hard truth: none of it works

White House Highlights Senate Passing of ‘Big Beautiful Bill’

A statement posted on the White House website on Tuesday said the Senate “delivered a resounding victory for American workers, farmers, and small businesses by passing President Donald J. Trump’s One Big Beautiful Bill”. The White House statement described the bill as “a transformative legislative package that locks in historic tax relief, delivers border security, reforms welfare, funds critical infrastructure, and more”. The statement linked to an X post by the “Official Rapid Response account of the Trump 47 White House” which noted that Vice President JD Vance “cast[ed]… the deciding vote as the Senate approve[d]… the One Big Beautiful Bill – moving it back to the House and one step closer to President Trump’s desk”. A summary of the bill on the Congress website, dated May 22, states that it “reduces taxes, reduces or increases spending for various federal programs, increases the statutory debt limit, and otherwise addresses agencies and programs throughout the federal government”. A tracker on the site showed that the bill had to pass the House and the Senate before going to the President and becoming law. In a statement sent to Rigzone late Tuesday, Independent Petroleum Association of America (IPAA) President and CEO Jeff Eshelman said “President Trump’s One Big Beautiful Bill remains a win for American energy”. “The bill passed today improves the ability of independent oil and natural gas producers to supply reliable, affordable energy to the American people,” he noted. “IPAA is pleased that the legislation reinstates oil and natural gas lease sales for onshore and offshore federal lands and makes common sense reforms to the permitting and leasing process on federal lands. IPAA members, the small businesses of the oil patch, are grateful that industry tax treatments including intangible drilling costs and percentage depletion were protected, along with carried interest deductions being preserved,” he added. “While

Saudis Led Surge in OPEC Stalwarts’ Oil Exports Last Month

Saudi Arabia led a surge in crude exports from stalwart oil-producing giants in the Middle East in June. The flood may reflect a race by Riyadh – along with neighboring Kuwait and the United Arab Emirates – to expedite barrels out of the Persian Gulf while a conflict between Israel and Iran threatened the region’s shipping corridor. While the trio are entitled to raise crude production under an OPEC+ accord, their exports appear to have climbed even further. The three nations loaded 11.9 million barrels a day onto tankers in June, ship-tracking compiled by Bloomberg shows. Their collective seaborne exports were the highest since April 2023. “Amid supply disruption concerns, Middle Eastern oil producers might have looked at additional storage locations around the world,” said Giovanni Staunovo, a commodity analyst UBS Group AG. The boost from the three producers, who rank among the biggest and most reliable in the Organization of the Petroleum Exporting Countries, signals the group and its partners are pressing on with plans to speed up the return of halted output – a strategy shift that has rattled crude traders and depressed prices, given faltering demand and an impending oversupply. The correlation between exports and production is patchy because countries load barrels from storage and can also ship them to storage. Likewise, the timing of a few large tanker shipments can have a big impact on flow rates. The bottom line, though is that the month-on-month gain from the trio – at 937,000 a day, their largest collective hike since September 2023 – means large amounts of extra supply are now en route toward buyer countries. Riyadh bolstered crude exports by 441,000 barrels a day, or about 7 percent, this month to 6.36 million a day, according to a preliminary analysis of tanker-tracking data compiled by Bloomberg. Kuwait’s exports rose to the highest

Eni Forms New Company to Expand Oilfield Chemicals Business

Eni SpA’s chemical arm, Versalis SpA, has transferred its oilfield chemicals assets to a new company called Versalis Oilfield Solutions SRL in an expansion bid, Eni said Tuesday. “The operation aims to consolidate Versalis’ position in the oilfield services sector by bringing together key expertise and strategic activities in a single, focused and operationally efficient entity”, Italy’s state-controlled Eni said in an online statement. “Versalis’ oilfield operations encompass research and development of advanced chemical formulations, outsourcing their production and marketing solvents and additives designed for the oil drilling industry. Products are tailored to meet specific client requirements, while services include sales and after-sales support, ensuring continuous and qualified technical assistance”. Launched 2010, Eni’s oilfield chemicals business now operates in Africa, the Americas, Asia and Europe, according to the company. Versalis Oilfield Solutions will expand “the scope of activities in terms of products and services, achieve higher revenue targets, and maintain strong profitability”, Eni said. Last year Eni finalized a plan to transform Versalis, which makes basic chemicals, chemical products including plastics and biochemical products such as biolubricants. With an investment of around EUR 2 billion ($2.36 billion), the plan “aims to reduce emissions by approximately 1 million tonnes of CO2, currently about 40 percent of Versalis’ emissions in Italy”, Eni said in a press release October 24, 2024. “It includes the set-up of new industrial plants consistent with the energy transition and decarbonization of industrial sites across sustainable chemistry, as well as biorefining and energy storage”, Eni added. “To enable the construction of the new plants, activity at the cracking plants in Brindisi and Priolo, and the polyethylene plant in Ragusa, will be phased out”. “Eni aims to significantly reduce Versalis’ exposure to basic chemicals, a sector that is facing structural and irreversible decline in Europe, and which has led

Macquarie Strategists Forecast USA Crude Inventory Drop

Macquarie strategists are forecasting that U.S. crude inventories will be down by 1.3 million barrels for the week ending June 27, an oil and gas report sent to Rigzone by the Macquarie team late Monday revealed. “This follows a 5.8 million barrel draw in the prior week, with the crude balance again realizing significantly tighter than our expectations amidst another week of disappointing net imports,” the strategists stated in the oil and gas report. “For this week’s crude balance, from refineries, we model an increase in crude runs (+0.2 million barrels per day),” the Macquarie strategists added in the report. “Among net imports, we again model a sharp increase, with exports significantly lower (-1.0 million barrels per day) and imports modestly higher (+0.2 million barrels per day) on a nominal basis,” they continued. The strategists highlighted in the report that timing of cargoes remains a source of potential volatility in this week’s crude balance. “From implied domestic supply (prod.+adj.+transfers), we look for a nominal reduction (-0.3 million barrels per day) this week,” the strategists said in the report. “Rounding out the picture, we anticipate another small increase in Strategic Petroleum Reserve (SPR) stocks (+0.3 million barrels) this week,” they added. The Macquarie strategists went on to state in the report that “among products”, they “look for a gasoline build (+2.3 million barrels) with distillate stocks slightly higher (+0.2 million barrels), and jet stocks modestly lower (-0.4 million barrels)”. “We model implied demand for these three products at 14.7 million barrels per day for the week ending June 27,” the strategists went on to state. U.S. commercial crude oil inventories, excluding those in the SPR, dropped by 5.8 million barrels from the week ending June 13 to the week ending June 20, the U.S. Energy Information Administration (EIA) highlighted in its

Uniper to Supply Solar Power to Baden-Wuerttemberg Universities

German utility Uniper SE has signed a deal to supply the state of Baden-Wuerttemberg with 86 gigawatt hours (GWh) of solar electricity per year for three years, or around 258 gigawatt hours. The power is meant for the universities in Ulm, Mannheim, Hohenheim, and Constance. The delivery period will run from January 1, 2026, to December 31, 2028, the company said in a media release. The contract involves a full green supply with flexible guarantees of origin adapted to customer consumption and is based on electricity from Spanish photovoltaic systems that have recently been connected to the grid, Uniper said. The agreement brings the universities one step closer to a sustainable electricity supply. The University of Konstanz already operates its own combined heat and power plants, which cover 50 percent of its electricity requirements. The University of Mannheim has also been sourcing all of its electricity from renewable energies since 2012, Uniper said. “Our customers’ demand for green electricity has increased enormously and we are pleased to be able to meet this demand – always with the aim of harmonizing security of supply and low CO2 energy. In this context, we are supporting the universities in Baden-Wuerttemberg on their way to a better CO2 balance”, Gundolf Schweppe, CCO for Sales at Uniper, said. Uniper and Baden-Wuerttemberg had partnered for energy supply before. In 2021, they signed a gas contract with the University of Stuttgart for January 2023 to December 2026. “Green power contracts have become a key driver in times of energy transition: they guarantee the supply of renewable energy, protect against price fluctuations, promote innovation, and ultimately reduce emissions”, Uniper said. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is

Petronas Signs Vehicle Fluids Deal with Mahindra

Malaysia’s Petroliam Nasional Berhad (Petronas) has secured a deal with Mahindra & Mahindra Ltd. Petronas Lubricants International (PLI) and Petronas Lubricants (India) Pvt. Ltd. bagged an Aftermarket Service Fill contract from India’s largest sports utility vehicle (SUV) manufacturer. Under the agreement, Petronas Lubricants will be the sole distributor of vehicle fluids to Mahindra’s authorized dealers, workshops, and stockists under the Maximile brand within Mahindra’s South Zone Distribution network in India, Petronas said in a media release. The deal involves engine oils, transmission oils, axle oils, and steering fluids. The deal spans 50 Stock Keeping Units (SKUs), catering to a wide range of passenger cars and SUVs across the region, Petronas said. “This collaboration marks a strategic step for PLIPL, reinforcing its commitment to deliver high-performance, OEM-aligned solutions to India’s rapidly growing automotive market”, Petronas said. “Through decades of engineering expertise, PLIPL’s innovative products will now be accessible to a broader market, enabling more customers to experience its award-winning Fluid Technology Solutions™ while supporting the future of mobility in India with Mahindra as a trusted partner. “With proven capabilities in R&D, manufacturing, and global distribution, PETRONAS Lubricants International is ideally positioned to support Mahindra’s expansive service network and evolving customer expectations”, it added. The agreement was signed by Binu Chandy, Chief Executive Officer of Petronas Lubricants India Pvt. Ltd., and R. Veeraraghavan, Senior Vice President of Strategic Sourcing, Mahindra & Mahindra Ltd., in Mumbai. To contact the author, email [email protected] What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy. MORE FROM THIS

Data center capacity continues to shift to hyperscalers

However, even though colocation and on-premises data centers will continue to lose share, they will still continue to grow. They just won’t be growing as fast as hyperscalers. So, it creates the illusion of shrinkage when it’s actually just slower growth. In fact, after a sustained period of essentially no growth, on-premises data center capacity is receiving a boost thanks to genAI applications and GPU infrastructure. “While most enterprise workloads are gravitating towards cloud providers or to off-premise colo facilities, a substantial subset are staying on-premise, driving a substantial increase in enterprise GPU servers,” said John Dinsdale, a chief analyst at Synergy Research Group.

Oracle inks $30 billion cloud deal, continuing its strong push into AI infrastructure.

He pointed out that, in addition to its continued growth, OCI has a remaining performance obligation (RPO) — total future revenue expected from contracts not yet reported as revenue — of $138 billion, a 41% increase, year over year. The company is benefiting from the immense demand for cloud computing largely driven by AI models. While traditionally an enterprise resource planning (ERP) company, Oracle launched OCI in 2016 and has been strategically investing in AI and data center infrastructure that can support gigawatts of capacity. Notably, it is a partner in the $500 billion SoftBank-backed Stargate project, along with OpenAI, Arm, Microsoft, and Nvidia, that will build out data center infrastructure in the US. Along with that, the company is reportedly spending about $40 billion on Nvidia chips for a massive new data center in Abilene, Texas, that will serve as Stargate’s first location in the country. Further, the company has signaled its plans to significantly increase its investment in Abu Dhabi to grow out its cloud and AI offerings in the UAE; has partnered with IBM to advance agentic AI; has launched more than 50 genAI use cases with Cohere; and is a key provider for ByteDance, which has said it plans to invest $20 billion in global cloud infrastructure this year, notably in Johor, Malaysia. Ellison’s plan: dominate the cloud world CTO and co-founder Larry Ellison announced in a recent earnings call Oracle’s intent to become No. 1 in cloud databases, cloud applications, and the construction and operation of cloud data centers. He said Oracle is uniquely positioned because it has so much enterprise data stored in its databases. He also highlighted the company’s flexible multi-cloud strategy and said that the latest version of its database, Oracle 23ai, is specifically tailored to the needs of AI workloads. Oracle

Datacenter industry calls for investment after EU issues water consumption warning

CISPE’s response to the European Commission’s report warns that the resulting regulatory uncertainty could hurt the region’s economy. “Imposing new, standalone water regulations could increase costs, create regulatory fragmentation, and deter investment. This risks shifting infrastructure outside the EU, undermining both sustainability and sovereignty goals,” CISPE said in its latest policy recommendation, Advancing water resilience through digital innovation and responsible stewardship. “Such regulatory uncertainty could also reduce Europe’s attractiveness for climate-neutral infrastructure investment at a time when other regions offer clear and stable frameworks for green data growth,” it added. CISPE’s recommendations are a mix of regulatory harmonization, increased investment, and technological improvement. Currently, water reuse regulation is directed towards agriculture. Updated regulation across the bloc would encourage more efficient use of water in industrial settings such as datacenters, the asosciation said. At the same time, countries struggling with limited public sector budgets are not investing enough in water infrastructure. This could only be addressed by tapping new investment by encouraging formal public-private partnerships (PPPs), it suggested: “Such a framework would enable the development of sustainable financing models that harness private sector innovation and capital, while ensuring robust public oversight and accountability.” Nevertheless, better water management would also require real-time data gathered through networks of IoT sensors coupled to AI analytics and prediction systems. To that end, cloud datacenters were less a drain on water resources than part of the answer: “A cloud-based approach would allow water utilities and industrial users to centralize data collection, automate operational processes, and leverage machine learning algorithms for improved decision-making,” argued CISPE.

HPE-Juniper deal clears DOJ hurdle, but settlement requires divestitures

In HPE’s press release following the court’s decision, the vendor wrote that “After close, HPE will facilitate limited access to Juniper’s advanced Mist AIOps technology.” In addition, the DOJ stated that the settlement requires HPE to divest its Instant On business and mandates that the merged firm license critical Juniper software to independent competitors. Specifically, HPE must divest its global Instant On campus and branch WLAN business, including all assets, intellectual property, R&D personnel, and customer relationships, to a DOJ-approved buyer within 180 days. Instant On is aimed primarily at the SMB arena and offers a cloud-based package of wired and wireless networking gear that’s designed for so-called out-of-the-box installation and minimal IT involvement, according to HPE. HPE and Juniper focused on the positive in reacting to the settlement. “Our agreement with the DOJ paves the way to close HPE’s acquisition of Juniper Networks and preserves the intended benefits of this deal for our customers and shareholders, while creating greater competition in the global networking market,” HPE CEO Antonio Neri said in a statement. “For the first time, customers will now have a modern network architecture alternative that can best support the demands of AI workloads. The combination of HPE Aruba Networking and Juniper Networks will provide customers with a comprehensive portfolio of secure, AI-native networking solutions, and accelerate HPE’s ability to grow in the AI data center, service provider and cloud segments.” “This marks an exciting step forward in delivering on a critical customer need – a complete portfolio of modern, secure networking solutions to connect their organizations and provide essential foundations for hybrid cloud and AI,” said Juniper Networks CEO Rami Rahim. “We look forward to closing this transaction and turning our shared vision into reality for enterprise, service provider and cloud customers.”

Data center costs surge up to 18% as enterprises face two-year capacity drought

“AI workloads, especially training and archival, can absorb 10-20ms latency variance if offset by 30-40% cost savings and assured uptime,” said Gogia. “Des Moines and Richmond offer better interconnection diversity today than some saturated Tier-1 hubs.” Contract flexibility is also crucial. Rather than traditional long-term leases, enterprises are negotiating shorter agreements with renewal options and exploring revenue-sharing arrangements tied to business performance. Maximizing what you have With expansion becoming more costly, enterprises are getting serious about efficiency through aggressive server consolidation, sophisticated virtualization and AI-driven optimization tools that squeeze more performance from existing space. The companies performing best in this constrained market are focusing on optimization rather than expansion. Some embrace hybrid strategies blending existing on-premises infrastructure with strategic cloud partnerships, reducing dependence on traditional colocation while maintaining control over critical workloads. The long wait When might relief arrive? CBRE’s analysis shows primary markets had a record 6,350 MW under construction at year-end 2024, more than double 2023 levels. However, power capacity constraints are forcing aggressive pre-leasing and extending construction timelines to 2027 and beyond. The implications for enterprises are stark: with construction timelines extending years due to power constraints, companies are essentially locked into current infrastructure for at least the next few years. Those adapting their strategies now will be better positioned when capacity eventually returns.

Cisco backs quantum networking startup Qunnect

In partnership with Deutsche Telekom’s T-Labs, Qunnect has set up quantum networking testbeds in New York City and Berlin. “Qunnect understands that quantum networking has to work in the real world, not just in pristine lab conditions,” Vijoy Pandey, general manager and senior vice president of Outshift by Cisco, stated in a blog about the investment. “Their room-temperature approach aligns with our quantum data center vision.” Cisco recently announced it is developing a quantum entanglement chip that could ultimately become part of the gear that will populate future quantum data centers. The chip operates at room temperature, uses minimal power, and functions using existing telecom frequencies, according to Pandey.

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle