Stay Ahead, Stay ONMINE

Are You Still Using LoRA to Fine-Tune Your LLM?

LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯… And most are based on a matrix technique I like a lot: the SVD (Singular Value Decomposition). Let’s dive in. LoRA The original Lora insight is that fine-tuning all the weights of a model is overkill. Instead, LoRA freezes the model and only trains a small pair of low-rank “adapter” matrices. See the illustrations below (where W is any matrix of weights in a transformer LLM). This saves memory and compute cycles since far fewer gradients have to be computed and stored. For example, here is a Gemma 8B model fine-tuned to speak like a pirate using LoRA: only 22M parameters are trainable, 8.5B parameters remain frozen. LoRA is very popular. It has even made it as a single-line API into mainstream ML frameworks like Keras: gemma.backbone.enable_lora(rank=8) But is LoRA the best? Researchers have been trying hard to improve on the formula. Indeed, there are many ways of selecting smaller “adapter” matrices. And since most of them make clever use of the singular value decomposition (SVD) of a matrix, let’s pause for a bit of Math. SVD: the simple math The SVD is a great tool for understanding the structure of matrices. The technique splits a matrix into three: W = USVT where U and V are orthogonal (i.e., base changes), and S is the diagonal matrix of sorted singular values. This decomposition always exists. In “textbook” SVD, U and V are square, while S is a rectangle with singular values on the diagonal and a tail of zeros. In practice, you can work with a square S and a rectangular U or V – see the picture – the chopped-off pieces are just multiplications by zero. This “economy-sized” SVD is what is used in common libraries, for example, numpy.linalg.svd. So how can we use this to more efficiently select the weights to train? Let’s quickly go through five recent SVD-based low-rank fine-tuning techniques, with commented illustrations. SVF The simplest alternative to LoRA is to use the SVD on the model’s weight matrices and then fine-tune the singular values directly. Oddly, this is the most recent technique, called SVF, published in the Transformers² paper (arxiv.org/abs/2501.06252v2). SVF is much more economical in parameters than LoRA. And as a bonus, it makes tuned models composable. For more info on that, see my Transformers² explainer here, but composing two SVF fine-tuned models is just an addition: SVFT Should you need more trainable parameters, the SVFT paper (arxiv.org/abs/2405.19597) explores multiple ways of doing that, starting by adding more trainable weights on the diagonal. It also evaluates multiple alternatives like spreading them randomly through the “M” matrix. More importantly, the SVFT paper confirms that having more trainable values than just the diagonal is useful. See their fine-tuning results below. Next come several techniques that split singular values in two sets, “large” and “small”. But before we proceed, let’s pause for a bit more SVD math. More SVD math The SVD is usually seen as a decomposition into three matrices W=USVT but it can also be thought of as a weighted sum of many rank-1 matrices, weighted by the singular values: Should you want to prove it, express individual matrix elements Wjk using the USVT form and the formula for matrix multiplication on one hand, theΣ siuiviT form, on the other, simplify using the fact that S is diagonal and notice that it’s the same thing. In this representation, it’s easy to see that you can split the sum in two. And as you can always sort the singular values, you can make this a split between “large” and “small” singular values. Going back to the tree-matrix form W=USVT, this is what the split looks like: Based on this formula, two papers have explored what happens if you tune only the large singular values or only the small ones, PiSSA and MiLoRA. PiSSA PiSSA (Principal Singular values and Singular Vectors Adaptation, arxiv.org/abs/2404.02948) claims that you should only tune the large principal values. The mechanism is illustrated below: From the paper: “PiSSA is designed to approximate full finetuning by adapting the principal singular components, which are believed to capture the essence of the weight matrices. In contrast, MiLoRA aims to adapt to new tasks while maximally retaining the base model’s knowledge.” The PiSSA paper also has an interesting finding: full fine-tuning is prone to over-fitting. You might get better results in the absolute with a low-rank fine-tuning technique. MiLoRA MiLoRA (Minor singular component LoRA arxiv.org/abs/2406.09044), on the other hand, claims that you should only tune the small principal values. It uses a similar mechanism to PiSSA: Surprisingly, MiLoRA seems to have the upper hand, at least when tuning on math datasets which are probably fairly aligned with the original pre-training. Arguably, PiSSA should be better for bending the behavior of the LLM further from its pre-training. LoRA-XS Finally, I’d like to mention LoRA-XS (arxiv.org/abs/2405.17604). Very similar to PiSSA but slightly different mechanism. It also shows good results with significantly fewer params than LoRA. The paper offers a mathematical explanation of why this setup is “ideal’ under two conditions: that truncating the bottom principal values from the SVD still offers a good approximation of the weights matrices that the fine-tuning data distribution is close to the pre-training one Both are questionable IMHO, so I won’t detail the math. Some results: The underlying assumption seems to be that singular values come in “large” and “small” varieties but is it true? I made a quick Colab to check this on Gemma2 9B. Bottom line: 99% of the singular values are in the 0.1 – 1.1 range.  I’m not sure partitioning them into “large” and “small” makes that much sense. Conclusion There are many more parameter-efficient fine-tuning techniques. Worth mentioning: My conclusion: to go beyond the LoRA standard with 10x fewer params, I like the simplicity of Transformers²’s SVF. And if you need more trainable weights, SVFT is an easy extension. Both use all singular values (full rank, no singular value pruning) and are still cheap 😁. Happy tuning! Note: All illustrations are either created by the author or extracted from arxiv.org papers for comment and discussion purposes.

LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯… And most are based on a matrix technique I like a lot: the SVD (Singular Value Decomposition). Let’s dive in.

LoRA

The original Lora insight is that fine-tuning all the weights of a model is overkill. Instead, LoRA freezes the model and only trains a small pair of low-rank “adapter” matrices. See the illustrations below (where W is any matrix of weights in a transformer LLM).

This saves memory and compute cycles since far fewer gradients have to be computed and stored. For example, here is a Gemma 8B model fine-tuned to speak like a pirate using LoRA: only 22M parameters are trainable, 8.5B parameters remain frozen.

LoRA is very popular. It has even made it as a single-line API into mainstream ML frameworks like Keras:

gemma.backbone.enable_lora(rank=8)

But is LoRA the best? Researchers have been trying hard to improve on the formula. Indeed, there are many ways of selecting smaller “adapter” matrices. And since most of them make clever use of the singular value decomposition (SVD) of a matrix, let’s pause for a bit of Math.

SVD: the simple math

The SVD is a great tool for understanding the structure of matrices. The technique splits a matrix into three: W = USVT where U and V are orthogonal (i.e., base changes), and S is the diagonal matrix of sorted singular values. This decomposition always exists.

In “textbook” SVD, U and V are square, while S is a rectangle with singular values on the diagonal and a tail of zeros. In practice, you can work with a square S and a rectangular U or V – see the picture – the chopped-off pieces are just multiplications by zero. This “economy-sized” SVD is what is used in common libraries, for example, numpy.linalg.svd.

So how can we use this to more efficiently select the weights to train? Let’s quickly go through five recent SVD-based low-rank fine-tuning techniques, with commented illustrations.

SVF

The simplest alternative to LoRA is to use the SVD on the model’s weight matrices and then fine-tune the singular values directly. Oddly, this is the most recent technique, called SVF, published in the Transformers² paper (arxiv.org/abs/2501.06252v2).

SVF is much more economical in parameters than LoRA. And as a bonus, it makes tuned models composable. For more info on that, see my Transformers² explainer here, but composing two SVF fine-tuned models is just an addition:

SVFT

Should you need more trainable parameters, the SVFT paper (arxiv.org/abs/2405.19597) explores multiple ways of doing that, starting by adding more trainable weights on the diagonal.

It also evaluates multiple alternatives like spreading them randomly through the “M” matrix.

More importantly, the SVFT paper confirms that having more trainable values than just the diagonal is useful. See their fine-tuning results below.

Next come several techniques that split singular values in two sets, “large” and “small”. But before we proceed, let’s pause for a bit more SVD math.

More SVD math

The SVD is usually seen as a decomposition into three matrices W=USVT but it can also be thought of as a weighted sum of many rank-1 matrices, weighted by the singular values:

Should you want to prove it, express individual matrix elements Wjk using the USVT form and the formula for matrix multiplication on one hand, the
Σ siuiviT form, on the other, simplify using the fact that S is diagonal and notice that it’s the same thing.

In this representation, it’s easy to see that you can split the sum in two. And as you can always sort the singular values, you can make this a split between “large” and “small” singular values.

Going back to the tree-matrix form W=USVT, this is what the split looks like:

Based on this formula, two papers have explored what happens if you tune only the large singular values or only the small ones, PiSSA and MiLoRA.

PiSSA

PiSSA (Principal Singular values and Singular Vectors Adaptation, arxiv.org/abs/2404.02948) claims that you should only tune the large principal values. The mechanism is illustrated below:

From the paper: “PiSSA is designed to approximate full finetuning by adapting the principal singular components, which are believed to capture the essence of the weight matrices. In contrast, MiLoRA aims to adapt to new tasks while maximally retaining the base model’s knowledge.”

The PiSSA paper also has an interesting finding: full fine-tuning is prone to over-fitting. You might get better results in the absolute with a low-rank fine-tuning technique.

MiLoRA

MiLoRA (Minor singular component LoRA arxiv.org/abs/2406.09044), on the other hand, claims that you should only tune the small principal values. It uses a similar mechanism to PiSSA:

Surprisingly, MiLoRA seems to have the upper hand, at least when tuning on math datasets which are probably fairly aligned with the original pre-training. Arguably, PiSSA should be better for bending the behavior of the LLM further from its pre-training.

LoRA-XS

Finally, I’d like to mention LoRA-XS (arxiv.org/abs/2405.17604). Very similar to PiSSA but slightly different mechanism. It also shows good results with significantly fewer params than LoRA.

The paper offers a mathematical explanation of why this setup is “ideal’ under two conditions:

  • that truncating the bottom principal values from the SVD still offers a good approximation of the weights matrices
  • that the fine-tuning data distribution is close to the pre-training one

Both are questionable IMHO, so I won’t detail the math. Some results:

The underlying assumption seems to be that singular values come in “large” and “small” varieties but is it true? I made a quick Colab to check this on Gemma2 9B. Bottom line: 99% of the singular values are in the 0.1 – 1.1 range.  I’m not sure partitioning them into “large” and “small” makes that much sense.

Conclusion

There are many more parameter-efficient fine-tuning techniques. Worth mentioning:

My conclusion: to go beyond the LoRA standard with 10x fewer params, I like the simplicity of Transformers²’s SVF. And if you need more trainable weights, SVFT is an easy extension. Both use all singular values (full rank, no singular value pruning) and are still cheap 😁. Happy tuning!

Note: All illustrations are either created by the author or extracted from arxiv.org papers for comment and discussion purposes.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

US lets China buy semiconductor design software again

The reversal marks a dramatic shift from the aggressive stance the Trump administration took in May, when it imposed sweeping restrictions on electronic design automation (EDA) software — the critical tools needed to design advanced semiconductors.  A short-lived stoppage  The restrictions had targeted what analysts called the “upstream” of chip

Read More »

Hardcoded root credentials in Cisco Unified CM trigger max-severity alert

The affected products-Cisco Unified CM and Unified CM SME–are core components of enterprise telephony infrastructure, widely deployed across government agencies, financial institutions, and large corporations to manage voice, video, and messaging at scale. A flaw in these systems could allow attackers to compromise an organization’s communications, letting them log in

Read More »

House passes Senate version of megabill, sending it to Trump’s desk

The Republican budget megabill, which makes steep cuts to the Inflation Reduction Act’s clean energy tax credits, now heads to President Donald Trump’s desk after passing both houses of Congress. The House passed the Senate’s version of the bill 218-214 on Thursday, after Republicans debated through the night and House Minority Leader Hakeem Jeffries, D-N.Y., used the leadership prerogative of a “magic minute” to speak in opposition to the bill for a record-breaking 8 hours and 44 minutes. Jeffries at one point referenced a June letter 13 House Republicans sent to the Senate, in which the Republicans stated they were “proud to have worked to ensure that the bill did not include a full repeal of the clean energy tax credits, [but] remain deeply concerned by several provisions” cutting back those credits the House version did include. “Every single one of these signatories voted for a House bill that undermines the clean energy economy, noted that it would hurt their own constituents, voted for the bill anyway, then begged the Senate to make a difference,” Jeffries said. “That is not how we should be legislating in the United States House of Representatives … It limped out of the House by a single vote, so every single signatory on this letter could have stopped the bill.” The bill restricts the ability of projects to qualify for the tech-neutral clean electricity 45Y production tax credit and 48E investment tax credit, shortens the timeline for those credits, and ends the 25D residential solar credit after this year. The 25E, 30D, 30C and 45W electric vehicle credits will terminate after Sept. 30. While clean energy advocates and congressional Democrats maintain that the final version of the bill goes too far in slashing IRA credits, some Republicans wanted to see more significant cuts.  Before the House

Read More »

Airloom Energy to pilot novel wind power tech at Wyoming site

Dive Brief: Airloom Energy has broken ground on a utility-scale pilot of the novel wind power technology that it says offers better energy density and siting flexibility at significantly lower capital cost. The Bill Gates-backed startup says the southeastern Wyoming project will validate its low-slung turbine design, which captures wind energy using 30-foot, vertically oriented airfoils that move around an ovular trackway about 80 feet above the ground. Airloom plans to begin generating power at the site near Rock River later this year, and begin commercial-scale demonstrations in 2027, according to a timeline posted on its website. Dive Insight: The Wyoming Energy Authority awarded Airloom $5 million in November to “design, build and test a 1 MW demonstration device that validates the company’s innovative, low-profile design.” That description matches what Airloom said about the pilot project in its announcement last week, but Airloom CEO Neal Rickner told TechCrunch on June 25 that the pilot would generate about 150 kW of electricity while using the same parts as future megawatt-scale turbines. Airloom did not respond to requests for comment. The key difference between the pilot- and commercial-scale plants, Rickner told TechCrunch, is the physical footprint: The straightaways run about 100 meters on the former and 500 meters on the latter. Despite the relatively large area inside the trackway, Airloom says its technology offers higher energy densities than traditional wind, which the U.S. Department of Agriculture says requires about 10 times less land per megawatt than solar PV. Whereas a traditional wind turbine has a circular swept area, Airloom’s design sweeps a rectangular area, capturing more wind and conserving more energy, it says. Airloom touts the undisturbed land inside the trackway as fit for complementary uses like livestock grazing, crop production or solar arrays. The design has other advantages over traditional turbines,

Read More »

Trinidad Aims to Boost Gas Output with BP, Woodside Projects

Trinidad and Tobago is working to reverse a major slump in gas production and revive a key export industry in the Caribbean nation to boost its economic growth. Energy Minister Roodal Moonilal sees strong potential in the country’s offshore waters at a time when auditors said more gas output must be added to satisfy demand. During the opening of the Society of Petroleum Engineers 2025 Mature Basin Energy Symposium Tuesday, Moonilal said Woodside Energy’s Calypso offshore project is expected to add 700 million standard cubic feet per day to domestic gas production.  “The current and upcoming gas projects will, however, only provide relief in the short to medium term,” he added. Calypso, with reserves of 3.5 trillion cubic feet, is “one of the main projects we would be seeking to accelerate,” Moonilal told Bloomberg in an interview on June 27.  Trinidad and Tobago has seen a 33 percent drop in gas output since 2015. The need to ramp up production comes as the country aims to fill a supply gap and deals with a crippling shortage of foreign exchange.  Natural gas reserves auditors DeGoyler and MacNaughton estimated in its latest report that 58.25 trillion cubic feet of prospective resources were offshore at the end of 2023. The country’s gas production has been declining, and it produced 3.8 billion cubic feet in 2015, Moonilal said. Trinidad and Tobago produced around 2.5 billion cubic feet in the first quarter of 2025, 46 percent of which is exported as liquefied natural gas, the energy minister told Bloomberg Wednesday. The auditors advised the country to fast track gas projects and exploration efforts to “convert prospective resources into reserves and contingent resources, to meet gas demand,” he added. Several projects from BP’s subsidiary in Trinidad and Tobago are planned with the oil and gas major set to

Read More »

Mozambique President Urges Total LNG Restart Despite Risks

Mozambican President Daniel Chapo said his government and private companies will have to collectively ensure the necessary security is in place to enable TotalEnergies SE to restart construction of a $20 billion gas project that has stalled due to a militant insurgency — and even then risks will remain. The project in the northern Cabo Delgado province along with others that are at earlier stages of development are seen as crucial to the future of the southern African nation, which ranks among the world’s poorest. The French oil major halted work, evacuated workers and declared force majeure in 2021 following an escalation in attacks by Islamic State-linked militants. “Regarding security, it’s relatively stable compared to this past four years, but continuity of this stability doesn’t depend only on the government of the republic but on all partners in that area,” Chapo said in an interview with Bloomberg in Seville, Spain, on Tuesday. “If we’re waiting for Cabo Delgado to be a heaven, we won’t lift force majeure.”  Total’s facility, which will take another four years to complete, will liquefy and export the extensive gas reserves off northeast Mozambique that were discovered 15 years ago. Since then, only one floating plant operated by Eni SpA has come on line. A final investment decision on a third planned venture, Exxon Mobil Corp.’s $27 billion Rovuma LNG project, is expected next year. Mozambique called on nations in the region to help secure Cabo Delgado after its attempts to use mercenaries to halt the violence failed.  Rwandan troops, brought in months after the attacks that led to Total’s evacuation, provided an effective response. The European Council gave the Rwanda Defence Force additional funding in November to help fight the insurgency, but the terms of its deployment remain uncertain. “How long will depend a lot on security

Read More »

USA Crude Oil Inventories Rise by Almost 4MM Barrels WoW

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), increased by 3.8 million barrels from the week ending June 20 to the week ending June 27, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. That report was released on July 2 and included data for the week ending June 27. It showed that crude oil stocks, not including the SPR, stood at 419.0 million barrels on June 27, 415.1 million barrels on June 20, and 448.5 million barrels on June 28, 2024. The EIA report highlighted that data may not add up to totals due to independent rounding. Crude oil in the SPR stood at 402.8 million barrels on June 27, 402.5 million barrels on June 20, and 372.6 million barrels on June 28, 2024, the report revealed. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.642 billion barrels on June 27, the EIA report pointed out. Total petroleum stocks were up 9.6 million barrels week on week and down 12.8 million barrels year on year, the report showed. “At 419 million barrels, U.S. crude oil inventories are about nine percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories increased by 4.2 million barrels from last week and are about one percent below the five year average for this time of year. Both Finished gasoline inventories and blending components inventories increased last week,” it added. “Distillate fuel inventories decreased by 1.7 million barrels last week and are about 21 percent below the five year average for this time of year. Propane/propylene inventories increased

Read More »

TotalEnergies Raises Stake in AES Renewables Portfolio in Caribbean

TotalEnergies SE said Wednesday it had completed the purchase of a 50 percent stake in AES Corp.’s renewable energy and battery energy storage system (BESS) portfolio of over one gigawatt in the Dominican Republic. Last year the French energy giant acquired a 30 percent stake in Arlington, Virginia-based AES’ renewables portfolio in Puerto Rico. “The combined portfolio now exceeds 1.5 GW of renewable energy and BESS capacity across the Caribbean”, TotalEnergies said in an online statement. The Dominican portfolio includes wind, solar and BESS projects, 410 megawatts of which are operational or under construction. Under-development wind and solar total over 500 MW. The BESS projects will be integrated into solar plants to mitigate intermittency, according to TotalEnergies. The company already owns a 103-MW solar plant under construction and a partially solarized network of 184 service stations, as well as operates natural gas distribution, in the Dominican Republic. The Puerto Rican portfolio includes 200 MW of solar and 285 MW/1,140 MW hours of BESS projects under construction. “TotalEnergies is pursuing deployment of its multi-energy strategy on the island, where it is already active in the fuel, lubricants, and aviation sectors, and operates a network of 200 service stations between Puerto Rico and the island of St Thomas”, the company said. Stephane Michel, TotalEnergies president for gas, renewables and power, said, “We are pleased to expand our multi-energy strategy through this partnership with AES, focusing on renewables and battery storage in a region where TotalEnergies is already a leading supplier of LNG, notably for power generation. Since 2018, we have been supplying LNG to AES’s subsidiaries in Panama and the Dominican Republic”. On April 15, 2025, TotalEnergies announced a heads of agreement with Energia Natural Dominicana (EnaDom), the joint venture between AES and Energas in the Dominican Republic, for the supply of 400,000 metric tons a year of LNG for 15 years from 2027. The

Read More »

Oracle to power OpenAI’s AGI ambitions with 4.5GW expansion

“For CIOs, this shift means more competition for AI infrastructure. Over the next 12–24 months, securing capacity for AI workloads will likely get harder, not easier. Though cost is coming down but demand is increasing as well, due to which CIOs must plan earlier and build stronger partnerships to ensure availability,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. He added that CIOs should expect longer wait times for AI infrastructure. To mitigate this, they should lock in capacity through reserved instances, diversify across regions and cloud providers, and work with vendors to align on long-term demand forecasts.  “Enterprises stand to benefit from more efficient and cost-effective AI infrastructure tailored to specialized AI workloads, significantly lower their overall future AI-related investments and expenses. Consequently, CIOs face a critical task: to analyze and predict the diverse AI workloads that will prevail across their organizations, business units, functions, and employee personas in the future. This foresight will be crucial in prioritizing and optimizing AI workloads for either in-house deployment or outsourced infrastructure, ensuring strategic and efficient resource allocation,” said Neil Shah, vice president at Counterpoint Research. Strategic pivot toward AI data centers The OpenAI-Oracle deal comes in stark contrast to developments earlier this year. In April, AWS was reported to be scaling back its plans for leasing new colocation capacity — a move that AWS Vice President for global data centers Kevin Miller described as routine capacity management, not a shift in long-term expansion plans. Still, these announcements raised questions around whether the hyperscale data center boom was beginning to plateau. “This isn’t a slowdown, it’s a strategic pivot. The era of building generic data center capacity is over. The new global imperative is a race for specialized, high-density, AI-ready compute. Hyperscalers are not slowing down; they are reallocating their capital to

Read More »

Arista Buys VeloCloud to reboot SD-WANs amid AI infrastructure shift

What this doesn’t answer is how Arista Networks plans to add newer, security-oriented Secure Access Service Edge (SASE) capabilities to VeloCloud’s older SD-WAN technology. Post-acquisition, it still has only some of the building blocks necessary to achieve this. Mapping AI However, in 2025 there is always more going on with networking acquisitions than simply adding another brick to the wall, and in this case it’s the way AI is changing data flows across networks. “In the new AI era, the concepts of what comprises a user and a site in a WAN have changed fundamentally. The introduction of agentic AI even changes what might be considered a user,” wrote Arista Networks CEO, Jayshree Ullal, in a blog highlighting AI’s effect on WAN architectures. “In addition to people accessing data on demand, new AI agents will be deployed to access data independently, adapting over time to solve problems and enhance user productivity,” she said. Specifically, WANs needed modernization to cope with the effect AI traffic flows are having on data center traffic. Sanjay Uppal, now VP and general manager of the new VeloCloud Division at Arista Networks, elaborated. “The next step in SD-WAN is to identify, secure and optimize agentic AI traffic across that distributed enterprise, this time from all end points across to branches, campus sites, and the different data center locations, both public and private,” he wrote. “The best way to grab this opportunity was in partnership with a networking systems leader, as customers were increasingly looking for a comprehensive solution from LAN/Campus across the WAN to the data center.”

Read More »

Data center capacity continues to shift to hyperscalers

However, even though colocation and on-premises data centers will continue to lose share, they will still continue to grow. They just won’t be growing as fast as hyperscalers. So, it creates the illusion of shrinkage when it’s actually just slower growth. In fact, after a sustained period of essentially no growth, on-premises data center capacity is receiving a boost thanks to genAI applications and GPU infrastructure. “While most enterprise workloads are gravitating towards cloud providers or to off-premise colo facilities, a substantial subset are staying on-premise, driving a substantial increase in enterprise GPU servers,” said John Dinsdale, a chief analyst at Synergy Research Group.

Read More »

Oracle inks $30 billion cloud deal, continuing its strong push into AI infrastructure.

He pointed out that, in addition to its continued growth, OCI has a remaining performance obligation (RPO) — total future revenue expected from contracts not yet reported as revenue — of $138 billion, a 41% increase, year over year. The company is benefiting from the immense demand for cloud computing largely driven by AI models. While traditionally an enterprise resource planning (ERP) company, Oracle launched OCI in 2016 and has been strategically investing in AI and data center infrastructure that can support gigawatts of capacity. Notably, it is a partner in the $500 billion SoftBank-backed Stargate project, along with OpenAI, Arm, Microsoft, and Nvidia, that will build out data center infrastructure in the US. Along with that, the company is reportedly spending about $40 billion on Nvidia chips for a massive new data center in Abilene, Texas, that will serve as Stargate’s first location in the country. Further, the company has signaled its plans to significantly increase its investment in Abu Dhabi to grow out its cloud and AI offerings in the UAE; has partnered with IBM to advance agentic AI; has launched more than 50 genAI use cases with Cohere; and is a key provider for ByteDance, which has said it plans to invest $20 billion in global cloud infrastructure this year, notably in Johor, Malaysia. Ellison’s plan: dominate the cloud world CTO and co-founder Larry Ellison announced in a recent earnings call Oracle’s intent to become No. 1 in cloud databases, cloud applications, and the construction and operation of cloud data centers. He said Oracle is uniquely positioned because it has so much enterprise data stored in its databases. He also highlighted the company’s flexible multi-cloud strategy and said that the latest version of its database, Oracle 23ai, is specifically tailored to the needs of AI workloads. Oracle

Read More »

Datacenter industry calls for investment after EU issues water consumption warning

CISPE’s response to the European Commission’s report warns that the resulting regulatory uncertainty could hurt the region’s economy. “Imposing new, standalone water regulations could increase costs, create regulatory fragmentation, and deter investment. This risks shifting infrastructure outside the EU, undermining both sustainability and sovereignty goals,” CISPE said in its latest policy recommendation, Advancing water resilience through digital innovation and responsible stewardship. “Such regulatory uncertainty could also reduce Europe’s attractiveness for climate-neutral infrastructure investment at a time when other regions offer clear and stable frameworks for green data growth,” it added. CISPE’s recommendations are a mix of regulatory harmonization, increased investment, and technological improvement. Currently, water reuse regulation is directed towards agriculture. Updated regulation across the bloc would encourage more efficient use of water in industrial settings such as datacenters, the asosciation said. At the same time, countries struggling with limited public sector budgets are not investing enough in water infrastructure. This could only be addressed by tapping new investment by encouraging formal public-private partnerships (PPPs), it suggested: “Such a framework would enable the development of sustainable financing models that harness private sector innovation and capital, while ensuring robust public oversight and accountability.” Nevertheless, better water management would also require real-time data gathered through networks of IoT sensors coupled to AI analytics and prediction systems. To that end, cloud datacenters were less a drain on water resources than part of the answer: “A cloud-based approach would allow water utilities and industrial users to centralize data collection, automate operational processes, and leverage machine learning algorithms for improved decision-making,” argued CISPE.

Read More »

HPE-Juniper deal clears DOJ hurdle, but settlement requires divestitures

In HPE’s press release following the court’s decision, the vendor wrote that “After close, HPE will facilitate limited access to Juniper’s advanced Mist AIOps technology.” In addition, the DOJ stated that the settlement requires HPE to divest its Instant On business and mandates that the merged firm license critical Juniper software to independent competitors. Specifically, HPE must divest its global Instant On campus and branch WLAN business, including all assets, intellectual property, R&D personnel, and customer relationships, to a DOJ-approved buyer within 180 days. Instant On is aimed primarily at the SMB arena and offers a cloud-based package of wired and wireless networking gear that’s designed for so-called out-of-the-box installation and minimal IT involvement, according to HPE. HPE and Juniper focused on the positive in reacting to the settlement. “Our agreement with the DOJ paves the way to close HPE’s acquisition of Juniper Networks and preserves the intended benefits of this deal for our customers and shareholders, while creating greater competition in the global networking market,” HPE CEO Antonio Neri said in a statement. “For the first time, customers will now have a modern network architecture alternative that can best support the demands of AI workloads. The combination of HPE Aruba Networking and Juniper Networks will provide customers with a comprehensive portfolio of secure, AI-native networking solutions, and accelerate HPE’s ability to grow in the AI data center, service provider and cloud segments.” “This marks an exciting step forward in delivering on a critical customer need – a complete portfolio of modern, secure networking solutions to connect their organizations and provide essential foundations for hybrid cloud and AI,” said Juniper Networks CEO Rami Rahim. “We look forward to closing this transaction and turning our shared vision into reality for enterprise, service provider and cloud customers.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »