Stay Ahead, Stay ONMINE

Are You Still Using LoRA to Fine-Tune Your LLM?

LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯… And most are based on a matrix technique I like a lot: the SVD (Singular Value Decomposition). Let’s dive in. LoRA The original Lora insight is that fine-tuning all the weights of a model is overkill. Instead, LoRA freezes the model and only trains a small pair of low-rank “adapter” matrices. See the illustrations below (where W is any matrix of weights in a transformer LLM). This saves memory and compute cycles since far fewer gradients have to be computed and stored. For example, here is a Gemma 8B model fine-tuned to speak like a pirate using LoRA: only 22M parameters are trainable, 8.5B parameters remain frozen. LoRA is very popular. It has even made it as a single-line API into mainstream ML frameworks like Keras: gemma.backbone.enable_lora(rank=8) But is LoRA the best? Researchers have been trying hard to improve on the formula. Indeed, there are many ways of selecting smaller “adapter” matrices. And since most of them make clever use of the singular value decomposition (SVD) of a matrix, let’s pause for a bit of Math. SVD: the simple math The SVD is a great tool for understanding the structure of matrices. The technique splits a matrix into three: W = USVT where U and V are orthogonal (i.e., base changes), and S is the diagonal matrix of sorted singular values. This decomposition always exists. In “textbook” SVD, U and V are square, while S is a rectangle with singular values on the diagonal and a tail of zeros. In practice, you can work with a square S and a rectangular U or V – see the picture – the chopped-off pieces are just multiplications by zero. This “economy-sized” SVD is what is used in common libraries, for example, numpy.linalg.svd. So how can we use this to more efficiently select the weights to train? Let’s quickly go through five recent SVD-based low-rank fine-tuning techniques, with commented illustrations. SVF The simplest alternative to LoRA is to use the SVD on the model’s weight matrices and then fine-tune the singular values directly. Oddly, this is the most recent technique, called SVF, published in the Transformers² paper (arxiv.org/abs/2501.06252v2). SVF is much more economical in parameters than LoRA. And as a bonus, it makes tuned models composable. For more info on that, see my Transformers² explainer here, but composing two SVF fine-tuned models is just an addition: SVFT Should you need more trainable parameters, the SVFT paper (arxiv.org/abs/2405.19597) explores multiple ways of doing that, starting by adding more trainable weights on the diagonal. It also evaluates multiple alternatives like spreading them randomly through the “M” matrix. More importantly, the SVFT paper confirms that having more trainable values than just the diagonal is useful. See their fine-tuning results below. Next come several techniques that split singular values in two sets, “large” and “small”. But before we proceed, let’s pause for a bit more SVD math. More SVD math The SVD is usually seen as a decomposition into three matrices W=USVT but it can also be thought of as a weighted sum of many rank-1 matrices, weighted by the singular values: Should you want to prove it, express individual matrix elements Wjk using the USVT form and the formula for matrix multiplication on one hand, theΣ siuiviT form, on the other, simplify using the fact that S is diagonal and notice that it’s the same thing. In this representation, it’s easy to see that you can split the sum in two. And as you can always sort the singular values, you can make this a split between “large” and “small” singular values. Going back to the tree-matrix form W=USVT, this is what the split looks like: Based on this formula, two papers have explored what happens if you tune only the large singular values or only the small ones, PiSSA and MiLoRA. PiSSA PiSSA (Principal Singular values and Singular Vectors Adaptation, arxiv.org/abs/2404.02948) claims that you should only tune the large principal values. The mechanism is illustrated below: From the paper: “PiSSA is designed to approximate full finetuning by adapting the principal singular components, which are believed to capture the essence of the weight matrices. In contrast, MiLoRA aims to adapt to new tasks while maximally retaining the base model’s knowledge.” The PiSSA paper also has an interesting finding: full fine-tuning is prone to over-fitting. You might get better results in the absolute with a low-rank fine-tuning technique. MiLoRA MiLoRA (Minor singular component LoRA arxiv.org/abs/2406.09044), on the other hand, claims that you should only tune the small principal values. It uses a similar mechanism to PiSSA: Surprisingly, MiLoRA seems to have the upper hand, at least when tuning on math datasets which are probably fairly aligned with the original pre-training. Arguably, PiSSA should be better for bending the behavior of the LLM further from its pre-training. LoRA-XS Finally, I’d like to mention LoRA-XS (arxiv.org/abs/2405.17604). Very similar to PiSSA but slightly different mechanism. It also shows good results with significantly fewer params than LoRA. The paper offers a mathematical explanation of why this setup is “ideal’ under two conditions: that truncating the bottom principal values from the SVD still offers a good approximation of the weights matrices that the fine-tuning data distribution is close to the pre-training one Both are questionable IMHO, so I won’t detail the math. Some results: The underlying assumption seems to be that singular values come in “large” and “small” varieties but is it true? I made a quick Colab to check this on Gemma2 9B. Bottom line: 99% of the singular values are in the 0.1 – 1.1 range.  I’m not sure partitioning them into “large” and “small” makes that much sense. Conclusion There are many more parameter-efficient fine-tuning techniques. Worth mentioning: My conclusion: to go beyond the LoRA standard with 10x fewer params, I like the simplicity of Transformers²’s SVF. And if you need more trainable weights, SVFT is an easy extension. Both use all singular values (full rank, no singular value pruning) and are still cheap 😁. Happy tuning! Note: All illustrations are either created by the author or extracted from arxiv.org papers for comment and discussion purposes.

LoRA (Low Rank Adaptation – arxiv.org/abs/2106.09685) is a popular technique for fine-tuning Large Language Models (LLMs) on the cheap. But 2024 has seen an explosion of new parameter-efficient fine-tuning techniques, an alphabet soup of LoRA alternatives: SVF, SVFT, MiLoRA, PiSSA, LoRA-XS 🤯… And most are based on a matrix technique I like a lot: the SVD (Singular Value Decomposition). Let’s dive in.

LoRA

The original Lora insight is that fine-tuning all the weights of a model is overkill. Instead, LoRA freezes the model and only trains a small pair of low-rank “adapter” matrices. See the illustrations below (where W is any matrix of weights in a transformer LLM).

This saves memory and compute cycles since far fewer gradients have to be computed and stored. For example, here is a Gemma 8B model fine-tuned to speak like a pirate using LoRA: only 22M parameters are trainable, 8.5B parameters remain frozen.

LoRA is very popular. It has even made it as a single-line API into mainstream ML frameworks like Keras:

gemma.backbone.enable_lora(rank=8)

But is LoRA the best? Researchers have been trying hard to improve on the formula. Indeed, there are many ways of selecting smaller “adapter” matrices. And since most of them make clever use of the singular value decomposition (SVD) of a matrix, let’s pause for a bit of Math.

SVD: the simple math

The SVD is a great tool for understanding the structure of matrices. The technique splits a matrix into three: W = USVT where U and V are orthogonal (i.e., base changes), and S is the diagonal matrix of sorted singular values. This decomposition always exists.

In “textbook” SVD, U and V are square, while S is a rectangle with singular values on the diagonal and a tail of zeros. In practice, you can work with a square S and a rectangular U or V – see the picture – the chopped-off pieces are just multiplications by zero. This “economy-sized” SVD is what is used in common libraries, for example, numpy.linalg.svd.

So how can we use this to more efficiently select the weights to train? Let’s quickly go through five recent SVD-based low-rank fine-tuning techniques, with commented illustrations.

SVF

The simplest alternative to LoRA is to use the SVD on the model’s weight matrices and then fine-tune the singular values directly. Oddly, this is the most recent technique, called SVF, published in the Transformers² paper (arxiv.org/abs/2501.06252v2).

SVF is much more economical in parameters than LoRA. And as a bonus, it makes tuned models composable. For more info on that, see my Transformers² explainer here, but composing two SVF fine-tuned models is just an addition:

SVFT

Should you need more trainable parameters, the SVFT paper (arxiv.org/abs/2405.19597) explores multiple ways of doing that, starting by adding more trainable weights on the diagonal.

It also evaluates multiple alternatives like spreading them randomly through the “M” matrix.

More importantly, the SVFT paper confirms that having more trainable values than just the diagonal is useful. See their fine-tuning results below.

Next come several techniques that split singular values in two sets, “large” and “small”. But before we proceed, let’s pause for a bit more SVD math.

More SVD math

The SVD is usually seen as a decomposition into three matrices W=USVT but it can also be thought of as a weighted sum of many rank-1 matrices, weighted by the singular values:

Should you want to prove it, express individual matrix elements Wjk using the USVT form and the formula for matrix multiplication on one hand, the
Σ siuiviT form, on the other, simplify using the fact that S is diagonal and notice that it’s the same thing.

In this representation, it’s easy to see that you can split the sum in two. And as you can always sort the singular values, you can make this a split between “large” and “small” singular values.

Going back to the tree-matrix form W=USVT, this is what the split looks like:

Based on this formula, two papers have explored what happens if you tune only the large singular values or only the small ones, PiSSA and MiLoRA.

PiSSA

PiSSA (Principal Singular values and Singular Vectors Adaptation, arxiv.org/abs/2404.02948) claims that you should only tune the large principal values. The mechanism is illustrated below:

From the paper: “PiSSA is designed to approximate full finetuning by adapting the principal singular components, which are believed to capture the essence of the weight matrices. In contrast, MiLoRA aims to adapt to new tasks while maximally retaining the base model’s knowledge.”

The PiSSA paper also has an interesting finding: full fine-tuning is prone to over-fitting. You might get better results in the absolute with a low-rank fine-tuning technique.

MiLoRA

MiLoRA (Minor singular component LoRA arxiv.org/abs/2406.09044), on the other hand, claims that you should only tune the small principal values. It uses a similar mechanism to PiSSA:

Surprisingly, MiLoRA seems to have the upper hand, at least when tuning on math datasets which are probably fairly aligned with the original pre-training. Arguably, PiSSA should be better for bending the behavior of the LLM further from its pre-training.

LoRA-XS

Finally, I’d like to mention LoRA-XS (arxiv.org/abs/2405.17604). Very similar to PiSSA but slightly different mechanism. It also shows good results with significantly fewer params than LoRA.

The paper offers a mathematical explanation of why this setup is “ideal’ under two conditions:

  • that truncating the bottom principal values from the SVD still offers a good approximation of the weights matrices
  • that the fine-tuning data distribution is close to the pre-training one

Both are questionable IMHO, so I won’t detail the math. Some results:

The underlying assumption seems to be that singular values come in “large” and “small” varieties but is it true? I made a quick Colab to check this on Gemma2 9B. Bottom line: 99% of the singular values are in the 0.1 – 1.1 range.  I’m not sure partitioning them into “large” and “small” makes that much sense.

Conclusion

There are many more parameter-efficient fine-tuning techniques. Worth mentioning:

My conclusion: to go beyond the LoRA standard with 10x fewer params, I like the simplicity of Transformers²’s SVF. And if you need more trainable weights, SVFT is an easy extension. Both use all singular values (full rank, no singular value pruning) and are still cheap 😁. Happy tuning!

Note: All illustrations are either created by the author or extracted from arxiv.org papers for comment and discussion purposes.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Palo Alto Networks readies security for AI-first world

Palo Alto has articulated the value of a security platform for several years. But now, given the speed at which AI is moving, the value shifts from cost consolidation to agility. With AI, most customers don’t know what their future operating environment will look like, and a platform approach lets

Read More »

Chevron executives see 2025 production growth nearing 8%

Executives of Chevron Corp., Houston, expect the company’s 2025 production growth, excluding former Hess operations, to be near the top of their guidance range of 6-8%, they said Oct. 31. Chevron’s total production for the 3 months that ended Sept. 30 totaled nearly 4.09 MMboe/d compared with 3.37 MMboe/d in

Read More »

Cisco unveils integrated edge platform for AI

Announced at Cisco’s Partner Summit, Unified Edge will likely be part of many third-party packages that can be configured in a variety of ways, Cisco stated. “The platform is customer definable. For example, if a customer has a workload and they’ve decided they want to use Nutanix, they can go

Read More »

Infoblox bolsters Universal DDI Platform with multi-cloud integrations

Universal DDI for Microsoft Management integration enables enterprises to gain control of their DNS and DHCP by centrally managing DNS and DHCP hosted on Microsoft server platforms. Integration with Google Cloud Internal Range applies consistent IPAM policies across Google Cloud, on-premises, and other cloud environments, which helps enterprise IT to

Read More »

Oil Retreats on Strong Greenback

Oil fell, halting a four-session run of gains, pressured by a strong dollar and a backdrop of oversupply. West Texas Intermediate fell 0.8% to settle below $61 a barrel on Tuesday. A global equities rally hit a speed bump amid concerns about lofty valuations while the greenback climbed to the highest in more than five months, weighing on crude and other dollar-denominated commodities. Oil declined because of “the dollar funding stress and the second-order effect on global liquidity and, in turn, global growth,” said Jon Byrne, an analyst at Strategas Securities. The Organization of the Petroleum Exporting Countries and its allies said over the weekend they planned to hold back from lifting production quotas in the first quarter. The decision came as market observers brace for what is expected to be a global crude glut. The US oil benchmark has retreated almost 16% this year as OPEC+ and non-member nations ramped up production. Prices rebounded from five-month lows when the US recently announced sanctions on Rosneft PJSC and Lukoil PJSC, Russia’s two biggest oil companies, but have since surrendered some of those advances. Russian seaborne crude shipments fell sharply in the wake of the sanctions, dropping by the most since January 2024, according to data tracked by Bloomberg. Cargo discharges have been hit even harder than loadings, with oil held in tanker ships surging. Still, some are skeptical the restrictions will stop Russian oil from finding buyers. “Down the line, you will see that more and more of the disrupted Russian oil, one way or another, finds its way to the market,” Torbjörn Törnqvist, chief executive officer of Gunvor Group, said during an interview on Tuesday. “It always does somehow.” Eni SpA CEO Claudio Descalzi said Monday that any concerns about oversupply will be short-lived, the latest comments by an

Read More »

Gunvor CEO Says Deal for Lukoil Assets Is a ‘Clean Break’

Gunvor Group Chief Executive Officer Torbjörn Törnqvist said a deal to acquire the international assets of sanctioned Russian oil producer Lukoil PJSC represents a “clean break” for the portfolio and should pass muster with regulators. “We believe it is satisfying all the concerns that may arise from a transaction of this magnitude and given the parties involved,” Törnqvist said in an interview with Bloomberg Television. He ruled out selling any of the assets back should sanctions on Lukoil be removed.  “It’s a clean break; the moment the deal is done — that’s it.” Lukoil last week announced it had agreed to sell Gunvor its vast international network of oil wells, refineries and gas stations, as well as its trading book, without disclosing terms. If finalized, the deal is a coup for Gunvor, a large trader of oil and gas that has longstanding ties to Russia’s energy industry. In 2014, co-founder Gennady Timchenko was sanctioned by the US, which claimed Russian President Vladimir Putin had “investments in Gunvor,” which the company has consistently denied. Törnqvist said he believes any concerns the authorities might have about continued Russian influence over the portfolio would be satisfied. “We’re pretty confident that this deal ticks off all the critical boxes,” he said Tuesday. The US blacklisted Lukoil and fellow Russian oil giant Rosneft PJSC last month as part of a fresh bid to end the war in Ukraine by depriving Moscow of revenues. Gunvor’s subsequent deal is subject to clearance from the US Treasury’s Office of Foreign Assets Control, among other authorities. As part of the sanctions, Lukoil and its Litasco trading arm have a short window to wind down business dealings. Gunvor is in talks with US regulators to secure an extension to a license to transact with the Russian company.  The US license

Read More »

Energy Department Announces $625 Million to Advance the Next Phase of National Quantum Information Science Research Centers

WASHINGTON— The U.S. Department of Energy (DOE) today announced $625 million in funding to renew its five National Quantum Information Science (QIS) Research Centers, originally established under the National Quantum Initiative Act signed into law by President Trump in December 2018. The renewal of DOE’s National Quantum Information Science Research Centers advances President Trump’s directive to restore American leadership in quantum science and technology. The DOE is aligning its quantum research enterprise with national priorities, focusing resources on advancing critical R&D across the American QIS, strengthening the quantum innovation ecosystem, accelerating discoveries that power next-generation technologies, and securing American leadership in quantum computing, hardware, and applications. “President Trump positioned America to lead the world in quantum science and technology and today, a new frontier of scientific discovery lies before us. Breakthroughs in QIS have the potential to revolutionize the ways we sense, communicate, and compute, sparking entirely new technologies and industries,” said U.S. Department of Energy Under Secretary for Science Darío Gil. “The renewal of DOE’s National Quantum Information Science Research Centers will empower America to secure our advantage in pioneering the next generation of scientific and engineering advancements needed for this technology.” Each NQISRC: Supports fundamental science with disruptive potential across quantum computing, simulation, networking, and sensing. Develops unique tools, equipment, and instrumentation that unlock transformative new QIS capabilities. Advances quantum technology through application to DOE’s most pressing scientific and national security challenge areas. Establishes community resources, workforce opportunities, and industry partnerships to strengthen the entire QIS ecosystem. Center renewals include: Co-design Center for Quantum Advantage (C2QA) – Brookhaven National Laboratory will advance quantum computing and sensing by improving materials used in superconducting and plasma-grown, diamond-based quantum devices and developing modular approaches for superconducting and neutral-atom systems. Superconducting Quantum Materials and Systems Center (SQMS) – Fermi National Accelerator Laboratory

Read More »

Xcel proposes doubling battery storage at Minnesota coal plant

Xcel Energy on Friday asked Minnesota regulators for permission to double the battery storage capacity at a location ajacent to its coal-fired Sherco power plant, which is slated to retire at the end of 2030. “We’re making a significant investment in battery storage because we see it as a critical part of Minnesota’s energy future,” Bria Shea, president of Xcel Energy-Minnesota, North Dakota and South Dakota, said in a statement. The Minnesota Public Utilities Commission has already approved 300 MW to be installed at Sherco. Xcel’s proposal would increase that capacity to 600 MW, making it the largest battery storage site in the upper Midwest, according to the utility. It would also add another 135.5 MW at the company’s Blue Lake facility and expand the company’s Sherco Solar facility with an additional 200-MW array. Xcel plans to start construction on the battery storage projects in 2026, and bring them online in late 2027. The projects will use lithium iron phosphate battery cell technology that “discharge energy in four-hour increments and are quick to recharge, allowing for regular use,” the utility said in a statement. Xcel said it plans to reuse existing grid connections for the batteries to store energy produced by wind, solar, nuclear and natural gas facilities across its system. “Batteries help us store energy when it’s inexpensive to produce and dispatch it when needed, allowing us to continue delivering reliable electricity to customers while keeping bills low,” Shea said. Xcel said it anticipates the projects will qualify for federal tax credits, offsetting 30% of the cost for the Blue Lake battery and 40% for the Sherco solar and battery projects. Xcel serves about 3.9 million electric customers across eight states, and expects retail sales to grow 5% through 2030. The utility on Thursday unveiled a $15 billion addition to

Read More »

BP Profit Exceeds Expectations

BP Plc’s profit exceeded expectations, with operational improvements and higher oil and gas production outweighing lower prices, as the company’s turnaround plan builds momentum. The British energy giant posted adjusted third-quarter net income of $2.21 billion, higher than the average analyst estimate of $1.98 billion. Its quarterly share buyback plan was maintained and net debt rose slightly.  The results signal Chief Executive Officer Murray Auchincloss is starting to deliver a turnaround plan to win back investor confidence by focusing on oil and gas production, selling non-strategic assets and cutting costs.  “We continue to make good progress to cut costs, strengthen our balance sheet and increase cash flow and returns,” Auchincloss said in BP’s earnings statement. “We are looking to accelerate delivery of our plans, including undertaking a thorough review of our portfolio.” BP shares were little changed in London trading, as crude prices declined. BP’s plan to divest $20 billion of assets by the end of 2027 to improve the balance sheet still includes expectations of a transaction for lubricants business Castrol, Auchincloss said in an interview on Bloomberg TV. The firm also raised its disposal expectations for 2025, saying proceeds will exceed $4 billion after previously guiding between $3 to $4 billion. Quarterly share buybacks were held at $750 million, a reduced level BP announced earlier this year along with a strategic reset. Gearing — a ratio of net debt to equity that analysts have flagged as elevated compared to peers — ticked higher to 25.1%, from 24.6% in the previous quarter. Even though the company returned to focusing on fossil fuels, BP said its full year reported upstream production is expected to be slightly lower than last year. But in a telephone interview on Tuesday, Auchincloss said “maybe we’ll do better than that, but we don’t want to

Read More »

Biden staffers say IRA was hobbled by slow deployment

Implementation of the Inflation Reduction Act and Bipartisan Infrastructure Law suffered from muddled aims, and projects took too long to materialize under the Biden administration, according to an October report from former Department of Energy staffers who interviewed more than 80 of their former colleagues on the topic. The slow rollout meant that the “political theory animating the [Biden] administration’s approach — that the economic development generated by clean energy projects and industries would create a durable bipartisan coalition — was never truly tested,” and the Trump administration has been able to claw back much of the associated funding, the report says. “Programs frequently tried to satisfy multiple aims at once: decarbonization, onshoring, labor, equity, national security,” the report says. “This layering of priorities blurred mandates and slowed action. This proved to be particularly challenging for requirements that were at odds with energy industry realities (e.g., impractical [Build America Buy America] requirements for every component; labor union requirements for transmission projects where union labor didn’t exist).” The report was written by Ramsey Fahs, a former policy advisor at DOE; Louise White, a former senior consultant with DOE’s Loan Programs Office and Office of Technology Transitions; and Alan Propp, who first worked as a senior strategy consultant with DOE’s LPO and then served as a deputy chief of staff in its Loan Underwriting and Structuring Division. All three left the agency in January.  The authors say they interviewed more than 80 “political appointees and career staff who sat at the heart of implementation, with a primary focus on the infrastructure offices” at DOE, and noted that the interviews “are not exhaustive and at times interviewees reported conflicting information or divergent experiences.” However, interviewees seemed to agree that the implementation of the IRA and BIL was hampered by jumbled priorities, as well

Read More »

Cisco centralizes customer experience around AI

The idea is to make sure enterprises are effectively choosing, implementing, and using the technologies they purchase to achieve their business goals, according to the company. Cisco CX offers a suite of services to help customers optimize their network infrastructure, security, collaboration, cloud and data center operations – from planning and design to implementation and maintenance. “For too long, the delivery of services has been fragmented, with support and professional services using different tools optimized for specific functions or lifecycle stages. This has led to a fragmented experience where customers, partners, and Cisco teams spend more time on data collection and tool maintenance than on high-value analysis,” wrote Bhaskar Jayakrishnan, senior vice president of engineering with the Cisco CX group in a blog about the new technology.  “Historically, the handoffs between these stages have been inefficient. Designs are interpreted by humans and then converted into code. Operational data is manually analyzed to inform optimizations. This process is slow, error-prone, and loses critical context at every step.” “Cisco IQ represents a shift from this tool-centric model to an intelligence-centric one. It is a multi-persona system, serving customers, partners, and our own services teams through an API-first architecture. Our objective is to turn decades of institutional knowledge into a living, adaptive system that makes your infrastructure smarter, more resilient, and more secure,” Jayakrishnan wrote.

Read More »

Data Center Jobs: Engineering, Construction, Commissioning, Sales, Field Service and Facility Tech Jobs Available in Major Data Center Hotspots

Each month Data Center Frontier, in partnership with Pkaza, posts some of the hottest data center career opportunities in the market. Here’s a look at some of the latest data center jobs posted on the Data Center Frontier jobs board, powered by Pkaza Critical Facilities Recruiting. Looking for Data Center Candidates? Check out Pkaza’s Active Candidate / Featured Candidate Hotlist Data Center Facility Technician (All Shifts Available) Impact, TX This position is also available in: Ashburn, VA; Abilene, TX; Needham, MA and New York, NY.  Navy Nuke / Military Vets leaving service accepted! This opportunity is working with a leading mission-critical data center provider. This firm provides data center solutions custom-fit to the requirements of their client’s mission-critical operational facilities. They provide reliability of mission-critical facilities for many of the world’s largest organizations facilities supporting enterprise clients, colo providers and hyperscale companies. This opportunity provides a career-growth minded role with exciting projects with leading-edge technology and innovation as well as competitive salaries and benefits. Electrical Commissioning Engineer Montvale, NJ This traveling position is also available in: New York, NY; White Plains, NY;  Richmond, VA; Ashburn, VA; Charlotte, NC; Atlanta, GA; Hampton, GA; Fayetteville, GA; New Albany, OH; Cedar Rapids, IA; Phoenix, AZ; Dallas, TX or Chicago IL *** ALSO looking for a LEAD EE and ME CxA Agents and CxA PMs. *** Our client is an engineering design and commissioning company that has a national footprint and specializes in MEP critical facilities design. They provide design, commissioning, consulting and management expertise in the critical facilities space. They have a mindset to provide reliability, energy efficiency, sustainable design and LEED expertise when providing these consulting services for enterprise, colocation and hyperscale companies. This career-growth minded opportunity offers exciting projects with leading-edge technology and innovation as well as competitive salaries and benefits. Data Center MEP Construction

Read More »

NVIDIA at GTC 2025: Building the AI Infrastructure of Everything

Omniverse DSX Blueprint Unveiled Also at the conference, NVIDIA released a blueprint for how other firms should build massive, gigascale AI data centers, or AI factories, in which Oracle, Microsoft, Google, and other leading tech firms are investing billions. The most powerful and efficient of those, company representatives said, will include NVIDIA chips and software. A new NVIDIA AI Factory Research Center in Virginia will use that technology. This new “mega” Omniverse DSX Blueprint is a comprehensive, open blueprint for designing and operating gigawatt-scale AI factories. It combines design, simulation, and operations across factory facilities, hardware, and software. • The blueprint expands to include libraries for building factory-scale digital twins, with Siemens’ Digital Twin software first to support the blueprint and FANUC and Foxconn Fii first to connect their robot models. • Belden, Caterpillar, Foxconn, Lucid Motors, Toyota, Taiwan Semiconductor Manufacturing Co. (TSMC), and Wistron build Omniverse factory digital twins to accelerate AI-driven manufacturing. • Agility Robotics, Amazon Robotics, Figure, and Skild AI build a collaborative robot workforce using NVIDIA’s three-computer architecture. NVIDIA Quantum Gains  And then there’s quantum computing. It can help data centers become more energy-efficient and faster with specific tasks such as optimization and AI model training. Conversely, the unique infrastructure needs of quantum computers, such as power, cooling, and error correction, are driving the development of specialized quantum data centers. Huang said it’s now possible to make one logical qubit, or quantum bit, that’s coherent, stable, and error corrected.  However, these qubits—the units of information enabling quantum computers to process information in ways ordinary computers can’t—are “incredibly fragile,” creating a need for powerful technology to do quantum error correction and infer the qubit’s state. To connect quantum and GPU computing, Huang announced the release of NVIDIA NVQLink — a quantum‑GPU interconnect that enables real‑time CUDA‑Q calls from quantum

Read More »

The Evolution of the Neocloud: From Niche to Mainstream Hyperscale Challenger

Infrastructure and Supply Chain Race Cloud competition is increasingly defined by the ability to secure power, land, and chips— three resources that dictate project timelines and customer onboarding. Neoclouds and hyperscalers face a common set of constraints: local utility availability, substation interconnection bottlenecks, and fierce competition for high-density GPU inventory. Power stands as the gating factor for expansion, often outpacing even chip shortages in severity. Facilities are increasingly being sited based on access to dedicated, reliable megawatt-scale electricity, rather than traditional latency zones or network proximity. AI growth forecasts point to four key ceilings: electrical capacity, chip procurement cycles, latency wall between computation and data, and scalable data throughput for model training. With hyperscaler and neocloud deployments now competing for every available GPU from manufacturers, deployment agility has become a prime differentiator. Neoclouds distinguish themselves by orchestrating microgrid agreements, securing direct-source utility contracts, and compressing build-to-operational timelines. Converting a bare site to a functional data hall with operators that can viably offer a shortened deployment timeline gives neoclouds a material edge over traditional hyperscale deployments that require broader campus and network-level integration cycles. The aftereffects of the COVID era supply chain disruptions linger, with legacy operators struggling to source critical electrical components, switchgear, and transformers, sometimes waiting more than a year for equipment. As a result, neocloud providers have moved aggressively into site selection strategies, regional partnerships, and infrastructure stack integration to hedge risk and shorten delivery cycles. Microgrid solutions and island modes for power supply are increasingly utilized to ensure uninterrupted access to electricity during ramp-up periods and supply chain outages, fundamentally rebalancing the competitive dynamics of AI infrastructure deployment. Creditworthiness, Capital, and Risk Management Securing capital remains a decisive factor for the growth and sustainability of neoclouds. Project finance for campus-scale deployments hinges on demonstrable creditworthiness; lenders demand

Read More »

Canyon Magnet Energy: The Superconducting Future of Powering AI Data Centers

At this year’s Data Center Frontier Trends Summit, Honghai Song, founder of Canyon Magnet Energy, presented his company’s breakthrough superconducting magnet technology during the “6 Moonshot Trends for the 2026 Data Center Frontier” panel—showcasing how high-temperature superconductors (HTS) could reshape both fusion energy and AI data-center power systems. In this episode of the Data Center Frontier Show, Editor in Chief Matt Vincent speaks with Song about how Canyon Magnet Energy—founded in 2023 and based in New Jersey with research roots at Stony Brook University—is bridging fusion research and AI infrastructure through next-generation magnet and energy-storage technology. From Fusion Research to Data Center Reality Founded in 2023, Canyon Magnet Energy emerged from the advanced-magnet research ecosystem around Stony Brook and now operates a manufacturing line in Newark, New Jersey. Its team draws on decades of experience designing the ultra-strong magnetic fields that enable the confinement and stability of fusion plasma—but their ambitions go far beyond the laboratory. “Super magnets are the foundation of fusion,” Song explains in the interview. “But the same high-temperature superconductors that can make fusion practical can also dramatically improve how we move and store electricity in data centers.” The company’s magnets are built using REBCO (Rare Earth Barium Copper Oxide) tape, which operates at around 77 Kelvin—cold, but far warmer and more manageable than traditional low-temperature superconductors. The result is a zero-resistance pathway for electricity, unlocking new possibilities in power transmission, energy storage, and grid integration. Why High-Temperature Superconductors Matter Since their discovery in 1986, high-temperature superconductors have progressed from exotic physics experiments to industrial-scale wire and magnet manufacturing. Canyon Magnet Energy is among a new generation of companies moving this technology into the AI data-center context—where efficiency and instantaneous power responsiveness are increasingly critical. With AI training clusters consuming power at hundreds of megawatts per campus,

Read More »

OpenAI spends even more money it doesn’t have

The aim, said Gogia, “is continuity, not cost efficiency. These deals are forward leaning, relying on revenue forecasts that remain speculative. In that context, OpenAI must continue to draw heavily on outside capital, whether through venture rounds, debt, or a future public offering.” He pointed out, “the company’s recent legal and corporate restructuring was designed to open the doors to that capital. Removing Microsoft’s exclusivity makes room for more vendors but also signals that no one provider can meet OpenAI’s demands. In several cases, suppliers are stepping in with financing arrangements that link product sales to future performance. While these strategies help close funding gaps, they introduce fragility. What looks like revenue is often pre-paid consumption, not realized margin.” Execution risks, he said, add to the concern. “Building and energizing enough data centers to meet OpenAI’s projected needs is not a function of ambition alone. It requires grid access, cooling capacity, and regional stability. Microsoft has acknowledged that it lacks the power infrastructure to fully deploy the GPUs it owns. Without physical readiness, all of these agreements sit on shaky ground.” Lots of equity swapping going on Scott Bickley, advisory fellow at Info-Tech Research Group, said he has not only been astounded by the funding announcements over the last few months, but is also appalled, primarily, he said, “because of the disconnect to what this does to the underlying technology stocks and their market prices versus where the technology is at from a development and ROI perspective … and from a boots on the ground perspective.” He added that while the financial pledges involve “huge, staggering numbers, most of them are tied up in ways that are not necessarily going to require all the cash to come from OpenAI. In a lot of cases, there is equity swapping. You have

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »