Breaking the Bottleneck: GPU-Optimised Video Processing for Deep Learning

Stay Ahead, Stay ONMINE

Breaking the Bottleneck: GPU-Optimised Video Processing for Deep Learning

Deep Learning (DL) applications often require processing video data for tasks such as object detection, classification, and segmentation. However, conventional video processing pipelines are typically inefficient for deep learning inference, leading to performance bottlenecks. In this post will leverage PyTorch and FFmpeg with NVIDIA hardware acceleration to achieve this optimisation. The inefficiency comes from how video frames are typically decoded and transferred between CPU and GPU. The standard workflow that we may find in the majority of tutorials follow this structure: Decode Frames on CPU: Video files are first decoded into raw frames using CPU-based decoding tools (e.g., OpenCV, FFmpeg without GPU support). Transfer to GPU: These frames are then transferred from CPU to GPU memory to perform deep learning inference using frameworks like TensorFlow, Pytorch, ONNX, etc. Inference on GPU: Once the frames are in GPU memory, the model performs inference. Transfer Back to CPU (if needed): Some post-processing steps may require data to be moved back to the CPU. This CPU-GPU transfer process introduces a significant performance bottleneck, especially when processing high-resolution videos at high frame rates. The unnecessary memory copies and context switches slow down the overall inference speed, limiting real-time processing capabilities. As an example, the following snippet has the typical Video Processing pipeline that you came across when you are starting to learn deep learning: The Solution: GPU-Based Video Decoding and Inference A more efficient approach is to keep the entire pipeline on the GPU, from video decoding to inference, eliminating redundant CPU-GPU transfers. This can be achieved using FFmpeg with NVIDIA GPU hardware acceleration. Key Optimisations GPU-Accelerated Video Decoding: Instead of using CPU-based decoding, we leverage FFmpeg with NVIDIA GPU acceleration (NVDEC) to decode video frames directly on the GPU. Zero-Copy Frame Processing: The decoded frames remain in GPU memory, avoiding unnecessary memory transfers. GPU-Optimized Inference: Once the frames are decoded, we perform inference directly using any model on the same GPU, significantly reducing latency. Hands on! Prerequisites In order to achieve the aforementioned improvements, we will be using the following dependencies: Installation Please, to get a deep insight of how FFmpeg is installed with NVIDIA gpu acceleration, follow these instructions. Tested with: System: Ubuntu 22.04 NVIDIA Driver Version: 550.120 CUDA Version: 12.4 Torch: 2.4.0 Torchaudio: 2.4.0 Torchvision: 0.19.0 1. Install the NV-Codecs 2. Clone and configure FFmpeg 3. Validate whether the installation was successful with torchaudio.utils Time to code an optimised pipeline! Benchmarking To benchmark whether it is making any difference, we will be using this video from Pexels by Pawel Perzanowski. Since most videos there are really short, I have stacked the same video several times to provide some results with different video lengths. The original video is 32 seconds long which gives us a total of 960 frames. The new modified videos have 5520 and 9300 frames respectively. Original video typical workflow: 28.51s optimised workflow: 24.2s Okay… it doesn’t seem like a real improvement, right? Let’s test it with longer videos. Modified video v1 (5520 frames) typical workflow: 118.72s optimised workflow: 100.23s Modified video v2 (9300 frames) typical workflow: 292.26s optimised workflow: 240.85s As the video duration increases, the benefits of the optimization become more evident. In the longest test case, we achieve an 18% speedup, demonstrating a significant reduction in processing time. These performance gains are particularly crucial when handling large video datasets or even in real-time video analysis tasks, where small efficiency improvements accumulate into substantial time savings. Conclusion In today’s post, we have explored two video processing pipelines, the typical one where frames are copied from CPU to GPU, introducing noticeable bottlenecks, and an optimised pipeline, in which frames are decoded in the GPU and pass them directly to inference, saving a considerably amount of time as videos’ duration increase. References

The inefficiency comes from how video frames are typically decoded and transferred between CPU and GPU. The standard workflow that we may find in the majority of tutorials follow this structure:

Decode Frames on CPU: Video files are first decoded into raw frames using CPU-based decoding tools (e.g., OpenCV, FFmpeg without GPU support).
Transfer to GPU: These frames are then transferred from CPU to GPU memory to perform deep learning inference using frameworks like TensorFlow, Pytorch, ONNX, etc.
Inference on GPU: Once the frames are in GPU memory, the model performs inference.
Transfer Back to CPU (if needed): Some post-processing steps may require data to be moved back to the CPU.

This CPU-GPU transfer process introduces a significant performance bottleneck, especially when processing high-resolution videos at high frame rates. The unnecessary memory copies and context switches slow down the overall inference speed, limiting real-time processing capabilities.

As an example, the following snippet has the typical Video Processing pipeline that you came across when you are starting to learn deep learning:

The Solution: GPU-Based Video Decoding and Inference

A more efficient approach is to keep the entire pipeline on the GPU, from video decoding to inference, eliminating redundant CPU-GPU transfers. This can be achieved using FFmpeg with NVIDIA GPU hardware acceleration.

Key Optimisations

GPU-Accelerated Video Decoding: Instead of using CPU-based decoding, we leverage FFmpeg with NVIDIA GPU acceleration (NVDEC) to decode video frames directly on the GPU.
Zero-Copy Frame Processing: The decoded frames remain in GPU memory, avoiding unnecessary memory transfers.
GPU-Optimized Inference: Once the frames are decoded, we perform inference directly using any model on the same GPU, significantly reducing latency.

Hands on!

Prerequisites

In order to achieve the aforementioned improvements, we will be using the following dependencies:

Installation

Please, to get a deep insight of how FFmpeg is installed with NVIDIA gpu acceleration, follow these instructions.

Tested with:

System: Ubuntu 22.04
NVIDIA Driver Version: 550.120
CUDA Version: 12.4
Torch: 2.4.0
Torchaudio: 2.4.0
Torchvision: 0.19.0

1. Install the NV-Codecs

2. Clone and configure FFmpeg

3. Validate whether the installation was successful with torchaudio.utils

Time to code an optimised pipeline!

Benchmarking

To benchmark whether it is making any difference, we will be using this video from Pexels by Pawel Perzanowski. Since most videos there are really short, I have stacked the same video several times to provide some results with different video lengths. The original video is 32 seconds long which gives us a total of 960 frames. The new modified videos have 5520 and 9300 frames respectively.

Original video

typical workflow: 28.51s
optimised workflow: 24.2s

Okay… it doesn’t seem like a real improvement, right? Let’s test it with longer videos.

Modified video v1 (5520 frames)

typical workflow: 118.72s
optimised workflow: 100.23s

Modified video v2 (9300 frames)

typical workflow: 292.26s
optimised workflow: 240.85s

As the video duration increases, the benefits of the optimization become more evident. In the longest test case, we achieve an 18% speedup, demonstrating a significant reduction in processing time. These performance gains are particularly crucial when handling large video datasets or even in real-time video analysis tasks, where small efficiency improvements accumulate into substantial time savings.

Conclusion

In today’s post, we have explored two video processing pipelines, the typical one where frames are copied from CPU to GPU, introducing noticeable bottlenecks, and an optimised pipeline, in which frames are decoded in the GPU and pass them directly to inference, saving a considerably amount of time as videos’ duration increase.

References

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

IBM targets AI application growth with DataStax buy

In particular IBM said DataStax’s technology will be built into its watsonx portfolio of generative AI products to help manage the vast amounts of unstructured data used in generative AI application development. Thousands of organizations including FedEx, Capital One, The Home Depot and Verizon use Apache Cassandra, and it offers

New Relic boosts observability platform with AI intelligence

New Relic announced updates to its Intelligent Observability Platform this week, which the company says will enable a unified view of system relationships and dependencies and intelligently connect technical systems with business context. New Relic’s cloud-based observability platform monitors applications and services in real time to provide insights into software,

Alibaba to invest $52 billion in AI and cloud over three years

“Although $52 billion is a lot of money, every US-based hyper scaler is committing more than this to AI already,” said Hyoun Park, CEO and chief analyst at Amalgam Insights. “AWS is committing roughly $100 billion to new capital expenditures next year, most of which is servers for AI-related use

Intel targets edge, high-performance computing with extended Xeon 6 chip line

“As we grow the core counts, do people really need to have a four socket system or an eight socket system? And I think the answer very simply is yes, we continue to see a strong demand for people who want to scale out their memory capacity,” said Ronak Singhal,

Schneider Electric books strong 2024 revenue, earnings growth amid data center boom

Dive Brief: Schneider Electric saw stronger-than-forecast revenue and adjusted earnings in 2024 as its customers’ data center investments — especially in North America — drove mid-double-digit growth in its energy management business, the company said Feb. 20. Year-over-year organic revenue growth accelerated 12% in the fourth quarter, supported by 25% growth in Schneider Electric’s North American energy management business. The company’s year-end 2024 sales backlog of 21.4 billion euros, or about $22.39 billion, was its highest ever, and the company plans to invest about 2 billion euros through 2027 to expand production capacity, it said. Much of the planned capacity expansion will occur in North America despite uncertainty around U.S. trade policy that could necessitate “commercial actions” to protect the company’s profitability, Chief Financial Officer Hilary Maxson said Thursday on Schneider Electric’s earnings call. Dive Insight: Schneider data centers and networks end-market has been strong throughout 2024 and should continue to see robust growth in 2025 and beyond, CEO Olivier Blum said on his first earnings call since replacing Peter Herweck in November. The AI investment boom supports annual growth of 10% or more through 2027 in the company’s data centers and networks business, which accounts for 24% of Schneider Electric’s 2024 end-market exposure, the company said in its earnings presentation. “Pure data centers” make up 20% of Schneider Electric’s end market exposure, with hyperscalers contributing “a bit less than half” of that total, Maxson said. “Suffice to say we feel there is healthy growth in that segment … and we believe there is healthy growth to come, [though] not exponential … as this new infrastructure backbone is built out,” Maxson said. Schneider expects DeepSeek, a Chinese AI firm that caught the industry off-guard in January when it released a reasoning model that appeared to use far less energy than

New York PSC approves retail and residential storage plan as 6-GW 2030 target in question

Dive Brief: The New York State Public Service Commission has approved the state’s retail and residential energy storage implementation plan, a significant step in its effort to reach 6 GW of energy storage by 2030. The Feb. 13 order approved a framework to reach the state’s retail storage deployment goal of 1,500 MW and its residential storage deployment goal of 200 MW. It also includes incentives for resources participating in the New York Independent System Operator’s distributed energy resources program to also be eligible for the retail storage incentive, the PSC said. The plan was approved as a new forecast by Aurora Energy Research shows New York falling “marginally short” of its 2030 energy storage target despite an expected deployment surge in the late 2020s, but reaching 30 GW of deployed storage capacity by 2050. Dive Insight: New York’s 6-GW 2030 goal will “support a buildout of storage deployments estimated to reduce projected future statewide electric system costs by nearly $2 billion, in addition to further benefits in the form of improved public health because of reduced exposure to harmful fossil fuel pollutants,” the PSC said in announcing the order. The 6-GW goal represents a doubling of the previous 2030 goal of 3 GW. It envisions 1.7 GW of new retail and residential storage plus 3 GW of new bulk storage added to about 1.3 GW of existing storage assets being procured by or under contract with the state as of April 1, 2024, the PSC said on Feb. 13. Following the adoption this month of its retail and residential implementation plan, the New York State Energy Research and Development Authority expects to make the first of three annual bulk storage solicitations by the end of June for deployment in 2027 and 2028. It plans subsequent storage solicitations in 2026

Charging Forward: UK battery storage projects reach startup, grid delays and more

In this week’s Charging Forward, Gore Street, Eku and BW ESS reach energisation at UK battery energy storage system (BESS) projects, amid warnings over an oversubscribed grid connection queue. This week’s headlines: Root-Power secures planning consent for 40 MW Rotherham BESS Sungrow and BW ESS Bramley Project begins operations Warnings over UK grid connection queue Invinity and Frontier Power partner on UK long duration energy storage projects Fire at Statera BESS site in Essex brought under control Gore Street energises UK Enderby BESS project Eku energises two UK BESS projects International news: China and Saudi Arabia collaborate on 12.5 GWh of energy storage projects and Canadian firm Hydrostor secures $200 million for compressed air energy storage Root-Power consent for 40 MW Rotherham BESS UK energy storage developer Root-Power has secured planning consent for a 40 MW/80 MWh BESS project in Brinsworth, Rotherham. Root-Power said the site will power 80,000 homes for a two-hour period once fully operational, and delivering a biodiversity net gain of 32.76%. The Brinsworth BESS is the fourth planning approval for Root-Power in 2025, following consents at sites in Yorkshire, County Durham and the Scottish Highlands. © Supplied by Root-PowerThe site of Root-Power’s 40 MW/80 MWh Brinsworth BESS project in Rotherham. Root-Power managing director Neil Brooks said the company “carefully selected a near perfect location” for the Brinsworth project. “Managing competing constraints is always difficult when planning a project, so finding a suitable location only 1 mile from the point of connection in an urban area, without causing unacceptable noise or visual impact on sensitive receptors is a real achievement,” he said. “We are happy to see that the planning committee unanimously supported our application, which is a real vote of confidence in our process and team.” Sungrow and BW ESS Bramley BESS starts up Swiss energy storage developer

Costain secures multi-million pound Sizewell C contract

UK construction and engineering firm Costain (LON:COST) has secured a multi-million pound contract to support the construction of the Sizewell C nuclear power plant. Costain said under the ten-year framework agreement, the company will provide support in areas such as delivery integration, health and safety and quality control. French state-owned energy firm EDF is developing the 3.2 GW nuclear power station, which could provide up to 7% of UK energy needs over its 60-year lifetime. The UK government holds a 76.1% stake in Sizewell C, with EDF holding the remaining 23.9%. Costain defence and nuclear energy sector director Bob Anstey said the Sizewell C project is a “vital part of creating a sustainable future”. “We have a long and successful track record in delivering for our civil nuclear customers, with a highly qualified and experienced workforce that consistently works to the highest safety and quality standards,” Anstey said. “A key part of our role will be to help ensure the project leaves a positive legacy, and we look forward to working closely with Sizewell C on a range of social value and employment initiatives that improve lives and provide long-term benefits to local communities.” Sizewell C Ltd managing director Nigel Cann said the project will “strengthen energy security and provide clean, reliable electricity for millions”. “We welcome Costain to the Sizewell C supplier family,” Cann said. “We are committed to providing thousands of great jobs and career development opportunities and we’re looking forward to working with our suppliers to boost skills, promote a diverse workforce and spread opportunities as widely as possible.” Sizewell C criticism The Sizewell C project has attracted significant criticism amid concerns over its ballooning costs. Earlier this year, campaign group Together Against Sizewell C (TASC) wrote to the National Audit Office calling for a review of

Trump’s National Energy Dominance Council should appoint a ‘Hydropower Czar’

Charles Yang is the founder and executive director of the Center for Industrial Strategy. President Trump forming a National Energy Dominance Council in the White House is a key recognition of the importance of energy to industrial policy. Affordable, abundant electricity is the foundation of America’s economic future, powering everything from advanced manufacturing to data centers. The council offers a rare opportunity to develop a unified federal strategy for energy dominance, linking industrial policy and load growth. However, the White House might quickly find how difficult it is for the federal government to unilaterally bring online new generation to meet load growth. The electricity sector’s web of regional operators, state regulators and investor-owned utilities have far more direct control over generation investments than the federal government. While financing vehicles like the Loan Programs Office or smart grid grants can help incentivize investments in the power sector, there are few levers the U.S. government has available to directly meet load growth. One unique exception to this is hydropower: the majority of the hydropower fleet in the U.S. is federally owned and operated, providing about 80 GW of generating capacity, with pumped storage hydropower being responsible for over 90% of the nation’s energy storage capacity. The federal government’s ownership of hydropower facilities offers a rare chance to fast-track solutions to load growth. Hydropower is also one of the lowest cost, baseload sources of energy generation, which makes it attractive to manufacturers with large loads, like polysilicon producers and data center developers. For instance, Iron Mountain Data Centers and Rye Development signed a 150-MW power purchase agreement, with Rye Development using the PPA financing to power non-powered dams across the mid-Atlantic. The National Energy Dominance Council can leverage the direct federal control over hydropower and dam facilities to help meet load growth through

NERC interregional transfer capability study lacks detail to drive transmission upgrades: EIPC

A North American Electric Reliability Corp. study on interregional transfer capability is inadequate for determining how much transmission capacity should be added between regions, according to grid planners across the Eastern Interconnection. NERC’s study — ordered by Congress in 2023 — provides “helpful information” but lacks enough detail to drive actions, the Eastern Interconnection Planning Collaborative said in comments filed with the Federal Energy Regulatory Commission on Monday. In its study filed with FERC in November, NERC found that the U.S. could use an additional 35 GW of transfer capacity between regions to bolster reliability during extreme weather. However, the study fails to adequately consider the costs and benefits of building transmission lines to increase transfer capacity between regions, according to EIPC, which includes ISO New England, the Midcontinent Independent System Operator, the New York ISO, the PJM Interconnection, the Southwest Power Pool and utilities such as Southern Co. “Large nationwide studies, like the [Interregional Transfer Capability Study,] have no way of achieving sufficiently detailed results to effectively weigh the cost/benefit associated with adding transfer capability within or between different regions, or to appropriately assign costs to the true beneficiaries,” the group said. Transmission planning entities should assess interregional transfer capability needs, according to EIPC. “Those entities have complex models of the system, and they are in the best position to evaluate resource adequacy and transmission security as well as an understanding of enhanced needs due to extreme weather conditions,” the group said. Determining how much transfer capability is needed should be informed by how it could improve system reliability, but also the cost of the upgrades, the ability to assign the costs to beneficiaries and the overall cost/benefit ratio compared with other options, such as generation resource additions, demand side management or operational measures, EIPC said. Further, adding transmission capacity

Cisco, Nvidia expand AI partnership to include Silicon One technology

In addition, Cisco and Nvidia will invest in cross-portfolio technology to tackle common challenges like congestion management and load balancing, ensuring that enterprises can accelerate their AI deployments, Patel stated. The vendors said they would also collaborate to create and validate Nvidia Cloud Partner (NCP) and Enterprise Reference Architectures based on Nvidia Spectrum-X with Cisco Silicon One, Hyperfabric, Nexus, UCS Compute, Optics, and other Cisco technologies. History of Cisco, Nvidia collaborations The announcement is just the latest expansion of the Cisco/Nvidia partnership. The companies have already worked together to make Nvidia’s Tensor Core GPUs available in Cisco’s Unified Computing System (UCS) rack and blade servers, including Cisco UCS X-Series and UCS X-Series Direct, to support AI and data-intensive workloads in the data center and at the edge. The integrated package includes Nvidia AI Enterprise software, which features pretrained models and development tools for production-ready AI. Earlier this month, Cisco said it has shipped the UCS C845A M8 Rack Server for enterprise data center environments. The 8U rack server is built on Nvidia’s HGX platform and designed to deliver the accelerated compute capabilities needed for AI workloads such as LLM training, model fine-tuning, large model inferencing, and retrieval-augmented generation (RAG). The companies are also collaborating on AI Pods, which are preconfigured, validated, and optimized infrastructure packages that customers can plug into their data center or edge environments as needed. The Pods are based on Cisco Validated Design principals, which provide a blueprint for building reliable, scalable, and secure network infrastructures, according to Cisco. The Pods include Nvidia AI Enterprise, which features pretrained models and development tools for production-ready AI, and are managed through Cisco Intersight.

3 strategies for carbon-free data centers

Because of the strain that data centers (as well as other electrification sources, such as electric vehicles) are putting on the grid, “the data center industry needs to develop new power supply strategies to support growth plans,” Dietrich said. Here are the underling factors that play into the three strategies outlined by Uptime. Scale creates new opportunities: It’s not just that more data centers are being built, but the data centers under construction are fundamentally different in terms of sheer magnitude. For example, a typical enterprise data center might require between 10 and 25 megawatts of power. Today, the hyperscalers are building data centers in the 250-megawatt range and a large data center campus could require 1,000 megawatts of power. Data centers not only require a reliable source of power, they also require backup power in the form of generators. Dietrich pointed out that if a data center operator builds out enough backup capacity to support 250 megawatts of demand, they’re essentially building a new, on-site power plant. On the one hand, that new power plant requires permitting, it’s costly, and it requires highly training staffers to operate. On the other hand, it provides an opportunity. Instead of letting this asset sit around unused except in an emergency, organizations can leverage these power plants to generate energy that can be sold back to the grid. Dietrich described this arrangement as a win-win that enables the data center to generate revenue, and it helps the utility to gain a new source of power. Realistic expectations: Alternative energy sources like wind and solar, which are dependent on environmental factors, can’t technically or economically supply 100% of data center power, but they can provide a significant percentage of it. Organizations need to temper their expectations, Dietrich said.

Questions arise about reasons why Microsoft has cancelled data center lease plans

This, the company said, “allows us to invest and allocate resources to growth areas for our future. Our plans to spend over $80 billion on infrastructure this fiscal year remains on track as we continue to grow at a record pace to meet customer demand.” When asked for his reaction to the findings, John Annand, infrastructure and operations research practice lead at Info-Tech Research Group, pointed to a blog released last month by Microsoft president Brad Smith, and said he thinks the company “is hedging its bets. It reaffirms the $80 billion AI investment guidance in 2025, $40 billion in the US. Why lease when you can build/buy your own?” Over the past four years, he said, Microsoft “has been leasing more data centers than owning. Perhaps they are using the fact that the lessors are behind schedule on providing facilities or the power upgrades required to bring that ratio back into balance. The limiting factor for data centers has always been the availability of power, and this has only become more true with power-hungry AI workloads.” The company, said Annand, “has made very public statements about owning nuclear power plants to help address this demand. If third-party data center operators are finding it tough to provide Microsoft with the power they need, it would make sense that Microsoft vertically integrate its supply chain; so, cancel leases or statements of qualification in favor of investing in the building of their own capacity.” However, Gartner analyst Tony Harvey said of the report, “so much of this is still speculation.” Microsoft, he added, “has not stated as yet that they are reducing their capex spend, and there are reports that Microsoft have strongly refuted that they are making changes to their data center strategy.” The company, he said, “like any other hyperscaler,

Quantum Computing Advancements Leap Forward In Evolving Data Center and AI Landscape

Overcoming the Barriers to Quantum Adoption Despite the promise of quantum computing, widespread deployment faces multiple hurdles: High Capital Costs: Quantum computing infrastructure requires substantial investment, with uncertain return-on-investment models. The partnership will explore cost-sharing strategies to mitigate risk. Undefined Revenue Models: Business frameworks for quantum services, including pricing structures and access models, remain in development. Hardware Limitations: Current quantum processors still struggle with error rates and scalability, requiring advancements in error correction and hybrid computing approaches. Software Maturity: Effective algorithms for leveraging quantum computing’s advantages remain an active area of research, particularly in real-world AI and optimization problems. SoftBank’s strategy includes leveraging its extensive telecom infrastructure and AI expertise to create real-world testing environments for quantum applications. By integrating quantum into existing data center operations, SoftBank aims to position itself at the forefront of the quantum-AI revolution. A Broader Play in Advanced Computing SoftBank’s quantum initiative follows a series of high-profile moves into the next generation of computing infrastructure. The company has been investing heavily in AI data centers, aligning with its “Beyond Carrier” strategy that expands its focus beyond telecommunications. Recent efforts include the development of large-scale AI models tailored to Japan and the enhancement of radio access networks (AI-RAN) through AI-driven optimizations. Internationally, SoftBank has explored data center expansion opportunities beyond Japan, as part of its efforts to support AI, cloud computing, and now quantum applications. The company’s long-term vision suggests that quantum data centers could eventually play a role in supporting AI-driven workloads at scale, offering performance benefits that classical supercomputers cannot achieve. The Road Ahead SoftBank and Quantinuum’s collaboration signals growing momentum for quantum computing in enterprise settings. While quantum remains a long-term bet, integrating QPUs into data center infrastructure represents a forward-looking approach that could redefine high-performance computing in the years to come. With

STACK Infrastructure Pushes Aggressive Data Center Expansion and Sustainability Strategy Into 2025

Global data center developer and operator STACK Infrastructure is providing a growing range of digital infrastructure solutions for hyperscalers, cloud service providers, and enterprise clients. Like almost all of the cutting-edge developers in the industry, Stack is maintaining the focus on scalability, reliability, and sustainability while delivering a full range of solutions, including build-to-suit, colocation, and powered shell facilities, with continued development in key global markets. Headquartered in the United States, the company has expanded its presence across North America, Europe, and Asia-Pacific, catering to the increasing demand for high-performance computing, artificial intelligence (AI), and cloud-based workloads. The company is known for its commitment to sustainable growth, leveraging green financing initiatives, energy-efficient designs, and renewable power sources to minimize its environmental impact. Through rapid expansion in technology hubs like Silicon Valley, Northern Virginia, Malaysia, and Loudoun County, the company continues to develop industry benchmarks for innovation and infrastructure resilience. With a customer-centric approach and a robust development pipeline, STACK Infrastructure is shaping the future of digital connectivity and data management in an era of accelerating digital transformation. Significant Developments Across 23 Major Data Center Markets Early in 2024, Stack broke ground on the expansion of their existing 100 MW campus in San Jose, servicing the power constrained Silicon Valley. Stack worked with the city of San Jose to add a 60 MW expansion to their SVY01 data center. While possibly the highest profile of Stack’s developments, due to its location, at that point in time the company had announced significant developments across 23 major data center markets, including: Stack’s 48 MW Santa Clara data center, featuring immediately available shell space powered by an onsite substation with rare, contracted capacity. Stack’s 56 MW Toronto campus, spanning 19 acres, includes an existing 8 MW data center and 48 MW expansion capacity,

Meta Update: Opens Mesa, Arizona Data Center; Unveils Major Subsea Cable Initiative; Forges Oklahoma Wind Farm PPA; More

Meta’s Project Waterworth: Building the Global Backbone for AI-Powered Digital Infrastructure Also very recently, Meta unveiled its most ambitious subsea cable initiative yet: Project Waterworth. Aimed at revolutionizing global digital connectivity, the project will span over 50,000 kilometers—surpassing the Earth’s circumference—and connect five major continents. When completed, it will be the world’s longest subsea cable system, featuring the highest-capacity technology available today. A Strategic Expansion to Key Global Markets As announced on Feb. 14, Project Waterworth is designed to enhance connectivity across critical regions, including the United States, India, Brazil, and South Africa. These regions are increasingly pivotal to global digital growth, and the new subsea infrastructure will fuel economic cooperation, promote digital inclusion, and unlock opportunities for technological advancement. In India, for instance, where rapid digital infrastructure growth is already underway, the project will accelerate progress and support the country’s ambitions for an expanded digital economy. This enhanced connectivity will foster regional integration and bolster the foundation for next-generation applications, including AI-driven services. Strengthening Global Digital Highways Subsea cables are the unsung heroes of global digital infrastructure, facilitating over 95% of intercontinental data traffic. With a multi-billion-dollar investment, Meta aims to open three new oceanic corridors that will deliver the high-speed, high-capacity bandwidth needed to fuel innovations like artificial intelligence. Meta’s experience in subsea infrastructure is extensive. Over the past decade, the company has collaborated with various partners to develop more than 20 subsea cables, including systems boasting up to 24 fiber pairs—far exceeding the typical 8 to 16 fiber pairs found in most new deployments. This technological edge ensures scalability and reliability, essential for handling the world’s ever-increasing data demands. Engineering Innovations for Resilience and Capacity Project Waterworth isn’t just about scale—it’s about resilience and cutting-edge engineering. The system will be the longest 24-fiber-pair subsea cable ever built, enhancing

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle