Stay Ahead, Stay ONMINE

Six Ways to Control Style and Content in Diffusion Models

Stable Diffusion 1.5/2.0/2.1/XL 1.0, DALL-E, Imagen… In the past years, Diffusion Models have showcased stunning quality in image generation. However, while producing great quality on generic concepts, these struggle to generate high quality for more specialised queries, for example generating images in a specific style, that was not frequently seen in the training dataset. We […]

Stable Diffusion 1.5/2.0/2.1/XL 1.0, DALL-E, Imagen… In the past years, Diffusion Models have showcased stunning quality in image generation. However, while producing great quality on generic concepts, these struggle to generate high quality for more specialised queries, for example generating images in a specific style, that was not frequently seen in the training dataset.

We could retrain the whole model on vast number of images, explaining the concepts needed to address the issue from scratch. However, this doesn’t sound practical. First, we need a large set of images for the idea, and second, it is simply too expensive and time-consuming.

There are solutions, however, that, given a handful of images and an hour of fine-tuning at worst, would enable diffusion models to produce reasonable quality on the new concepts.

Below, I cover approaches like Dreambooth, Lora, Hyper-networks, Textual Inversion, IP-Adapters and ControlNets widely used to customize and condition diffusion models. The idea behind all these methods is to memorise a new concept we are trying to learn, however, each technique approaches it differently.

Diffusion architecture

Before diving into various methods that help to condition diffusion models, let’s first recap what diffusion models are.

Diffusion process visualisation. Image by the Author.

The original idea of diffusion models is to train a model to reconstruct a coherent image from noise. In the training stage, we gradually add small amounts of Gaussian noise (forward process) and then reconstruct the image iteratively by optimizing the model to predict the noise, subtracting which we would get closer to the target image (reverse process).

The original idea of image corruption has evolved into a more practical and lightweight architecture in which images are first compressed to a latent space, and all manipulation with added noise is performed in low dimensional space.

To add textual information to the diffusion model, we first pass it through a text-encoder (typically CLIP) to produce latent embedding, that is then injected into the model with cross-attention layers.

Dreambooth visualisation. Trainable blocks are marked in red. Image by the Author.

The idea is to take a rare word; typically, an {SKS} word is used and then teach the model to map the word {SKS} to a feature we would like to learn. That might, for example, be a style that the model has never seen, like van Gogh. We would show a dozen of his paintings and fine-tune to the phrase “A painting of boots in the {SKS} style”. We could similarly personalise the generation, for example, learn how to generate images of a particular person, for example “{SKS} in the mountains” on a set of one’s selfies.

To maintain the information learned in the pre-training stage, Dreambooth encourages the model not to deviate too much from the original, pre-trained version by adding text-image pairs generated by the original model to the fine-tuning set.

When to use and when not
Dreambooth produces the best quality across all methods; however, the technique could impact already learnt concepts since the whole model is updated. The training schedule also limits the number of concepts the model can understand. Training is time-consuming, taking 1–2 hours. If we decide to introduce several new concepts at a time, we would need to store two model checkpoints, which wastes a lot of space.

Textual Inversion, papercode

Textual inversion visualisation. Trainable blocks are marked in red. Image by the Author.

The assumption behind the textual inversion is that the knowledge stored in the latent space of the diffusion models is vast. Hence, the style or the condition we want to reproduce with the Diffusion model is already known to it, but we just don’t have the token to access it. Thus, instead of fine-tuning the model to reproduce the desired output when fed with rare words “in the {SKS} style”, we are optimizing for a textual embedding that would result in the desired output.

When to use and when not
It takes very little space, as only the token will be stored. It is also relatively quick to train, with an average training time of 20–30 minutes. However, it comes with its shortcomings — as we are fine-tuning a specific vector that guides the model to produce a particular style, it won’t generalise beyond this style.

LoRA visualisation. Trainable blocks are marked in red. Image by the Author.

Low-Rank Adaptions (LoRA) were proposed for Large Language Models and were first adapted to the diffusion model by Simo Ryu. The original idea of LoRAs is that instead of fine-tuning the whole model, which can be rather costly, we can blend a fraction of new weights that would be fine-tuned for the task with a similar rare token approach into the original model.

In diffusion models, rank decomposition is applied to cross-attention layers and is responsible for merging prompt and image information. The weight matrices WO, WQ, WK, and WV in these layers have LoRA applied.

When to use and when not
LoRAs take very little time to train (5–15 minutes) — we are updating a handful of parameters compared to the whole model, and unlike Dreambooth, they take much less space. However, small-in-size models fine-tuned with LoRAs prove worse quality compared to DreamBooth.

Hyper-networks, paper, code

Hyper-networks visualisation. Trainable blocks are marked in red. Image by the Author.

Hyper-networks are, in some sense, extensions to LoRAs. Instead of learning the relatively small embeddings that would alter the model’s output directly, we train a separate network capable of predicting the weights for these newly injected embeddings.

Having the model predict the embeddings for a specific concept we can teach the hypernetwork several concepts — reusing the same model for multiple tasks.

When to use and not
Hypernetworks, not specialising in a single style, but instead capable to produce plethora generally do not result in as good quality as the other methods and can take significant time to train. On the pros side, they can store many more concepts than other single-concept fine-tuning methods.

IP-adapter visualisation. Trainable blocks are marked in red. Image by the Author.

Instead of controlling image generation with text prompts, IP adapters propose a method to control the generation with an image without any changes to the underlying model.

The core idea behind the IP adapter is a decoupled cross-attention mechanism that allows the combination of source images with text and generated image features. This is achieved by adding a separate cross-attention layer, allowing the model to learn image-specific features.

When to use and not
IP adapters are lightweight, adaptable and fast. However, their performance is highly dependent on the quality and diversity of the training data. IP adapters generally tend to work better with supplying stylistic attributes (e.g. with an image of Mark Chagall’s paintings) that we would like to see in the generated image and could struggle with providing control for exact details, such as pose.

ControlNet visualisation. Trainable blocks are marked in red. Image by the Author.

ControlNet paper proposes a way to extend the input of the text-to-image model to any modality, allowing for fine-grained control of the generated image.

In the original formulation, ControlNet is an encoder of the pre-trained diffusion model that takes, as an input, the prompt, noise and control data (e.g. depth-map, landmarks, etc.). To guide the generation, the intermediate levels of the ControlNet are then added to the activations of the frozen diffusion model.

The injection is achieved through zero-convolutions, where the weights and biases of 1×1 convolutions are initialized as zeros and gradually learn meaningful transformations during training. This is similar to how LoRAs are trained — intialised with 0’s they begin learning from the identity function.

When to use and not
ControlNets are preferable when we want to control the output structure, for example, through landmarks, depth maps, or edge maps. Due to the need to update the whole model weights, training could be time-consuming; however, these methods also allow for the best fine-grained control through rigid control signals.

Summary

  • DreamBooth: Full fine-tuning of models for custom subjects of styles, high control level; however, it takes long time to train and are fit for one purpose only.
  • Textual Inversion: Embedding-based learning for new concepts, low level of control, however, fast to train.
  • LoRA: Lightweight fine-tuning of models for new styles/characters, medium level of control, while quick to train
  • Hypernetworks: Separate model to predict LoRA weights for a given control request. Lower control level for more styles. Takes time to train.
  • IP-Adapter: Soft style/content guidance via reference images, medium level of stylistic control, lightweight and efficient.
  • ControlNet: Control via pose, depth, and edges is very precise; however, it takes longer time to train.

Best practice: For the best results, the combination of IP-adapter, with its softer stylistic guidance and ControlNet for pose and object arrangement, would produce the best results.

If you want to go into more details on diffusion, check out this article, that I have found very well written accessible to any level of machine learning and math. If you want to have an intuitive explanation of the Math with cool commentary check out this video or this video.

For looking up information on ControlNets, I found this explanation very helpful, this article and this article could be a good intro as well.

Liked the author? Stay connected!

Have I missed anything? Do not hesitate to leave a note, comment or message me directly on LinkedIn or Twitter!

The opinions in this blog are my own and not attributable to or on behalf of Snap.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Petronas launches Malaysia Bid Round 2026

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style

Read More »

Intel says Google engineers spotted Xeon vulnerabilities

“In a perfect world, the [Trusted Computer Base] would be bug-free; in reality, the complexity of modern systems makes continuous assessment essential. Collaborative reviews allow industry leaders to proactively fix vulnerabilities while fostering transparency for everyone who relies on the technology,” Google researchers wrote. The main problem arose when using

Read More »

How Cisco’s platform mindset is meeting the AI era

4. Sovereignty, trust, and the rise of sovereign AI In EMEA, trust and sovereignty were more than talk—they were central to almost every discussion. This came across loud and clear at the event and in Davos in January. Cisco emphasized four dimensions of trust: security, innovation, execution, and sovereign control.​

Read More »

Verde Clean Fuels suspends Permian basin GTG project

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } <!–> Verde Clean Fuels Inc., Houston, has suspended development of a natural gas-to-gasoline (GTG) project in the Permian basin. The company cited “changing market conditions driven by increasing demand for natural gas,” in the region. ]–> <!–> –><!–> –> June 5, 2024 In 2024, Verde Clean Fuels and Midland-based Diamondback Energy Inc. subsidiary Cottonmouth Ventures LLC agreed to a joint development of a GTG plant in Martin County, Tex., in the Permian’s Midland basin. The project was originally slated to combine Verde’s technology with a feedstock of stranded or otherside-flared associated natural from Diamondback’s Permian basin operations for commercial-scale production of almost 3,000 b/d of fully finished reformulated blend stock for oxygenate blending (RBOB) gasoline. A front-end engineering and design (FEED) study was completed in December 2025. Verde’s chief executive officer, Ernest Miller, said knowledge collected from the work completed “will continue to be useful as we explore other opportunities to deploy our technology.” He said the company will devote resources toward other opportunities “in regions where natural gas is stranded or flared without access to a higher value outlet to market.”

Read More »

Oil prices ride the Iran seesaw

Oil, fundamental analysis Crude prices were up-and-down this week as traders eyed both optimistic and pessimistic signs regarding the proposed US/Iran talks held Friday. A large drop in crude and distillate inventories were the main fundamental factors noticed. Despite the rollercoaster ride, US prices remained above the key $60/bbl level. WTI had a High of $65.55/bbl on Wednesday with a weekly Low of $61.10 on Tuesday. Brent crude’s High was $69.75/bbl on Wednesday while its Low was $65.75 Monday. Both grades settled lower week-on-week. The WTI/Brent spread has tightened to ($4.50). The week started with a bearish tone as US President Trump spoke with optimism about the upcoming US/Iran talks. However, later in the week, the Iranian government objected to the specific topics the US wants to discuss beyond their nuclear developments and the meeting appeared doomed. The two sides did decide to follow through with the planned meeting Friday with Iranian officials labeling the meeting as a ‘very good start.’ Meanwhile, the Iranian Revolutionary Guard Corps (IRGCC) attempted to stop a US-flagged vessel in the Strait of Hormuz and sent drones to a US Navy ship in the area. Both attempts were thwarted by US naval forces. Some Very Large Crude Carriers (VLCC) are said to have been increasing their normal speeds for faster passage through the Strait of Hormuz while it’s still open. It is estimate that slightly more than 25% of global oil supplies move through the Strait. Russia is now relying heavily on its relationship with China to buy its oil exports after the US offered India a deal whereby tariffs could be cut if India halts buying Russian Urals. Additionally, Indian refiners would have access to Venezuelan exports. No official action has yet to be taken by the Indian government in terms of such a

Read More »

Perenco Congo installs Kombi 2 platform on KLL field

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } Perenco Congo installed the new Kombi 2 platform on Kombi-Likalala-Libondo II (KLL II) field offshore Congo-Brazzaville. The new mobile offshore production unit has improved water and effluent treatment, increased associated gas recovery, and 8 MW of electricity generated by two gas turbines. Total investment in the KLL II project was over $200 million. Kombi 2 will host a six-well drilling campaign starting in 2026, with the aim of increasing production, extending the life of the field. Connection work is currently under way, with commissioning scheduled for early March 2026.

Read More »

EnQuest secures extension for Block 12W offshore Vietnam

EnQuest PLC was awarded a 4-year extension to the Block 12W Production Sharing Contract by PetroVietnam. This extension runs to July 2034 from the original contract end date of November 2030, allowing the operator to access prospectivity across three gas discoveries and several additional targets. EnQuest aims to initiate plans to progress discovered resources into reserves. Since completion of the acquisition of Block 12W from Harbour Energy in July 2025, EnQuest completed three proactive well interventions from Chim Sáo and Dua fields that boosted gross fourth-quarter 2025 production to 10,400 boe/d (5,500 boe/d net to EnQuest). Block 12W is comprised of three producing oil and gas fields; Chim Sáo, Chim Sáo North West (CSNW), and Dua, all in the the Nam Con Son basin, about 400 km south west of Vung Tau, Vietnam. Developed via a single wellhead platform, Chim Sáo and CSNW oil production is exported via the Lewek Emas floating production storage and offloading (FPSO) vessel, and gas is exported by pipeline to Vung Tau near Ho Chi Minh City. Chim Sáo currently has 14 active oil producers and seven water injectors. CSNW is developed via a single injector and producer pair. Dua oil and gas field was developed as a subsea tie-back to Chim Sáo. Its production is via three subsea oil producers. EnQuest is operator of the block (53.125%) with partners Bitexco (31.875%) and PetroVietnam (15%).

Read More »

Vår Energi drills dry well at Price Updip prospect in North Sea Balder area

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } <!–> Vår Energi and its partner Kistos Energy drilled a dry well at the Prince Updip prospect in the Balder area in the North Sea. The well has been permanently plugged. Well 25/8-C-23 D, the 13th exploration well in production license 027, was drilled from the Ringhorne platform, just over 200 km west of Stavanger, to a vertical depth of 2,318 m subsea. It was terminated in basement rock. Water depth at the site is 129 m. ]–> <!–> –><!–> –> March 30, 2021 <!–> –><!–> –> June 28, 2021 <!–> Balder and Ringhorne Øst fields are contained within the license, where exploration activity in 2021—wells 25/8-20 S, B, and C—proved hydrocarbons at two levels. The objective of the well drilled by Vår Energi was to prove petroleum in the Prince Updip prospect in the Skagerrak formation in the Upper Triassic. The secondary exploration target was to prove petroleum in basement rock. The well encountered the Skagerrak formation with a thickness of about 127 m, 18 m of which were sandstone layers with moderate to good reservoir quality. Certain sandstone layers had hydrocarbon shows. The well drilled about 30

Read More »

Expand board moves on from CEO Dell’Osso

The directors of Expand Energy Corp. have thanked chief executive officer Nick Dell’Osso for his services and installed chairman Mike Wichterich as interim leader while they search for a successor. In a statement, Expand said the 49-year-old Dell’Osso—who had led the company since October 2021 (when it was still Chesapeake Energy) and steered it through the 2024 merger with Southwestern Energy—also has resigned his seat on the company’s board. Dell’Osso will, however, be an external advisor to the company during a transition period. In a filing with the US Securities and Exchange Commission, Expand said it is working on a severance agreement with Dell’Osso, whose exit is being deemed a termination without cause. “On behalf of the board, I want to thank Nick for his leadership and many contributions since first joining the company in 2008,” Wichterich said in the statement. “During his tenure as CEO, the company has grown from a $5 billion business to a $26 billion investment-grade enterprise included in the S&P 500 Index. We are grateful for his leadership in setting a strong foundation for our future.” Wichterich was interim chief executive of the former Chesapeake before Dell’Osso took over in 2021 and served as executive chairman of the new Expand through the end of 2022. He is the founder of Three Rivers Operating Co. LLC, an exploration and production company focused on the Permian basin. In early trading Feb. 9, shares of Expand (Ticker: EXE) were down nearly 6% to about $103.80. That move down knocked about $1.6 billion from the market capitalization of Expand, which now stands at roughly $24.7 billion. In announcing Dell’Osso departure, Expand’s board said nothing has changed about its operating and spending forecasts for fourth-quarter 2025, results of which will be released Feb. 17. Another major change is the company’s plan

Read More »

Cisco highlights memory costs, Silicon One growth in Q2 recap

“AI infrastructure orders taken from hyperscalers totaled $2.1 billion in Q2 compared to $1.3 billion just last quarter and equal to the total orders taken in all of fiscal year ’25, marking another significant acceleration in growth across our silicon, systems and optics,” Robbins said. “Given the strong demand for our Silicon One systems and optics, we now expect to take AI orders in excess of $5 billion and to recognize over $3 billion in AI infrastructure revenue from hyperscalers in FY ’26.” Regarding enterprise uptake, Robbins said Cisco took in $350 million in AI orders from enterprise customers in Q2 and has a pipeline in excess of $2.5 billion for its high-performance AI infrastructure portfolio. Cisco is seeing early enterprise use cases for AI around fraud detection and video analytics in sectors such as financial, manufacturing and pharmaceuticals, for example. “I also see examples in retail, where customers are leveraging agents on mobile devices in retail to help their staff do a better job engaging with their customers. We’re seeing a combination of both investment in cloud-based architectures as well as on prem,” Robbins said. Networking rules Cisco is experiencing a faster-than-historical ramp-up of next-generation platforms, including its Catalyst 9K, Wi-Fi 7, and smart switches, stated Sebastien Naji, a research analyst with William Blair, in a report after the call. He attributed it to three factors: an accelerated refresh cycle in the data center; early AI-readiness efforts in the enterprise; and end-of-support for legacy Catalyst and Nexus switches.  “We are seeing strong demand for our next-generation switching, routing and wireless products, which continue to ramp faster than prior product launches. We’re delivering AI-native capabilities across these products, including weaving security into the fabric of the network and modernizing the operational stack of campus networks,” Robbins said. Co-packaged optics? When asked

Read More »

Energy providers seek flexible load strategies for data center operations

“In theory, yes, they’d have to wait a little bit longer while their queries are routed to a data center that has capacity,” said Lawrence. The one thing the industry cannot do is operate like it has in the past, where data center power was tuned and then forgotten for six months. Previously, data centers would test their power sources once or twice a year. They don’t have that luxury anymore. They need to check their power sources and loads far more regularly, according to Lawrence. “I think that for that for the data center industry to continue to survive like we all need it, there’s going to have to be some realignment on the incentives to why somebody would become flexible,” said Lawrence. The survey suggests that utilities and load operators expect to expand their demand response activities and budgets in the near term. Sixty-three percent of respondents anticipate DR program funding to grow by 50% or more over the next three years. While they remain a major source of load growth and system strain, 57% of respondents indicate that onsite power generation from data centers will be most important to improving grid stability over the next five years. One of the proposed fixes to the power shortage has been small modular nuclear reactors. These have gained a lot of traction in the marketplace even if they have nothing to sell yet. But Lawrence said that that’s not an ideal solution for existing power generators, ironically enough.

Read More »

Nokia predicts huge WAN traffic growth, but experts question assumptions

Consumer, which includes both mobile access and fixed access, including fixed wireless access. Enterprise and industrial, which covers wide-area connectivity that supports knowledge work, automation, machine vision, robotics coordination, field support, and industrial IoT. AI, including applications that people directly invoke, such as assistants, copilots, and media generation, as well as autonomous use cases in which AI systems trigger other AI systems to perform functions and move data across networks. The report outlines three scenarios: conservative, moderate, and aggressive. “Our goal is to present scenarios that fall within a realistic range of possible outcomes, encouraging stakeholders to plan across the full spectrum of high-impact demand possibilities,” the report says. Nokia’s prediction for global WAN traffic growth ranges from a 13% CAGR for the conservative scenario to 16% CAGR for moderate and 22% CAGR for aggressive. Looking more closely at the moderate scenario, it’s clear that consumer traffic dominates. Enterprise and industrial traffic make up only about 14% to 17% of overall WAN traffic, although their share is expected to grow during the 10-year forecast period. “On the consumer side, the vast majority of traffic by volume is video,” says William Webb, CEO of the consulting firm Commcisive. Asked whether any of that consumer traffic is at some point served up by enterprises, the answer is a decisive “no.” It’s mostly YouTube and streaming services like Netflix, he says. In short, that doesn’t raise enterprise concerns. Nokia predicts AI traffic boom AI is a different story. “Consumer- and enterprise-generated AI traffic imposes a substantial impact on the wide-area network (WAN) by adding AI workloads processed by data centers across the WAN. AI traffic does not stay inside one data center; it moves across edge, metro, core, and cloud infrastructure, driving dense lateral flows and new capacity demands,” the report says. An

Read More »

Cisco amps up Silicon One line, delivers new systems and optics for AI networking

Those building blocks include the new G300 as well as the G200 51.2 Tbps chip, which is aimed at spine and aggregation applications, and the G100 25.6 Tbps chip, which is aimed at leaf operations. Expanded portfolio of Silicon One P200-powered systems Cisco in October rolled out the P200 Silicon One chip and the high-end, 51.2 Tbps 8223 router aimed at distributed AI workloads. That system supports Octal Small Form-Factor Pluggable (OSFP) and Quad Small Form-Factor Pluggable Double Density (QSFP-DD) optical form factors that help the box support geographically dispersed AI clusters. Cisco grew the G200 family this week with the addition of the 8122X-64EF-O, a 64x800G switch that will run the SONiC OS and includes support for Cisco 800G Linear Pluggable Optics (LPO) connectivity. LPO components typically set up direct links between fiber optic modules, eliminating the need for traditional components such as a digital signal processor. Cisco said its P200 systems running IOS XR software now better support core routing services to allow data-center-to-data-center links and data center interconnect applications. In addition, Cisco introduced a P200-powered 88-LC2-36EF-M line card, which delivers 28.8T of capacity. “Available for both our 8-slot and 18-slot modular systems, this line card enables up to an unprecedented 518.4T of total system bandwidth, the highest in the industry,” wrote Guru Shenoy, senior vice president of the Cisco provider connectivity group, in a blog post about the news. “When paired with Cisco 800G ZR/ZR+ coherent pluggable optics, these systems can easily connect sites over 1,000 kilometers apart, providing the high-density performance needed for modern data center interconnects and core routing.”

Read More »

NetBox Labs ships AI copilot designed for network engineers, not developers

Natural language for network engineers Beevers explained that network operations teams face two fundamental barriers to automation. First, they lack accurate data about their infrastructure. Second, they aren’t software developers and shouldn’t have to become them. “These are not software developers. They are network engineers or IT infrastructure engineers,” Beevers said. “The big realization for us through the copilot journey is they will never be software developers. Let’s stop trying to make them be. Let’s let these computers that are really good at being software developers do that, and let’s let the network engineers or the data center engineers be really good at what they’re really good at.”  That vision drove the development of NetBox Copilot’s natural language interface and its capabilities. Grounding AI in infrastructure reality The challenge with deploying AI  in network operations is trust. Generic large language models hallucinate, produce inconsistent results, and lack the operational context to make reliable decisions. NetBox Copilot addresses this by grounding the AI agent in NetBox’s comprehensive infrastructure data model. NetBox serves as the system of record for network and infrastructure teams, maintaining a semantic map of devices, connections, IP addressing, rack layouts, power distribution and the relationships between these elements. Copilot has native awareness of this data structure and the context it provides. This enables queries that would be difficult or impossible with traditional interfaces. Network engineers can ask “Which devices are missing IP addresses?” to validate data completeness, “Who changed this prefix last week?” for change tracking and compliance, or “What depends on this switch?” for impact analysis before maintenance windows.

Read More »

US pushes voluntary pact to curb AI data center energy impact

Others note that cost pressure isn’t limited to the server rack. Danish Faruqui, CEO of Fab Economics, said the AI ecosystem is layered from silicon to software services, creating multiple points where infrastructure expenses eventually resurface. “Cloud service providers are likely to gradually introduce more granular pricing models across cloud, AI, and SaaS offerings, tailored by customer type, as they work to absorb the costs associated with the White House energy and grid compact,” Faruqui said.   This may not show up as explicit energy surcharges, but instead surface through reduced discounts, higher spending commitments, and premiums for guaranteed capacity or performance. “Smaller enterprises will feel the impact first, while large strategic customers remain insulated longer,” Rawat said. “Ultimately, the compact would delay and redistribute cost pressure; it does not eliminate it.” Implications for data center design The proposal is also likely to accelerate changes in how AI facilities are designed. “Data centers will evolve into localized microgrids that combine utility power with on-site generation and higher-level implementation of battery energy storage systems,” Faruqui said. “Designing for grid interaction will become imperative for AI data centers, requiring intelligent, high-speed switching gear, increased battery energy storage capacity for frequency regulation, and advanced control systems that can manage on-site resources.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »