Stay Ahead, Stay ONMINE

Supercharge Your RAG with Multi-Agent Self-RAG

Introduction Many of us might have tried to build a RAG application and noticed it falls significantly short of addressing real-life needs. Why is that? It’s because many real-world problems require multiple steps of information retrieval and reasoning. We need our agent to perform those as humans normally do, yet most RAG applications fall short […]

Introduction

Many of us might have tried to build a RAG application and noticed it falls significantly short of addressing real-life needs. Why is that? It’s because many real-world problems require multiple steps of information retrieval and reasoning. We need our agent to perform those as humans normally do, yet most RAG applications fall short of this.

This article explores how to supercharge your RAG application by making its data retrieval and reasoning process similar to how a human would, under a multi-agent framework. The framework presented here is based on the Self-RAG strategy but has been significantly modified to enhance its capabilities. Prior knowledge of the original strategy is not necessary for reading this article.

Real-life Case

Consider this: I was going to fly from Delhi to Munich (let’s assume I am taking the flight from an EU airline), but I was denied boarding somehow. Now I want to know what the compensation should be.

These two webpages contain relevant information, I go ahead adding them to my vector store, trying to have my agent answer this for me by retrieving the right information.

Now, I pass this question to the vector store: “how much can I receive if I am denied boarding, for flights from Delhi to Munich?”.

– – – – – – – – – – – – – – – – – – – – – – – – –
Overview of US Flight Compensation Policies To get compensation for delayed flights, you should contact your airline via their customer service or go to the customer service desk. At the same time, you should bear in mind that you will only receive compensation if the delay is not weather-related and is within the carrier`s control. According to the US Department of Transportation, US airlines are not required to compensate you if a flight is cancelled or delayed. You can be compensated if you are bumped or moved from an overbooked flight. If your provider cancels your flight less than two weeks before departure and you decide to cancel your trip entirely, you can receive a refund of both pre-paid baggage fees and your plane ticket. There will be no refund if you choose to continue your journey. In the case of a delayed flight, the airline will rebook you on a different flight. According to federal law, you will not be provided with money or other compensation. Comparative Analysis of EU vs. US Flight Compensation Policies
– – – – – – – – – – – – – – – – – – – – – – – – –
(AUTHOR-ADDED NOTE: IMPORTANT, PAY ATTENTION TO THIS)
Short-distance flight delays – if it is up to 1,500 km, you are due 250 Euro compensation.
Medium distance flight delays – for all the flights between 1,500 and 3,500 km, the compensation should be 400 Euro.
Long-distance flight delays – if it is over 3,500 km, you are due 600 Euro compensation. To receive this kind of compensation, the following conditions must be met; Your flight starts in a non-EU member state or in an EU member state and finishes in an EU member state and is organised by an EU airline. Your flight reaches the final destination with a delay that exceeds three hours. There is no force majeure.
– – – – – – – – – – – – – – – – – – – – – – – – –
Compensation policies in the EU and US are not the same, which implies that it is worth knowing more about them. While you can always count on Skycop flight cancellation compensation, you should still get acquainted with the information below.
– – – – – – – – – – – – – – – – – – – – – – – – –
Compensation for flight regulations EU: The EU does regulate flight delay compensation, which is known as EU261. US: According to the US Department of Transportation, every airline has its own policies about what should be done for delayed passengers. Compensation for flight delays EU: Just like in the United States, compensation is not provided when the flight is delayed due to uncontrollable reasons. However, there is a clear approach to compensation calculation based on distance. For example, if your flight was up to 1,500 km, you can receive 250 euros. US: There are no federal requirements. That is why every airline sets its own limits for compensation in terms of length. However, it is usually set at three hours. Overbooking EU: In the EU, they call for volunteers if the flight is overbooked. These people are entitled to a choice of: Re-routing to their final destination at the earliest opportunity. Refund of their ticket cost within a week if not travelling. Re-routing at a later date at the person`s convenience.

Unfortunately, they contain only generic flight compensation policies, without telling me how much I can expect when denied boarding from Delhi to Munich specifically. If the RAG agent takes these as the sole context, it can only provide a generic answer about flight compensation policy, without giving the answer we want.

However, while the documents are not immediately useful, there is an important insight contained in the 2nd piece of context: compensation varies according to flight distance. If the RAG agent thinks more like human, it should follow these steps to provide an answer:

  1. Based on the retrieved context, reason that compensation varies with flight distance
  2. Next, retrieve the flight distance between Delhi and Munich
  3. Given the distance (which is around 5900km), classify the flight as a long-distance one
  4. Combined with the previously retrieved context, figure out I am due 600 EUR, assuming other conditions are fulfilled

This example demonstrates how a simple RAG, in which a single retrieval is made, fall short for several reasons:

  1. Complex Queries: Users often have questions that a simple search can’t fully address. For example, “What’s the best smartphone for gaming under $500?” requires consideration of multiple factors like performance, price, and features, which a single retrieval step might miss.
  2. Deep Information: Some information lies across documents. For example, research papers, medical records, or legal documents often include references that need to be made sense of, before one can fully understand the content of a given article. Multiple retrieval steps help dig deeper into the content.

Multiple retrievals supplemented with human-like reasoning allow for a more nuanced, comprehensive, and accurate response, adapting to the complexity and depth of user queries.

Multi-Agent Self-RAG

Here I explain the reasoning process behind this strategy, afterwards I will provide the code to show you how to achieve this!

Note: For readers interested in knowing how my approach differs from the original Self-RAG, I will describe the discrepancies in quotation boxes like this. But general readers who are unfamiliar with the original Self-RAG can skip them.

In the below graphs, each circle represents a step (aka Node), which is performed by a dedicated agent working on the specific problem. We orchestrate them to form a multi-agent RAG application.

1st iteration: Simple RAG

A simple RAG chain

This is just the vanilla RAG approach I described in “Real-life Case”, represented as a graph. After Retrieve documents, the new_documents will be used as input for Generate Answer. Nothing special, but it serves as our starting point.

2nd iteration: Digest documents with “Grade documents”

Reasoning like human do

Remember I said in the “Real-life Case” section, that as a next step, the agent should “reason that compensation varies with flight distance”? The Grade documents step is exactly for this purpose.

Given the new_documents, the agent will try to output two items:

  1. useful_documents: Comparing the question asked, it determines if the documents are useful, and retain a memory for those deemed useful for future reference. As an example, since our question does not concern compensation policies for US, documents describing those are discarded, leaving only those for EU
  2. hypothesis: Based on the documents, the agent forms a hypothesis about how the question can be answered, that is, flight distance needs to be identified

Notice how the above reasoning resembles human thinking! But still, while these outputs are useful, we need to instruct the agent to use them as input for performing the next document retrieval. Without this, the answer provided in Generate answer is still not useful.

useful_documents are appended for each document retrieval loop, instead of being overwritten, to keep a memory of documents that are previously deemed useful. hypothesis is formed from useful_documents and new_documents to provide an “abstract reasoning” to inform how query is to be transformed subsequently.

The hypothesis is especially useful when no useful documents can be identified initially, as the agent can still form hypothesis from documents not immediately deemed as useful / only bearing indirect relationship to the question at hand, for informing what questions to ask next

3rd iteration: Brainstorm new questions to ask

Suggest questions for additional information retrieval

We have the agent reflect upon whether the answer is useful and grounded in context. If not, it should proceed to Transform query to ask further questions.

The output new_queries will be a list of new questions that the agent consider useful for obtaining extra information. Given the useful_documents (compensation policies for EU), and hypothesis (need to identify flight distance between Delhi and Munich), it asks questions like “What is the distance between Delhi and Munich?”

Now we are ready to use the new_queries for further retrieval!

The transform_query node will use useful_documents (which are accumulated per iteration, instead of being overwritten) and hypothesis as input for providing the agent directions to ask new questions.

The new questions will be a list of questions (instead of a single question) separated from the original question, so that the original question is kept in state, otherwise the agent could lose track of the original question after multiple iterations.

Final iteration: Further retrieval with new questions

Issuing new queries to retrieve extra documents

The output new_queries from Transform query will be passed to the Retrieve documents step, forming a retrieval loop.

Since the question “What is the distance between Delhi and Munich?” is asked, we can expect the flight distance is then retrieved as new_documents, and subsequently graded as useful_documents, further used as an input for Generate answer.

The grade_documents node will compare the documents against both the original question and new_questions list, so that documents that are considered useful for new_questions, even if not so for the original question, are kept.

This is because those documents might help answer the original question indirectly, by being relevant to new_questions (like “What is the distance between Delhi and Munich?”)

Final answer!

Equipped with this new context about flight distance, the agent is now ready to provide the right answer: 600 EUR!

Next, let us now dive into the code to see how this multi-agent RAG application is created.

Implementation

The source code can be found here. Our multi-agent RAG application involves iterations and loops, and LangGraph is a great library for building such complex multi-agent application. If you are not familiar with LangGraph, you are strongly suggested to have a look at LangGraph’s Quickstart guide to understand more about it!

To keep this article concise, I will focus on the key code snippets only.

Important note: I am using OpenRouter as the Llm interface, but the code can be easily adapted for other LLM interfaces. Also, while in my code I am using Claude 3.5 Sonnet as model, you can use any LLM as long as it support tools as parameter (check this list here), so you can also run this with other models, like DeepSeek V3 and OpenAI o1!

State definition

In the previous section, I have defined various elements e.g. new_documentshypothesis that are to be passed to each step (aka Nodes), in LangGraph’s terminology these elements are called State.

We define the State formally with the following snippet.

from typing import List, Annotated
from typing_extensions import TypedDict

def append_to_list(original: list, new: list) -> list:
original.append(new)
return original

def combine_list(original: list, new: list) -> list:
return original + new

class GraphState(TypedDict):
"""
Represents the state of our graph.

Attributes:
question: question
generation: LLM generation
new_documents: newly retrieved documents for the current iteration
useful_documents: documents that are considered useful
graded_documents: documents that have been graded
new_queries: newly generated questions
hypothesis: hypothesis
"""

question: str
generation: str
new_documents: List[str]
useful_documents: Annotated[List[str], combine_list]
graded_documents: List[str]
new_queries: Annotated[List[str], append_to_list]
hypothesis: str

Graph definition

This is where we combine the different steps to form a “Graph”, which is a representation of our multi-agent application. The definitions of various steps (e.g. grade_documents) are represented by their respective functions.

from langgraph.graph import END, StateGraph, START
from langgraph.checkpoint.memory import MemorySaver
from IPython.display import Image, display

workflow = StateGraph(GraphState)

# Define the nodes
workflow.add_node("retrieve", retrieve) # retrieve
workflow.add_node("grade_documents", grade_documents) # grade documents
workflow.add_node("generate", generate) # generatae
workflow.add_node("transform_query", transform_query) # transform_query

# Build graph
workflow.add_edge(START, "retrieve")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
"grade_documents",
decide_to_generate,
{
"transform_query": "transform_query",
"generate": "generate",
},
)
workflow.add_edge("transform_query", "retrieve")
workflow.add_conditional_edges(
"generate",
grade_generation_v_documents_and_question,
{
"useful": END,
"not supported": "transform_query",
"not useful": "transform_query",
},
)

# Compile
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
display(Image(app.get_graph(xray=True).draw_mermaid_png()))

Running the above code, you should see this graphical representation of our RAG application. Notice how it is essentially equivalent to the graph I have shown in the final iteration of “Enhanced Self-RAG Strategy”!

Visualizing the multi-agent RAG graph

After generate, if the answer is considered “not supported”, the agent will proceed to transform_query intead of to generate again, so that the agent will look for additional information rather than trying to regenerate answers based on existing context, which might not suffice for providing a “supported” answer

Now we are ready to put the multi-agent application to test! With the below code snippet, we ask this question how much can I receive if I am denied boarding, for flights from Delhi to Munich?

from pprint import pprint
config = {"configurable": {"thread_id": str(uuid4())}}

# Run
inputs = {
"question": "how much can I receive if I am denied boarding, for flights from Delhi to Munich?",
}
for output in app.stream(inputs, config):
for key, value in output.items():
# Node
pprint(f"Node '{key}':")
# Optional: print full state at each node
# print(app.get_state(config).values)
pprint("n---n")

# Final generation
pprint(value["generation"])

While output might vary (sometimes the application provides the answer without any iterations, because it “guessed” the distance between Delhi and Munich), it should look something like this, which shows the application went through multiple rounds of data retrieval for RAG.

---RETRIEVE---
"Node 'retrieve':"
'n---n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'n---n'
---GENERATE---
---CHECK HALLUCINATIONS---
'---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---'
"Node 'generate':"
'n---n'
---TRANSFORM QUERY---
"Node 'transform_query':"
'n---n'
---RETRIEVE---
"Node 'retrieve':"
'n---n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'n---n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'generate':"
'n---n'
('Based on the context provided, the flight distance from Munich to Delhi is '
'5,931 km, which falls into the long-distance category (over 3,500 km). '
'Therefore, if you are denied boarding on a flight from Delhi to Munich '
'operated by an EU airline, you would be eligible for 600 Euro compensation, '
'provided that:n'
'1. The flight is operated by an EU airlinen'
'2. There is no force majeuren'
'3. Other applicable conditions are metn'
'n'
"However, it's important to note that this compensation amount is only valid "
'if all the required conditions are met as specified in the regulations.')

And the final answer is what we aimed for!

Based on the context provided, the flight distance from Munich to Delhi is
5,931 km, which falls into the long-distance category (over 3,500 km).
Therefore, if you are denied boarding on a flight from Delhi to Munich
operated by an EU airline, you would be eligible for 600 Euro compensation,
provided that:
1. The flight is operated by an EU airline
2. There is no force majeure
3. Other applicable conditions are met

However, it's important to note that this compensation amount is only valid
if all the required conditions are met as specified in the regulations.

Inspecting the State, we see how the hypothesis and new_queries enhance the effectiveness of our multi-agent RAG application by mimicking human thinking process.

Hypothesis

print(app.get_state(config).values.get('hypothesis',""))
--- Output ---
To answer this question accurately, I need to determine:

1. Is this flight operated by an EU airline? (Since Delhi is non-EU and Munich is EU)
2. What is the flight distance between Delhi and Munich? (To determine compensation amount)
3. Are we dealing with a denied boarding situation due to overbooking? (As opposed to delay/cancellation)

From the context, I can find information about compensation amounts based on distance, but I need to verify:
- If the flight meets EU compensation eligibility criteria
- The exact distance between Delhi and Munich to determine which compensation tier applies (250€, 400€, or 600€)
- If denied boarding compensation follows the same amounts as delay compensation

The context doesn't explicitly state compensation amounts specifically for denied boarding, though it mentions overbooking situations in the EU require offering volunteers re-routing or refund options.

Would you like me to proceed with the information available, or would you need additional context about denied boarding compensation specifically?

New Queries

for questions_batch in app.get_state(config).values.get('new_queries',""):
for q in questions_batch:
print(q)
--- Output ---
What is the flight distance between Delhi and Munich?
Does EU denied boarding compensation follow the same amounts as flight delay compensation?
Are there specific compensation rules for denied boarding versus flight delays for flights from non-EU to EU destinations?
What are the compensation rules when flying with non-EU airlines from Delhi to Munich?
What are the specific conditions that qualify as denied boarding under EU regulations?

Conclusion

Simple RAG, while easy to build, might fall short in tackling real-life questions. By incorporating human thinking process into a multi-agent RAG framework, we are making RAG applications much more practical.

*Unless otherwise noted, all images are by the author


Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Cirrascale to offer on-prem Google Gemini models

Google Distributed Cloud can be deployed in customer-controlled environments, including installations that are disconnected from the Internet, which is a key requirement for some government and critical-infrastructure users. One of the big challenges is that these models are incredibly valuable and they need to be delivered in a trusted, secure

Read More »

Golden Pass LNG ships first export cargo

Editor’s Note: Updated Apr. 23 to include information provided by the US Energy Information Administration.  Golden Pass LNG, a joint venture between QatarEnergy and ExxonMobil Corp., has loaded and shipped its first LNG export cargo from the plant in Sabine Pass, Tex. The departure comes following first LNG production from Train 1 late last month. Once fully operational, Golden Pass LNG expects to export about 18 million tons/year (tpy) of LNG. Golden Pass LNG is the 10th LNG plant in the US, the US Energy Information Administration (EIA) noted in a separate release Apr. 23. It is the only new US LNG export plant currently expected to begin LNG shipments this year, EIA said. Construction and commissioning continue on Trains 2 and 3, which are expected to come online in turn, following stable operation of Train 1. EIA noted Golden Pass aims to start up Train 2 in second-half 2026 and Train 3 in first-half 2027. QatarEnergy holds 70% interest in Golden Pass LNG, while ExxonMobil holds the remaining 30%. LNG demand  ExxonMobil forecasts natural gas demand to rise 20% by 2050 and LNG demand to rise by 3% per year through 2050. The operator is developing four LNG projects and, by 2030, expects to double its supply compared to 2020 to more than 40 million tpy.

Read More »

Ecopetrol agrees to acquire equity stake in Brava Energia with plans for increased ownership

State-owned Ecopetrol SA, Bogotá, Colombia, has agreed to acquire a 26% equity stake in Brava Energia SA from a group of shareholders and plans to launch a tender offer to increase its ownership to 51%, which would give it control of the Brazilian oil and gas independent. The move would add exposure to roughly 81,000 boe/d of production and 459 MMboe of reserves, expanding Ecopetrol’s footprint in Brazil. Ecopetrol entered into share purchase agreement with Jive, Yellowstone, and Bloco Somah Printemps Quantum, which together constitute a group holding about 26% of the outstanding common shares of Brava Energia. Brava Energia, the second-largest independent company listed in the Brazilian market in terms of reserves and production, was incorporated in 2024 from the merger between 3R Petroleum Óleo e Gás SA and Enauta Participações SA. Completion of the deal is subject to certain conditions, including, among others, approval by Brazil’s Administrative Council for Economic Defense (CADE), the grant of certain waivers and consents considering Brava’s financing instruments and relevant commercial agreements, as well as the purchase by Ecopetrol SA, or one of its affiliates or subsidiaries within the Ecopetrol Group, of the number of shares required to achieve a 51% controlling stake of Brava’s voting share capital. Ecopetrol plans to launch a voluntary tender offer on the B3 stock exchange in Brazil to buy additional shares to reach 51% controlling stake at R$23.00 per share, subject to regulatory requirements and certain conditions. Ecopetrol in Brazil In Brazil, Ecopetrol, through subsidiary Ecopetrol Óleo e Gás do Brasil Ltda., holds 30% interest in 11 blocks in the southern area of Santos basin in consortium with Shell Brasil Petróleo Ltda. (operator, 70%).  The company also holds a 30% non-operated interest in Gato do Mato (BM-S-54) and Sul de Gato do Mato (production sharing agreement), which

Read More »

China leads global oil stockpiles in 2025

China, the United States, and Japan held the world’s largest strategic oil inventories as of December 2025, the US Energy Information Administration (EIA) said in a recent note.  The EIA examined significant global buildup in strategic oil inventories as of December 2025, prior to the International Energy Agency (IEA)-coordinated emergency release in March 2026 triggered by the Strait of Hormuz disruption. These reserves—first established by OECD countries in the 1970s—continue to serve as a critical buffer against supply shocks. China holds the largest volume of oil inventories globally. EIA estimates about 360 million bbl in government-held stocks and roughly 1 billion bbl in commercial inventories, bringing its total to nearly 1.4 billion bbl. The agency said China added about 1.1 million b/d to inventories in 2025, reflecting an aggressive stockpiling strategy. The US follows, with about 413 million bbl in its Strategic Petroleum Reserve (SPR) as of December 2025, alongside more than 400 million bbl in commercial crude stocks, EIA said. Japan ranks third, holding 263 million bbl in government reserves, with an additional 220 million bbl required under Japan’s Oil Stockpiling Act. OECD Europe held about 179 million bbl, and South Korea maintained roughly 79 million bbl.  Among non-OECD countries, estimates are less transparent, EIA noted. Saudi Arabia held about 82 million bbl, Iran 71 million bbl, and the UAE 34 million bbl in on-land inventories, while India’s SPR totaled 21.4 million bbl, with plans to expand storage capacity domestically and abroad. Global estimates remain conservative due to limited transparency and varying definitions of “strategic” inventories, EIA said. In most countries, only government or national oil company holdings are counted, though China is a key exception where commercial inventories are included due to state-directed stockpiling. EIA plans to update its assessment periodically in its Short-Term Energy Outlook beginning this May.

Read More »

Peace signals temper crude rally, Europe jet fuel tightness intensifies

Tentative diplomatic signals offer limited relief to markets still dominated by supply disruption concerns surrounding the Iran war. At the time of writing, Brent crude futures hovered around $105–106/bbl after earlier trading above $107/bbl, while West Texas Intermediate (WTI) held near $95–97/bbl. Prices softened modestly following reports of renewed diplomatic engagement. Iranian Foreign Minister Seyed Abbas Araghchi is expected to visit Pakistan for talks. Separately, Israel and Lebanon agreed to extend their ceasefire by 3 weeks after meetings with US officials in Washington. Stay updated on oil price volatility, shipping disruptions, LNG market analysis, and production output at OGJ’s Iran war content hub. Despite these developments, market participants remain cautious, with analysts warning that any easing in risk premiums may prove temporary. Ongoing tensions linked to the US-Iran conflict continue to disrupt flows through the Strait of Hormuz, a critical artery for global oil trade. In remarks at CNBC’s Converge Live conference, Fatih Birol, executive director of the International Energy Agency (IEA), described the situation as an unprecedented energy security challenge, noting that the strait is operating under what he termed a “double blockade,” severely constraining tanker movements. The impact is being felt acutely in refined product markets, particularly in Europe’s aviation sector. With Middle Eastern exports curtailed, European refiners have shifted output toward jet fuel production, though with limited flexibility. According to Frans Everts of Shell plc, refineries across the region are operating in “max jet mode,” with only marginal capacity to increase yields further. Inventory data underscore the tightening balance. Jet fuel and kerosene stocks in the Amsterdam-Rotterdam-Antwerp hub fell to 597,000 metric tons, the lowest level since April 2020, declining 10% year-on-year. In response, Europe has increasingly relied on imports from the US Gulf Coast to offset lost Middle Eastern supply. According to IEA, global oil supply

Read More »

Shell to expand Canadian operations with $16.4-billion acquisition of ARC Resources

Shell plc has agreed to acquire ARC Resources Ltd. in a transaction valued at about $16.4 billion, including $13.6 billion in equity and roughly $2.8 billion in assumed net debt and leases. The acquisition is expected to strengthen Shell’s integrated gas portfolio and expand its position in Canada through the addition of long-life Montney resources in British Columbia and Alberta, the companies said Apr. 27. “ARC is a high-quality, low-cost, and top-quartile low carbon intensity producer in the Montney that complements our existing footprint in Canada and strengthens our resource base for decades,” said Wael Sawan, Shell chief executive officer. “This establishes Canada as a heartland for Shell while furthering our strategy to deliver more value with less emissions.” ARC produced 374,000 boe/d in 2025 (before royalties). Its assets overlap with Shell’s existing Groundbirch position in British Columbia and the Gold Creek development in Alberta. Groundbirch supplies gas to the 14-million tonnes/year LNG Canada liquefaction plant (Shell, 40%), as well as to the domestic market.

Read More »

Brent holds above $100/bbl; US shale response remains restrained

Global crude markets remained firmly supported Apr. 27 as the ongoing Iran conflict and continued disruption in the Strait of Hormuz reinforced a persistent geopolitical risk premium, offsetting intermittent diplomatic signals. Brent crude traded in the upper-$100/bbl range, while West Texas Intermediate (WTI) held in the high-$90s/bbl, reflecting tight physical supply conditions and uncertainty surrounding Middle East export flows. Stay updated on oil price volatility, shipping disruptions, LNG market analysis, and production output at OGJ’s Iran war content hub. While diplomatic efforts between the US and Iran have produced occasional signs of progress—including reported proposals to reopen the strait—negotiations remain fragile. The situation has evolved into a prolonged stalemate, with neither a full escalation nor a clear resolution in sight. Current market structure reflects a geopolitically driven pricing regime, with volatility concentrated in near-term crude futures while longer-dated contracts remain relatively anchored. The impact of Iran-related supply disruptions is being priced primarily into prompt contracts, whereas deferred benchmarks—such as 2027 WTI—have moved more modestly, holding in the low-$70/bbl range. This divergence suggests that traders view the current supply shock as severe but not necessarily permanent, with expectations of eventual normalization. However, according to the latest Dallas Fed survey, 86% of US oil and gas executives view another future Hormuz disruption within the next 5 years as somewhat or very likely, while 40% do not expect normalization of Hormuz traffic by August. A further 35% believe less than 90% of shut-in Gulf production will eventually return. These figures suggest the industry is calibrating its medium-term strategy around a world of elevated and recurring geopolitical risk. US shale response remains restrained According to an analysis from Macquire, despite favorable pricing, the US upstream response is expected to be measured. With average breakeven levels near $43/bbl WTI, current prices offer highly attractive margins

Read More »

AI data flows force rethink of data center networking at Backblaze

According to a report that Backblaze released this morning, traffic from content delivery networks and hosting and Internet services providers have stayed largely within historical norms over the past year. But traffic from hyperscalers and neoclouds fluctuated dramatically, with steep climbs in September and October and another uptick in March. Another network traffic change related to AI is geography. “Traditionally, it didn’t matter where cloud infrastructure was located,” says Nowak. But with AI workloads, if storage is close to compute, enterprises get lower latency and higher throughput. Today, Virginia and California have a high concentration of AI compute providers. This, in turn, brings in more storage companies. “In July, we chose to double our footprint in US East to increase the proximity to hyperscalers and neoclouds,” says Nowak. And that, in turn, leads to even more demand for compute, and even greater concentration. “There’s a snowball effect,” Nowak says. Why neoclouds for AI? Enterprises might think that they don’t need to worry about network traffic details if they’re using a hyperscaler for their AI workloads because the data and the processing both stay within the cloud. But there are advantages to using a third-party storage provider combined with neoclouds for the GPUs. According to a report released by Synergy Research Group in early April, neocloud revenues hit $9 billion in the fourth quarter of 2025, a 223% year-over-year increase. Revenues passed $25 billion for the whole year and are expected to hit $400 billion by 2031.

Read More »

TD Cowen: AI Adoption Is Already Here. Infrastructure Demand Is What Comes Next.

Enterprise AI adoption is no longer emerging. It is already embedded and beginning to scale in ways that will reshape data center demand. The latest TD Cowen GenAI Adoption Survey makes that clear. Across 689 U.S. enterprises, 92% are now using at least one major AI platform, with Microsoft Copilot, Google Gemini, and ChatGPT forming the core triad of daily enterprise tooling. That’s the baseline. The more important story is what comes next. AI is moving quickly from assistive software to autonomous systems, and that shift carries direct implications for compute demand, power consumption, and infrastructure design. From Copilots to Autonomous Systems Today’s enterprise AI footprint is already broad, but it is still largely human-in-the-loop. That is beginning to change. Roughly a third of respondents say they already have semi-autonomous AI agents running in production, while another large cohort is piloting or planning deployments over the next 12 to 18 months. By 2027, more than three-quarters expect to be running AI agents capable of executing multi-step workflows without human intervention. This is not incremental adoption. It is a step-function shift. Autonomous agents don’t just respond to prompts; they execute tasks, interact with enterprise systems, and continuously access data. For data centers, that translates into more persistent, baseline load: exactly the kind of demand profile that stresses power delivery, increases utilization, and accelerates capacity planning timelines. To wit: AI is moving from a bursty workload to a continuous one. ROI Is No Longer the Question At the same time, the debate around AI return on investment is effectively over. Three-quarters of respondents report positive ROI, while only a small minority report negative outcomes. A meaningful share is already seeing multiples of return on their investments. The implication seems straightforward: AI budgets are becoming durable. This is no longer experimental spend that

Read More »

BYOP Moves to the Center of Data Center Strategy

Self-Sufficiency Becomes a Feature, Not a Risk Consider Wyoming’s Project Jade, where county commissioners approved an AI campus tied to 2.7 GW of new natural gas-fired generation being developed by Tallgrass Energy. Reporting from POWER described the project as a “bring your own power” model designed for a high degree of self-sufficiency, with a mix of natural gas generation and Bloom fuel cells. The campus is expected to scale significantly over time. What stands out is not only the size, but the positioning. Self-sufficiency is becoming a selling point both for developers seeking to de-risk timelines, and for local stakeholders wary of overloading existing utility infrastructure. Fuel Cells and Nuclear: The Middle Ground and the Long Game Fuel cells occupy an important middle ground in this shift. Bloom Energy’s 2026 report positions fuel cells as a leading onsite option due to shorter lead times, modular deployment, and lower local emissions. Market activity suggests that interest is real. For developers, fuel cells can be easier to permit than large turbine installations and can be deployed incrementally. That makes them effective as bridge-to-grid solutions or as permanent components of hybrid architectures. Advanced nuclear remains the most strategically significant, but least immediate, BYOP pathway. Companies including Switch and other data center operators have explored partnerships with Oklo around its Aurora small modular reactor design. Nuclear holds long-term appeal because it offers firm, low-carbon power at scale. But for current AI buildouts, it remains a future option rather than a near-term construction solution. The immediate reality is that gas and modular onsite systems are closing the time-to-power gap, while nuclear is being positioned as a longer-duration successor as licensing and deployment timelines evolve. The model itself is also evolving. BYOP is beginning to blur the line between developer, energy provider, and compute customer. Reuters

Read More »

Microsoft Builds for Two Worlds: Sovereign Cloud and AI Factories

So far in 2026, across the United States and overseas, Microsoft is building an infrastructure portfolio at full hyperscale. The strategy runs on two tracks. The first is familiar: sovereign cloud expansion involving new regions, local data residency, and compliance-driven enterprise infrastructure. The second is larger and more consequential: purpose-built AI factory campuses designed for dense GPU clusters, liquid cooling, private fiber, and power acquisition at a scale that extends far beyond traditional cloud infrastructure. Despite reports last year that Microsoft was pulling back on data center development, the company is accelerating. It is not only advancing its own large-scale campuses, but also absorbing premium AI capacity originally aligned with OpenAI. In Texas and Norway, projects tied to OpenAI’s infrastructure plans have shifted back into Microsoft’s orbit. Even after contractual changes gave OpenAI greater flexibility to source compute elsewhere, Microsoft remains the market’s most reliable backstop buyer for top-tier AI infrastructure. It no longer needs to control every OpenAI build to maintain its position. In 2026, Microsoft is still the company best positioned to turn uncertain AI demand into deployed capacity, e.g. concrete, steel, power, and silicon at scale. Building at Industrial Scale The clearest indicator of Microsoft’s intent is its capital spending. In its January 2026 earnings cycle, Reuters reported that Microsoft’s quarterly capital expenditures reached a record $37.5 billion, up nearly 66% year over year. The company’s cloud backlog rose to $625 billion, with roughly 45% of remaining performance obligations tied to OpenAI. About two-thirds of that quarterly capex was directed toward compute chips. To be clear: this is no speculative buildout. Microsoft is deploying capital against a massive, committed demand pipeline, even as it maintains significant exposure to OpenAI-driven workloads. The company is solving two infrastructure problems at once: supporting broad Azure and Copilot growth, while ensuring

Read More »

AI’s Execution Era: Aligned and Netrality on Power, Speed, and the New Data Center Reality

At Data Center World 2026, the industry didn’t need convincing that something fundamental has shifted. “This feels different,” said Bill Kleyman as he opened a keynote fireside with Phill Lawson-Shanks and Amber Caramella. “In the past 24 months, we’ve seen more evolution… than in the two decades before.” What followed was less a forecast than a field report from the front lines of the AI infrastructure buildout—where demand is immediate, power is decisive, and execution is everything. A Different Kind of Growth Cycle For Caramella, the shift starts with scale—and speed. “What feels fundamentally different is just the sheer pace and breadth of the demand combined with a real shift in architecture,” she said. Vacancy rates have collapsed even as capacity expands. AI workloads are not just additive—they are redefining absorption curves across the market. But the deeper change is behavioral. “Over 75% of people are using AI in their day-to-day business… and now the conversation is shifting to agentic AI,” Caramella noted. That shift—from tools to delegated workflows—points to a second wave of infrastructure demand that has not yet fully materialized. Lawson-Shanks framed the transformation in more structural terms. The industry, he said, has always followed a predictable chain: workload → software → hardware → facility → location. That chain has broken. “We had a very predictable industry… prior to Covid. And Covid changed everything,” he said, describing how hyperscale demand compressed deployment cycles overnight. What followed was a surge that utilities—and supply chains—were not prepared to meet. From Capacity to Constraint: Power Becomes Strategy If AI has a gating factor, it is no longer compute. It is power. “Before it used to be an operational convenience,” Caramella said. “Now it’s a strategic advantage—or constraint if you don’t have it.” That shift is reshaping executive decision-making. Power is no

Read More »

The Trillion-Dollar AIDC Boom Gets Real: Omdia Maps the Path From Megaclusters to Microgrids

The AI data center buildout is getting bigger, denser, and more electrically complex than even many bullish observers expected. That was the core message from Omdia’s Data Center World analyst summit, where Senior Director Vlad Galabov and Practice Lead Shen Wang laid out a view of the market that has grown more expansive in just the past year. What had been a large-scale infrastructure story is now, in Omdia’s telling, something closer to a full-stack industrial transition: hyperscalers are still leading, but enterprises, second-tier cloud providers, and new AI use cases are beginning to add demand on top of demand. Omdia’s updated forecast reflects that shift. Galabov said the firm has now raised its 2030 projection for data center investment beyond the $1.6 trillion figure it showed a year ago, arguing that surging AI usage, expanding buyer classes, and the emergence of new power infrastructure categories have all forced a rethink. “One of the reasons why we raised it is that people keep using more AI,” Galabov said. “And that just means more money, because we need to buy more GPUs to run the AI.” That is the simple version. The more consequential one is that AI is no longer behaving like a contained technology cycle. It is spilling outward into adjacent infrastructure markets, including batteries, gas-fired onsite generation, and high-voltage DC power architectures that until recently sat well outside the mainstream data center conversation. A Market Moving Faster Than the Forecasts Galabov opened by revisiting the predictions Omdia made last year for 2030. On several fronts, he said, the market is already validating them faster than expected. AI applications are becoming commonplace. AI has become the dominant driver of data center investment. Self-generation is no longer a fringe strategy. Even some of the rack-scale architecture concepts that once looked

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »