Stay Ahead, Stay ONMINE

➡️ Start Asking Your Data ‘Why?’ — A Gentle Intro To Causality

Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can, potentially, unveil causal relationships within standard observational data, without having to resort to expensive randomised control trials. This post is targeted towards anyone making data driven decisions. The main takeaway message is that causality may be possible by […]

Correlation does not imply causation. It turns out, however, that with some simple ingenious tricks one can, potentially, unveil causal relationships within standard observational data, without having to resort to expensive randomised control trials.

This post is targeted towards anyone making data driven decisions. The main takeaway message is that causality may be possible by understanding that the story behind the data is as important as the data itself.

By introducing Simpson’s and Berkson’s Paradoxes, situations where the outcome of a population is in conflict with that of its cohorts, I shine a light on the importance of using causal reasoning to identify these paradoxes in data and avoid misinterpretation. Specifically I introduce causal graphs as a method to visualise the story behind the data point out that by adding this to your arsenal you are likely to conduct better analyses and experiments.

My ultimate objective is to whet your appetite to explore more on causality, as I believe that by asking data “Why?” you will be able to go beyond correlation calculations and extract more insights, as well as avoid common misjudgement pitfalls.

Note that throughout this gentle intro I do not use equations but demonstrate using accessible intuitive visuals. That said I provide resources for you to take your next step in adding Causal Inference to your statistical toolbox so that you may get more value from your data.

The Era of Data Driven Decision Making

In [Deity] We Trust, All Others Bring Data! — William E. Deming

In this digital age it is common to put a lot of faith in data. But this raises an overlooked question: Should we trust data on its own?

Judea Pearl, who is considered the godfather of Causality, articulated best:

“The collection of information is as important as the information itself “ — Judea Pearl

In other words the story behind the data is as important as the data itself.

Judea Pearl is considered the Godfather of Causality. Credit: Aleksander Molak

This manifests in a growing awareness of the importance of identifying bias in datasets. By the end of this post I hope that you will appreciate that causality pertains the fundamental tools to best express, quantify and attempt to correct for these biases.

In causality introductions it is customary to demonstrate why “correlation does not imply causation” by highlighting limitations of association analysis due to spurious correlations (e.g, shark attacks 🦈 and ice-cream sales 🍦). In an attempt to reduce the length of this post I defer this aspect to an older one of mine. Here I focus on two mind boggling paradoxes 🤯 and their resolution via causal graphs to make a similar point.

Paradoxes in Analysis

To understand the importance of the story behind the data we will examine two counter-intuitive (but nonetheless true) paradoxes which are classical situations of data misinterpretation.

In the first we imagine a clinical trial in which patients are given a treatment and that results in a health score. Our objective is to assess the average impact of increased treatment to the health outcome. For pedagogical purposes in these examples we assume that samples are representative (i.e, the sample size is not an issue) and that variances in measurements are minimal.

Population outcome of imaginary clinical trial. Each dot is one patient and the red line indicates the naïve population trend.

In the figure above we learn that on average increasing the treatment appears to be beneficial since it results in a better outcome.

Now we’ll color code by age and gender groupings and examine how the treatment increases impacts each cohort.

Same data as before where each symbol represents an age-gender cohort.

Track any cohort (e.g, “Girls” representing young females) and you immediately realise that increase in treatment appears adverse.

What is the conclusion of the study? On the one hand increasing the treatment appears to be better for the population at large, but when examining gender-age cohorts it seems disadvantageous. This is Simpson’s Paradox which may be stated:

“Trends can exist in subgroups but reverse for the whole”

Below we will resolve this paradox using causality tools, but beforehand let’s explore another interesting one, which also examines made up data.

Imagine that we quantify for the general population their attractiveness and how talented they are as in this figure:

General population. Source: Wikipedia, created by Cmglee

We find no apparent correlation.

Now we’ll focus on an unusual subset — famous people:

A subset of celebrities. Source: Wikipedia created by Cmglee

Here we clearly see an anti-correlation that doesn’t exist in the general population.

Should we conclude that Talent and Attractiveness are independent variables as per the first plot of the general population or that they are correlated as per that of celebrities?

This is Berkson’s Paradox where one population has a trait trend that another lacks.

Whereas an algorithm would identify these correlations, resolving these paradoxes requires a full understanding of the context which normally is not fed to a computer. In other words without knowing the story behind the data results may be misinterpreted and wrong conclusions may be inferred.

Mastering identification and resolution these paradoxes is an important first step to elevating one’s analyses from correlations to causal inference.

Whereas these simple examples may be explained away logically, for the purposes of learning causal tools in the next section I’ll introduce Causal Graphs.

Causal Graphs— Visualising The Story Behind The Data

“[From the Simpson’s and Berkson’s Paradoxes we learn that] certain decisions cannot be made based on the basis of data alone, but instead depend on the story behind the data. … Graph Theory enables these stories to be conveyed” — Judea Pearl

Causal graph models are probabilistic graphical models used to visualise the story behind the data. They are perhaps one of the most powerful tools for analysts that is not taught in most statistics curricula. They are both elegant and highly informative. Hopefully by the end of this post you will appreciate it when Judea Pearl says that this is the missing vocabulary to communicate causality.

To understand causal graph models (or causal graphs for short) we start with the following illustration of an example undirected graph with four nodes/vertices and three edges.

An undirected graph with four nodes/vertices and three edges

Each node is a variable and the edges communicate “who is related to whom?” (i.e, correlations, joint probabilities).A directed graph is one in which we add arrows as in this figure.

A directed graph with four nodes/vertices and five directed edges

A directed edge communicates “who listens to whom?” which is the essence of causation.

In this specific example you can notice a cyclical relationship between the C and D nodes.A useful subset of directed graphs are the directed acyclic graphs (DAG), which have no cycles as in the next figure.

A directed acyclic graph with four nodes/vertices and four directed edges

Here we see that when starting from any node (e.g, A) there isn’t a path that gets back to it.

DAGs are the go-to choice in causality for simplicity as the fact that parameters do not have feedback highly simplifies the flow of information. (For mechanisms that have feedback, e.g temporal systems, one may consider rolling out nodes as a function of time, but that is beyond the scope of this intro.)

Causal graphs are powerful at conveying the cause/effect relationships between the parameter and hence how data was generated (the story behind the data).

From a practical point of view, graphs enable us to understand which parameters are confounders that need to be controlled for, and, as important, which not to control for, because doing so causes spurious correlations. This will be demonstrated below.

The practice of attempting to build a causal graph enables:

  • Design of better experiments.
  • Draw causal conclusions (go beyond correlations by means of representing interventions, counterfactuals and encoding conditional independence relationships; all beyond the scope of this post).

To further motivate the usage of causal graph models we will use them to resolve the Simpson’s and Berkson’s paradoxes introduced above.

💊 Causal Graph Resolution of Simpson’s Paradox

For simplicity we’ll examine Simpson’s paradox focusing on two cohorts, male and female adults.

Outcome of the imaginary therapeutic trial, similar to the previous but focusing on the adults. Each symbol is one patient from the respective age-gender cohort and the red line indicates the naïve population trend.

Examining this data we can make three statements about three variables of interest:

  • Gender is an independent variable (it does not “listen to” the other two)
  • Treatment depends on Gender (as we can see, in this setting the level given depends on Gender — women have been given, for some reason, a higher dosage.)
  • Outcome depends on both Gender and Treatment

According to these we can draw the causal graph as the following:

Simpson’s paradox Graphic Model where Gender is a confounding variable between Treatment and Outcome

Notice how each arrow contributes to communicate the statements above. As important, the lack of an arrow pointing into Gender conveys that it is an independent variable.

We also notice that by having arrows pointing from Gender to Treatment and Outcome it is considered a common cause between them.

The essence of the Simpson’s paradox is that although the Outcome is effected by changes in Treatment, as expected, there is also a backdoor path flow of information via Gender.

As you may have guessed by this stage, the solution to this paradox is that the common cause Gender is a confounding variable that needs to be controlled.

Controlling for a variable, in terms of a causal graph, means eliminating the relationship between Gender and Treatment.

This may be done in two manners:

  • Pre data collection: Setting up a Randomised Control Trial (RCT) in which participants will be given dosage regardless of their Gender.
  • Post data collection: E.g, in this made up scenario the data has already been collected and hence we need to deal with what is referred to as Observational Data.

In both pre- and post- data collection the elimination of the Treatment dependency of Gender (i.e, controlling for the Gender) may be done by modifying the graph such that the arrow between them is removed as in the following:

A modified version of the Simpson’s paradox Graphic Model. The dark node means we control for Gender.

Applying this “graphical surgery” means that the last two statements need to be modified (for convenience I’ll write all three):

  • Gender is an independent variable
  • Treatment is an independent variable
  • Outcome depends on Gender and Treatment (but with no backdoor path).

This enables obtaining the causal relationship of interest : we can assess the direct impact of modification Treatment on the Outcome.

The process of controlling for a confounder, i.e manipulation of the data generation process, is formally referred to as applying an intervention. That is to say we are no longer passive observers of the data, but we are taking an active role in modification it to assess the causal impact.

How is this manifested in practice?

In the case of RCTs the researcher needs to control for important confounding variables. Here we limit the discussion to Gender (but in real world settings you can imagine other variables such as Age, Social Status and anything else that might be relevant to one’s health).

RCTs are considered the golden standard for causal analysis in many experimental settings thanks to its practice of confounding variables. That said, it has many setbacks:

  • It may be expensive to recruit individuals and may be complicated logistically
  • The intervention under investigation may not be physically possible or ethical to conduct (e.g, one can’t ask randomly selected people to smoke or not for ten years)
  • Artificial setting of a laboratory — not a true natural habitat of the population.

Observational data on the other hand is much more readily available in the industry and academia and hence much cheaper and could be more representative of actual habits of the individuals. But as illustrated in the Simpson’s diagram it may have confounding variables that need to be controlled.

This is where ingenious solutions developed in the causal community in the past few decades are making headway. Detailing them are beyond the scope of this post, but I briefly mention how to learn more at the end.

To resolve for this Simpson’s paradox with the given observational data one

  1. Calculates for each cohort the impact of the change of the treatment on the outcome
  2. Calculates a weighted average contribution of each cohort on the population.

Here we will focus on intuition, but in a future post we will describe the maths behind this solution.

I am sure that many analysts, just like myself, have noticed Simpson’s at some stage in their data and hopefully have corrected for it. Now you know the name of this effect and hopefully start to appreciate how causal tools are useful.

That said … being confused at this stage is OK 😕

I’ll be the first to admit that I struggled to understand this concept and it took me three weekends of deep diving into examples to internalised it. This was the gateway drug to causality for me. Part of my process to understanding statistics is playing with data. For this purpose I created an interactive web application hosted in Streamlit which I call Simpson’s Calculator 🧮. I’ll write a separate post for this in the future.

Even if you are confused the main takeaways of Simpson’s paradox is that:

  • It is a situation where trends can exist in subgroups but reverse for the whole.
  • It may be resolved by identifying confounding variables between the treatment and the outcome variables and controlling for them.

This raises the question — should we just control for all variables except for the treatment and outcome? Let’s keep this in mind when resolving for the Berkson’s paradox.

🦚 Causal Graph Resolution of Berkson’s Paradox

As in the previous section we are going to make clear statements about how we believe the data was generated and then draw these in a causal graph.

Let’s examine the case of the general population, for convenience I’m copying the image from above:

General population. Source: Wikipedia, created by Cmglee

Here we understand that:

  • Talent is an independent variable
  • Attractiveness is an independent variable

A causal graph for this is quite simple, two nodes without an edge.

In the general population ones Talent and Attractiveness are independent

Let’s examine the plot of the celebrity subset.

A subset of celebrities. Source: Wikipedia created by Cmglee

The cheeky insight from this mock data is that the more likely one is attractive the less they need to be talented to be a celebrity. Hence we can deduce that:

  • Talent is an independent variable
  • Attractiveness is an independent variable
  • Celebrity variable depends on both Talent and Attractiveness variables. (Imagine this variable is boolean as in: true for celebrities or false for not).

Hence we can draw the causal graph as:

Being a celebrity depends on Talent and Attractiveness

By having arrows pointing into it Celebrity is a collider node between Talent and Attractiveness.

Berkson’s paradox is the fact that when controlling for celebrities we see an interesting trend (anti correlation between Attractiveness and Talent) not seen in the general population.

This can be visualised in the causal graph that by confounding for the Celebrity parameter we are creating a spurious correlation between the otherwise independent variables Talent and Attractiveness. We can draw this as the following:

Berkson’s paradox Graphic Model. The dark node means we control for Celebrity. Controlling this collider variable generates a spurious correlation (dashed line) between Talent and Attractiveness.

The solution of this Berkson’s paradox should be apparent here: Talent and Attractiveness are independent variables in general, but by controlling for the collider Celebrity node causes a spurious correlation in the data.

Let’s compare the resolution of both paradoxes:

  • Resolving Simpson’s Paradox is by controlling for common cause (Gender)
  • Resolving Berkson’s Paradox is by not controlling for the collider (Celebrity)

The next figure combines both insights in the form of their causal graphs:

Graph models show how to resolve the paradoxes. Dark nodes are controlled for. Left: Modified graph to resolve Simpson’s paradox by controlling for Gender. Right: To resolve for Berkson’s paradox the collider should not be controlled.

The main takeaway from the resolution of these paradoxes is that controlling for parameters requires a justification. Common causes should be controlled for but colliders should not.

Even though this is common knowledge for those who study causality (e.g, Economics majors), it is unfortunate that most analysts and machine learning practitioners are not aware of this (including myself in 2020 after over 15 years of analysis and predictive modelling experience).

Oddly, statisticians both over- and underrate the importance of confounders — Judea Pearl

Summary

The main takeaway from this post is that the story behind the data is as important as the data itself.

Appreciating this will help you avoid result misinterpretation as spurious correlations and, as demonstrated here, in Simpson’s and Berskon’s paradoxes.

Causal Graphs are an essential tool to visualise the story behind the data. By using them to solve for the paradoxes we learnt that controlling for variables requires justification (common causes ✅, colliders ⛔️).

For those interested in taking the next step in their causal journey I highly suggest mastering Simpson’s paradox. One great way is by playing with data. Feel free to do so with my interactive “Simpson-calculator” 🧮.

Loved this post? 💌 Join me on LinkedIn or ☕ Buy me a coffee!

Credits

Unless otherwise noted, all images were created by the author.

Many thanks to Jim Parr, Will Reynolds, Hedva Kazin and Betty Kazin for their useful comments.

Wondering what your next step should be in your causal journey? Check out my new article on mastering Simpson’s Paradox — you will never look at data the same way. 🔎

Useful Resources

Here I provide resources that I find useful as well as a shopping list of topics for beginners to learn.

📚 Books

Credit: Gaelle Marcel
  • The Book of Why — popular science reading (NY Times level)
  • Causal Inference in Statistics A Primer — excellent short technical book (site)
  • Causal Inference and Discovery in Python by Aleksander Molak (Packt, github) — clearly explained with python applications 🐍.
  • What If? — a cohesive presentation of concepts of, and methods for, causal inference (site, github)
  • Causal Inference The Mixtape — Social Science focused using Python, R and Strata (site, resources, mooc)
  • Counterfactuals and Causal Inference — Methods and Principles (Social Science focused)

This list is far from comprehensive, but I’m glad to add to it if anyone has suggestions (please mention why the book stands out from the pack).

🔏 Courses

Credit: Austrian National Library

There are probably a few courses online. I love the 🆓 one of Brady Neil bradyneal.com/causal-inference-course.

  • Clearly explained
  • Covers many aspects
  • Thorough
  • Provides memorable examples
  • F.R.E.E

One paid course 💰 that is targeted to practitioners is Altdeep.

💾 Software

Credit: Artturi Jalli

This list is far from comprehensive because the space is rapidly growing:

Causal Wizard app also have an article about Causal Diagram tools.

🐾 Suggested Next Steps In The Causal Journey

Here I highlight a list of topics which I would have found useful when I started my learnings in the field. If I’m missing anything I’d be more than glad to get feedback and adding. I bold face the ones which were briefly discussed here.

Pearl’s Causal Hierarchy of seeing, doing, imagining and their applications. This is an approved modification of the original illustration by Maayan Harel from MaayanVisuals.com in The Book of Why.
  • Pearl’s Causal Hierarchy of seeing, doing and imagining (figure above)
  • Observational data vs. Randomised Control Trials
  • d-separation, common causes, colliders, mediators, instrumental variables
  • Causal Graphs
  • Structural Causal Models
  • Assumptions: Ignorability, SUTVA, Consistency, Positivity
  • “Do” Algebra — assessing impact on cohorts by intervention
  • Counterfactuals — assessing impact on individuals by comparing real outcomes to potential ones
  • The fundamental problem of causality
  • Estimand, Estimator, Estimate, Identifiability — relating causal definitions to observable statistics (e.g, conditional probabilities)
  • Causal Discovery — finding causal graphs with data (e.g, Markov Equivalence)
  • Causal Machine Learning (e.g, Double Machine Learning)

For completeness it is useful to know that there are different streams of causality. Although there is a lot of overlap you may find that methods differ in naming convention due to development in different fields of research: Computer Science, Social Sciences, Health, Economics

Here I used definitions mostly from the Pearlian perspective (as developed in the field of computer science).

The Story Behind This Post

This narrative is a result of two study groups that I have conducted in a previous role to get myself and colleagues to learn about causality, which I felt missing in my skill set. If there is any interest I’m glad to write a post about the study group experience.

This intro was created as the one I felt that I needed when I started my journey in causality.

In the first iteration of this post I wrote and presented the limitations of spurious correlations and Simpson’s paradox. The main reason for this revision to focus on two paradoxes is that, whereas most causality intros focus on the limitations of correlations, I feel that understanding the concept of justification of confounders is important for all analysts and machine learning practitioners to be aware of.

On September 5th 2024 I have presented this content in a contributed talk at the Royal Statistical Society Annual Conference in Brighton, England (abstract link).

Unfortunately there is no recording but there are of previous talks of mine:

The slides are available at bit.ly/start-ask-why. Presenting this material for the first time at PyData Global 2021

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

CompTIA training targets workplace AI use

CompTIA AI Essentials (V2) delivers training to help employees, students, and other professionals strengthen the skills they need for effective business use of AI tools such as ChatGPT, Copilot, and Gemini. In its first iteration, CompTIA’s AI Essentials focused on AI fundamentals to help professionals learn how to apply AI technology

Read More »

OPEC Receives Updated Compensation Plans

A statement posted on OPEC’s website this week announced that the OPEC Secretariat has received updated compensation plans from Iraq, the United Arab Emirates (UAE), Kazakhstan, and Oman. A table accompanying this statement showed that these compensation plans amount to a total of 221,000 barrels per day in November, 272,000

Read More »

LogicMonitor closes Catchpoint buy, targets AI observability

The acquisition combines LogicMonitor’s observability platform with Catchpoint’s internet-level intelligence, which monitors performance from thousands of global vantage points. Once integrated, Catchpoint’s synthetic monitoring, network data, and real-user monitoring will feed directly into Edwin AI, LogicMonitor’s intelligence engine. The goal is to let enterprise customers shift from reactive alerting to

Read More »

Akamai acquires Fermyon for edge computing as WebAssembly comes of age

Spin handles compilation from source to WebAssembly bytecode and manages execution on target platforms. The runtime abstracts the underlying technology while preserving WebAssembly’s performance and security characteristics. This bet on WebAssembly standards has paid off as the technology matured.  WebAssembly has evolved significantly beyond its initial browser-focused design to support

Read More »

US Extends Reprieve for Lukoil Fuel Stations Outside Russia

The Trump administration extended a waiver for Lukoil PJSC’s gas stations outside of Russia, allowing them to continue operations until late April 2026.  The measure by the US Treasury Department’s Office of Foreign Assets Control could help ease concerns about sudden disruptions across Europe and the Americas, where Russia’s second-largest oil producer has retail operations.  As Russia’s most internationally diversified oil firm, Lukoil has stakes in European oil refineries as well as significant holdings in oil fields from Iraq to Kazakhstan. Its brand also extends to filling stations from the US to Belgium and Romania. Washington’s move shows that even as the US tightens the screws on Moscow, it’s willing to make targeted exceptions to prevent an unnecessary shock to local economies. Retail fuel networks in nations from the Balkans to Central Europe rely on international suppliers, and abrupt cutoffs risk spilling into fuel shortages, higher prices or operational headaches for foreign partners.  The decision lands with a key Dec. 13 deadline looming for Lukoil to offload its international assets. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with peers and industry insiders and engage in a professional community that will empower your career in energy.

Read More »

Var Energi Hits New Oil Discovery Near Goliat

Var Energi ASA on Thursday confirmed a new oil discovery on Norway’s side of the Barents Sea, the second discovery in weeks in its ongoing Goliat Ridge drilling campaign. Located five kilometers (3.11 miles) north of the producing Goliat field, operated by Var Energi, the Goliat North exploration well encountered hydrocarbons in the Realgrunnen and Kobbe formations, the Norwegian company said in an online statement. Estimated gross recoverable resources are up to five million barrels of oil equivalent (MMboe). “Including the latest discovery, the Goliat Ridge is estimated to contain gross discovered resources of 39-108 MMboe and with additional prospective resources taking the total gross potential to up to 200 MMboe”, Var Energi said. “A tie-back of the Goliat Ridge discoveries to the nearby Goliat FPSO [floating production, storage and offloading vessel] is being planned”. Goliat Nord, or well 7122/7-8, aimed to prove hydrocarbons in Lower Jurassic/Upper Triassic and Middle Triassic rocks in the Realgrunnen Subgroup and the Kobbe Formation respectively, according to the Norwegian Offshore Directorate (NOD). “Well 7122/7-8 S encountered an eight-meter [26.25 feet] gas/oil column in the Tubaen Formation in the Realgrunnen Subgroup in reservoir rocks totaling 6.5 meters, with good reservoir quality”, the NOD reported separately. “The gas/oil contact was encountered 1,255 meters below sea level. The oil/water contact was not encountered. “The well also encountered a six-meter gas/oil column in the Fruholmen Formation in the Realgrunnen Subgroup in reservoir rocks with good reservoir quality. The gas/oil contact was encountered 1,285 meters below sea level. The oil/water contact was encountered 1,290 meters below sea level. “In the Kobbe Formation, the well encountered a 17-meter oil column in reservoir rocks totaling 12 meters, with good reservoir quality. The oil/water contact was encountered 2,048 meters below sea level”. Goliat North was drilled to a vertical depth of 2,197 meters below

Read More »

PTTEP Eyes 8 Percent Growth in Sales Volume Next Year

Thailand’s state-owned PTT Exploration and Production Public Company Ltd (PTTEP) on Thursday announced a spending budget of around $7.73 billion for 2026, targeting an eight percent sales volume increase to 556,000 barrels of oil equivalent a day (boed). “This growth reflects strong momentum from our operational expansion in Thailand and overseas this year, which has already translated into higher sales volume and revenue, and will continue to support our performance into 2026 and beyond”, PTTEP chief executive Montri Rawanchaikul said in an online statement. Of the 2026 budget, $5.16 billion is for capital expenditure and $2.56 billion is for operating expenditure. PTTEP said it aims to maximize volumes from current producing assets to strengthen the Southeast Asian country’s energy security. “Main producing projects include G1/61 (Erawan, Platong, Satun and Funan fields), G2/61 (Bongkot field), Arthit, S1, Contract 4 projects and projects in the Malaysia-Thailand Joint Development Area”, PTTEP said. “This plan also includes other overseas projects in Malaysia, Oman and Algeria. The capex budget of USD 3,605 million (equivalent to THB 118,064 million) is allocated to support these activities”. It said it has allotted $118 million for emission reduction activities including a carbon capture and storage (CCS) project in the Arthit field in the Gulf of Thailand. It announced a positive final investment decision on the CCS project on September 8, earmarking a five-year investment of $320 million. Expected to start operations 2028, the project is designed to capture and store up to one million metric tons of carbon dioxide a year. The 2026 plan also involves “accelerating the activities of key projects under the development phase, including Ghasha Concession, Abu Dhabi Offshore 2, Mozambique Area 1, Malaysia greenfields such as Malaysia SK405B, Malaysia SK417 and Malaysia SK438 Projects, to achieve production start-up timelines as planned, with the allocated capex budget

Read More »

U.S. Department of Energy Announces New Research, Technology, and Economic Security Framework

The U.S. Department of Energy (DOE) Office of Energy Efficiency and Renewable Energy announced the release of a memo by the Deputy Secretary of Energy that describes a framework designed to minimize foreign risks to the scientific enterprise of DOE and the National Nuclear Security Administration (NNSA). The newly published Research, Technology & Economic Security (RTES) Framework highlights DOE’s goals, process, high level risk factors, and commitment to mitigation when assessing RTES risk. This framework outlines a harmonized approach across all DOE/NNSA funding offices that undertakes to protect DOE’s early-stage research and development (R&D) in academic settings, applied R&D stage projects, and demonstration and deployment stage projects while maintaining an open collaborative, and world leading scientific enterprise. The framework also highlights DOE’s commitment to mitigation when assessing RTES risk and outlines its goals and processes. Join an RTES Informational Webinar To Learn More To assist the applicant and recipient community in understanding and adapting to the recently published framework, DOE will host a webinar to introduce the approach and answer questions. Funding awardees and prospective applicants are encouraged to review the framework and attend Monday, December 16, 2024. Register today. About DOE’s RTES Office DOE’s Office of Research, Technology & Economic Security (RTES), situated in DOE’s Office of International Affairs, undertakes several risk mitigation activities that support DOE’s responsibility to protect federal funding from undue foreign influence and to accomplish its mission in ways that protect and further energy security and technological advancement of the United States. Specifically, RTES identifies and addresses potential security risks that threaten the scientific enterprise; establishes best practices for programs; conducts outreach activities for stakeholders; educates Department programs on potential security risks; and conducts or facilitates risk assessments of DOE proposals, loans, and awards. More information about RTES’s mission, activities, events, and ways to get involved

Read More »

@H2Spotlight: Fall 2024

Spotlight on Success: First Megawatt-Scale Demonstration of Hydrogen Fuel Cells for Data Center Backup Power Earlier this year, Caterpillar Inc. announced successful completion of a first-of-a-kind collaboration with Microsoft and Ballard Power Systems to demonstrate the viability of using large-format hydrogen fuel cells to supply reliable backup power for data centers.  The demonstration, hosted at Microsoft’s Cheyenne, Wyoming, data center, simulated a 48-hour power outage, providing critical insights into the capabilities of fuel cell systems to power multi-megawatt data centers, ensuring uninterrupted power supply to meet 99.999% uptime requirements. Caterpillar served as project lead, providing overall system integration, power electronics, and microgrid controls that form the central structure of the hydrogen power solution. Hardware for the demonstration included two Caterpillar power grid stabilization storage systems alongside a 1.5-MW hydrogen fuel cell supplied by Ballard Power Systems. Over the course of the project, researchers evaluated the cost and performance of the fuel cell system including analysis of key performance characteristics such as power transfer time and load acceptance. Launched in 2020 and completed this year, the project was supported and partially funded by DOE under the H2@Scale initiative, which brings stakeholders together to advance affordable hydrogen production, transport, storage, and utilization in multiple energy sectors. During the demonstration, researchers at DOE’s National Renewable Energy Laboratory (NREL) analyzed safety, techno-economics, and greenhouse gas impacts.

Read More »

U.S. Department of Energy Releases Request for Information on Defining Sustainable Maritime Fuels in the United States

To support and advance future maritime fuel technology and investment, the U.S. Department of Energy (DOE) released a Request for Information (RFI) to establish a consistent and reliable definition for sustainable maritime fuel (SMF) that informs and aligns community, industry, governments, and other maritime stakeholders. The Action Plan for Maritime Energy and Emissions Innovation (Action Plan), a summary of which was released in December 2024, builds on the 2023 U.S. National Blueprint for Transportation Decarbonization to define actions that aim to achieve a clean, safe, accessible and affordable U.S. maritime transportation system. The Action Plan calls for the federal government to define “Sustainable Maritime Fuel,” which is critical to evaluating and determining future SMF production volume goals in the Action Plan and alternative fuels that align with the U.S. 2050 net emission goals. “The global maritime sector is pursuing sustainable maritime fuels. The United States is well positioned to be a global leader in producing, distributing, and selling these sustainable fuels that can provide more affordable options to the market,” said Michael Berube, deputy assistant secretary for sustainable transportation and fuels, Office of Energy Efficiency and Renewable Energy. “This Request for Information will help align the industry around common definitions, enabling broader adoption across the economy.” The U.S. maritime sector connects virtually every aspect of American life—from our clothes and food, to our cars, and the oil and natural gas used to heat and cool homes. About 99% of U.S. overseas trade enters or leaves the United States by ship. This waterborne cargo and associated activity contribute more than $500 billion to the U.S. gross domestic product and sustain over 10 million U.S. jobs. However, the Action Plan estimates the total amount of greenhouse (GHG) emissions from fuel sold in the United States for use in maritime applications accounts for 4% of the U.S. transportation sector’s GHG

Read More »

With AI Factories, AWS aims to help enterprises scale AI while respecting data sovereignty

“The AWS AI Factory seeks to resolve the tension between cloud-native innovation velocity and sovereign control. Historically, these objectives lived in opposition. CIOs faced an unsustainable dilemma: choose between on-premises security or public cloud cost and speed benefits,” he said. “This is arguably AWS’s most significant move in the sovereign AI landscape.” On premises GPUs are already a thing AI Factories isn’t the first attempt to put cloud-managed AI accelerators in customers’ data centers. Oracle introduced Nvidia processors to its Cloud@Customer managed on-premises offering in March, while Microsoft announced last month that it will add Nvidia processors to its Azure Local service. Google Distributed Cloud also includes a GPU offering, and even AWS offers lower-powered Nvidia processors in its AWS Outposts. AWS’ AI Factories is also likely to square off against from a range of similar products, such as Nvidia’s AI Factory, Dell’s AI Factory stack, and HPE’s Private Cloud for AI — each tightly coupled with Nvidia GPUs, networking, or software, and all vying to become the default on-premises AI platform. But, said Sopko, AWS will have an advantage over rivals due to its hardware-software integration and operational maturity: “The secret sauce is the software, not the infrastructure,” he said. Omdia principal analyst Alexander Harrowell expects AWS’s AI Factories to combine the on-premises control of Outposts with the flexibility and ability to run a wider variety of services offered by AWS Local Zones, which puts small data centers close to large population centers to reduce service latency. Sopko cautioned that enterprises are likely to face high commitment costs, drawing a parallel with Oracle’s OCI Dedicated Region, one of its Cloud@Customer offerings.

Read More »

HPE loads up AI networking portfolio, strengthens Nvidia, AMD partnerships

On the hardware front, HPE is targeting the AI data center edge with a new MX router and the scale-out networking delivery with a new QFX switch. Juniper’s MX series is its flagship routing family aimed at carriers, large-scale enterprise data center and WAN customers, while the QFX line services data center customers anchoring spine/leaf networks as well as top-of-rack systems. The new 1U, 1.6Tbps MX301 multiservice edge router, available now, is aimed at bringing AI inferencing closer to the source of data generation and can be positioned in metro, mobile backhaul, and enterprise routing applications, Rahim said. It includes high-density support for 16 x 1/1025/50GbE, 10 x 100Gb and 4 x 400Gb interfaces. “The MX301 is essentially the on-ramp to provide high speed, secure connections from distributed inference cluster users, devices and agents from the edge all the way to the AI data center,” Rami said. “The requirements here are typically around high performance, but also very high logical skills and integrated security.” In the QFX arena, the new QFX5250 switch, available in 1Q 2026, is a fully liquid-cooled box aimed at tying together Nvidia Rubin and/or AMD MI400 GPUs for AI consumption across the data center. It is built on Broadcom Tomahawk 6 silicon and supports up to 102.4Tbps Ethernet bandwidth, Rahim said.  “The QFX5250 combines HPE liquid cooling technology with Juniper networking software (Junos) and integrated AIops intelligence to deliver a high-performance, power-efficient and simplified operations for next-generation AI inference,” Rami said. Partnership expansions Also key to HPE/Juniper’s AI networking plans are its partnerships with Nvidia and AMD. The company announced its relationship with Nvidia now includes HPE Juniper edge onramp and long-haul data center interconnect (DCI) support in its Nvidia AI Computing by HPE portfolio. This extension uses the MX and Junipers PTX hyperscaler routers to support high-scale, secure

Read More »

What is co-packaged optics? A solution for surging capacity in AI data center networks

When it announced its CPO-capable switches, Nvidia said they would improve resiliency by 10 times at scale compared to previous switch generations. Several factors contribute to this claim, including the fact that the optical switches require four times fewer lasers, Shainer says. Whereas the laser source was previously part of the transceiver, the optical engine is now incorporated onto the ASIC, allowing multiple optical channels to share a single laser. Additionally, in Nvidia’s implementation, the laser source is located outside of the switch. “We want to keep the ability to replace a laser source in case it has failed and needs to be replaced,” he says. “They are completely hot-swappable, so you don’t need to shut down the switch.” Nonetheless, you may often hear that when something fails in a CPO box, you need to replace the entire box. That may be true if it’s the photonics engine embedded in silicon inside the box. “But they shouldn’t fail that often. There are not a lot of moving parts in there,” Wilkinson says. While he understands the argument around failures, he doesn’t expect it to pan out as CPO gets deployed. “It’s a fallacy,” he says. There’s also a simple workaround to the resiliency issue, which hyperscalers are already talking about, Karavalas says: overbuild. “Have 10% more ports than you need or 5%,” he says. “If you lose a port because the optic goes bad, you just move it and plug it in somewhere else.” Which vendors are backing co-packaged optics? In terms of vendors that have or plan to have CPO offerings, the list is not long, unless you include various component players like TSMC. But in terms of major switch vendors, here’s a rundown: Broadcom has been making steady progress on CPO since 2021. It is now shipping “to

Read More »

Nvidia’s $2B Synopsys stake tests independence of open AI interconnect standard

But the concern for enterprise IT leaders is whether Nvidia’s financial stakes in UALink consortium members could influence the development of an open standard specifically designed to compete with Nvidia’s proprietary technology and to give enterprises more choices in the datacenter. Organizations planning major AI infrastructure investments view such open standards as critical to avoiding vendor lock-in and maintaining competitive pricing. “This does put more pressure on UALink since Intel is also a member and also took investment from Nvidia,” Sag said. UALink and Synopsys’s critical role UALink represents the industry’s most significant effort to prevent vendor lock-in for AI infrastructure. The consortium ratified its UALink 200G 1.0 Specification in April, defining an open standard for connecting up to 1,024 AI accelerators within computing pods at 200 Gbps per lane — directly competing with Nvidia’s NVLink for scale-up applications. Synopsys plays a critical role. The company joined UALink’s board in January and in December announced the industry’s first UALink design components, enabling chip designers to build UALink-compatible accelerators. Analysts flag governance concerns Gaurav Gupta, VP analyst at Gartner, acknowledged the tension. “The Nvidia-Synopsys deal does raise questions around the future of UALink as Synopsys is a key partner of the consortium and holds critical IP for UALink, which competes with Nvidia’s proprietary NVLink,” he said. Sanchit Vir Gogia, chief analyst at Greyhound Research, sees deeper structural concerns. “Synopsys is not a peripheral player in this standard; it is the primary supplier of UALink IP and a board member within the UALink Consortium,” he said. “Nvidia’s entry into Synopsys’ shareholder structure risks contaminating that neutrality.”

Read More »

Cooling crisis at CME: A wakeup call for modern infrastructure governance

Organizations should reassess redundancy However, he pointed out, “the deeper concern is that CME had a secondary data center ready to take the load, yet the failover threshold was set too high, and the activation sequence remained manually gated. The decision to wait for the cooling issue to self-correct rather than trigger the backup site immediately revealed a governance model that had not evolved to keep pace with the operational tempo of modern markets.” Thermal failures, he said, “do not unfold on the timelines assumed in traditional disaster recovery playbooks. They escalate within minutes and demand automated responses that do not depend on human certainty about whether a facility will recover in time.” Matt Kimball, VP and principal analyst at Moor Insights & Strategy, said that to some degree what happened in Aurora highlights an issue that may arise on occasion: “the communications gap that can exist between IT executives and data center operators. Think of ‘rack in versus rack out’ mindsets.” Often, he said, the operational elements of that data center environment, such as cooling, power, fire hazards, physical security, and so forth, fall outside the realm of an IT executive focused on delivering IT services to the business. “And even if they don’t fall outside the realm, these elements are certainly not a primary focus,” he noted. “This was certainly true when I was living in the IT world.” Additionally, said Kimball, “this highlights the need for organizations to reassess redundancy and resilience in a new light. Again, in IT, we tend to focus on resilience and redundancy at the app, server, and workload layers. Maybe even cluster level. But as we continue to place more and more of a premium on data, and the terms ‘business critical’ or ‘mission critical’ have real relevance, we have to zoom out

Read More »

Microsoft loses two senior AI infrastructure leaders as data center pressures mount

Microsoft did not immediately respond to a request for comment. Microsoft’s constraints Analysts say the twin departures mark a significant setback for Microsoft at a critical moment in the AI data center race, with pressure mounting from both OpenAI’s model demands and Google’s infrastructure scale. “Losing some of the best professionals working on this challenge could set Microsoft back,” said Neil Shah, partner and co-founder at Counterpoint Research. “Solving the energy wall is not trivial, and there may have been friction or strategic differences that contributed to their decision to move on, especially if they saw an opportunity to make a broader impact and do so more lucratively at a company like Nvidia.” Even so, Microsoft has the depth and ecosystem strength to continue doubling down on AI data centers, said Prabhu Ram, VP for industry research at Cybermedia Research. According to Sanchit Gogia, chief analyst at Greyhound Research, the departures come at a sensitive moment because Microsoft is trying to expand its AI infrastructure faster than physical constraints allow. “The executives who have left were central to GPU cluster design, data center engineering, energy procurement, and the experimental power and cooling approaches Microsoft has been pursuing to support dense AI workloads,” Gogia said. “Their exit coincides with pressures the company has already acknowledged publicly. GPUs are arriving faster than the company can energize the facilities that will house them, and power availability has overtaken chip availability as the real bottleneck.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »