How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

Stay Ahead, Stay ONMINE

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today. This two-part deep dive is something I’ve been meaning […]

With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them.

I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today.

This two-part deep dive is something I’ve been meaning to do for a while and was also inspired by Andrej Karpathy’s widely popular 3.5-hour YouTube video, which has racked up 800,000+ views in just 10 days. Andrej is a founding member of OpenAI, his insights are gold— you get the idea.

If you have the time, his video is definitely worth watching. But let’s be real — 3.5 hours is a long watch. So, for all the busy folks who don’t want to miss out, I’ve distilled the key concepts from the first 1.5 hours into this 10-minute read, adding my own breakdowns to help you build a solid intuition.

What you’ll get

Part 1 (this article): Covers the fundamentals of LLMs, including pre-training to post-training, neural networks, Hallucinations, and inference.

Part 2: Reinforcement learning with human/AI feedback, investigating o1 models, DeepSeek R1, AlphaGo

Let’s go! I’ll start with looking at how LLMs are being built.

At a high level, there are 2 key phases: pre-training and post-training.

1. Pre-training

Before an LLM can generate text, it must first learn how language works. This happens through pre-training, a highly computationally intensive task.

Step 1: Data collection and preprocessing

The first step in training an LLM is gathering as much high-quality text as possible. The goal is to create a massive and diverse dataset containing a wide range of human knowledge.

One source is Common Crawl, which is a free, open repository of web crawl data containing 250 billion web pages over 18 years. However, raw web data is noisy — containing spam, duplicates and low quality content — so preprocessing is essential.If you’re interested in preprocessed datasets, FineWeb offers a curated version of Common Crawl, and is made available on Hugging Face.

Once cleaned, the text corpus is ready for tokenization.

Step 2: Tokenization

Before a neural network can process text, it must be converted into numerical form. This is done through tokenization, where words, subwords, or characters are mapped to unique numerical tokens.

Think of tokens as the building blocks — the fundamental building blocks of all language models. In GPT4, there are 100,277 possible tokens.A popular tokenizer, Tiktokenizer, allows you to experiment with tokenization and see how text is broken down into tokens. Try entering a sentence, and you’ll see each word or subword assigned a series of numerical IDs.

Step 3: Neural network training

Once the text is tokenized, the neural network learns to predict the next token based on its context. As shown above, the model takes an input sequence of tokens (e.g., “we are cook ing”) and processes it through a giant mathematical expression — which represents the model’s architecture — to predict the next token.

A neural network consists of 2 key parts:

Parameters (weights) — the learned numerical values from training.
Architecture (mathematical expression) — the structure defining how the input tokens are processed to produce outputs.

Initially, the model’s predictions are random, but as training progresses, it learns to assign probabilities to possible next tokens.

When the correct token (e.g. “food”) is identified, the model adjusts its billions of parameters (weights) through backpropagation — an optimization process that reinforces correct predictions by increasing their probabilities while reducing the likelihood of incorrect ones.

This process is repeated billions of times across massive datasets.

Base model — the output of pre-training

At this stage, the base model has learned:

How words, phrases and sentences relate to each other
Statistical patterns in your training data

However, base models are not yet optimised for real-world tasks. You can think of them as an advanced autocomplete system — they predict the next token based on probability, but with limited instruction-following ability.

A base model can sometimes recite training data verbatim and can be used for certain applications through in-context learning, where you guide its responses by providing examples in your prompt. However, to make the model truly useful and reliable, it requires further training.

2. Post training — Making the model useful

Base models are raw and unrefined. To make them helpful, reliable, and safe, they go through post-training, where they are fine-tuned on smaller, specialised datasets.

Because the model is a neural network, it cannot be explicitly programmed like traditional software. Instead, we “program” it implicitly by training it on structured labeled datasets that represent examples of desired interactions.

How post training works

Specialised datasets are created, consisting of structured examples on how the model should respond in different situations.

Some types of post training include:

Instruction/conversation fine tuning
Goal: To teach the model to follow instructions, be task oriented, engage in multi-turn conversations, follow safety guidelines and refuse malicious requests, etc.
Eg: InstructGPT (2022): OpenAI hired some 40 contractors to create these labelled datasets. These human annotators wrote prompts and provided ideal responses based on safety guidelines. Today, many datasets are generated automatically, with humans reviewing and editing them for quality.
Domain specific fine tuning
Goal: Adapt the model for specialised fields like medicine, law and programming.

Post training also introduces special tokens — symbols that were not used during pre-training — to help the model understand the structure of interactions. These tokens signal where a user’s input starts and ends and where the AI’s response begins, ensuring that the model correctly distinguishes between prompts and replies.

Now, we’ll move on to some other key concepts.

Inference — how the model generates new text

Inference can be performed at any stage, even midway through pre-training, to evaluate how well the model has learned.

When given an input sequence of tokens, the model assigns probabilities to all possible next tokens based on patterns it has learned during training.

Instead of always choosing the most likely token, it samples from this probability distribution — similar to flipping a biased coin, where higher-probability tokens are more likely to be selected.

This process repeats iteratively, with each newly generated token becoming part of the input for the next prediction.

Token selection is stochastic and the same input can produce different outputs. Over time, the model generates text that wasn’t explicitly in its training data but follows the same statistical patterns.

Hallucinations — when LLMs generate false info

Why do hallucinations occur?

Hallucinations happen because LLMs do not “know” facts — they simply predict the most statistically likely sequence of words based on their training data.

Early models struggled significantly with hallucinations.

For instance, in the example below, if the training data contains many “Who is…” questions with definitive answers, the model learns that such queries should always have confident responses, even when it lacks the necessary knowledge.

When asked about an unknown person, the model does not default to “I don’t know” because this pattern was not reinforced during training. Instead, it generates its best guess, often leading to fabricated information.

How do you reduce hallucinations?

Method 1: Saying “I don’t know”

Improving factual accuracy requires explicitly training the model to recognise what it does not know — a task that is more complex than it seems.

This is done via self interrogation, a process that helps define the model’s knowledge boundaries.

Self interrogation can be automated using another AI model, which generates questions to probe knowledge gaps. If it produces a false answer, new training examples are added, where the correct response is: “I’m not sure. Could you provide more context?”

If a model has seen a question many times in training, it will assign a high probability to the correct answer.

If the model has not encountered the question before, it distributes probability more evenly across multiple possible tokens, making the output more randomised. No single token stands out as the most likely choice.

Fine tuning explicitly trains the model to handle low-confidence outputs with predefined responses.

For example, when I asked ChatGPT-4o, “Who is asdja rkjgklfj?”, it correctly responded: “I’m not sure who that is. Could you provide more context?”

Method 2: Doing a web search

A more advanced method is to extend the model’s knowledge beyond its training data by giving it access to external search tools.

At a high level, when a model detects uncertainty, it can trigger a web search. The search results are then inserted into a model’s context window — essentially allowing this new data to be part of it’s working memory. The model references this new information while generating a response.

Vague recollections vs working memory

Generally speaking, LLMs have two types of knowledge access.

Vague recollections — the knowledge stored in the model’s parameters from pre-training. This is based on patterns it learned from vast amounts of internet data but is not precise nor searchable.
Working memory — the information that is available in the model’s context window, which is directly accessible during inference. Any text provided in the prompt acts as a short term memory, allowing the model to recall details while generating responses.

Adding relevant facts within the context window significantly improves response quality.

Knowledge of self

When asked questions like “Who are you?” or “What built you?”, an LLM will generate a statistical best guess based on its training data, unless explicitly programmed to respond accurately.

LLMs do not have true self-awareness, their responses depend on patterns seen during training.

One way to provide the model with a consistent identity is by using a system prompt, which sets predefined instructions about how it should describe itself, its capabilities, and its limitations.

To end off

That’s a wrap for Part 1! I hope this has helped you build intuition on how LLMs work. In Part 2, we’ll dive deeper into reinforcement learning and some of the latest models.

Got questions or ideas for what I should cover next? Drop them in the comments — I’d love to hear your thoughts. See you in Part 2! 🙂

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Meta’s compute grab continues with agreement to deploy tens of millions of AWS Graviton cores

“This is really about control of the AI system, not just scale,” said Kimball. As AI evolves toward persistent, agentic workloads, the role of the CPU becomes “quite meaningful;” it serves as the control plane, handling orchestration, managing memory, scheduling, and other intensive tasks across accelerators. “This is especially true

Cirrascale to offer on-prem Google Gemini models

Google Distributed Cloud can be deployed in customer-controlled environments, including installations that are disconnected from the Internet, which is a key requirement for some government and critical-infrastructure users. One of the big challenges is that these models are incredibly valuable and they need to be delivered in a trusted, secure

Cisco switch aimed at building practical quantum networks

Cisco today unveiled a prototype switch it says will significantly accelerate the timeline for practical, distributed, quantum-computing-based networks. Cisco’s Universal Quantum Switch is designed to connect quantum systems from different vendors, such as IBM, IonQ, Google and Rigetti, in all major qubit encoding technologies, at room temperature, and over standard

It’s the end of set-and-forget security

For IT pros, this translates into: Designing topologies and routing policies that support near real‑time, partial restores of critical services without hard cutovers. Ensuring backup traffic, recovery workflows, and security tooling share telemetry so SecOps can correlate “what changed on the wire” with “what was restored.” Treating recovery points and

US BLM to offer 400,000 acres for oil and gas leasing under ANWR’s coastal plain in June

The US Bureau of Land Management (BLM) will offer oil and gas leases on 400,000 acres under the Alaska National Wildlife Refuge (ANWR)’s coastal plain on June 5, the first in a series of at least four sales required under the One Big Beautiful Bill Act (OBBBA), which the Trump administration now calls the Working Families Tax Cut act. Recent attempts to lease land for oil and gas development in the 1.5-million-acre coastal plain (the “1002 Area”) of ANWR have generated little interest, with the most recent federal lease sale in January 2025 yielding zero bids and no revenue for federal or state taxpayers. This sale was the second auction mandated by another bill, the 2017 Tax Cuts and Jobs Act. The first sale under that law, held in January 2021, offered 1.1 million acres but yielded only $14.4 million in high bids, less than 1% of the roughly $1 billion originally estimated. BLM noted, however, that a recent federal lease sale in the National Petroleum Reserve in Alaska generated strong participation, which could portend a stronger showing for the upcoming ANWR sale. “The record-breaking success of last month’s lease sale in Alaska’s National Petroleum Reserve sent a clear signal: There is robust and continuing demand for Alaskan energy, underscoring the need for more opportunities like the Coastal Plain sale,” Acting BLM Director Bill Groffy said in a statement. “By expanding these opportunities, we strengthen our national energy security, support high-paying jobs for Alaskans, and help ensure Americans have access to affordable energy.” The Mar. 18 NPR-A sale resulted in 187 leases and $163.7 million in total receipts. Oil and gas development in ANWR remains contentious because of its ecologically sensitive environment and ongoing lawsuits from indigenous groups and environmental organizations. Majors, including ExxonMobil, ConocoPhillips, and bp have left the area

Oil prices decline as Strait traffic resumes

Friday’s move has the May 2026 WTI NYMEX futures are trading below the 8-, 13-, and 21-day Moving Averages with a Low that breached the Lower-Bollinger Band limit. Volume is down to 80,000 as May expires next week and traders turn their attention to June. The Relative Strength Indicator (RSI), a momentum indicator, has fallen back into neutral territory at 42. Resistance is now pegged at $93.70 (8-day MA) while near-term Support is $82.45 (Bollinger Band). As has been the pattern for several weeks now, traders have to be cautious with their Friday positions as the market is closed until Sunday evening and the US/Iran talks continue on Saturday. Looking ahead Questions now remain in terms of the duration of the Israeli ceasefire with Lebanon which Iran has tied to the opening of the Strait of Hormuz. Should Israel violate the ceasefire, it would put Iran’s IRGC back in direct conflict with US naval forces in the area should the former attempt to close the Strait again. US/Iran negotiations are scheduled to continue this weekend in Islamabad. Once again, markets will be closed until Sunday evening so the outcome of those talks will be key to market direction on the Open. Should peace hold, there will need to be a very detailed assessment of the long-term damage to all oil and gas infrastructure in the region. The tanker tracking map below indicates loaded oil vessels are exiting the Strait of Hormuz. Natural gas, fundamental analysis May NYMEX natural gas futures have now been on a 5-week downtrend on mild weather and a larger-than-expected storage injections despite healthy LNG export volumes. The week’s High was Monday’s $2.72/MMbtu while the Low was Tuesday’s $2.56, a tight range which indicates market direction uncertainty. Natural gas demand this week has been estimated at about

Phillips 66, Kinder Morgan move forward with Western Gateway pipeline with secured shipper interest

Phillips 66 Co. and Kinder Morgan Inc. have secured sufficient shipper interest to advance the proposed Western Gateway refined products pipeline project to supply fuel to ‌Arizona and California, the companies said in a joint release Apr. 20. Following a second open season to secure long-term shipper commitments, the companies will “move the project forward, subject to the execution of definitive transportation service agreements, joint venture agreements, and respective board approvals,” the companies said. “Customer response during the open season underscores the importance of Western Gateway in addressing long term refined products logistics needs in the region,” said Phillips 66 chairman and chief executive officer Mark Lashier. “By utilizing existing pipeline assets across multiple states along the route, we’re uniquely well-positioned to support a refined products transportation solution,” said Kim Dang, Kinder Morgan chief executive officer. Western Gateway pipeline specs The planned 200,000-b/d Western Gateway project is designed as a 1,300-mile refined products system with a new-build pipeline from Borger, Tex. to Phoenix, Ariz., combined with Kinder Morgan’s existing SFPP LP pipeline from Colton, Calif. to Phoenix, Ariz., which will be reversed to enable east-to-west product flows into California. It will be fed from supplies connected to Borger as well as supplies already connected to SFPP’s system in El Paso, Tex. The Gold Pipeline, operated by Phillips 66, which currently flows from Borger to St. Louis, will be reversed to enable refined products from midcontinent refineries to flow toward Borger and supply the Western Gateway pipeline. Western Gateway will also have connectivity to Las Vegas, Nev. via Kinder Morgan’s 566-mile CALNEV Pipeline. The Western Gateway Pipeline is targeting completion by 2029. Phillips 66 will build the entirety of the new pipeline and will operate the line from Borger, Tex., to El Paso, Tex. Kinder Morgan will operate the line from El

Viva Energy reports on Geelong refinery status following fire

Viva Energy Group Ltd. has stabilized operations at its 120,000-b/d Geelong refinery in Victoria, Australia, which continues operating at reduced rates following a mid-April fire in the site’s gasoline complex. In an Apr. 20 update to the market, Viva Energy confirmed the Apr. 15 fire specifically occurred in the complex’s alkylation unit and was not fully extinguished until the morning of Apr. 16. While the refinery’s crude distillation units and reformer continue operating, the site’s residue catalytic cracking unit (RCCU) remains temporarily offline as part of ongoing stabilization efforts, according to the company. In the near term, Viva Energy said it expects the refinery’s diesel and jet fuel production to average about 80% normal capacity, with gasoline output reduced to about 60% capacity. The company anticipates production constraints to ease in the coming weeks, subject to inspection and restart of the RCCU, which would allow the refinery’s combined output diesel, jet fuel, and gasoline to exceed 90% of nameplate capacity until all necessary repairs are completed. With sufficient fuel inventories already on hand, Viva Energy said it remains well-positioned to maintain normal fuel supplies to customers during the production shortfalls. “The whole Viva Energy team understands how important our refinery is to the energy security of the country, especially at the current time. We will progressively restore production once we are confident that it is safe to do so, and do not expect any disruptions to fuel availability or price increases for Viva Energy’s customers as a result of this incident,” Scott Wyatt, Viva Energy’s chief executive officer, said in a separate statement. While the company confirmed an assessment of damage to the alkylation unit and associated systems is under way, estimated timelines for full repairs and financial impacts resulting from the fire have yet to be determined. Alongside prioritizing

Oil prices plunge following full reopening of the Strait of Hormuz to commercial vessels

Oil prices plunged on Apr. 17, as geopolitical tensions in the Middle East showed signs of easing, following the full reopening of the Strait of Hormuz to commercial vessels. Global crude markets reacted sharply after Iran confirmed that the Strait of Hormuz is now “completely open” to commercial shipping during an ongoing ceasefire tied to regional conflict negotiations. The announcement marked a major turning point after weeks of disruption that had severely constrained global oil flows. Stay updated on oil price volatility, shipping disruptions, LNG market analysis, and production output at OGJ’s Iran war content hub. Brent crude fell by more than 10%, dropping to around $88–89/bbl, while US West Texas Intermediate (WTI) declined to the low $80s—both benchmarks hitting their lowest levels in over a month. The sell-off reflects a rapid unwinding of the geopolitical risk premium that had built up during the conflict. The reopening follows a fragile, 10-day ceasefire involving Israel and Lebanon, alongside tentative progress in US–Iran negotiations. While the waterway is now open, the US has maintained a naval blockade on Iranian ports, signaling that broader geopolitical risks have not fully dissipated. The return of tanker traffic through the Gulf could gradually restore millions of barrels per day to global markets, easing the tight conditions that had driven recent price volatility. However, some uncertainty remains over how quickly shipping activity will normalize and whether the ceasefire will hold. Despite the sharp price decline, the oil market remains structurally fragile. Weeks of disruption have depleted inventories and altered trade flows, and it may take time for supply chains to fully recover. Additionally, any breakdown in ceasefire talks could quickly reverse the current trend. Beyond energy markets, the development rippled across global financial systems. Equity markets surged, with major US indices posting strong gains as lower oil

EIA: US crude inventories up 1.9 million bbl

US crude oil inventories for the week ended Apr. 17, excluding the Strategic Petroleum Reserve, increased by 1.9 million bbl from the previous week, according to data from the US Energy Information Administration (EIA). At 465.7 million bbl, US crude oil inventories are about 3% above the 5-year average for this time of year, the EIA report indicated. EIA said total motor gasoline inventories decreased by 4.6 million bbl from last week and are about 0.5% below the 5-year average for this time of year. Finished gasoline inventories increased while blending components inventories decreased last week. Distillate fuel inventories decreased by 3.4 million bbl last week and are about 8% below the 5-year average for this time of year. Propane-propylene inventories increased by 2.1 million bbl from last week and are 69% above the 5-year average for this time of year, EIA said. US crude oil refinery inputs averaged 16.0 million b/d for the week, which was 55,000 b/d less than the previous week’s average. Refineries operated at 89.1% of capacity. Gasoline production increased, averaging 10.1 million b/d. Distillate fuel production increased, averaging 5.0 million b/d. US crude oil imports averaged 6.1 million b/d, up 787,000 b/d from the previous week. Over the last 4 weeks, crude oil imports averaged about 6.0 million b/d, 0.4% less than the same 4-week period last year. Total motor gasoline imports averaged 587,000 b/d. Distillate fuel imports averaged 190,000 b/d.

BYOP Moves to the Center of Data Center Strategy

Self-Sufficiency Becomes a Feature, Not a Risk Consider Wyoming’s Project Jade, where county commissioners approved an AI campus tied to 2.7 GW of new natural gas-fired generation being developed by Tallgrass Energy. Reporting from POWER described the project as a “bring your own power” model designed for a high degree of self-sufficiency, with a mix of natural gas generation and Bloom fuel cells. The campus is expected to scale significantly over time. What stands out is not only the size, but the positioning. Self-sufficiency is becoming a selling point both for developers seeking to de-risk timelines, and for local stakeholders wary of overloading existing utility infrastructure. Fuel Cells and Nuclear: The Middle Ground and the Long Game Fuel cells occupy an important middle ground in this shift. Bloom Energy’s 2026 report positions fuel cells as a leading onsite option due to shorter lead times, modular deployment, and lower local emissions. Market activity suggests that interest is real. For developers, fuel cells can be easier to permit than large turbine installations and can be deployed incrementally. That makes them effective as bridge-to-grid solutions or as permanent components of hybrid architectures. Advanced nuclear remains the most strategically significant, but least immediate, BYOP pathway. Companies including Switch and other data center operators have explored partnerships with Oklo around its Aurora small modular reactor design. Nuclear holds long-term appeal because it offers firm, low-carbon power at scale. But for current AI buildouts, it remains a future option rather than a near-term construction solution. The immediate reality is that gas and modular onsite systems are closing the time-to-power gap, while nuclear is being positioned as a longer-duration successor as licensing and deployment timelines evolve. The model itself is also evolving. BYOP is beginning to blur the line between developer, energy provider, and compute customer. Reuters

Microsoft Builds for Two Worlds: Sovereign Cloud and AI Factories

So far in 2026, across the United States and overseas, Microsoft is building an infrastructure portfolio at full hyperscale. The strategy runs on two tracks. The first is familiar: sovereign cloud expansion involving new regions, local data residency, and compliance-driven enterprise infrastructure. The second is larger and more consequential: purpose-built AI factory campuses designed for dense GPU clusters, liquid cooling, private fiber, and power acquisition at a scale that extends far beyond traditional cloud infrastructure. Despite reports last year that Microsoft was pulling back on data center development, the company is accelerating. It is not only advancing its own large-scale campuses, but also absorbing premium AI capacity originally aligned with OpenAI. In Texas and Norway, projects tied to OpenAI’s infrastructure plans have shifted back into Microsoft’s orbit. Even after contractual changes gave OpenAI greater flexibility to source compute elsewhere, Microsoft remains the market’s most reliable backstop buyer for top-tier AI infrastructure. It no longer needs to control every OpenAI build to maintain its position. In 2026, Microsoft is still the company best positioned to turn uncertain AI demand into deployed capacity, e.g. concrete, steel, power, and silicon at scale. Building at Industrial Scale The clearest indicator of Microsoft’s intent is its capital spending. In its January 2026 earnings cycle, Reuters reported that Microsoft’s quarterly capital expenditures reached a record $37.5 billion, up nearly 66% year over year. The company’s cloud backlog rose to $625 billion, with roughly 45% of remaining performance obligations tied to OpenAI. About two-thirds of that quarterly capex was directed toward compute chips. To be clear: this is no speculative buildout. Microsoft is deploying capital against a massive, committed demand pipeline, even as it maintains significant exposure to OpenAI-driven workloads. The company is solving two infrastructure problems at once: supporting broad Azure and Copilot growth, while ensuring

AI’s Execution Era: Aligned and Netrality on Power, Speed, and the New Data Center Reality

At Data Center World 2026, the industry didn’t need convincing that something fundamental has shifted. “This feels different,” said Bill Kleyman as he opened a keynote fireside with Phill Lawson-Shanks and Amber Caramella. “In the past 24 months, we’ve seen more evolution… than in the two decades before.” What followed was less a forecast than a field report from the front lines of the AI infrastructure buildout—where demand is immediate, power is decisive, and execution is everything. A Different Kind of Growth Cycle For Caramella, the shift starts with scale—and speed. “What feels fundamentally different is just the sheer pace and breadth of the demand combined with a real shift in architecture,” she said. Vacancy rates have collapsed even as capacity expands. AI workloads are not just additive—they are redefining absorption curves across the market. But the deeper change is behavioral. “Over 75% of people are using AI in their day-to-day business… and now the conversation is shifting to agentic AI,” Caramella noted. That shift—from tools to delegated workflows—points to a second wave of infrastructure demand that has not yet fully materialized. Lawson-Shanks framed the transformation in more structural terms. The industry, he said, has always followed a predictable chain: workload → software → hardware → facility → location. That chain has broken. “We had a very predictable industry… prior to Covid. And Covid changed everything,” he said, describing how hyperscale demand compressed deployment cycles overnight. What followed was a surge that utilities—and supply chains—were not prepared to meet. From Capacity to Constraint: Power Becomes Strategy If AI has a gating factor, it is no longer compute. It is power. “Before it used to be an operational convenience,” Caramella said. “Now it’s a strategic advantage—or constraint if you don’t have it.” That shift is reshaping executive decision-making. Power is no

The Trillion-Dollar AIDC Boom Gets Real: Omdia Maps the Path From Megaclusters to Microgrids

The AI data center buildout is getting bigger, denser, and more electrically complex than even many bullish observers expected. That was the core message from Omdia’s Data Center World analyst summit, where Senior Director Vlad Galabov and Practice Lead Shen Wang laid out a view of the market that has grown more expansive in just the past year. What had been a large-scale infrastructure story is now, in Omdia’s telling, something closer to a full-stack industrial transition: hyperscalers are still leading, but enterprises, second-tier cloud providers, and new AI use cases are beginning to add demand on top of demand. Omdia’s updated forecast reflects that shift. Galabov said the firm has now raised its 2030 projection for data center investment beyond the $1.6 trillion figure it showed a year ago, arguing that surging AI usage, expanding buyer classes, and the emergence of new power infrastructure categories have all forced a rethink. “One of the reasons why we raised it is that people keep using more AI,” Galabov said. “And that just means more money, because we need to buy more GPUs to run the AI.” That is the simple version. The more consequential one is that AI is no longer behaving like a contained technology cycle. It is spilling outward into adjacent infrastructure markets, including batteries, gas-fired onsite generation, and high-voltage DC power architectures that until recently sat well outside the mainstream data center conversation. A Market Moving Faster Than the Forecasts Galabov opened by revisiting the predictions Omdia made last year for 2030. On several fronts, he said, the market is already validating them faster than expected. AI applications are becoming commonplace. AI has become the dominant driver of data center investment. Self-generation is no longer a fringe strategy. Even some of the rack-scale architecture concepts that once looked

Data Center World 2026: Innovation Spotlight

Belden + OptiCool: Modular Cooling for the AI Middle Market At Data Center World 2026, company representatives from Belden and OptiCool described a joint push into integrated rack-level infrastructure—pairing connectivity, power, and modular cooling into a single deployable system aimed squarely at enterprise and mid-market colocation providers. The partnership reflects a shift already underway inside Belden itself. Long known as a manufacturer of wire, cable, and connectivity products, the company said it has spent the last several years evolving into a solutions provider—leveraging a broader portfolio that spans industrial networking, automation, and control systems. That repositioning is now extending into AI infrastructure. From Components to Fully Integrated Systems Rather than selling discrete products into bid cycles, Belden is now packaging racks, PDUs, cable management, and cooling into a unified offering—delivered as a manufacturer-backed system rather than a third-party integration. “We can bring a full solution to the table now,” a company representative said, emphasizing that the company is “standing behind the solution as a manufacturer, not as a system integrator.” The cooling layer comes via OptiCool, whose rear-door heat exchanger (RDHx) technology is designed to scale alongside uncertain AI workloads. Two-Phase Rear Door Cooling at Rack Scale OptiCool’s approach centers on two-phase cooling applied at the rear door, combining the non-invasive characteristics of RDHx with the efficiency gains typically associated with direct-to-chip liquid cooling. According to company representatives, the system: Supports up to 120 kW per rack (with 60 kW demonstrated on the show floor) Delivers up to 10x cooling capacity compared to traditional approaches Operates at roughly one-third the energy consumption of comparable single-phase systems Instead of injecting cold air, the system extracts heat using refrigerant as the heat sink, reducing demand on CRAC units and broader facility cooling infrastructure. Designing for Uncertainty: Modular, Swappable Capacity The defining feature—and

Space data-center news: Roundup of extraterrestrial AI endeavors

Orbital is betting that distributed inference can scale as a constellation, with each satellite handling workloads in parallel. The company is also filing with the FCC for a larger constellation. Lonestar announces first commercial space data storage service April 2026: Lonestar Data Holdings announced StarVault, which it’s calling “the world’s first commercially operational space-based sovereign data storage platform.” The service launches in October 2026 aboard Sidus Space’s LizzieSat-4 mission. StarVault isn’t a full data center — it’s data storage with “advanced cryptographic key escrow capabilities,” according to the announcement. But it’s the first commercial space data service that enterprises can actually buy. Lonestar says demand from governments, financial institutions, and critical infrastructure operators has already exceeded expectations, and the company has ordered a second payload for launch next year. Lonestar has already flown four proof-of-concept data centers to space, including two to the Moon, according to the announcement. This is different because it’s the first one designed for paying customers. Atomic-6 launches a marketplace for buying orbital capacity April 2026: Atomic-6, a space systems company in Marietta, Georgia, has launched ODC.space — basically, a marketplace where you spec, price, and order orbital data center capacity the way you’d order a rack from a colo provider. You can buy either a sovereign satellite, where you get the whole thing, or colocated, where you rent space on someone else’s capacity, according to the announcement. Atomic-6 handles spacecraft build, launch, licensing, and operations through a partner network. You just supply the processors and the workload. Delivery runs two to three years, which Atomic-6 is carefully positioning against terrestrial data center timelines that now routinely run five-plus. Base configurations start with 1U nodes on satellites rated up to 100 kW. Connectivity starts at 1 Gbps. A sovereign rack runs $3.5 million a month, Atomic-6

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle