Stay Ahead, Stay ONMINE

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today. This two-part deep dive is something I’ve been meaning […]

With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them.

I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today.

This two-part deep dive is something I’ve been meaning to do for a while and was also inspired by Andrej Karpathy’s widely popular 3.5-hour YouTube video, which has racked up 800,000+ views in just 10 days. Andrej is a founding member of OpenAI, his insights are gold— you get the idea.

If you have the time, his video is definitely worth watching. But let’s be real — 3.5 hours is a long watch. So, for all the busy folks who don’t want to miss out, I’ve distilled the key concepts from the first 1.5 hours into this 10-minute read, adding my own breakdowns to help you build a solid intuition.

What you’ll get

Part 1 (this article): Covers the fundamentals of LLMs, including pre-training to post-training, neural networks, Hallucinations, and inference.

Part 2: Reinforcement learning with human/AI feedback, investigating o1 models, DeepSeek R1, AlphaGo

Let’s go! I’ll start with looking at how LLMs are being built.

At a high level, there are 2 key phases: pre-training and post-training.

1. Pre-training

Before an LLM can generate text, it must first learn how language works. This happens through pre-training, a highly computationally intensive task.

Step 1: Data collection and preprocessing

The first step in training an LLM is gathering as much high-quality text as possible. The goal is to create a massive and diverse dataset containing a wide range of human knowledge.

One source is Common Crawl, which is a free, open repository of web crawl data containing 250 billion web pages over 18 years. However, raw web data is noisy — containing spam, duplicates and low quality content — so preprocessing is essential.If you’re interested in preprocessed datasets, FineWeb offers a curated version of Common Crawl, and is made available on Hugging Face.

Once cleaned, the text corpus is ready for tokenization.

Step 2: Tokenization

Before a neural network can process text, it must be converted into numerical form. This is done through tokenization, where words, subwords, or characters are mapped to unique numerical tokens.

Think of tokens as the building blocks — the fundamental building blocks of all language models. In GPT4, there are 100,277 possible tokens.A popular tokenizer, Tiktokenizer, allows you to experiment with tokenization and see how text is broken down into tokens. Try entering a sentence, and you’ll see each word or subword assigned a series of numerical IDs.

Step 3: Neural network training

Once the text is tokenized, the neural network learns to predict the next token based on its context. As shown above, the model takes an input sequence of tokens (e.g., “we are cook ing”) and processes it through a giant mathematical expression — which represents the model’s architecture — to predict the next token.

A neural network consists of 2 key parts:

  1. Parameters (weights) — the learned numerical values from training.
  2. Architecture (mathematical expression) — the structure defining how the input tokens are processed to produce outputs.

Initially, the model’s predictions are random, but as training progresses, it learns to assign probabilities to possible next tokens.

When the correct token (e.g. “food”) is identified, the model adjusts its billions of parameters (weights) through backpropagation — an optimization process that reinforces correct predictions by increasing their probabilities while reducing the likelihood of incorrect ones.

This process is repeated billions of times across massive datasets.

Base model — the output of pre-training

At this stage, the base model has learned:

  • How words, phrases and sentences relate to each other
  • Statistical patterns in your training data

However, base models are not yet optimised for real-world tasks. You can think of them as an advanced autocomplete system — they predict the next token based on probability, but with limited instruction-following ability.

A base model can sometimes recite training data verbatim and can be used for certain applications through in-context learning, where you guide its responses by providing examples in your prompt. However, to make the model truly useful and reliable, it requires further training.

2. Post training — Making the model useful

Base models are raw and unrefined. To make them helpful, reliable, and safe, they go through post-training, where they are fine-tuned on smaller, specialised datasets.

Because the model is a neural network, it cannot be explicitly programmed like traditional software. Instead, we “program” it implicitly by training it on structured labeled datasets that represent examples of desired interactions.

How post training works

Specialised datasets are created, consisting of structured examples on how the model should respond in different situations. 

Some types of post training include:

  1. Instruction/conversation fine tuning
    Goal: To teach the model to follow instructions, be task oriented, engage in multi-turn conversations, follow safety guidelines and refuse malicious requests, etc.
    Eg: InstructGPT (2022): OpenAI hired some 40 contractors to create these labelled datasets. These human annotators wrote prompts and provided ideal responses based on safety guidelines. Today, many datasets are generated automatically, with humans reviewing and editing them for quality.
  2. Domain specific fine tuning
    Goal: Adapt the model for specialised fields like medicine, law and programming.

Post training also introduces special tokens — symbols that were not used during pre-training — to help the model understand the structure of interactions. These tokens signal where a user’s input starts and ends and where the AI’s response begins, ensuring that the model correctly distinguishes between prompts and replies.

Now, we’ll move on to some other key concepts.

Inference — how the model generates new text

Inference can be performed at any stage, even midway through pre-training, to evaluate how well the model has learned.

When given an input sequence of tokens, the model assigns probabilities to all possible next tokens based on patterns it has learned during training.

Instead of always choosing the most likely token, it samples from this probability distribution — similar to flipping a biased coin, where higher-probability tokens are more likely to be selected.

This process repeats iteratively, with each newly generated token becoming part of the input for the next prediction. 

Token selection is stochastic and the same input can produce different outputs. Over time, the model generates text that wasn’t explicitly in its training data but follows the same statistical patterns.

Hallucinations — when LLMs generate false info

Why do hallucinations occur?

Hallucinations happen because LLMs do not “know” facts — they simply predict the most statistically likely sequence of words based on their training data.

Early models struggled significantly with hallucinations.

For instance, in the example below, if the training data contains many “Who is…” questions with definitive answers, the model learns that such queries should always have confident responses, even when it lacks the necessary knowledge.

When asked about an unknown person, the model does not default to “I don’t know” because this pattern was not reinforced during training. Instead, it generates its best guess, often leading to fabricated information.

How do you reduce hallucinations?

Method 1: Saying “I don’t know”

Improving factual accuracy requires explicitly training the model to recognise what it does not know — a task that is more complex than it seems.

This is done via self interrogation, a process that helps define the model’s knowledge boundaries.

Self interrogation can be automated using another AI model, which generates questions to probe knowledge gaps. If it produces a false answer, new training examples are added, where the correct response is: “I’m not sure. Could you provide more context?”

If a model has seen a question many times in training, it will assign a high probability to the correct answer.

If the model has not encountered the question before, it distributes probability more evenly across multiple possible tokens, making the output more randomised. No single token stands out as the most likely choice.

Fine tuning explicitly trains the model to handle low-confidence outputs with predefined responses. 

For example, when I asked ChatGPT-4o, “Who is asdja rkjgklfj?”, it correctly responded: “I’m not sure who that is. Could you provide more context?”

Method 2: Doing a web search

A more advanced method is to extend the model’s knowledge beyond its training data by giving it access to external search tools.

At a high level, when a model detects uncertainty, it can trigger a web search. The search results are then inserted into a model’s context window — essentially allowing this new data to be part of it’s working memory. The model references this new information while generating a response.

Vague recollections vs working memory

Generally speaking, LLMs have two types of knowledge access.

  1. Vague recollections — the knowledge stored in the model’s parameters from pre-training. This is based on patterns it learned from vast amounts of internet data but is not precise nor searchable.
  2. Working memory — the information that is available in the model’s context window, which is directly accessible during inference. Any text provided in the prompt acts as a short term memory, allowing the model to recall details while generating responses.

Adding relevant facts within the context window significantly improves response quality.

Knowledge of self 

When asked questions like “Who are you?” or “What built you?”, an LLM will generate a statistical best guess based on its training data, unless explicitly programmed to respond accurately. 

LLMs do not have true self-awareness, their responses depend on patterns seen during training.

One way to provide the model with a consistent identity is by using a system prompt, which sets predefined instructions about how it should describe itself, its capabilities, and its limitations.

To end off

That’s a wrap for Part 1! I hope this has helped you build intuition on how LLMs work. In Part 2, we’ll dive deeper into reinforcement learning and some of the latest models.

Got questions or ideas for what I should cover next? Drop them in the comments — I’d love to hear your thoughts. See you in Part 2! 🙂

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Western Digital wants to ramp-up hard disk drive speeds

Most enterprises are not using SATA drives, at least not with hot data. Perhaps cold storage but not frequently accessed data. They are using PCI Express based drives and those are considerably faster than anything Western Digital can engineer in a hard disk. Capacity aside, Western Digital is also aiming

Read More »

LoRaWAN reaches 125 million devices as industrial IoT expands

Satellite integration is set to grow Terrestrial LoRaWAN networks cannot achieve complete geographic coverage. Yegin cited Swisscom’s nationwide Switzerland deployment, which covers 97.2% of the population but cannot reach remote alpine terrain. Two LoRa Alliance members, Lacuna Space and Plan-S, already operate commercial LoRaWAN services from low Earth orbit. Standard

Read More »

Data stored in glass could last over 10,000 years, Microsoft says

Magnetic tape, the most widely deployed archival medium today, reflects those constraints. An LTO-10 (Linear Tape-Open) cartridge, the current enterprise benchmark, holds 30TB to 40TB native at 400MB/s, but its rated shelf life is just 30 years. It requires climate-controlled storage between 16°C and 25°C and migration roughly every five

Read More »

Insights: Venezuela – new legal frameworks vs. the inertia of history

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } In this Insights episode of the Oil & Gas Journal ReEnterprised podcast, Head of Content Chris Smith updates the evolving situation in Venezuela as the industry attempts to navigate the best path forward while the two governments continue to hammer out the details. The discussion centers on the new legal frameworks being established in both countries within the context of fraught relations stretching back for decades. Want to hear more? Listen in on a January episode highlighting industry’s initial take following the removal of Nicholas Maduro from power. References Politico podcast Monaldi Substack Baker webinar Washington, Caracas open Venezuela to allow more oil sales 

Read More »

Eni makes Calao South discovery offshore Ivory Coast

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } Eni SPA discovered gas and condensate in the Murene South-1X exploration well in Block CI-501, Ivory Coast. The well is the first exploration in the block and was drilled by the Saipem Santorini drilling ship about 8 km southwest of the Murene-1X discovery well in adjacent CI-205 block. The well was drilled to about 5,000 m TD in 2,200 m of water. Extensive data acquisition confirmed a main hydrocarbon bearing interval in high-quality Cenomanian sands with a gross thickness of about 50 m with excellent petrophysical properties, the operator said. Murene South-1X will undergo a full conventional drill stem test (DST) to assess the production capacity of this discovery, named Calao South. Calao South confirms the potential of the Calao channel complex that also includes the Calao discovery. It is the second largest discovery in the country after Baleine, with estimated volumes of up to 5.0 tcf of gas and 450 million bbl of condensate (about 1.4 billion bbl of oil). Eni is operator of Block CI-501 (90%) with partner Petroci Holding (10%).

Read More »

CFEnergía to supply natural gas to low-carbon methanol plant in Mexico

CFEnergía, a subsidiary of Mexico’s Federal Electricity Commission (CFE), has agreed to supply natural gas to Transition Industries LLC for its Pacifico Mexinol project near Topolobampo, Sinaloa, Mexico. Under the signed agreement, which enables the start of Pacifico Mexinol’s construction phase, CFEnergía will supply about 160 MMcfd of natural gas for an unspecified timeframe noted as “long term,” Transition Industries said in a release Feb. 16. The natural gas—to be sourced from the US and supplied at market prices via existing infrastructure—will be used as “critical input for Mexinol’s production of ultra-low carbon methanol,” the company said. Pacifico Mexinol The $3.3-billion Mexinol project, when it begins operations in late 2029 to early 2030, is expected to be the world’s largest ultra-low carbon chemicals plant with production of about 1.8 million tonnes of blue methanol and 350,000 tonnes of green methanol annually. Supply is aimed at markets in Asia, including Japan, while also boosting the development of the domestic market and the Mexican chemical industry. Mitsubishi Gas Chemical has committed to purchasing about 1 million tonnes/year of methanol from the project, about 50% of the project’s planned production. Transition Industries is jointly developing Pacifico Mexinol with the International Finance Corporation (IFC), a member of the World Bank Group. Last year, the company signed a contingent engineering, procurement, and construction (EPC) contract with the consortium of Samsung E&A Co., Ltd., Grupo Samsung E&A Mexico SA de CV, and Techint Engineering and Construction for the project. MAIRE group’s technology division NextChem, through its subsidiary KT TECH SpA, also signed a basic engineering, critical and proprietary equipment supply agreement with Samsung E&A in connection with its proprietary NX AdWinMethanol®Zero technology supply to the project.

Read More »

North Atlantic’s Gravenchon refinery scheduled for major turnaround

Canada-based North Atlantic Refining Ltd. France-based subsidiary North Atlantic France SAS is undertaking planned maintenance in March at its North Atlantic Energies-operated 230,000-b/d Notre-Dame-de-Gravenchon refinery in Port-Jérôme-sur-Seine, Normandy. Scheduled to begin on Mar. 3 with the phased shutdown of unidentified units at the refinery, the upcoming turnaround will involve thorough inspections of associated equipment designed for continuous operation, as well as unspecified works to improve energy efficiency, environmental performance, and overall competitiveness of the site, North Atlantic Energies said on Feb. 16. Part of the operator’s routine maintenance program aimed at meeting regulatory requirements to ensure the safety, compliance, and long-term performance of the refinery, North Atlantic Energies said the scheduled turnaround will not interrupt product supplies to customers during the shutdown period. While the company confirmed the phased shutdown of units slated for work during the maintenance event would last for several days, the operator did not reveal a definitive timeline for the entire duration of the turnaround. Further details regarding specific works to be carried out during the major maintenance event were not revealed. The upcoming turnaround will be the first to be executed under North Atlantic Group’s ownership, which completed its purchase of the formerly majority-owned ExxonMobil Corp. refinery and associated petrochemical assets at the site in November 2025.

Read More »

Azule Energy starts Ndungu full field production offshore Angola

Azule Energy has started full field production from Ndungu, part of the Agogo Integrated West Hub Project (IWH) in the western area of Block 15/06, offshore Angola. Ndungo full field lies about 10 km from the NGOMA FPSO in a water depth of around 1,100 m and comprises seven production wells and four injection wells, with an expected production peak of 60,000 b/d of oil. The National Agency for Petroleum, Gas and Biofuels (ANPG) and Azule Energy noted the full field start-up with first oil of three production wells. The phased integration of IWH, with Ndungu full field producing first via N’goma FPSO and later via Agogo FPSO, is expected to reach a peak output of about 175,000 b/d across the two fields. The fields have combined estimated reserves of about 450 million bbl. The Agogo IWH project is operated by Azule Energy with a 36.84% stake alongside partners Sonangol E&P (36.84%) and Sinopec International (26.32%).   

Read More »

Ovintiv to divest Anadarko assets for $3 billion

In a release Feb. 17, Brendan McCracken, Ovintiv president and chief executive officer, said the company has “built one of the deepest premium inventory positions in our industry in the two most valuable plays in North America, the Permian and the Montney,” and that the Anadarko assets sale “positions [Ovintiv] to deliver superior returns for our shareholders for many years to come.” Ovintiv in 2025 had noted plans to sell the asset to help offset the cost of its acquisition of NuVista Energy Ltd. That $2.7-billion cash and stock deal, which closed earlier this month, added about 930 net 10,000-ft equivalent well locations and about 140,000 net acres (70% undeveloped) in the core of the oil-rich Alberta Montney.  Proceeds from the Anadarko assets sale are earmarked for accelerated debt reduction, the company said.  Ovintiv’s sale of its Anadarko assets is expected to close early in this year’s second quarter, subject to customary conditions, with an effective date of Jan. 1, 2026.

Read More »

Meta scoops up more of Nvidia’s AI chip output

“No one deploys AI at Meta’s scale,” Nvidia CEO Jensen Huang said in a news release. Meta plans capital expenditure, mostly on data centers and the computing infrastructure they contain, of $115 billion-$135 billion this year — more than some hyperscalers, which rent their computing capacity to others. Meta is keeping it all for itself. This could be bad news for other enterprises, as the demands of the hyperscalers and big customers like Meta is leading to a decrease in the availability of chips to train and run AI models. IDC is predicting that the broader AI-driven chip shortage will have a significant effect on the IT market over the next two years as companies struggle to replace everything from laptops to servers. In particular, businesses looking for Nvidia processors may be forced to look elsewhere.

Read More »

ECL targets AI data centers with fuel-agnostic power platform

Power availability has become a gating factor for many data center projects, particularly where developers need larger connections or rapid delivery. Grid constraints can also influence where operators place compute for low-latency AI workloads. “Inference has to live close to people, data and applications, in and around major cities, smaller metros and industrial hubs where there is rarely a spare 50 or 100 megawatts sitting on the grid, and almost never a mature hydrogen ecosystem,” said Bachar. In typical data center design, the facilities are planned around 1 energy source, be it electrical grid, solar and other renewables, or diesel generated. All require different layouts and designs. One design does not fit all power sources. FlexGrid lets the data center use any power source it wants and switch to a new source without requiring a redesign of the facilities.

Read More »

AI likely to put a major strain on global networks—are enterprises ready?

“When AI pipelines slow down or traffic overloads common infrastructure, business processes slow down, and customer experience degrades,” Kale says. “Since many organizations are using AI to enable their teams to make critical decisions, disruptions caused by AI-related failures will be experienced instantly by both internal teams and external customers.” A single bottleneck can quickly cascade through an organization, Kales says, “reducing the overall value of the broader digital ecosystem.” In 2026, “we will see significant disruption from accelerated appetite for all things AI,” research firm Forrester noted in a late-year predictions post. “Business demands of AI systems, network connectivity, AI for IT operations, the conversational AI-powered service desk, and more are driving substantial changes that tech leaders must enable within their organizations.” And in a 2025 study of about 1,300 networking, operations, cloud, and architecture professionals worldwide, Broadcom noted a “readiness gap” between the desire for AI and network preparedness. While 99% of organizations have cloud strategies and are adopting AI, only 49% say their networks can support the bandwidth and low latency that AI requires, according to Broadcom’s  2026 State of Network Operations report. “AI is shifting Internet traffic from human-paced to machine-paced, and machines generate 100 times more requests with zero off-hours,” says Ed Barrow, CEO of Cloud Capital, an investment management firm focused on acquiring, managing, and operating data centers. “Inference workloads in particular create continuous, high-intensity, globally distributed traffic patterns,” Barrow says. “A single AI feature can trigger millions of additional requests per hour, and those requests are heavier—higher bandwidth, higher concurrency, and GPU-accelerated compute on the other side of the network.”

Read More »

Adani bets $100 billion on AI data centers as India eyes global hub status

The sovereignty question Adani framed the investment as a matter of national digital sovereignty, saying it would reserve a significant portion of GPU capacity for Indian AI startups and research institutions. Analysts were not convinced the structure supported the claim. “I believe it is too distant from digital sovereignty if the majority of the projects are being built to serve leading MNC AI hyperscalers,” said Shah. “Equal investments have to happen for public AI infrastructure, and the data of billions of users — from commerce to content to health — must remain sovereign.” Gogia framed the gap in operational terms. “Ownership alone does not define sovereignty,” he said. “The practical determinants are who controls privileged access during incidents, where critical workloads fail over when grids are stressed, and what regulatory oversight mechanisms are contractually enforceable.” Those are questions Adani has not yet answered and the market, analysts say, will be watching for more than just construction progress. But Banerjee said the market would not wait nine years to judge the announcement. “In practice, I think the market will judge this on near-term proof points, grid capacity secured, power contracting in place, and anchor tenants signed, rather than the headline capex or long-dated targets,” he said.

Read More »

Arista laments ‘horrendous’ memory situation

Digging in on campus Arista has been clear about its plans to grow its presence campus networking environments. Last Fall, Ullal said she expects Arista’s campus and WAN business would grow from the current $750 million-$800 million run rate to $1.25 billion, representing a 60% growth opportunity for the company. “We are committed to our aggressive goal of $1.25 billion for ’26 for the cognitive campus and branch. We have also successfully deployed in many routing edge, core spine and peering use cases,” Ullal said. “In Q4 2025, Arista launched our flagship 7800 R4 spine for many routing use cases, including DCI, AI spines with that massive 460 terabits of capacity to meet the demanding needs of multiservice routing, AI workloads and switching use cases. The combined campus and routing adjacencies together contribute approximately 18% of revenue.” Ethernet leads the way “In terms of annual 2025 product lines, our core cloud, AI and data center products built upon our highly differentiated Arista EOS stack is successfully deployed across 10 gig to 800 gigabit Ethernet speeds with 1.6 terabit migration imminent,” Ullal said. “This includes our portfolio of EtherLink AI and our 7000 series platforms for best-in-class performance, power efficiency, high availability, automation, agility for both the front and back-end compute, storage and all of the interconnect zones.” Ullal said she expects Ethernet will get even more of a boost later this year when the multivendor Ethernet for Scale-Up Networking (ESUN) specification is released.  “We have consistently described that today’s configurations are mostly a combination of scale out and scale up were largely based on 800G and smaller ratings. Now that the ESUN specification is well underway, we need a good solid spec. Otherwise, we’ll be shipping proprietary products like some people in the world do today. And so we will tie our

Read More »

From NIMBY to YIMBY: A Playbook for Data Center Community Acceptance

Across many conversations at the start of this year, at PTC and other conferences alike, the word on everyone’s lips seems to be “community.” For the data center industry, that single word now captures a turning point from just a few short years ago: we are no longer a niche, back‑of‑house utility, but a front‑page presence in local politics, school board budgets, and town hall debates. That visibility is forcing a choice in how we tell our story—either accept a permanent NIMBY-reactive framework, or actively build a YIMBY narrative that portrays the real value digital infrastructure brings to the markets and surrounding communities that host it. Speaking regularly with Ilissa Miller, CEO of iMiller Public Relations about this topic, there is work to be done across the ecosystem to build communications. Miller recently reflected: “What we’re seeing in communities isn’t a rejection of digital infrastructure, it’s a rejection of uncertainty driven by anxiety and fear. Most local leaders have never been given a framework to evaluate digital infrastructure developments the way they evaluate roads, water systems, or industrial parks. When there’s no shared planning language, ‘no’ becomes the safest answer.” A Brief History of “No” Community pushback against data centers is no longer episodic; it has become organized, media‑savvy, and politically influential in key markets. In Northern Virginia, resident groups and environmental organizations have mobilized against large‑scale campuses, pressing counties like Loudoun and Prince William to tighten zoning, question incentives, and delay or reshape projects.1 Loudoun County’s move in 2025 to end by‑right approvals for new facilities, requiring public hearings and board votes, marked a watershed moment as the world’s densest data center market signaled that communities now expect more say over where and how these campuses are built. Prince William County’s decision to sharply increase its tax rate on

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »

Community service

The bird is a beautiful silver-gray, and as she dies twitching in the lasernet I’m grateful for two things: First, that she didn’t make a

Read More »