Retrieval Augmented Generation in SQLite

Stay Ahead, Stay ONMINE

Retrieval Augmented Generation in SQLite

This is the second in a two-part series on using SQLite for Machine Learning. In my last article, I dove into how SQLite is rapidly becoming a production-ready database for web applications. In this article, I will discuss how to perform retrieval-augmented-generation using SQLite. If you’d like a custom web application with generative AI integration, […]

This is the second in a two-part series on using SQLite for Machine Learning. In my last article, I dove into how SQLite is rapidly becoming a production-ready database for web applications. In this article, I will discuss how to perform retrieval-augmented-generation using SQLite.

If you’d like a custom web application with generative AI integration, visit losangelesaiapps.com

The code referenced in this article can be found here.

When I first learned how to perform retrieval-augmented-generation (RAG) as a budding data scientist, I followed the traditional path. This usually looks something like:

Google retrieval-augmented-generation and look for tutorials
Find the most popular framework, usually LangChain or LlamaIndex
Find the most popular cloud vector database, usually Pinecone or Weaviate
Read a bunch of docs, put all the pieces together, and success!

In fact I actually wrote an article about my experience building a RAG system in LangChain with Pinecone.

There is nothing terribly wrong with using a RAG framework with a cloud vector database. However, I would argue that for first time learners it overcomplicates the situation. Do we really need an entire framework to learn how to do RAG? Is it necessary to perform API calls to cloud vector databases? These databases act as black boxes, which is never good for learners (or frankly for anyone).

In this article, I will walk you through how to perform RAG on the simplest stack possible. In fact, this ‘stack’ is just Sqlite with the sqlite-vec extension and the OpenAI API for use of their embedding and chat models. I recommend you re ad part 1 of this series to get a deep dive on SQLite and how it is rapidly becoming production ready for web applications. For our purposes here, it is enough to understand that SQLite is the simplest kind of database possible: a single file in your repository.

So ditch your cloud vector databases and your bloated frameworks, and let’s do some RAG.

SQLite-Vec

One of the powers of the SQLite database is the use of extensions. For those of us familiar with Python, extensions are a lot like libraries. They are modular pieces of code written in C to extend the functionality of SQLite, making things that were once impossible possible. One popular example of a SQLite extension is the Full-Text Search (FTS) extension. This extension allows SQLite to perform efficient searches across large volumes of textual data in SQLite. Because the extension is written purely in C, we can run it anywhere a SQLite database can be run, including Raspberry Pis and browsers.

In this article I will be going over the extension known as sqlite-vec. This gives SQLite the power of performing vector search. Vector search is similar to full-text search in that it allows for efficient search across textual data. However, rather than search for an exact word or phrase in the text, vector search has a semantic understanding. In other words, searching for “horses” will find matches of “equestrian”, “pony”, “Clydesdale”, etc. Full-text search is incapable of this.

sqlite-vec makes use of virtual tables, as do most extensions in SQLite. A virtual table is similar to a regular table, but with additional powers:

Custom Data Sources: The data for a standard table in SQLite is housed in a single db file. For a virtual table, the data can be housed in external sources, for example a CSV file or an API call.
Flexible Functionality: Virtual tables can add specialized indexing or querying capabilities and support complex data types like JSON or XML.
Integration with SQLite Query Engine: Virtual tables integrate seamlessly with SQLite’s standard query syntax e.g. SELECT , INSERT, UPDATE, and DELETE options. Ultimately it is up to the writers of the extensions to support these operations.
Use of Modules: The backend logic for how the virtual table will work is implemented by a module (written in C or another language).

The typical syntax for creating a virtual table looks like the following:

CREATE VIRTUAL TABLE my_table USING my_extension_module();

The important part of this statement is my_extension_module(). This specifies the module that will be powering the backend of the my_table virtual table. In sqlite-vec we will use the vec0 module.

Code Walkthrough

The code for this article can be found here. It is a simple directory with the majority of files being .txt files that we will be using as our dummy data. Because I am a physics nerd, the majority of the files pertain to physics, with just a few files relating to other random fields. I will not present the full code in this walkthrough, but instead will highlight the important pieces. Clone my repo and play around with it to investigate the full code. Below is a tree view of the repo. Note that my_docs.db is the single-file database used by SQLite to manage all of our data.

.

├── data

│   ├── cooking.txt

│   ├── gardening.txt

│   ├── general_relativity.txt

│   ├── newton.txt

│   ├── personal_finance.txt

│   ├── quantum.txt

│   ├── thermodynamics.txt

│   └── travel.txt

├── my_docs.db

├── requirements.txt

└── sqlite_rag_tutorial.py

Step 1 is to install the necessary libraries. Below is our requirements.txt file. As you can see it has only three libraries. I recommend creating a virtual environment with the latest Python version (3.13.1 was used for this article) and then running pip install -r requirements.txt to install the libraries.

# requirements.txt

sqlite-vec==0.1.6

openai==1.63.0

python-dotenv==1.0.1

Step 2 is to create an OpenAI API key if you don’t already have one. We will be using OpenAI to generate embeddings for the text files so that we can perform our vector search.

# sqlite_rag_tutorial.py

import sqlite3

from sqlite_vec import serialize_float32

import sqlite_vec

import os

from openai import OpenAI

from dotenv import load_dotenv

# Set up OpenAI client

client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

Step 3 is to load the sqlite-vec extension into SQLite. We will be using Python and SQL for our examples in this article. Disabling the ability to load extensions immediately after loading your extension is a good security practice.

# Path to the database file

db_path = 'my_docs.db'

# Delete the database file if it exists

db = sqlite3.connect(db_path)

db.enable_load_extension(True)

sqlite_vec.load(db)

db.enable_load_extension(False)

Next we will go ahead and create our virtual table:

db.execute('''

   CREATE VIRTUAL TABLE documents USING vec0(

       embedding float[1536],

       +file_name TEXT,

       +content TEXT

   )

''')

documents is a virtual table with three columns:

sample_embedding : 1536-dimension float that will store the embeddings of our sample documents.
file_name : Text that will house the name of each file we store in the database. Note that this column and the following have a + symbol in front of them. This indicates that they are auxiliary fields. Previously in sqlite-vec only embedding data could be stored in the virtual table. However, recently an update was pushed that allows us to add fields to our table that we don’t really want embedded. In this case we are adding the content and name of the file in the same table as our embeddings. This will allow us to easily see what embeddings correspond to what content easily while sparing us the need for extra tables and JOIN statements.
content : Text that will store the content of each file.

Now that we have our virtual table set up in our SQLite database, we can begin converting our text files into embeddings and storing them in our table:

# Function to get embeddings using the OpenAI API

def get_openai_embedding(text):

   response = client.embeddings.create(

       model="text-embedding-3-small",

       input=text

   )

   return response.data[0].embedding

# Iterate over .txt files in the /data directory

for file_name in os.listdir("data"):

   file_path = os.path.join("data", file_name)

   with open(file_path, 'r', encoding='utf-8') as file:

       content = file.read()

       # Generate embedding for the content

       embedding = get_openai_embedding(content)

       if embedding:

           # Insert file content and embedding into the vec0 table

           db.execute(

               'INSERT INTO documents (embedding, file_name, content) VALUES (?, ?, ?)',

               (serialize_float32(embedding), file_name, content)

# Commit changes

db.commit()

We essentially loop through each of our .txt files, embedding the content from each file, and then using an INSERT INTO statement to insert the embedding, file_name, and content into documents virtual table. A commit statement at the end ensures the changes are persisted. Note that we are using serialize_float32 here from the sqlite-vec library. SQLite itself does not have a built-in vector type, so it stores vectors as binary large objects (BLOBs) to save space and allow fast operations. Internally, it uses Python’s struct.pack() function, which converts Python data into C-style binary representations.

Finally, to perform RAG, you then use the following code to do a K-Nearest-Neighbors (KNN-style) operation. This is the heart of vector search.

# Perform a sample KNN query

query_text = "What is general relativity?"

query_embedding = get_openai_embedding(query_text)

if query_embedding:

   rows = db.execute(

       """

       SELECT

           file_name,

           content,

           distance

       FROM documents

       WHERE embedding MATCH ?

       ORDER BY distance

       LIMIT 3

       """,

       [serialize_float32(query_embedding)]

   ).fetchall()

   print("Top 3 most similar documents:")

   top_contexts = []

   for row in rows:

       print(row)

       top_contexts.append(row[1])  # Append the 'content' column

We begin by taking in a query from the user, in this case “What is general relativity?” and embedding that query using the same embedding model as before. We then perform a SQL operation. Let’s break this down:

The SELECT statement means the retrieved data will have three columns: file_name, content, and distance. The first two we have already mentioned. Distance will be calculated during the SQL operation, more on this in a moment.
The FROM statement ensures you are pulling data from the documents table.
The WHERE embedding MATCH ? statement performs a similarity search between all of the vectors in your database and the query vector. The returned data will include a distance column. This distance is just a floating point number measuring the similarity between the query and database vectors. The higher the number, the closer the vectors are. sqlite-vec provides a few options for how to calculate this similarity.
The ORDER BY distance makes sure to order the retrieved vectors in descending order of similarity (high -> low).
LIMIT 3 ensures we only get the top three documents that are nearest to our query embedding vector. You can tweak this number to see how retrieving more or less vectors affects your results.

Given our query of “What is general relativity?”, the following documents were pulled. It did a pretty good job!

Conclusion

sqlite-vec is a project sponsored by the Mozilla Builders Accelerator program, so it has some significant backing behind it. Have to give a big thanks to Alex Garcia, the creator of sqlite-vec , for helping to push the SQLite ecosystem and making ML possible with this simple database. This is a well maintained library, with updates coming down the pipeline on a regular basis. As of November 20th, they even added filtering by metadata! Perhaps I should re-do my aforementioned RAG article using SQLite 🤔.

The extension also offers bindings for several popular programming languages, including Ruby, Go, Rust, and more.

The fact that we are able to radically simplify our RAG pipeline to the bare essentials is remarkable. To recap, there is no need for a database service to be spun up and spun down, like Postgres, MySQL, etc. There is no need for API calls to cloud vendors. If you deploy to a server directly via Digital Ocean or Hetzner, you can even avoid costly and unnecessary complexity associated with managed cloud services like AWS, Azure, or Vercel.

I believe this simple architecture can work for a variety of applications. It is cheaper to use, easier to maintain, and faster to iterate on. Once you reach a certain scale it will likely make sense to migrate to a more robust database such as Postgres with the pgvector extension for RAG capabilities. For more advanced capabilities such as chunking and document cleaning, a framework may be the right choice. But for startups and smaller players, it’s SQLite to the moon.

Have fun trying out sqlite-vec for yourself!

Simple RAG architecture. Image by author.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Versa Networks launches sovereign SASE, challenging cloud-only security model

Versa’s sovereign SASE offering is designed to be highly customizable, allowing customers to choose their own compute resources, whether on-premises or in a private cloud. The company provides the software and recommends the hardware, but the customer maintains full control over the deployment. The system supports both containerized and virtual

Fortinet speeds threat detection with improved FortiAnalyzer

The package also now integrates with FortiAI, the vendor’s genAI assistant, to better support analytics and telemetry to help security teams speed threat investigation and response, the vendor stated. “FortiAI identifies the threats that need analysis from the data collected by FortiAnalyzer, primarily collected from FortiGates. By automating the collection,

Microsoft braids anyons into topological cubits on new Majorana quantum chip

Traditional qubits are extremely vulnerable to any change in their environment, which makes it difficult to scale up a quantum computer. But the new topological qubits need ten times less error-correction overhead, according to Microsoft. Traditional qubits also require analog controls, like turning a dial. Topological qubits, by comparison, can

Aryaka adds AI-powered observability to SASE platform

Nadkarni explained that Aryaka runs unsupervised machine learning models on the data to identify anomalies and outliers in the data. For example, the models may detect a sudden spike in traffic to a domain that has not been seen before. This unsupervised analysis helps surface potential issues or areas of

GB Energy puts interim CEO in place

The state-owned Great British Energy has put an interim chief executive in place as it continues its search for a permanent leader. Dan McGrail, currently the chief executive of trade body RenewableUK, will be seconded to the role for six months. The appointment comes after industry sources warned that the government faces a struggle to find someone who has the credentials to take charge of GB Energy’s £8.3-billion budget but who will also work for a civil servant’s salary and be based at its headquarters in Aberdeen. In a statement, the Department for Energy Security and Net Zero (DESNZ) said McGrail “will be based in Scotland, working from the Aberdeen headquarters”. Currently his workplace at RenewableUK is in London. DESNZ added he will take up his post in March and recruitment for the permanent CEO “will also begin shortly”. The entity has yet to select physical premises in the Granite City. McGrail, who currently sits on the board for WindEurope, was previously CEO of Siemens Engines and managing director of Siemens Power Generation. Juergen Maier, who is designated as GB Energy’s “start up chair”, was his boss in his role as CEO of Siemens UK. Maier said: “Dan brings invaluable experience from a long career in clean energy and joins Great British Energy at a critical time to help spearhead our work to help make Britain energy independent. “I look forward to working with him to back innovation, create sustainable jobs, and grow our supply chains.” In January, the organisation appointed five new non-executive directors. Energy Secretary Ed Miliband said: “With the appointment of Dan McGrail as interim CEO we now have a fantastic team in place to lead Great British Energy and start delivering on our plan for change.” McGrail said: “Together with the talented leadership team, I’m excited to

Elliott Seeks Phillips 66 Board Seats as It Urges Asset Sale

Elliott Investment Management is seeking seats on the board of oil refiner Phillips 66, the latest effort in a multi-year campaign pushing the company to sell assets, improve operational performance and bolster board oversight. Phillips 66 received a notice from Elliott that the activist investor plans to nominate board candidates at the company’s annual meeting, the refiner said Wednesday in a filing. Elliott will also request that the board hold annual elections for directors. Phillips 66 said the board will review the notice. Elliott, which began pressing for changes at Phillips 66 in 2023, said earlier this month that it’s now one of Phillips 66’s top five investors and believes the company hasn’t followed through on promises to improve operations. The fund, controlled by billionaire Paul Singer, wants the company to streamline its business and set more ambitious refining targets. The third-largest US refiner by capacity is already undergoing a multi-year cost-cutting initiative targeting $3 billion in asset sales. But Elliott wants the company to divest its pipeline business, do the same for its 50% ownership of petrochemicals joint venture Chevron Phillips Chemical and finalize a plan to sell European retail assets that operate under the JET brand. Elliott has said that by selling its pipeline unit, Phillips 66 could “command a premium valuation in excess of $40 billion.” The activist investor has in recent years pushed multiple refiners to separate their retail, refining and midstream assets to focus on their core business of turning oil into fuel. Marathon Petroleum sold its 3,900-store Speedway gas stations in 2019 for $21 billion following activist engagement by Elliott. In a February presentation, the hedge fund called for Phillips 66 to follow what it calls the “Marathon Path”. Canadian oil company Suncor Energy Inc., which operates refineries in Canada and the US, did not sell its

Oil, Gas Sector Sees Dip in Contract Volume, GlobalData Reveals

The global oil and gas industry experienced a 15 percent quarter on quarter decrease in total contracts, GlobalData noted in a release sent to Rigzone recently. The company outlined in the release that this figure dropped from 1,596 in the third quarter of last year to 1,353 in the fourth quarter of 2024. Despite the dip in volume, the overall contract value remained stable, driven by the announcement of some major contracts in Africa, GlobalData stated in the release, which highlighted a recent company report on oil and gas industry contracts. This report revealed that overall contract value came in at $39.2 billion in the fourth quarter and $38.8 billion in the third quarter, the release highlighted. A chart showing oil and gas industry contracts by scope in the fourth quarter of 2024, which was included in the release, revealed that 670 contracts had an operations and maintenance scope, 403 contracts had a procurement scope, 139 contracts had multiple scopes, 78 contracts had a design and engineering scope, 62 had a construction and installation scope, and one had an asset retirement scope. “The major contracts announced in the African region include Tecnicas Reunidas and Sinopec Engineering’s $4 billion new deep conversion oil refinery project in Algeria’s Hassi Messaoud region, and $1.4 billion Wuhuan Engineering and WeDo’s ammonia and urea plant project in Angola,” GlobalData said in the release. The company stated that some other notable contracts during the quarter were Bram Offshore and Starnav Servicos Maritimos’ $2.74 billion construction and charter contract from Petrobras for 12 Platform Supply Vessels (PSVs) and Saipem’s $1.9 billion contract from TotalEnergies EP Suriname for the EPC, supply, pre-commissioning, and commissioning assistance for the Subsea Umbilicals, Risers, and Flowlines (SURF) package for the GranMorgu project in Suriname. “These contracts demonstrate continued investment and expansion in

Uncertainty surrounds nuclear tax credit guidance, NRC changes: Morgan Lewis

Dive Brief: Nuclear reactor restarts and capacity uprates could get a financial lift in 2025 and onward from the Inflation Reduction Act’s technology-neutral Section 45Y production and Section 48E investment tax credits and Section 45V clean hydrogen production tax credits, though recent Treasury Department guidance leaves some questions unanswered and executive-branch reforms could affect Nuclear Regulatory Commission activities, Morgan, Lewis & Bockius attorneys said Wednesday on the law firm’s Q1 2025 nuclear regulatory webinar. To qualify for the credits, restarted nuclear reactors must have ceased operations for at least one calendar year, be authorized to restart by federal nuclear and energy regulators, and meet an “anti-abuse” test meant to prevent reactors from shuttering to later qualify for federal tax credits, Morgan Lewis attorney Jared Sanders said on the webinar. Absent incremental capacity uprates, existing nuclear generation facilities are not eligible to claim the technology-neutral production or investment tax credits and can only claim the 45V tax credit up to a maximum of 200 MW under certain conditions, Sanders said. Dive Insight: Since 2023, the owners of three prematurely retired nuclear power plants have announced plans to restart operations by the end of the decade: Holtec International’s Palisades plant in Michigan, Constellation Energy’s Three Mile Island Unit 1 in Pennsylvania, and NextEra Energy’s Duane Arnold facility in Iowa. TMI Unit 1 — now known as the Crane Clean Energy Center — ceased operations in 2019. Duane Arnold shuttered in 2020, followed by Palisades in 2022. All are likely to qualify for federal tax credits, though Constellation spokesperson Paul Adams told Politico last year that it was the company’s 20-year power purchase agreement with Microsoft, not federal tax credit eligibility, that made the restart possible. Holtec also plans to bring two 300-MW SMRs into operation at the Palisades site in the early

The AI energy challenge is coming to a head

Krishna Rangasayee is CEO of SiMa.ai, a software-centric, embedded edge machine learning system-on-chip company. As the last year falls further into the rearview and we are full steam ahead into 2025, the discussion surrounding AI’s massive energy consumption has reached an inflection point. The rapid advancement of AI has resulted in unprecedented demands on global energy infrastructure, threatening to outpace our ability to deliver power — and AI benefits — where they’re needed most. With AI already accounting for up to 4% of U.S. electricity use (a figure projected to nearly triple to 11% by 2030), reducing the strain on our energy systems is a priority, requiring a thorough reexamination of how AI’s energy needs could affect our long-term climate goals, infrastructure, resource availability and the scale at which this technology operates. And the hype doesn’t look to be slowing down anytime soon. A new executive order was issued last month to prioritize and speed up the development of AI infrastructure, including data centers and other power facilities, while proposing new restrictions on exports of AI chips to keep innovation local. While political debates rage about energy sources and environmental regulation, the fundamental challenge lies in the stark mismatch between AI’s accelerating power requirements and our aging energy distribution infrastructure. This has now become a race against time, and though AI has inevitably become a “problem” — it also offers the path to a solution. The scale of AI’s energy challenge Most AI applications we use today — from chatbots to image generators — rely on the cloud to run models and process queries. These AI data centers serve as a hub, steadily accounting for around 4.4% of U.S. electrical demand, potentially increasing to more than a tenth of the total U.S. electrical demand by 2028. Recent data points to

California Phaseout of Fossil Fuel Cars Faces Congress Test

The Environmental Protection Agency (EPA) under the Trump administration is trying to cancel waivers it issued under the Biden government for three regulations in California that aim to curb transport emissions. “[T]he EPA will be transmitting to Congress the Biden Administration’s rules granting waivers that allowed California to preempt federal car and truck standards promulgated by EPA and the U.S. Department of Transportation’s National Highway Traffic Safety Administration”, the EPA said in an online statement. EPA Administrator Lee Zeldin said, “The Biden Administration failed to send rules on California’s waivers to Congress, preventing Members of Congress from deciding on extremely consequential actions that have massive impacts and costs across the entire United States”. During previous president Joe Biden’s last days in office, the EPA granted waivers for California’s Advanced Clean Cars II (ACCII) and Heavy-Duty Omnibus. “Under the Clean Air Act, California is afforded the ability to adopt emissions requirements independent from EPA’s regulations to meet its significant air quality challenges”, the EPA said December 18, 2024. ACCII sets emission standards and raises sales of zero-emission vehicles for model years 2026-35 so that all new light-duty passenger cars, pick-up trucks and SUVs sold in California are zero-emission by 2035. ACCII builds on ACCI, adopted 2012 for model years 2015-25. “By 2035, all those vehicles must be zero-emission, which includes the option to sell plug-in hybrid vehicles”, the California Air Resources Board (CARB) said separately at the time. “The regulation does not ban fossil-fueled cars and pickup trucks; residents can drive existing internal combustion vehicles as long as they want. “The regulation will save drivers of clean vehicles $7,500 in maintenance and fuel costs over the first 10 years of use. It also will cut harmful pollutants by over 25 percent, save lives and save Californians $13 billion in health costs

Ireland says there will be no computation without generation

Stanish said that, in 2023, she wrote a paper that predicted “by 2028, more than 70% of multinational enterprises will alter their data center strategies due to limited energy supplies and data center moratoriums, up from only about 5% in 2023. It has been interesting watching this trend evolve as expected, with Ireland being a major force in this conversation since the boycotts against data center growth started a few years ago.” Fair, equitable, and stable electricity allocation, she said, “means that the availability of electricity for digital services is not guaranteed in the future, and I expect these policies, data center moratoriums, and regional rejections will only continue and expand moving forward.” Stanish pointed out that this trend is not just occurring in Ireland. “Many studies show that, globally, enterprises’ digital technologies are consuming energy at a faster rate than overall growth in energy supply (though, to be clear, these studies mostly assume a static position on energy efficiency of current technologies, and don’t take into account potential for nuclear or hydrogen to assuage some of these supply issues).” If taken at face value, she said, this means that a lack of resources could cause widespread electricity shortages in data centers over the next several years. To mitigate this, Stanish said, “so far, data center moratoriums and related constraints (including reduced tax incentives) have been enacted in the US (specifically Virginia and Georgia), Denmark, Singapore, and other countries, in response to concerns about the excessive energy consumption of IT, particularly regarding compute-intense AI workloads and concerns regarding an IT energy monopoly in certain regions. As a result, governments (federal, state, county, etc.) are working to ensure that consumption does not outpace capacity.” Changes needed In its report, the CRU stated, “a safe and secure supply of energy is essential

Perspective: Can We Solve the AI Data Center Power Crisis with Microgrids?

President Trump announced a$500 billion private sector investment in the nation’s Artificial Intelligence (AI) infrastructure last month. The investment will come from The Stargate Project, a joint venture between OpenAI, SoftBank, Oracle and MGX, which intends to build 20 new AI data centers in the U.S in the next four to five years. The Stargate Project committed$100 billion for immediate deployment and construction has already begun on its first data center in Texas. At approximately a half a million square feet each, the partners say these new facilities will cement America’s leadership in AI, create jobs and stimulate economic growth. Stargate is not the only game in town, either. Microsoft is expected to invest$80 billion in AI data center development in 2025, with Google, AWS and Meta also spending big. While all this investment in AI infrastructure is certainly exciting, experts say there’s one lingering question that’s yet to be answered and it’s a big one: How are we going to power all these AI data centers? This will be one of the many questions tackled duringMicrogrid Knowledge’s annual conference, which will be held in Texas April 15-17 at the Sheraton Dallas. “Powering Data Centers: Collaborative Microgrid Solutions for a Growing Market” will be one of the key sessions on April 16. Industry experts will gather to discuss how private entities, developers and utilities can work together to deploy microgrids and distributed energy technologies that address the data center industry’s power needs. The panel will share solutions, technologies and strategies that will favorably position data centers in the energy queue. In advance of this session, we sat down with two microgrid experts to learn more about the challenges facing the data center industry and how microgrids can address the sector’s growing energy needs. We spoke with Michael Stadler, co-founder and

Data Center Tours: Iron Mountain VA-1, Manassas, Virginia

Iron Mountain Northern Virginia Overview Iron Mountain’s Northern Virginia data centers VA-1 through VA-7 are situated on a 142-acre highly secure campus in Prince William County, Virginia. Located at 11680 Hayden Road in Manassas, Iron Mountain VA-1 spans 167,958 sq. ft. and harbors 12.4 MW of total capacity to meet colocation needs. The 36 MW VA-2 facility stands nearby. The total campus features a mixture of single and multi-tenant facilities which together provide more than 2,000,000 SF of highly efficient green colocation space for enterprises, federal agencies, service providers and hyperscale clouds. The company notes that its Manassas campus offers tax savings compared to Ashburn and exceptional levels of energy-efficiency as well as a diverse and accessible ecosystem of cloud, network and other service providers. Iron Mountain’s Virginia campus has 9 total planned data centers, with 5 operational facilities to date and two more data centers coming soon. VA-2 recently became the first data center in the United States to achieve DCOS Maturity Level 3. As we continued the tour, Kinra led the way toward the break room, an area where customers can grab coffee or catch up on work. Unlike the high-end aesthetic of some other colocation providers, Iron Mountain’s approach is more practical and focused on functionality. At the secure shipping and receiving area, Kinra explained the process for handling customer equipment. “This is where our customers ship their equipment into,” he said. “They submit a ticket, send their shipments in, and we’ll take it, put it aside for them, and let them know when it’s here. Sometimes they ask us to take it to their environment, which we’ll do for them via a smart hands ticket.” Power Infrastructure and Security Measures The VA-1 campus is supported by a single substation, providing the necessary power for its growing

Land and Expand: DPO, Microsoft, JLL and BlackChamber, Prologis, Core Scientific, Overwatch Capital

Land and Expand is a periodic feature at Data Center Frontier highlighting the latest data center development news, including new sites, land acquisitions and campus expansions. Here are some of the new and notable developments from hyperscale and colocation data center developers and operators about which we’ve been reading lately. DPO to Develop $200 Million AI Data Center in Wisconsin Rapids; Strategic Partnership with Billerud’s CWPCo Unlocks Hydroelectric Power for High-Density AI Compute Digital Power Optimization (DPO) is moving forward with plans to build a $200 million high-performance computing (HPC) data center in Wisconsin Rapids, Wisconsin. The project, designed to support up to 20 megawatts (MW) of artificial intelligence (AI) computing, leverages an innovative partnership with Consolidated Water Power Company (CWPCo), a subsidiary of global packaging leader Billerud. DPO specializes in developing and operating data centers optimized for power-dense computing. By partnering with utilities and independent power producers, DPO colocates its facilities at energy generation sites, ensuring direct access to sustainable power for AI, HPC, and blockchain computing. The company is privately held. Leveraging Power Infrastructure for Speed-to-Energization CWPCo, a regulated utility subsidiary, has operated hydroelectric generation assets since 1894, reliably serving industrial and commercial customers in Wisconsin Rapids, Biron, and Stevens Point. Parent company Billerud is a global leader in high-performance packaging materials, committed to sustainability and innovation. The company operates nine production facilities across Sweden, the USA, and Finland, employing 5,800 people in over 19 countries. The data center will be powered by CWPCo’s renewable hydroelectric assets, tapping into the utility’s existing 32 megawatts of generation capacity. The partnership grants DPO a long-term land lease—extending up to 50 years—alongside interconnection rights to an already-energized substation and a firm, reliable power supply. “AI infrastructure is evolving at an unprecedented pace, and access to power-dense sites is critical,” said Andrew

Data center spending to top $1 trillion by 2029 as AI transforms infrastructure

His projections account for recent advances in AI and data center efficiency, he says. For example, the open-source AI model from Chinese company DeepSeek seems to have shown that an LLM can produce very high-quality results at a very low cost with some clever architectural changes to how the models work. These improvements are likely to be quickly replicated by other AI companies. “A lot of these companies are trying to push out more efficient models,” says Fung. “There’s a lot of effort to reduce costs and to make it more efficient.” In addition, hyperscalers are designing and building their own chips, optimized for their AI workloads. Just the accelerator market alone is projected to reach $392 billion by 2029, Dell’Oro predicts. By that time, custom accelerators will outpace commercially available accelerators such as GPUs. The deployment of dedicated AI servers also has an impact on networking, power and cooling. As a result, spending on data center physical infrastructure (DCPI) will also increase, though at a more moderate pace, growing by 14% annually to $61 billion in 2029. “DCPI deployments are a prerequisite to support AI workloads,” says Tam Dell’Oro, founder of Dell’Oro Group, in the report. The research firm raised its outlook in this area due to the fact that actual 2024 results exceeded its expectations, and demand is spreading from tier one to tier two cloud service providers. In addition, governments and tier one telecom operators are getting involved in data center expansion, making it a long-term trend.

The Future of Property Values and Power in Virginia’s Loudoun County and ‘Data Center Alley’

Loudoun County’s FY 2026 Proposed Budget Is Released This week, Virginia’s Loudoun County released its FY 2026 Proposed Budget. The document notes how data centers are a major driver of revenue growth in Loudoun County, contributing significantly to both personal and real property tax revenues. As noted above, data centers generate almost 50% of Loudoun County property tax revenues. Importantly, Loudoun County has now implemented measures such as a Revenue Stabilization Fund (RSF) to manage the risks associated with this revenue dependency. The FY 2026 budget reflects the strong growth in data center-related revenue, allowing for tax rate reductions while still funding critical services and infrastructure projects. But the county is mindful of the potential volatility in data center revenue and is planning for long-term fiscal sustainability. The FY 2026 Proposed Budget notes how Loudoun County’s revenue from personal property taxes, particularly from data centers, has grown significantly. From FY 2013 to FY 2026, revenue from this source has increased from $60 million to over $800 million. Additionally, the county said its FY 2026 Proposed Budget benefits from $150 million in new revenue from the personal property tax portfolio, with $133 million generated specifically from computer equipment (primarily data centers). The county said data centers have also significantly impacted the real property tax portfolio. In Tax Year (TY) 2025, 73% of the county’s commercial portfolio is composed of data centers. The county said its overall commercial portfolio experienced a 50% increase in value between TY 2024 and TY 2025, largely driven by the appreciation of data center properties. RSF Meets Positive Economic Outlook The Loudoun County Board of Supervisors created the aformentioned Revenue Stabilization Fund (RSF) to manage the risks associated with the county’s reliance on data center-related revenue. The RSF targets 10% of data center-related real and personal property tax

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

OpenAI’s ChatGPT explodes to 400M weekly users, with GPT-5 on the way

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI’s ChatGPT has surpassed 400 million

Medical training’s AI leap: How agentic RAG, open-weight LLMs and real-time case insights are shaping a new generation of doctors at NYU Langone

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Patient data records can be convoluted

Identity is the breaking point — get it right or zero trust fails

This article is part of VentureBeat’s special issue, “The cyber resilience playbook: Navigating the new era of threats.” Read more from this special issue here.

Milliseconds to breach: How patch automation closes attackers’ fastest loophole

This article is part of VentureBeat’s special issue, “The cyber resilience playbook: Navigating the new era of threats.” Read more from this special issue here.

Stay Ahead, Stay ONMINE