Stay Ahead, Stay ONMINE

LLM + RAG: Creating an AI-Powered File Reader Assistant

Introduction AI is everywhere.  It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot. And I don’t think that that is bad. In fact, my opinion is the other way – at least so far. I defend and advocate for the use of AI in our daily lives because, let’s agree, it makes everything much easier. I don’t have to spend time double-reading a document to find punctuation problems or type. AI does that for me. I don’t waste time writing that follow-up email every single Monday. AI does that for me. I don’t need to read a huge and boring contract when I have an AI to summarize the main takeaways and action points to me! These are only some of AI’s great uses. If you’d like to know more use cases of LLMs to make our lives easier, I wrote a whole book about them. Now, thinking as a data scientist and looking at the technical side, not everything is that bright and shiny.  LLMs are great for several general use cases that apply to anyone or any company. For example, coding, summarizing, or answering questions about general content created until the training cutoff date. However, when it comes to specific business applications, for a single purpose, or something new that didn’t make the cutoff date, that is when the models won’t be that useful if used out-of-the-box – meaning, they will not know the answer. Thus, it will need adjustments. Training an LLM model can take months and millions of dollars. What is even worse is that if we don’t adjust and tune the model to our purpose, there will be unsatisfactory results or hallucinations (when the model’s response doesn’t make sense given our query). So what is the solution, then? Spending a lot of money retraining the model to include our data? Not really. That’s when the Retrieval-Augmented Generation (RAG) becomes useful. RAG is a framework that combines getting information from an external knowledge base with large language models (LLMs). It helps AI models produce more accurate and relevant responses. Let’s learn more about RAG next. What is RAG? Let me tell you a story to illustrate the concept. I love movies. For some time in the past, I knew which movies were competing for the best movie category at the Oscars or the best actors and actresses. And I would certainly know which ones got the statue for that year. But now I am all rusty on that subject. If you asked me who was competing, I would not know. And even if I tried to answer you, I would give you a weak response.  So, to provide you with a quality response, I will do what everybody else does: search for the information online, obtain it, and then give it to you. What I just did is the same idea as the RAG: I obtained data from an external database to give you an answer. When we enhance the LLM with a content store where it can go and retrieve data to augment (increase) its knowledge base, that is the RAG framework in action. RAG is like creating a content store where the model can enhance its knowledge and respond more accurately. User prompt about Content C. LLM retrieves external content to aggregate to the answer. Image by the author. Summarizing: Uses search algorithms to query external data sources, such as databases, knowledge bases, and web pages. Pre-processes the retrieved information. Incorporates the pre-processed information into the LLM. Why use RAG? Now that we know what the RAG framework is let’s understand why we should be using it. Here are some of the benefits: Enhances factual accuracy by referencing real data. RAG can help LLMs process and consolidate knowledge to create more relevant answers  RAG can help LLMs access additional knowledge bases, such as internal organizational data  RAG can help LLMs create more accurate domain-specific content  RAG can help reduce knowledge gaps and AI hallucination As previously explained, I like to say that with the RAG framework, we are giving an internal search engine for the content we want it to add to the knowledge base. Well. All of that is very interesting. But let’s see an application of RAG. We will learn how to create an AI-powered PDF Reader Assistant. Project This is an application that allows users to upload a PDF document and ask questions about its content using AI-powered natural language processing (NLP) tools.  The app uses Streamlit as the front end. Langchain, OpenAI’s GPT-4 model, and FAISS (Facebook AI Similarity Search) for document retrieval and question answering in the backend. Let’s break down the steps for better understanding: Loading a PDF file and splitting it into chunks of text. This makes the data optimized for retrieval Present the chunks to an embedding tool. Embeddings are numerical vector representations of data used to capture relationships, similarities, and meanings in a way that machines can understand. They are widely used in Natural Language Processing (NLP), recommender systems, and search engines. Next, we put those chunks of text and embeddings in the same DB for retrieval. Finally, we make it available to the LLM. Data preparation Preparing a content store for the LLM will take some steps, as we just saw. So, let’s start by creating a function that can load a file and split it into text chunks for efficient retrieval. # Imports from langchain_community.document_loaders import PyPDFLoader from langchain.text_splitter import RecursiveCharacterTextSplitter def load_document(pdf): # Load a PDF “”” Load a PDF and split it into chunks for efficient retrieval. :param pdf: PDF file to load :return: List of chunks of text “”” loader = PyPDFLoader(pdf) docs = loader.load() # Instantiate Text Splitter with Chunk Size of 500 words and Overlap of 100 words so that context is not lost text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100) # Split into chunks for efficient retrieval chunks = text_splitter.split_documents(docs) # Return return chunks Next, we will start building our Streamlit app, and we’ll use that function in the next script. Web application We will begin importing the necessary modules in Python. Most of those will come from the langchain packages. FAISS is used for document retrieval; OpenAIEmbeddings transforms the text chunks into numerical scores for better similarity calculation by the LLM; ChatOpenAI is what enables us to interact with the OpenAI API; create_retrieval_chain is what actually the RAG does, retrieving and augmenting the LLM with that data; create_stuff_documents_chain glues the model and the ChatPromptTemplate. Note: You will need to generate an OpenAI Key to be able to run this script. If it’s the first time you’re creating your account, you get some free credits. But if you have it for some time, it is possible that you will have to add 5 dollars in credits to be able to access OpenAI’s API. An option is using Hugging Face’s Embedding.  # Imports from langchain_community.vectorstores import FAISS from langchain_openai import OpenAIEmbeddings from langchain.chains import create_retrieval_chain from langchain_openai import ChatOpenAI from langchain.chains.combine_documents import create_stuff_documents_chain from langchain_core.prompts import ChatPromptTemplate from scripts.secret import OPENAI_KEY from scripts.document_loader import load_document import streamlit as st This first code snippet will create the App title, create a box for file upload, and prepare the file to be added to the load_document() function. # Create a Streamlit app st.title(“AI-Powered Document Q&A”) # Load document to streamlit uploaded_file = st.file_uploader(“Upload a PDF file”, type=”pdf”) # If a file is uploaded, create the TextSplitter and vector database if uploaded_file :     # Code to work around document loader from Streamlit and make it readable by langchain     temp_file = “./temp.pdf”     with open(temp_file, “wb”) as file:         file.write(uploaded_file.getvalue())         file_name = uploaded_file.name     # Load document and split it into chunks for efficient retrieval.     chunks = load_document(temp_file)     # Message user that document is being processed with time emoji     st.write(“Processing document… :watch:”) Machines understand numbers better than text, so in the end, we will have to provide the model with a database of numbers that it can compare and check for similarity when performing a query. That’s where the embeddings will be useful to create the vector_db, in this next piece of code. # Generate embeddings     # Embeddings are numerical vector representations of data, typically used to capture relationships, similarities,     # and meanings in a way that machines can understand. They are widely used in Natural Language Processing (NLP),     # recommender systems, and search engines.     embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_KEY,                                   model=”text-embedding-ada-002″)     # Can also use HuggingFaceEmbeddings     # from langchain_huggingface.embeddings import HuggingFaceEmbeddings     # embeddings = HuggingFaceEmbeddings(model_name=”sentence-transformers/all-MiniLM-L6-v2″)     # Create vector database containing chunks and embeddings     vector_db = FAISS.from_documents(chunks, embeddings) Next, we create a retriever object to navigate in the vector_db. # Create a document retriever     retriever = vector_db.as_retriever()     llm = ChatOpenAI(model_name=”gpt-4o-mini”, openai_api_key=OPENAI_KEY) Then, we will create the system_prompt, which is a set of instructions to the LLM on how to answer, and we will create a prompt template, preparing it to be added to the model once we get the input from the user. # Create a system prompt     # It sets the overall context for the model.     # It influences tone, style, and focus before user interaction starts.     # Unlike user inputs, a system prompt is not visible to the end user.     system_prompt = (         “You are a helpful assistant. Use the given context to answer the question.”         “If you don’t know the answer, say you don’t know. ”         “{context}”     )     # Create a prompt Template     prompt = ChatPromptTemplate.from_messages(         [             (“system”, system_prompt),             (“human”, “{input}”),         ]     )     # Create a chain     # It creates a StuffDocumentsChain, which takes multiple documents (text data) and “stuffs” them together before passing them to the LLM for processing.     question_answer_chain = create_stuff_documents_chain(llm, prompt) Moving on, we create the core of the RAG framework, pasting together the retriever object and the prompt. This object adds relevant documents from a data source (e.g., a vector database) and makes it ready to be processed using an LLM to generate a response. # Creates the RAG      chain = create_retrieval_chain(retriever, question_answer_chain) Finally, we create the variable question for the user input. If this question box is filled with a query, we pass it to the chain, which calls the LLM to process and return the response, which will be printed on the app’s screen. # Streamlit input for question     question = st.text_input(“Ask a question about the document:”)     if question:         # Answer         response = chain.invoke({“input”: question})[‘answer’]         st.write(response) Here is a screenshot of the result. Screenshot of the final app. Image by the author. And this is a GIF for you to see the File Reader Ai Assistant in action! File Reader AI Assistant in action. Image by the author. Before you go In this project, we learned what the RAG framework is and how it helps the Llm to perform better and also perform well with specific knowledge. AI can be powered with knowledge from an instruction manual, databases from a company, some finance files, or contracts, and then become fine-tuned to respond accurately to domain-specific content queries. The knowledge base is augmented with a content store. To recap, this is how the framework works: 1️⃣ User Query → Input text is received. 2️⃣ Retrieve Relevant Documents → Searches a knowledge base (e.g., a database, vector store). 3️⃣ Augment Context → Retrieved documents are added to the input. 4️⃣ Generate Response → An LLM processes the combined input and produces an answer. GitHub repository https://github.com/gurezende/Basic-Rag About me If you liked this content and want to learn more about my work, here is my website, where you can also find all my contacts. https://gustavorsantos.me References https://cloud.google.com/use-cases/retrieval-augmented-generation https://www.ibm.com/think/topics/retrieval-augmented-generation https://python.langchain.com/docs/introduction https://www.geeksforgeeks.org/how-to-get-your-own-openai-api-key

Introduction

AI is everywhere. 

It is hard not to interact at least once a day with a Large Language Model (LLM). The chatbots are here to stay. They’re in your apps, they help you write better, they compose emails, they read emails…well, they do a lot.

And I don’t think that that is bad. In fact, my opinion is the other way – at least so far. I defend and advocate for the use of AI in our daily lives because, let’s agree, it makes everything much easier.

I don’t have to spend time double-reading a document to find punctuation problems or type. AI does that for me. I don’t waste time writing that follow-up email every single Monday. AI does that for me. I don’t need to read a huge and boring contract when I have an AI to summarize the main takeaways and action points to me!

These are only some of AI’s great uses. If you’d like to know more use cases of LLMs to make our lives easier, I wrote a whole book about them.

Now, thinking as a data scientist and looking at the technical side, not everything is that bright and shiny. 

LLMs are great for several general use cases that apply to anyone or any company. For example, coding, summarizing, or answering questions about general content created until the training cutoff date. However, when it comes to specific business applications, for a single purpose, or something new that didn’t make the cutoff date, that is when the models won’t be that useful if used out-of-the-box – meaning, they will not know the answer. Thus, it will need adjustments.

Training an LLM model can take months and millions of dollars. What is even worse is that if we don’t adjust and tune the model to our purpose, there will be unsatisfactory results or hallucinations (when the model’s response doesn’t make sense given our query).

So what is the solution, then? Spending a lot of money retraining the model to include our data?

Not really. That’s when the Retrieval-Augmented Generation (RAG) becomes useful.

RAG is a framework that combines getting information from an external knowledge base with large language models (LLMs). It helps AI models produce more accurate and relevant responses.

Let’s learn more about RAG next.

What is RAG?

Let me tell you a story to illustrate the concept.

I love movies. For some time in the past, I knew which movies were competing for the best movie category at the Oscars or the best actors and actresses. And I would certainly know which ones got the statue for that year. But now I am all rusty on that subject. If you asked me who was competing, I would not know. And even if I tried to answer you, I would give you a weak response. 

So, to provide you with a quality response, I will do what everybody else does: search for the information online, obtain it, and then give it to you. What I just did is the same idea as the RAG: I obtained data from an external database to give you an answer.

When we enhance the LLM with a content store where it can go and retrieve data to augment (increase) its knowledge base, that is the RAG framework in action.

RAG is like creating a content store where the model can enhance its knowledge and respond more accurately.

Diagram: User prompts and content using LLM + RAG
User prompt about Content C. LLM retrieves external content to aggregate to the answer. Image by the author.

Summarizing:

  1. Uses search algorithms to query external data sources, such as databases, knowledge bases, and web pages.
  2. Pre-processes the retrieved information.
  3. Incorporates the pre-processed information into the LLM.

Why use RAG?

Now that we know what the RAG framework is let’s understand why we should be using it.

Here are some of the benefits:

  • Enhances factual accuracy by referencing real data.
  • RAG can help LLMs process and consolidate knowledge to create more relevant answers 
  • RAG can help LLMs access additional knowledge bases, such as internal organizational data 
  • RAG can help LLMs create more accurate domain-specific content 
  • RAG can help reduce knowledge gaps and AI hallucination

As previously explained, I like to say that with the RAG framework, we are giving an internal search engine for the content we want it to add to the knowledge base.

Well. All of that is very interesting. But let’s see an application of RAG. We will learn how to create an AI-powered PDF Reader Assistant.

Project

This is an application that allows users to upload a PDF document and ask questions about its content using AI-powered natural language processing (NLP) tools. 

  • The app uses Streamlit as the front end.
  • Langchain, OpenAI’s GPT-4 model, and FAISS (Facebook AI Similarity Search) for document retrieval and question answering in the backend.

Let’s break down the steps for better understanding:

  1. Loading a PDF file and splitting it into chunks of text.
    1. This makes the data optimized for retrieval
  2. Present the chunks to an embedding tool.
    1. Embeddings are numerical vector representations of data used to capture relationships, similarities, and meanings in a way that machines can understand. They are widely used in Natural Language Processing (NLP), recommender systems, and search engines.
  3. Next, we put those chunks of text and embeddings in the same DB for retrieval.
  4. Finally, we make it available to the LLM.

Data preparation

Preparing a content store for the LLM will take some steps, as we just saw. So, let’s start by creating a function that can load a file and split it into text chunks for efficient retrieval.

# Imports
from  langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

def load_document(pdf):
    # Load a PDF
    """
    Load a PDF and split it into chunks for efficient retrieval.

    :param pdf: PDF file to load
    :return: List of chunks of text
    """

    loader = PyPDFLoader(pdf)
    docs = loader.load()

    # Instantiate Text Splitter with Chunk Size of 500 words and Overlap of 100 words so that context is not lost
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
    # Split into chunks for efficient retrieval
    chunks = text_splitter.split_documents(docs)

    # Return
    return chunks

Next, we will start building our Streamlit app, and we’ll use that function in the next script.

Web application

We will begin importing the necessary modules in Python. Most of those will come from the langchain packages.

FAISS is used for document retrieval; OpenAIEmbeddings transforms the text chunks into numerical scores for better similarity calculation by the LLM; ChatOpenAI is what enables us to interact with the OpenAI API; create_retrieval_chain is what actually the RAG does, retrieving and augmenting the LLM with that data; create_stuff_documents_chain glues the model and the ChatPromptTemplate.

Note: You will need to generate an OpenAI Key to be able to run this script. If it’s the first time you’re creating your account, you get some free credits. But if you have it for some time, it is possible that you will have to add 5 dollars in credits to be able to access OpenAI’s API. An option is using Hugging Face’s Embedding. 

# Imports
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.chains import create_retrieval_chain
from langchain_openai import ChatOpenAI
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from scripts.secret import OPENAI_KEY
from scripts.document_loader import load_document
import streamlit as st

This first code snippet will create the App title, create a box for file upload, and prepare the file to be added to the load_document() function.

# Create a Streamlit app
st.title("AI-Powered Document Q&A")

# Load document to streamlit
uploaded_file = st.file_uploader("Upload a PDF file", type="pdf")

# If a file is uploaded, create the TextSplitter and vector database
if uploaded_file :

    # Code to work around document loader from Streamlit and make it readable by langchain
    temp_file = "./temp.pdf"
    with open(temp_file, "wb") as file:
        file.write(uploaded_file.getvalue())
        file_name = uploaded_file.name

    # Load document and split it into chunks for efficient retrieval.
    chunks = load_document(temp_file)

    # Message user that document is being processed with time emoji
    st.write("Processing document... :watch:")

Machines understand numbers better than text, so in the end, we will have to provide the model with a database of numbers that it can compare and check for similarity when performing a query. That’s where the embeddings will be useful to create the vector_db, in this next piece of code.

# Generate embeddings
    # Embeddings are numerical vector representations of data, typically used to capture relationships, similarities,
    # and meanings in a way that machines can understand. They are widely used in Natural Language Processing (NLP),
    # recommender systems, and search engines.
    embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_KEY,
                                  model="text-embedding-ada-002")

    # Can also use HuggingFaceEmbeddings
    # from langchain_huggingface.embeddings import HuggingFaceEmbeddings
    # embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

    # Create vector database containing chunks and embeddings
    vector_db = FAISS.from_documents(chunks, embeddings)

Next, we create a retriever object to navigate in the vector_db.

# Create a document retriever
    retriever = vector_db.as_retriever()
    llm = ChatOpenAI(model_name="gpt-4o-mini", openai_api_key=OPENAI_KEY)

Then, we will create the system_prompt, which is a set of instructions to the LLM on how to answer, and we will create a prompt template, preparing it to be added to the model once we get the input from the user.

# Create a system prompt
    # It sets the overall context for the model.
    # It influences tone, style, and focus before user interaction starts.
    # Unlike user inputs, a system prompt is not visible to the end user.

    system_prompt = (
        "You are a helpful assistant. Use the given context to answer the question."
        "If you don't know the answer, say you don't know. "
        "{context}"
    )

    # Create a prompt Template
    prompt = ChatPromptTemplate.from_messages(
        [
            ("system", system_prompt),
            ("human", "{input}"),
        ]
    )

    # Create a chain
    # It creates a StuffDocumentsChain, which takes multiple documents (text data) and "stuffs" them together before passing them to the LLM for processing.

    question_answer_chain = create_stuff_documents_chain(llm, prompt)

Moving on, we create the core of the RAG framework, pasting together the retriever object and the prompt. This object adds relevant documents from a data source (e.g., a vector database) and makes it ready to be processed using an LLM to generate a response.

# Creates the RAG
     chain = create_retrieval_chain(retriever, question_answer_chain)

Finally, we create the variable question for the user input. If this question box is filled with a query, we pass it to the chain, which calls the LLM to process and return the response, which will be printed on the app’s screen.

# Streamlit input for question
    question = st.text_input("Ask a question about the document:")
    if question:
        # Answer
        response = chain.invoke({"input": question})['answer']
        st.write(response)

Here is a screenshot of the result.

Screenshot of the AI-Powered Document Q&A
Screenshot of the final app. Image by the author.

And this is a GIF for you to see the File Reader Ai Assistant in action!

GIF of the File Reader AI Assistant in action
File Reader AI Assistant in action. Image by the author.

Before you go

In this project, we learned what the RAG framework is and how it helps the Llm to perform better and also perform well with specific knowledge.

AI can be powered with knowledge from an instruction manual, databases from a company, some finance files, or contracts, and then become fine-tuned to respond accurately to domain-specific content queries. The knowledge base is augmented with a content store.

To recap, this is how the framework works:

1️⃣ User Query → Input text is received.

2️⃣ Retrieve Relevant Documents → Searches a knowledge base (e.g., a database, vector store).

3️⃣ Augment Context → Retrieved documents are added to the input.

4️⃣ Generate Response → An LLM processes the combined input and produces an answer.

GitHub repository

https://github.com/gurezende/Basic-Rag

About me

If you liked this content and want to learn more about my work, here is my website, where you can also find all my contacts.

https://gustavorsantos.me

References

https://cloud.google.com/use-cases/retrieval-augmented-generation

https://www.ibm.com/think/topics/retrieval-augmented-generation

https://youtu.be/T-D1OfcDW1M?si=G0UWfH5-wZnMu0nw

https://python.langchain.com/docs/introduction

https://www.geeksforgeeks.org/how-to-get-your-own-openai-api-key

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Equinor, Wellesley Petroleum agree to HPHT exploration

Equinor and Wellesley Petroleum agreed to establish a joint exploration project aimed at increasing high-pressure, high-temperature (HPHT) exploration activity on the Norwegian Continental Shelf (NCS) and contributing to long-term production from existing infrastructure. Equinor will bring regional knowledge, subsurface experience, and infrastructure to the project, while Wellesley will focus on

Read More »

Occidental Petroleum, 1PointFive STRATOS DAC plant nears startup in Texas Permian basin

Occidental Petroleum Corp. and its subsidiary 1PointFive expect Phase 1 of the STRATOS direct air capture (DAC) plant in Texas’ Permian basin to come online in this year’s second quarter. In a post to LinkedIn, 1PointFive said Phase 1 “is in the final stage of startup” and that Phase 2, which incorporates learnings from research and development and Phase 1 construction activities, “will also begin commissioning in Q2, with operational ramp-up continuing through the rest of the year.” Once fully operational, STRATOS is designed to capture up to 500,000 tonnes/year (tpy) of CO2. As part of the US Environmental Protection Agency (EPA) Class VI permitting process and approval, it was reported that STRATOS is expected to include three wells to store about 722,000 tpy of CO2 in saline formations at a depth of about 4,400 ft. The company said a few activities before start-up remain, including ramping up remaining pellet reactors, completing calciner final commissioning in parallel, and beginning CO2 injection. Start-up milestones achieved include: Completed wet commissioning with water circulation. Received Class VI permits to sequester CO2. Ran CO2 compression system at design pressure. Added potassium hydroxide (KOH) to capture CO2 from the atmosphere. Building pellet inventory. Burners tested on calciner.  

Read More »

Brava Energia weighs Phase 3 at Atlanta to extend production plateau

Just 2 months after bringing its flagship Atlanta field onstream with the new FPSO Atlanta, Brazil’s independent operator Brava Energia SA is evaluating a potential third development phase that could add roughly 25 million bbl of reserves and help sustain peak production longer than originally planned. The Phase 3 project, still at an early technical and economic evaluation stage, focuses on the Atlanta Nordeste area; a separate, shallower reservoir discovered in 2006 by Shell’s 9-SHEL-19D-RJS well. According to André Fagundes, vice-president of research (Brazil) at Welligence Energy Analytics, Phase 2 has four wells still to be developed: two expected in 2027 and two in 2029. Phase 3 would involve drilling two additional wells in 2031, bringing total development to 12 producing wells. Until recently, full-field development was understood to comprise 10 wells, but Brava has since updated guidance to reflect a 12-well development concept. Atlanta field upside The primary objective is clear. “We believe its main objective is to extend the production plateau,” Fagundes said. Welligence estimates incremental recovery could reach 25 MMbbl, increasing the field’s overall recovery factor by roughly 1.5%. Lying outside Atlanta’s main Cretaceous reservoir, Atlanta Nordeste represents a genuine upside opportunity, Fagundes explained. The field benefits from strong natural aquifer support, and no water or gas injection is anticipated. Water-handling constraints that affected early production using the Petrojarl I—limited to 11,500 b/d of water treatment—are no longer a bottleneck. FPSO Atlanta can process up to 140,000 b/d of water. Reservoir performance to date has been solid, albeit with difficulties. Recurrent electric submersible pump (ESP) failures and processing limits on the previous FPSO complicated full validation of original reservoir models. With the new 50,000-b/d FPSO in operation since late 2024, reservoir deliverability has become the main constraint. Phase 3 wells would also use ESPs and require additional subsea

Read More »

California Resources eyes ‘measured’ capex ramp on way to 12% production growth thanks to Berry buy

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } The leaders of California Resources Corp., Long Beach, plan to have the company’s total production average 152,000-157,000 boe/d in 2026, with each quarter expected to be in that range. That output would equate to an increase of more than 12% from the operator’s 137,000 boe/d during fourth-quarter 2025, due mostly to the mid-December acquisition of Berry Corp. Fourth-quarter results folded in 14 days of Berry production and included 109,000 b/d of oil, with the company’s assets in the San Joaquin and Los Angeles basins accounting for 99,000 b/d of that total. The company dilled 31 new wells during the quarter and 76 in all of 2025—all in the San Joaquin—but that number will grow significantly to about 260 this year as state officials have resumed issuing permits following the passage last fall of a bill focused on Kern County production. Speaking to analysts after CRC reported fourth-quarter net income of $12 million on $924 million in revenues, president and chief executive officer Francisco Leon and chief financial officer Clio Crespy said the goal is to manage 2026 output decline to roughly 0.5% per quarter while operating four rigs and

Read More »

Petro-Victory Energy spuds São João well in Brazil

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } Petro-Victory Energy Corp. has spudded the SJ‑12 well at São João field in Barreirinhas basin, on the Brazilian equatorial margin, Maranhão.  Drilling and testing SJ‑12 is aimed at proving enough gas can be produced to sell locally. The well forms part of the single non‑associated gas well commitment under a memorandum of understanding signed in 2024 with Enava. São João contains 50.1 bcf (1.4 billion cu m) non‑associated gas resources. Petro‑Victory 100% owns and operates São João field.

Read More »

Opinion Poll: Strait of Hormuz disruptions

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } 388041610 © Ahmad Efendi | Dreamstime.com US, Israel, and Iran flags <!–> ]–> <!–> –> Oil & Gas Journal wants to hear your thoughts about how the collaborative strike on Iran by the US and Israel and disruptions through the Strait of Hormuz may impact oil prices.  

Read More »

Iran war

@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } <!–> –> <!–> ]–> <!–> –> You’ll need free site-access membership to view certain articles below. If you are not already registered with Oil & Gas Journal, sign up now for free. For Offshore articles, sign up here for free. New content will be added as it becomes available.  Oil & Gas Journal content <!–> Economics & Markets –> 26184925 © Robert Hale | Dreamstime.com <!–> ]–> <!–> When the market opened after the initial strike on Iran, oil prices traded $75/bbl on the Open, a $7/bbl jump from Friday’s High, indicating a higher risk premium as the market… –> March 6, 2026 96633437 © Titoonz | Dreamstime.com <!–> ]–> <!–> Broader infrastructure risks are emerging as regional attacks threaten production in Qatar, Saudi Arabia, and Iraq, while Europe and Asia face heightened vulnerability due to … –> March 3, 2026 387409148 © Clare Jackson | Dreamstime.com <!–> ]–> <!–> Despite initial market volatility, oil storage levels and pre-positioned supplies have mitigated immediate price shocks. However, ongoing tensions and insurance issues continue… –> March 2, 2026 220736519 © Pavel Muravev | Dreamstime.com <!–> ]–> <!–> About 20 million b/d of

Read More »

Execution, Power, and Public Trust: Rich Miller on 2026’s Data Center Reality and Why He Built Data Center Richness

DCF founder Rich Miller has spent much of his career explaining how the data center industry works. Now, with his latest venture, Data Center Richness, he’s also examining how the industry learns. That thread provided the opening for the latest episode of The DCF Show Podcast, where Miller joined present Data Center Frontier Editor in Chief Matt Vincent and Senior Editor David Chernicoff for a wide-ranging discussion that ultimately landed on a simple conclusion: after two years of unprecedented AI-driven announcements, 2026 will be the year reality asserts itself. Projects will either get built, or they won’t. Power will either materialize, or it won’t. Communities will either accept data center expansion – or they’ll stop it. In other words, the industry is entering its execution phase. Why Data Center Richness Matters Now Miller launched Data Center Richness as both a podcast and a Substack publication, an effort to experiment with formats and better understand how professionals now consume industry information. Podcasts have become a primary way many practitioners follow the business, while YouTube’s discovery advantages increasingly make video versions essential. At the same time, Miller remains committed to written analysis, using Substack as a venue for deeper dives and format experimentation. One example is his weekly newsletter distilling key industry developments into just a handful of essential links rather than overwhelming readers with volume. The approach reflects a broader recognition: the pace of change has accelerated so much that clarity matters more than quantity. The topic of how people learn about data centers isn’t separate from the industry’s trajectory; it’s becoming part of it. Public perception, regulatory scrutiny, and investor expectations are now shaped by how stories are told as much as by how facilities are built. That context sets the stage for the conversation’s core theme. Execution Defines 2026 After

Read More »

Nomads at the Frontier: PTC 2026 Signals the Digital Infrastructure Industry’s Moment of Execution

Each January, the Pacific Telecommunications Council conference serves as a barometer for where digital infrastructure is headed next. And according to Nomad Futurist founders Nabeel Mahmood and Phillip Koblence, the message from PTC 2026 was unmistakable: The industry has moved beyond hype. The hard work has begun. In the latest episode of The DCF Show Podcast, part of our ongoing ‘Nomads at the Frontier’ series, Mahmood and Koblence joined Data Center Frontier to unpack the tone shift emerging across the AI and data center ecosystem. Attendance continues to grow year over year. Conversations remain energetic. But the character of those conversations has changed. As Mahmood put it: “The hype that the market started to see is actually resulting a bit more into actions now, and those conversations are resulting into some good progress.” The difference from prior years? Less speculation. More execution. From Data Center Cowboys to Real Deployments Koblence offered perhaps the sharpest contrast between PTC conversations in 2024 and those in 2026. Two years ago, many projects felt speculative. Today, developers are arriving with secured power, customers, and construction underway. “If 2024’s PTC was data center cowboys — sites that in someone’s mind could be a data center — this year was: show me the money, show me the power, give me accurate timelines.” In other words, the market is no longer rewarding hypothetical capacity. It is demanding delivered capacity. Operators now speak in terms of deployments already underway, not aspirational campuses still waiting on permits and power commitments. And behind nearly every conversation sits the same gating factor. Power. Power Has Become the Industry’s Defining Constraint Whether discussions centered on AI factories, investment capital, or campus expansion, Mahmood and Koblence noted that every conversation eventually returned to energy availability. “All of those questions are power,” Koblence said.

Read More »

Land and Expand: Early 2026 Megaprojects Reflect a Power-First Ethos

Vantage — Lighthouse (Port Washington, Wisconsin) Although the on-site ceremonial groundbreaking occurred in 2025, Vantage Data Centers’ Lighthouse campus in Port Washington, Wisconsin, remained one of the most closely watched AI infrastructure developments entering 2026, with updated local materials posted February 19 reinforcing the project’s scale and timeline. Announced in October 2025 in partnership with OpenAI and Oracle, Lighthouse is positioned as the Midwest anchor site within the companies’ broader Stargate expansion, which targets up to 4.5 gigawatts of additional AI capacity globally. Current plans call for four hyperscale data centers delivering nearly 902 MW of IT load on a site encompassing roughly 672 acres, with construction expected to run through 2028. From a Land and Expand perspective, the project exemplifies the new generation of AI campuses involving large-scale land banking paired with phased delivery designed to stay ahead of hyperscale demand curves. Just as notable is the project’s power and community framework. Vantage is working with WEC Energy Group’s We Energies on a dedicated rate structure under which the developer will underwrite 100% of the power infrastructure investment, a model explicitly designed to shield existing customers from rate increases. The utility partnership also includes plans to enable nearly 2 gigawatts of new zero-emission energy capacity, with approximately 70% allocated to the Lighthouse campus and the remainder supporting broader grid needs. Water and environmental positioning are also central to the project narrative. Lighthouse is designed around a closed-loop liquid cooling system intended to minimize water consumption, alongside local restoration investments aimed at achieving water positivity. Vantage has also committed to preserving significant portions of the site’s natural landscape while pursuing LEED certification for the campus. Economically, the development is expected to generate more than 4,000 primarily union construction jobs and over 1,000 long-term operational roles, while Vantage has pledged at

Read More »

7×24 Exchange’s Dennis Cronin on the Data Center Workforce Crisis: The Talent Cliff Is Already Here

The data center industry has spent the past two years obsessing over power constraints, AI density, and supply chain pressure. But according to longtime mission critical leader Dennis Cronin, the sector’s most consequential bottleneck may be far more human. In a recent episode of the Data Center Frontier Show Podcast, Cronin — a founding member of 7×24 Exchange International and board member of the Mission Critical Global Alliance (MCGA) — delivered a stark message: the workforce “talent cliff” the industry keeps discussing as a future risk is already impacting operations today. A Million-Job Gap Emerging Cronin’s assessment reframes the workforce conversation from a routine labor shortage to what he describes as a structural and demographic challenge. Based on recent analysis of open roles, he estimates the industry is currently short between 467,000 and 498,000 workers across core operational positions including facilities managers, operations engineers, electricians, generator technicians, and HVAC specialists. Layer in emerging roles tied to AI infrastructure, sustainability, and cyber-physical security, and the potential demand rises to roughly one million jobs. “The coming talent cliff is not coming,” Cronin said. “It’s here, here and now.” With data center capacity expanding at roughly 30% annually, the workforce pipeline is not keeping pace with physical buildout. The Five-Year Experience Trap One of the industry’s most persistent self-inflicted wounds, Cronin argues, is the widespread requirement for five years of experience in roles that are effectively entry level. The result is a closed-loop hiring dynamic: New workers can’t get hired without experience They can’t gain experience without being hired Operators end up poaching from each other Workers may benefit from the resulting 10–20% salary jumps, but the overall talent pool remains stagnant. “It’s not helping us grow the industry,” Cronin said. In a market defined by rapid expansion and increasing system complexity, that

Read More »

Powering AI When the Grid Can’t: Inside the New Behind-the-Meter Playbook

The AI infrastructure boom is forcing a hard reset in how the data center industry thinks about power. What was once a relatively straightforward utility procurement exercise is rapidly evolving into a complex, multi-disciplinary strategy problem spanning generation, fuel logistics, finance, and system architecture. That reality framed a recent special edition of The Data Center Frontier Show Podcast, which recast and updated one of the most consequential sessions from the DCF Trends Summit 2025: From Grid to Onsite Powering: Optimizing Energy Behind the Meter for Data Centers. Moderating the discussion was Fengrong Li, Senior Managing Director at FTI Consulting, whose questions and analytical framing shaped the conversation’s direction. With more than 20 years of experience across energy and infrastructure—including expert testimony before the Federal Energy Regulatory Commission (FERC), the Federal Communications Commission (FCC), and multiple state bodies—Li brought a systems-level perspective that pushed the panel well beyond a simple technology tour. Her premise was clear from the outset: the rise of AI is not just increasing data center demand. It is restructuring the entire power delivery paradigm. A Moderator Focused on the System-Level Shift Li’s role went well beyond traditional moderation. Drawing on a career that includes 13 years at Siemens focused on grid issues and eight years at Mitsui in commodity trading and infrastructure investment, she constructed the discussion around what she described as “one of the most urgent topics shaping digital infrastructure deployment.” “Onsite power and the rise of co-located, integrated power and AI campuses,” Li told the panel, “are accelerating data centers beyond traditional hubs and changing how they interact with the grid.” Throughout the session, Li repeatedly pushed panelists to connect near-term deployment realities with longer-term structural implications particularly around redundancy, financing, and regulatory exposure. The result was a grounded look at an industry that is

Read More »

Data Center Jobs: Engineering, Construction, Commissioning, Sales, Field Service and Facility Tech Jobs Available in Major Data Center Hotspots

Each month Data Center Frontier, in partnership with Pkaza, posts some of the hottest data center career opportunities in the market. Here’s a look at some of the latest data center jobs posted on the Data Center Frontier jobs board, powered by Pkaza Critical Facilities Recruiting. Looking for Data Center Candidates? Check out Pkaza’s Active Candidate / Featured Candidate Hotlist Electrical Applications Engineer Pittsburgh, PA This position is also available in: Denver, CO and Andrews, SC Our client is a leading provider and manufacturer of industrial electrical power equipment used in industrial applications for mission critical operations. They help their customers save money by reducing energy and operating costs and provide solutions for modernizing their customer’s existing electrical infrastructure. This company provides cooling solutions to many of the world’s largest organizations and government facilities and enterprise clients, colocation providers and hyperscale companies. This career-growth minded opportunity offers exciting projects with leading-edge technology and innovation as well as competitive salaries and benefits. Electrical Commissioning Engineer Dallas TXThis traveling position is also available in: New York, NY; White Plains, NY;  Ashburn, VA; Richmond, VA; Montvale, NJ; Charlotte, NC; Atlanta, GA; Hampton, GA; New Albany, OH; Cedar Rapids, IA; Phoenix, AZ; Salt Lake City, UT; Kansas City, MO; Omaha, NE; Chesterton, IN or Chicago, IL. *** ALSO looking for a LEAD EE and ME CxA Agents and CxA PMs and a Director of CxA Colos in Dallas, TX *** Our client is an engineering design and commissioning company that has a national footprint and specializes in MEP critical facilities design. They provide design, commissioning, consulting and management expertise in the critical facilities space. They have a mindset to provide reliability, energy efficiency, sustainable design and LEED expertise when providing these consulting services for enterprise, colocation and hyperscale companies. This career-growth minded opportunity offers exciting projects with leading-edge

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »