Your Gateway to Power, Energy, Datacenters, Bitcoin and AI

Dive into the latest industry updates, our exclusive Paperboy Newsletter, and curated insights designed to keep you informed. Stay ahead with minimal time spent.

Discover What Matters Most to You

Explore ONMINE’s curated content, from our Paperboy Newsletter to industry-specific insights tailored for energy, Bitcoin mining, and AI professionals.

AI

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Bitcoin:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Datacenter:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Energy:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Shape
Discover What Matter Most to You

Featured Articles

Catio wins ‘coolest tech’ award at VB Transform 2025

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Palo Alto-based Catio was awarded “Coolest Technology” at VentureBeat Transform 2025 in San Francisco on Wednesday. Founded in 2023, the company has raised $7 million to date, with a recent $3 million round announced in March. Catio was also a finalist and presented at VB Transform’s Innovation Showcase in 2024. Catio’s AI Copilot for Tech Architecture reframes architecture as a living system—one that can be codified, introspected and intelligently evolved. By combining a real-time architectural map with a multi-agent AI organization, the solution helps engineering teams shift from reactive decision-making to continuous, proactive architecture excellence. VentureBeat spoke with co-founder and CEO Boris Bogatin and product lead Adam Kirsh about their team and the company’s technology following the announcement of winners at Transform. “We’re a team of serial entrepreneurs and tech leaders who’ve all shared a deep personal problem,” Bogatin said. “While finance folks and developers all have tools, CTOs, architects, and developers all plan and optimize stacks on whiteboards and ad hoc spreadsheets. And we’re changing that with Catio.”  Catio is far more than a digital whiteboard for CTOs—it’s a reimagining of how architecture is understood, managed and evolved. The platform serves as a digital twin for your tech stack, offering continuous architecture visibility to inform more well-informed, data-driven architecture decisions. Designed to address the escalating complexity of modern tech stacks—including cloud infrastructure, container orchestration, monitoring and data pipelines—the platform replaces static diagrams and ad hoc snapshots with an interactive, high-fidelity system model. With Catio, architecture becomes a living, codified system—constantly updated, evaluated and advised by a network of intelligent AI agents. From static diagrams to living systems As an AI-driven tech stack copilot for technical leaders

Read More »

Retail Resurrection: David’s Bridal bets its future on AI after double bankruptcy

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Inside a new David’s Bridal store in Delray Beach, Florida, a bride-to-be carefully taps images on a 65-inch touchscreen, curating a vision board for her wedding. Behind the scenes, an AI system automatically analyzes her selections, building a knowledge graph that will match her with vendors, recommend products and generate a personalized wedding plan. For the overwhelmed bride facing 300-plus wedding planning tasks, this AI assistant promises to automate the process: suggesting what to do next, reorganizing timelines when plans change and eliminating the need to manually update spreadsheets that inevitably break when wedding plans evolve. That’s the vision David’s Bridal is racing to fully implement with Pearl Planner, its new beta AI-powered wedding planning platform. For the twice-bankrupt retailer, this technology-driven transformation represents a high-stakes bet that AI can accomplish what traditional retail strategies couldn’t: Survival in an industry where 15,000 stores are projected to close this year alone. David’s Bridal is hardly alone in the dramatic and ongoing wave of store closures, bankruptcies and disruptions sweeping through the U.S. retail industry since the mid-2010s. Dubbed the “retail apocalypse,” there were at least 133 major retail bankruptcies and 57,000 store closures between 2018 and 2024. The company narrowly survived liquidation in its second bankruptcy in 2023 when business development company CION Investment Corporation — which has more than $6.1 billion in assets and a portfolio of 100 companies — acquired substantially all of its assets and invested $20 million in new funding. David’s AI-led transformation is driven from the top down by new CEO Kelly Cook, who originally joined the company as CMO in 2019. Her vision of taking the company from “aisle to algorithm” led her

Read More »

Russian Fuel Flows Decline to Lowest in 8 Months on Baltic Slump

Russia’s oil product exports dropped in June to the lowest in eight months amid extended work at refineries supplying Baltic ports, coupled with efforts to stabilize domestic fuel supplies before the upcoming seasonal surge in agricultural and holiday consumption. Seaborne shipments of refined fuels totaled 2 million barrels a day in the first 20 days in June, according to data compiled by Bloomberg from analytics firm Vortexa Ltd. That’s the lowest monthly tally since October and an 8% decline compared to both the previous month and last year in June. Flows from Baltic ports recorded the sharpest drop of more than 15% from May levels. Russian seaborne oil flows are closely watched by the market to assess its production since official data has been classified. Crude outflows slid to the lowest since mid-April led by maintenance-related disruptions at a key Pacific port, compounded by a decline from the Baltic. Oil processing rates have ramped-up this month as refineries wrap up seasonal maintenance. However, volumes available for export may be curbed by government initiatives to boost stockpiles to meet growing fuel demand from agricultural activity and summer travel. Diesel exports were largely flat, while flows of refinery feedstocks like vacuum gasoil, used by secondary units like the fluid catalytic crackers, jumped this month. Outflows of all other major fuels slumped. Most of the decline in fuel flows were concentrated in the Baltic ports, indicating extended turnarounds at refineries that usually supply these terminals. “Drone strikes earlier this year could have extended the turnaround time for both primary and secondary units,” according to Mick Strautmann, a market analyst at Vortexa. The spike in vacuum gasoil flows out of Ust-Luga in the Baltic, a feedstock used in secondary units like the fluid catalytic cracking units, suggests more serious disruptions at downstream units in the region, he

Read More »

Oil Steady as OPEC+ Weighs Output Hike

Oil held steady as traders weighed the uncertain status of nuclear talks between the US and Iran against reports that OPEC+ may extend its run of super-sized production increases. West Texas Intermediate edged up to settle above $65 a barrel after swinging between gains and losses. Bloomberg reported that several OPEC delegates, who asked not to be identified, said their countries are ready consider another 411,000 barrel-a-day increase for August when they convene on July 6, following similarly sized hikes agreed upon in each of the previous three months. While that figure is broadly in-line with expectations, “the indications are that the group may go beyond the 411,000 barrel-a-day increase,” said John Kilduff, a partner at Again Capital. “Next, we should hear about the voluntary cuts under-shooting the goal from the group laggards. I expect the ultimate decision to be bearish for prices.” Crude had earlier advanced as much as 1.3% after US Energy Secretary Chris Wright told Bloomberg that sanctions against Iran will remain in place for now, and US President Donald Trump said he dropped plans to ease Iran sanctions. The statement comes just days after the president claimed that Iran and the US would meet for nuclear talk as soon as next week, which Iran denied. Oil still ended the week down roughly 13% — snapping three weeks of gains — after a ceasefire in the Israel-Iran conflict was reached, easing concerns about supply disruptions from a region that pumps about a third of the world’s crude. The focus has largely reverted to fundamental catalysts, including OPEC moves. Russia now also appears more receptive to a fresh output boost, in a reversal of an earlier stance, raising concerns of supply overhang in the second half of the year. Investors have also turned their attention to progress on

Read More »

From pilot to profit: The real path to scalable, ROI-positive AI

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue. Three years after ChatGPT launched the generative AI era, most enterprises remain trapped in pilot purgatory. Despite billions in AI investments, the majority of corporate AI initiatives never escape the proof-of-concept phase, let alone generate measurable returns. But a select group of Fortune 500 companies has cracked the code. Walmart, JPMorgan Chase, Novartis, General Electric, McKinsey, Uber and others have systematically moved AI from experimental “innovation theater” to production-grade systems delivering substantial ROI—in some cases, generating over $1 billion in annual business value. Their success isn’t accidental. It’s the result of deliberate governance models, disciplined budgeting strategies and fundamental cultural shifts that transform how organizations approach AI deployment. This isn’t about having the best algorithms or the most data scientists. It’s about building the institutional machinery that turns AI experiments into scalable business assets. “We see this as a pretty big inflection point, very similar to the internet,” Walmart’s VP of emerging technology Desirée Gosby said at this week’s VB Transform event. “It’s as profound in terms of how we’re actually going to operate, how we actually do work.” The pilot trap: Why most AI initiatives fail to scale The statistics are sobering. Industry research shows that 85% of AI projects never make it to production, and of those that do, fewer than half generate meaningful business value. The problem isn’t technical—it’s organizational. Companies treat AI as a science experiment rather than a business capability. “AI is already cutting some product-development cycles by about 40 percent, letting companies ship and decide faster than ever,”

Read More »

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

Model providers continue to roll out increasingly sophisticated large language models (LLMs) with longer context windows and enhanced reasoning capabilities. 

This allows models to process and “think” more, but it also increases compute: The more a model takes in and puts out, the more energy it expends and the higher the costs. 

Couple this with all the tinkering involved with prompting — it can take a few tries to get to the intended result, and sometimes the question at hand simply doesn’t need a model that can think like a PhD — and compute spend can get out of control. 

This is giving rise to prompt ops, a whole new discipline in the dawning age of AI. 

“Prompt engineering is kind of like writing, the actual creating, whereas prompt ops is like publishing, where you’re evolving the content,” Crawford Del Prete, IDC president, told VentureBeat. “The content is alive, the content is changing, and you want to make sure you’re refining that over time.”

The challenge of compute use and cost

Compute use and cost are two “related but separate concepts” in the context of LLMs, explained David Emerson, applied scientist at the Vector Institute. Generally, the price users pay scales based on both the number of input tokens (what the user prompts) and the number of output tokens (what the model delivers). However, they are not changed for behind-the-scenes actions like meta-prompts, steering instructions or retrieval-augmented generation (RAG). 

While longer context allows models to process much more text at once, it directly translates to significantly more FLOPS (a measurement of compute power), he explained. Some aspects of transformer models even scale quadratically with input length if not well managed. Unnecessarily long responses can also slow down processing time and require additional compute and cost to build and maintain algorithms to post-process responses into the answer users were hoping for.

Typically, longer context environments incentivize providers to deliberately deliver verbose responses, said Emerson. For example, many heavier reasoning models (o3 or o1 from OpenAI, for example) will often provide long responses to even simple questions, incurring heavy computing costs. 

Here’s an example:

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have?

Output: If I eat 1, I only have 1 left. I would have 5 apples if I buy 4 more.

The model not only generated more tokens than it needed to, it buried its answer. An engineer may then have to design a programmatic way to extract the final answer or ask follow-up questions like ‘What is your final answer?’ that incur even more API costs. 

Alternatively, the prompt could be redesigned to guide the model to produce an immediate answer. For instance: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Start your response with “The answer is”…

Or: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Wrap your final answer in bold tags .

“The way the question is asked can reduce the effort or cost in getting to the desired answer,” said Emerson. He also pointed out that techniques like few-shot prompting (providing a few examples of what the user is looking for) can help produce quicker outputs. 

One danger is not knowing when to use sophisticated techniques like chain-of-thought (CoT) prompting (generating answers in steps) or self-refinement, which directly encourage models to produce many tokens or go through several iterations when generating responses, Emerson pointed out. 

Not every query requires a model to analyze and re-analyze before providing an answer, he emphasized; they could be perfectly capable of answering correctly when instructed to respond directly. Additionally, incorrect prompting API configurations (such as OpenAI o3, which requires a high reasoning effort) will incur higher costs when a lower-effort, cheaper request would suffice.

“With longer contexts, users can also be tempted to use an ‘everything but the kitchen sink’ approach, where you dump as much text as possible into a model context in the hope that doing so will help the model perform a task more accurately,” said Emerson. “While more context can help models perform tasks, it isn’t always the best or most efficient approach.”

Evolution to prompt ops

It’s no big secret that AI-optimized infrastructure can be hard to come by these days; IDC’s Del Prete pointed out that enterprises must be able to minimize the amount of GPU idle time and fill more queries into idle cycles between GPU requests. 

“How do I squeeze more out of these very, very precious commodities?,” he noted. “Because I’ve got to get my system utilization up, because I just don’t have the benefit of simply throwing more capacity at the problem.” 

Prompt ops can go a long way towards addressing this challenge, as it ultimately manages the lifecycle of the prompt. While prompt engineering is about the quality of the prompt, prompt ops is where you repeat, Del Prete explained. 

“It’s more orchestration,” he said. “I think of it as the curation of questions and the curation of how you interact with AI to make sure you’re getting the most out of it.” 

Models can tend to get “fatigued,” cycling in loops where quality of outputs degrades, he said. Prompt ops help manage, measure, monitor and tune prompts. “I think when we look back three or four years from now, it’s going to be a whole discipline. It’ll be a skill.”

While it’s still very much an emerging field, early providers include QueryPal, Promptable, Rebuff and TrueLens. As prompt ops evolve, these platforms will continue to iterate, improve and provide real-time feedback to give users more capacity to tune prompts over time, Dep Prete noted.

Eventually, he predicted, agents will be able to tune, write and structure prompts on their own. “The level of automation will increase, the level of human interaction will decrease, you’ll be able to have agents operating more autonomously in the prompts that they’re creating.”

Common prompting mistakes

Until prompt ops is fully realized, there is ultimately no perfect prompt. Some of the biggest mistakes people make, according to Emerson: 

Not being specific enough about the problem to be solved. This includes how the user wants the model to provide its answer, what should be considered when responding, constraints to take into account and other factors. “In many settings, models need a good amount of context to provide a response that meets users expectations,” said Emerson. 

Not taking into account the ways a problem can be simplified to narrow the scope of the response. Should the answer be within a certain range (0 to 100)? Should the answer be phrased as a multiple choice problem rather than something open-ended? Can the user provide good examples to contextualize the query? Can the problem be broken into steps for separate and simpler queries?

Not taking advantage of structure. LLMs are very good at pattern recognition, and many can understand code. While using bullet points, itemized lists or bold indicators (****) may seem “a bit cluttered” to human eyes, Emerson noted, these callouts can be beneficial for an LLM. Asking for structured outputs (such as JSON or Markdown) can also help when users are looking to process responses automatically. 

There are many other factors to consider in maintaining a production pipeline, based on engineering best practices, Emerson noted. These include: 

Making sure that the throughput of the pipeline remains consistent; 

Monitoring the performance of the prompts over time (potentially against a validation set);

Setting up tests and early warning detection to identify pipeline issues.

Users can also take advantage of tools designed to support the prompting process. For instance, the open-source DSPy can automatically configure and optimize prompts for downstream tasks based on a few labeled examples. While this may be a fairly sophisticated example, there are many other offerings (including some built into tools like ChatGPT, Google and others) that can assist in prompt design. 

And ultimately, Emerson said, “I think one of the simplest things users can do is to try to stay up-to-date on effective prompting approaches, model developments and new ways to configure and interact with models.” 

Read More »

Catio wins ‘coolest tech’ award at VB Transform 2025

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Palo Alto-based Catio was awarded “Coolest Technology” at VentureBeat Transform 2025 in San Francisco on Wednesday. Founded in 2023, the company has raised $7 million to date, with a recent $3 million round announced in March. Catio was also a finalist and presented at VB Transform’s Innovation Showcase in 2024. Catio’s AI Copilot for Tech Architecture reframes architecture as a living system—one that can be codified, introspected and intelligently evolved. By combining a real-time architectural map with a multi-agent AI organization, the solution helps engineering teams shift from reactive decision-making to continuous, proactive architecture excellence. VentureBeat spoke with co-founder and CEO Boris Bogatin and product lead Adam Kirsh about their team and the company’s technology following the announcement of winners at Transform. “We’re a team of serial entrepreneurs and tech leaders who’ve all shared a deep personal problem,” Bogatin said. “While finance folks and developers all have tools, CTOs, architects, and developers all plan and optimize stacks on whiteboards and ad hoc spreadsheets. And we’re changing that with Catio.”  Catio is far more than a digital whiteboard for CTOs—it’s a reimagining of how architecture is understood, managed and evolved. The platform serves as a digital twin for your tech stack, offering continuous architecture visibility to inform more well-informed, data-driven architecture decisions. Designed to address the escalating complexity of modern tech stacks—including cloud infrastructure, container orchestration, monitoring and data pipelines—the platform replaces static diagrams and ad hoc snapshots with an interactive, high-fidelity system model. With Catio, architecture becomes a living, codified system—constantly updated, evaluated and advised by a network of intelligent AI agents. From static diagrams to living systems As an AI-driven tech stack copilot for technical leaders

Read More »

Retail Resurrection: David’s Bridal bets its future on AI after double bankruptcy

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Inside a new David’s Bridal store in Delray Beach, Florida, a bride-to-be carefully taps images on a 65-inch touchscreen, curating a vision board for her wedding. Behind the scenes, an AI system automatically analyzes her selections, building a knowledge graph that will match her with vendors, recommend products and generate a personalized wedding plan. For the overwhelmed bride facing 300-plus wedding planning tasks, this AI assistant promises to automate the process: suggesting what to do next, reorganizing timelines when plans change and eliminating the need to manually update spreadsheets that inevitably break when wedding plans evolve. That’s the vision David’s Bridal is racing to fully implement with Pearl Planner, its new beta AI-powered wedding planning platform. For the twice-bankrupt retailer, this technology-driven transformation represents a high-stakes bet that AI can accomplish what traditional retail strategies couldn’t: Survival in an industry where 15,000 stores are projected to close this year alone. David’s Bridal is hardly alone in the dramatic and ongoing wave of store closures, bankruptcies and disruptions sweeping through the U.S. retail industry since the mid-2010s. Dubbed the “retail apocalypse,” there were at least 133 major retail bankruptcies and 57,000 store closures between 2018 and 2024. The company narrowly survived liquidation in its second bankruptcy in 2023 when business development company CION Investment Corporation — which has more than $6.1 billion in assets and a portfolio of 100 companies — acquired substantially all of its assets and invested $20 million in new funding. David’s AI-led transformation is driven from the top down by new CEO Kelly Cook, who originally joined the company as CMO in 2019. Her vision of taking the company from “aisle to algorithm” led her

Read More »

Russian Fuel Flows Decline to Lowest in 8 Months on Baltic Slump

Russia’s oil product exports dropped in June to the lowest in eight months amid extended work at refineries supplying Baltic ports, coupled with efforts to stabilize domestic fuel supplies before the upcoming seasonal surge in agricultural and holiday consumption. Seaborne shipments of refined fuels totaled 2 million barrels a day in the first 20 days in June, according to data compiled by Bloomberg from analytics firm Vortexa Ltd. That’s the lowest monthly tally since October and an 8% decline compared to both the previous month and last year in June. Flows from Baltic ports recorded the sharpest drop of more than 15% from May levels. Russian seaborne oil flows are closely watched by the market to assess its production since official data has been classified. Crude outflows slid to the lowest since mid-April led by maintenance-related disruptions at a key Pacific port, compounded by a decline from the Baltic. Oil processing rates have ramped-up this month as refineries wrap up seasonal maintenance. However, volumes available for export may be curbed by government initiatives to boost stockpiles to meet growing fuel demand from agricultural activity and summer travel. Diesel exports were largely flat, while flows of refinery feedstocks like vacuum gasoil, used by secondary units like the fluid catalytic crackers, jumped this month. Outflows of all other major fuels slumped. Most of the decline in fuel flows were concentrated in the Baltic ports, indicating extended turnarounds at refineries that usually supply these terminals. “Drone strikes earlier this year could have extended the turnaround time for both primary and secondary units,” according to Mick Strautmann, a market analyst at Vortexa. The spike in vacuum gasoil flows out of Ust-Luga in the Baltic, a feedstock used in secondary units like the fluid catalytic cracking units, suggests more serious disruptions at downstream units in the region, he

Read More »

Oil Steady as OPEC+ Weighs Output Hike

Oil held steady as traders weighed the uncertain status of nuclear talks between the US and Iran against reports that OPEC+ may extend its run of super-sized production increases. West Texas Intermediate edged up to settle above $65 a barrel after swinging between gains and losses. Bloomberg reported that several OPEC delegates, who asked not to be identified, said their countries are ready consider another 411,000 barrel-a-day increase for August when they convene on July 6, following similarly sized hikes agreed upon in each of the previous three months. While that figure is broadly in-line with expectations, “the indications are that the group may go beyond the 411,000 barrel-a-day increase,” said John Kilduff, a partner at Again Capital. “Next, we should hear about the voluntary cuts under-shooting the goal from the group laggards. I expect the ultimate decision to be bearish for prices.” Crude had earlier advanced as much as 1.3% after US Energy Secretary Chris Wright told Bloomberg that sanctions against Iran will remain in place for now, and US President Donald Trump said he dropped plans to ease Iran sanctions. The statement comes just days after the president claimed that Iran and the US would meet for nuclear talk as soon as next week, which Iran denied. Oil still ended the week down roughly 13% — snapping three weeks of gains — after a ceasefire in the Israel-Iran conflict was reached, easing concerns about supply disruptions from a region that pumps about a third of the world’s crude. The focus has largely reverted to fundamental catalysts, including OPEC moves. Russia now also appears more receptive to a fresh output boost, in a reversal of an earlier stance, raising concerns of supply overhang in the second half of the year. Investors have also turned their attention to progress on

Read More »

From pilot to profit: The real path to scalable, ROI-positive AI

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue. Three years after ChatGPT launched the generative AI era, most enterprises remain trapped in pilot purgatory. Despite billions in AI investments, the majority of corporate AI initiatives never escape the proof-of-concept phase, let alone generate measurable returns. But a select group of Fortune 500 companies has cracked the code. Walmart, JPMorgan Chase, Novartis, General Electric, McKinsey, Uber and others have systematically moved AI from experimental “innovation theater” to production-grade systems delivering substantial ROI—in some cases, generating over $1 billion in annual business value. Their success isn’t accidental. It’s the result of deliberate governance models, disciplined budgeting strategies and fundamental cultural shifts that transform how organizations approach AI deployment. This isn’t about having the best algorithms or the most data scientists. It’s about building the institutional machinery that turns AI experiments into scalable business assets. “We see this as a pretty big inflection point, very similar to the internet,” Walmart’s VP of emerging technology Desirée Gosby said at this week’s VB Transform event. “It’s as profound in terms of how we’re actually going to operate, how we actually do work.” The pilot trap: Why most AI initiatives fail to scale The statistics are sobering. Industry research shows that 85% of AI projects never make it to production, and of those that do, fewer than half generate meaningful business value. The problem isn’t technical—it’s organizational. Companies treat AI as a science experiment rather than a business capability. “AI is already cutting some product-development cycles by about 40 percent, letting companies ship and decide faster than ever,”

Read More »

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

Model providers continue to roll out increasingly sophisticated large language models (LLMs) with longer context windows and enhanced reasoning capabilities. 

This allows models to process and “think” more, but it also increases compute: The more a model takes in and puts out, the more energy it expends and the higher the costs. 

Couple this with all the tinkering involved with prompting — it can take a few tries to get to the intended result, and sometimes the question at hand simply doesn’t need a model that can think like a PhD — and compute spend can get out of control. 

This is giving rise to prompt ops, a whole new discipline in the dawning age of AI. 

“Prompt engineering is kind of like writing, the actual creating, whereas prompt ops is like publishing, where you’re evolving the content,” Crawford Del Prete, IDC president, told VentureBeat. “The content is alive, the content is changing, and you want to make sure you’re refining that over time.”

The challenge of compute use and cost

Compute use and cost are two “related but separate concepts” in the context of LLMs, explained David Emerson, applied scientist at the Vector Institute. Generally, the price users pay scales based on both the number of input tokens (what the user prompts) and the number of output tokens (what the model delivers). However, they are not changed for behind-the-scenes actions like meta-prompts, steering instructions or retrieval-augmented generation (RAG). 

While longer context allows models to process much more text at once, it directly translates to significantly more FLOPS (a measurement of compute power), he explained. Some aspects of transformer models even scale quadratically with input length if not well managed. Unnecessarily long responses can also slow down processing time and require additional compute and cost to build and maintain algorithms to post-process responses into the answer users were hoping for.

Typically, longer context environments incentivize providers to deliberately deliver verbose responses, said Emerson. For example, many heavier reasoning models (o3 or o1 from OpenAI, for example) will often provide long responses to even simple questions, incurring heavy computing costs. 

Here’s an example:

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have?

Output: If I eat 1, I only have 1 left. I would have 5 apples if I buy 4 more.

The model not only generated more tokens than it needed to, it buried its answer. An engineer may then have to design a programmatic way to extract the final answer or ask follow-up questions like ‘What is your final answer?’ that incur even more API costs. 

Alternatively, the prompt could be redesigned to guide the model to produce an immediate answer. For instance: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Start your response with “The answer is”…

Or: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Wrap your final answer in bold tags .

“The way the question is asked can reduce the effort or cost in getting to the desired answer,” said Emerson. He also pointed out that techniques like few-shot prompting (providing a few examples of what the user is looking for) can help produce quicker outputs. 

One danger is not knowing when to use sophisticated techniques like chain-of-thought (CoT) prompting (generating answers in steps) or self-refinement, which directly encourage models to produce many tokens or go through several iterations when generating responses, Emerson pointed out. 

Not every query requires a model to analyze and re-analyze before providing an answer, he emphasized; they could be perfectly capable of answering correctly when instructed to respond directly. Additionally, incorrect prompting API configurations (such as OpenAI o3, which requires a high reasoning effort) will incur higher costs when a lower-effort, cheaper request would suffice.

“With longer contexts, users can also be tempted to use an ‘everything but the kitchen sink’ approach, where you dump as much text as possible into a model context in the hope that doing so will help the model perform a task more accurately,” said Emerson. “While more context can help models perform tasks, it isn’t always the best or most efficient approach.”

Evolution to prompt ops

It’s no big secret that AI-optimized infrastructure can be hard to come by these days; IDC’s Del Prete pointed out that enterprises must be able to minimize the amount of GPU idle time and fill more queries into idle cycles between GPU requests. 

“How do I squeeze more out of these very, very precious commodities?,” he noted. “Because I’ve got to get my system utilization up, because I just don’t have the benefit of simply throwing more capacity at the problem.” 

Prompt ops can go a long way towards addressing this challenge, as it ultimately manages the lifecycle of the prompt. While prompt engineering is about the quality of the prompt, prompt ops is where you repeat, Del Prete explained. 

“It’s more orchestration,” he said. “I think of it as the curation of questions and the curation of how you interact with AI to make sure you’re getting the most out of it.” 

Models can tend to get “fatigued,” cycling in loops where quality of outputs degrades, he said. Prompt ops help manage, measure, monitor and tune prompts. “I think when we look back three or four years from now, it’s going to be a whole discipline. It’ll be a skill.”

While it’s still very much an emerging field, early providers include QueryPal, Promptable, Rebuff and TrueLens. As prompt ops evolve, these platforms will continue to iterate, improve and provide real-time feedback to give users more capacity to tune prompts over time, Dep Prete noted.

Eventually, he predicted, agents will be able to tune, write and structure prompts on their own. “The level of automation will increase, the level of human interaction will decrease, you’ll be able to have agents operating more autonomously in the prompts that they’re creating.”

Common prompting mistakes

Until prompt ops is fully realized, there is ultimately no perfect prompt. Some of the biggest mistakes people make, according to Emerson: 

Not being specific enough about the problem to be solved. This includes how the user wants the model to provide its answer, what should be considered when responding, constraints to take into account and other factors. “In many settings, models need a good amount of context to provide a response that meets users expectations,” said Emerson. 

Not taking into account the ways a problem can be simplified to narrow the scope of the response. Should the answer be within a certain range (0 to 100)? Should the answer be phrased as a multiple choice problem rather than something open-ended? Can the user provide good examples to contextualize the query? Can the problem be broken into steps for separate and simpler queries?

Not taking advantage of structure. LLMs are very good at pattern recognition, and many can understand code. While using bullet points, itemized lists or bold indicators (****) may seem “a bit cluttered” to human eyes, Emerson noted, these callouts can be beneficial for an LLM. Asking for structured outputs (such as JSON or Markdown) can also help when users are looking to process responses automatically. 

There are many other factors to consider in maintaining a production pipeline, based on engineering best practices, Emerson noted. These include: 

Making sure that the throughput of the pipeline remains consistent; 

Monitoring the performance of the prompts over time (potentially against a validation set);

Setting up tests and early warning detection to identify pipeline issues.

Users can also take advantage of tools designed to support the prompting process. For instance, the open-source DSPy can automatically configure and optimize prompts for downstream tasks based on a few labeled examples. While this may be a fairly sophisticated example, there are many other offerings (including some built into tools like ChatGPT, Google and others) that can assist in prompt design. 

And ultimately, Emerson said, “I think one of the simplest things users can do is to try to stay up-to-date on effective prompting approaches, model developments and new ways to configure and interact with models.” 

Read More »

Can oil and gas solve the AI power dilemma?

Joe Brettell is a partner at Prosody Group. The promise, peril and possibilities of artificial intelligence continue to capture the cultural and business zeitgeist worldwide. Hardly a conference or long-form interview can be held these days without a panelist or pundit commenting on the technology’s implications for their profession. Yet despite being the hottest topic in every circle, AI’s ultimate challenge isn’t technological but physical. After years of breathless speculation and prediction, the issue remains the same: AI needs more energy. Amidst this backdrop, the oil and gas industry faces a similarly fundamental challenge: a shifting production frontier and evolving path to continued growth. After a decade of efficiency-driven growth, the era of easy barrels is waning. Diamondback Energy CEO Travis Stice captured the new reality in a recent letter, warning of the increasingly dim prospects for expanding production amid geological constraints and rising costs. Other energy majors have issued similar cautions, a sharp departure from the boom years of the shale revolution when abundant, low-cost reserves, followed by shareholder-focused production, made the industry a market favorite. Now, with resource intensity rising, global volatility accelerating and economic conditions tightening, the industry is under pressure to find its next value horizon. That horizon may be converging with AI. The pairing makes increasing sense. While initially circling one another warily, major players in energy and technology have become increasingly intertwined. At major gatherings like CERAWeek, energy executives and tech leaders now share the same stage — and increasingly, the same strategic questions. How do we scale the infrastructure to match exponential AI growth? Who will supply the energy to power it? And how do we do so fast enough while dealing with rising environmental, social and regulatory concerns? These challenges come amid a stark reality: AI’s computational appetite isn’t just increasing —

Read More »

Analyst Highlights Natural Gas Contract ‘Implosion’

In an EBW Analytics Group report sent to Rigzone by the EBW team on Friday, Eli Rubin, an energy analyst at the company, highlighted an “implosion over the past week” in the July natural gas contract. “The July natural gas contract rolled off the board at $3.261 yesterday – a 72.8¢ (-18 percent) implosion over the past week as the heat wave failed to sufficiently lift physical cash prices at Henry Hub,” Rubin said in the report. “Yesterday’s 96 billion cubic foot bearish Energy Information Administration (EIA) injection, while largely a make-up for last week’s bullish surprise, further weighed on prices,” Rubin added. In its latest weekly natural gas storage report, which was released on Thursday, the EIA stated that working gas in storage was 2,898 billion cubic feet as of Friday, June 20, 2025, according to EIA estimates. “This represents a net increase of 96 billion cubic feet from the previous week. Stocks were 196 billion cubic feet less than last year at this time and 179 billion cubic feet above the five-year average of 2,719 billion cubic feet,” the EIA said in its report. “At 2,898 billion cubic feet, total working gas is within the five-year historical range,” the EIA added. In the EBW Analytics Group report, Rubin stated that the early to mid July weather outlook is warming and LNG demand is firming and highlighted that “both can develop further in a bullish direction over the next few weeks”. “Supply readings may churn higher near term, though, amid intramonth nomination patterns and concluding maintenance – potentially subduing upside into early next week,” Rubin said in the report. Rubin also pointed out in the report that, “technically, the August contract appeared to find support yesterday, bouncing 12¢ off intraday lows at $3.403”. He highlighted in the report that the

Read More »

European Council, Parliament Reach Deal on Gas Storage Rule Extension

The European Council and Parliament have reached a provisional agreement on a proposal by the European Commission to extend a regulation requiring that natural gas storage facilities be at least 90 percent full before the winter season. The Gas Storage Regulation was adopted June 2022 at the height of the energy crisis as a cushion against shortages. It will expire at the end of 2025. The proposal seeks to extend it to 2027. “The agreement keeps the existing binding target of 90 percent of gas storage but provides flexibility to reach it anytime between 1 October and 1 December instead of the current 1 November deadline”, a Council statement said. “Council and Parliament agreed that intermediary storage targets are indicative, to give predictability of storage levels while leaving sufficient flexibility for market participants to purchase gas throughout the year when it is more convenient”. “The agreement reached today would help member states to react swiftly to constantly changing conditions and to take advantage of the best purchasing conditions, while ensuring security of gas supply and the correct functioning of the internal market”, the Council added. Both institutions now need to endorse the provisional deal. “The prolongation of gas storage obligations for the next two years will significantly contribute to maintaining the EU’s security of energy supply and gas market stability by incentivizing preparations for the upcoming winter seasons in a coordinated manner across the Union”, the Commission said separately. “The European Commission will work closely with the Member States to ensure an optimal gas storage refilling and the achievement of the target, including by exploring the full potential of the demand aggregation and joint purchasing”, the Commission added. AggregateEU, a mechanism in which gas suppliers compete to book demand placed by companies in the EU and its Energy Community partner countries, had also been initially only meant

Read More »

Thailand’s PTT Signs Cooperation Agreement with Glenfarne Alaska LNG

Glenfarne Alaska LNG LLC said that Thailand’s PTT Public Company Limited has signed a cooperation agreement for strategic participation in the Alaska LNG project. The cooperation agreement defines the process for Alaska LNG and PTT to move toward definitive agreements for partnership on Alaska LNG, including long-term liquefied natural gas (LNG) offtake. The agreement includes the potential procurement of 2 million metric tons per annum (mtpa) of LNG from Alaska LNG over a 20-year term, Glenfarne said in a statement. “Glenfarne and Alaska LNG are pleased that PTT and the Thai government have realized the strategic security, cost, and stability advantages offered by the Alaska LNG project,” Glenfarne Alaska LNG President Adam Prestidge said. “With today’s and previously announced agreements, Alaska LNG has now reserved 50 percent of its available third-party LNG offtake capacity to investment grade counterparties, and the project has overwhelming interest from additional counterparties globally,” Prestidge added. “Recent events in the Middle East once again underscore the significant need for Alaska LNG that comes from a secure, stable, and abundant source without traversing through potentially contested waters,” Glenfarne CEO Brendan Duval said. “This agreement with PTT further symbolizes Alaska LNG’s tremendous momentum, well on its way to becoming a reality that will solve Alaska’s natural gas shortage while providing jobs, business opportunities, and increased economic development for Alaska residents, businesses, and military facilities,” Duval added. The Alaska LNG project consists of an 807-mile, 42-inch pipeline capable of transporting enough natural gas to meet both Alaska’s domestic needs and supply the full 20-mtpa Alaska LNG export facility, the Glenfarne Group LLC subsidiary said. The pipeline will be built in two independent, financially viable phases. Phase one will deliver natural gas approximately 765 miles from the North Slope to the Anchorage region. Phase two adds compression equipment and approximately

Read More »

EIA Fuel Update Shows Increasing USA Gasoline Price

In its latest gasoline fuel update, which was released this week, the U.S. Energy Information Administration (EIA) showed an increasing price trend for U.S. regular gasoline. The update highlighted that the U.S. regular gasoline price averaged $3.108 per gallon on June 9, $3.139 per gallon on June 16, and $3.213 per gallon on June 23. The June 23 price was $0.225 less than the year ago price, the update outlined. Of the five Petroleum Administration for Defense District (PADD) regions highlighted in the EIA’s latest fuel update, the West Coast was shown to have the highest regular gasoline price as of June 23, at $4.162 per gallon. The Gulf Coast was shown in the update to have the lowest regular gasoline price as of June 23, at $2.844 per gallon. A glossary section of the EIA site notes that the 50 U.S. states and the District of Columbia are divided into five districts, with PADD 1 further split into three subdistricts. PADDs 6 and 7 encompass U.S. territories, the site adds. Current Price According to the AAA Fuel Prices website, the average price of regular gasoline in the U.S. is $3.207 per gallon, as of June 27. Yesterday’s average was $3.220 per gallon, the week ago average was $3.217 per gallon, the month ago average was $3.174 per gallon, and the year ago average was $3.503 per gallon, the site showed. The highest recorded average price for regular gasoline was seen on June 14, 2022, at $5.016 per gallon, the AAA Fuel Prices site outlined. GasBuddy’s live ticking average for regular gasoline in the U.S. was $3.199 per gallon as of 7.25am EST on June 27. The figure was 0.9 cents lower than yesterday’s average, 1.7 cents lower than last week’s average, 5.2 cents higher than last month’s average, and 31.0

Read More »

Petrovietnam, JV Partners Sign PSC for Vietnam Block

Vietnam Oil and Gas Group (Petrovietnam) and its joint venture partners have signed a product sharing contract (PSC) for Block 15-1 on the southern continental shelf of Vietnam. When completed, the project will supply 125 billion standard cubic feet (Bcf) per day of natural gas to the domestic market, Petrovietnam said in a news release. Under the PSC, Petrovietnam subsidiary Oil and Gas Exploration and Production Corporation (PVEP) holds 59 percent in the asset, Perenco holds 19.8 percent, Korea National Oil Corporation (KNOC) holds 11.4 percent, SK Group holds 7.2 percent, and Geopetrol holds 2.6 percent. The project will be operated by the joint venture Cuu Long Joint Operating Company (Cuu Long JOC), which is the second largest oil exporter in the country. The contract signing officially kicks off phase 2B of the $1.3 billion White Lion Project, Petrovietnam said, citing a Perenco representative. Block 15-1, an oil and gas block with one of the country’s largest reserves and one of the earliest to be developed, is located in the Cuu Long basin in the southern mainland of Vietnam and includes the Black Lion, Golden Lion, Brown Lion and White Lion oil and gas fields, according to the release. The asset is one of the few blocks in Viernam that contains large-scale commercial oil and gas fields. The Black Lion, Golden Lion, and Brown Lion fields operate in parallel with the White Lion gas field, “creating an integrated exploitation ecosystem, optimizing both technical infrastructure and operational efficiency,” Petrovietnam said. The entire cluster forms an integrated system in Vietnam covering production, processing, and transportation, with a central processing platform, floating production, storage, and offloading (FPSO) platforms, wellhead platforms, and a pipeline system, the company said. Block 15-1 provides stable crude oil to the domestic market through the Dung Quat Oil Refinery,

Read More »

National Grid, Con Edison urge FERC to adopt gas pipeline reliability requirements

The Federal Energy Regulatory Commission should adopt reliability-related requirements for gas pipeline operators to ensure fuel supplies during cold weather, according to National Grid USA and affiliated utilities Consolidated Edison Co. of New York and Orange and Rockland Utilities. In the wake of power outages in the Southeast and the near collapse of New York City’s gas system during Winter Storm Elliott in December 2022, voluntary efforts to bolster gas pipeline reliability are inadequate, the utilities said in two separate filings on Friday at FERC. The filings were in response to a gas-electric coordination meeting held in November by the Federal-State Current Issues Collaborative between FERC and the National Association of Regulatory Utility Commissioners. National Grid called for FERC to use its authority under the Natural Gas Act to require pipeline reliability reporting, coupled with enforcement mechanisms, and pipeline tariff reforms. “Such data reporting would enable the commission to gain a clearer picture into pipeline reliability and identify any problematic trends in the quality of pipeline service,” National Grid said. “At that point, the commission could consider using its ratemaking, audit, and civil penalty authority preemptively to address such identified concerns before they result in service curtailments.” On pipeline tariff reforms, FERC should develop tougher provisions for force majeure events — an unforeseen occurence that prevents a contract from being fulfilled — reservation charge crediting, operational flow orders, scheduling and confirmation enhancements, improved real-time coordination, and limits on changes to nomination rankings, National Grid said. FERC should support efforts in New England and New York to create financial incentives for gas-fired generators to enter into winter contracts for imported liquefied natural gas supplies, or other long-term firm contracts with suppliers and pipelines, National Grid said. Con Edison and O&R said they were encouraged by recent efforts such as North American Energy Standard

Read More »

US BOEM Seeks Feedback on Potential Wind Leasing Offshore Guam

The United States Bureau of Ocean Energy Management (BOEM) on Monday issued a Call for Information and Nominations to help it decide on potential leasing areas for wind energy development offshore Guam. The call concerns a contiguous area around the island that comprises about 2.1 million acres. The area’s water depths range from 350 meters (1,148.29 feet) to 2,200 meters (7,217.85 feet), according to a statement on BOEM’s website. Closing April 7, the comment period seeks “relevant information on site conditions, marine resources, and ocean uses near or within the call area”, the BOEM said. “Concurrently, wind energy companies can nominate specific areas they would like to see offered for leasing. “During the call comment period, BOEM will engage with Indigenous Peoples, stakeholder organizations, ocean users, federal agencies, the government of Guam, and other parties to identify conflicts early in the process as BOEM seeks to identify areas where offshore wind development would have the least impact”. The next step would be the identification of specific WEAs, or wind energy areas, in the larger call area. BOEM would then conduct environmental reviews of the WEAs in consultation with different stakeholders. “After completing its environmental reviews and consultations, BOEM may propose one or more competitive lease sales for areas within the WEAs”, the Department of the Interior (DOI) sub-agency said. BOEM Director Elizabeth Klein said, “Responsible offshore wind development off Guam’s coast offers a vital opportunity to expand clean energy, cut carbon emissions, and reduce energy costs for Guam residents”. Late last year the DOI announced the approval of the 2.4-gigawatt (GW) SouthCoast Wind Project, raising the total capacity of federally approved offshore wind power projects to over 19 GW. The project owned by a joint venture between EDP Renewables and ENGIE received a positive Record of Decision, the DOI said in

Read More »

Biden Bars Offshore Oil Drilling in USA Atlantic and Pacific

President Joe Biden is indefinitely blocking offshore oil and gas development in more than 625 million acres of US coastal waters, warning that drilling there is simply “not worth the risks” and “unnecessary” to meet the nation’s energy needs.  Biden’s move is enshrined in a pair of presidential memoranda being issued Monday, burnishing his legacy on conservation and fighting climate change just two weeks before President-elect Donald Trump takes office. Yet unlike other actions Biden has taken to constrain fossil fuel development, this one could be harder for Trump to unwind, since it’s rooted in a 72-year-old provision of federal law that empowers presidents to withdraw US waters from oil and gas leasing without explicitly authorizing revocations.  Biden is ruling out future oil and gas leasing along the US East and West Coasts, the eastern Gulf of Mexico and a sliver of the Northern Bering Sea, an area teeming with seabirds, marine mammals, fish and other wildlife that indigenous people have depended on for millennia. The action doesn’t affect energy development under existing offshore leases, and it won’t prevent the sale of more drilling rights in Alaska’s gas-rich Cook Inlet or the central and western Gulf of Mexico, which together provide about 14% of US oil and gas production.  The president cast the move as achieving a careful balance between conservation and energy security. “It is clear to me that the relatively minimal fossil fuel potential in the areas I am withdrawing do not justify the environmental, public health and economic risks that would come from new leasing and drilling,” Biden said. “We do not need to choose between protecting the environment and growing our economy, or between keeping our ocean healthy, our coastlines resilient and the food they produce secure — and keeping energy prices low.” Some of the areas Biden is protecting

Read More »

Biden Admin Finalizes Hydrogen Tax Credit Favoring Cleaner Production

The Biden administration has finalized rules for a tax incentive promoting hydrogen production using renewable power, with lower credits for processes using abated natural gas. The Clean Hydrogen Production Credit is based on carbon intensity, which must not exceed four kilograms of carbon dioxide equivalent per kilogram of hydrogen produced. Qualified facilities are those whose start of construction falls before 2033. These facilities can claim credits for 10 years of production starting on the date of service placement, according to the draft text on the Federal Register’s portal. The final text is scheduled for publication Friday. Established by the 2022 Inflation Reduction Act, the four-tier scheme gives producers that meet wage and apprenticeship requirements a credit of up to $3 per kilogram of “qualified clean hydrogen”, to be adjusted for inflation. Hydrogen whose production process makes higher lifecycle emissions gets less. The scheme will use the Energy Department’s Greenhouse Gases, Regulated Emissions and Energy Use in Transportation (GREET) model in tiering production processes for credit computation. “In the coming weeks, the Department of Energy will release an updated version of the 45VH2-GREET model that producers will use to calculate the section 45V tax credit”, the Treasury Department said in a statement announcing the finalization of rules, a process that it said had considered roughly 30,000 public comments. However, producers may use the GREET model that was the most recent when their facility began construction. “This is in consideration of comments that the prospect of potential changes to the model over time reduces investment certainty”, explained the statement on the Treasury’s website. “Calculation of the lifecycle GHG analysis for the tax credit requires consideration of direct and significant indirect emissions”, the statement said. For electrolytic hydrogen, electrolyzers covered by the scheme include not only those using renewables-derived electricity (green hydrogen) but

Read More »

Xthings unveils Ulticam home security cameras powered by edge AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Xthings announced that its Ulticam security camera brand has a new model out today: the Ulticam IQ Floodlight, an edge AI-powered home security camera. The company also plans to showcase two additional cameras, Ulticam IQ, an outdoor spotlight camera, and Ulticam Dot, a portable, wireless security camera. All three cameras offer free cloud storage (seven days rolling) and subscription-free edge AI-powered person detection and alerts. The AI at the edge means that it doesn’t have to go out to an internet-connected data center to tap AI computing to figure out what is in front of the camera. Rather, the processing for the AI is built into the camera itself, and that sets a new standard for value and performance in home security cameras. It can identify people, faces and vehicles. CES 2025 attendees can experience Ulticam’s entire lineup at Pepcom’s Digital Experience event on January 6, 2025, and at the Venetian Expo, Halls A-D, booth #51732, from January 7 to January 10, 2025. These new security cameras will be available for purchase online in the U.S. in Q1 and Q2 2025 at U-tec.com, Amazon, and Best Buy. The Ulticam IQ Series: smart edge AI-powered home security cameras Ulticam IQ home security camera. The Ulticam IQ Series, which includes IQ and IQ Floodlight, takes home security to the next level with the most advanced AI-powered recognition. Among the very first consumer cameras to use edge AI, the IQ Series can quickly and accurately identify people, faces and vehicles, without uploading video for server-side processing, which improves speed, accuracy, security and privacy. Additionally, the Ulticam IQ Series is designed to improve over time with over-the-air updates that enable new AI features. Both cameras

Read More »

Intel unveils new Core Ultra processors with 2X to 3X performance on AI apps

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Intel unveiled new Intel Core Ultra 9 processors today at CES 2025 with as much as two or three times the edge performance on AI apps as before. The chips under the Intel Core Ultra 9 and Core i9 labels were previously codenamed Arrow Lake H, Meteor Lake H, Arrow Lake S and Raptor Lake S Refresh. Intel said it is pushing the boundaries of AI performance and power efficiency for businesses and consumers, ushering in the next era of AI computing. In other performance metrics, Intel said the Core Ultra 9 processors are up to 5.8 times faster in media performance, 3.4 times faster in video analytics end-to-end workloads with media and AI, and 8.2 times better in terms of performance per watt than prior chips. Intel hopes to kick off the year better than in 2024. CEO Pat Gelsinger resigned last month without a permanent successor after a variety of struggles, including mass layoffs, manufacturing delays and poor execution on chips including gaming bugs in chips launched during the summer. Intel Core Ultra Series 2 Michael Masci, vice president of product management at the Edge Computing Group at Intel, said in a briefing that AI, once the domain of research labs, is integrating into every aspect of our lives, including AI PCs where the AI processing is done in the computer itself, not the cloud. AI is also being processed in data centers in big enterprises, from retail stores to hospital rooms. “As CES kicks off, it’s clear we are witnessing a transformative moment,” he said. “Artificial intelligence is moving at an unprecedented pace.” The new processors include the Intel Core 9 Ultra 200 H/U/S models, with up to

Read More »

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

Model providers continue to roll out increasingly sophisticated large language models (LLMs) with longer context windows and enhanced reasoning capabilities. 

This allows models to process and “think” more, but it also increases compute: The more a model takes in and puts out, the more energy it expends and the higher the costs. 

Couple this with all the tinkering involved with prompting — it can take a few tries to get to the intended result, and sometimes the question at hand simply doesn’t need a model that can think like a PhD — and compute spend can get out of control. 

This is giving rise to prompt ops, a whole new discipline in the dawning age of AI. 

“Prompt engineering is kind of like writing, the actual creating, whereas prompt ops is like publishing, where you’re evolving the content,” Crawford Del Prete, IDC president, told VentureBeat. “The content is alive, the content is changing, and you want to make sure you’re refining that over time.”

The challenge of compute use and cost

Compute use and cost are two “related but separate concepts” in the context of LLMs, explained David Emerson, applied scientist at the Vector Institute. Generally, the price users pay scales based on both the number of input tokens (what the user prompts) and the number of output tokens (what the model delivers). However, they are not changed for behind-the-scenes actions like meta-prompts, steering instructions or retrieval-augmented generation (RAG). 

While longer context allows models to process much more text at once, it directly translates to significantly more FLOPS (a measurement of compute power), he explained. Some aspects of transformer models even scale quadratically with input length if not well managed. Unnecessarily long responses can also slow down processing time and require additional compute and cost to build and maintain algorithms to post-process responses into the answer users were hoping for.

Typically, longer context environments incentivize providers to deliberately deliver verbose responses, said Emerson. For example, many heavier reasoning models (o3 or o1 from OpenAI, for example) will often provide long responses to even simple questions, incurring heavy computing costs. 

Here’s an example:

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have?

Output: If I eat 1, I only have 1 left. I would have 5 apples if I buy 4 more.

The model not only generated more tokens than it needed to, it buried its answer. An engineer may then have to design a programmatic way to extract the final answer or ask follow-up questions like ‘What is your final answer?’ that incur even more API costs. 

Alternatively, the prompt could be redesigned to guide the model to produce an immediate answer. For instance: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Start your response with “The answer is”…

Or: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Wrap your final answer in bold tags .

“The way the question is asked can reduce the effort or cost in getting to the desired answer,” said Emerson. He also pointed out that techniques like few-shot prompting (providing a few examples of what the user is looking for) can help produce quicker outputs. 

One danger is not knowing when to use sophisticated techniques like chain-of-thought (CoT) prompting (generating answers in steps) or self-refinement, which directly encourage models to produce many tokens or go through several iterations when generating responses, Emerson pointed out. 

Not every query requires a model to analyze and re-analyze before providing an answer, he emphasized; they could be perfectly capable of answering correctly when instructed to respond directly. Additionally, incorrect prompting API configurations (such as OpenAI o3, which requires a high reasoning effort) will incur higher costs when a lower-effort, cheaper request would suffice.

“With longer contexts, users can also be tempted to use an ‘everything but the kitchen sink’ approach, where you dump as much text as possible into a model context in the hope that doing so will help the model perform a task more accurately,” said Emerson. “While more context can help models perform tasks, it isn’t always the best or most efficient approach.”

Evolution to prompt ops

It’s no big secret that AI-optimized infrastructure can be hard to come by these days; IDC’s Del Prete pointed out that enterprises must be able to minimize the amount of GPU idle time and fill more queries into idle cycles between GPU requests. 

“How do I squeeze more out of these very, very precious commodities?,” he noted. “Because I’ve got to get my system utilization up, because I just don’t have the benefit of simply throwing more capacity at the problem.” 

Prompt ops can go a long way towards addressing this challenge, as it ultimately manages the lifecycle of the prompt. While prompt engineering is about the quality of the prompt, prompt ops is where you repeat, Del Prete explained. 

“It’s more orchestration,” he said. “I think of it as the curation of questions and the curation of how you interact with AI to make sure you’re getting the most out of it.” 

Models can tend to get “fatigued,” cycling in loops where quality of outputs degrades, he said. Prompt ops help manage, measure, monitor and tune prompts. “I think when we look back three or four years from now, it’s going to be a whole discipline. It’ll be a skill.”

While it’s still very much an emerging field, early providers include QueryPal, Promptable, Rebuff and TrueLens. As prompt ops evolve, these platforms will continue to iterate, improve and provide real-time feedback to give users more capacity to tune prompts over time, Dep Prete noted.

Eventually, he predicted, agents will be able to tune, write and structure prompts on their own. “The level of automation will increase, the level of human interaction will decrease, you’ll be able to have agents operating more autonomously in the prompts that they’re creating.”

Common prompting mistakes

Until prompt ops is fully realized, there is ultimately no perfect prompt. Some of the biggest mistakes people make, according to Emerson: 

Not being specific enough about the problem to be solved. This includes how the user wants the model to provide its answer, what should be considered when responding, constraints to take into account and other factors. “In many settings, models need a good amount of context to provide a response that meets users expectations,” said Emerson. 

Not taking into account the ways a problem can be simplified to narrow the scope of the response. Should the answer be within a certain range (0 to 100)? Should the answer be phrased as a multiple choice problem rather than something open-ended? Can the user provide good examples to contextualize the query? Can the problem be broken into steps for separate and simpler queries?

Not taking advantage of structure. LLMs are very good at pattern recognition, and many can understand code. While using bullet points, itemized lists or bold indicators (****) may seem “a bit cluttered” to human eyes, Emerson noted, these callouts can be beneficial for an LLM. Asking for structured outputs (such as JSON or Markdown) can also help when users are looking to process responses automatically. 

There are many other factors to consider in maintaining a production pipeline, based on engineering best practices, Emerson noted. These include: 

Making sure that the throughput of the pipeline remains consistent; 

Monitoring the performance of the prompts over time (potentially against a validation set);

Setting up tests and early warning detection to identify pipeline issues.

Users can also take advantage of tools designed to support the prompting process. For instance, the open-source DSPy can automatically configure and optimize prompts for downstream tasks based on a few labeled examples. While this may be a fairly sophisticated example, there are many other offerings (including some built into tools like ChatGPT, Google and others) that can assist in prompt design. 

And ultimately, Emerson said, “I think one of the simplest things users can do is to try to stay up-to-date on effective prompting approaches, model developments and new ways to configure and interact with models.” 

Read More »

From pilot to profit: The real path to scalable, ROI-positive AI

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue. Three years after ChatGPT launched the generative AI era, most enterprises remain trapped in pilot purgatory. Despite billions in AI investments, the majority of corporate AI initiatives never escape the proof-of-concept phase, let alone generate measurable returns. But a select group of Fortune 500 companies has cracked the code. Walmart, JPMorgan Chase, Novartis, General Electric, McKinsey, Uber and others have systematically moved AI from experimental “innovation theater” to production-grade systems delivering substantial ROI—in some cases, generating over $1 billion in annual business value. Their success isn’t accidental. It’s the result of deliberate governance models, disciplined budgeting strategies and fundamental cultural shifts that transform how organizations approach AI deployment. This isn’t about having the best algorithms or the most data scientists. It’s about building the institutional machinery that turns AI experiments into scalable business assets. “We see this as a pretty big inflection point, very similar to the internet,” Walmart’s VP of emerging technology Desirée Gosby said at this week’s VB Transform event. “It’s as profound in terms of how we’re actually going to operate, how we actually do work.” The pilot trap: Why most AI initiatives fail to scale The statistics are sobering. Industry research shows that 85% of AI projects never make it to production, and of those that do, fewer than half generate meaningful business value. The problem isn’t technical—it’s organizational. Companies treat AI as a science experiment rather than a business capability. “AI is already cutting some product-development cycles by about 40 percent, letting companies ship and decide faster than ever,”

Read More »

Kumo’s ‘relational foundation model’ predicts the future your LLM can’t see

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more The generative AI boom has given us powerful language models that can write, summarize and reason over vast amounts of text and other types of data. But when it comes to high-value predictive tasks like predicting customer churn or detecting fraud from structured, relational data, enterprises remain stuck in the world of traditional machine learning.  Stanford professor and Kumo AI co-founder Jure Leskovec argues that this is the critical missing piece. His company’s tool, a relational foundation model (RFM), is a new kind of pre-trained AI that brings the “zero-shot” capabilities of large language models (LLMs) to structured databases. “It’s about making a forecast about something you don’t know, something that has not happened yet,” Leskovec told VentureBeat. “And that’s a fundamentally new capability that is, I would argue, missing from the current purview of what we think of as gen AI.” Why predictive ML is a “30-year-old technology” While LLMs and retrieval-augmented generation (RAG) systems can answer questions about existing knowledge, they are fundamentally retrospective. They retrieve and reason over information that is already there. For predictive business tasks, companies still rely on classic machine learning.  For example, to build a model that predicts customer churn, a business must hire a team of data scientists who spend a considerably long time doing “feature engineering,” the process of manually creating predictive signals from the data. This involves complex data wrangling to join information from different tables, such as a customer’s purchase history and website clicks, to create a single, massive training table. “If you want to do machine learning (ML), sorry, you are stuck in the past,” Leskovec said. Expensive and time-consuming bottlenecks prevent most organizations from

Read More »

OpenAI’s API lead explains how enterprises are already succeeding with its Agents SDK and Responses API

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more At VentureBeat’s Transform 2025 conference, Olivier Godement, Head of Product for OpenAI’s API platform, provided a behind-the-scenes look at how enterprise teams are adopting and deploying AI agents at scale. In a 20-minute panel discussion I hosted exclusively with Godement, the former Stripe researcher and current OpenAI API boss unpacked OpenAI’s latest developer tools—the Responses API and Agents SDK—while highlighting real-world patterns, security considerations, and cost-return examples from early adopters like Stripe and Box. For enterprise leaders unable to attend the session live, here are top 8 most important takeaways: Agents Are Rapidly Moving From Prototype to Production According to Godement, 2025 marks a real shift in how AI is being deployed at scale. With over a million monthly active developers now using OpenAI’s API platform globally, and token usage up 700% year over year, AI is moving beyond experimentation. “It’s been five years since we launched essentially GPT-3… and man, the past five years has been pretty wild.” Godement emphasized that current demand isn’t just about chatbots anymore. “AI use cases are moving from simple Q&A to actually use cases where the application, the agent, can do stuff for you.” This shift prompted OpenAI to launch two major developer-facing tools in March: the Responses API and the Agents SDK. When to Use Single Agents vs. Sub-Agent Architectures A major theme was architectural choice. Godement noted that single-agent loops, which encapsulate full tool access and context in one model, are conceptually elegant but often impractical at scale. “Building accurate and reliable single agents is hard. Like, it’s really hard.” As complexity increases—more tools, more possible user inputs, more logic—teams often move toward modular architectures with specialized sub-agents.

Read More »

How Highmark Health and Google Cloud are using Gen AI to streamline medical claims and improve care: 6 key lessons

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Among the numerous educational and startlingly insightful panel discussions on AI enterprise integrations featuring industry leaders at VentureBeat’s Transform 2025 conference this week was one led by Google Cloud Platform Vice President and Chief Technology Officer (CTO) Will Grannis and Richard Clarke, Highmark Health’s Senior Vice President and Chief Data and Analytics Officer. That session, “The New AI Stack in Healthcare: Architecting for Multi-Model, Multi-Modal Environments,” delivered a pragmatic look at how the two organizations are collaborating to deploy AI at scale across more than 14,000 employees at the large U.S. healthcare system Highmark Health (based out of Western Pennsylvania). In addition, the collaboration has onboarded all these employees and turned them into active users without losing sight of complexity, regulation, or clinician trust. So, how did Google Cloud and Highmark go about it? Read on to find out. A Partnership Built on Prepared Foundations Highmark Health, an integrated payer-provider system serving over 6 million members, is using Google Cloud’s AI models and infrastructure to modernize legacy systems, boost internal efficiency, and improve patient outcomes. What sets this initiative apart is its focus on platform engineering—treating AI as a foundational shift in how work gets done, not just another tech layer. Richard Clarke, Highmark’s Chief Data and Analytics Officer, emphasized the importance of building flexible infrastructure early. “There’s nothing more legacy than an employment platform coded in COBOL,” Clarke noted, but Highmark has integrated even those systems with cloud-based AI models. The result: up to 90% workload replication without systemic disruption, enabling smoother transitions and real-time insights into complex administrative processes. Google Cloud CTO Will Grannis echoed that success begins with groundwork. “This may take three, four,

Read More »

The Download: how to clean up AI data centers, and weight-loss drugs’ side effects

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This battery recycling company is now cleaning up AI data centers In a sandy industrial lot outside Reno, Nevada, rows of battery packs that once propelled electric vehicles are now powering a small AI data center.Redwood Materials, one of the US’s largest battery recycling companies, showed off this array of energy storage modules, sitting on cinder blocks and wrapped in waterproof plastic, during a press tour at its headquarters on June 26.The event marked the launch of the company’s new business line, Redwood Energy, which will initially repurpose (rather than recycle) batteries with years of remaining life to create renewable-powered microgrids. Such small-scale energy systems can operate on or off the larger electricity grid, providing electricity for businesses or communities. Read the full story.—James Temple
We’re learning more about what weight-loss drugs do to the body Weight-loss drugs are this decade’s blockbuster medicines. Drugs like Ozempic, Wegovy, and Mounjaro help people with diabetes get their blood sugar under control and help overweight and obese people reach a healthier weight. And they’re fast becoming a trendy must-have for celebrities and other figure-conscious individuals looking to trim down.
They became so hugely popular so quickly that not long after their approval for weight loss, we saw global shortages of the drugs. Prescriptions have soared over the last five years, but even people who don’t have prescriptions are seeking these drugs out online.We know they can suppress appetite, lower blood sugar, and lead to dramatic weight loss. We also know that they come with side effects, which can include nausea, diarrhea, and vomiting. But we are still learning about some of their other effects. Read the full story. —Jessica Hamzelou This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 The Supreme Court has paved the way to defund Planned Parenthood By allowing South Carolina to block the organization from its Medicaid program. (WP $)+ Other red states are likely to follow suit. (CNN)+ Planned Parenthood may be able to challenge the ban under state law. (Politico) 2 Iran is back onlineThe country appeared to cut connectivity in a bid to thwart foreign attacks. (Economist $)

3 ICE is using a new facial recognition appIt’s capable of recognizing someone from their fingerprints, too. (404 Media)+ How a new type of AI is helping police skirt facial recognition bans. (MIT Technology Review) 4 Denmark has a potential solution for malicious deepfakesBy giving its residents copyright to their own body, facial features, and voice. (The Guardian)+ An AI startup made a hyperrealistic deepfake of me that’s so good it’s scary. (MIT Technology Review) 5 Impossible Foods wants to bring its plant-based burgers to Europe 🍔After sales started falling in America. (Bloomberg $)+ Sales of regular old meat are booming in the States. (Vox) 6 The Three Mile Island nuclear plant’s restart is being fast trackedIt’s currently scheduled to start operating a year earlier than anticipated. (Reuters)+ But bringing the reactor back online is no easy task. (The Register)+ Why Microsoft made a deal to help restart Three Mile Island. (MIT Technology Review) 7 AI may be making research too easyNew research suggests that using LLMs results in weaker grasps of topics. (WSJ $)+ It could also be making our thoughts less original. (New Yorker $) 8 Climate tech companies are struggling to weather Trump’s cutsA lot of startups are expected to fold as a result. (Insider $)+ The Trump administration has shut down more than 100 climate studies. (MIT Technology Review) 9 Billions of Facebook and Google passwords have been leakedAnd people in developing nations are most at risk. (Rest of World) 10 Inside a couples retreat with humans and their AI companionsChaos ensured. (Wired $)+ The AI relationship revolution is already here. (MIT Technology Review)
Quote of the day
“[The internet blackout] makes us invisible. And still, we’re here. Still trying to connect with the free world.” —’Amir,’ a student in Iran, tells the Guardian why young Iranians are working to overcome the country’s internet shutdowns. One more thing Maybe you will be able to live past 122How long can humans live? This is a good time to ask the question. The longevity scene is having a moment, thanks to a combination of scientific advances, public interest, and an unprecedented level of investment. A few key areas of research suggest that we might be able to push human life spans further, and potentially reverse at least some signs of aging.Researchers can’t even agree on what the exact mechanisms of aging are and which they should be targeting. Debates continue to rage over how long it’s possible for humans to live—and whether there is a limit at all.But it looks likely that something will be developed in the coming decades that will help us live longer, in better health. Read the full story. —Jessica Hamzelou
We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.) + This ancient amphibian skull is pretty remarkable.+ A new Phantom of the Opera spin-off is coming—but no one really knows what it is.+ Stop panicking, it turns out Marge Simpson isn’t dead after all.+ I love these owls in towels 🦉

Read More »

Catio wins ‘coolest tech’ award at VB Transform 2025

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Palo Alto-based Catio was awarded “Coolest Technology” at VentureBeat Transform 2025 in San Francisco on Wednesday. Founded in 2023, the company has raised $7 million to date, with a recent $3 million round announced in March. Catio was also a finalist and presented at VB Transform’s Innovation Showcase in 2024. Catio’s AI Copilot for Tech Architecture reframes architecture as a living system—one that can be codified, introspected and intelligently evolved. By combining a real-time architectural map with a multi-agent AI organization, the solution helps engineering teams shift from reactive decision-making to continuous, proactive architecture excellence. VentureBeat spoke with co-founder and CEO Boris Bogatin and product lead Adam Kirsh about their team and the company’s technology following the announcement of winners at Transform. “We’re a team of serial entrepreneurs and tech leaders who’ve all shared a deep personal problem,” Bogatin said. “While finance folks and developers all have tools, CTOs, architects, and developers all plan and optimize stacks on whiteboards and ad hoc spreadsheets. And we’re changing that with Catio.”  Catio is far more than a digital whiteboard for CTOs—it’s a reimagining of how architecture is understood, managed and evolved. The platform serves as a digital twin for your tech stack, offering continuous architecture visibility to inform more well-informed, data-driven architecture decisions. Designed to address the escalating complexity of modern tech stacks—including cloud infrastructure, container orchestration, monitoring and data pipelines—the platform replaces static diagrams and ad hoc snapshots with an interactive, high-fidelity system model. With Catio, architecture becomes a living, codified system—constantly updated, evaluated and advised by a network of intelligent AI agents. From static diagrams to living systems As an AI-driven tech stack copilot for technical leaders

Read More »

Retail Resurrection: David’s Bridal bets its future on AI after double bankruptcy

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Inside a new David’s Bridal store in Delray Beach, Florida, a bride-to-be carefully taps images on a 65-inch touchscreen, curating a vision board for her wedding. Behind the scenes, an AI system automatically analyzes her selections, building a knowledge graph that will match her with vendors, recommend products and generate a personalized wedding plan. For the overwhelmed bride facing 300-plus wedding planning tasks, this AI assistant promises to automate the process: suggesting what to do next, reorganizing timelines when plans change and eliminating the need to manually update spreadsheets that inevitably break when wedding plans evolve. That’s the vision David’s Bridal is racing to fully implement with Pearl Planner, its new beta AI-powered wedding planning platform. For the twice-bankrupt retailer, this technology-driven transformation represents a high-stakes bet that AI can accomplish what traditional retail strategies couldn’t: Survival in an industry where 15,000 stores are projected to close this year alone. David’s Bridal is hardly alone in the dramatic and ongoing wave of store closures, bankruptcies and disruptions sweeping through the U.S. retail industry since the mid-2010s. Dubbed the “retail apocalypse,” there were at least 133 major retail bankruptcies and 57,000 store closures between 2018 and 2024. The company narrowly survived liquidation in its second bankruptcy in 2023 when business development company CION Investment Corporation — which has more than $6.1 billion in assets and a portfolio of 100 companies — acquired substantially all of its assets and invested $20 million in new funding. David’s AI-led transformation is driven from the top down by new CEO Kelly Cook, who originally joined the company as CMO in 2019. Her vision of taking the company from “aisle to algorithm” led her

Read More »

Russian Fuel Flows Decline to Lowest in 8 Months on Baltic Slump

Russia’s oil product exports dropped in June to the lowest in eight months amid extended work at refineries supplying Baltic ports, coupled with efforts to stabilize domestic fuel supplies before the upcoming seasonal surge in agricultural and holiday consumption. Seaborne shipments of refined fuels totaled 2 million barrels a day in the first 20 days in June, according to data compiled by Bloomberg from analytics firm Vortexa Ltd. That’s the lowest monthly tally since October and an 8% decline compared to both the previous month and last year in June. Flows from Baltic ports recorded the sharpest drop of more than 15% from May levels. Russian seaborne oil flows are closely watched by the market to assess its production since official data has been classified. Crude outflows slid to the lowest since mid-April led by maintenance-related disruptions at a key Pacific port, compounded by a decline from the Baltic. Oil processing rates have ramped-up this month as refineries wrap up seasonal maintenance. However, volumes available for export may be curbed by government initiatives to boost stockpiles to meet growing fuel demand from agricultural activity and summer travel. Diesel exports were largely flat, while flows of refinery feedstocks like vacuum gasoil, used by secondary units like the fluid catalytic crackers, jumped this month. Outflows of all other major fuels slumped. Most of the decline in fuel flows were concentrated in the Baltic ports, indicating extended turnarounds at refineries that usually supply these terminals. “Drone strikes earlier this year could have extended the turnaround time for both primary and secondary units,” according to Mick Strautmann, a market analyst at Vortexa. The spike in vacuum gasoil flows out of Ust-Luga in the Baltic, a feedstock used in secondary units like the fluid catalytic cracking units, suggests more serious disruptions at downstream units in the region, he

Read More »

Oil Steady as OPEC+ Weighs Output Hike

Oil held steady as traders weighed the uncertain status of nuclear talks between the US and Iran against reports that OPEC+ may extend its run of super-sized production increases. West Texas Intermediate edged up to settle above $65 a barrel after swinging between gains and losses. Bloomberg reported that several OPEC delegates, who asked not to be identified, said their countries are ready consider another 411,000 barrel-a-day increase for August when they convene on July 6, following similarly sized hikes agreed upon in each of the previous three months. While that figure is broadly in-line with expectations, “the indications are that the group may go beyond the 411,000 barrel-a-day increase,” said John Kilduff, a partner at Again Capital. “Next, we should hear about the voluntary cuts under-shooting the goal from the group laggards. I expect the ultimate decision to be bearish for prices.” Crude had earlier advanced as much as 1.3% after US Energy Secretary Chris Wright told Bloomberg that sanctions against Iran will remain in place for now, and US President Donald Trump said he dropped plans to ease Iran sanctions. The statement comes just days after the president claimed that Iran and the US would meet for nuclear talk as soon as next week, which Iran denied. Oil still ended the week down roughly 13% — snapping three weeks of gains — after a ceasefire in the Israel-Iran conflict was reached, easing concerns about supply disruptions from a region that pumps about a third of the world’s crude. The focus has largely reverted to fundamental catalysts, including OPEC moves. Russia now also appears more receptive to a fresh output boost, in a reversal of an earlier stance, raising concerns of supply overhang in the second half of the year. Investors have also turned their attention to progress on

Read More »

From pilot to profit: The real path to scalable, ROI-positive AI

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue. Three years after ChatGPT launched the generative AI era, most enterprises remain trapped in pilot purgatory. Despite billions in AI investments, the majority of corporate AI initiatives never escape the proof-of-concept phase, let alone generate measurable returns. But a select group of Fortune 500 companies has cracked the code. Walmart, JPMorgan Chase, Novartis, General Electric, McKinsey, Uber and others have systematically moved AI from experimental “innovation theater” to production-grade systems delivering substantial ROI—in some cases, generating over $1 billion in annual business value. Their success isn’t accidental. It’s the result of deliberate governance models, disciplined budgeting strategies and fundamental cultural shifts that transform how organizations approach AI deployment. This isn’t about having the best algorithms or the most data scientists. It’s about building the institutional machinery that turns AI experiments into scalable business assets. “We see this as a pretty big inflection point, very similar to the internet,” Walmart’s VP of emerging technology Desirée Gosby said at this week’s VB Transform event. “It’s as profound in terms of how we’re actually going to operate, how we actually do work.” The pilot trap: Why most AI initiatives fail to scale The statistics are sobering. Industry research shows that 85% of AI projects never make it to production, and of those that do, fewer than half generate meaningful business value. The problem isn’t technical—it’s organizational. Companies treat AI as a science experiment rather than a business capability. “AI is already cutting some product-development cycles by about 40 percent, letting companies ship and decide faster than ever,”

Read More »

The rise of prompt ops: Tackling hidden AI costs from bad inputs and context bloat

This article is part of VentureBeat’s special issue, “The Real Cost of AI: Performance, Efficiency and ROI at Scale.” Read more from this special issue.

Model providers continue to roll out increasingly sophisticated large language models (LLMs) with longer context windows and enhanced reasoning capabilities. 

This allows models to process and “think” more, but it also increases compute: The more a model takes in and puts out, the more energy it expends and the higher the costs. 

Couple this with all the tinkering involved with prompting — it can take a few tries to get to the intended result, and sometimes the question at hand simply doesn’t need a model that can think like a PhD — and compute spend can get out of control. 

This is giving rise to prompt ops, a whole new discipline in the dawning age of AI. 

“Prompt engineering is kind of like writing, the actual creating, whereas prompt ops is like publishing, where you’re evolving the content,” Crawford Del Prete, IDC president, told VentureBeat. “The content is alive, the content is changing, and you want to make sure you’re refining that over time.”

The challenge of compute use and cost

Compute use and cost are two “related but separate concepts” in the context of LLMs, explained David Emerson, applied scientist at the Vector Institute. Generally, the price users pay scales based on both the number of input tokens (what the user prompts) and the number of output tokens (what the model delivers). However, they are not changed for behind-the-scenes actions like meta-prompts, steering instructions or retrieval-augmented generation (RAG). 

While longer context allows models to process much more text at once, it directly translates to significantly more FLOPS (a measurement of compute power), he explained. Some aspects of transformer models even scale quadratically with input length if not well managed. Unnecessarily long responses can also slow down processing time and require additional compute and cost to build and maintain algorithms to post-process responses into the answer users were hoping for.

Typically, longer context environments incentivize providers to deliberately deliver verbose responses, said Emerson. For example, many heavier reasoning models (o3 or o1 from OpenAI, for example) will often provide long responses to even simple questions, incurring heavy computing costs. 

Here’s an example:

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have?

Output: If I eat 1, I only have 1 left. I would have 5 apples if I buy 4 more.

The model not only generated more tokens than it needed to, it buried its answer. An engineer may then have to design a programmatic way to extract the final answer or ask follow-up questions like ‘What is your final answer?’ that incur even more API costs. 

Alternatively, the prompt could be redesigned to guide the model to produce an immediate answer. For instance: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Start your response with “The answer is”…

Or: 

Input: Answer the following math problem. If I have 2 apples and I buy 4 more at the store after eating 1, how many apples do I have? Wrap your final answer in bold tags .

“The way the question is asked can reduce the effort or cost in getting to the desired answer,” said Emerson. He also pointed out that techniques like few-shot prompting (providing a few examples of what the user is looking for) can help produce quicker outputs. 

One danger is not knowing when to use sophisticated techniques like chain-of-thought (CoT) prompting (generating answers in steps) or self-refinement, which directly encourage models to produce many tokens or go through several iterations when generating responses, Emerson pointed out. 

Not every query requires a model to analyze and re-analyze before providing an answer, he emphasized; they could be perfectly capable of answering correctly when instructed to respond directly. Additionally, incorrect prompting API configurations (such as OpenAI o3, which requires a high reasoning effort) will incur higher costs when a lower-effort, cheaper request would suffice.

“With longer contexts, users can also be tempted to use an ‘everything but the kitchen sink’ approach, where you dump as much text as possible into a model context in the hope that doing so will help the model perform a task more accurately,” said Emerson. “While more context can help models perform tasks, it isn’t always the best or most efficient approach.”

Evolution to prompt ops

It’s no big secret that AI-optimized infrastructure can be hard to come by these days; IDC’s Del Prete pointed out that enterprises must be able to minimize the amount of GPU idle time and fill more queries into idle cycles between GPU requests. 

“How do I squeeze more out of these very, very precious commodities?,” he noted. “Because I’ve got to get my system utilization up, because I just don’t have the benefit of simply throwing more capacity at the problem.” 

Prompt ops can go a long way towards addressing this challenge, as it ultimately manages the lifecycle of the prompt. While prompt engineering is about the quality of the prompt, prompt ops is where you repeat, Del Prete explained. 

“It’s more orchestration,” he said. “I think of it as the curation of questions and the curation of how you interact with AI to make sure you’re getting the most out of it.” 

Models can tend to get “fatigued,” cycling in loops where quality of outputs degrades, he said. Prompt ops help manage, measure, monitor and tune prompts. “I think when we look back three or four years from now, it’s going to be a whole discipline. It’ll be a skill.”

While it’s still very much an emerging field, early providers include QueryPal, Promptable, Rebuff and TrueLens. As prompt ops evolve, these platforms will continue to iterate, improve and provide real-time feedback to give users more capacity to tune prompts over time, Dep Prete noted.

Eventually, he predicted, agents will be able to tune, write and structure prompts on their own. “The level of automation will increase, the level of human interaction will decrease, you’ll be able to have agents operating more autonomously in the prompts that they’re creating.”

Common prompting mistakes

Until prompt ops is fully realized, there is ultimately no perfect prompt. Some of the biggest mistakes people make, according to Emerson: 

Not being specific enough about the problem to be solved. This includes how the user wants the model to provide its answer, what should be considered when responding, constraints to take into account and other factors. “In many settings, models need a good amount of context to provide a response that meets users expectations,” said Emerson. 

Not taking into account the ways a problem can be simplified to narrow the scope of the response. Should the answer be within a certain range (0 to 100)? Should the answer be phrased as a multiple choice problem rather than something open-ended? Can the user provide good examples to contextualize the query? Can the problem be broken into steps for separate and simpler queries?

Not taking advantage of structure. LLMs are very good at pattern recognition, and many can understand code. While using bullet points, itemized lists or bold indicators (****) may seem “a bit cluttered” to human eyes, Emerson noted, these callouts can be beneficial for an LLM. Asking for structured outputs (such as JSON or Markdown) can also help when users are looking to process responses automatically. 

There are many other factors to consider in maintaining a production pipeline, based on engineering best practices, Emerson noted. These include: 

Making sure that the throughput of the pipeline remains consistent; 

Monitoring the performance of the prompts over time (potentially against a validation set);

Setting up tests and early warning detection to identify pipeline issues.

Users can also take advantage of tools designed to support the prompting process. For instance, the open-source DSPy can automatically configure and optimize prompts for downstream tasks based on a few labeled examples. While this may be a fairly sophisticated example, there are many other offerings (including some built into tools like ChatGPT, Google and others) that can assist in prompt design. 

And ultimately, Emerson said, “I think one of the simplest things users can do is to try to stay up-to-date on effective prompting approaches, model developments and new ways to configure and interact with models.” 

Read More »

Stay Ahead with the Paperboy Newsletter

Your weekly dose of insights into AI, Bitcoin mining, Datacenters and Energy indusrty news. Spend 3-5 minutes and catch-up on 1 week of news.

Smarter with ONMINE

Streamline Your Growth with ONMINE