Stay Ahead, Stay ONMINE

Diffbot’s AI model doesn’t guess—it knows, thanks to a trillion-fact knowledge graph

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Diffbot, a small Silicon Valley company best known for maintaining one of the world’s largest indexes of web knowledge, announced today the release of a new AI model that promises to address one of the biggest […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Diffbot, a small Silicon Valley company best known for maintaining one of the world’s largest indexes of web knowledge, announced today the release of a new AI model that promises to address one of the biggest challenges in the field: factual accuracy.

The new model, a fine-tuned version of Meta’s LLama 3.3, is the first open-source implementation of a system known as Graph Retrieval-Augmented Generation, or GraphRAG.

Unlike conventional AI models, which rely solely on vast amounts of preloaded training data, Diffbot’s LLM draws on real-time information from the company’s Knowledge Graph, a constantly updated database containing more than a trillion interconnected facts.

“We have a thesis that eventually general purpose reasoning will get distilled down into about 1 billion parameters,” said Mike Tung, Diffbot’s founder and CEO, in an interview with VentureBeat. “You don’t actually want the knowledge in the model. You want the model to be good at just using tools so that it can query knowledge externally.”

How it works

Diffbot’s Knowledge Graph is a sprawling, automated database that has been crawling the public web since 2016. It categorizes web pages into entities such as people, companies, products, and articles, extracting structured information using a combination of computer vision and natural language processing.

Every four to five days, the Knowledge Graph is refreshed with millions of new facts, ensuring it remains up-to-date. Diffbot’s AI model leverages this resource by querying the graph in real time to retrieve information, rather than relying on static knowledge encoded in its training data.

For example, when asked about a recent news event, the model can search the web for the latest updates, extract relevant facts, and cite the original sources. This process is designed to make the system more accurate and transparent than traditional LLMs.

“Imagine asking an AI about the weather,” Tung said. “Instead of generating an answer based on outdated training data, our model queries a live weather service and provides a response grounded in real-time information.”

How Diffbot’s Knowledge Graph beats traditional AI at finding facts

In benchmark tests, Diffbot’s approach appears to be paying off. The company reports its model achieves an 81% accuracy score on FreshQA, a Google-created benchmark for testing real-time factual knowledge, surpassing both ChatGPT and Gemini. It also scored 70.36% on MMLU-Pro, a more difficult version of a standard test of academic knowledge.

Perhaps most significantly, Diffbot is making its model fully open source, allowing companies to run it on their own hardware and customize it for their needs. This addresses growing concerns about data privacy and vendor lock-in with major AI providers.

“You can run it locally on your machine,” Tung noted. “There’s no way you can run Google Gemini without sending your data over to Google and shipping it outside of your premises.”

Open source AI could transform how enterprises handle sensitive data

The release comes at a pivotal moment in AI development. Recent months have seen mounting criticism of large language models’ tendency to “hallucinate” or generate false information, even as companies continue to scale up model sizes. Diffbot’s approach suggests an alternative path forward — one focused on grounding AI systems in verifiable facts rather than attempting to encode all human knowledge in neural networks.

“Not everyone’s going after just bigger and bigger models,” Tung said. “You can have a model that has more capability than a big model with kind of a non-intuitive approach like ours.”

Industry experts note that Diffbot’s knowledge graph-based approach could be particularly valuable for enterprise applications where accuracy and auditability are crucial. The company already provides data services to major firms including Cisco, DuckDuckGo, and Snapchat.

The model is available immediately through an open source release on GitHub and can be tested through a public demo at diffy.chat. For organizations wanting to deploy it internally, Diffbot says the smaller 8 billion parameter version can run on a single Nvidia A100 GPU, while the full 70 billion parameter version requires two H100 GPUs.

Looking ahead, Tung believes the future of AI lies not in ever-larger models, but in better ways of organizing and accessing human knowledge: “Facts get stale. A lot of these facts will be moved out into explicit places where you can actually modify the knowledge and where you can have data provenance.”

As the AI industry grapples with challenges around factual accuracy and transparency, Diffbot’s release offers a compelling alternative to the dominant bigger-is-better paradigm. Whether it succeeds in shifting the field’s direction remains to be seen, but it has certainly demonstrated that when it comes to AI, size isn’t everything.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

A timeline of Broadcom/VMware and Siemens licensing dispute

June 24: VMware responds, saying that Siemens distributed infringing VMware products to its US subsidiaries in violation of US copyright law by accessing VMware’s US server. July 1: Nah uh, says Siemens. First, any actions taken by the parent company occurred in Germany. Also, downloading allegedly copyrighted software does not

Read More »

JPMorgan launches carbon market blockchain app

Dive Brief: JPMorgan Chase is working to allow voluntary carbon markets to issue blockchain tokens at the registry level that represent ownership of carbon credits, permitting market participants to issue, transfer and retire credits, the bank announced Wednesday. JPMorgan is currently exploring testing processes with carbon registries from S&P Global

Read More »

IBM Power11 challenges x86 and GPU giants with security-first server strategy

The IBM Power Cyber Vault solution is designed to provide protection against cyberattacks such as data corruption and encryption with proactive immutable snapshots that are automatically captured, stored, and tested on a custom-defined schedule, IBM said. Power11 also uses NIST-approved built-in quantum-safe cryptography designed to help protect systems from harvest-now, decrypt-later attacks

Read More »

Solar generation expected to rise by a third this summer: EIA

Solar generation expected to rise by a third this summer: EIA | Utility Dive Skip to main content An article from Solar is expected to account for 7% of total U.S. generation in 2025 and 8% in 2026, the Energy Information Administration said Tuesday. Published July 10, 2025 GRID Alternatives employees carry a solar panel on October 19, 2023, in Pomona, California. Solar energy is expected to be the largest source of renewable power in the United States next summer, exceeding both wind and hydropower, according to an Energy Information Administration report. Mario Tama/Getty Images via Getty Images Despite open hostility from the Trump administration toward renewable sources of power, solar in particular is expected to be a larger part of the energy mix this summer and going forward, the U.S. Energy Information Administration said in a Tuesday report. The electric power sector will generate 124 billion kWh from solar this summer, 34% more than it did during the June-September 2024 period, according to EIA’s Short Term Energy Outlook. Retrieved from U.S. Energy Information Administration. “Solar generation has grown quickly in the past few years as more capacity is installed, a trend we expect to continue this summer,” EIA said. “By summer 2026, we forecast solar generation will grow by another 19% to 147 BkWh, which means solar would surpass wind to become the leading source of renewables generation during the summer.” Wind is expected to generate roughly 130 BkWh this summer, according to EIA. Growing solar has displaced some gas generation in some areas, EIA noted. “With higher generation from renewables and increased fuel costs, we expect U.S. natural gas generation will fall by 4% in 2025 followed by an increase of 2% in 2026,” it said. EIA’s analysis comes as the White House has taken steps to hamstring solar and

Read More »

Oil Teeters as Tariffs and Output Risks Unbalance Market

Oil futures sank as the escalating global trade war and the possibility that OPEC+ may halt output hikes flashed warning signs for energy demand. West Texas Intermediate futures fell to settle below $67 a barrel after Bloomberg reported that the cartel is discussing a pause in further production increases from October. The early-stage deliberations are taking place as President Donald Trump unveils a new round of tariffs, including a 50% rate on Brazil, which sends some oil to the US. Traders are probably interpreting the OPEC+ talks as a sign that “the market may not be able to cope with more oil,” said Ole Hansen, head of commodity strategy at Saxo Bank A/S. “We are potentially seeing the risk of an oversupplied market” once the peak demand period ends, he said. The US-led tariff war has intensified in recent days, and Trump’s latest salvo of demands has overshadowed earlier deals with major trade partners including China and the UK, which had served to mollify investors. Now, the market is facing some of the highest tariff rates in US history, setting the stage for an uncertain period for global growth. Oil has edged higher this week even after OPEC+ decided over the weekend to raise output by more than expected in August. Energy Aspects said it expects global oil demand to rise by less than 1 million barrels a day in the third and fourth quarters amid pressure from US tariff policies. Director of Market Intelligence Amrita Sen said the consultant was “worried about the fourth quarter and into 2026 because tariff talks are back.” Timespreads also show that perceptions of strength in physical market are waning. While WTI’s prompt spread — the gap between its two nearest contracts — is still in a bullish, backwardated structure, the differential narrowed to

Read More »

Ohio regulators approve AEP data center interconnection rules

Dive Brief: The Ohio Public Utilities Commission unanimously approved a settlement agreement Wednesday between AEP Ohio and stakeholders setting new terms and conditions for connecting data centers to the grid. The Ohio Consumers Council, which represents ratepayers, and commission staff were parties to the agreement, along with Ohio Partners for Affordable Energy and others.  The Data Center Coalition, an industry group representing Google, Amazon, Microsoft and others, has opposed the agreement since it was proposed in October, saying it would dampen investment by artificially inflating data centers’ costs and impose cumbersome regulation on behind-the-meter energy generation at the facilities. Dive Insight: The commission ordered AEP Ohio to file updated tariffs and lift a moratorium on connecting new data centers as soon as possible.  The order adopted the agreement as it was proposed in October, with a modification requiring the data center customer or the customer’s financial sponsor to put up collateral, as long as the sponsor is a co-signer on the contract. The order sets minimum monthly bills for new data centers larger than 25 MW based on a percentage of their highest previous monthly billing demand over the past 11 months or their contract capacity. That means they will have to pay for at least 85% of the energy they expect to need each month, even if they use less, to cover the cost of infrastructure needed to bring electricity to the facilities. It also requires data centers to show they are financially viable and to pay an exit fee if their project is canceled or they can’t meet obligations. The requirements would be in place for up to 12 years, including a 4-year ramp-up period. The vote comes as regulators, lawmakers and utilities across the country are struggling to manage new interconnection requests from large loads, led by data

Read More »

Trafigura’s Greenergy Plans to Close UK Biodiesel Plant

Trafigura Group-owned Greenergy has begun consultations on a proposal to halt production at one of its two UK biodiesel plants, as uncertainty about the country’s biofuels industry grows.  Despite cost-cutting efforts, the plant in Immingham, Lincolnshire “has continued to be negatively impacted by market factors, including slower increases in the UK’s biofuels blending mandates compared to European countries and competition from subsidized US-origin products,” the company said in a statement.  Greenergy and British ethanol plants want the government to revise policy to support the sector. The industry is seeking increased blending mandates – above the current 10 percent levels – as cheaper American supplies flood the domestic market. One ethanol manufacturing facility has already started to wind down operations because of the hit from American ethanol supplies. “As European countries are updating their mandates in recent months, the UK has fallen quite a bit behind Europe,” Greenergy Chief Executive Officer Adam Traeger said in an interview. Despite talking to the British government on the lag in policy, “we’re not seeing the action that we would like,” he said. Pressure on the biodiesel industry has come from US companies sending increased volumes of hydrotreated vegetable oil to Britain, following the removal of countervailing duties and anti-dumping tariffs, Traeger said. Changing the blending mandate or encouraging increased use by the maritime or aviation industries would help boost volumes of biofuel used in the UK.  The Immingham plant accounts for more than 25 percent of the UK’s biodiesel production, according to the company. Its larger Teesside plant will replace some of these volumes, Traeger said, otherwise it’ll “have to be replaced with imports coming from other producers either in Europe or around the world.” What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is

Read More »

Chevron Shifts from Local to Centralized Hubs to Cut Costs

Chevron Corp. is reducing local and regional business units in favor of a more centralized model to improve performance and cut as much as $3 billion of costs by 2026.  A single offshore division will operate assets in the US Gulf, Nigeria, Angola and Eastern Mediterranean while shale assets in Texas, Colorado and Argentina will also be brought under one roof, Vice Chairman Mark Nelson said in an interview with Bloomberg Tuesday.  Service centers in Manila and Buenos Aires are set to take on finance, human resources and information technology work that used to be done in multiple countries. Centralized engineering hubs are planned for Houston and Bengaluru, India.  “We’re working so hard to simplify our structure, take some layers out so that we can execute faster,” Nelson said. “Best practices are decided upon and applied across the system regardless of what continent they happen to sit on.” Low oil prices and an uncertain outlook for fossil fuels have led investors to demand more cash returns from the world’s largest energy companies over the past few years, forcing executives to focus on reducing costs to help fund dividends and share buybacks. Even so, energy stocks now make up just 3.1 percent of the S&P 500 Index, less than half the weighting a decade ago, despite the US becoming the world’s largest producer of oil and exporter of natural gas.  Chevron has climbed 5.8 percent this year as of Wednesday midday trading, ahead of the S&P 500 Energy Index’s 3.1 percent increase and the wider market.  “If we’re going to continue to win and be an investment choice in the market, we have to just always be more effective and look for new ways and better ways to work,” Nelson said.  The changes, which include the oil major’s production and refining divisions, are part

Read More »

Avoid the ‘standing’ land mine in FERC appeals

Jackson A. Myers and Zakary Kadish are attorneys at the law firm MoloLamken. Imagine the Federal Energy Regulatory Commission is considering proposed revisions to your regional transmission organization’s tariff. You support the revisions and get involved at the commission to advocate for them, but FERC rejects the proposal nonetheless. So, you petition a federal appellate court for review, asking the court to reverse FERC’s decision. You lose, but not because the court agreed with FERC’s order — it doesn’t even talk about that. Instead, the court says you couldn’t appeal in the first place, because you didn’t satisfy a legal requirement for seeking relief in federal court called “standing.” This unfortunate scenario is more than a thought exercise — it happened in April in Entergy Arkansas v. FERC, No. 22-1334 (D.C. Cir.). Entergy intervened at FERC to support a Midcontinent Independent System Operator proposal that would require distributors to source half their capacity from outside of MISO’s capacity auction, such as through bilateral contracting.  FERC rejected the proposal, so Entergy petitioned the U.S. Court of Appeals for the D.C. Circuit for review. But Entergy didn’t initially explain why it had standing to appeal, as the D.C. Circuit’s rules require. FERC pounced, arguing that Entergy couldn’t bring the appeal at all. The court agreed, dismissing the case without even addressing FERC’s decision. Entergy Arkansas highlights a potential landmine when appealing a FERC decision:  Participating at FERC doesn’t mean you necessarily get to appeal the commission’s decision, because the requirements for suing in federal court — standing — are more restrictive than those at FERC.  Participating at FERC vs. in federal court Participation at FERC is wide open. Anyone may file a complaint challenging particular rates or RTO procedures. And anyone with “an interest” — self-defined — that “may be directly affected”

Read More »

Enterprises will strengthen networks to take on AI, survey finds

Private data centers: 29.5% Traditional public cloud: 35.4% GPU as a service specialists: 18.5% Edge compute: 16.6% “There is little variation from training to inference, but the general pattern is workloads are concentrated a bit in traditional public cloud and then hyperscalers have significant presence in private data centers,” McGillicuddy explained. “There is emerging interest around deploying AI workloads at the corporate edge and edge compute environments as well, which allows them to have workloads residing closer to edge data in the enterprise, which helps them combat latency issues and things like that. The big key takeaway here is that the typical enterprise is going to need to make sure that its data center network is ready to support AI workloads.” AI networking challenges The popularity of AI doesn’t remove some of the business and technical concerns that the technology brings to enterprise leaders. According to the EMA survey, business concerns include security risk (39%), cost/budget (33%), rapid technology evolution (33%), and networking team skills gaps (29%). Respondents also indicated several concerns around both data center networking issues and WAN issues. Concerns related to data center networking included: Integration between AI network and legacy networks: 43% Bandwidth demand: 41% Coordinating traffic flows of synchronized AI workloads: 38% Latency: 36% WAN issues respondents shared included: Complexity of workload distribution across sites: 42% Latency between workloads and data at WAN edge: 39% Complexity of traffic prioritization: 36% Network congestion: 33% “It’s really not cheap to make your network AI ready,” McGillicuddy stated. “You might need to invest in a lot of new switches and you might need to upgrade your WAN or switch vendors. You might need to make some changes to your underlay around what kind of connectivity your AI traffic is going over.” Enterprise leaders intend to invest in infrastructure

Read More »

CoreWeave acquires Core Scientific for $9B to power AI infrastructure push

Such a shift, analysts say, could offer short-term benefits for enterprises, particularly in cost and access, but also introduces new operational risks. “This acquisition may potentially lower enterprise pricing through lease cost elimination and annual savings, while improving GPU access via expanded power capacity, enabling faster deployment of Nvidia chipsets and systems,” said Charlie Dai, VP and principal analyst at Forrester. “However, service reliability risks persist during this crypto-to-AI retrofitting.” This also indicates that struggling vendors such as Core Scientific and similar have a way to cash out, according to Yugal Joshi, partner at Everest Group. “However, it does not materially impact the availability of Nvidia GPUs and similar for enterprises,” Joshi added. “Consolidation does impact the pricing power of vendors.” Concerns for enterprises Rising demand for AI-ready infrastructure can raise concerns among enterprises, particularly over access to power-rich data centers and future capacity constraints. “The biggest concern that CIOs should have with this acquisition is that mature data center infrastructure with dedicated power is an acquisition target,” said Hyoun Park, CEO and chief analyst at Amalgam Insights. “This may turn out to create challenges for CIOs currently collocating data workloads or seeking to keep more of their data loads on private data centers rather than in the cloud.”

Read More »

CoreWeave achieves a first with Nvidia GB300 NVL72 deployment

The deployment, Kimball said, “brings Dell quality to the commodity space. Wins like this really validate what Dell has been doing in reshaping its portfolio to accommodate the needs of the market — both in the cloud and the enterprise.” Although concerns were voiced last year that Nvidia’s next-generation Blackwell data center processors had significant overheating problems when they were installed in high-capacity server racks, he said that a repeat performance is unlikely. Nvidia, said Kimball “has been very disciplined in its approach with its GPUs and not shipping silicon until it is ready. And Dell almost doubles down on this maniacal quality focus. I don’t mean to sound like I have blind faith, but I’ve watched both companies over the last several years be intentional in delivering product in volume. Especially as the competitive market starts to shape up more strongly, I expect there is an extremely high degree of confidence in quality.” CoreWeave ‘has one purpose’ He said, “like Lambda Labs, Crusoe and others, [CoreWeave] seemingly has one purpose (for now): deliver GPU capacity to the market. While I expect these cloud providers will expand in services, I think for now the type of customer employing services is on the early adopter side of AI. From an enterprise perspective, I have to think that organizations well into their AI journey are the consumers of CoreWeave.”  “CoreWeave is also being utilized by a lot of the model providers and tech vendors playing in the AI space,” Kimball pointed out. “For instance, it’s public knowledge that Microsoft, OpenAI, Meta, IBM and others use CoreWeave GPUs for model training and more. It makes sense. These are the customers that truly benefit from the performance lift that we see from generation to generation.”

Read More »

Oracle to power OpenAI’s AGI ambitions with 4.5GW expansion

“For CIOs, this shift means more competition for AI infrastructure. Over the next 12–24 months, securing capacity for AI workloads will likely get harder, not easier. Though cost is coming down but demand is increasing as well, due to which CIOs must plan earlier and build stronger partnerships to ensure availability,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. He added that CIOs should expect longer wait times for AI infrastructure. To mitigate this, they should lock in capacity through reserved instances, diversify across regions and cloud providers, and work with vendors to align on long-term demand forecasts.  “Enterprises stand to benefit from more efficient and cost-effective AI infrastructure tailored to specialized AI workloads, significantly lower their overall future AI-related investments and expenses. Consequently, CIOs face a critical task: to analyze and predict the diverse AI workloads that will prevail across their organizations, business units, functions, and employee personas in the future. This foresight will be crucial in prioritizing and optimizing AI workloads for either in-house deployment or outsourced infrastructure, ensuring strategic and efficient resource allocation,” said Neil Shah, vice president at Counterpoint Research. Strategic pivot toward AI data centers The OpenAI-Oracle deal comes in stark contrast to developments earlier this year. In April, AWS was reported to be scaling back its plans for leasing new colocation capacity — a move that AWS Vice President for global data centers Kevin Miller described as routine capacity management, not a shift in long-term expansion plans. Still, these announcements raised questions around whether the hyperscale data center boom was beginning to plateau. “This isn’t a slowdown, it’s a strategic pivot. The era of building generic data center capacity is over. The new global imperative is a race for specialized, high-density, AI-ready compute. Hyperscalers are not slowing down; they are reallocating their capital to

Read More »

Arista Buys VeloCloud to reboot SD-WANs amid AI infrastructure shift

What this doesn’t answer is how Arista Networks plans to add newer, security-oriented Secure Access Service Edge (SASE) capabilities to VeloCloud’s older SD-WAN technology. Post-acquisition, it still has only some of the building blocks necessary to achieve this. Mapping AI However, in 2025 there is always more going on with networking acquisitions than simply adding another brick to the wall, and in this case it’s the way AI is changing data flows across networks. “In the new AI era, the concepts of what comprises a user and a site in a WAN have changed fundamentally. The introduction of agentic AI even changes what might be considered a user,” wrote Arista Networks CEO, Jayshree Ullal, in a blog highlighting AI’s effect on WAN architectures. “In addition to people accessing data on demand, new AI agents will be deployed to access data independently, adapting over time to solve problems and enhance user productivity,” she said. Specifically, WANs needed modernization to cope with the effect AI traffic flows are having on data center traffic. Sanjay Uppal, now VP and general manager of the new VeloCloud Division at Arista Networks, elaborated. “The next step in SD-WAN is to identify, secure and optimize agentic AI traffic across that distributed enterprise, this time from all end points across to branches, campus sites, and the different data center locations, both public and private,” he wrote. “The best way to grab this opportunity was in partnership with a networking systems leader, as customers were increasingly looking for a comprehensive solution from LAN/Campus across the WAN to the data center.”

Read More »

Data center capacity continues to shift to hyperscalers

However, even though colocation and on-premises data centers will continue to lose share, they will still continue to grow. They just won’t be growing as fast as hyperscalers. So, it creates the illusion of shrinkage when it’s actually just slower growth. In fact, after a sustained period of essentially no growth, on-premises data center capacity is receiving a boost thanks to genAI applications and GPU infrastructure. “While most enterprise workloads are gravitating towards cloud providers or to off-premise colo facilities, a substantial subset are staying on-premise, driving a substantial increase in enterprise GPU servers,” said John Dinsdale, a chief analyst at Synergy Research Group.

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »