Stay Ahead, Stay ONMINE

Hallucinations in AI: How GSK is addressing a critical problem in drug development

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Generative AI has become a key piece of infrastructure in many industries, and healthcare is no exception. Yet, as organizations like GSK push the boundaries of what generative AI can achieve, they face significant challenges — […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Generative AI has become a key piece of infrastructure in many industries, and healthcare is no exception. Yet, as organizations like GSK push the boundaries of what generative AI can achieve, they face significant challenges — particularly when it comes to reliability. Hallucinations, or when AI models generate incorrect or fabricated information, are a persistent problem in high-stakes applications like drug discovery and healthcare. For GSK, tackling these challenges requires leveraging test-time compute scaling to improve gen AI systems. Here’s how they’re doing it.

The hallucination problem in generative health care

Healthcare applications demand an exceptionally high level of accuracy and reliability. Errors are not merely inconvenient; they can have life-altering consequences. This makes hallucinations in large language models (LLMs) a critical issue for companies like GSK, where gen AI is applied to tasks such as scientific literature review, genomic analysis and drug discovery.

To mitigate hallucinations, GSK employs advanced inference-time compute strategies, including self-reflection mechanisms, multi-model sampling and iterative output evaluation. According to Kim Branson, SvP of AI and machine learning (ML) at GSK, these techniques help ensure that agents are “robust and reliable,” while enabling scientists to generate actionable insights more quickly.

Leveraging test-time compute scaling

Test-time compute scaling refers to the ability to increase computational resources during the inference phase of AI systems. This allows for more complex operations, such as iterative output refinement or multi-model aggregation, which are critical for reducing hallucinations and improving model performance.

Branson emphasized the transformative role of scaling in GSK’s AI efforts, noting that “we’re all about increasing the iteration cycles at GSK — how we think faster.” By using strategies like self-reflection and ensemble modeling, GSK can leverage these additional compute cycles to produce results that are both accurate and reliable.

Branson also touched on the broader industry trend, saying, “You’re seeing this war happening with how much I can serve, my cost per token and time per token. That allows people to bring these different algorithmic strategies which were before not technically feasible, and that also will drive the kind of deployment and adoption of agents.”

Strategies for reducing hallucinations

GSK has identified hallucinations as a critical challenge in gen AI for healthcare. The company employs two main strategies that require additional computational resources during inference. Applying more thorough processing steps ensures that each answer is examined for accuracy and consistency before it is delivered in clinical or research settings, where reliability is paramount.

Self-reflection and iterative output review

One core technique is self-reflection, where LLMs critique or edit their own responses to improve quality. The model “thinks step by step,” analyzing its initial output, pinpointing weaknesses and revising answers as needed. GSK’s literature search tool exemplifies this: It collects data from internal repositories and an LLM’s memory, then re-evaluates its findings through self-criticism to uncover inconsistencies. 

This iterative process results in clearer, more detailed final answers. Branson underscored the value of self-criticism, saying: “If you can only afford to do one thing, do that.” Refining its own logic before delivering results allows the system to produce insights that align with healthcare’s strict standards.

Multi-model sampling

GSK’s second strategy relies on multiple LLMs or different configurations of a single model to cross-verify outputs. In practice, the system might run the same query at various temperature settings to generate diverse answers, employ fine-tuned versions of the same model specializing in particular domains or call on entirely separate models trained on distinct datasets.

Comparing and contrasting these outputs helps confirm the most consistent or convergent conclusions. “You can get that effect of having different orthogonal ways to come to the same conclusion,” said Branson. Although this approach requires more computational power, it reduces hallucinations and boosts confidence in the final answer — an essential benefit in high-stakes healthcare environments.

The inference wars

GSK’s strategies depend on infrastructure that can handle significantly heavier computational loads. In what Branson calls “inference wars,” AI infrastructure companies — such as Cerebras, Groq and SambaNova — compete to deliver hardware breakthroughs that enhance token throughput, lower latency and reduce costs per token. 

Specialized chips and architectures enable complex inferencing routines, including multi-model sampling and iterative self-reflection, at scale. Cerebras’ technology, for example, processes thousands of tokens per second, allowing advanced techniques to work in real-world scenarios. “You’re seeing the results of these innovations directly impacting how we can deploy generative models effectively in healthcare,” Branson noted. 

When hardware keeps pace with software demands, solutions emerge to maintain accuracy and efficiency.

Challenges remain

Even with these advancements, scaling compute resources presents obstacles. Longer inference times can slow workflows, especially if clinicians or researchers need prompt results. Higher compute usage also drives up costs, requiring careful resource management. Nonetheless, GSK considers these trade-offs necessary for stronger reliability and richer functionality. 

“As we enable more tools in the agent ecosystem, the system becomes more useful for people, and you end up with increased compute usage,” Branson noted. Balancing performance, costs and system capabilities allows GSK to maintain a practical yet forward-looking strategy.

What’s next?

GSK plans to keep refining its AI-driven healthcare solutions with test-time compute scaling as a top priority. The combination of self-reflection, multi-model sampling and robust infrastructure helps to ensure that generative models meet the rigorous demands of clinical environments. 

This approach also serves as a road map for other organizations, illustrating how to reconcile accuracy, efficiency and scalability. Maintaining a leading edge in compute innovations and sophisticated inference techniques not only addresses current challenges, but also lays the groundwork for breakthroughs in drug discovery, patient care and beyond.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Extreme plots enterprise marketplace for AI agents, tools, apps

Extreme Networks this week previewed an AI marketplace where it plans to offer a curated catalog of AI tools, agents and applications. Called Extreme Exchange, it’s designed to give enterprise customers a way to discover, deploy, and create AI agents, microapps, and workflows in minutes rather than developing such components

Read More »

Top quantum breakthroughs of 2025

The Helios quantum computing platform is available to customers through Quantinuum’s cloud service and on-premises offering. HSBC is using IBM’s Heron quantum computer to improve their bond trading predictions by 34% compared to classical computing. Caltech physicists create 6,100-qubit array. Kon H. Leung is seen working on the apparatus used

Read More »

How enterprises are rethinking online AI tools

A second path enterprises like had only about 35% buy-in, but generated the most enthusiasm. It is to use an online AI tool that offers more than a simple answer to a question, something more like an “interactive AI agent” than a chatbot. Two that got all the attention are

Read More »

2.4-GW New Jersey offshore wind project canceled by developer

An attorney representing Invenergy told the New Jersey Board of Public Utilities in a Friday filing that the company’s 2.4-GW offshore wind project, Leading Light Wind, is canceled. Invenergy “has determined it cannot move forward with the project under the terms and conditions set out” when the BPU awarded it offshore wind renewable energy certificates in January of last year, the filing said. Invenergy “regrets this decision … and looks to the future for possible solicitations.” “The [BPU] is well aware that the offshore wind industry has experienced economic and regulatory conditions that have made the development of new offshore wind energy projects extremely difficult,” the filing said. Invenergy says it’s North America’s largest privately-held developer, owner and operator of clean energy solutions, with 36 GW of projects under its belt. Leading Light Wind was being developed off the coast of New Jersey and was set to become operational in 2030.  The filing cites financial, supply chain and regulatory obstacles as reasons the project is no longer viable. Invenergy and the project’s co-sponsor, energyRe, sought several delays from the BPU as they failed to meet filing deadlines due to issues like an inability to find a turbine supplier, it said. Invenergy was granted a stay last September, then extended that stay three more times before filing about its intention to abandon the project. In its May filing, Invenergy said that “given ongoing market and policy uncertainty, Leading Light Wind will continue to focus on meeting its lease obligations.” The Friday filing said the company since concluded that it doesn’t see a path forward for doing so. “The Company has invested considerable time and financial resources in the development of [Leading Light Wind] and remains firmly of the view that [Leading Light Wind], and offshore wind energy development, can provide significant

Read More »

Chevron Sees Oil Prices Under ‘More Pressure’ Than LNG Next Year

Increased oil supply from OPEC and its allies will continue to put pressure on crude prices next year, while liquefied natural gas prices will likely fall later in the decade, according to Chevron Corp. Chief Executive Officer Mike Wirth.  “Oil prices in 2026 are likely to feel more pressure than LNG prices,” Wirth said in an interview with Bloomberg TV. “There’s a lot of oil supply that’s coming back from the OPEC+ countries that have been holding supply back.” Back in August, Chevron correctly called the drop in oil prices in the second half of this year, and today unveiled a five-year plan to focus on profitability over production growth through 2030. The plan proposes to grow free cash flow at a 14% compound annual rate through the period with crude at $70 a barrel. “We’ve built a portfolio that will withstand the cycles of this business,” Wirth said.  Chevron expects strong, “linear” demand increases for liquefied natural gas globally, but sees lower prices at the end of the 2020s due to a surge in supply, particularly from the Gulf Coast and the Middle East.  “There’s a period of time when it would appear we’re going to see more supply coming into the market than demand will be able to absorb,” Wirth said. “That probably results in lower spot prices.” WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Read More »

ENGIE to Supply Green Power to AstraZeneca

ENGIE North America said Wednesday it had bagged a nine-year contract to supply renewable electricity to pharmaceutical company AstraZeneca PLC until 2034. The power will come from the 114-megawatt Tyson Nick Solar Project, located 90 miles northeast of Dallas in Lamar County, Texas, ENGIE North America said in an online statement. “AstraZeneca represents a strategic customer base for ENGIE”, the statement said. “It is one of 19 global pharmaceutical accounts and is one of the first to have its climate targets verified by the Science-Based Targets Initiative’s Net-Zero Corporate Standard”. The power from the Texas solar plant will avoid nearly 95,000 metric tons of carbon dioxide emissions, “the equivalent of eliminating the emissions from burning 105 million pounds of coal”, said ENGIE North America, part of French power and gas utility ENGIE. The supply will serve Cambridge, United Kingdom-based AstraZeneca’s Texas operations, ENGIE North America said. Last month AstraZeneca opened its expanded production plant for the hyperkalemia drug Lokelma in Coppell, Texas. The Coppell  plant is the only one producing Lokelma, supplying 50 countries, according to AstraZeneca. “Through the expansion and development of a new 9,000 square foot building, two novel manufacturing lines will be added along with enhancements to support drug substance and drug product laboratory testing, warehousing, additional manufacturing utilities and administrative space”, AstraZeneca said in a press release October 15 announcing the opening of the enlarged plant. AstraZeneca said the $445 million expansion project is part of its plan to invest $50 billion in research, development and manufacturing in the United States over the next five years. In an earlier green power supply agreement in the U.S., ENGIE secured Meta’s commitment for the 600-MW Swenson Ranch Solar Project, also in Texas. The new contract raises the tech giant’s contracted capacity from ENGIE to 1.3 gigawatts, from four

Read More »

Occidental Beats Q3 Profit Estimates on Higher Production

Occidental Petroleum Corp has reported $649 million or 64 cents per share in net income adjusted for nonrecurring items for the third quarter. That beat the Zacks Consensus Estimate – which averages forecasts by brokerage analysts – of 48 cents, as production exceeded the upper end of the company’s guidance. Net profit before adjustment was $661 million, or $0.65 per diluted share, the Warren Buffett-backed company said in its quarterly report. Occidental maintained its dividend at $0.24 per share. July-September output averaged 1.47 million barrels of oil equivalent per day (MMboepd). The Permian Basin accounted for 800,000 boepd. The Rockies and other United States assets contributed 288,000 boepd, while the Gulf of America produced 139,000 boepd. Occidental derived 238,000 boepd from outside the U.S. Net sales totaled $6.62 billion, down from $7.17 billion for Q3 2024. Q3 oil and gas pre-tax income was $1.3 billion. “Excluding items affecting comparability, the increase in third quarter oil and gas income, compared to the second quarter of 2025, was due to higher crude oil volumes and prices”, Occidental said. “For the third quarter of 2025, average WTI and Brent marker prices were $64.93 per barrel and $68.14 per barrel, respectively. Average worldwide realized crude oil prices increased by two percent from the prior quarter to $64.78 per barrel. Average worldwide realized natural gas liquids prices decreased by five percent from the prior quarter to $19.60 per barrel. Average domestic realized gas prices increased by 11 percent from the prior quarter to $1.48 per thousand cubic feet”. Occidental Chemical Corp (OxyChem) generated $197 million in pre-tax earnings, down quarter-on-quarter due to lower realized prices and volumes across most products. These were “partially offset by favorable raw material costs”, Occidental said. Buffett’s holding company Berkshire Hathaway Inc is in the process of acquiring OxyChem for $9.7 billion.

Read More »

Tape Ark Bags Multi Million Dollar Oil and Gas AI Enablement Deal

In a release sent to Rigzone recently by the Tape Ark team, Tape Ark revealed that it had bagged a “multi-million dollar oil and gas AI enablement contract”. Tape Ark announced in that release that it had “secured a multi-million dollar contract with a major U.S. oil and gas exploration company to liberate and modernize one of the industry’s largest legacy tape archives in readiness for massive petabyte scale AI programs”. Tape Ark, which described itself in the release as the global leader in large-scale tape to cloud data migration, revealed in the statement that the contract was awarded “after an extensive global evaluation”. The company highlighted that the deal will see Tape Ark migrate over 50 petabytes of critical exploration and production data from legacy tapes into the cloud, “enabling advanced AI, analytics, faster access to historical data, and compliance with evolving energy-sector data retention standards”. Guy Holmes, Founder and CEO of Tape Ark, said in the statement that “this partnership represents a major milestone in our North American expansion”. “For decades, vital subsurface and operational data in the energy sector has remained trapped on aging tapes. We’re proud to be the company trusted to bring this data to life – securely, at scale, and in the cloud,” he added. “This contract is more than a migration – it’s a digital transformation of decades of exploration intelligence,” Holmes continued. “By unlocking this data and enabling it for scalable AI, our client will gain an entirely new layer of insight and operational value. Our expansion and investment plans are about making that possible for organizations worldwide,” Holmes went on to state. Tape Ark noted in its release that the deal “follows recent large-scale projects globally for major broadcasters, governments, and oil companies”. In a statement posted on its website back

Read More »

[Podcast] How Utilities Are Planning for Demand

From the electrification of transportation to the heating and cooling of data centers, utilities across the U.S. face the challenge of meeting surging demand for electricity. “How Utilities Are Planning for Demand” is a three-part podcast series that examines the increasingly complex utility sector landscape. The series features the insights of utility experts addressing industry-critical topics, including the vital role of smart planning in meeting historic demand, how to meet demand on accelerated timelines and the grid of tomorrow. Check out the podcast episodes! ⬆ <!– Ep. 3 Planning the Grid of Tomorrow –> <!– Ep. 2 The Need for Speed –> Ep. 1 The Big Picture: Why Smart Planning is Key to Meeting Historic Demand <!– Ep. 3 Planning the Grid of Tomorrow Meeting today’s soaring electricity demand is challenging enough — but what about the decades ahead? In this episode, industry experts explore how long-term planning, transmission buildouts, and advanced tools can give utilities the confidence to invest wisely and prepare for a future shaped by renewables, bidirectional power flows, and extreme weather. COMING SOON –> <!– Ep. 2 The Need for Speed What makes many emerging utility customers different today? They need a lot of electricity fast. In this episode, we explore how utilities can respond quickly to meet the accelerating load growth. Listen to “The Need for Speed” on Spreaker. –> Ep. 1 The Big Picture: Why Smart Planning is Key to Meeting Historic Demand Whether it’s growing demand, an aging workforce and grid, or the influx of renewable energy, the utility sector is more complex than ever. In this episode, we will explore the broad range of issues utility leaders must navigate to reliably and affordably deliver growing amounts of electricity to customers.

Read More »

AMD outlines ambitious plan for AI-driven data centers

“There are very beefy workloads that you must have that performance for to run the enterprise,” he said. “The Fortune 500 mainstream enterprise customers are now … adopting Epyc faster than anyone. We’ve seen a 3x adoption this year. And what that does is drives back to the on-prem enterprise adoption, so that the hybrid multi-cloud is end-to-end on Epyc.” One of the key focus areas for AMD’s Epyc strategy has been our ecosystem build out. It has almost 180 platforms, from racks to blades to towers to edge devices, and 3,000 solutions in the market on top of those platforms. One of the areas where AMD pushes into the enterprise is what it calls industry or vertical workloads. “These are the workloads that drive the end business. So in semiconductors, that’s telco, it’s the network, and the goal there is to accelerate those workloads and either driving more throughput or drive faster time to market or faster time to results. And we almost double our competition in terms of faster time to results,” said McNamara. And it’s paying off. McNamara noted that over 60% of the Fortune 100 are using AMD, and that’s growing quarterly. “We track that very, very closely,” he said. The other question is are they getting new customer acquisitions, customers with Epyc for the first time? “We’ve doubled that year on year.” AMD didn’t just brag, it laid out a road map for the next two years, and 2026 is going to be a very busy year. That will be the year that new CPUs, both client and server, built on the Zen 6 architecture begin to appear. On the server side, that means the Venice generation of Epyc server processors. Zen 6 processors will be built on 2 nanometer design generated by (you guessed

Read More »

Building the Regional Edge: DartPoints CEO Scott Willis on High-Density AI Workloads in Non-Tier-One Markets

When DartPoints CEO Scott Willis took the stage on “the Distributed Edge” panel at the 2025 Data Center Frontier Trends Summit, his message resonated across a room full of developers, operators, and hyperscale strategists: the future of AI infrastructure will be built far beyond the nation’s tier-one metros. On the latest episode of the Data Center Frontier Show, Willis expands on that thesis, mapping out how DartPoints has positioned itself for a moment when digital infrastructure inevitably becomes more distributed, and why that moment has now arrived. DartPoints’ strategy centers on what Willis calls the “regional edge”—markets in the Midwest, Southeast, and South Central regions that sit outside traditional cloud hubs but are increasingly essential to the evolving AI economy. These are not tower-edge micro-nodes, nor hyperscale mega-campuses. Instead, they are regional data centers designed to serve enterprises with colocation, cloud, hybrid cloud, multi-tenant cloud, DRaaS, and backup workloads, while increasingly accommodating the AI-driven use cases shaping the next phase of digital infrastructure. As inference expands and latency-sensitive applications proliferate, Willis sees the industry’s momentum bending toward the very markets DartPoints has spent years cultivating. Interconnection as Foundation for Regional AI Growth A key part of the company’s differentiation is its interconnection strategy. Every DartPoints facility is built to operate as a deeply interconnected environment, drawing in all available carriers within a market and stitching sites together through a regional fiber fabric. Willis describes fiber as the “nervous system” of the modern data center, and for DartPoints that means creating an interconnection model robust enough to support a mix of enterprise cloud, multi-site disaster recovery, and emerging AI inference workloads. The company is already hosting latency-sensitive deployments in select facilities—particularly inference AI and specialized healthcare applications—and Willis expects such deployments to expand significantly as regional AI architectures become more widely

Read More »

Key takeaways from Cisco Partner Summit

Brian Ortbals, senior vice president from World Wide Technology, which is one of Cisco’s biggest and most important partners stated: “Cisco engaged partners early in the process and took our feedback along the way. We believe now is the right time for these changes as it will enable us to capitalize on the changes in the market.” The reality is, the more successful its more-than-half-a-million partners are, the more successful Cisco will be. Platform approach is coming together When Jeetu Patel took the reigns as chief product officer, one of his goals was to make the Cisco portfolio a “force multiple.” Patel has stated repeatedly that, historically, Cisco acted more as a technology holding company with good products in networking, security, collaboration, data center and other areas. In this case, product breadth was not an advantage, as everything must be sold as “best of breed,” which is a tough ask of the salesforce and partner community. Since then, there have been many examples of the coming together of the portfolio to create products that leverage the breadth of the platform. The latest is the Unified Edge appliance, an all-in-one solution that brings together compute, networking, storage and security. Cisco has been aggressive with AI products in the data center, and Cisco Unified Edge compliments that work with a device designed to bring AI to edge locations. This is ideally suited for retail, manufacturing, healthcare, factories and other industries where it’s more cost effecting and performative to run AI where the data lives.

Read More »

AI networking demand fueled Cisco’s upbeat Q1 financials

Customers are very focused on modernizing their network infrastructure in the enterprise in preparation for inferencing and AI workloads, Robbins said. “These things are always multi-year efforts,” and this is only the beginning, Robbins said. The AI opportunity “As we look at the AI opportunity, we see customer use cases growing across training, inferencing, and connectivity, with secure networking increasingly critical as workloads move from the data center to end users, devices, and agents at the edge,” Robbins said. “Agents are transforming network traffic from predictable bursts to persistent high-intensity loads, with agentic AI queries generating up to 25 times more network traffic than chatbots.” “Instead of pulling data to and from the data center, AI workloads require models and infrastructure to be closer to where data is created and decisions are made, particularly in industries such as retail, healthcare, and manufacturing.” Robbins pointed to last week’s introduction of Cisco Unified Edge, a converged platform that integrates networking, compute and storage to help enterprise customers more efficiently handle data from AI and other workloads at the edge. “Unified Edge enables real-time inferencing for agentic and physical AI workloads, so enterprises can confidently deploy and manage AI at scale,” Robbins said. On the hyperscaler front, “we see a lot of solid pipeline throughout the rest of the year. The use cases, we see it expanding,” Robbins said. “Obviously, we’ve been selling networking infrastructure under the training models. We’ve been selling scale-out. We launched the P200-based router that will begin to address some of the scale-across opportunities.” Cisco has also seen great success with its pluggable optics, Robbins said. “All of the hyperscalers now are officially customers of our pluggable optics, so we feel like that’s a great opportunity. They not only plug into our products, but they can be used with other companies’

Read More »

When the Cloud Leaves Earth: Google and NVIDIA Test Space Data Centers for the Orbital AI Era

On November 4, 2025, Google unveiled Project Suncatcher, a moonshot research initiative exploring the feasibility of AI data centers in space. The concept envisions constellations of solar-powered satellites in Low Earth Orbit (LEO), each equipped with Tensor Processing Units (TPUs) and interconnected via free-space optical laser links. Google’s stated objective is to launch prototype satellites by early 2027 to test the idea and evaluate scaling paths if the technology proves viable. Rather than a commitment to move production AI workloads off-planet, Suncatcher represents a time-bound research program designed to validate whether solar-powered, laser-linked LEO constellations can augment terrestrial AI factories, particularly for power-intensive, latency-tolerant tasks. The 2025–2027 window effectively serves as a go/no-go phase to assess key technical hurdles including thermal management, radiation resilience, launch economics, and optical-link reliability. If these milestones are met, Suncatcher could signal the emergence of a new cloud tier: one that scales AI with solar energy rather than substations. Inside Google’s Suncatcher Vision Google has released a detailed technical paper titled “Towards a Future Space-Based, Highly Scalable AI Infrastructure Design.” The accompanying Google Research blog describes Project Suncatcher as “a moonshot exploring a new frontier” – an early-stage effort to test whether AI compute clusters in orbit can become a viable complement to terrestrial data centers. The paper outlines several foundational design concepts: Orbit and Power Project Suncatcher targets Low Earth Orbit (LEO), where solar irradiance is significantly higher and can remain continuous in specific orbital paths. Google emphasizes that space-based solar generation will serve as the primary power source for the TPU-equipped satellites. Compute and Interconnect Each satellite would host Tensor Processing Unit (TPU) accelerators, forming a constellation connected through free-space optical inter-satellite links (ISLs). Together, these would function as a disaggregated orbital AI cluster, capable of executing large-scale batch and training workloads. Downlink

Read More »

Cloud-based GPU savings are real – for the nimble

The pattern points to an evolving GPU ecosystem: while top-tier chips like Nvidia’s new GB200 Blackwell processors remain in extremely short supply, older models such as the A100 and H100 are becoming cheaper and more available. Yet, customer behavior may not match practical needs. “Many are buying the newest GPUs because of FOMO—the fear of missing out,” he added. “ChatGPT itself was built on older architecture, and no one complained about its performance.” Gil emphasized that managing cloud GPU resources now requires agility, both operationally and geographically. Spot capacity fluctuates hourly or even by the minute, and availability varies across data center regions. Enterprises willing to move workloads dynamically between regions—often with the help of AI-driven automation—can achieve cost reductions of up to 80%. “If you can move your workloads where the GPUs are cheap and available, you pay five times less than a company that can’t move,” he said. “Human operators can’t respond that fast automation is essential.” Conveniently, Cast sells an AI automation solution. But it is not the only one and the argument is valid. If spot pricing can be found cheaper at another location, you want to take it to keep the cloud bill down/ Gil concluded by urging engineers and CTOs to embrace flexibility and automation rather than lock themselves into fixed regions or infrastructure providers. “If you want to win this game, you have to let your systems self-adjust and find capacity where it exists. That’s how you make AI infrastructure sustainable.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »