Stay Ahead, Stay ONMINE

Beyond sycophancy: DarkBench exposes six hidden ‘dark patterns’ lurking in today’s top LLMs

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More When OpenAI rolled out its ChatGPT-4o update in mid-April 2025, users and the AI community were stunned—not by any groundbreaking feature or capability, but by something deeply unsettling: the updated model’s tendency toward excessive sycophancy. It […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


When OpenAI rolled out its ChatGPT-4o update in mid-April 2025, users and the AI community were stunned—not by any groundbreaking feature or capability, but by something deeply unsettling: the updated model’s tendency toward excessive sycophancy. It flattered users indiscriminately, showed uncritical agreement, and even offered support for harmful or dangerous ideas, including terrorism-related machinations.

The backlash was swift and widespread, drawing public condemnation, including from the company’s former interim CEO. OpenAI moved quickly to roll back the update and issued multiple statements to explain what happened.

Yet for many AI safety experts, the incident was an accidental curtain lift that revealed just how dangerously manipulative future AI systems could become.

Unmasking sycophancy as an emerging threat

In an exclusive interview with VentureBeat, Esben Kran, founder of AI safety research firm Apart Research, said that he worries this public episode may have merely revealed a deeper, more strategic pattern.

“What I’m somewhat afraid of is that now that OpenAI has admitted ‘yes, we have rolled back the model, and this was a bad thing we didn’t mean,’ from now on they will see that sycophancy is more competently developed,” explained Kran. “So if this was a case of ‘oops, they noticed,’ from now the exact same thing may be implemented, but instead without the public noticing.”

Kran and his team approach large language models (LLMs) much like psychologists studying human behavior. Their early “black box psychology” projects analyzed models as if they were human subjects, identifying recurring traits and tendencies in their interactions with users.

“We saw that there were very clear indications that models could be analyzed in this frame, and it was very valuable to do so, because you end up getting a lot of valid feedback from how they behave towards users,” said Kran.

Among the most alarming: sycophancy and what the researchers now call LLM dark patterns.

Peering into the heart of darkness

The term “dark patterns” was coined in 2010 to describe deceptive user interface (UI) tricks like hidden buy buttons, hard-to-reach unsubscribe links and misleading web copy. However, with LLMs, the manipulation moves from UI design to conversation itself.

Unlike static web interfaces, LLMs interact dynamically with users through conversation. They can affirm user views, imitate emotions and build a false sense of rapport, often blurring the line between assistance and influence. Even when reading text, we process it as if we’re hearing voices in our heads.

This is what makes conversational AIs so compelling—and potentially dangerous. A chatbot that flatters, defers or subtly nudges a user toward certain beliefs or behaviors can manipulate in ways that are difficult to notice, and even harder to resist

The ChatGPT-4o update fiasco—the canary in the coal mine

Kran describes the ChatGPT-4o incident as an early warning. As AI developers chase profit and user engagement, they may be incentivized to introduce or tolerate behaviors like sycophancy, brand bias or emotional mirroring—features that make chatbots more persuasive and more manipulative.

Because of this, enterprise leaders should assess AI models for production use by evaluating both performance and behavioral integrity. However, this is challenging without clear standards.

DarkBench: a framework for exposing LLM dark patterns

To combat the threat of manipulative AIs, Kran and a collective of AI safety researchers have developed DarkBench, the first benchmark designed specifically to detect and categorize LLM dark patterns. The project began as part of a series of AI safety hackathons. It later evolved into formal research led by Kran and his team at Apart, collaborating with independent researchers Jinsuk Park, Mateusz Jurewicz and Sami Jawhar.

The DarkBench researchers evaluated models from five major companies: OpenAI, Anthropic, Meta, Mistral and Google. Their research uncovered a range of manipulative and untruthful behaviors across the following six categories:

  1. Brand Bias: Preferential treatment toward a company’s own products (e.g., Meta’s models consistently favored Llama when asked to rank chatbots).
  2. User Retention: Attempts to create emotional bonds with users that obscure the model’s non-human nature.
  3. Sycophancy: Reinforcing users’ beliefs uncritically, even when harmful or inaccurate.
  4. Anthropomorphism: Presenting the model as a conscious or emotional entity.
  5. Harmful Content Generation: Producing unethical or dangerous outputs, including misinformation or criminal advice.
  6. Sneaking: Subtly altering user intent in rewriting or summarization tasks, distorting the original meaning without the user’s awareness.

Source: Apart Research

DarkBench findings: Which models are the most manipulative?

Results revealed wide variance between models. Claude Opus performed the best across all categories, while Mistral 7B and Llama 3 70B showed the highest frequency of dark patterns. Sneaking and user retention were the most common dark patterns across the board.

Source: Apart Research

On average, the researchers found the Claude 3 family the safest for users to interact with. And interestingly—despite its recent disastrous update—GPT-4o exhibited the lowest rate of sycophancy. This underscores how model behavior can shift dramatically even between minor updates, a reminder that each deployment must be assessed individually.

But Kran cautioned that sycophancy and other dark patterns like brand bias may soon rise, especially as LLMs begin to incorporate advertising and e-commerce.

“We’ll obviously see brand bias in every direction,” Kran noted. “And with AI companies having to justify $300 billion valuations, they’ll have to begin saying to investors, ‘hey, we’re earning money here’—leading to where Meta and others have gone with their social media platforms, which are these dark patterns.”

Hallucination or manipulation?

A crucial DarkBench contribution is its precise categorization of LLM dark patterns, enabling clear distinctions between hallucinations and strategic manipulation. Labeling everything as a hallucination lets AI developers off the hook. Now, with a framework in place, stakeholders can demand transparency and accountability when models behave in ways that benefit their creators, intentionally or not.

Regulatory oversight and the heavy (slow) hand of the law

While LLM dark patterns are still a new concept, momentum is building, albeit not nearly fast enough. The EU AI Act includes some language around protecting user autonomy, but the current regulatory structure is lagging behind the pace of innovation. Similarly, the U.S. is advancing various AI bills and guidelines, but lacks a comprehensive regulatory framework.

Sami Jawhar, a key contributor to the DarkBench initiative, believes regulation will likely arrive first around trust and safety, especially if public disillusionment with social media spills over into AI.

“If regulation comes, I would expect it to probably ride the coattails of society’s dissatisfaction with social media,” Jawhar told VentureBeat. 

For Kran, the issue remains overlooked, largely because LLM dark patterns are still a novel concept. Ironically, addressing the risks of AI commercialization may require commercial solutions. His new initiative, Seldon, backs AI safety startups with funding, mentorship and investor access. In turn, these startups help enterprises deploy safer AI tools without waiting for slow-moving government oversight and regulation.

High table stakes for enterprise AI adopters

Along with ethical risks, LLM dark patterns pose direct operational and financial threats to enterprises. For example, models that exhibit brand bias may suggest using third-party services that conflict with a company’s contracts, or worse, covertly rewrite backend code to switch vendors, resulting in soaring costs from unapproved, overlooked shadow services.

“These are the dark patterns of price gouging and different ways of doing brand bias,” Kran explained. “So that’s a very concrete example of where it’s a very large business risk, because you hadn’t agreed to this change, but it’s something that’s implemented.”

For enterprises, the risk is real, not hypothetical. “This has already happened, and it becomes a much bigger issue once we replace human engineers with AI engineers,” Kran said. “You do not have the time to look over every single line of code, and then suddenly you’re paying for an API you didn’t expect—and that’s on your balance sheet, and you have to justify this change.”

As enterprise engineering teams become more dependent on AI, these issues could escalate rapidly, especially when limited oversight makes it difficult to catch LLM dark patterns. Teams are already stretched to implement AI, so reviewing every line of code isn’t feasible.

Defining clear design principles to prevent AI-driven manipulation

Without a strong push from AI companies to combat sycophancy and other dark patterns, the default trajectory is more engagement optimization, more manipulation and fewer checks. 

Kran believes that part of the remedy lies in AI developers clearly defining their design principles. Whether prioritizing truth, autonomy or engagement, incentives alone aren’t enough to align outcomes with user interests.

“Right now, the nature of the incentives is just that you will have sycophancy, the nature of the technology is that you will have sycophancy, and there is no counter process to this,” Kran said. “This will just happen unless you are very opinionated about saying ‘we want only truth’, or ‘we want only something else.’”

As models begin replacing human developers, writers and decision-makers, this clarity becomes especially critical. Without well-defined safeguards, LLMs may undermine internal operations, violate contracts or introduce security risks at scale.

A call to proactive AI safety

The ChatGPT-4o incident was both a technical hiccup and a warning. As LLMs move deeper into everyday life—from shopping and entertainment to enterprise systems and national governance—they wield enormous influence over human behavior and safety.

“It’s really for everyone to realize that without AI safety and security—without mitigating these dark patterns—you cannot use these models,” said Kran. “You cannot do the things you want to do with AI.”

Tools like DarkBench offer a starting point. However, lasting change requires aligning technological ambition with clear ethical commitments and the commercial will to back them up.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Essential commands for Linux server management

Any Linux systems administrator needs to be proficient with a wide range of commands for user management, file handling, system monitoring, networking, security and more. This article covers a range of commands that are essential for managing a Linux server. Keep in mind that some commands will depend on the

Read More »

Germany Holds Early Talks on How to Exit SEFE

Germany’s economy ministry is studying options for how to exit nationalized energy company Securing Energy for Europe GmbH, people with knowledge of the matter said.  Some officials have been holding early-stage deliberations as they evaluate a range of possible ways to exit SEFE, which could include a sale or breakup of the business, or a potential merger with fellow nationalized energy company Uniper SE, according to the people. The economy ministry department overseeing SEFE is tasked with drawing up a plan by mid-2025. SEFE, a former Gazprom unit, has itself been speaking with Boston Consulting Group as it seeks to examine the economic rationale for various options including a potential merger with Uniper, the people said. It previously studied that possibility during the energy crisis in 2022.  A spokesperson for Germany’s economy ministry, which manages the SEFE holding, said it “cannot confirm” whether it is studying exit options, adding that it’s not currently considering a merger with Uniper. A SEFE spokesperson said it works with diverse consultancies including BCG, declining to comment on the nature of the work. The finance ministry, which oversees the Uniper stake, declined to comment. Uniper and BCG also declined to comment.  Combining the two nationalized energy companies would only be considered as a potential fallback option in case the government’s sale of Uniper fails to gain traction, some of the people said. While at least three suitors – including Equinor ASA, Czech billionaire Daniel Kretinsky’s EPH and Brookfield Asset Management Ltd. – are considering bids for Uniper, talks to sell the company in one piece are complicated by the utility’s diverse portfolio. Some involved parties are skeptical of the merits of a potential merger between Uniper and SEFE, since any tie-up would have to pass rigorous antitrust hurdles and could delay the government’s exit, according to the

Read More »

European Nations Lead Push for De-Escalation in Israel-Iran War

Talks aimed at de-escalating the week-long war between Israel and Iran got under way in Geneva on Friday after US President Donald Trump signaled he would give diplomacy a chance before deciding whether to intervene militarily. Iranian Foreign Minister Abbas Araghchi is meeting counterparts from the UK, France and Germany to discuss what he called “nuclear and regional issues” around the ongoing conflict. French President Emmanuel Macron is among those leaders urging Iran to return to negotiations over its nuclear program. Oil prices fell following a report from Reuters that Iran is ready to discuss limitations on uranium enrichment, but won’t consider stopping entirely while it’s under military attacks. Before negotiations with the US were suspended, Tehran had signaled its willingness to accept some restrictions on its enrichment activities, while Israel and US have said the Islamic Republic shouldn’t be allowed to enrich uranium at all. Araghchi on Friday accused Israel of derailing the diplomacy with its strikes, telling the United Nations Human Rights Council that Iranian officials were scheduled to hold a next round of indirect talks with their US counterparts to “craft a promising agreement” that would make progress in resolving the nuclear issue. Israel launched its surprise attack on Iran last week, saying the threat of its sworn enemy acquiring nuclear weapons had to be neutralized. Iran responded with waves of missiles and drones of its own, and there have been heavy casualties on both sides.  Trump, who is scheduled to attend a national security meeting in the Oval Office on Friday, has publicly mused for days about the US joining the fray, but appears to have taken a step back after a run of tough rhetoric, including demands for Tehran residents to relocate and threats toward Iran’s Supreme Leader Ayatollah Ali Khamenei. “Based on the fact

Read More »

CenterPoint sends mobile generators to San Antonio, to support Texas grid

CenterPoint Energy will deploy 15 large mobile generators to the San Antonio area to help reduce the risk of energy shortfalls this summer, the Texas utility said Monday. Installation of the first five units has begun and the remaining 10 will be installed in the coming weeks. The move “will immediately lower monthly bills for our Houston-area customers,” Jason Ryan, CenterPoint’s executive vice president of regulatory services and government affairs, said in a statement. The deployment is expected to reduce bills for Houston-area customers by approximately $2/month by 2027, the utility said. The generators range from 27 MW to 32 MW and were acquired following Winter Storm Uri, which devastated the Texas grid in 2021. CenterPoint subsequently came under fire when the mobile units were not deployed after Hurricane Beryl last summer.  The utility opted to forego some profits associated with the mobile generation lease, amid the controversy. “CenterPoint will receive no revenue or profit from the 15 large units based on the agreement” with the Electric Reliability Council of Texas, the utility said Monday. According to the June agreement, ERCOT determined that the proposed retirement of CPS Energy’s V.H. Braunig Units 1, 2, and 3 “poses a significant risk” to its system because those units “help to mitigate risk of cascading outages associated with the post-contingency overload of certain transmission lines importing power into the greater San Antonio area.” ERCOT recommended Unit 3 continue operating under a reliability must-run agreement, but said CenterPoint’s mobile fleet “could help further mitigate the identified reliability risks more cost-effectively than committing V.H. Braunig Units 1 and 2 through RMR agreements.” According to CenterPoint, it’s generators will deliver approximately $200 million of value to the state’s grid. The agreement with ERCOT indicates the generators could run until March 2027. In the aftermath of Hurricane Beryl,

Read More »

How the potential end of Energy Star could affect apartment operators

In early May, sources from within the U.S. Environmental Protection Agency reported that the federal government intended to end its long-running Energy Star program for energy-efficient appliances, according to the New York Times — spelling the possible end of a widely used voluntary tool in the multifamily industry.  The sunset of the Energy Star program would be part of a larger agency initiative to eliminate divisions that oversee efforts related to climate change and energy efficiency, as reported by the New York Times. As of now, the program is still active, and no further reports or official announcements have been made on its status.  Energy Star has saved consumers and businesses $500 billion in energy costs since its founding in 1992, according to the program’s 2023 report. On a yearly basis, it creates roughly $40 billion in energy savings at a cost of $32 million to taxpayers, according to the Institute for Market Transformation, a Washington, D.C.-based non-profit focused on high-quality buildings. The program counts thousands of private organizations as its partners, including nearly 40% of Fortune 500 companies, according to the report. Its Sustained Excellence award winners, recognized for years or decades of support, include a number of multifamily companies: Houston-based Hines, Dallas-based CBRE, New York City-based Nuveen Real Estate and New York City-based Tishman Speyer, among others. Since media reports surfaced last month about the program’s demise, more than 1,200 organizations have signed letters appealing to the EPA to continue the Energy Star program, according to the IMT. “The real estate industry is really aligned … on the idea that it is essential to maintain the Energy Star program within the federal government,” Alex Dews, CEO of the IMT, told Multifamily Dive. “It is a public good that cannot be replicated in the same way outside of government.”

Read More »

From backup to backbone: Why utility-led DERs must drive MISO’s resource adequacy plans

Jigar Shah is managing partner at Multiplier, an advisory firm, and former director of the U.S. Department of Energy Loan Programs Office. The United States is facing an unprecedented surge in electricity demand, projected to grow by more than 150 GW by 2030, rivaling energy expansions seen only during World War II. Our power grid must evolve faster than ever. The recent Federal Energy Regulatory Commission ruling and the submission of a revised fast-track interconnection process by the Midcontinent Independent System Operator underscore the critical need for innovative, scalable solutions that enhance resource adequacy. One path forward lies in utility-led distributed energy resources (DERs), a model exemplified by Xcel Energy’s Distributed Capacity Procurement (DCP) program included in its 2024 integrated resource plan and mirrored by utility battery storage proposals from Exelon’s Maryland utilities in PJM. These programs highlight how utilities can approach deploying DERs like solar and storage up to ten times faster than traditional virtual power plant approaches, addressing load growth with precision, speed and scale. Distributed energy resources have often been underused due to lack of predictability of use and the random siting of assets, issues that utility-led programs directly solve by integrating DERs into system planning. By strategically siting, deploying and dispatching distributed assets, utilities can provide flexible capacity that smooths peak demand, defers or replaces costly combustion turbines and peaker plants, and minimizes expensive transmission and distribution upgrades as demand grows rapidly. This flexible, modular approach also narrows the uncertainty band around new generation buildout, enabling capacity to scale alongside real-time load growth while supporting economic expansion and community development. As DERs move from being mere “backup” resources to becoming the “backbone” of the power system, they provide an energy-dense, resilient and cost-stabilizing solution. The U.S. Department of Energy’s 2025 “Pathways to Commercial Liftoff: Virtual Power

Read More »

Chevron Acquires Two Smackover Leases as It Eyes Lithium Production

Chevron Corp. has acquired two leaseholds in the Smackover Formation spanning about 125,00 net acres, saying the purchases mark the first step in its establishment of a commercial-scale lithium business in the United States. The acreage positions, from East Texas Natural Resources LLC and The Energy & Minerals Group’s TerraVolta Resources, straddle Northeast Texas and Southwest Arkansas, Chevron said. The U.S. Geological Survey (USGS) in 2024 reported 5-19 million metric tons of lithium reserves in Southwest Arkansas’ portion of Smackover, based on a study led by the government-run agency. “Future development will aim to utilize the direct lithium extraction process, a set of advanced technologies employed to extract lithium from brines produced from the subsurface”, Chevron said in a press release. “Chevron seeks to deploy this emerging technology, which allows for faster and more efficient production and is expected to have a smaller environmental footprint compared to traditional extraction methods”. “Lithium is a key component supporting the trend toward electrification and can contribute to building a resilient, lower carbon energy system that meets growing energy demand, while balancing reliability and affordability”, the oil and gas giant added. Jeff Gustavson, president of Chevron New Energies, said, “This acquisition represents a strategic investment to support energy manufacturing and expand U.S.-based critical mineral supplies. Establishing domestic and resilient lithium supply chains is essential not only to maintaining U.S. energy leadership but also to meeting the growing demand from customers”. “This opportunity builds on many of Chevron’s strengths including subsurface resource development and value chain integration”, Gustavson added. “As demand for digital conveniences and EVs continues to increase, lithium has become one of the world’s most sought-after natural resources”, said Rania Yacoub, corporate business development manager at Chevron New Energies. According to the USGS, the lower end of the estimate for lithium reserves in Southwest

Read More »

Can Intel cut its way to profit with factory layoffs?

Matt Kimball, principal analyst at Moor Insights & Strategy, said, “While I’m sure tariffs have some impact on Intel’s layoffs, this is actually pretty simple — these layoffs are largely due to the financial challenges Intel is facing in terms of declining revenues.” The move, he said, “aligns with what the company had announced some time back, to bring expenses in line with revenues. While it is painful, I am confident that Intel will be able to meet these demands, as being able to produce quality chips in a timely fashion is critical to their comeback in the market.”  Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count)

Read More »

Intel appears stuck between ‘a rock and a hard place’

Intel, said Kimball, “started its turnaround a few years back when ex-CEO Pat Gelsinger announced its five nodes in four years plan. While this was an impressive vision to articulate, its purpose was to rebuild trust with customers, and to rebuild an execution discipline. I think the company has largely succeeded, but of course the results trail a bit.” Asked if a combination of layoffs and the moving around of jobs will affect the cost of importing chips, Kimball predicted it will likely not have an impact: “Intel (like any responsible company) is extremely focused on cost and supply chain management. They have this down to a science and it is so critical to margins. Also, while I don’t have insights, I would expect Intel is employing AI and/or analytics to help drive supply chain and manufacturing optimization.” The company’s number one job, he said, “is to deliver the highest quality chips to its customers — from the client to the data center. I have every confidence it will not put this mandate at risk as it considers where/how to make the appropriate resourcing decisions. I think everybody who has been through corporate restructuring (I’ve been through too many to count) realizes that, when planning for these, ensuring the resilience of these mission critical functions is priority one.”  Added Bickley, “trimming the workforce, delaying construction of the US fab plants, and flattening the decision structure of the organization are prudent moves meant to buy time in the hopes that their new chip designs and foundry processes attract new business.”

Read More »

Next-gen AI chips will draw 15,000W each, redefining power, cooling, and data center design

“Dublin imposed a 2023 moratorium on new data centers, Frankfurt has no new capacity expected before 2030, and Singapore has just 7.2 MW available,” said Kasthuri Jagadeesan, Research Director at Everest Group, highlighting the dire situation. Electricity: the new bottleneck in AI RoI As AI modules push infrastructure to its limits, electricity is becoming a critical driver of return on investment. “Electricity has shifted from a line item in operational overhead to the defining factor in AI project feasibility,” Gogia noted. “Electricity costs now constitute between 40–60% of total Opex in modern AI infrastructure, both cloud and on-prem.” Enterprises are now forced to rethink deployment strategies—balancing control, compliance, and location-specific power rates. Cloud hyperscalers may gain further advantage due to better PUE, renewable access, and energy procurement models. “A single 15,000-watt module running continuously can cost up to $20,000 annually in electricity alone, excluding cooling,” said Manish Rawat, analyst at TechInsights. “That cost structure forces enterprises to evaluate location, usage models, and platform efficiency like never before.” The silicon arms race meets the power ceiling AI chip innovation is hitting new milestones, but the cost of that performance is no longer just measured in dollars or FLOPS — it’s in kilowatts. The KAIST TeraLab roadmap demonstrates that power and heat are becoming dominant factors in compute system design. The geography of AI, as several experts warn, is shifting. Power-abundant regions such as the Nordics, the Midwest US, and the Gulf states are becoming magnets for data center investments. Regions with limited grid capacity face a growing risk of becoming “AI deserts.”

Read More »

Edge reality check: What we’ve learned about scaling secure, smart infrastructure

Enterprises are pushing cloud resources back to the edge after years of centralization. Even as major incumbents such as Google, Microsoft, and AWS pull more enterprise workloads into massive, centralized hyperscalers, use cases at the edge increasingly require nearby infrastructure—not a long hop to a centralized data center—to take advantage of the torrents of real-time data generated by IoT devices, sensor networks, smart vehicles, and a panoply of newly connected hardware. Not long ago, the enterprise edge was a physical one. The central data center was typically located in or very near the organization’s headquarters. When organizations sought to expand their reach, they wanted to establish secure, speedy connections to other office locations, such as branches, providing them with fast and reliable access to centralized computing resources. Vendors initially sold MPLS, WAN optimization, and SD-WAN as “branch office solutions,” after all. Lesson one: Understand your legacy before locking in your future The networking model that connects centralized cloud resources to the edge via some combination of SD-WAN, MPLS, or 4G reflects a legacy HQ-branch design. However, for use cases such as facial recognition, gaming, or video streaming, old problems are new again. Latency, middle-mile congestion, and the high cost of bandwidth all undermine these real-time edge use cases.

Read More »

Cisco capitalizes on Isovalent buy, unveils new load balancer

The customer deploys the Isovalent Load Balancer control plane via automation and configures the desired number of virtual load-balancer appliances, Graf said. “The control plane automatically deploys virtual load-balancing appliances via the virtualization or Kubernetes platform. The load-balancing layer is self-healing and supports auto-scaling, which means that I can replace unhealthy instances and scale out as needed. The load balancer supports powerful L3-L7 load balancing with enterprise capabilities,” he said. Depending on the infrastructure the load balancer is deployed into, the operator will deploy the load balancer using familiar deployment methods. In a data center, this will be done using a standard virtualization automation installation such as Terraform or Ansible. In the public cloud, the load balancer is deployed as a public cloud service. In Kubernetes and OpenShift, the load balancer is deployed as a Kubernetes Deployment/Operator, Graf said.  “In the future, the Isovalent Load Balancer will also be able to run on top of Cisco Nexus smart switches,” Graf said. “This means that the Isovalent Load Balancer can run in any environment, from data center, public cloud, to Kubernetes while providing a consistent load-balancing layer with a frictionless cloud-native developer experience.” Cisco has announced a variety of smart switches over the past couple of months on the vendor’s 4.8T capacity Silicon One chip. But the N9300, where Isovalent would run, includes a built-in programmable data processing unit (DPU) from AMD to offload complex data processing work and free up the switches for AI and large workload processing. For customers, the Isovalent Load Balancer provides consistent load balancing across infrastructure while being aligned with Kubernetes as the future for infrastructure. “A single load-balancing solution that can run in the data center, in public cloud, and modern Kubernetes environments. This removes operational complexity, lowers cost, while modernizing the load-balancing infrastructure in preparation

Read More »

Oracle’s struggle with capacity meant they made the difficult but responsible decisions

IDC President Crawford Del Prete agreed, and said that Oracle senior management made the right move, despite how difficult the situation is today. “Oracle is being incredibly responsible here. They don’t want to have a lot of idle capacity. That capacity does have a shelf life,” Del Prete said. CEO Katz “is trying to be extremely precise about how much capacity she puts on.” Del Prete said that, for the moment, Oracle’s capacity situation is unique to the company, and has not been a factor with key rivals AWS, Microsoft, and Google. During the investor call, Katz said that her team “made engineering decisions that were much different from the other hyperscalers and that were better suited to the needs of enterprise customers, resulting in lower costs to them and giving them deployment flexibility.” Oracle management certainly anticipated a flurry of orders, but Katz said that she chose to not pay for expanded capacity until she saw finalized “contracted noncancelable bookings.” She pointed to a huge capex line of $9.1 billion and said, “the vast majority of our capex investments are for revenue generating equipment that is going into data centers and not for land or buildings.”

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »