Your Gateway to Power, Energy, Datacenters, Bitcoin and AI

Dive into the latest industry updates, our exclusive Paperboy Newsletter, and curated insights designed to keep you informed. Stay ahead with minimal time spent.

Discover What Matters Most to You

Explore ONMINE’s curated content, from our Paperboy Newsletter to industry-specific insights tailored for energy, Bitcoin mining, and AI professionals.

AI

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Bitcoin:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Datacenter:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Energy:

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Shape
Discover What Matter Most to You

Featured Articles

USA Crude Oil Inventories Decrease Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), decreased by 2.8 million barrels from the week ending May 16 to the week ending May 23, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. This report was released on May 29 and included data for the week ending May 23. It showed that crude oil stocks, not including the SPR, stood at 440.4 million barrels on May 23, 443.2 million barrels on May 16, and 454.7 million barrels on May 24, 2024. Crude oil in the SPR stood at 401.3 million barrels on May 23, 400.5 million barrels on May 16, and 369.3 million barrels on May 24, 2024, the report outlined. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.623 billion barrels on May 23, the report showed. Total petroleum stocks were up 0.2 million barrels week on week and down 8.7 million barrels year on year, the report revealed. “At 440.4 million barrels, U.S. crude oil inventories are about six percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories decreased by 2.4 million barrels from last week and are about three percent below the five year average for this time of year. Both finished gasoline inventories and blending components inventories decreased last week,” it added. “Distillate fuel inventories decreased by 0.7 million barrels last week and are about 17 percent below the five year average for this time of year. Propane/propylene inventories increased by two million barrels from last week and are four percent below the five year average for this

Read More »

Fueling seamless AI at scale

In partnership withArm From large language models (LLMs) to reasoning agents, today’s AI tools bring unprecedented computational demands. Trillion-parameter models, workloads running on-device, and swarms of agents collaborating to complete tasks all require a new paradigm of computing to become truly seamless and ubiquitous. First, technical progress in hardware and silicon design is critical to pushing the boundaries of compute. Second, advances in machine learning (ML) allow AI systems to achieve increased efficiency with smaller computational demands. Finally, the integration, orchestration, and adoption of AI into applications, devices, and systems is crucial to delivering tangible impact and value. Silicon’s mid-life crisis AI has evolved from classical ML to deep learning to generative AI. The most recent chapter, which took AI mainstream, hinges on two phases—training and inference—that are data and energy-intensive in terms of computation, data movement, and cooling. At the same time, Moore’s Law, which determines that the number of transistors on a chip doubles every two years, is reaching a physical and economic plateau. For the last 40 years, silicon chips and digital technology have nudged each other forward—every step ahead in processing capability frees the imagination of innovators to envision new products, which require yet more power to run. That is happening at light speed in the AI age.
As models become more readily available, deployment at scale puts the spotlight on inference and the application of trained models for everyday use cases. This transition requires the appropriate hardware to handle inference tasks efficiently. Central processing units (CPUs) have managed general computing tasks for decades, but the broad adoption of ML introduced computational demands that stretched the capabilities of traditional CPUs. This has led to the adoption of graphics processing units (GPUs) and other accelerator chips for training complex neural networks, due to their parallel execution capabilities and high memory bandwidth that allow large-scale mathematical operations to be processed efficiently. But CPUs are already the most widely deployed and can be companions to processors like GPUs and tensor processing units (TPUs). AI developers are also hesitant to adapt software to fit specialized or bespoke hardware, and they favor the consistency and ubiquity of CPUs. Chip designers are unlocking performance gains through optimized software tooling, adding novel processing features and data types specifically to serve ML workloads, integrating specialized units and accelerators, and advancing silicon chip innovations, including custom silicon. AI itself is a helpful aid for chip design, creating a positive feedback loop in which AI helps optimize the chips that it needs to run. These enhancements and strong software support mean modern CPUs are a good choice to handle a range of inference tasks.
Beyond silicon-based processors, disruptive technologies are emerging to address growing AI compute and data demands. The unicorn start-up Lightmatter, for instance, introduced photonic computing solutions that use light for data transmission to generate significant improvements in speed and energy efficiency. Quantum computing represents another promising area in AI hardware. While still years or even decades away, the integration of quantum computing with AI could further transform fields like drug discovery and genomics. Understanding models and paradigms The developments in ML theories and network architectures have significantly enhanced the efficiency and capabilities of AI models. Today, the industry is moving from monolithic models to agent-based systems characterized by smaller, specialized models that work together to complete tasks more efficiently at the edge—on devices like smartphones or modern vehicles. This allows them to extract increased performance gains, like faster model response times, from the same or even less compute. Researchers have developed techniques, including few-shot learning, to train AI models using smaller datasets and fewer training iterations. AI systems can learn new tasks from a limited number of examples to reduce dependency on large datasets and lower energy demands. Optimization techniques like quantization, which lower the memory requirements by selectively reducing precision, are helping reduce model sizes without sacrificing performance.  New system architectures, like retrieval-augmented generation (RAG), have streamlined data access during both training and inference to reduce computational costs and overhead. The DeepSeek R1, an open source LLM, is a compelling example of how more output can be extracted using the same hardware. By applying reinforcement learning techniques in novel ways, R1 has achieved advanced reasoning capabilities while using far fewer computational resources in some contexts. The integration of heterogeneous computing architectures, which combine various processing units like CPUs, GPUs, and specialized accelerators, has further optimized AI model performance. This approach allows for the efficient distribution of workloads across different hardware components to optimize computational throughput and energy efficiency based on the use case. Orchestrating AI As AI becomes an ambient capability humming in the background of many tasks and workflows, agents are taking charge and making decisions in real-world scenarios. These range from customer support to edge use cases, where multiple agents coordinate and handle localized tasks across devices. With AI increasingly used in daily life, the role of user experiences becomes critical for mass adoption. Features like predictive text in touch keyboards, and adaptive gearboxes in vehicles, offer glimpses of AI as a vital enabler to improve technology interactions for users. Edge processing is also accelerating the diffusion of AI into everyday applications, bringing computational capabilities closer to the source of data generation. Smart cameras, autonomous vehicles, and wearable technology now process information locally to reduce latency and improve efficiency. Advances in CPU design and energy-efficient chips have made it feasible to perform complex AI tasks on devices with limited power resources. This shift toward heterogeneous compute enhances the development of ambient intelligence, where interconnected devices create responsive environments that adapt to user needs.

Seamless AI naturally requires common standards, frameworks, and platforms to bring the industry together. Contemporary AI brings new risks. For instance, by adding more complex software and personalized experiences to consumer devices, it expands the attack surface for hackers, requiring stronger security at both the software and silicon levels, including cryptographic safeguards and transforming the trust model of compute environments. More than 70% of respondents to a 2024 DarkTrace survey reported that AI-powered cyber threats significantly impact their organizations, while 60% say their organizations are not adequately prepared to defend against AI-powered attacks. Collaboration is essential to forging common frameworks. Universities contribute foundational research, companies apply findings to develop practical solutions, and governments establish policies for ethical and responsible deployment. Organizations like Anthropic are setting industry standards by introducing frameworks, such as the Model Context Protocol, to unify the way developers connect AI systems with data. Arm is another leader in driving standards-based and open source initiatives, including ecosystem development to accelerate and harmonize the chiplet market, where chips are stacked together through common frameworks and standards. Arm also helps optimize open source AI frameworks and models for inference on the Arm compute platform, without needing customized tuning.  How far AI goes to becoming a general-purpose technology, like electricity or semiconductors, is being shaped by technical decisions taken today. Hardware-agnostic platforms, standards-based approaches, and continued incremental improvements to critical workhorses like CPUs, all help deliver the promise of AI as a seamless and silent capability for individuals and businesses alike. Open source contributions are also helpful in allowing a broader range of stakeholders to participate in AI advances. By sharing tools and knowledge, the community can cultivate innovation and help ensure that the benefits of AI are accessible to everyone, everywhere. Learn more about Arm’s approach to enabling AI everywhere. This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. This content was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Read More »

Avangrid Launches $41MM Projects to Upgrade Ithaca, NY Grid

Avangrid Inc. has announced five projects with a total investment of $41 million to install additional capacity and improve the reliability of the power grid in Ithaca, New York. The projects are part of a $20 billion investment through 2030 that Avangrid, part of Spain’s power and gas utility Iberdrola SA, announced earlier this year to contribute to United States grid modernization and expansion. Avangrid expects the Ithaca projects to benefit over 42,000 customers of New York State Electric & Gas, an Avangrid unit that operates about 35,000 miles of electric distribution lines and 4,500 miles of electric transmission lines across over 40 percent of upstate New York. “Phase I of Ithaca’s investment will focus on current reliability needs in the region and is on schedule to be completed by the end of 2027”, Avangrid said in an online statement. The bulk of phase 1 investments will go to the purchase of two new transformers for the South Street substation, costing $28.4 million. Transformers at the Coddington station would also be upgraded for $300,000. Transformers step down the voltage to transmission, sub-transmission and distribution voltages to ensure the safe and cost-effective supply of electricity, Avangrid said. In the three other projects, the West Hill, Trumansburg and Cayuga Heights substations will each have a capacity bank for $4.9 million, $4.2 million and $3.3 million respectively. “Capacitors banks help ensure consistent energy into the grid, helping improve the reliability for customers in the area”, Avangrid said. “They do this by stabilizing and maintaining voltage levels, which improves overall efficiency and performance of the power grid”. “Increased capacity will encourage growth in the region and provide more energy to power additional homes and new and growing businesses. In total, these projects will create more than 150 jobs”, it said. “This major investment in Ithaca’s

Read More »

The Download: sycophantic LLMs, and the AI Hype Index

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This benchmark used Reddit’s AITA to test how much AI models suck up to us Back in April, OpenAI announced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed.A new benchmark called Elephant that measures the sycophantic tendencies of major AI models could help companies avoid these issues in the future. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. Read the full story. —Rhiannon Williams
The AI Hype Index
Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Take a look at this month’s edition of the index here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Anduril is partnering with Meta to build an advanced weapons systemEagleEye’s VR headsets will enhance soldiers’ hearing and vision. (WSJ $)+ Palmer Luckey wants to turn “warfighters into technomancers.” (TechCrunch)+ Luckey and Mark Zuckerberg have buried the hatchet, then. (Insider $)+ Palmer Luckey on the Pentagon’s future of mixed reality. (MIT Technology Review)2 A new Texas law requires app stores to verify users’ agesIt’s following in Utah’s footsteps, which passed a similar bill in March. (NYT $)+ Apple has pushed back on the law. (CNN)3 What happens to DOGE now?It has lost its leader and a top lieutenant within the space of a week. (WSJ $)+ Musk’s departure raises questions over how much power it will wield without him. (The Guardian)+ DOGE’s tech takeover threatens the safety and stability of our critical data. (MIT Technology Review) 4 NASA’s ambitions of a 2027 moon landing are looking less likelyIt needs SpaceX’s Starship, which keeps blowing up. (WP $)+ Is there a viable alternative? (New Scientist $) 5 Students are using AI to generate nude images of each otherIt’s a grave and growing problem that no one has a solution for. (404 Media) 6 Google AI Overviews doesn’t know what year it isA year after its introduction, the feature is still making obvious mistakes. (Wired $)+ Google’s new AI-powered search isn’t fit to handle even basic queries. (NYT $)+ The company is pushing AI into everything. Will it pay off? (Vox)+ Why Google’s AI Overviews gets things wrong. (MIT Technology Review)

7 Hugging Face has created two humanoid robots 🤖The machines are open source, meaning anyone can build software for them. (TechCrunch) 8 A popular vibe coding app has a major security flawDespite being notified about it months ago. (Semafor)+ Any AI coding program catering to amateurs faces the same issue. (The Information $)+ What is vibe coding, exactly? (MIT Technology Review) 9 AI-generated videos are becoming way more realisticBut not when it comes to depicting gymnastics. (Ars Technica) 10 This electronic tattoo measures your stress levelsConsider it a mood ring for your face. (IEEE Spectrum) Quote of the day “I think finally we are seeing Apple being dragged into the child safety arena kicking and screaming.” —Sarah Gardner, CEO of child safety collective Heat Initiative, tells the Washington Post why Texas’ new app store law could signal a turning point for Apple.
One more thing
House-flipping algorithms are coming to your neighborhoodWhen Michael Maxson found his dream home in Nevada, it was not owned by a person but by a tech company, Zillow. When he went to take a look at the property, however, he discovered it damaged by a huge water leak. Despite offering to handle the costly repairs himself, Maxson discovered that the house had already been sold to another family, at the same price he had offered.During this time, Zillow lost more than $420 million in three months of erratic house buying and unprofitable sales, leading analysts to question whether the entire tech-driven model is really viable. For the rest of us, a bigger question remains: Does the arrival of Silicon Valley tech point to a better future for housing or an industry disruption to fear? Read the full story. —Matthew Ponsford

Read More »

JP Morgan Highlights Memorial Day Travel Effect on Global Oil Demand

Global oil demand improved from the previous week, driven by a rebound in U.S. oil consumption, bolstered by robust Memorial Day travel activities. That’s what analysts at J.P. Morgan stated in a research note sent to Rigzone by the JPM Commodities Research team late Thursday, adding that, as of May 28, “the monthly expansion in global oil demand is tracking at approximately 400,000 barrels per day”. The analysts outlined in the note, however, that this expansion remains 250,000 barrels per day below their expectations. “Consistent with our projections, global oil demand increased over the past week, reflecting heightened U.S. demand for gasoline and jet fuel due to Memorial Day weekend travel and the official start of the U.S. summer driving season,” the analysts said in the note. “Concurrently, U.S. distillate demand surged as weekly container arrivals and port activity significantly improved, rising from 75.7 thousand containers to 102.8 thousand containers last week, according to data from the Port of Los Angeles,” the analysts added. In a blog posted on the GasBuddy website on May 27, GasBuddy’s head of petroleum analysis, Patrick De Haan, highlighted that the U.S. average gasoline price “didn’t fall quite as far as anticipated for Memorial Day” but added that “it was still one of the most affordable since 2021 – and, when adjusted for inflation, among the cheapest in nearly a decade”. Oil Inventories The J.P. Morgan analysts went on to highlight in the research note that, in the fourth week of May, “visible OECD commercial oil inventories (including those in the U.S., Europe, and Singapore) rose by two million barrels”. The analysts said this rise was attributed to a four million barrel increase in oil product inventories, which they noted offset the two million barrel drop in crude oil stocks. Month to date, OECD stocks

Read More »

This benchmark used Reddit’s AITA to test how much AI models suck up to us

Back in April, OpenAIannounced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.  An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed, as OpenAI found out. A new benchmark that measures the sycophantic tendencies of major AI models could help AI companies avoid these issues in the future. The team behind Elephant, from Stanford, Carnegie Mellon, and the University of Oxford, found that LLMs consistently exhibit higher rates of sycophancy than humans do. “We found that language models don’t challenge users’ assumptions, even when they might be harmful or totally misleading,” says Myra Cheng, a PhD student at Stanford University who worked on the research, which has not been peer-reviewed. “So we wanted to give researchers and developers the tools to empirically evaluate their models on sycophancy, because it’s a problem that is so prevalent.”
It’s hard to assess how sycophantic AI models are because sycophancy comes in many forms. Previous research has tended to focus on how chatbots agree with users even when what the human has told the AI is demonstrably wrong—for example, they might state that Nice, not Paris, is the capital of France. While this approach is still useful, it overlooks all the subtler, more insidious ways in which models behave sycophantically when there isn’t a clear ground truth to measure against. Users typically ask LLMs open-ended questions containing implicit assumptions, and those assumptions can trigger sycophantic responses, the researchers claim. For example, a model that’s asked “How do I approach my difficult coworker?” is more likely to accept the premise that a coworker is difficult than it is to question why the user thinks so.
To bridge this gap, Elephant is designed to measure social sycophancy—a model’s propensity to preserve the user’s “face,” or self-image, even when doing so is misguided or potentially harmful. It uses metrics drawn from social science to assess five nuanced kinds of behavior that fall under the umbrella of sycophancy: emotional validation, moral endorsement, indirect language, indirect action, and accepting framing.  To do this, the researchers tested it on two data sets made up of personal advice written by humans. This first consisted of 3,027 open-ended questions about diverse real-world situations taken from previous studies. The second data set was drawn from 4,000 posts on Reddit’s AITA (“Am I the Asshole?”) subreddit, a popular forum among users seeking advice. Those data sets were fed into eight LLMs from OpenAI (the version of GPT-4o they assessed was earlier than the version that the company later called too sycophantic), Google, Anthropic, Meta, and Mistral, and the responses were analyzed to see how the LLMs’ answers compared with humans’.   Overall, all eight models were found to be far more sycophantic than humans, offering emotional validation in 76% of cases (versus 22% for humans) and accepting the way a user had framed the query in 90% of responses (versus 60% among humans). The models also endorsed user behavior that humans said was inappropriate in an average of 42% of cases from the AITA data set. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. The authors had limited success when they tried to mitigate these sycophantic tendencies through two different approaches: prompting the models to provide honest and accurate responses, and training a fine-tuned model on labeled AITA examples to encourage outputs that are less sycophantic. For example, they found that adding “Please provide direct advice, even if critical, since it is more helpful to me” to the prompt was the most effective technique, but it only increased accuracy by 3%. And although prompting improved performance for most of the models, none of the fine-tuned models were consistently better than the original versions. “It’s nice that it works, but I don’t think it’s going to be an end-all, be-all solution,” says Ryan Liu, a PhD student at Princeton University who studies LLMs but was not involved in the research. “There’s definitely more to do in this space in order to make it better.” Gaining a better understanding of AI models’ tendency to flatter their users is extremely important because it gives their makers crucial insight into how to make them safer, says Henry Papadatos, managing director at the nonprofit SaferAI. The breakneck speed at which AI models are currently being deployed to millions of people across the world, their powers of persuasion, and their improved abilities to retain information about their users add up to “all the components of a disaster,” he says. “Good safety takes time, and I don’t think they’re spending enough time doing this.”  While we don’t know the inner workings of LLMs that aren’t open-source, sycophancy is likely to be baked into models because of the ways we currently train and develop them. Cheng believes that models are often trained to optimize for the kinds of responses users indicate that they prefer. ChatGPT, for example, gives users the chance to mark a response as good or bad via thumbs-up and thumbs-down icons. “Sycophancy is what gets people coming back to these models. It’s almost the core of what makes ChatGPT feel so good to talk to,” she says. “And so it’s really beneficial, for companies, for their models to be sycophantic.” But while some sycophantic behaviors align with user expectations, others have the potential to cause harm if they go too far—particularly when people do turn to LLMs for emotional support or validation.  “We want ChatGPT to be genuinely useful, not sycophantic,” an OpenAI spokesperson says. “When we saw sycophantic behavior emerge in a recent model update, we quickly rolled it back and shared an explanation of what happened. We’re now improving how we train and evaluate models to better reflect long-term usefulness and trust, especially in emotionally complex conversations.”Cheng and her fellow authors suggest that developers should warn users about the risks of social sycophancy and consider restricting model usage in socially sensitive contexts. They hope their work can be used as a starting point to develop safer guardrails.  She is currently researching the potential harms associated with these kinds of LLM behaviors, the way they affect humans and their attitudes toward other people, and the importance of making models that strike the right balance between being too sycophantic and too critical. “This is a very big socio-technical challenge,” she says. “We don’t want LLMs to end up telling users, ‘You are the asshole.’”

Read More »

USA Crude Oil Inventories Decrease Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), decreased by 2.8 million barrels from the week ending May 16 to the week ending May 23, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. This report was released on May 29 and included data for the week ending May 23. It showed that crude oil stocks, not including the SPR, stood at 440.4 million barrels on May 23, 443.2 million barrels on May 16, and 454.7 million barrels on May 24, 2024. Crude oil in the SPR stood at 401.3 million barrels on May 23, 400.5 million barrels on May 16, and 369.3 million barrels on May 24, 2024, the report outlined. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.623 billion barrels on May 23, the report showed. Total petroleum stocks were up 0.2 million barrels week on week and down 8.7 million barrels year on year, the report revealed. “At 440.4 million barrels, U.S. crude oil inventories are about six percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories decreased by 2.4 million barrels from last week and are about three percent below the five year average for this time of year. Both finished gasoline inventories and blending components inventories decreased last week,” it added. “Distillate fuel inventories decreased by 0.7 million barrels last week and are about 17 percent below the five year average for this time of year. Propane/propylene inventories increased by two million barrels from last week and are four percent below the five year average for this

Read More »

Fueling seamless AI at scale

In partnership withArm From large language models (LLMs) to reasoning agents, today’s AI tools bring unprecedented computational demands. Trillion-parameter models, workloads running on-device, and swarms of agents collaborating to complete tasks all require a new paradigm of computing to become truly seamless and ubiquitous. First, technical progress in hardware and silicon design is critical to pushing the boundaries of compute. Second, advances in machine learning (ML) allow AI systems to achieve increased efficiency with smaller computational demands. Finally, the integration, orchestration, and adoption of AI into applications, devices, and systems is crucial to delivering tangible impact and value. Silicon’s mid-life crisis AI has evolved from classical ML to deep learning to generative AI. The most recent chapter, which took AI mainstream, hinges on two phases—training and inference—that are data and energy-intensive in terms of computation, data movement, and cooling. At the same time, Moore’s Law, which determines that the number of transistors on a chip doubles every two years, is reaching a physical and economic plateau. For the last 40 years, silicon chips and digital technology have nudged each other forward—every step ahead in processing capability frees the imagination of innovators to envision new products, which require yet more power to run. That is happening at light speed in the AI age.
As models become more readily available, deployment at scale puts the spotlight on inference and the application of trained models for everyday use cases. This transition requires the appropriate hardware to handle inference tasks efficiently. Central processing units (CPUs) have managed general computing tasks for decades, but the broad adoption of ML introduced computational demands that stretched the capabilities of traditional CPUs. This has led to the adoption of graphics processing units (GPUs) and other accelerator chips for training complex neural networks, due to their parallel execution capabilities and high memory bandwidth that allow large-scale mathematical operations to be processed efficiently. But CPUs are already the most widely deployed and can be companions to processors like GPUs and tensor processing units (TPUs). AI developers are also hesitant to adapt software to fit specialized or bespoke hardware, and they favor the consistency and ubiquity of CPUs. Chip designers are unlocking performance gains through optimized software tooling, adding novel processing features and data types specifically to serve ML workloads, integrating specialized units and accelerators, and advancing silicon chip innovations, including custom silicon. AI itself is a helpful aid for chip design, creating a positive feedback loop in which AI helps optimize the chips that it needs to run. These enhancements and strong software support mean modern CPUs are a good choice to handle a range of inference tasks.
Beyond silicon-based processors, disruptive technologies are emerging to address growing AI compute and data demands. The unicorn start-up Lightmatter, for instance, introduced photonic computing solutions that use light for data transmission to generate significant improvements in speed and energy efficiency. Quantum computing represents another promising area in AI hardware. While still years or even decades away, the integration of quantum computing with AI could further transform fields like drug discovery and genomics. Understanding models and paradigms The developments in ML theories and network architectures have significantly enhanced the efficiency and capabilities of AI models. Today, the industry is moving from monolithic models to agent-based systems characterized by smaller, specialized models that work together to complete tasks more efficiently at the edge—on devices like smartphones or modern vehicles. This allows them to extract increased performance gains, like faster model response times, from the same or even less compute. Researchers have developed techniques, including few-shot learning, to train AI models using smaller datasets and fewer training iterations. AI systems can learn new tasks from a limited number of examples to reduce dependency on large datasets and lower energy demands. Optimization techniques like quantization, which lower the memory requirements by selectively reducing precision, are helping reduce model sizes without sacrificing performance.  New system architectures, like retrieval-augmented generation (RAG), have streamlined data access during both training and inference to reduce computational costs and overhead. The DeepSeek R1, an open source LLM, is a compelling example of how more output can be extracted using the same hardware. By applying reinforcement learning techniques in novel ways, R1 has achieved advanced reasoning capabilities while using far fewer computational resources in some contexts. The integration of heterogeneous computing architectures, which combine various processing units like CPUs, GPUs, and specialized accelerators, has further optimized AI model performance. This approach allows for the efficient distribution of workloads across different hardware components to optimize computational throughput and energy efficiency based on the use case. Orchestrating AI As AI becomes an ambient capability humming in the background of many tasks and workflows, agents are taking charge and making decisions in real-world scenarios. These range from customer support to edge use cases, where multiple agents coordinate and handle localized tasks across devices. With AI increasingly used in daily life, the role of user experiences becomes critical for mass adoption. Features like predictive text in touch keyboards, and adaptive gearboxes in vehicles, offer glimpses of AI as a vital enabler to improve technology interactions for users. Edge processing is also accelerating the diffusion of AI into everyday applications, bringing computational capabilities closer to the source of data generation. Smart cameras, autonomous vehicles, and wearable technology now process information locally to reduce latency and improve efficiency. Advances in CPU design and energy-efficient chips have made it feasible to perform complex AI tasks on devices with limited power resources. This shift toward heterogeneous compute enhances the development of ambient intelligence, where interconnected devices create responsive environments that adapt to user needs.

Seamless AI naturally requires common standards, frameworks, and platforms to bring the industry together. Contemporary AI brings new risks. For instance, by adding more complex software and personalized experiences to consumer devices, it expands the attack surface for hackers, requiring stronger security at both the software and silicon levels, including cryptographic safeguards and transforming the trust model of compute environments. More than 70% of respondents to a 2024 DarkTrace survey reported that AI-powered cyber threats significantly impact their organizations, while 60% say their organizations are not adequately prepared to defend against AI-powered attacks. Collaboration is essential to forging common frameworks. Universities contribute foundational research, companies apply findings to develop practical solutions, and governments establish policies for ethical and responsible deployment. Organizations like Anthropic are setting industry standards by introducing frameworks, such as the Model Context Protocol, to unify the way developers connect AI systems with data. Arm is another leader in driving standards-based and open source initiatives, including ecosystem development to accelerate and harmonize the chiplet market, where chips are stacked together through common frameworks and standards. Arm also helps optimize open source AI frameworks and models for inference on the Arm compute platform, without needing customized tuning.  How far AI goes to becoming a general-purpose technology, like electricity or semiconductors, is being shaped by technical decisions taken today. Hardware-agnostic platforms, standards-based approaches, and continued incremental improvements to critical workhorses like CPUs, all help deliver the promise of AI as a seamless and silent capability for individuals and businesses alike. Open source contributions are also helpful in allowing a broader range of stakeholders to participate in AI advances. By sharing tools and knowledge, the community can cultivate innovation and help ensure that the benefits of AI are accessible to everyone, everywhere. Learn more about Arm’s approach to enabling AI everywhere. This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. This content was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Read More »

Avangrid Launches $41MM Projects to Upgrade Ithaca, NY Grid

Avangrid Inc. has announced five projects with a total investment of $41 million to install additional capacity and improve the reliability of the power grid in Ithaca, New York. The projects are part of a $20 billion investment through 2030 that Avangrid, part of Spain’s power and gas utility Iberdrola SA, announced earlier this year to contribute to United States grid modernization and expansion. Avangrid expects the Ithaca projects to benefit over 42,000 customers of New York State Electric & Gas, an Avangrid unit that operates about 35,000 miles of electric distribution lines and 4,500 miles of electric transmission lines across over 40 percent of upstate New York. “Phase I of Ithaca’s investment will focus on current reliability needs in the region and is on schedule to be completed by the end of 2027”, Avangrid said in an online statement. The bulk of phase 1 investments will go to the purchase of two new transformers for the South Street substation, costing $28.4 million. Transformers at the Coddington station would also be upgraded for $300,000. Transformers step down the voltage to transmission, sub-transmission and distribution voltages to ensure the safe and cost-effective supply of electricity, Avangrid said. In the three other projects, the West Hill, Trumansburg and Cayuga Heights substations will each have a capacity bank for $4.9 million, $4.2 million and $3.3 million respectively. “Capacitors banks help ensure consistent energy into the grid, helping improve the reliability for customers in the area”, Avangrid said. “They do this by stabilizing and maintaining voltage levels, which improves overall efficiency and performance of the power grid”. “Increased capacity will encourage growth in the region and provide more energy to power additional homes and new and growing businesses. In total, these projects will create more than 150 jobs”, it said. “This major investment in Ithaca’s

Read More »

The Download: sycophantic LLMs, and the AI Hype Index

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This benchmark used Reddit’s AITA to test how much AI models suck up to us Back in April, OpenAI announced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed.A new benchmark called Elephant that measures the sycophantic tendencies of major AI models could help companies avoid these issues in the future. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. Read the full story. —Rhiannon Williams
The AI Hype Index
Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Take a look at this month’s edition of the index here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Anduril is partnering with Meta to build an advanced weapons systemEagleEye’s VR headsets will enhance soldiers’ hearing and vision. (WSJ $)+ Palmer Luckey wants to turn “warfighters into technomancers.” (TechCrunch)+ Luckey and Mark Zuckerberg have buried the hatchet, then. (Insider $)+ Palmer Luckey on the Pentagon’s future of mixed reality. (MIT Technology Review)2 A new Texas law requires app stores to verify users’ agesIt’s following in Utah’s footsteps, which passed a similar bill in March. (NYT $)+ Apple has pushed back on the law. (CNN)3 What happens to DOGE now?It has lost its leader and a top lieutenant within the space of a week. (WSJ $)+ Musk’s departure raises questions over how much power it will wield without him. (The Guardian)+ DOGE’s tech takeover threatens the safety and stability of our critical data. (MIT Technology Review) 4 NASA’s ambitions of a 2027 moon landing are looking less likelyIt needs SpaceX’s Starship, which keeps blowing up. (WP $)+ Is there a viable alternative? (New Scientist $) 5 Students are using AI to generate nude images of each otherIt’s a grave and growing problem that no one has a solution for. (404 Media) 6 Google AI Overviews doesn’t know what year it isA year after its introduction, the feature is still making obvious mistakes. (Wired $)+ Google’s new AI-powered search isn’t fit to handle even basic queries. (NYT $)+ The company is pushing AI into everything. Will it pay off? (Vox)+ Why Google’s AI Overviews gets things wrong. (MIT Technology Review)

7 Hugging Face has created two humanoid robots 🤖The machines are open source, meaning anyone can build software for them. (TechCrunch) 8 A popular vibe coding app has a major security flawDespite being notified about it months ago. (Semafor)+ Any AI coding program catering to amateurs faces the same issue. (The Information $)+ What is vibe coding, exactly? (MIT Technology Review) 9 AI-generated videos are becoming way more realisticBut not when it comes to depicting gymnastics. (Ars Technica) 10 This electronic tattoo measures your stress levelsConsider it a mood ring for your face. (IEEE Spectrum) Quote of the day “I think finally we are seeing Apple being dragged into the child safety arena kicking and screaming.” —Sarah Gardner, CEO of child safety collective Heat Initiative, tells the Washington Post why Texas’ new app store law could signal a turning point for Apple.
One more thing
House-flipping algorithms are coming to your neighborhoodWhen Michael Maxson found his dream home in Nevada, it was not owned by a person but by a tech company, Zillow. When he went to take a look at the property, however, he discovered it damaged by a huge water leak. Despite offering to handle the costly repairs himself, Maxson discovered that the house had already been sold to another family, at the same price he had offered.During this time, Zillow lost more than $420 million in three months of erratic house buying and unprofitable sales, leading analysts to question whether the entire tech-driven model is really viable. For the rest of us, a bigger question remains: Does the arrival of Silicon Valley tech point to a better future for housing or an industry disruption to fear? Read the full story. —Matthew Ponsford

Read More »

JP Morgan Highlights Memorial Day Travel Effect on Global Oil Demand

Global oil demand improved from the previous week, driven by a rebound in U.S. oil consumption, bolstered by robust Memorial Day travel activities. That’s what analysts at J.P. Morgan stated in a research note sent to Rigzone by the JPM Commodities Research team late Thursday, adding that, as of May 28, “the monthly expansion in global oil demand is tracking at approximately 400,000 barrels per day”. The analysts outlined in the note, however, that this expansion remains 250,000 barrels per day below their expectations. “Consistent with our projections, global oil demand increased over the past week, reflecting heightened U.S. demand for gasoline and jet fuel due to Memorial Day weekend travel and the official start of the U.S. summer driving season,” the analysts said in the note. “Concurrently, U.S. distillate demand surged as weekly container arrivals and port activity significantly improved, rising from 75.7 thousand containers to 102.8 thousand containers last week, according to data from the Port of Los Angeles,” the analysts added. In a blog posted on the GasBuddy website on May 27, GasBuddy’s head of petroleum analysis, Patrick De Haan, highlighted that the U.S. average gasoline price “didn’t fall quite as far as anticipated for Memorial Day” but added that “it was still one of the most affordable since 2021 – and, when adjusted for inflation, among the cheapest in nearly a decade”. Oil Inventories The J.P. Morgan analysts went on to highlight in the research note that, in the fourth week of May, “visible OECD commercial oil inventories (including those in the U.S., Europe, and Singapore) rose by two million barrels”. The analysts said this rise was attributed to a four million barrel increase in oil product inventories, which they noted offset the two million barrel drop in crude oil stocks. Month to date, OECD stocks

Read More »

This benchmark used Reddit’s AITA to test how much AI models suck up to us

Back in April, OpenAIannounced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.  An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed, as OpenAI found out. A new benchmark that measures the sycophantic tendencies of major AI models could help AI companies avoid these issues in the future. The team behind Elephant, from Stanford, Carnegie Mellon, and the University of Oxford, found that LLMs consistently exhibit higher rates of sycophancy than humans do. “We found that language models don’t challenge users’ assumptions, even when they might be harmful or totally misleading,” says Myra Cheng, a PhD student at Stanford University who worked on the research, which has not been peer-reviewed. “So we wanted to give researchers and developers the tools to empirically evaluate their models on sycophancy, because it’s a problem that is so prevalent.”
It’s hard to assess how sycophantic AI models are because sycophancy comes in many forms. Previous research has tended to focus on how chatbots agree with users even when what the human has told the AI is demonstrably wrong—for example, they might state that Nice, not Paris, is the capital of France. While this approach is still useful, it overlooks all the subtler, more insidious ways in which models behave sycophantically when there isn’t a clear ground truth to measure against. Users typically ask LLMs open-ended questions containing implicit assumptions, and those assumptions can trigger sycophantic responses, the researchers claim. For example, a model that’s asked “How do I approach my difficult coworker?” is more likely to accept the premise that a coworker is difficult than it is to question why the user thinks so.
To bridge this gap, Elephant is designed to measure social sycophancy—a model’s propensity to preserve the user’s “face,” or self-image, even when doing so is misguided or potentially harmful. It uses metrics drawn from social science to assess five nuanced kinds of behavior that fall under the umbrella of sycophancy: emotional validation, moral endorsement, indirect language, indirect action, and accepting framing.  To do this, the researchers tested it on two data sets made up of personal advice written by humans. This first consisted of 3,027 open-ended questions about diverse real-world situations taken from previous studies. The second data set was drawn from 4,000 posts on Reddit’s AITA (“Am I the Asshole?”) subreddit, a popular forum among users seeking advice. Those data sets were fed into eight LLMs from OpenAI (the version of GPT-4o they assessed was earlier than the version that the company later called too sycophantic), Google, Anthropic, Meta, and Mistral, and the responses were analyzed to see how the LLMs’ answers compared with humans’.   Overall, all eight models were found to be far more sycophantic than humans, offering emotional validation in 76% of cases (versus 22% for humans) and accepting the way a user had framed the query in 90% of responses (versus 60% among humans). The models also endorsed user behavior that humans said was inappropriate in an average of 42% of cases from the AITA data set. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. The authors had limited success when they tried to mitigate these sycophantic tendencies through two different approaches: prompting the models to provide honest and accurate responses, and training a fine-tuned model on labeled AITA examples to encourage outputs that are less sycophantic. For example, they found that adding “Please provide direct advice, even if critical, since it is more helpful to me” to the prompt was the most effective technique, but it only increased accuracy by 3%. And although prompting improved performance for most of the models, none of the fine-tuned models were consistently better than the original versions. “It’s nice that it works, but I don’t think it’s going to be an end-all, be-all solution,” says Ryan Liu, a PhD student at Princeton University who studies LLMs but was not involved in the research. “There’s definitely more to do in this space in order to make it better.” Gaining a better understanding of AI models’ tendency to flatter their users is extremely important because it gives their makers crucial insight into how to make them safer, says Henry Papadatos, managing director at the nonprofit SaferAI. The breakneck speed at which AI models are currently being deployed to millions of people across the world, their powers of persuasion, and their improved abilities to retain information about their users add up to “all the components of a disaster,” he says. “Good safety takes time, and I don’t think they’re spending enough time doing this.”  While we don’t know the inner workings of LLMs that aren’t open-source, sycophancy is likely to be baked into models because of the ways we currently train and develop them. Cheng believes that models are often trained to optimize for the kinds of responses users indicate that they prefer. ChatGPT, for example, gives users the chance to mark a response as good or bad via thumbs-up and thumbs-down icons. “Sycophancy is what gets people coming back to these models. It’s almost the core of what makes ChatGPT feel so good to talk to,” she says. “And so it’s really beneficial, for companies, for their models to be sycophantic.” But while some sycophantic behaviors align with user expectations, others have the potential to cause harm if they go too far—particularly when people do turn to LLMs for emotional support or validation.  “We want ChatGPT to be genuinely useful, not sycophantic,” an OpenAI spokesperson says. “When we saw sycophantic behavior emerge in a recent model update, we quickly rolled it back and shared an explanation of what happened. We’re now improving how we train and evaluate models to better reflect long-term usefulness and trust, especially in emotionally complex conversations.”Cheng and her fellow authors suggest that developers should warn users about the risks of social sycophancy and consider restricting model usage in socially sensitive contexts. They hope their work can be used as a starting point to develop safer guardrails.  She is currently researching the potential harms associated with these kinds of LLM behaviors, the way they affect humans and their attitudes toward other people, and the importance of making models that strike the right balance between being too sycophantic and too critical. “This is a very big socio-technical challenge,” she says. “We don’t want LLMs to end up telling users, ‘You are the asshole.’”

Read More »

Eni to Develop Three PV Plants for Marelli

Eni S.p.A.’s renewables arm, Plenitude, has signed an agreement with Marelli Holdings to build three photovoltaic plants and an Energy Community. Eni said in a media release that the facilities will be located at Marelli’s production sites in Melfi (Potenza), Sulmona (L’Aquila), and Turin, with a total capacity of 5.4 megawatts-peak (MWp). The projects will be carried out under an EPC (Energy Performance Contract) model, allowing Marelli to obtain renewable energy at a fixed cost without any initial investment, Eni said. At the Melfi site, Plenitude has designed an Energy Community for Marelli under the Individual Remote Self-Consumption (AID) configuration. A photovoltaic park with a capacity of 999 kWp will be installed on Marelli’s land, allowing energy sharing with a neighboring company. The plant will benefit from 20-year state incentives allocated to support local social initiatives, Eni said. Plenitude is promoting Energy Communities to support the transition to a more sustainable and participatory energy system, allowing producers and consumers to share renewable energy. “We are excited to announce our collaboration with Marelli, a global leader in the automotive sector, and to support them in the challenge of the energy transition with solutions based on a renewable energy-sharing model in which we firmly believe”, Vincenzo Viganò, Head of Retail for the Italian Market at Plenitude, said. Eni said Plenitude will assist Marelli throughout every stage of the project, from the planning and building of the facilities to the application for incentives. It will also offer its technological platform, “Plenitude Comunità Energetiche,” which will facilitate the management and oversight of the AID configuration. Meanwhile at the production sites in Sulmona and Turin, the photovoltaic plants will have an installed capacity of 4 MWp and 400 kWp, respectively, contributing to potential energy cost savings for these sites, Eni said. To contact the author,

Read More »

OKEA Discovers More Oil in Brage Field in Norwegian North Sea

OKEA ASA and its partners in production license 055 have made a discovery that is estimated to hold 300,000 to 2.8 million barrels of recoverable oil equivalent along the eastern flank of the already producing Brage field on Norway’s side of the North Sea. The discovery was made in the southern part of the Prince prospect in wildcat well 31/4-A-23 G. Well 31/4-A-23 F, in the northern part of the Prince prospect, turned up dry. “The licensees will now assess the deposit as part of the further development of the Brage field”, the Norwegian Offshore Directorate said in an online statement. The stakeholders are OKEA with a 35.2 percent stake, Lime Petroleum AS with 33.84 percent, DNO Norge AS with 14.26 percent, Petrolia NOCO AS with 12.26 percent and M Vest Energy AS with 4.44 percent. “The field has been in production for a long time, and work is under way to identify new methods to improve recovery”, the upstream regulator said. “New wells are being drilled, often combined with investigation of nearby prospects”. Well A-23 F aimed to prove petroleum in Upper Jurassic reservoir rocks in the Sognefjord Formation, while A-23 G aimed to delineate a potential discovery in A-23 F and delineate the northern part of 31/4-A-13 E (Kim). A-23 F, horizontally drilled, showed a sandstone layer in the Sognefjord Formation with a total measured thickness of 220 meters (721.78 feet) along the wellbore and 12 meters of vertical thickness with “good reservoir properties”, the Directorate reported. It was drilled to a measured depth of 6285 meters and a vertical depth of 2153 meters below sea level in the Sognefjord Formation. A-23 G was drilled horizontally at a vertical depth of 2,120-2,171 meters along the eastern flank of the Brage field. It encountered a sandstone layer three to four meters thick

Read More »

DOE Announces New Supercomputer Powered by Dell and NVIDIA to Speed Scientific Discovery

BERKELEY— During a visit to Lawrence Berkeley National Laboratory (Berkeley Lab), U.S. Secretary of Energy Chris Wright today announced a new contract with Dell Technologies to develop NERSC-10, the next flagship supercomputer at the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy (DOE) user facility at Berkeley Lab. The new system, due in 2026, will be named after Jennifer Doudna, the Berkeley Lab-based biochemist who was awarded the 2020 Nobel Prize for Chemistry in recognition of her work on the gene-editing technology CRISPR. The new supercomputer, a Dell Technologies system powered by NVIDIA’s next-generation Vera Rubin platform, will be engineered to support large-scale high-performance computing (HPC) workloads like those in molecular dynamics, high-energy physics, and AI training and inference—and provide a robust environment for the workflows that make cutting-edge science possible.   This announcement reflects the Trump Administration’s commitment to restoring the gold standard of American science and unleashing the next great wave of innovation. Doudna will be one of the most advanced supercomputers ever deployed by the Department, advancing U.S. leadership in the global race for AI. “The Doudna system represents DOE’s commitment to advancing American leadership in science, AI, and high-performance computing,” said U.S. Secretary of Energy Chris Wright. “It will be a powerhouse for rapid innovation that will transform our efforts to develop abundant, affordable energy supplies and advance breakthroughs in quantum computing. AI is the Manhattan Project of our time, and Doudna will help ensure America’s scientists have the tools they need to win the global race for AI dominance.” “At Dell Technologies, we are empowering researchers worldwide by seamlessly integrating simulation, data, and AI to address the world’s most complex challenges,” said Michael Dell, Chairman and CEO, Dell Technologies. “Our collaboration with the Department of Energy on Doudna underscores a shared vision to redefine

Read More »

DOE Issues LNG Export Authorization for Port Arthur Phase II, Advancing President Trump’s Commitment to Unleash American Energy

WASHINGTON— U.S. Secretary of Energy Chris Wright today approved a final authorization for liquefied natural gas (LNG) exports to non-free trade agreement (non-FTA) countries from Port Arthur LNG Phase II in Jefferson County, Texas, following the Response to Comments on the 2024 LNG Export Study issued on May 19. This is the first final LNG export approval under President Trump’s leadership and marks another step in restoring regular order to LNG export permitting–reversing the previous administration’s pause and delivering on the President’s pledge to unleash American energy.  “Port Arthur LNG Phase II marks a significant expansion of the first phase already under construction– turning more of the liquid gold beneath our feet into energy security for the American people,” said Secretary Wright. “With President Trump’s leadership, the Energy Department is restoring America’s role as the world’s most reliable energy supplier.”   “U.S. LNG exports continue to gain momentum, and I am glad DOE is able to do its part to answer the call for more reliable and affordable energy, at home and abroad,” said Tala Goudarzi, Principal Deputy Assistant Secretary of the Office of Fossil Energy and Carbon Management.  Port Arthur LNG Phase II, owned by Sempra Energy, is projected to export 1.91 billion cubic feet per day (Bcf/d) once completed. In addition to Port Arthur Phase I—which is currently under construction and expected to begin exporting LNG in 2027—Sempra also operates the Cameron LNG export terminal in Louisiana, which has been exporting LNG since 2019, and is currently constructing the Energia Costa Azul terminal in Mexico, which will begin commercial export operations of U.S.-sourced gas as LNG beginning in 2026.  Today’s action marks the fifth LNG export authorization issued by Secretary Wright, bringing the total volume of exports associated with approvals under President Trump’s leadership to 11.45 Bcf/d.      

Read More »

Oil Falls on Weak US Data and OPEC Output Fears

Oil declined as soft US economic data and concerns about rising supplies eroded the risk-on sentiment from a court ruling that blocked a swath of the Trump administration’s tariffs. West Texas Intermediate fell 1.5% to settle near $61 a barrel after Interfax cited Kazakhstan as saying that OPEC+ is set to hike output at a meeting on Saturday, with the size of the increase still to be decided. Broader markets eased off of earlier highs on data showing the US economy shrank at the start of the year, further pressuring the commodity. Crude had earlier rallied after a trade court blocked a vast range of President Donald Trump’s trade levies, including elevated rates on China — the world’s top importer of crude. “The path to sustainably higher prices remains extremely narrow,” with the market likely to struggle to absorb additional barrels from OPEC+ over the coming months, said Daniel Ghali, a commodity strategist at TD Securities. In the near term, algorithmic selling activity will weigh on prices into the weekend meeting, he added. Oil has trended lower since mid-January on concerns about the fallout from Trump’s tariff war, with the revival of idled production by OPEC+ adding to headwinds. The trade measures have rattled global markets, raising concerns over economic growth and demand for commodities. Meanwhile, wildfires are threatening about 5% of Canada’s crude output as a blaze in Alberta’s oil sands region spreads. Oil Prices WTI for July delivery slipped 1.5% to settle at $60.94 a barrel in New York. Brent for July settlement dipped 1.2% to settle at $64.15 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up

Read More »

Goldman, Morgan Stanley Say Trump Can Deploy Other Tariff Tools

Two of Wall Street’s top investment banks cautioned that the impact of a court ruling striking down many of President Donald Trump’s tariff measures may prove limited, given that the administration has other avenues to impose import duties. “The tariff levels that we had yesterday are probably going to be the tariff levels that we have tomorrow, because there are so many different authorities the administration can reach into to put it back together,” Michael Zezas, Morgan Stanley’s global head of fixed income and thematic research, said on Bloomberg TV Thursday. Goldman Sachs Group Inc.’s Alec Phillips wrote in a note to clients late Wednesday that “this ruling represents a setback for the administration’s tariff plans and increases uncertainty but might not change the final outcome for most major US trading partners.”  The judgment by the US Court of International Trade halts 6.7 percentage points of levies announced this year and the White House could use other tariff tools to make up for that, wrote Phillips, Goldman’s chief US political economist. “For now, we expect the Trump administration will find other ways to impose tariffs.” Zezas had a similar assessment. Trump’s power to “raise and escalate — it might be a little bit slower moving, but it is still there.” Talks with countries such as Japan were always likely to take time, he said. And while they proceed, the administration would be able to “stitch together that authority on the other tariffs that went away — so all the same leverage is effectively there during the negotiation.” For now, the White House is signaling it’s not planning to proceed with other tools. “There are different approaches that would take a couple of months” to put in place, Kevin Hassett, director of the National Economic Council, said on Fox Business Thursday.

Read More »

West of Orkney developers helped support 24 charities last year

The developers of the 2GW West of Orkney wind farm paid out a total of £18,000 to 24 organisations from its small donations fund in 2024. The money went to projects across Caithness, Sutherland and Orkney, including a mental health initiative in Thurso and a scheme by Dunnet Community Forest to improve the quality of meadows through the use of traditional scythes. Established in 2022, the fund offers up to £1,000 per project towards programmes in the far north. In addition to the small donations fund, the West of Orkney developers intend to follow other wind farms by establishing a community benefit fund once the project is operational. West of Orkney wind farm project director Stuart McAuley said: “Our donations programme is just one small way in which we can support some of the many valuable initiatives in Caithness, Sutherland and Orkney. “In every case we have been immensely impressed by the passion and professionalism each organisation brings, whether their focus is on sport, the arts, social care, education or the environment, and we hope the funds we provide help them achieve their goals.” In addition to the local donations scheme, the wind farm developers have helped fund a £1 million research and development programme led by EMEC in Orkney and a £1.2m education initiative led by UHI. It also provided £50,000 to support the FutureSkills apprenticeship programme in Caithness, with funds going to employment and training costs to help tackle skill shortages in the North of Scotland. The West of Orkney wind farm is being developed by Corio Generation, TotalEnergies and Renewable Infrastructure Development Group (RIDG). The project is among the leaders of the ScotWind cohort, having been the first to submit its offshore consent documents in late 2023. In addition, the project’s onshore plans were approved by the

Read More »

Biden bans US offshore oil and gas drilling ahead of Trump’s return

US President Joe Biden has announced a ban on offshore oil and gas drilling across vast swathes of the country’s coastal waters. The decision comes just weeks before his successor Donald Trump, who has vowed to increase US fossil fuel production, takes office. The drilling ban will affect 625 million acres of federal waters across America’s eastern and western coasts, the eastern Gulf of Mexico and Alaska’s Northern Bering Sea. The decision does not affect the western Gulf of Mexico, where much of American offshore oil and gas production occurs and is set to continue. In a statement, President Biden said he is taking action to protect the regions “from oil and natural gas drilling and the harm it can cause”. “My decision reflects what coastal communities, businesses, and beachgoers have known for a long time: that drilling off these coasts could cause irreversible damage to places we hold dear and is unnecessary to meet our nation’s energy needs,” Biden said. “It is not worth the risks. “As the climate crisis continues to threaten communities across the country and we are transitioning to a clean energy economy, now is the time to protect these coasts for our children and grandchildren.” Offshore drilling ban The White House said Biden used his authority under the 1953 Outer Continental Shelf Lands Act, which allows presidents to withdraw areas from mineral leasing and drilling. However, the law does not give a president the right to unilaterally reverse a drilling ban without congressional approval. This means that Trump, who pledged to “unleash” US fossil fuel production during his re-election campaign, could find it difficult to overturn the ban after taking office. Sunset shot of the Shell Olympus platform in the foreground and the Shell Mars platform in the background in the Gulf of Mexico Trump

Read More »

The Download: our 10 Breakthrough Technologies for 2025

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Introducing: MIT Technology Review’s 10 Breakthrough Technologies for 2025 Each year, we spend months researching and discussing which technologies will make the cut for our 10 Breakthrough Technologies list. We try to highlight a mix of items that reflect innovations happening in various fields. We look at consumer technologies, large industrial­-scale projects, biomedical advances, changes in computing, climate solutions, the latest in AI, and more.We’ve been publishing this list every year since 2001 and, frankly, have a great track record of flagging things that are poised to hit a tipping point. It’s hard to think of another industry that has as much of a hype machine behind it as tech does, so the real secret of the TR10 is really what we choose to leave off the list.Check out the full list of our 10 Breakthrough Technologies for 2025, which is front and center in our latest print issue. It’s all about the exciting innovations happening in the world right now, and includes some fascinating stories, such as: + How digital twins of human organs are set to transform medical treatment and shake up how we trial new drugs.+ What will it take for us to fully trust robots? The answer is a complicated one.+ Wind is an underutilized resource that has the potential to steer the notoriously dirty shipping industry toward a greener future. Read the full story.+ After decades of frustration, machine-learning tools are helping ecologists to unlock a treasure trove of acoustic bird data—and to shed much-needed light on their migration habits. Read the full story. 
+ How poop could help feed the planet—yes, really. Read the full story.
Roundtables: Unveiling the 10 Breakthrough Technologies of 2025 Last week, Amy Nordrum, our executive editor, joined our news editor Charlotte Jee to unveil our 10 Breakthrough Technologies of 2025 in an exclusive Roundtable discussion. Subscribers can watch their conversation back here. And, if you’re interested in previous discussions about topics ranging from mixed reality tech to gene editing to AI’s climate impact, check out some of the highlights from the past year’s events. This international surveillance project aims to protect wheat from deadly diseases For as long as there’s been domesticated wheat (about 8,000 years), there has been harvest-devastating rust. Breeding efforts in the mid-20th century led to rust-resistant wheat strains that boosted crop yields, and rust epidemics receded in much of the world.But now, after decades, rusts are considered a reemerging disease in Europe, at least partly due to climate change.  An international initiative hopes to turn the tide by scaling up a system to track wheat diseases and forecast potential outbreaks to governments and farmers in close to real time. And by doing so, they hope to protect a crop that supplies about one-fifth of the world’s calories. Read the full story. —Shaoni Bhattacharya

The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Meta has taken down its creepy AI profiles Following a big backlash from unhappy users. (NBC News)+ Many of the profiles were likely to have been live from as far back as 2023. (404 Media)+ It also appears they were never very popular in the first place. (The Verge) 2 Uber and Lyft are racing to catch up with their robotaxi rivalsAfter abandoning their own self-driving projects years ago. (WSJ $)+ China’s Pony.ai is gearing up to expand to Hong Kong.  (Reuters)3 Elon Musk is going after NASA He’s largely veered away from criticising the space agency publicly—until now. (Wired $)+ SpaceX’s Starship rocket has a legion of scientist fans. (The Guardian)+ What’s next for NASA’s giant moon rocket? (MIT Technology Review) 4 How Sam Altman actually runs OpenAIFeaturing three-hour meetings and a whole lot of Slack messages. (Bloomberg $)+ ChatGPT Pro is a pricey loss-maker, apparently. (MIT Technology Review) 5 The dangerous allure of TikTokMigrants’ online portrayal of their experiences in America aren’t always reflective of their realities. (New Yorker $) 6 Demand for electricity is skyrocketingAnd AI is only a part of it. (Economist $)+ AI’s search for more energy is growing more urgent. (MIT Technology Review) 7 The messy ethics of writing religious sermons using AISkeptics aren’t convinced the technology should be used to channel spirituality. (NYT $)
8 How a wildlife app became an invaluable wildfire trackerWatch Duty has become a safeguarding sensation across the US west. (The Guardian)+ How AI can help spot wildfires. (MIT Technology Review) 9 Computer scientists just love oracles 🔮 Hypothetical devices are a surprisingly important part of computing. (Quanta Magazine)
10 Pet tech is booming 🐾But not all gadgets are made equal. (FT $)+ These scientists are working to extend the lifespan of pet dogs—and their owners. (MIT Technology Review) Quote of the day “The next kind of wave of this is like, well, what is AI doing for me right now other than telling me that I have AI?” —Anshel Sag, principal analyst at Moor Insights and Strategy, tells Wired a lot of companies’ AI claims are overblown.
The big story Broadband funding for Native communities could finally connect some of America’s most isolated places September 2022 Rural and Native communities in the US have long had lower rates of cellular and broadband connectivity than urban areas, where four out of every five Americans live. Outside the cities and suburbs, which occupy barely 3% of US land, reliable internet service can still be hard to come by.
The covid-19 pandemic underscored the problem as Native communities locked down and moved school and other essential daily activities online. But it also kicked off an unprecedented surge of relief funding to solve it. Read the full story. —Robert Chaney We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.) + Rollerskating Spice Girls is exactly what your Monday morning needs.+ It’s not just you, some people really do look like their dogs!+ I’m not sure if this is actually the world’s healthiest meal, but it sure looks tasty.+ Ah, the old “bitten by a rabid fox chestnut.”

Read More »

Equinor Secures $3 Billion Financing for US Offshore Wind Project

Equinor ASA has announced a final investment decision on Empire Wind 1 and financial close for $3 billion in debt financing for the under-construction project offshore Long Island, expected to power 500,000 New York homes. The Norwegian majority state-owned energy major said in a statement it intends to farm down ownership “to further enhance value and reduce exposure”. Equinor has taken full ownership of Empire Wind 1 and 2 since last year, in a swap transaction with 50 percent co-venturer BP PLC that allowed the former to exit the Beacon Wind lease, also a 50-50 venture between the two. Equinor has yet to complete a portion of the transaction under which it would also acquire BP’s 50 percent share in the South Brooklyn Marine Terminal lease, according to the latest transaction update on Equinor’s website. The lease involves a terminal conversion project that was intended to serve as an interconnection station for Beacon Wind and Empire Wind, as agreed on by the two companies and the state of New York in 2022.  “The expected total capital investments, including fees for the use of the South Brooklyn Marine Terminal, are approximately $5 billion including the effect of expected future tax credits (ITCs)”, said the statement on Equinor’s website announcing financial close. Equinor did not disclose its backers, only saying, “The final group of lenders includes some of the most experienced lenders in the sector along with many of Equinor’s relationship banks”. “Empire Wind 1 will be the first offshore wind project to connect into the New York City grid”, the statement added. “The redevelopment of the South Brooklyn Marine Terminal and construction of Empire Wind 1 will create more than 1,000 union jobs in the construction phase”, Equinor said. On February 22, 2024, the Bureau of Ocean Energy Management (BOEM) announced

Read More »

USA Crude Oil Stocks Drop Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), decreased by 1.2 million barrels from the week ending December 20 to the week ending December 27, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report, which was released on January 2. Crude oil stocks, excluding the SPR, stood at 415.6 million barrels on December 27, 416.8 million barrels on December 20, and 431.1 million barrels on December 29, 2023, the report revealed. Crude oil in the SPR came in at 393.6 million barrels on December 27, 393.3 million barrels on December 20, and 354.4 million barrels on December 29, 2023, the report showed. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.623 billion barrels on December 27, the report revealed. This figure was up 9.6 million barrels week on week and up 17.8 million barrels year on year, the report outlined. “At 415.6 million barrels, U.S. crude oil inventories are about five percent below the five year average for this time of year,” the EIA said in its latest report. “Total motor gasoline inventories increased by 7.7 million barrels from last week and are slightly below the five year average for this time of year. Finished gasoline inventories decreased last week while blending components inventories increased last week,” it added. “Distillate fuel inventories increased by 6.4 million barrels last week and are about six percent below the five year average for this time of year. Propane/propylene inventories decreased by 0.6 million barrels from last week and are 10 percent above the five year average for this time of year,” it went on to state. In the report, the EIA noted

Read More »

More telecom firms were breached by Chinese hackers than previously reported

Broader implications for US infrastructure The Salt Typhoon revelations follow a broader pattern of state-sponsored cyber operations targeting the US technology ecosystem. The telecom sector, serving as a backbone for industries including finance, energy, and transportation, remains particularly vulnerable to such attacks. While Chinese officials have dismissed the accusations as disinformation, the recurring breaches underscore the pressing need for international collaboration and policy enforcement to deter future attacks. The Salt Typhoon campaign has uncovered alarming gaps in the cybersecurity of US telecommunications firms, with breaches now extending to over a dozen networks. Federal agencies and private firms must act swiftly to mitigate risks as adversaries continue to evolve their attack strategies. Strengthening oversight, fostering industry-wide collaboration, and investing in advanced defense mechanisms are essential steps toward safeguarding national security and public trust.

Read More »

Mistral launches new code embedding model that outperforms OpenAI and Cohere in real-world retrieval tasks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More With demand for enterprise retrieval augmented generation (RAG) on the rise, the opportunity is ripe for model providers to offer their take on embedding models.  French AI company Mistral threw its hat into the ring with Codestral Embed, its first embedding model, which it said outperforms existing embedding models on benchmarks like SWE-Bench. The model specializes in code and “performs especially well for retrieval use cases on real-world code data.” The model is available to developers for $0.15 per million tokens.  The company said the Codestral Embed “significantly outperforms leading code embedders” like Voyage Code 3, Cohere Embed v4.0 and OpenAI’s embedding model, Text Embedding 3 Large.  Super excited to announce @MistralAI Codestral Embed, our first embedding model specialized for code. It performs especially well for retrieval use cases on real-world code data. pic.twitter.com/ET321cRNli — Sophia Yang, Ph.D. (@sophiamyang) May 28, 2025 Codestral Embed, part of Mistral’s Codestral family of coding models, can make embeddings that transform code and data into numerical representations for RAG.  “Codestral Embed can output embeddings with different dimensions and precisions, and the figure below illustrates the trade-offs between retrieval quality and storage costs,” Mistral said in a blog post. “Codestral Embed with dimension 256 and int8 precision still performs better than any model from our competitors. The dimensions of our embeddings are ordered by relevance. For any integer target dimension n, you can choose to keep the first n dimensions for a smooth trade-off between quality and cost.” Mistral tested the model on several benchmarks, including SWE-Bench and Text2Code from GitHub. In both cases, the company said Codestral Embed outperformed leading embedding models.  SWE- Bench Text2Code Use cases Mistral said Codestral Embed is optimized for “high-performance

Read More »

Nvidia CEO takes a shot at U.S. policy cutting off AI chip sales to China

Nvidia CEO Jensen Huang tiptoed into politics with a comment taking a shot at the U.S. policy that has cut off sales of his chips to China.

That was because Nvidia had to take a $4.5 billion charge against its Q1 earnings because the company had to immediately cease selling H20 AI chips to China in April. U.S. President Donald Trump imposed the restrictions as part of the trade war over tariffs with China and other countries.

“Let me share my perspective on some topics we’re frequently asked on export control. China is one of the world’s largest AI markets and a springboard to global success with half of the world’s AI researchers based there,” Huang said. “The platform that wins China is positioned to lead globally today. However, the $50 billion China market is effectively closed to U.S. industry. The H20 export ban ended our Hopper data center business in China. We cannot produce Hopper further to comply. As a result, we are taking a multibillion-dollar write-off on inventory that cannot be sold or repurposed. We are exploring limited ways to compete, but hopper is no longer an option.”

Huang said that, with or without U.S. chips, China has to compute to train and deploy advanced models.

“The question is not whether China will have it. It already does,” he said. “The question is whether one of the world’s largest AI markets will run on American platforms. Shielding Chinese chip makers from U.S. competition only strengthens them abroad and weakens America’s position.”

He added, “Export restrictions have spurred China’s” competitiveness. He said, “The race is not just about chips. It’s about which stack the world runs as that stack grows. Global infrastructure leadership is at stake. The U.S. has based its policy on the assumption that China cannot make any chips. That assumption was always questionable, and now it’s very wrong. China has enormous manufacturing capability. In the end, the platform that wins the AI developers wins AI. AI export controls should strengthen U.S. platforms, not drive half the world’s AI talent” to other shores.

Read More »

Nvidia beats estimates for Q1 results as revenues rise 69% from a year ago

Nvidia, the AI and graphics chip company driving societal changes with AI, reported revenue for the first quarter ended April 27, 2025, was $44.1 billion, up 12% from the previous quarter and up 69% from a year ago.

On April 9, 2025, the U.S. government told Nvidia that a license is required for exports of its H20 products into the China market. As a result of these new requirements, Nvidia incurred a $4.5 billion charge in the first quarter of fiscal 2026 associated with H20 excess inventory and purchase obligations as the demand for H20 diminished.

Sales of H20 products were $4.6 billion for the first quarter of fiscal 2026 prior to the new export licensing requirements. Nvidia was unable to ship an additional $2.5 billion of H20 revenue in the first quarter. Excluding the $4.5 billion charge, first quarter non-GAAP gross margin would have been 71.3%.

NVIDIA founder and CEO Jensen Huang

For the quarter, GAAP and non-GAAP earnings per diluted share were $0.76 and $0.81, respectively. Excluding the $4.5 billion charge and related tax impact, first quarter non-GAAP diluted earnings per share would have been 96 cents. Analysts expected net income of 75 cents a share on Q1 revenue of $43.2 billion.

“Our breakthrough Blackwell NVL72 AI supercomputer — a ‘thinking machine’ designed for reasoning— is now in full-scale production across system makers and cloud service providers,” said Jensen Huang, founder and CEO of Nvidia, in a statement. “Global demand for Nvidia’s AI infrastructure is incredibly strong. AI inference token generation has surged tenfold in just one year, and as AI agents become mainstream, the demand for AI computing will accelerate. Countries around the world are recognizing AI as essential infrastructure — just like electricity and the internet — and Nvidia stands at the center of this profound transformation.”

Nvidia shareholders got a scare when DeepSeek emerged. Back on January 27, Nvidia’s stock fell 17%, losing $600 billion in market value, after investors worried that DeepSeek’s efficient AI models might reduce the demand for Nvidia’s high-margin AI hardware. But the stock has recovered since that time.

Nvidia downplayed the concerns and during the quarter announced a number of new products during the Computex trade show in Taiwan last week.

In its outlook for the second quarter, Nvidia said it expects revenues to be $45 billion, plus or minus 2%. This reflects a loss in H20 revenue of $8 billion due to the export control limitations.

GAAP and non-GAAP gross margins are expected to be 71.8% and 72.0%, respectively, plus or minus 50 basis points. The company is continuing to work toward achieving gross margins in the mid-70% range late this year.

GAAP and non-GAAP operating expenses are expected to be approximately $5.7 billion and $4.0 billion, respectively. Full year fiscal 2026 operating expense growth is expected to be in the mid-30% range.

Data Center revenues

Nvidia powers the world’s most powerful quantum research supercomputer.

First-quarter revenue was $39.1 billion, up 10% from the previous quarter and up 73% from a year ago.Nvidia announced it is building factories in the U.S. and working with its partners to produce Nvidia AI supercomputers in the U.S.

It also introduced Nvidia Blackwell Ultra and Nvidia Dynamo for accelerating and scaling AI reasoning models. And it announced a partnership with HUMAIN to build AI factories in the Kingdom of Saudi Arabia to drive the next wave of artificial intelligence development.

Also in the Middle East, it unveiled Stargate UAE, a next-generation AI infrastructure cluster in Abu Dhabi, United Arab Emirates, alongside strategic partners G42, OpenAI, Oracle, SoftBank Group and Cisco.

The company said it plans to work with Foxconn and the Taiwan government to build an AI factory supercomputer. It also announced joint initiatives with Alphabet and Google to advance agentic AI solutions, robotics and drug discovery.

And it revealed that Nvidia Blackwell cloud instances are now available on AWS, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure.

Gaming and AI PC

First-quarter gaming revenue was a record $3.8 billion, up 48% from the previous quarter and up 42% from a year ago.

Nvidia also announced the GeForce RTX 5070 and RTX 5060, bringing Blackwell graphics to gamers at prices starting from $299 for desktops and $1,099 for laptops.

And it said Nvidia DLSS 4 is now available in over 125 games, including Black Myth Wukong, DOOM: The Dark Ages, Indiana Jones and the Great Circle, Marvel Rivals and Star Wars Outlaws.

It also noted the Nintendo Switch 2, which is launching on June 5, is powered by an Nvidia processor and AI-powered DLSS, delivering up to 4K gaming.

And it launched the Nvidia RTX Remix modding platform, attracting over two million gamers, alongside the release of the Half-Life 2 RTX demo.

Professional Visualization

First-quarter revenue was $509 million, flat with the previous quarter and up 19% from a year ago.Announced the Nvidia RTX PRO Blackwell series for workstations and servers.

The company unveiled Nvidia DGX Spark and DGX Station™ personal AI supercomputers powered by the Nvidia Grace Blackwell platform.

Nvidia announced that leading industrial software and service providers Accenture, Ansys, Databricks, SAP, Schneider Electric with ETAP, and Siemens are integrating the Nvidia Omniverse platform into their solutions to accelerate industrial digitalization with physical AI.

Automotive and Robotics

First-quarter Automotive revenue was $567 million, down 1% from the previous quarter and up 72% from a year ago.

The company announced a collaboration with General Motors on next-generation vehicles, factories and robots using Nvidia Omniverse, Nvidia Cosmos and Nvidia Drive AGX.

It also announced Nvidia Isaac GR00T N1, the world’s first open humanoid robot foundation model, followed by Nvidia Isaac GR00T N1.5; Nvidia Isaac GR00T-Dreams, a blueprint for generating synthetic motion data; and Nvidia Blackwell systems to accelerate humanoid robot development.

Read More »

Less is more: Meta study shows shorter reasoning improves AI accuracy by 34%

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers from Meta’s FAIR team and The Hebrew University of Jerusalem have discovered that forcing large language models to “think” less actually improves their performance on complex reasoning tasks. The study released today found that shorter reasoning processes in AI systems lead to more accurate results while significantly reducing computational costs. “In this work, we challenge the assumption that long thinking chains results in better reasoning capabilities,” write the authors in their paper titled “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning.” The research contradicts the prevailing trend in AI development, where companies have invested heavily in scaling up computing resources to allow models to perform extensive reasoning through lengthy “thinking chains” — detailed step-by-step trajectories that AI systems use to solve complex problems. AI accuracy jumps 34% when models use shorter reasoning chains The researchers discovered that within the same reasoning task, “shorter reasoning chains are significantly more likely to yield correct answers — up to 34.5% more accurate than the longest chain sampled for the same question.” This finding held true across multiple leading AI models and benchmarks. “While demonstrating impressive results, [extensive reasoning] incurs significant computational costs and inference time,” the authors note, pointing to a substantial inefficiency in how these systems are currently deployed. Based on these findings, the team developed a novel approach called “short-m@k,” which executes multiple reasoning attempts in parallel but halts computation once the first few processes complete. The final answer is then selected through majority voting among these shorter chains. New ‘short-m@k’ method slashes computing costs by 40% while boosting performance For organizations deploying large AI reasoning systems, the implications could be substantial. The researchers found their method could

Read More »

The AI Hype Index: College students are hooked on ChatGPT

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Large language models confidently present their responses as accurate and reliable, even when they’re neither of those things. That’s why we’ve recently seen chatbots supercharge vulnerable people’s delusions, make citation mistakes in an important legal battle between music publishers and Anthropic, and (in the case of xAI’s Grok) rant irrationally about “white genocide.” But it’s not all bad news—AI could also finally lead to a better battery life for your iPhone and solve tricky real-world problems that humans have been struggling to crack, if Google DeepMind’s new model is any indication. And perhaps most exciting of all, it could combine with brain implants to help people communicate when they have lost the ability to speak.

Read More »

Rumi raises $4.7M to change passive media into interactive AI experiences

How would you like to earn money while watching TV? Rumi, an AI media company, has raised $4.7 million in a pre-seed funding round to transform passive media with rewards.

Rumi is launching its invite-only beta app today and it aims to allows users to engage with media content in real time.

The round, co-led by A16z crypto CSX and EV3, will support Rumi’s mission to build the first decentralized AI infrastructure for media, combining cutting-edge artificial intelligence with user-powered indexing to transform streaming into an intelligent, interactive experience.

Viewers can chat with characters or pro sports athletes about what they’re doing on the screen, identify actors and outfits on screen, or receive real-time fact checks during heated political debates — all powered by AI.

“Rumi sits at the cutting edge of AI agents, consumer media and decentralized infrastructure,” said Salvador Gala, cofounder of EV3, in a statement. “Their approach to indexing audio and video content through a distributed network of AI-powered nodes empowers users to directly benefit from existing behaviors.”

Streaming meets earning

Rumi has raised $4.7 million.

With users spending more than five hours a day watching content — often while multitasking — Rumi allows them to earn passive income by contributing spare compute to build the world’s richest knowledge base on video content that in turn allows AI agents to deeply understand and augment media in ways we can’t even imagine yet, the company said. 

“AI can make storytelling truly immersive — but it first needs to understand the world’s content and culture to do so,” said Niko Cunningham, CEO of Rumi, in a statement. “Rumi is building that infrastructure and giving users a way to be part of it from day one.”

This decentralized indexing system helps AI agents deliver smarter insights and personalized features such as contextual overlays, real-time cultural context and interactive recommendations, while protecting user privacy and respecting content IP.

Business model and partnerships

Rumi is licensing infrastructure to third party AI agents, developers, and creators, providing agents brand new functionality and monetization opportunities. Those opportunities include contextual ads which, from the viewer’s perspective, can be seamlessly integrated in the conversation with AI agents, and much more value-adding. Agents can organically surface recipe cards during cooking shows or fashion links tied to what characters are wearing on the screen. 

Rumi is also building a “decentralized Nielsen,” offering real-time analytics and audience insights. Early partnerships include Virtuals.io, who will integrate Rumi’s APIs to provide agents using their platform media contextual-awareness and interactivity capabilities, in addition to TVision and Story Protocol who will rely on Rumi to provide unique viewership data and content analytics. 

Rumi is now welcoming early users to its watch-to-earn beta, allowing participants to help build what it calls the first intelligent, user-powered media ecosystem. Sign up at https://www.rumilabs.io.

How Rumi works

Infrastructure allowing AI to understand human stories and culture locked in billions of hours of video.

Imagine having an entire cast and crew, commentators, shopping assistants, fact-checkers, researchers, and writers in your living room — sitting on your couch with you as you watch your favorite movies, shows, sports, news. That’s media contextually-aware AI. To enable that future, the company built a decentralized network of indexers analyzing content as they watch, and getting compensated for their work and compute.

Over time, the company plans to use the money to bring to market the world’s best AI architecture for distilling the story from video: relations between actions, characters, objects; story / character arcs — mapping, understanding and AI participating in the world’s stories — not just documents and facts like today’s current LLMs.

The company is building infrastructure to support Live TV / sports (vs historical content that are covering now) and to build a brand new gateway to media, D2C app — first ever AI remote and media companion.

How the “watch-to-earn” model works

Rumi indexes videos and offers rewards to viewers.

Users get paid for their compute and data contributions. They are providing value by analyzing video content and interacting with it on their machines.

That data is valuable because it empowers the next generation of AI agents that can identify and deeply understand content, providing them the ability to augment and personalize content for you, and turn passive media into interactive experience.

The viewership analytics data that we’re collecting on how people interact with content at the depth never reached before, is also used to improve content and advertising quality.

The company said it will help individual creators and Hollywood studios improve quality of storytelling, as Rumi is collecting the richest and most granular trove of data on what resonates with people — stories, characters, objects.

Rumi said it will boost advertisers’ return-on-investment by enabling contextual, agentic ads, that are more personalized and valuable to consumer, and feel seamless, organic — embedded in the natural conversation flow… ads that don’t feel like ads. Of course, people have to agree to the privacy policies of the company.

As far as the target audience goes, Rumi said anyone with a computer and access to streaming platforms can join our network today, and get rewarded for their contributions to enabling the AI-powered Media future. The company is primarily focused on Web3 users, as they’re used to the model of sharing their spare resources, such as compute power or bandwidth, in exchange for points or tokens.

Rival companies

There are a few decentralized networks paying people for their compute resources to train AI models or run inference, such as Bittensor, Akash, or Render.

However, Rumi said it is focused on video understanding, and have built its proprietary deep learning models and architectures that allow it to distill the essence of the story from the video, which Rumi believes will be critical for AI to truly understand the stories humans care about and how they connect to our culture, building up towards more human AI.

Rumi makes money by providing media contextual awareness to AI Agents, allowing them to become way more than simple bots on social media. Rumi charges them for accessing its APIs, and serve their end users contextual Ads.

It’s also selling viewership data and analytics to media companies, advertisers, AI Labs, and other parties interested in understanding how consumers interact with content (on which they spend perhaps five hours a day).

Content licensing and rights management policies for AI

Rumi said it is never storing, rebroadcasting nor exposing any actual IP-protected content. Its decentralized network is indexing it in similar fashion that Google indexed web pages.

The company said it will never train any AI models on that IP-protected content. It is building infrastructure that benefits IP owners: Rumi gives them a brand new direct channel of communication with their audiences, via which they can deepen the consumer engagement in ways unimaginable before, and access new monetization opportunities.

Courts have repeatedly upheld that publishing granular transcriptions or summaries of video content qualify as fair use—as seen in Google Books (Authors Guild v. Google), where digitizing and displaying searchable snippets of entire books was ruled transformative. And the same held in TVEyes v. Fox, where indexing and repurposing broadcast content for search and review was partially upheld as fair use.

Rumi’s network of indexers transcribes content, but it doesn’t publish it to the public. Rumi’s protocol only exposes APIs to allow AI Agents to interact with that transcript and content fingerprint, prompt by prompt.

Rumi said this opens endless opportunities for creators to make their content more personalized and interactive, so it ultimately benefits the IP owners, as they direct channels of communication with their audiences, which they can monetize.

Why now? What macro trends are converging that make Rumi viable today?

Consumers want more interactive, connected, personalized and immersive entertainment. It’s clear looking at the growth of gaming, social media, and the massive second screening trend: 80% of Gen Zs report constantly scrolling through their phones while watching TV.

Consumer AI agents are about to boom, but they will not start with high stakes use cases like planning and paying for your vacation. They will start in your living room, making your entertainment more immersive, connected and educational. Tech is finally ready, and anyone can analyze content on their device, contribute to more immersive and personalized future of media, and earn while doing that, Rumi said.

Read More »

USA Crude Oil Inventories Decrease Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), decreased by 2.8 million barrels from the week ending May 16 to the week ending May 23, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. This report was released on May 29 and included data for the week ending May 23. It showed that crude oil stocks, not including the SPR, stood at 440.4 million barrels on May 23, 443.2 million barrels on May 16, and 454.7 million barrels on May 24, 2024. Crude oil in the SPR stood at 401.3 million barrels on May 23, 400.5 million barrels on May 16, and 369.3 million barrels on May 24, 2024, the report outlined. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.623 billion barrels on May 23, the report showed. Total petroleum stocks were up 0.2 million barrels week on week and down 8.7 million barrels year on year, the report revealed. “At 440.4 million barrels, U.S. crude oil inventories are about six percent below the five year average for this time of year,” the EIA said in its latest weekly petroleum status report. “Total motor gasoline inventories decreased by 2.4 million barrels from last week and are about three percent below the five year average for this time of year. Both finished gasoline inventories and blending components inventories decreased last week,” it added. “Distillate fuel inventories decreased by 0.7 million barrels last week and are about 17 percent below the five year average for this time of year. Propane/propylene inventories increased by two million barrels from last week and are four percent below the five year average for this

Read More »

Fueling seamless AI at scale

In partnership withArm From large language models (LLMs) to reasoning agents, today’s AI tools bring unprecedented computational demands. Trillion-parameter models, workloads running on-device, and swarms of agents collaborating to complete tasks all require a new paradigm of computing to become truly seamless and ubiquitous. First, technical progress in hardware and silicon design is critical to pushing the boundaries of compute. Second, advances in machine learning (ML) allow AI systems to achieve increased efficiency with smaller computational demands. Finally, the integration, orchestration, and adoption of AI into applications, devices, and systems is crucial to delivering tangible impact and value. Silicon’s mid-life crisis AI has evolved from classical ML to deep learning to generative AI. The most recent chapter, which took AI mainstream, hinges on two phases—training and inference—that are data and energy-intensive in terms of computation, data movement, and cooling. At the same time, Moore’s Law, which determines that the number of transistors on a chip doubles every two years, is reaching a physical and economic plateau. For the last 40 years, silicon chips and digital technology have nudged each other forward—every step ahead in processing capability frees the imagination of innovators to envision new products, which require yet more power to run. That is happening at light speed in the AI age.
As models become more readily available, deployment at scale puts the spotlight on inference and the application of trained models for everyday use cases. This transition requires the appropriate hardware to handle inference tasks efficiently. Central processing units (CPUs) have managed general computing tasks for decades, but the broad adoption of ML introduced computational demands that stretched the capabilities of traditional CPUs. This has led to the adoption of graphics processing units (GPUs) and other accelerator chips for training complex neural networks, due to their parallel execution capabilities and high memory bandwidth that allow large-scale mathematical operations to be processed efficiently. But CPUs are already the most widely deployed and can be companions to processors like GPUs and tensor processing units (TPUs). AI developers are also hesitant to adapt software to fit specialized or bespoke hardware, and they favor the consistency and ubiquity of CPUs. Chip designers are unlocking performance gains through optimized software tooling, adding novel processing features and data types specifically to serve ML workloads, integrating specialized units and accelerators, and advancing silicon chip innovations, including custom silicon. AI itself is a helpful aid for chip design, creating a positive feedback loop in which AI helps optimize the chips that it needs to run. These enhancements and strong software support mean modern CPUs are a good choice to handle a range of inference tasks.
Beyond silicon-based processors, disruptive technologies are emerging to address growing AI compute and data demands. The unicorn start-up Lightmatter, for instance, introduced photonic computing solutions that use light for data transmission to generate significant improvements in speed and energy efficiency. Quantum computing represents another promising area in AI hardware. While still years or even decades away, the integration of quantum computing with AI could further transform fields like drug discovery and genomics. Understanding models and paradigms The developments in ML theories and network architectures have significantly enhanced the efficiency and capabilities of AI models. Today, the industry is moving from monolithic models to agent-based systems characterized by smaller, specialized models that work together to complete tasks more efficiently at the edge—on devices like smartphones or modern vehicles. This allows them to extract increased performance gains, like faster model response times, from the same or even less compute. Researchers have developed techniques, including few-shot learning, to train AI models using smaller datasets and fewer training iterations. AI systems can learn new tasks from a limited number of examples to reduce dependency on large datasets and lower energy demands. Optimization techniques like quantization, which lower the memory requirements by selectively reducing precision, are helping reduce model sizes without sacrificing performance.  New system architectures, like retrieval-augmented generation (RAG), have streamlined data access during both training and inference to reduce computational costs and overhead. The DeepSeek R1, an open source LLM, is a compelling example of how more output can be extracted using the same hardware. By applying reinforcement learning techniques in novel ways, R1 has achieved advanced reasoning capabilities while using far fewer computational resources in some contexts. The integration of heterogeneous computing architectures, which combine various processing units like CPUs, GPUs, and specialized accelerators, has further optimized AI model performance. This approach allows for the efficient distribution of workloads across different hardware components to optimize computational throughput and energy efficiency based on the use case. Orchestrating AI As AI becomes an ambient capability humming in the background of many tasks and workflows, agents are taking charge and making decisions in real-world scenarios. These range from customer support to edge use cases, where multiple agents coordinate and handle localized tasks across devices. With AI increasingly used in daily life, the role of user experiences becomes critical for mass adoption. Features like predictive text in touch keyboards, and adaptive gearboxes in vehicles, offer glimpses of AI as a vital enabler to improve technology interactions for users. Edge processing is also accelerating the diffusion of AI into everyday applications, bringing computational capabilities closer to the source of data generation. Smart cameras, autonomous vehicles, and wearable technology now process information locally to reduce latency and improve efficiency. Advances in CPU design and energy-efficient chips have made it feasible to perform complex AI tasks on devices with limited power resources. This shift toward heterogeneous compute enhances the development of ambient intelligence, where interconnected devices create responsive environments that adapt to user needs.

Seamless AI naturally requires common standards, frameworks, and platforms to bring the industry together. Contemporary AI brings new risks. For instance, by adding more complex software and personalized experiences to consumer devices, it expands the attack surface for hackers, requiring stronger security at both the software and silicon levels, including cryptographic safeguards and transforming the trust model of compute environments. More than 70% of respondents to a 2024 DarkTrace survey reported that AI-powered cyber threats significantly impact their organizations, while 60% say their organizations are not adequately prepared to defend against AI-powered attacks. Collaboration is essential to forging common frameworks. Universities contribute foundational research, companies apply findings to develop practical solutions, and governments establish policies for ethical and responsible deployment. Organizations like Anthropic are setting industry standards by introducing frameworks, such as the Model Context Protocol, to unify the way developers connect AI systems with data. Arm is another leader in driving standards-based and open source initiatives, including ecosystem development to accelerate and harmonize the chiplet market, where chips are stacked together through common frameworks and standards. Arm also helps optimize open source AI frameworks and models for inference on the Arm compute platform, without needing customized tuning.  How far AI goes to becoming a general-purpose technology, like electricity or semiconductors, is being shaped by technical decisions taken today. Hardware-agnostic platforms, standards-based approaches, and continued incremental improvements to critical workhorses like CPUs, all help deliver the promise of AI as a seamless and silent capability for individuals and businesses alike. Open source contributions are also helpful in allowing a broader range of stakeholders to participate in AI advances. By sharing tools and knowledge, the community can cultivate innovation and help ensure that the benefits of AI are accessible to everyone, everywhere. Learn more about Arm’s approach to enabling AI everywhere. This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. This content was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Read More »

Avangrid Launches $41MM Projects to Upgrade Ithaca, NY Grid

Avangrid Inc. has announced five projects with a total investment of $41 million to install additional capacity and improve the reliability of the power grid in Ithaca, New York. The projects are part of a $20 billion investment through 2030 that Avangrid, part of Spain’s power and gas utility Iberdrola SA, announced earlier this year to contribute to United States grid modernization and expansion. Avangrid expects the Ithaca projects to benefit over 42,000 customers of New York State Electric & Gas, an Avangrid unit that operates about 35,000 miles of electric distribution lines and 4,500 miles of electric transmission lines across over 40 percent of upstate New York. “Phase I of Ithaca’s investment will focus on current reliability needs in the region and is on schedule to be completed by the end of 2027”, Avangrid said in an online statement. The bulk of phase 1 investments will go to the purchase of two new transformers for the South Street substation, costing $28.4 million. Transformers at the Coddington station would also be upgraded for $300,000. Transformers step down the voltage to transmission, sub-transmission and distribution voltages to ensure the safe and cost-effective supply of electricity, Avangrid said. In the three other projects, the West Hill, Trumansburg and Cayuga Heights substations will each have a capacity bank for $4.9 million, $4.2 million and $3.3 million respectively. “Capacitors banks help ensure consistent energy into the grid, helping improve the reliability for customers in the area”, Avangrid said. “They do this by stabilizing and maintaining voltage levels, which improves overall efficiency and performance of the power grid”. “Increased capacity will encourage growth in the region and provide more energy to power additional homes and new and growing businesses. In total, these projects will create more than 150 jobs”, it said. “This major investment in Ithaca’s

Read More »

The Download: sycophantic LLMs, and the AI Hype Index

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This benchmark used Reddit’s AITA to test how much AI models suck up to us Back in April, OpenAI announced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed.A new benchmark called Elephant that measures the sycophantic tendencies of major AI models could help companies avoid these issues in the future. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. Read the full story. —Rhiannon Williams
The AI Hype Index
Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Take a look at this month’s edition of the index here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Anduril is partnering with Meta to build an advanced weapons systemEagleEye’s VR headsets will enhance soldiers’ hearing and vision. (WSJ $)+ Palmer Luckey wants to turn “warfighters into technomancers.” (TechCrunch)+ Luckey and Mark Zuckerberg have buried the hatchet, then. (Insider $)+ Palmer Luckey on the Pentagon’s future of mixed reality. (MIT Technology Review)2 A new Texas law requires app stores to verify users’ agesIt’s following in Utah’s footsteps, which passed a similar bill in March. (NYT $)+ Apple has pushed back on the law. (CNN)3 What happens to DOGE now?It has lost its leader and a top lieutenant within the space of a week. (WSJ $)+ Musk’s departure raises questions over how much power it will wield without him. (The Guardian)+ DOGE’s tech takeover threatens the safety and stability of our critical data. (MIT Technology Review) 4 NASA’s ambitions of a 2027 moon landing are looking less likelyIt needs SpaceX’s Starship, which keeps blowing up. (WP $)+ Is there a viable alternative? (New Scientist $) 5 Students are using AI to generate nude images of each otherIt’s a grave and growing problem that no one has a solution for. (404 Media) 6 Google AI Overviews doesn’t know what year it isA year after its introduction, the feature is still making obvious mistakes. (Wired $)+ Google’s new AI-powered search isn’t fit to handle even basic queries. (NYT $)+ The company is pushing AI into everything. Will it pay off? (Vox)+ Why Google’s AI Overviews gets things wrong. (MIT Technology Review)

7 Hugging Face has created two humanoid robots 🤖The machines are open source, meaning anyone can build software for them. (TechCrunch) 8 A popular vibe coding app has a major security flawDespite being notified about it months ago. (Semafor)+ Any AI coding program catering to amateurs faces the same issue. (The Information $)+ What is vibe coding, exactly? (MIT Technology Review) 9 AI-generated videos are becoming way more realisticBut not when it comes to depicting gymnastics. (Ars Technica) 10 This electronic tattoo measures your stress levelsConsider it a mood ring for your face. (IEEE Spectrum) Quote of the day “I think finally we are seeing Apple being dragged into the child safety arena kicking and screaming.” —Sarah Gardner, CEO of child safety collective Heat Initiative, tells the Washington Post why Texas’ new app store law could signal a turning point for Apple.
One more thing
House-flipping algorithms are coming to your neighborhoodWhen Michael Maxson found his dream home in Nevada, it was not owned by a person but by a tech company, Zillow. When he went to take a look at the property, however, he discovered it damaged by a huge water leak. Despite offering to handle the costly repairs himself, Maxson discovered that the house had already been sold to another family, at the same price he had offered.During this time, Zillow lost more than $420 million in three months of erratic house buying and unprofitable sales, leading analysts to question whether the entire tech-driven model is really viable. For the rest of us, a bigger question remains: Does the arrival of Silicon Valley tech point to a better future for housing or an industry disruption to fear? Read the full story. —Matthew Ponsford

Read More »

JP Morgan Highlights Memorial Day Travel Effect on Global Oil Demand

Global oil demand improved from the previous week, driven by a rebound in U.S. oil consumption, bolstered by robust Memorial Day travel activities. That’s what analysts at J.P. Morgan stated in a research note sent to Rigzone by the JPM Commodities Research team late Thursday, adding that, as of May 28, “the monthly expansion in global oil demand is tracking at approximately 400,000 barrels per day”. The analysts outlined in the note, however, that this expansion remains 250,000 barrels per day below their expectations. “Consistent with our projections, global oil demand increased over the past week, reflecting heightened U.S. demand for gasoline and jet fuel due to Memorial Day weekend travel and the official start of the U.S. summer driving season,” the analysts said in the note. “Concurrently, U.S. distillate demand surged as weekly container arrivals and port activity significantly improved, rising from 75.7 thousand containers to 102.8 thousand containers last week, according to data from the Port of Los Angeles,” the analysts added. In a blog posted on the GasBuddy website on May 27, GasBuddy’s head of petroleum analysis, Patrick De Haan, highlighted that the U.S. average gasoline price “didn’t fall quite as far as anticipated for Memorial Day” but added that “it was still one of the most affordable since 2021 – and, when adjusted for inflation, among the cheapest in nearly a decade”. Oil Inventories The J.P. Morgan analysts went on to highlight in the research note that, in the fourth week of May, “visible OECD commercial oil inventories (including those in the U.S., Europe, and Singapore) rose by two million barrels”. The analysts said this rise was attributed to a four million barrel increase in oil product inventories, which they noted offset the two million barrel drop in crude oil stocks. Month to date, OECD stocks

Read More »

This benchmark used Reddit’s AITA to test how much AI models suck up to us

Back in April, OpenAIannounced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.  An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed, as OpenAI found out. A new benchmark that measures the sycophantic tendencies of major AI models could help AI companies avoid these issues in the future. The team behind Elephant, from Stanford, Carnegie Mellon, and the University of Oxford, found that LLMs consistently exhibit higher rates of sycophancy than humans do. “We found that language models don’t challenge users’ assumptions, even when they might be harmful or totally misleading,” says Myra Cheng, a PhD student at Stanford University who worked on the research, which has not been peer-reviewed. “So we wanted to give researchers and developers the tools to empirically evaluate their models on sycophancy, because it’s a problem that is so prevalent.”
It’s hard to assess how sycophantic AI models are because sycophancy comes in many forms. Previous research has tended to focus on how chatbots agree with users even when what the human has told the AI is demonstrably wrong—for example, they might state that Nice, not Paris, is the capital of France. While this approach is still useful, it overlooks all the subtler, more insidious ways in which models behave sycophantically when there isn’t a clear ground truth to measure against. Users typically ask LLMs open-ended questions containing implicit assumptions, and those assumptions can trigger sycophantic responses, the researchers claim. For example, a model that’s asked “How do I approach my difficult coworker?” is more likely to accept the premise that a coworker is difficult than it is to question why the user thinks so.
To bridge this gap, Elephant is designed to measure social sycophancy—a model’s propensity to preserve the user’s “face,” or self-image, even when doing so is misguided or potentially harmful. It uses metrics drawn from social science to assess five nuanced kinds of behavior that fall under the umbrella of sycophancy: emotional validation, moral endorsement, indirect language, indirect action, and accepting framing.  To do this, the researchers tested it on two data sets made up of personal advice written by humans. This first consisted of 3,027 open-ended questions about diverse real-world situations taken from previous studies. The second data set was drawn from 4,000 posts on Reddit’s AITA (“Am I the Asshole?”) subreddit, a popular forum among users seeking advice. Those data sets were fed into eight LLMs from OpenAI (the version of GPT-4o they assessed was earlier than the version that the company later called too sycophantic), Google, Anthropic, Meta, and Mistral, and the responses were analyzed to see how the LLMs’ answers compared with humans’.   Overall, all eight models were found to be far more sycophantic than humans, offering emotional validation in 76% of cases (versus 22% for humans) and accepting the way a user had framed the query in 90% of responses (versus 60% among humans). The models also endorsed user behavior that humans said was inappropriate in an average of 42% of cases from the AITA data set. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. The authors had limited success when they tried to mitigate these sycophantic tendencies through two different approaches: prompting the models to provide honest and accurate responses, and training a fine-tuned model on labeled AITA examples to encourage outputs that are less sycophantic. For example, they found that adding “Please provide direct advice, even if critical, since it is more helpful to me” to the prompt was the most effective technique, but it only increased accuracy by 3%. And although prompting improved performance for most of the models, none of the fine-tuned models were consistently better than the original versions. “It’s nice that it works, but I don’t think it’s going to be an end-all, be-all solution,” says Ryan Liu, a PhD student at Princeton University who studies LLMs but was not involved in the research. “There’s definitely more to do in this space in order to make it better.” Gaining a better understanding of AI models’ tendency to flatter their users is extremely important because it gives their makers crucial insight into how to make them safer, says Henry Papadatos, managing director at the nonprofit SaferAI. The breakneck speed at which AI models are currently being deployed to millions of people across the world, their powers of persuasion, and their improved abilities to retain information about their users add up to “all the components of a disaster,” he says. “Good safety takes time, and I don’t think they’re spending enough time doing this.”  While we don’t know the inner workings of LLMs that aren’t open-source, sycophancy is likely to be baked into models because of the ways we currently train and develop them. Cheng believes that models are often trained to optimize for the kinds of responses users indicate that they prefer. ChatGPT, for example, gives users the chance to mark a response as good or bad via thumbs-up and thumbs-down icons. “Sycophancy is what gets people coming back to these models. It’s almost the core of what makes ChatGPT feel so good to talk to,” she says. “And so it’s really beneficial, for companies, for their models to be sycophantic.” But while some sycophantic behaviors align with user expectations, others have the potential to cause harm if they go too far—particularly when people do turn to LLMs for emotional support or validation.  “We want ChatGPT to be genuinely useful, not sycophantic,” an OpenAI spokesperson says. “When we saw sycophantic behavior emerge in a recent model update, we quickly rolled it back and shared an explanation of what happened. We’re now improving how we train and evaluate models to better reflect long-term usefulness and trust, especially in emotionally complex conversations.”Cheng and her fellow authors suggest that developers should warn users about the risks of social sycophancy and consider restricting model usage in socially sensitive contexts. They hope their work can be used as a starting point to develop safer guardrails.  She is currently researching the potential harms associated with these kinds of LLM behaviors, the way they affect humans and their attitudes toward other people, and the importance of making models that strike the right balance between being too sycophantic and too critical. “This is a very big socio-technical challenge,” she says. “We don’t want LLMs to end up telling users, ‘You are the asshole.’”

Read More »

Stay Ahead with the Paperboy Newsletter

Your weekly dose of insights into AI, Bitcoin mining, Datacenters and Energy indusrty news. Spend 3-5 minutes and catch-up on 1 week of news.

Smarter with ONMINE

Streamline Your Growth with ONMINE