Your Gateway to Power, Energy, Datacenters, Bitcoin and AI
Dive into the latest industry updates, our exclusive Paperboy Newsletter, and curated insights designed to keep you informed. Stay ahead with minimal time spent.
Discover What Matters Most to You

AI
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Bitcoin:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Datacenter:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Energy:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Discover What Matter Most to You
Featured Articles

Vim and GNU Emacs: Claude Code helpfully found zero-day exploits for both
“An attacker who can deliver a crafted file to a victim achieves arbitrary command execution with the privileges of the user running Vim,” Vim maintainers noted in their security advisory. “The attack requires only that the victim opens the file; no further interaction is needed.” GNU Emacs ‘forever-day’ Surprised, Nguyen then jokingly suggested Claude Code find the same type of flaw in a second text editor, GNU Emacs. Claude Code obliged, finding a zero-day vulnerability, dating back to 2018, in the way the program interacts with the Git version control system that would make it possible to execute malicious code simply by opening a file. “Opening a file in GNU Emacs can trigger arbitrary code execution through version control (git), most requiring zero user interaction beyond the file open itself. The most severe finding requires no file-local variables at all — simply opening any file inside a directory containing a crafted .git/ folder executes attacker-controlled commands,” he wrote. One fixed, one not When notified, Vim’s maintainers quickly fixed their issue, identified as CVE-2026-34714 with a CVSS score of 9.2, in version 9.2.0272. Unfortunately, addressing the GNU Emacs vulnerability, which is currently without a CVE identifier, isn’t as straightforward. Its maintainers believe it to be a problem with Git, and declined to address the issue; in his post, Nguyen suggests manual mitigations. The vulnerable versions are 30.2 (stable release) and 31.0.50 (development).

Tokenomics: Why IT leaders need to pay attention to AI tokens
“Not only do you know you need the fast throughput, but you need the fast response time, because ultimately that end-to-end operation is going to define how long it takes you to get back your full answer,” he said. How tokens drive enterprise decisions Tokens are used as the currency for public AI products, such as text to image or image to video conversions. Within an enterprise, this situation is different. Token consumption is commonly expressed in cost per million tokens. In enterprise settings, this often translates to two approaches: metered usage models, where departments or applications consume tokens against a defined budget; and enterprise or site licenses, where organizations negotiate volume-based pricing to manage costs at scale. Some enterprises may allocate token budgets to departments, setting soft or hard limits to control usage. Others may rely on centralized licensing to simplify governance and cost management. Either way, tokenomics becomes a core part of financial planning for AI initiatives. “Selling” tokens to your employees may seem strange, but Salvadore said employees will not be given a blank check to use enterprise AI applications or public AI services. “[IT] has to think about how can we do this in a way that’s going to get our organization the capabilities that they need to really make a good use of authentic AI, while at the same time balancing the ability for individual users to be able to get what they need quickly enough, while also balancing cost,” he said. Virtually all of the public API services providers offer tiered usage. For example, ChatGPT has four pricing plans, from free to Pro, which runs for $200 per month but offers considerably more services than the free version. Through tokenomics, enterprises can buy Pro-level services but limit their use or availability.

Schneider Electric Maps the AI Data Center’s Next Design Era
The coming shift to higher-voltage DC That internal power challenge led Simonelli to one of the most consequential architectural topics in the interview: the likely transition toward higher-voltage DC distribution at very high rack densities. He framed it pragmatically. At current density levels, the industry knows how to get power into racks at 200 or 300 kilowatts. But as densities rise toward 400 kilowatts and beyond, conventional AC approaches start to run into physical limits. Too much cable, too much copper, too much conversion equipment, and too much space consumed by power infrastructure rather than GPUs. At that point, he said, higher-voltage DC becomes attractive not for philosophical reasons, but because it reduces current, shrinks conductor size, saves space, and leaves more room for revenue-generating compute. “It is again a paradigm shift,” Simonelli said of DC power at these densities. “But it won’t be everywhere.” That is probably right. The transition will not be universal, and the exact thresholds will evolve. But his underlying point is powerful. As rack densities climb, electrical architecture starts to matter not only for efficiency and reliability, but for physical space allocation inside the rack. Put differently, power distribution becomes a compute-enablement issue. Distance between accelerators matters, too. The closer GPUs and TPUs can be kept together, the better they perform. If power infrastructure can be compacted, more of the rack can be devoted to dense compute, improving the economics and performance of the system. That is a strong example of how AI is collapsing traditional boundaries between facility engineering and compute architecture. The two are no longer cleanly separable. Gas now, renewables over time On onsite power, Simonelli was refreshingly direct. If the goal is dispatchable onsite generation at the scale now being contemplated for AI facilities, he said, “there really isn’t an alternative

The Download: gig workers training humanoids, and better AI benchmarks
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his iPhone to his forehead and records himself doing chores. Zeus is a data recorder for Micro1, which sells the data he collects to robotics firms. As these companies race to build humanoids, videos from workers like Zeus have become the hottest new way to train them. Micro1 has hired thousands of them in more than 50 countries, including India, Nigeria, and Argentina. The jobs pay well locally, but raise thorny questions around privacy and informed consent. The work can be challenging—and weird. Read the full story.
—Michelle Kim Our readers recently voted humanoid robots the “11th breakthrough” to add to our 2026 list of 10 Breakthrough Technologies. Check out what else officially made the cut.
AI benchmarks are broken. Here’s what we need instead. For decades, AI has been evaluated based on whether it can outperform humans on isolated problems. But it’s seldom used this way in the real world. While AI is assessed in a vacuum, it operates in messy, complex, multi-person environments over time. This misalignment leads us to misunderstand its capabilities, risks, and impacts. We need new benchmarks that assess AI’s performance over longer horizons within human teams, workflows, and organizations. Here’s a proposal for one such approach: Human–AI, Context-Specific Evaluation. —Angela Aristidou, professor at University College London and faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. MIT Technology Review Narrated: can quantum computers now solve health care problems? We’ll soon find out. In a laboratory on the outskirts of Oxford, a quantum computer built from atoms and light awaits its moment. The device is small but powerful—and also very valuable. Infleqtion, the company that owns it, is hoping its abilities will win $5 million at a competition. The prize will go to the quantum computer that can solve real health care problems that “classical” computers cannot. But there can be only one big winner—if there is a winner at all. —Michael Brooks This is our latest story to be turned into an MIT Technology Review Narrated podcast, which we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.
The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 OpenAI just closed the biggest funding round in Silicon Valley history It raised $122 billion ahead of its blockbuster IPO, which is expected later this year. (WSJ $) + It’s also prepping a push to “rethink the social contract.” (Vanity Fair $) + Campaigners are urging people to quit ChatGPT. (MIT Technology Review) 2 Iran has threatened to attack 18 US tech companies It’s eyeing their operations in the Middle East. (Politico) + Targets include Nvidia, Apple, Microsoft, and Google. (Engadget) + Iran struck AWS data centers earlier this month. (Reuters $) 3 Artemis II is about to fly humans to the Moon. Here’s the science they’ll do Their experiments will set the stage for future explorers. (Nature) + You can watch the launch attempt today. (Engadget) 4 Putin is trying to take full control of Russia’s internet New outages and blockages are cutting the country off from the world. (NYT $) + Can we repair the internet? (MIT Technology Review) 5 A robotaxi outage in China left passengers stranded on highways Baidu vehicles froze on the streets of Wuhan. (Bloomberg $) + Police are blaming a “system failure.” (Reuters $) 6 US government requests for social media user data are soaring They’ve skyrocketed by 770% in the past decade. (Bloomberg $) + Is the Pentagon allowed to surveil Americans with AI? (MIT Technology Review)
7 Tesla has admitted that humans sometimes drive its robotaxis Remote drivers occasionally control them completely. (Wired $) 8 A satellite-smashing chain reaction could spiral out of control This data visualization captures the dangers of space collisions. (Guardian) + Here’s all the stuff we’ve put into space. (MIT Technology Review)
9 Meta’s smartglasses can turn you into a creep According to one journalist who wore them for a month. (Guardian) 10 A Claude Code leak has exposed plans for a virtual pet We could be getting a Tamagotchi for the GenAI era. (The Verge) Quote of the day “From now on, for every assassination, an American company will be destroyed.” —Iran’s Islamic Revolutionary Guard Corps (IRGC) threatens US tech firms in an affiliated Telegram, per CNBC. One More Thing ACKERMAN + GRUBER How one mine could unlock billions in EV subsidies In a pine farm north of the tiny town of Tamarack, Minnesota, Talon Metals has uncovered one of America’s densest nickel deposits. Now it wants to begin mining the ore.
Products made from the nickel could net more than $26 billion in subsidies through the Inflation Reduction Act (IRA), which is starting to transform the US economy. To understand how, we tallied up the potential tax credits available. Read the full story to find out what we discovered. —James Temple We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + A selfless group of gluttons tried to taste-test every potato chip in the world. + Get romantic inspiration from these penguins’ engagement pebbles. + Good news: global terrorism has hit a 15-year low. + Enjoy endless new views through these windows around the world.

The gig workers who are training humanoid robots at home
When Zeus, a medical student living in a hilltop city in central Nigeria, returns to his studio apartment from a long day at the hospital, he turns on his ring light, straps his iPhone to his forehead, and starts recording himself. He raises his hands in front of him like a sleepwalker and puts a sheet on his bed. He moves slowly and carefully to make sure his hands stay within the camera frame. Zeus is a data recorder for Micro1, a US company based in Palo Alto, California that collects real-world data to sell to robotics companies. As companies like Tesla, Figure AI, and Agility Robotics race to build humanoids—robots designed to resemble and move like humans in factories and homes—videos recorded by gig workers like Zeus are becoming the hottest new way to train them. Micro1 has hired thousands of contract workers in more than 50 countries, including India, Nigeria, and Argentina, where swathes of tech-savvy young people are looking for jobs. They’re mounting iPhones on their heads and recording themselves folding laundry, washing dishes, and cooking. The job pays well by local standards and is boosting local economies, but it raises thorny questions around privacy and informed consent. And the work can be challenging at times—and weird. Zeus found the job in November, when people started talking about it everywhere on LinkedIn and YouTube. “This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future,” he thought.
Zeus is paid $15 an hour, which is good income in Nigeria’s strained economy with high unemployment rates. But as a bright-eyed student dreaming of becoming a doctor, he finds ironing his clothes for hours every day boring. “I really [do] not like it so much,” he says. “I’m the kind of person that requires … a technical job that requires me to think.”
Zeus, and all the workers interviewed by MIT Technology Review, asked to be referred to only by pseudonyms because they were not authorized to talk about their work. Humanoid robots are notoriously hard to build because manipulating physical objects is a difficult skill to master. But the rise of large language models underlying chatbots like ChatGPT has inspired a paradigm shift in robotics. Just as large language models learned to generate words by being trained on vast troves of text scraped from the internet, many researchers believe that humanoid robots can learn to interact with the world by being trained on massive amounts of movement data. Editor’s note: In a recent poll, MIT Technology Review readers selected humanoid robots as the 11th breakthrough for our 2026 list of 10 Breakthrough Technologies. Robotics requires far more complex data about the physical world, though, and that is much harder to find. Virtual simulations can train robots to perform acrobatics, but not how to grasp and move objects, because simulations struggle to model physics with perfect accuracy. For robots to work in factories and serve as housekeepers, real-world data, however time-consuming and expensive to collect, may be what we need. Investors are pouring money feverishly into solving this challenge, spending over $6 billion on humanoid robots in 2025. And at-home data recording is becoming a booming gig economy around the world. Data companies like Scale AI and Encord are recruiting their own armies of data recorders, while DoorDash pays delivery drivers to film themselves doing chores. And in China, workers in dozens of state-owned robot training centers wear virtual-reality headsets and exoskeletons to teach humanoid robots how to open a microwave and wipe down the table. “There is a lot of demand, and it’s increasing really fast,” says Ali Ansari, CEO of Micro1. He estimates that robotics companies are now spending more than $100 million each year to buy real-world data from his company and others like it. A day in the life Workers at Micro1 are vetted by an AI agent named Zara that conducts interviews and reviews samples of chore videos. Every week, they submit videos of themselves doing chores around their homes, following a list of instructions about things like keeping their hands visible and moving at natural speed. The videos are reviewed by both AI and a human and are either accepted or rejected. They’re then annotated by AI and a team of hundreds of humans who label the actions in the footage. “There is a lot of demand, and it’s increasing really fast.” Ali Ansari, CEO of Micro1 Because this approach to training robots is in its infancy, it’s not clear yet what makes good training data. Still, “you need to give lots and lots of variations for the robot to generalize well for basic navigation and manipulation of the world,” says Ansari.
But many workers say that creating a variety of “chore content” in their tiny homes is a challenge. Zeus, a scrappy student living in a humble studio, struggles to record anything beyond ironing his clothes every day. Arjun, a tutor in Delhi, India, takes an hour to make a 15-minute video because he spends so much time brainstorming new chores. “How much content [can be made] in the home? How much content?” he says. There’s also the sticky question of privacy. Micro1 asks workers not to show their faces to the camera or reveal personal information such as names, phone numbers, and birth dates. Then it uses AI and human reviewers to remove anything that slips through. But even without faces, the videos capture an intimate slice of workers’ lives: the interiors of their homes, their possessions, their routines. And understanding what kind of personal information they might be recording while they’re busy doing chores on camera can be tricky. Reviews of such footage might not filter out sensitive information beyond the most obvious identifiers. For workers with families, keeping private life off camera is a constant negotiation. Arjun, a father of two daughters, has to wrangle his chaotic two-year-old out of frame. “Sometimes it’s very difficult to work because my daughter is small,” he says. Sasha, a banker turned data recorder in Nigeria, tiptoes around when she hangs her laundry outside in a shared residential compound so she won’t record her neighbors, who watch her in bewilderment. “It’s going to take longer than people think.”Ken Goldberg, UC Berkeley While the workers interviewed by MIT Technology Review understand that their data is being used to train robots, none of them know how exactly their data will be used, stored, and shared with third parties, including the robotics companies that Micro1 is selling the data to. For confidentiality reasons, says Ansari, Micro1 doesn’t name its clients or disclose to workers the specific nature of the projects they are contributing to. “It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention … where this kind of technology might go and how that might affect them longer term,” says Yasmine Kotturi, a professor of human-centered computing at the University of Maryland.
Occasionally, some workers say, they’ve seen other workers asking on the company Slack channel if the company could delete their data. Micro1 declined to comment on whether such data is deleted. “People are opting into doing this,” says Ansari. “They could stop the work at any time.”
Hungry for data With thousands of workers doing their chores differently in different homes, some roboticists wonder if the data collected from them is reliable enough to train robots safely. “How we conduct our lives in our homes is not always right from a safety point of view,” says Aaron Prather, a roboticist at ASTM International. “If those folks are teaching those bad habits that could lead to an incident, then that’s not good data.” And the sheer volume of data being collected makes reviewing it for quality control challenging. But Ansari says the company rejects videos showing unsafe ways of performing a task, while clumsy movements can be useful to teach robots what not to do. Then there’s the question of how much of this data we need. Micro1 says it has tens of thousands of hours of footage, while Scale AI announced it had gathered more than 100,000 hours. “It’s going to take a long time to get there,” says Ken Goldberg, a roboticist at the University of California, Berkeley. Large language models were trained on text and images that would take a human 100,000 years to read, and humanoid robots may need even more data, because controlling robotic joints is even more complicated than generating text. “It’s going to take longer than people think,” he says. When Dattu, an engineering student living in a bustling tech hub in India, comes home after a full day of classes at his university, he skips dinner and dashes to his tiny balcony, cramped with potted plants and dumbbells. He straps his iPhone to his forehead and records himself folding the same set of clothes over and over again. His family stares at him quizzically. “It’s like some space technology for them,” he says. When he tells his friends about his job, “they just get astounded by the idea that they can get paid by recording chores.” Juggling his university studies with data recording, as well as other data annotation gigs, takes a toll on him. Still, “it feels like you’re doing something different than the whole world,” he says.

Microsoft facing CMA probe of its business software portfolio
Smith added that Microsoft recognizes that the CMA “will continue to review and assess additional issues relating to our products and services, including in the business software market. We are committed to working quickly and constructively to address these issues, including by providing all the information the CMA needs to move forward with its reviews.” A welcome move Matthew Sinclair, senior director and head of the London office of the Computer & Communications Industry Association (CCIA), a group which represents a cross section of communications and technology firms, described the move by the CMA as “welcome news.” It will, he said, “avoid overly broad and prescriptive interventions that would have impeded investment and innovation in UK cloud services. The regulator can focus its efforts on action to address specific issues, particularly restrictive software licensing terms for legacy software, which are costing UK users a fortune.” A resilience and digital sovereignty issue In response to both CMA decisions, Forrester senior analyst Dario Maisto said, “in times of increasing geopolitical volatility, organizations and authorities are reassessing risks coming from dependencies on foreign providers, to improve their digital sovereignty posture.” He pointed out, “if we consider that Microsoft and AWS own some 70% of the European and UK public cloud market, we can easily understand how emerging sovereignty concerns add to existing concentration risk in a mix that urges action now more than ever.” According to Maisto, Microsoft’s case is under even more regulatory scrutiny because European and UK organizations have a strong dependency on its productivity suite, regardless of the infrastructure layer.

Vim and GNU Emacs: Claude Code helpfully found zero-day exploits for both
“An attacker who can deliver a crafted file to a victim achieves arbitrary command execution with the privileges of the user running Vim,” Vim maintainers noted in their security advisory. “The attack requires only that the victim opens the file; no further interaction is needed.” GNU Emacs ‘forever-day’ Surprised, Nguyen then jokingly suggested Claude Code find the same type of flaw in a second text editor, GNU Emacs. Claude Code obliged, finding a zero-day vulnerability, dating back to 2018, in the way the program interacts with the Git version control system that would make it possible to execute malicious code simply by opening a file. “Opening a file in GNU Emacs can trigger arbitrary code execution through version control (git), most requiring zero user interaction beyond the file open itself. The most severe finding requires no file-local variables at all — simply opening any file inside a directory containing a crafted .git/ folder executes attacker-controlled commands,” he wrote. One fixed, one not When notified, Vim’s maintainers quickly fixed their issue, identified as CVE-2026-34714 with a CVSS score of 9.2, in version 9.2.0272. Unfortunately, addressing the GNU Emacs vulnerability, which is currently without a CVE identifier, isn’t as straightforward. Its maintainers believe it to be a problem with Git, and declined to address the issue; in his post, Nguyen suggests manual mitigations. The vulnerable versions are 30.2 (stable release) and 31.0.50 (development).

Tokenomics: Why IT leaders need to pay attention to AI tokens
“Not only do you know you need the fast throughput, but you need the fast response time, because ultimately that end-to-end operation is going to define how long it takes you to get back your full answer,” he said. How tokens drive enterprise decisions Tokens are used as the currency for public AI products, such as text to image or image to video conversions. Within an enterprise, this situation is different. Token consumption is commonly expressed in cost per million tokens. In enterprise settings, this often translates to two approaches: metered usage models, where departments or applications consume tokens against a defined budget; and enterprise or site licenses, where organizations negotiate volume-based pricing to manage costs at scale. Some enterprises may allocate token budgets to departments, setting soft or hard limits to control usage. Others may rely on centralized licensing to simplify governance and cost management. Either way, tokenomics becomes a core part of financial planning for AI initiatives. “Selling” tokens to your employees may seem strange, but Salvadore said employees will not be given a blank check to use enterprise AI applications or public AI services. “[IT] has to think about how can we do this in a way that’s going to get our organization the capabilities that they need to really make a good use of authentic AI, while at the same time balancing the ability for individual users to be able to get what they need quickly enough, while also balancing cost,” he said. Virtually all of the public API services providers offer tiered usage. For example, ChatGPT has four pricing plans, from free to Pro, which runs for $200 per month but offers considerably more services than the free version. Through tokenomics, enterprises can buy Pro-level services but limit their use or availability.

Schneider Electric Maps the AI Data Center’s Next Design Era
The coming shift to higher-voltage DC That internal power challenge led Simonelli to one of the most consequential architectural topics in the interview: the likely transition toward higher-voltage DC distribution at very high rack densities. He framed it pragmatically. At current density levels, the industry knows how to get power into racks at 200 or 300 kilowatts. But as densities rise toward 400 kilowatts and beyond, conventional AC approaches start to run into physical limits. Too much cable, too much copper, too much conversion equipment, and too much space consumed by power infrastructure rather than GPUs. At that point, he said, higher-voltage DC becomes attractive not for philosophical reasons, but because it reduces current, shrinks conductor size, saves space, and leaves more room for revenue-generating compute. “It is again a paradigm shift,” Simonelli said of DC power at these densities. “But it won’t be everywhere.” That is probably right. The transition will not be universal, and the exact thresholds will evolve. But his underlying point is powerful. As rack densities climb, electrical architecture starts to matter not only for efficiency and reliability, but for physical space allocation inside the rack. Put differently, power distribution becomes a compute-enablement issue. Distance between accelerators matters, too. The closer GPUs and TPUs can be kept together, the better they perform. If power infrastructure can be compacted, more of the rack can be devoted to dense compute, improving the economics and performance of the system. That is a strong example of how AI is collapsing traditional boundaries between facility engineering and compute architecture. The two are no longer cleanly separable. Gas now, renewables over time On onsite power, Simonelli was refreshingly direct. If the goal is dispatchable onsite generation at the scale now being contemplated for AI facilities, he said, “there really isn’t an alternative

The Download: gig workers training humanoids, and better AI benchmarks
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his iPhone to his forehead and records himself doing chores. Zeus is a data recorder for Micro1, which sells the data he collects to robotics firms. As these companies race to build humanoids, videos from workers like Zeus have become the hottest new way to train them. Micro1 has hired thousands of them in more than 50 countries, including India, Nigeria, and Argentina. The jobs pay well locally, but raise thorny questions around privacy and informed consent. The work can be challenging—and weird. Read the full story.
—Michelle Kim Our readers recently voted humanoid robots the “11th breakthrough” to add to our 2026 list of 10 Breakthrough Technologies. Check out what else officially made the cut.
AI benchmarks are broken. Here’s what we need instead. For decades, AI has been evaluated based on whether it can outperform humans on isolated problems. But it’s seldom used this way in the real world. While AI is assessed in a vacuum, it operates in messy, complex, multi-person environments over time. This misalignment leads us to misunderstand its capabilities, risks, and impacts. We need new benchmarks that assess AI’s performance over longer horizons within human teams, workflows, and organizations. Here’s a proposal for one such approach: Human–AI, Context-Specific Evaluation. —Angela Aristidou, professor at University College London and faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. MIT Technology Review Narrated: can quantum computers now solve health care problems? We’ll soon find out. In a laboratory on the outskirts of Oxford, a quantum computer built from atoms and light awaits its moment. The device is small but powerful—and also very valuable. Infleqtion, the company that owns it, is hoping its abilities will win $5 million at a competition. The prize will go to the quantum computer that can solve real health care problems that “classical” computers cannot. But there can be only one big winner—if there is a winner at all. —Michael Brooks This is our latest story to be turned into an MIT Technology Review Narrated podcast, which we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.
The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 OpenAI just closed the biggest funding round in Silicon Valley history It raised $122 billion ahead of its blockbuster IPO, which is expected later this year. (WSJ $) + It’s also prepping a push to “rethink the social contract.” (Vanity Fair $) + Campaigners are urging people to quit ChatGPT. (MIT Technology Review) 2 Iran has threatened to attack 18 US tech companies It’s eyeing their operations in the Middle East. (Politico) + Targets include Nvidia, Apple, Microsoft, and Google. (Engadget) + Iran struck AWS data centers earlier this month. (Reuters $) 3 Artemis II is about to fly humans to the Moon. Here’s the science they’ll do Their experiments will set the stage for future explorers. (Nature) + You can watch the launch attempt today. (Engadget) 4 Putin is trying to take full control of Russia’s internet New outages and blockages are cutting the country off from the world. (NYT $) + Can we repair the internet? (MIT Technology Review) 5 A robotaxi outage in China left passengers stranded on highways Baidu vehicles froze on the streets of Wuhan. (Bloomberg $) + Police are blaming a “system failure.” (Reuters $) 6 US government requests for social media user data are soaring They’ve skyrocketed by 770% in the past decade. (Bloomberg $) + Is the Pentagon allowed to surveil Americans with AI? (MIT Technology Review)
7 Tesla has admitted that humans sometimes drive its robotaxis Remote drivers occasionally control them completely. (Wired $) 8 A satellite-smashing chain reaction could spiral out of control This data visualization captures the dangers of space collisions. (Guardian) + Here’s all the stuff we’ve put into space. (MIT Technology Review)
9 Meta’s smartglasses can turn you into a creep According to one journalist who wore them for a month. (Guardian) 10 A Claude Code leak has exposed plans for a virtual pet We could be getting a Tamagotchi for the GenAI era. (The Verge) Quote of the day “From now on, for every assassination, an American company will be destroyed.” —Iran’s Islamic Revolutionary Guard Corps (IRGC) threatens US tech firms in an affiliated Telegram, per CNBC. One More Thing ACKERMAN + GRUBER How one mine could unlock billions in EV subsidies In a pine farm north of the tiny town of Tamarack, Minnesota, Talon Metals has uncovered one of America’s densest nickel deposits. Now it wants to begin mining the ore.
Products made from the nickel could net more than $26 billion in subsidies through the Inflation Reduction Act (IRA), which is starting to transform the US economy. To understand how, we tallied up the potential tax credits available. Read the full story to find out what we discovered. —James Temple We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + A selfless group of gluttons tried to taste-test every potato chip in the world. + Get romantic inspiration from these penguins’ engagement pebbles. + Good news: global terrorism has hit a 15-year low. + Enjoy endless new views through these windows around the world.

The gig workers who are training humanoid robots at home
When Zeus, a medical student living in a hilltop city in central Nigeria, returns to his studio apartment from a long day at the hospital, he turns on his ring light, straps his iPhone to his forehead, and starts recording himself. He raises his hands in front of him like a sleepwalker and puts a sheet on his bed. He moves slowly and carefully to make sure his hands stay within the camera frame. Zeus is a data recorder for Micro1, a US company based in Palo Alto, California that collects real-world data to sell to robotics companies. As companies like Tesla, Figure AI, and Agility Robotics race to build humanoids—robots designed to resemble and move like humans in factories and homes—videos recorded by gig workers like Zeus are becoming the hottest new way to train them. Micro1 has hired thousands of contract workers in more than 50 countries, including India, Nigeria, and Argentina, where swathes of tech-savvy young people are looking for jobs. They’re mounting iPhones on their heads and recording themselves folding laundry, washing dishes, and cooking. The job pays well by local standards and is boosting local economies, but it raises thorny questions around privacy and informed consent. And the work can be challenging at times—and weird. Zeus found the job in November, when people started talking about it everywhere on LinkedIn and YouTube. “This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future,” he thought.
Zeus is paid $15 an hour, which is good income in Nigeria’s strained economy with high unemployment rates. But as a bright-eyed student dreaming of becoming a doctor, he finds ironing his clothes for hours every day boring. “I really [do] not like it so much,” he says. “I’m the kind of person that requires … a technical job that requires me to think.”
Zeus, and all the workers interviewed by MIT Technology Review, asked to be referred to only by pseudonyms because they were not authorized to talk about their work. Humanoid robots are notoriously hard to build because manipulating physical objects is a difficult skill to master. But the rise of large language models underlying chatbots like ChatGPT has inspired a paradigm shift in robotics. Just as large language models learned to generate words by being trained on vast troves of text scraped from the internet, many researchers believe that humanoid robots can learn to interact with the world by being trained on massive amounts of movement data. Editor’s note: In a recent poll, MIT Technology Review readers selected humanoid robots as the 11th breakthrough for our 2026 list of 10 Breakthrough Technologies. Robotics requires far more complex data about the physical world, though, and that is much harder to find. Virtual simulations can train robots to perform acrobatics, but not how to grasp and move objects, because simulations struggle to model physics with perfect accuracy. For robots to work in factories and serve as housekeepers, real-world data, however time-consuming and expensive to collect, may be what we need. Investors are pouring money feverishly into solving this challenge, spending over $6 billion on humanoid robots in 2025. And at-home data recording is becoming a booming gig economy around the world. Data companies like Scale AI and Encord are recruiting their own armies of data recorders, while DoorDash pays delivery drivers to film themselves doing chores. And in China, workers in dozens of state-owned robot training centers wear virtual-reality headsets and exoskeletons to teach humanoid robots how to open a microwave and wipe down the table. “There is a lot of demand, and it’s increasing really fast,” says Ali Ansari, CEO of Micro1. He estimates that robotics companies are now spending more than $100 million each year to buy real-world data from his company and others like it. A day in the life Workers at Micro1 are vetted by an AI agent named Zara that conducts interviews and reviews samples of chore videos. Every week, they submit videos of themselves doing chores around their homes, following a list of instructions about things like keeping their hands visible and moving at natural speed. The videos are reviewed by both AI and a human and are either accepted or rejected. They’re then annotated by AI and a team of hundreds of humans who label the actions in the footage. “There is a lot of demand, and it’s increasing really fast.” Ali Ansari, CEO of Micro1 Because this approach to training robots is in its infancy, it’s not clear yet what makes good training data. Still, “you need to give lots and lots of variations for the robot to generalize well for basic navigation and manipulation of the world,” says Ansari.
But many workers say that creating a variety of “chore content” in their tiny homes is a challenge. Zeus, a scrappy student living in a humble studio, struggles to record anything beyond ironing his clothes every day. Arjun, a tutor in Delhi, India, takes an hour to make a 15-minute video because he spends so much time brainstorming new chores. “How much content [can be made] in the home? How much content?” he says. There’s also the sticky question of privacy. Micro1 asks workers not to show their faces to the camera or reveal personal information such as names, phone numbers, and birth dates. Then it uses AI and human reviewers to remove anything that slips through. But even without faces, the videos capture an intimate slice of workers’ lives: the interiors of their homes, their possessions, their routines. And understanding what kind of personal information they might be recording while they’re busy doing chores on camera can be tricky. Reviews of such footage might not filter out sensitive information beyond the most obvious identifiers. For workers with families, keeping private life off camera is a constant negotiation. Arjun, a father of two daughters, has to wrangle his chaotic two-year-old out of frame. “Sometimes it’s very difficult to work because my daughter is small,” he says. Sasha, a banker turned data recorder in Nigeria, tiptoes around when she hangs her laundry outside in a shared residential compound so she won’t record her neighbors, who watch her in bewilderment. “It’s going to take longer than people think.”Ken Goldberg, UC Berkeley While the workers interviewed by MIT Technology Review understand that their data is being used to train robots, none of them know how exactly their data will be used, stored, and shared with third parties, including the robotics companies that Micro1 is selling the data to. For confidentiality reasons, says Ansari, Micro1 doesn’t name its clients or disclose to workers the specific nature of the projects they are contributing to. “It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention … where this kind of technology might go and how that might affect them longer term,” says Yasmine Kotturi, a professor of human-centered computing at the University of Maryland.
Occasionally, some workers say, they’ve seen other workers asking on the company Slack channel if the company could delete their data. Micro1 declined to comment on whether such data is deleted. “People are opting into doing this,” says Ansari. “They could stop the work at any time.”
Hungry for data With thousands of workers doing their chores differently in different homes, some roboticists wonder if the data collected from them is reliable enough to train robots safely. “How we conduct our lives in our homes is not always right from a safety point of view,” says Aaron Prather, a roboticist at ASTM International. “If those folks are teaching those bad habits that could lead to an incident, then that’s not good data.” And the sheer volume of data being collected makes reviewing it for quality control challenging. But Ansari says the company rejects videos showing unsafe ways of performing a task, while clumsy movements can be useful to teach robots what not to do. Then there’s the question of how much of this data we need. Micro1 says it has tens of thousands of hours of footage, while Scale AI announced it had gathered more than 100,000 hours. “It’s going to take a long time to get there,” says Ken Goldberg, a roboticist at the University of California, Berkeley. Large language models were trained on text and images that would take a human 100,000 years to read, and humanoid robots may need even more data, because controlling robotic joints is even more complicated than generating text. “It’s going to take longer than people think,” he says. When Dattu, an engineering student living in a bustling tech hub in India, comes home after a full day of classes at his university, he skips dinner and dashes to his tiny balcony, cramped with potted plants and dumbbells. He straps his iPhone to his forehead and records himself folding the same set of clothes over and over again. His family stares at him quizzically. “It’s like some space technology for them,” he says. When he tells his friends about his job, “they just get astounded by the idea that they can get paid by recording chores.” Juggling his university studies with data recording, as well as other data annotation gigs, takes a toll on him. Still, “it feels like you’re doing something different than the whole world,” he says.

Microsoft facing CMA probe of its business software portfolio
Smith added that Microsoft recognizes that the CMA “will continue to review and assess additional issues relating to our products and services, including in the business software market. We are committed to working quickly and constructively to address these issues, including by providing all the information the CMA needs to move forward with its reviews.” A welcome move Matthew Sinclair, senior director and head of the London office of the Computer & Communications Industry Association (CCIA), a group which represents a cross section of communications and technology firms, described the move by the CMA as “welcome news.” It will, he said, “avoid overly broad and prescriptive interventions that would have impeded investment and innovation in UK cloud services. The regulator can focus its efforts on action to address specific issues, particularly restrictive software licensing terms for legacy software, which are costing UK users a fortune.” A resilience and digital sovereignty issue In response to both CMA decisions, Forrester senior analyst Dario Maisto said, “in times of increasing geopolitical volatility, organizations and authorities are reassessing risks coming from dependencies on foreign providers, to improve their digital sovereignty posture.” He pointed out, “if we consider that Microsoft and AWS own some 70% of the European and UK public cloud market, we can easily understand how emerging sovereignty concerns add to existing concentration risk in a mix that urges action now more than ever.” According to Maisto, Microsoft’s case is under even more regulatory scrutiny because European and UK organizations have a strong dependency on its productivity suite, regardless of the infrastructure layer.

Trump Administration Keeps Colorado Coal Plant Open to Ensure Affordable, Reliable and Secure Power in Colorado
WASHINGTON—U.S. Secretary of Energy Chris Wright today issued an emergency order to keep a Colorado coal plant operational to ensure Americans maintain access to affordable, reliable and secure electricity. The order directs Tri-State Generation and Transmission Association (Tri-State), Platte River Power Authority, Salt River Project, PacifiCorp, and Public Service Company of Colorado (Xcel Energy), in coordination with the Western Area Power Administration (WAPA) Rocky Mountain Region and Southwest Power Pool (SPP), to take all measures necessary to ensure that Unit 1 at the Craig Station in Craig, Colorado is available to operate. Unit One of the coal plant was scheduled to shut down at the end of 2025 but on December 30, 2025, Secretary Wright issued an emergency order directing Tri-State and the co-owners to ensure that Unit 1 at the Craig Station remains available to operate. “The last administration’s energy subtraction policies threatened America’s energy security and positioned our nation to likely experience significantly more blackouts in the coming years—thankfully, President Trump won’t let that happen,” said Energy Secretary Wright. “The Trump Administration will continue taking action to ensure we don’t lose critical generation sources. Americans deserve access to affordable, reliable, and secure energy to power their homes all the time, regardless of whether the wind is blowing or the sun is shining.” Thanks to President Trump’s leadership, coal plants across the country are reversing plans to shut down. In 2025, more than 17 gigawatts (GW) of coal-power electricity generation were saved. On April 1, once Tri-State and the WAPA Rocky Mountain Region join the SPP RTO West expansion, SPP is directed to take every step to employ economic dispatch to minimize costs to ratepayers. According to DOE’s Resource Adequacy Report, blackouts were on track to potentially increase 100 times by 2030 if the U.S. continued to take reliable

NextDecade contractor Bechtel awards ABB more Rio Grande LNG automation work
NextDecade Corp. contractor Bechtel Corp. has awarded ABB Ltd. additional integrated automation and electrical solution orders, extending its scope to Trains 4 and 5 of NextDecade’s 30-million tonne/year (tpy) Rio Grande LNG (RGLNG) plant in Brownsville, Tex. The orders were booked in third- and fourth-quarters 2025 and build on ABB’s Phase 1 work with Trains 1-3, totaling 17 million tpy. The scope for RGLNG Trains 4 and 5 includes deployment of an integrated control and safety system consisting of a distributed control system, emergency shutdown, and fire and gas systems. An electrical controls and monitoring system will provide unified visibility of the plant’s electrical infrastructure. These two overarching solutions will provide a common automation platform. ABB will also supply medium-voltage drives, synchronous motors, transformers, motor controllers and switchgear. The orders also include local equipment buildings—two for Train 4 and one for Train 5— housing critical control and electrical systems in prefabricated modules to streamline installation and commissioning on site. The solutions being delivered to Bechtel use ABB adaptive execution, a methodology for capital projects designed to optimize engineering work and reduce delivery timelines. Phase 1 of RGLNG is under construction and expected to begin operations in 2027. Operations at Train 4 are expected in 2030 and Train 5 in 2031. ABB’s senior vice-president for the Americas, Scott McCay, confirmed to Oil & Gas Journal at CERAWeek by S&P Global in Houston that the company is doing similar work through Tecnimont for Argent LNG’s planned 25-million tpy plant in Port Fourchon, La.; 10-million tpy Phase 1 and 15-million tpy Phase 2. Argent is targeting 2030 completion for its plant.

Persistent oil flow imbalances drive Enverus to increase crude price forecast
Citing impacts from the Iran war, near-zero flows through the Strait of Hormuz, accelerating global stock draws, and expectations for a muted US production response despite higher prices, Enverus Intelligence Research (EIR) raised its Brent crude oil price forecast. EIR now expects Brent to average $95/bbl for the remainder of 2026 and $100/bbl in 2027, reflecting what it described as a persistent global oil flow imbalance that continues to draw down inventories. “The world has an oil flow problem that is draining stocks,” said Al Salazar, director of research at EIR. “Whenever that oil flow problem is resolved, the world is left with low stocks. That’s what drives our oil price outlook higher for longer.” The outlook assumes the Strait of Hormuz remains largely closed for 3 months. EIR estimates that each month of constrained flows shifts the price outlook by about $10–15/bbl, underscoring the scale of the disruption and uncertainty around its duration. Despite West Texas Intermediate (WTI) prices of $90–100/bbl, EIR does not expect US producers to materially increase output. The firm forecasts US liquids production growth of 370,000 b/d by end-2026 and 580,000 b/d by end-2027, citing drilling-to-production lags, industry consolidation, and continued capital discipline. Global oil demand growth for 2026 has been reduced to about 500,000 b/d from 1.0 million b/d as higher energy prices and anticipated supply disruptions weigh on economic activity. Cumulative global oil stock draws are estimated at roughly 1 billion bbl through 2027, with non-OECD inventories—particularly in Asia—absorbing nearly half of the impact. A 60-day Jones Act waiver may provide limited short-term US shipping flexibility, but EIR said the measure is unlikely to materially affect global oil prices given broader market forces.

Equinor begins drilling $9-billion natural gas development project offshore Brazil
Equinor has started drilling the Raia natural gas project in the Campos basin presalt offshore Brazil. The $9-billion project is Equinor’s largest international investment, its largest project under execution, and marks the deepest water depth operation in its portfolio. The drilling campaign, which began Mar. 24 with the Valaris DS‑17 drillship, includes six wells in the Raia area 200 km offshore in water depths of around 2,900 m. The area is expected to hold recoverable natural gas and condensate reserves of over 1 billion boe. Raia’s development concept is based on production through wells connected to a 126,000-b/d floating production, storage and offloading unit (FPSO), which will treat produced oil/condensate and gas. Natural gas will be transported through a 200‑km pipeline from the FPSO to Cabiúnas, in the city of Macaé, Rio de Janeiro state. Once in operation, expected in 2028, the project will have the capacity to export up to 16 million cu m/day of natural gas, which could represent 15% of Brazil’s natural gas demand, the company said in a release Mar. 24. “While drilling takes place, integration and commissioning activities on the FPSO are progressing well putting us on track towards a safe start of operations in 2028,” said Geir Tungesvik, executive vice-president, projects, drilling and procurement, Equinor. The Raia project is operated by Equinor (35%), in partnership with Repsol Sinopec Brasil (35%) and Petrobras (30%).

Woodfibre LNG receives additional modules as construction advances
Woodfibre LNG LP has received two major modules within a week for its under‑construction, 2.1‑million tonne/year (tpy) LNG export plant near Squamish, British Columbia, advancing construction to about 65% complete. The deliveries include the liquefaction module—the project’s heaviest and most critical process unit—and the powerhouse module, which will serve as the plant’s central power and control hub. The liquefaction module, delivered aboard the heavy cargo vessel Red Zed 1, is the 15th of 19 modules scheduled for installation at the site, the company said in a Mar. 24 release. Weighing about 10,847 metric tonnes and occupying a footprint roughly equivalent to a football field, it is among the largest modules fabricated for the project. Once installed and commissioned, the liquefaction module will cool natural gas to about –162°C, converting it into LNG for export. Shortly after the liquefaction module’s arrival, Woodfibre LNG received the powerhouse module, the 16th module delivered to site. Weighing more than 4,200 metric tonnes, the powerhouse module will function as a power and control system, receiving electricity from BC Hydro and managing and distributing power to the plant’s electric‑drive compressors. The Woodfibre LNG project is designed as the first LNG export plant to use electric‑drive motors for liquefaction, replacing conventional gas‑turbine‑driven compressors. The Siemens electric‑drive system will be powered by renewable hydroelectricity from BC Hydro, eliminating the largest operational source of greenhouse gas emissions typically associated with liquefaction, the company said. The project is being built near the community of Squamish on the traditional territory of the Sḵwx̱wú7mesh Úxwumixw (Squamish Nation) and is regulated in part by the Indigenous government. All 19 modules are expected to arrive on site by spring 2026. Construction is scheduled for completion in 2027. Woodfibre LNG is owned by Woodfibre LNG Ltd. Partnership, which is 70% owned by Pacific Energy Corp.

ExxonMobil begins Turrum Phase 3 drilling off Australia’s east coast
@import url(‘https://fonts.googleapis.com/css2?family=Inter:[email protected]&display=swap’); a { color: var(–color-primary-main); } .ebm-page__main h1, .ebm-page__main h2, .ebm-page__main h3, .ebm-page__main h4, .ebm-page__main h5, .ebm-page__main h6 { font-family: Inter; } body { line-height: 150%; letter-spacing: 0.025em; font-family: Inter; } button, .ebm-button-wrapper { font-family: Inter; } .label-style { text-transform: uppercase; color: var(–color-grey); font-weight: 600; font-size: 0.75rem; } .caption-style { font-size: 0.75rem; opacity: .6; } #onetrust-pc-sdk [id*=btn-handler], #onetrust-pc-sdk [class*=btn-handler] { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-policy a, #onetrust-pc-sdk a, #ot-pc-content a { color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-pc-sdk .ot-active-menu { border-color: #c19a06 !important; } #onetrust-consent-sdk #onetrust-accept-btn-handler, #onetrust-banner-sdk #onetrust-reject-all-handler, #onetrust-consent-sdk #onetrust-pc-btn-handler.cookie-setting-link { background-color: #c19a06 !important; border-color: #c19a06 !important; } #onetrust-consent-sdk .onetrust-pc-btn-handler { color: #c19a06 !important; border-color: #c19a06 !important; } Esso Australia Pty Ltd., a subsidiary of ExxonMobil Corp. and current operator of the Gippsland basin oil and gas fields in Bass Strait offshore eastern Victoria, has started drilling the Turrum Phase 3 project in Australia. This $350-million investment will see the VALARIS 107 jack-up rig drill five new wells into Turrum and North Turrum gas fields within Production License VIC/L03 to support Australia’s east coast domestic gas market. The new wells will be drilled from Marlin B platform, about 42 km off the Gippsland coastline, southeast of Lakes Entrance in water depths of about 60 m, according to a 2025 information bulletin. <!–> Turrum Phase 3, which builds on nearly $1 billion in recent investment across the Gippsland basin, is expected to be online before winter 2027, the company said in a post to its LinkedIn account Mar. 24. In 2025, Esso made a final investment decision to develop the Turrum Phase 3 project targeting underdeveloped gas resources. The Gippsland Basin joint venture is a 50-50 partnership between Esso Australia Resources and Woodside Energy (Bass Strait) and operated by Esso Australia. ]–><!–> ]–>

Microsoft will invest $80B in AI data centers in fiscal 2025
And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Three Aberdeen oil company headquarters sell for £45m
Three Aberdeen oil company headquarters have been sold in a deal worth £45 million. The CNOOC, Apache and Taqa buildings at the Prime Four business park in Kingswells have been acquired by EEH Ventures. The trio of buildings, totalling 275,000 sq ft, were previously owned by Canadian firm BMO. The financial services powerhouse first bought the buildings in 2014 but took the decision to sell the buildings as part of a “long-standing strategy to reduce their office exposure across the UK”. The deal was the largest to take place throughout Scotland during the last quarter of 2024. Trio of buildings snapped up London headquartered EEH Ventures was founded in 2013 and owns a number of residential, offices, shopping centres and hotels throughout the UK. All three Kingswells-based buildings were pre-let, designed and constructed by Aberdeen property developer Drum in 2012 on a 15-year lease. © Supplied by CBREThe Aberdeen headquarters of Taqa. Image: CBRE The North Sea headquarters of Middle-East oil firm Taqa has previously been described as “an amazing success story in the Granite City”. Taqa announced in 2023 that it intends to cease production from all of its UK North Sea platforms by the end of 2027. Meanwhile, Apache revealed at the end of last year it is planning to exit the North Sea by the end of 2029 blaming the windfall tax. The US firm first entered the North Sea in 2003 but will wrap up all of its UK operations by 2030. Aberdeen big deals The Prime Four acquisition wasn’t the biggest Granite City commercial property sale of 2024. American private equity firm Lone Star bought Union Square shopping centre from Hammerson for £111m. © ShutterstockAberdeen city centre. Hammerson, who also built the property, had originally been seeking £150m. BP’s North Sea headquarters in Stoneywood, Aberdeen, was also sold. Manchester-based

2025 ransomware predictions, trends, and how to prepare
Zscaler ThreatLabz research team has revealed critical insights and predictions on ransomware trends for 2025. The latest Ransomware Report uncovered a surge in sophisticated tactics and extortion attacks. As ransomware remains a key concern for CISOs and CIOs, the report sheds light on actionable strategies to mitigate risks. Top Ransomware Predictions for 2025: ● AI-Powered Social Engineering: In 2025, GenAI will fuel voice phishing (vishing) attacks. With the proliferation of GenAI-based tooling, initial access broker groups will increasingly leverage AI-generated voices; which sound more and more realistic by adopting local accents and dialects to enhance credibility and success rates. ● The Trifecta of Social Engineering Attacks: Vishing, Ransomware and Data Exfiltration. Additionally, sophisticated ransomware groups, like the Dark Angels, will continue the trend of low-volume, high-impact attacks; preferring to focus on an individual company, stealing vast amounts of data without encrypting files, and evading media and law enforcement scrutiny. ● Targeted Industries Under Siege: Manufacturing, healthcare, education, energy will remain primary targets, with no slowdown in attacks expected. ● New SEC Regulations Drive Increased Transparency: 2025 will see an uptick in reported ransomware attacks and payouts due to new, tighter SEC requirements mandating that public companies report material incidents within four business days. ● Ransomware Payouts Are on the Rise: In 2025 ransom demands will most likely increase due to an evolving ecosystem of cybercrime groups, specializing in designated attack tactics, and collaboration by these groups that have entered a sophisticated profit sharing model using Ransomware-as-a-Service. To combat damaging ransomware attacks, Zscaler ThreatLabz recommends the following strategies. ● Fighting AI with AI: As threat actors use AI to identify vulnerabilities, organizations must counter with AI-powered zero trust security systems that detect and mitigate new threats. ● Advantages of adopting a Zero Trust architecture: A Zero Trust cloud security platform stops

The Download: gig workers training humanoids, and better AI benchmarks
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his iPhone to his forehead and records himself doing chores. Zeus is a data recorder for Micro1, which sells the data he collects to robotics firms. As these companies race to build humanoids, videos from workers like Zeus have become the hottest new way to train them. Micro1 has hired thousands of them in more than 50 countries, including India, Nigeria, and Argentina. The jobs pay well locally, but raise thorny questions around privacy and informed consent. The work can be challenging—and weird. Read the full story.
—Michelle Kim Our readers recently voted humanoid robots the “11th breakthrough” to add to our 2026 list of 10 Breakthrough Technologies. Check out what else officially made the cut.
AI benchmarks are broken. Here’s what we need instead. For decades, AI has been evaluated based on whether it can outperform humans on isolated problems. But it’s seldom used this way in the real world. While AI is assessed in a vacuum, it operates in messy, complex, multi-person environments over time. This misalignment leads us to misunderstand its capabilities, risks, and impacts. We need new benchmarks that assess AI’s performance over longer horizons within human teams, workflows, and organizations. Here’s a proposal for one such approach: Human–AI, Context-Specific Evaluation. —Angela Aristidou, professor at University College London and faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. MIT Technology Review Narrated: can quantum computers now solve health care problems? We’ll soon find out. In a laboratory on the outskirts of Oxford, a quantum computer built from atoms and light awaits its moment. The device is small but powerful—and also very valuable. Infleqtion, the company that owns it, is hoping its abilities will win $5 million at a competition. The prize will go to the quantum computer that can solve real health care problems that “classical” computers cannot. But there can be only one big winner—if there is a winner at all. —Michael Brooks This is our latest story to be turned into an MIT Technology Review Narrated podcast, which we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.
The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 OpenAI just closed the biggest funding round in Silicon Valley history It raised $122 billion ahead of its blockbuster IPO, which is expected later this year. (WSJ $) + It’s also prepping a push to “rethink the social contract.” (Vanity Fair $) + Campaigners are urging people to quit ChatGPT. (MIT Technology Review) 2 Iran has threatened to attack 18 US tech companies It’s eyeing their operations in the Middle East. (Politico) + Targets include Nvidia, Apple, Microsoft, and Google. (Engadget) + Iran struck AWS data centers earlier this month. (Reuters $) 3 Artemis II is about to fly humans to the Moon. Here’s the science they’ll do Their experiments will set the stage for future explorers. (Nature) + You can watch the launch attempt today. (Engadget) 4 Putin is trying to take full control of Russia’s internet New outages and blockages are cutting the country off from the world. (NYT $) + Can we repair the internet? (MIT Technology Review) 5 A robotaxi outage in China left passengers stranded on highways Baidu vehicles froze on the streets of Wuhan. (Bloomberg $) + Police are blaming a “system failure.” (Reuters $) 6 US government requests for social media user data are soaring They’ve skyrocketed by 770% in the past decade. (Bloomberg $) + Is the Pentagon allowed to surveil Americans with AI? (MIT Technology Review)
7 Tesla has admitted that humans sometimes drive its robotaxis Remote drivers occasionally control them completely. (Wired $) 8 A satellite-smashing chain reaction could spiral out of control This data visualization captures the dangers of space collisions. (Guardian) + Here’s all the stuff we’ve put into space. (MIT Technology Review)
9 Meta’s smartglasses can turn you into a creep According to one journalist who wore them for a month. (Guardian) 10 A Claude Code leak has exposed plans for a virtual pet We could be getting a Tamagotchi for the GenAI era. (The Verge) Quote of the day “From now on, for every assassination, an American company will be destroyed.” —Iran’s Islamic Revolutionary Guard Corps (IRGC) threatens US tech firms in an affiliated Telegram, per CNBC. One More Thing ACKERMAN + GRUBER How one mine could unlock billions in EV subsidies In a pine farm north of the tiny town of Tamarack, Minnesota, Talon Metals has uncovered one of America’s densest nickel deposits. Now it wants to begin mining the ore.
Products made from the nickel could net more than $26 billion in subsidies through the Inflation Reduction Act (IRA), which is starting to transform the US economy. To understand how, we tallied up the potential tax credits available. Read the full story to find out what we discovered. —James Temple We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + A selfless group of gluttons tried to taste-test every potato chip in the world. + Get romantic inspiration from these penguins’ engagement pebbles. + Good news: global terrorism has hit a 15-year low. + Enjoy endless new views through these windows around the world.

The gig workers who are training humanoid robots at home
When Zeus, a medical student living in a hilltop city in central Nigeria, returns to his studio apartment from a long day at the hospital, he turns on his ring light, straps his iPhone to his forehead, and starts recording himself. He raises his hands in front of him like a sleepwalker and puts a sheet on his bed. He moves slowly and carefully to make sure his hands stay within the camera frame. Zeus is a data recorder for Micro1, a US company based in Palo Alto, California that collects real-world data to sell to robotics companies. As companies like Tesla, Figure AI, and Agility Robotics race to build humanoids—robots designed to resemble and move like humans in factories and homes—videos recorded by gig workers like Zeus are becoming the hottest new way to train them. Micro1 has hired thousands of contract workers in more than 50 countries, including India, Nigeria, and Argentina, where swathes of tech-savvy young people are looking for jobs. They’re mounting iPhones on their heads and recording themselves folding laundry, washing dishes, and cooking. The job pays well by local standards and is boosting local economies, but it raises thorny questions around privacy and informed consent. And the work can be challenging at times—and weird. Zeus found the job in November, when people started talking about it everywhere on LinkedIn and YouTube. “This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future,” he thought.
Zeus is paid $15 an hour, which is good income in Nigeria’s strained economy with high unemployment rates. But as a bright-eyed student dreaming of becoming a doctor, he finds ironing his clothes for hours every day boring. “I really [do] not like it so much,” he says. “I’m the kind of person that requires … a technical job that requires me to think.”
Zeus, and all the workers interviewed by MIT Technology Review, asked to be referred to only by pseudonyms because they were not authorized to talk about their work. Humanoid robots are notoriously hard to build because manipulating physical objects is a difficult skill to master. But the rise of large language models underlying chatbots like ChatGPT has inspired a paradigm shift in robotics. Just as large language models learned to generate words by being trained on vast troves of text scraped from the internet, many researchers believe that humanoid robots can learn to interact with the world by being trained on massive amounts of movement data. Editor’s note: In a recent poll, MIT Technology Review readers selected humanoid robots as the 11th breakthrough for our 2026 list of 10 Breakthrough Technologies. Robotics requires far more complex data about the physical world, though, and that is much harder to find. Virtual simulations can train robots to perform acrobatics, but not how to grasp and move objects, because simulations struggle to model physics with perfect accuracy. For robots to work in factories and serve as housekeepers, real-world data, however time-consuming and expensive to collect, may be what we need. Investors are pouring money feverishly into solving this challenge, spending over $6 billion on humanoid robots in 2025. And at-home data recording is becoming a booming gig economy around the world. Data companies like Scale AI and Encord are recruiting their own armies of data recorders, while DoorDash pays delivery drivers to film themselves doing chores. And in China, workers in dozens of state-owned robot training centers wear virtual-reality headsets and exoskeletons to teach humanoid robots how to open a microwave and wipe down the table. “There is a lot of demand, and it’s increasing really fast,” says Ali Ansari, CEO of Micro1. He estimates that robotics companies are now spending more than $100 million each year to buy real-world data from his company and others like it. A day in the life Workers at Micro1 are vetted by an AI agent named Zara that conducts interviews and reviews samples of chore videos. Every week, they submit videos of themselves doing chores around their homes, following a list of instructions about things like keeping their hands visible and moving at natural speed. The videos are reviewed by both AI and a human and are either accepted or rejected. They’re then annotated by AI and a team of hundreds of humans who label the actions in the footage. “There is a lot of demand, and it’s increasing really fast.” Ali Ansari, CEO of Micro1 Because this approach to training robots is in its infancy, it’s not clear yet what makes good training data. Still, “you need to give lots and lots of variations for the robot to generalize well for basic navigation and manipulation of the world,” says Ansari.
But many workers say that creating a variety of “chore content” in their tiny homes is a challenge. Zeus, a scrappy student living in a humble studio, struggles to record anything beyond ironing his clothes every day. Arjun, a tutor in Delhi, India, takes an hour to make a 15-minute video because he spends so much time brainstorming new chores. “How much content [can be made] in the home? How much content?” he says. There’s also the sticky question of privacy. Micro1 asks workers not to show their faces to the camera or reveal personal information such as names, phone numbers, and birth dates. Then it uses AI and human reviewers to remove anything that slips through. But even without faces, the videos capture an intimate slice of workers’ lives: the interiors of their homes, their possessions, their routines. And understanding what kind of personal information they might be recording while they’re busy doing chores on camera can be tricky. Reviews of such footage might not filter out sensitive information beyond the most obvious identifiers. For workers with families, keeping private life off camera is a constant negotiation. Arjun, a father of two daughters, has to wrangle his chaotic two-year-old out of frame. “Sometimes it’s very difficult to work because my daughter is small,” he says. Sasha, a banker turned data recorder in Nigeria, tiptoes around when she hangs her laundry outside in a shared residential compound so she won’t record her neighbors, who watch her in bewilderment. “It’s going to take longer than people think.”Ken Goldberg, UC Berkeley While the workers interviewed by MIT Technology Review understand that their data is being used to train robots, none of them know how exactly their data will be used, stored, and shared with third parties, including the robotics companies that Micro1 is selling the data to. For confidentiality reasons, says Ansari, Micro1 doesn’t name its clients or disclose to workers the specific nature of the projects they are contributing to. “It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention … where this kind of technology might go and how that might affect them longer term,” says Yasmine Kotturi, a professor of human-centered computing at the University of Maryland.
Occasionally, some workers say, they’ve seen other workers asking on the company Slack channel if the company could delete their data. Micro1 declined to comment on whether such data is deleted. “People are opting into doing this,” says Ansari. “They could stop the work at any time.”
Hungry for data With thousands of workers doing their chores differently in different homes, some roboticists wonder if the data collected from them is reliable enough to train robots safely. “How we conduct our lives in our homes is not always right from a safety point of view,” says Aaron Prather, a roboticist at ASTM International. “If those folks are teaching those bad habits that could lead to an incident, then that’s not good data.” And the sheer volume of data being collected makes reviewing it for quality control challenging. But Ansari says the company rejects videos showing unsafe ways of performing a task, while clumsy movements can be useful to teach robots what not to do. Then there’s the question of how much of this data we need. Micro1 says it has tens of thousands of hours of footage, while Scale AI announced it had gathered more than 100,000 hours. “It’s going to take a long time to get there,” says Ken Goldberg, a roboticist at the University of California, Berkeley. Large language models were trained on text and images that would take a human 100,000 years to read, and humanoid robots may need even more data, because controlling robotic joints is even more complicated than generating text. “It’s going to take longer than people think,” he says. When Dattu, an engineering student living in a bustling tech hub in India, comes home after a full day of classes at his university, he skips dinner and dashes to his tiny balcony, cramped with potted plants and dumbbells. He straps his iPhone to his forehead and records himself folding the same set of clothes over and over again. His family stares at him quizzically. “It’s like some space technology for them,” he says. When he tells his friends about his job, “they just get astounded by the idea that they can get paid by recording chores.” Juggling his university studies with data recording, as well as other data annotation gigs, takes a toll on him. Still, “it feels like you’re doing something different than the whole world,” he says.

Shifting to AI model customization is an architectural imperative
In partnership withMistral AI In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every new model iteration. Today, those jumps have flattened into incremental gains. The exception is domain-specialized intelligence, where true step-function improvements are still the norm. When a model is fused with an organization’s proprietary data and internal logic, it encodes the company’s history into its future workflows. This alignment creates a compounding advantage: a competitive moat built on a model that understands the business intimately. This is more than fine-tuning; it is the institutionalization of expertise into an AI system. This is the power of customization. Intelligence tuned to context Every sector operates within its own specific lexicon. In automotive engineering, the “language” of the firm revolves around tolerance stacks, validation cycles, and revision control. In capital markets, reasoning is dictated by risk-weighted assets and liquidity buffers. In security operations, patterns are extracted from the noise of telemetry signals and identity anomalies. Custom-adapted models internalize the nuances of the field. They recognize which variables dictate a “go/no-go” decision, and they think in the language of the industry.
Domain expertise in action The transition from general-purpose to tailored AI centers on one goal: encoding an organization’s unique logic directly into a model’s weights. Mistral AI partners with organizations to incorporate domain expertise into their training ecosystems. A few use cases illustrate customized implementations in practice:
Software engineering and assisting at scale: A network hardware company with proprietary languages and specialized codebases found that out-of-the-box models could not grasp their internal stack. By training a custom model on their own development patterns, they achieved a step function in fluency. Integrated into Mistral’s software development scaffolding, this customized model now supports the entire lifecycle—from maintaining legacy systems to autonomous code modernization via reinforcement learning. This turns once-opaque, niche code into a space where AI reliably assists at scale. Automotive and the engineering copilot: A leading automotive company uses customization to revolutionize crash test simulations. Previously, specialists spent entire days manually comparing digital simulations with physical results to find divergences. By training a model on proprietary simulation data and internal analyses, they automated this visual inspection, flagging deformations in real time. Moving beyond detection, the model now acts as a copilot, proposing design adjustments to bring simulations closer to real-world behavior and radically accelerating the R&D loop. Public sector and sovereign AI: In Southeast Asia, a government agency is building a sovereign AI layer to move beyond Western-centric models. By commissioning a foundation model tailored to regional languages, local idioms, and cultural contexts, they created a strategic infrastructure asset. This ensures sensitive data remains under local governance while powering inclusive citizen services and regulatory assistants. Here, customization is the key to deploying AI that is both technically effective and genuinely sovereign. The blueprint for strategic customization Moving from a general-purpose AI strategy to a domain-specific advantage requires a structural rethinking of the model’s role within the enterprise. Success is defined by three shifts in organizational logic. 1. Treat AI as infrastructure, not an experiment. Historically, enterprises have treated model customization as an ad hoc experiment—a single fine-tuning run for a niche use case or a localized pilot. While these bespoke silos often yield promising results, they are rarely built to scale. They produce brittle pipelines, improvised governance, and limited portability. When the underlying base models evolve, the adaptation work must often be discarded and rebuilt from scratch.In contrast, a durable strategy treats customization as foundational infrastructure. In this model, adaptation workflows are reproducible, version-controlled, and engineered for production. Success is measured against deterministic business outcomes. By decoupling the customization logic from the underlying model, firms ensure that their “digital nervous system” remains resilient, even as the frontier of base models shifts. 2. Retain control of your own data and models. As AI migrates from the periphery to core operations, the question of control becomes existential. Reliance on a single cloud provider or vendor for model alignment creates a dangerous asymmetry of power regarding data residency, pricing, and architectural updates. Enterprises that retain control of their training pipelines and deployment environments preserve their strategic agency. By adapting models within controlled environments, organizations can enforce their own data residency requirements and dictate their own update cycles. This approach transforms AI from a service consumed into an asset governed, reducing structural dependency and allowing for cost and energy optimizations aligned with internal priorities rather than vendor roadmaps. 3. Design for continuous adaptation. The enterprise environment is never static: regulations shift, taxonomies evolve, and market conditions fluctuate. A common failure is treating a customized model as a finished artifact. In reality, a domain-aligned model is a living asset subject to model decay if left unmanaged.
Designing for continuous adaptation requires a disciplined approach to ModelOps. This includes automated drift detection, event-driven retraining, and incremental updates. By building the capacity for constant recalibration, the organization ensures that its AI does not just reflect its history, but it evolves in lockstep with its future. This is the stage where the competitive moat begins to compound: the model’s utility grows as it internalizes the organization’s ongoing response to change. Control is the new leverage We have entered an era where generic intelligence is a commodity, but contextual intelligence is a scarcity. While raw model power is now a baseline requirement, the true differentiator is alignment—AI calibrated to an organization’s unique data, mandates, and decision logic. In the next decade, the most valuable AI won’t be the one that knows everything about the world; it will be the one that knows everything about you. The firms that own the model weights of that intelligence will own the market. This content was produced by Mistral AI. It was not written by MIT Technology Review’s editorial staff.

AI benchmarks are broken. Here’s what we need instead.
For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. This framing is seductive: An AI vs. human comparison on isolated problems with clear right or wrong answers is easy to standardize, compare, and optimize. It generates rankings and headlines. But there’s a problem: AI is almost never used in the way it is benchmarked. Although researchers and industry have started to improve benchmarking by moving beyond static tests to more dynamic evaluation methods, these innovations resolve only part of the issue. That’s because they still evaluate AI’s performance outside the human teams and organizational workflows where its real-world performance ultimately unfolds. While AI is evaluated at the task level in a vacuum, it is used in messy, complex environments where it usually interacts with more than one person. Its performance (or lack thereof) emerges only over extended periods of use. This misalignment leaves us misunderstanding AI’s capabilities, overlooking systemic risks, and misjudging its economic and social consequences.
To mitigate this, it’s time to shift from narrow methods to benchmarks that assess how AI systems perform over longer time horizons within human teams, workflows, and organizations. I have studied real-world AI deployment since 2022 in small businesses and health, humanitarian, nonprofit, and higher-education organizations in the UK, the United States, and Asia, as well as within leading AI design ecosystems in London and Silicon Valley. I propose a different approach, which I call HAIC benchmarks—Human–AI, Context-Specific Evaluation. What happens when AI fails For governments and businesses, AI benchmark scores appear more objective than vendor claims. They’re a critical part of determining whether an AI model or application is “good enough” for real-world deployment. Imagine an AI model that achieves impressive technical scores on the most cutting-edge benchmarks—98% accuracy, groundbreaking speed, compelling outputs. On the strength of these results, organizations may decide to adopt the model, committing sizable financial and technical resources to purchasing and integrating it.
But then, once it’s adopted, the gap between benchmark and real-world performance quickly becomes visible. For example, take the swathe of FDA-approved AI models that can read medical scans faster and more accurately than an expert radiologist. In the radiology units of hospitals from the heart of California to the outskirts of London, I witnessed staff using highly ranked radiology AI applications. Repeatedly, it took them extra time to interpret AI’s outputs alongside hospital-specific reporting standards and nation-specific regulatory requirements. What appeared as a productivity-enhancing AI tool when tested in a vacuum introduced delays in practice. It soon became clear that the benchmark tests on which medical AI models are assessed do not capture how medical decisions are actually made. Hospitals rely on multidisciplinary teams—radiologists, oncologists, physicists, nurses—who jointly review patients. Treatment planning rarely hinges on a static decision; it evolves as new information emerges over days or weeks. Decisions often arise through constructive debate and trade-offs between professional standards, patient preferences, and the shared goal of long-term patient well-being. No wonder even highly scored AI models struggle to deliver the promised performance once they encounter the complex, collaborative processes of real clinical care. The same pattern emerges in my research across other sectors: When embedded within real-world work environments, even AI models that perform brilliantly on standardized tests don’t perform as promised. When high benchmark scores fail to translate into real-world performance, even the most highly scored AI is soon abandoned to what I call the “AI graveyard.” The costs are significant: Time, effort and money end up being wasted. And over time, repeated experiences like this erode organizational confidence in AI and—in critical settings such as health—may erode broader public trust in the technology as well. When current benchmarks provide only a partial and potentially misleading signal of an AI model’s readiness for real-world use, this creates regulatory blind spots: Oversight is shaped by metrics that do not reflect reality. It also leaves organizations and governments to shoulder the risks of testing AI in sensitive real-world settings, often with limited resources and support. How to build better tests To close the gap between benchmark and real-world performance, we must pay attention to the actual conditions in which AI models will be used. The critical questions: Can AI function as a productive participant within human teams? And can it generate sustained, collective value? Through my research on AI deployment across multiple sectors, I have seen a number of organizations already moving—deliberately and experimentally—toward the HAIC benchmarks I favor. HAIC benchmarks reframe current benchmarking in four ways:
1. From individual and single-task performance to team and workflow performance (shifting the unit of analysis) 2. From one-off testing with right/wrong answers to long-term impacts (expanding the time horizon) 3. From correctness and speed to organizational outcomes, coordination quality, and error detectability (expanding outcome measures) 4. From isolated outputs to upstream and downstream consequences (system effects) Across the organizations where this approach has emerged and started to be applied, the first step is shifting the unit of analysis. For example, in one UK hospital system in the period 2021–2024, the question expanded from whether a medical AI application improves diagnostic accuracy to how the presence of AI within the hospital’s multidisciplinary teams affects not only accuracy but also coordination and deliberation. The hospital specifically assessed coordination and deliberation in human teams using and not using AI. Multiple stakeholders (within and outside the hospital) decided on metrics like how AI influences collective reasoning, whether it surfaces overlooked considerations, whether it strengthens or weakens coordination, and whether it changes established risk and compliance practices. This shift is fundamental. It matters a lot in high-stakes contexts where system-level effects matter more than task-level accuracy. It also matters for the economy. It may help recalibrate inflated expectations of sweeping productivity gains that are so far predicated largely on the promise of improving individual task performance. Once that foundation is set, HAIC benchmarking can begin to take on the element of time.
Today’s benchmarks resemble school exams—one-off, standardized tests of accuracy. But real professional competence is assessed differently. Junior doctors and lawyers are evaluated continuously inside real workflows, under supervision, with feedback loops and accountability structures. Performance is judged over time and in a specific context, because competence is relational. If AI systems are meant to operate alongside professionals, their impact should be judged longitudinally, reflecting how performance unfolds over repeated interactions. I saw this aspect of HAIC applied in one of my humanitarian-sector case studies. Over 18 months, an AI system was evaluated within real workflows, with particular attention to how detectable its errors were—that is, how easily human teams could identify and correct them. This long-term “record of error detectability” meant the organizations involved could design and test context-specific guardrails to promote trust in the system, despite the inevitability of occasional AI mistakes.
A longer time horizon also makes visible the system-level consequences that short-term benchmarks miss. An AI application may outperform a single doctor on a narrow diagnostic task yet fail to improve multidisciplinary decision-making. Worse, it may introduce systemic distortions: anchoring teams too early in plausible but incomplete answers, adding to people’s cognitive workloads, or generating downstream inefficiencies that offset any speed or efficiency gains at the point of the AI’s use. These knock-on effects—often invisible to current benchmarks—are central to understanding real impact. The HAIC approach, admittedly promises to make benchmarking more complex, resource-intensive, and harder to standardize. But continuing to evaluate AI in sanitized conditions detached from the world of work will leave us misunderstanding what it truly can and cannot do for us. To deploy AI responsibly in real-world settings, we must measure what actually matters: not just what a model can do alone, but what it enables—or undermines—when humans and teams in the real world work with it. Angela Aristidou is a professor at University College London and a faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. She speaks, writes, and advises about the real-life deployment of artificial-intelligence tools for public good.

The Download: AI health tools and the Pentagon’s Anthropic culture war
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. There are more AI health tools than ever—but how well do they work? In the last few months alone, Microsoft, Amazon, and OpenAI have all launched medical chatbots. There’s a clear demand for these tools, given how hard it is for many people to access advice through the existing medical system—and they could make safe and useful recommendations. But concerns have surfaced about how little external evaluation they undergo before being released to the public. Read the full story to understand what’s at stake.
—Grace Huckins The Pentagon’s culture war tactic against Anthropic has backfired A judge has temporarily blocked the Pentagon from labeling Anthropic a supply chain risk and ordering government agencies to stop using its AI. Her intervention suggests that the feud never needed to reach such a frenzy.
It did so because the government disregarded the existing process for such disputes—and fueled the fire on social media. Find out how it happened and what comes next. —James O’Donnell This story is from The Algorithm, our weekly newsletter giving you the inside track on all things AI. Sign up to receive it in your inbox every Monday. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 California has defied Trump to impose new AI regulations Governor Newsom signed off on the new standards yesterday. (Guardian) + Firms seeking state contracts will need extra safeguards. (Reuters $) + States are installing guardrails despite Trump’s order to stop. (NYT $) + An AI regulation war is brewing in the US. (MIT Technology Review) 2 Experiments have verified quantum simulations for the first time It’s a breakthrough for quantum computing applications. (Nature) + Which could one day help solve healthcare problems. (MIT Technology Review) 3 The new White House app is a security and privacy nightmare It extensively tracks users and relies on external code. (Gizmodo) + The new app promises “unparalleled access” to Trump. (CNET) + It also invites users to report people to ICE. (The Verge)
4 Big Tech’s $635 billion AI spending faces an energy shock test The Middle East crisis is clouding prospects for growth. (Reuters $) + Here are three big unknowns about AI’s energy burden. (MIT Technology Review) 5 Meta and Google have been accused of breaking child safety rules Australia suspects they flouted a social media ban. (Bloomberg $) + Indonesia is also investigating non-compliance. (Reuters $) 6 Nebius is building a $10 billion AI data center in Finland The company is rapidly expanding Europe’s AI infrastructure. (CNBC) 7 South Korea’s chipmakers’ helium stocks will last until June Beyond that? Who knows. (Reuters $) + Shortages caused by the Iran war threaten the chip industry. (NYT $) 8 Another Starlink satellite has inexplicably exploded SpaceX suffered a similar episode in December. (The Verge) + We went inside Ukraine’s largest Starlink repair shop. (MIT Technology Review) 9 Bluesky’s new AI tool is already its most blocked account—after JD Vance About 83 times as many users have blocked it as have followed it. (TechCrunch) 10 An AI agent banned from Wikipedia has lashed out in angry blogs The bot accused its human editors of “uncivil behavior.” (404 Media) Quote of the day
“Is any of this illegal? Probably not. Is it what you’d expect from an official government app? Probably not either.” —Security researcher Thereallo reviews the White House’s new app. One More Thing
CHANTAL JAHCHAN Inside Amsterdam’s high-stakes experiment to create fair welfare AI When Hans de Zwart, a digital rights advocate, saw Amsterdam’s plan to have an algorithm evaluate every welfare applicant for potential fraud, he nearly fell out of his chair. He believed the system had “unfixable problems.” Meanwhile, Paul de Koning, a consultant to the city, was excited. He saw immense potential to improve efficiencies and remove biases.

There are more AI health tools than ever—but how well do they work?
EXECUTIVE SUMMARY Earlier this month, Microsoft launched Copilot Health, a new space within its Copilot app where users will be able to connect their medical records and ask specific questions about their health. A couple of days earlier, Amazon had announced that Health AI, an LLM-based tool previously restricted to members of its One Medical service, would now be widely available. These products join the ranks of ChatGPT Health, which OpenAI released back in January, and Anthropic’s Claude, which can access user health records if granted permission. Health AI for the masses is officially a trend. There’s a clear demand for chatbots that provide health advice, given how hard it is for many people to access it through existing medical systems. And some research suggests that current LLMs are capable of making safe and useful recommendations. But researchers say that these tools should be more rigorously evaluated by independent experts, ideally before they are widely released. In a high-stakes area like health, trusting companies to evaluate their own products could prove unwise, especially if those evaluations aren’t made available for external expert review. And even if the companies are doing quality, rigorous research—which some, including OpenAI, do seem to be—they might still have blind spots that the broader research community could help to fill. “To the extent that you always are going to need more health care, I think we should definitely be chasing every route that works,” says Andrew Bean, a doctoral candidate at the Oxford Internet Institute. “It’s entirely plausible to me that these models have reached a point where they’re actually worth rolling out.”
“But,” he adds, “the evidence base really needs to be there.” Tipping points
To hear developers tell it, these health products are now being released because large language models have indeed reached a point where they can effectively provide medical advice. Dominic King, the vice president of health at Microsoft AI and a former surgeon, cites AI advancement as a core reason why the company’s health team was formed, and why Copilot Health now exists. “We’ve seen this enormous progress in the capabilities of generative AI to be able to answer health questions and give good responses,” he says. But that’s only half the story, according to King. The other key factor is demand. Shortly before Copilot Health was launched, Microsoft published a report, and an accompanying blog post, detailing how people used Copilot for health advice. The company says it receives 50 million health questions each day, and health is the most popular discussion topic on the Copilot mobile app. Other AI companies have noticed, and responded to, this trend. “Even before our health products, we were seeing just a rapid, rapid increase in the rate of people using ChatGPT for health-related questions,” says Karan Singhal, who leads OpenAI’s Health AI team. (OpenAI and Microsoft have a long-standing partnership, and Copilot is powered by OpenAI’s models.) It’s possible that people simply prefer posing their health problems to a nonjudgmental bot that’s available to them 24-7. But many experts interpret this pattern in light of the current state of the health-care system. “There is a reason that these tools exist and they have a position in the overall landscape,” says Girish Nadkarni, chief AI officer at the Mount Sinai Health System. “That’s because access to health care is hard, and it’s particularly hard for certain populations.” The virtuous vision of consumer-facing LLM health chatbots hinges on the possibility that they could improve user health while reducing pressure on the health-care system. That might involve helping users decide whether or not they need medical attention, a task known as triage. If chatbot triage works, then patients who need emergency care might seek it out earlier than they would have otherwise, and patients with more mild concerns might feel comfortable managing their symptoms at home with the chatbot’s advice rather than unnecessarily busying emergency rooms and doctor’s offices. But a recent, widely discussed study from Nadkarni and other researchers at Mount Sinai found that ChatGPT Health sometimes recommends too much care for mild conditions and fails to identify emergencies. Though Singhal and some other experts have suggested that its methodology might not provide a complete picture of ChatGPT Health’s capabilities, the study has surfaced concerns about how little external evaluation these tools see before being released to the public. Most of the academic experts interviewed for this piece agreed that LLM health chatbots could have real upsides, given how little access to health care some people have. But all six of them expressed concerns that these tools are being launched without testing from independent researchers to assess whether they are safe. While some advertised uses of these tools, such as recommending exercise plans or suggesting questions that a user might ask a doctor, are relatively harmless, others carry clear risks. Triage is one; another is asking a chatbot to provide a diagnosis or a treatment plan. The ChatGPT Health interface includes a prominent disclaimer stating that it is not intended for diagnosis or treatment, and the announcements for Copilot Health and Amazon’s Health AI include similar warnings. But those warnings are easy to ignore. “We all know that people are going to use it for diagnosis and management,” says Adam Rodman, an internal medicine physician and researcher at Beth Israel Deaconess Medical Center and a visiting researcher at Google.
Medical testing Companies say they are testing the chatbots to ensure that they provide safe responses the vast majority of the time. OpenAI has designed and released HealthBench, a benchmark that scores LLMs on how they respond in realistic health-related conversations—though the conversations themselves are LLM-generated. When GPT-5, which powers both ChatGPT Health and Copilot Health, was released last year, OpenAI reported the model’s HealthBench scores: It did substantially better than previous OpenAI models, though its overall performance was far from perfect. But evaluations like HealthBench have limitations. In a study published last month, Bean—the Oxford doctoral candidate—and his colleagues found that even if an LLM can accurately identify a medical condition from a fictional written scenario on its own, a non-expert user who is given the scenario and asked to determine the condition with LLM assistance might figure it out only a third of the time. If they lack medical expertise, users might not know which parts of a scenario—or their real-life experience—are important to include in their prompt, or they might misinterpret the information that an LLM gives them. Bean says that this performance gap could be significant for OpenAI’s models. In the original HealthBench study, the company reported that its models performed relatively poorly in conversations that required them to seek more information from the user. If that’s the case, then users who don’t have enough medical knowledge to provide a health chatbot with the information that it needs from the get-go might get unhelpful or inaccurate advice. Singhal, the OpenAI health lead, notes that the company’s current GPT-5 series of models, which had not yet been released when the original HealthBench study was conducted, do a much better job of soliciting additional information than their predecessors. However, OpenAI has reported that GPT-5.4, the current flagship, is actually worse at seeking context than GPT-5.2, an earlier version. Ideally, Bean says, health chatbots would be subjected to controlled tests with human users, as they were in his study, before being released to the public. That might be a heavy lift, particularly given how fast the AI world moves and how long human studies can take. Bean’s own study used GPT-4o, which came out almost a year ago and is now outdated. Earlier this month, Google released a study that meets Bean’s standards. In the study, patients discussed medical concerns with the company’s Articulate Medical Intelligence Explorer (AMIE), a medical LLM chatbot that is not yet available to the public, before meeting with a human physician. Overall, AMIE’s diagnoses were just as accurate as physicians’, and none of the conversations raised major safety concerns for researchers. Despite the encouraging results, Google isn’t planning to release AMIE anytime soon. “While the research has advanced, there are significant limitations that must be addressed before real-world translation of systems for diagnosis and treatment, including further research into equity, fairness, and safety testing,” wrote Alan Karthikesalingam, a research scientist at Google DeepMind, in an email. Google did recently reveal that Health100, a health platform it is building in partnership with CVS, will include an AI assistant powered by its flagship Gemini models, though that tool will presumably not be intended for diagnosis or treatment.
Rodman, who led the AMIE study with Karthikesalingam, doesn’t think such extensive, multiyear studies are necessarily the right approach for chatbots like ChatGPT Health and Copilot Health. “There’s lots of reasons that the clinical trial paradigm doesn’t always work in generative AI,” he says. “And that’s where this benchmarking conversation comes in. Are there benchmarks [from] a trusted third party that we can agree are meaningful, that the labs can hold themselves to?” They key there is “third party.” No matter how extensively companies evaluate their own products, it’s tough to trust their conclusions completely. Not only does a third-party evaluation bring impartiality, but if there are many third parties involved, it also helps protect against blind spots.
OpenAI’s Singhal says he’s strongly in favor of external evaluation. “We try our best to support the community,” he says. “Part of why we put out HealthBench was actually to give the community and other model developers an example of what a very good evaluation looks like.” Given how expensive it is to produce a high-quality evaluation, he says, he’s skeptical that any individual academic laboratory would be able to produce what he calls “the one evaluation to rule them all.” But he does speak highly of efforts that academic groups have made to bring preexisting and novel evaluations together into comprehensive evaluations suites—such as Stanford’s MedHELM framework, which tests models on a wide variety of medical tasks. Currently, OpenAI’s GPT-5 holds the highest MedHELM score. Nigam Shah, a professor of medicine at Stanford University who led the MedHELM project, says it has limitations. In particular, it only evaluates individual chatbot responses, but someone who’s seeking medical advice from a chatbot tool might engage it in a multi-turn, back-and-forth conversation. He says that he and some collaborators are gearing up to build an evaluation that can score those complex conversations, but that it will take time, and money. “You and I have zero ability to stop these companies from releasing [health-oriented products], so they’re going to do whatever they damn please,” he says. “The only thing people like us can do is find a way to fund the benchmark.” No one interviewed for this article argued that health LLMs need to perform perfectly on third-party evaluations in order to be released. Doctors themselves make mistakes—and for someone who has only occasional access to a doctor, a consistently accessible LLM that sometimes messes up could still be a huge improvement over the status quo, as long as its errors aren’t too grave. With the current state of the evidence, however, it’s impossible to know for sure whether the currently available tools do in fact constitute an improvement, or whether their risks outweigh their benefits.

Vim and GNU Emacs: Claude Code helpfully found zero-day exploits for both
“An attacker who can deliver a crafted file to a victim achieves arbitrary command execution with the privileges of the user running Vim,” Vim maintainers noted in their security advisory. “The attack requires only that the victim opens the file; no further interaction is needed.” GNU Emacs ‘forever-day’ Surprised, Nguyen then jokingly suggested Claude Code find the same type of flaw in a second text editor, GNU Emacs. Claude Code obliged, finding a zero-day vulnerability, dating back to 2018, in the way the program interacts with the Git version control system that would make it possible to execute malicious code simply by opening a file. “Opening a file in GNU Emacs can trigger arbitrary code execution through version control (git), most requiring zero user interaction beyond the file open itself. The most severe finding requires no file-local variables at all — simply opening any file inside a directory containing a crafted .git/ folder executes attacker-controlled commands,” he wrote. One fixed, one not When notified, Vim’s maintainers quickly fixed their issue, identified as CVE-2026-34714 with a CVSS score of 9.2, in version 9.2.0272. Unfortunately, addressing the GNU Emacs vulnerability, which is currently without a CVE identifier, isn’t as straightforward. Its maintainers believe it to be a problem with Git, and declined to address the issue; in his post, Nguyen suggests manual mitigations. The vulnerable versions are 30.2 (stable release) and 31.0.50 (development).

Tokenomics: Why IT leaders need to pay attention to AI tokens
“Not only do you know you need the fast throughput, but you need the fast response time, because ultimately that end-to-end operation is going to define how long it takes you to get back your full answer,” he said. How tokens drive enterprise decisions Tokens are used as the currency for public AI products, such as text to image or image to video conversions. Within an enterprise, this situation is different. Token consumption is commonly expressed in cost per million tokens. In enterprise settings, this often translates to two approaches: metered usage models, where departments or applications consume tokens against a defined budget; and enterprise or site licenses, where organizations negotiate volume-based pricing to manage costs at scale. Some enterprises may allocate token budgets to departments, setting soft or hard limits to control usage. Others may rely on centralized licensing to simplify governance and cost management. Either way, tokenomics becomes a core part of financial planning for AI initiatives. “Selling” tokens to your employees may seem strange, but Salvadore said employees will not be given a blank check to use enterprise AI applications or public AI services. “[IT] has to think about how can we do this in a way that’s going to get our organization the capabilities that they need to really make a good use of authentic AI, while at the same time balancing the ability for individual users to be able to get what they need quickly enough, while also balancing cost,” he said. Virtually all of the public API services providers offer tiered usage. For example, ChatGPT has four pricing plans, from free to Pro, which runs for $200 per month but offers considerably more services than the free version. Through tokenomics, enterprises can buy Pro-level services but limit their use or availability.

Schneider Electric Maps the AI Data Center’s Next Design Era
The coming shift to higher-voltage DC That internal power challenge led Simonelli to one of the most consequential architectural topics in the interview: the likely transition toward higher-voltage DC distribution at very high rack densities. He framed it pragmatically. At current density levels, the industry knows how to get power into racks at 200 or 300 kilowatts. But as densities rise toward 400 kilowatts and beyond, conventional AC approaches start to run into physical limits. Too much cable, too much copper, too much conversion equipment, and too much space consumed by power infrastructure rather than GPUs. At that point, he said, higher-voltage DC becomes attractive not for philosophical reasons, but because it reduces current, shrinks conductor size, saves space, and leaves more room for revenue-generating compute. “It is again a paradigm shift,” Simonelli said of DC power at these densities. “But it won’t be everywhere.” That is probably right. The transition will not be universal, and the exact thresholds will evolve. But his underlying point is powerful. As rack densities climb, electrical architecture starts to matter not only for efficiency and reliability, but for physical space allocation inside the rack. Put differently, power distribution becomes a compute-enablement issue. Distance between accelerators matters, too. The closer GPUs and TPUs can be kept together, the better they perform. If power infrastructure can be compacted, more of the rack can be devoted to dense compute, improving the economics and performance of the system. That is a strong example of how AI is collapsing traditional boundaries between facility engineering and compute architecture. The two are no longer cleanly separable. Gas now, renewables over time On onsite power, Simonelli was refreshingly direct. If the goal is dispatchable onsite generation at the scale now being contemplated for AI facilities, he said, “there really isn’t an alternative

The Download: gig workers training humanoids, and better AI benchmarks
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. The gig workers who are training humanoid robots at home When Zeus, a medical student in Nigeria, returns to his apartment from a long day at the hospital, he straps his iPhone to his forehead and records himself doing chores. Zeus is a data recorder for Micro1, which sells the data he collects to robotics firms. As these companies race to build humanoids, videos from workers like Zeus have become the hottest new way to train them. Micro1 has hired thousands of them in more than 50 countries, including India, Nigeria, and Argentina. The jobs pay well locally, but raise thorny questions around privacy and informed consent. The work can be challenging—and weird. Read the full story.
—Michelle Kim Our readers recently voted humanoid robots the “11th breakthrough” to add to our 2026 list of 10 Breakthrough Technologies. Check out what else officially made the cut.
AI benchmarks are broken. Here’s what we need instead. For decades, AI has been evaluated based on whether it can outperform humans on isolated problems. But it’s seldom used this way in the real world. While AI is assessed in a vacuum, it operates in messy, complex, multi-person environments over time. This misalignment leads us to misunderstand its capabilities, risks, and impacts. We need new benchmarks that assess AI’s performance over longer horizons within human teams, workflows, and organizations. Here’s a proposal for one such approach: Human–AI, Context-Specific Evaluation. —Angela Aristidou, professor at University College London and faculty fellow at the Stanford Digital Economy Lab and the Stanford Human-Centered AI Institute. MIT Technology Review Narrated: can quantum computers now solve health care problems? We’ll soon find out. In a laboratory on the outskirts of Oxford, a quantum computer built from atoms and light awaits its moment. The device is small but powerful—and also very valuable. Infleqtion, the company that owns it, is hoping its abilities will win $5 million at a competition. The prize will go to the quantum computer that can solve real health care problems that “classical” computers cannot. But there can be only one big winner—if there is a winner at all. —Michael Brooks This is our latest story to be turned into an MIT Technology Review Narrated podcast, which we’re publishing each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.
The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 OpenAI just closed the biggest funding round in Silicon Valley history It raised $122 billion ahead of its blockbuster IPO, which is expected later this year. (WSJ $) + It’s also prepping a push to “rethink the social contract.” (Vanity Fair $) + Campaigners are urging people to quit ChatGPT. (MIT Technology Review) 2 Iran has threatened to attack 18 US tech companies It’s eyeing their operations in the Middle East. (Politico) + Targets include Nvidia, Apple, Microsoft, and Google. (Engadget) + Iran struck AWS data centers earlier this month. (Reuters $) 3 Artemis II is about to fly humans to the Moon. Here’s the science they’ll do Their experiments will set the stage for future explorers. (Nature) + You can watch the launch attempt today. (Engadget) 4 Putin is trying to take full control of Russia’s internet New outages and blockages are cutting the country off from the world. (NYT $) + Can we repair the internet? (MIT Technology Review) 5 A robotaxi outage in China left passengers stranded on highways Baidu vehicles froze on the streets of Wuhan. (Bloomberg $) + Police are blaming a “system failure.” (Reuters $) 6 US government requests for social media user data are soaring They’ve skyrocketed by 770% in the past decade. (Bloomberg $) + Is the Pentagon allowed to surveil Americans with AI? (MIT Technology Review)
7 Tesla has admitted that humans sometimes drive its robotaxis Remote drivers occasionally control them completely. (Wired $) 8 A satellite-smashing chain reaction could spiral out of control This data visualization captures the dangers of space collisions. (Guardian) + Here’s all the stuff we’ve put into space. (MIT Technology Review)
9 Meta’s smartglasses can turn you into a creep According to one journalist who wore them for a month. (Guardian) 10 A Claude Code leak has exposed plans for a virtual pet We could be getting a Tamagotchi for the GenAI era. (The Verge) Quote of the day “From now on, for every assassination, an American company will be destroyed.” —Iran’s Islamic Revolutionary Guard Corps (IRGC) threatens US tech firms in an affiliated Telegram, per CNBC. One More Thing ACKERMAN + GRUBER How one mine could unlock billions in EV subsidies In a pine farm north of the tiny town of Tamarack, Minnesota, Talon Metals has uncovered one of America’s densest nickel deposits. Now it wants to begin mining the ore.
Products made from the nickel could net more than $26 billion in subsidies through the Inflation Reduction Act (IRA), which is starting to transform the US economy. To understand how, we tallied up the potential tax credits available. Read the full story to find out what we discovered. —James Temple We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line.) + A selfless group of gluttons tried to taste-test every potato chip in the world. + Get romantic inspiration from these penguins’ engagement pebbles. + Good news: global terrorism has hit a 15-year low. + Enjoy endless new views through these windows around the world.

The gig workers who are training humanoid robots at home
When Zeus, a medical student living in a hilltop city in central Nigeria, returns to his studio apartment from a long day at the hospital, he turns on his ring light, straps his iPhone to his forehead, and starts recording himself. He raises his hands in front of him like a sleepwalker and puts a sheet on his bed. He moves slowly and carefully to make sure his hands stay within the camera frame. Zeus is a data recorder for Micro1, a US company based in Palo Alto, California that collects real-world data to sell to robotics companies. As companies like Tesla, Figure AI, and Agility Robotics race to build humanoids—robots designed to resemble and move like humans in factories and homes—videos recorded by gig workers like Zeus are becoming the hottest new way to train them. Micro1 has hired thousands of contract workers in more than 50 countries, including India, Nigeria, and Argentina, where swathes of tech-savvy young people are looking for jobs. They’re mounting iPhones on their heads and recording themselves folding laundry, washing dishes, and cooking. The job pays well by local standards and is boosting local economies, but it raises thorny questions around privacy and informed consent. And the work can be challenging at times—and weird. Zeus found the job in November, when people started talking about it everywhere on LinkedIn and YouTube. “This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future,” he thought.
Zeus is paid $15 an hour, which is good income in Nigeria’s strained economy with high unemployment rates. But as a bright-eyed student dreaming of becoming a doctor, he finds ironing his clothes for hours every day boring. “I really [do] not like it so much,” he says. “I’m the kind of person that requires … a technical job that requires me to think.”
Zeus, and all the workers interviewed by MIT Technology Review, asked to be referred to only by pseudonyms because they were not authorized to talk about their work. Humanoid robots are notoriously hard to build because manipulating physical objects is a difficult skill to master. But the rise of large language models underlying chatbots like ChatGPT has inspired a paradigm shift in robotics. Just as large language models learned to generate words by being trained on vast troves of text scraped from the internet, many researchers believe that humanoid robots can learn to interact with the world by being trained on massive amounts of movement data. Editor’s note: In a recent poll, MIT Technology Review readers selected humanoid robots as the 11th breakthrough for our 2026 list of 10 Breakthrough Technologies. Robotics requires far more complex data about the physical world, though, and that is much harder to find. Virtual simulations can train robots to perform acrobatics, but not how to grasp and move objects, because simulations struggle to model physics with perfect accuracy. For robots to work in factories and serve as housekeepers, real-world data, however time-consuming and expensive to collect, may be what we need. Investors are pouring money feverishly into solving this challenge, spending over $6 billion on humanoid robots in 2025. And at-home data recording is becoming a booming gig economy around the world. Data companies like Scale AI and Encord are recruiting their own armies of data recorders, while DoorDash pays delivery drivers to film themselves doing chores. And in China, workers in dozens of state-owned robot training centers wear virtual-reality headsets and exoskeletons to teach humanoid robots how to open a microwave and wipe down the table. “There is a lot of demand, and it’s increasing really fast,” says Ali Ansari, CEO of Micro1. He estimates that robotics companies are now spending more than $100 million each year to buy real-world data from his company and others like it. A day in the life Workers at Micro1 are vetted by an AI agent named Zara that conducts interviews and reviews samples of chore videos. Every week, they submit videos of themselves doing chores around their homes, following a list of instructions about things like keeping their hands visible and moving at natural speed. The videos are reviewed by both AI and a human and are either accepted or rejected. They’re then annotated by AI and a team of hundreds of humans who label the actions in the footage. “There is a lot of demand, and it’s increasing really fast.” Ali Ansari, CEO of Micro1 Because this approach to training robots is in its infancy, it’s not clear yet what makes good training data. Still, “you need to give lots and lots of variations for the robot to generalize well for basic navigation and manipulation of the world,” says Ansari.
But many workers say that creating a variety of “chore content” in their tiny homes is a challenge. Zeus, a scrappy student living in a humble studio, struggles to record anything beyond ironing his clothes every day. Arjun, a tutor in Delhi, India, takes an hour to make a 15-minute video because he spends so much time brainstorming new chores. “How much content [can be made] in the home? How much content?” he says. There’s also the sticky question of privacy. Micro1 asks workers not to show their faces to the camera or reveal personal information such as names, phone numbers, and birth dates. Then it uses AI and human reviewers to remove anything that slips through. But even without faces, the videos capture an intimate slice of workers’ lives: the interiors of their homes, their possessions, their routines. And understanding what kind of personal information they might be recording while they’re busy doing chores on camera can be tricky. Reviews of such footage might not filter out sensitive information beyond the most obvious identifiers. For workers with families, keeping private life off camera is a constant negotiation. Arjun, a father of two daughters, has to wrangle his chaotic two-year-old out of frame. “Sometimes it’s very difficult to work because my daughter is small,” he says. Sasha, a banker turned data recorder in Nigeria, tiptoes around when she hangs her laundry outside in a shared residential compound so she won’t record her neighbors, who watch her in bewilderment. “It’s going to take longer than people think.”Ken Goldberg, UC Berkeley While the workers interviewed by MIT Technology Review understand that their data is being used to train robots, none of them know how exactly their data will be used, stored, and shared with third parties, including the robotics companies that Micro1 is selling the data to. For confidentiality reasons, says Ansari, Micro1 doesn’t name its clients or disclose to workers the specific nature of the projects they are contributing to. “It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention … where this kind of technology might go and how that might affect them longer term,” says Yasmine Kotturi, a professor of human-centered computing at the University of Maryland.
Occasionally, some workers say, they’ve seen other workers asking on the company Slack channel if the company could delete their data. Micro1 declined to comment on whether such data is deleted. “People are opting into doing this,” says Ansari. “They could stop the work at any time.”
Hungry for data With thousands of workers doing their chores differently in different homes, some roboticists wonder if the data collected from them is reliable enough to train robots safely. “How we conduct our lives in our homes is not always right from a safety point of view,” says Aaron Prather, a roboticist at ASTM International. “If those folks are teaching those bad habits that could lead to an incident, then that’s not good data.” And the sheer volume of data being collected makes reviewing it for quality control challenging. But Ansari says the company rejects videos showing unsafe ways of performing a task, while clumsy movements can be useful to teach robots what not to do. Then there’s the question of how much of this data we need. Micro1 says it has tens of thousands of hours of footage, while Scale AI announced it had gathered more than 100,000 hours. “It’s going to take a long time to get there,” says Ken Goldberg, a roboticist at the University of California, Berkeley. Large language models were trained on text and images that would take a human 100,000 years to read, and humanoid robots may need even more data, because controlling robotic joints is even more complicated than generating text. “It’s going to take longer than people think,” he says. When Dattu, an engineering student living in a bustling tech hub in India, comes home after a full day of classes at his university, he skips dinner and dashes to his tiny balcony, cramped with potted plants and dumbbells. He straps his iPhone to his forehead and records himself folding the same set of clothes over and over again. His family stares at him quizzically. “It’s like some space technology for them,” he says. When he tells his friends about his job, “they just get astounded by the idea that they can get paid by recording chores.” Juggling his university studies with data recording, as well as other data annotation gigs, takes a toll on him. Still, “it feels like you’re doing something different than the whole world,” he says.

Microsoft facing CMA probe of its business software portfolio
Smith added that Microsoft recognizes that the CMA “will continue to review and assess additional issues relating to our products and services, including in the business software market. We are committed to working quickly and constructively to address these issues, including by providing all the information the CMA needs to move forward with its reviews.” A welcome move Matthew Sinclair, senior director and head of the London office of the Computer & Communications Industry Association (CCIA), a group which represents a cross section of communications and technology firms, described the move by the CMA as “welcome news.” It will, he said, “avoid overly broad and prescriptive interventions that would have impeded investment and innovation in UK cloud services. The regulator can focus its efforts on action to address specific issues, particularly restrictive software licensing terms for legacy software, which are costing UK users a fortune.” A resilience and digital sovereignty issue In response to both CMA decisions, Forrester senior analyst Dario Maisto said, “in times of increasing geopolitical volatility, organizations and authorities are reassessing risks coming from dependencies on foreign providers, to improve their digital sovereignty posture.” He pointed out, “if we consider that Microsoft and AWS own some 70% of the European and UK public cloud market, we can easily understand how emerging sovereignty concerns add to existing concentration risk in a mix that urges action now more than ever.” According to Maisto, Microsoft’s case is under even more regulatory scrutiny because European and UK organizations have a strong dependency on its productivity suite, regardless of the infrastructure layer.
Stay Ahead with the Paperboy Newsletter
Your weekly dose of insights into AI, Bitcoin mining, Datacenters and Energy indusrty news. Spend 3-5 minutes and catch-up on 1 week of news.