This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.
How to build a better AI benchmark
It’s not easy being one of Silicon Valley’s favorite benchmarks.
SWE-Bench (pronounced “swee bench”) launched in November 2024 as a way to evaluate an AI model’s coding skill. It has since quickly become one of the most popular tests in AI. A SWE-Bench score has become a mainstay of major model releases from OpenAI, Anthropic, and Google—and outside of foundation models, the fine-tuners at AI firms are in constant competition to see who can rise above the pack.
Despite all the fervor, this isn’t exactly a truthful assessment of which model is “better.” Entrants have begun to game the system—which is pushing many others to wonder whether there’s a better way to actually measure AI achievement. Read the full story.
—Russell Brandom
Did solar power cause Spain’s blackout?
At roughly midday on Monday, April 28, the lights went out in Spain. The grid blackout, which extended into parts of Portugal and France, affected tens of millions of people—flights were grounded, cell networks went down, and businesses closed for the day.
Over a week later, officials still aren’t entirely sure what happened, but some have suggested that renewables may have played a role, because just before the outage happened, wind and solar accounted for about 70% of electricity generation. Others, including Spanish government officials, insist that it’s too early to assign blame.
It’ll take weeks to get the full report, but we do know a few things about what happened. Here are a few takeaways that could help our future grid.
—Casey Crownhart
This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.
The must-reads
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.
1 The Trump administration will repeal some global chip curbs
It’s drawing up new rules that prioritize direct negotiations with various nations. (Bloomberg $)
+ The curbs have always been leaky anyway. (Economist $)
2 India and Pakistan have accused each other of overnight drone attacks
The conflict between the two countries is rapidly escalating. (The Guardian)
+ Pakistan claims to have shot down 25 drones in its airspace. (Reuters)
+ Mass-market military drones have changed the way wars are fought. (MIT Technology Review)
3 The FDA is interested in using AI for drug evaluation
And has met with OpenAI to hear more about how to do it. (Wired $)
+ An AI-driven “factory of drugs” claims to have hit a big milestone. (MIT Technology Review)
4 The US is pushing nations facing its tariffs to adopt Starlink
Government officials in India and other countries have fast tracked approvals. (WP $)
+ India recently announced new rules for satellite internet providers. (Rest of World)
5 Apple is overhauling its Safari browser to focus on AI search
Its search volume is down for the first time in 22 years. (The Verge)
+ Apple exec Eddy Cue thinks AI search will replace traditional search engines. (Bloomberg $)
+ AI means the end of internet search as we’ve known it. (MIT Technology Review)
6 Mark Zuckerberg is betting big on AI chatbots
He’s on a media charm offensive to convince us that AI friends are the future. (WSJ $)
+ The AI relationship revolution is already here. (MIT Technology Review)
7 Students can’t wean themselves off ChatGPT
And experts fear that they’ll emerge into the workforce essentially illiterate. (NY Mag $)
+ Some educators believe that AI highlights how the ways we teach need to change. (MIT Technology Review)
8 We don’t really know how memory works 🧠
But these researchers are doing their best to find out. (Quanta Magazine)
9 The vast majority of the sea depths are still unexplored
What lies beneath is a mystery. (New Scientist $)
+ Meet the divers trying to figure out how deep humans can go. (MIT Technology Review)
10 Pet psychics are taking over TikTok 🔮
But does your furry friend have anything to say?(NYT $)
+ Humans are still better than AI at futuregazing—for now. (Vox)
+ How DeepSeek became a fortune teller for China’s youth. (MIT Technology Review)
Quote of the day
“It’s like living in hell.”
—Elizabeth Martorana, a Virginia resident, describes what it’s like to live in a development zone for Amazon, Microsoft, and Google data centers, Semafor reports.
One more thing
How Antarctica’s history of isolation is ending—thanks to Starlink
“This is one of the least visited places on planet Earth and I got to open the door,” Matty Jordan, a construction specialist at New Zealand’s Scott Base in Antarctica, wrote in the caption to the video he posted to Instagram and TikTok in October 2023.
In the video, he guides viewers through the hut, pointing out where the men of Ernest Shackleton’s 1907 expedition lived and worked.
The video has racked up millions of views from all over the world. It’s also kind of a miracle: until very recently, those who lived and worked on Antarctic bases had no hope of communicating so readily with the outside world.
That’s starting to change, thanks to Starlink, the satellite constellation developed by Elon Musk’s company SpaceX to service the world with high-speed broadband internet. Read the full story.
—Allegra Rosenberg
We can still have nice things
A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)
+ Does Boston still drink? Not in the same way it used to.
+ Where in the US you should set up camp to stargaze right now.
+ Wow: this New Zealand snail lays eggs from its neck. 🐌
+ Jurassic World Rebirth is coming: and it looks suitably bonkers.