Stay Ahead, Stay ONMINE

The two people shaping the future of OpenAI’s research

For the past couple of years, OpenAI has felt like a one-man brand. With his showbiz style and fundraising glitz, CEO Sam Altman overshadows all other big names on the firm’s roster. Even his bungled ouster ended with him back on top—and more famous than ever. But look past the charismatic frontman and you get a clearer sense of where this company is going. After all, Altman is not the one building the technology on which its reputation rests.  That responsibility falls to OpenAI’s twin heads of research—chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google. I sat down with Chen and Pachocki for an exclusive conversation during a recent trip the pair made to London, where OpenAI set up its first international office in 2023. We talked about how they manage the inherent tension between research and product. We also talked about why they think coding and math are the keys to more capable all-purpose models; what they really mean when they talk about AGI; and what happened to OpenAI’s superalignment team, set up by the firm’s cofounder and former chief scientist Ilya Sutskever to prevent a hypothetical superintelligence from going rogue, which disbanded soon after he quit.  In particular, I wanted to get a sense of where their heads are at in the run-up to OpenAI’s biggest product release in months: GPT-5. Reports are out that the firm’s next-generation model will be launched in August. OpenAI’s official line—well, Altman’s—is that it will release GPT-5 “soon.” Anticipation is high. The leaps OpenAI made with GPT-3 and then GPT-4 raised the bar of what was thought possible with this technology. And yet delays to the launch of GPT-5 have fueled rumors that OpenAI has struggled to build a model that meets its own—not to mention everyone else’s—expectations. Altman has been uncharacteristically modest: “[GPT-5] is an experimental model that incorporates new research techniques we will use in future models,” he posted on X—which makes it sound more like a work in progress than another horizon-shifting release.  But expectation management is part of the job for a company that for the last several years has set the agenda for the industry. And Chen and Pachocki set the agenda inside OpenAI. Twin peaks  The firm’s main London office is in St James’s Park, a few hundred meters east of Buckingham Palace. But I met Chen and Pachocki in a conference room in a coworking space near King’s Cross, which OpenAI keeps as a kind of pied-à-terre in the heart of London’s tech neighborhood (Google DeepMind and Meta are just around the corner). OpenAI’s head of research communications, Laurance Fauconnet, sat with an open laptop at the end of the table.  Chen, who was wearing a maroon polo shirt, is clean-cut, almost preppy. He’s media trained and comfortable talking to a reporter. (That’s him flirting with a chatbot in the “Introducing GPT-4o” video.) Pachocki, in a black elephant-logo tee, has more of a TV-movie hacker look. He stares at his hands a lot when he speaks. But the pair are a tighter double act than they first appear. Pachocki summed up their roles. Chen shapes and manages the research teams, he said. “I am responsible for setting the research roadmap and establishing our long-term technical vision.” “But there’s fluidity in the roles,” Chen said. “We’re both researchers, we pull on technical threads. Whatever we see that we can pull on and fix, that’s what we do.” Chen joined the company in 2018 after working as a quantitative trader at the Wall Street firm Jane Street Capital, where he developed machine-learning models for futures trading. At OpenAI he spearheaded the creation of DALL-E, the firm’s breakthrough generative image model. He then worked on adding image recognition to GPT‑4 and led the development of Codex, the generative coding model that powers GitHub Copilot. Pachocki left an academic career in theoretical computer science to join OpenAI in 2017 and replaced Sutskever as chief scientist in 2024. He is the key architect of OpenAI’s so-called reasoning models—especially o1 and o3—which are designed to tackle complex tasks in science, math, and coding.  When we met they were buzzing, fresh off the high of two new back-to-back wins for their company’s technology. On July 16, one of OpenAI’s large language models came in second in the AtCoder World Tour Finals, one of the world’s most hardcore programming competitions. On July 19, OpenAI announced that one of its models had achieved gold-medal-level results on the 2025 International Math Olympiad, one of the world’s most prestigious math contests. The math result made headlines, not only because of OpenAI’s remarkable achievement, but because rival Google DeepMind revealed two days later that one of its models had achieved the same score in the same competition. Google DeepMind had played by the competition’s rules and waited for its results to be checked by the organizers before making an announcement; OpenAI had in effect marked its own answers. For Chen and Pachocki, the result speaks for itself. Anyway, it’s the programming win they’re most excited about. “I think that’s quite underrated,” Chen told me. A gold medal result in the International Math Olympiad puts you somewhere in the top 20 to 50 competitors, he said. But in the AtCoder contest OpenAI’s model placed in the top two: “To break into a really different tier of human performance—that’s unprecedented.” Ship, ship, ship! People at OpenAI still like to say they work at a research lab. But the company is very different from the one it was before the release of ChatGPT three years ago. The firm is now in a race with the biggest and richest technology companies in the world and valued at $300 billion. Envelope-pushing research and eye-catching demos no longer cut it. It needs to ship products and get them into people’s hands—and boy, it does.  OpenAI has kept up a run of new releases—putting out major updates to its GPT-4 series, launching a string of generative image and video models, and introducing the ability to talk to ChatGPT with your voice. Six months ago it kicked off a new wave of so-called reasoning models with its o1 release, soon followed by o3. And last week it released its browser-using agent Operator to the public. It now claims that more than 400 million people use its products every week and submit 2.5 billion prompts a day.  OpenAI’s incoming CEO of applications, Fidji Simo, plans to keep up the momentum. In a memo to the company, she told employees she is looking forward to “helping get OpenAI’s technologies into the hands of more people around the world,” where they will “unlock more opportunities for more people than any other technology in history.” Expect the products to keep coming. I asked how OpenAI juggles open-ended research and product development. “This is something we have been thinking about for a very long time, long before ChatGPT,” Pachocki said. “If we are actually serious about trying to build artificial general intelligence, clearly there will be so much that you can do with this technology along the way, so many tangents you can go down that will be big products.” In other words, keep shaking the tree and harvest what you can. A talking point that comes up with OpenAI folks is that putting experimental models out into the world was a necessary part of research. The goal was to make people aware of how good this technology had become. “We want to educate people about what’s coming so that we can participate in what will be a very hard societal conversation,” Altman told me back in 2022. The makers of this strange new technology were also curious what it might be for: OpenAI was keen to get it into people’s hands to see what they would do with it. Is that still the case? They answered at the same time. “Yeah!” Chen said. “To some extent,” Pachocki said. Chen laughed: “No, go ahead.”  “I wouldn’t say research iterates on product,” said Pachocki. “But now that models are at the edge of the capabilities that can be measured by classical benchmarks and a lot of the long-standing challenges that we’ve been thinking about are starting to fall, we’re at the point where it really is about what the models can do in the real world.” Like taking on humans in coding competitions. The person who beat OpenAI’s model at this year’s AtCoder contest, held in Japan, was a programmer named Przemysław Dębiak, also known as Psyho. The contest was a puzzle-solving marathon in which competitors had 10 hours to find the most efficient way to solve a complex coding problem. After his win, Psyho posted on X: “I’m completely exhausted … I’m barely alive.”   Chen and Pachocki have strong ties to the world of competitive coding. Both have competed in international coding contests in the past and Chen coaches the USA Computing Olympiad team. I asked whether that personal enthusiasm for competitive coding colors their sense of how big a deal it is for a model to perform well at such a challenge. They both laughed. “Definitely,” said Pachocki. “So: Psyho is kind of a legend. He’s been the number one competitor for many years. He’s also actually a friend of mine—we used to compete together in these contests.” Dębiak also used to work with Pachocki at OpenAI. When Pachocki competed in coding contests he favored those that focused on shorter problems with concrete solutions. But Dębiak liked longer, open-ended problems without an obvious correct answer. “He used to poke fun at me, saying that the kind of contest I was into will be automated long before the ones he liked,” Pachocki recalled. “So I was seriously invested in the performance of this model in this latest competition.” Pachocki told me he was glued to the late-night livestream from Tokyo, watching his model come in second: “Psyho resists for now.”  “We’ve tracked the performance of LLMs on coding contests for a while,” said Chen. “We’ve watched them become better than me, better than Jakub. It feels something like Lee Sedol playing Go.” Lee is the master Go player who lost a series of matches to DeepMind’s game-playing model AlphaGo in 2016. The results stunned the international Go community and led Lee to give up professional play. Last year he told the New York Times: “Losing to AI, in a sense, meant my entire world was collapsing … I could no longer enjoy the game.” And yet, unlike Lee, Chen and Pachocki are thrilled to be surpassed.    But why should the rest of us care about these niche wins? It’s clear that this technology—designed to mimic and, ultimately, stand in for human intelligence—is being built by people whose idea of peak intelligence is acing a math contest or holding your own against a legendary coder. Is it a problem that this view of intelligence is skewed toward the mathematical, analytical end of the scale? “I mean, I think you are right that—you know, selfishly, we do want to create models which accelerate ourselves,” Chen told me. “We see that as a very fast factor to progress.”   The argument researchers like Chen and Pachocki make is that math and coding are the bedrock for a far more general form of intelligence, one that can solve a wide range of problems in ways we might not have thought of ourselves. “We’re talking about programming and math here,” said Pachocki. “But it’s really about creativity, coming up with novel ideas, connecting ideas from different places.” Look at the two recent competitions: “In both cases, there were problems which required very hard, out-of-the-box thinking. Psyho spent half the programming competition thinking and then came up with a solution that was really novel and quite different from anything that our model looked at.” “This is really what we’re after,” Pachocki continued. “How do we get models to discover this sort of novel insight? To actually advance our knowledge? I think they are already capable of that in some limited ways. But I think this technology has the potential to really accelerate scientific progress.”  I returned to the question about whether the focus on math and programming was a problem, conceding that maybe it’s fine if what we’re building are tools to help us do science. We don’t necessarily want large language models to replace politicians and have people skills, I suggested. Chen pulled a face and looked up at the ceiling: “Why not?” What’s missing OpenAI was founded with a level of hubris that stood out even by Silicon Valley standards, boasting about its goal of building AGI back when talk of AGI still sounded kooky. OpenAI remains as gung-ho about AGI as ever, and it has done more than most to make AGI a mainstream multibillion-dollar concern. It’s not there yet, though. I asked Chen and Pachocki what they think is missing. “I think the way to envision the future is to really, deeply study the technology that we see today,” Pachocki said. “From the beginning, OpenAI has looked at deep learning as this very mysterious and clearly very powerful technology with a lot of potential. We’ve been trying to understand its bottlenecks. What can it do? What can it not do?”   At the current cutting edge, Chen said, are reasoning models, which break down problems into smaller, more manageable steps, but even they have limits: “You know, you have these models which know a lot of things but can’t chain that knowledge together. Why is that? Why can’t it do that in a way that humans can?” OpenAI is throwing everything at answering that question. “We are probably still, like, at the very beginning of this reasoning paradigm,” Pachocki told me. “Really, we are thinking about how to get these models to learn and explore over the long term and actually deliver very new ideas.” Chen pushed the point home: “I really don’t consider reasoning done. We’ve definitely not solved it. You have to read so much text to get a kind of approximation of what humans know.” OpenAI won’t say what data it uses to train its models or give details about their size and shape—only that it is working hard to make all stages of the development process more efficient. Those efforts make them confident that so-called scaling laws—which suggest that models will continue to get better the more compute you throw at them—show no sign of breaking down. “I don’t think there’s evidence that scaling laws are dead in any sense,” Chen insisted. “There have always been bottlenecks, right? Sometimes they’re to do with the way models are built. Sometimes they’re to do with data. But fundamentally it’s just about finding the research that breaks you through the current bottleneck.”  The faith in progress is unshakeable. I brought up something Pachocki had said about AGI in an interview with Nature in May: “When I joined OpenAI in 2017, I was still among the biggest skeptics at the company.” He looked doubtful.  “I’m not sure I was skeptical about the concept,” he said. “But I think I was—” He paused, looking at his hands on the table in front of him. “When I joined OpenAI, I expected the timelines to be longer to get to the point that we are now.” “There’s a lot of consequences of AI,” he said. “But the one I think the most about is automated research. When we look at human history, a lot of it is about technological progress, about humans building new technologies. The point when computers can develop new technologies themselves seems like a very important, um, inflection point. “We already see these models assist scientists. But when they are able to work on longer horizons—when they’re able to establish research programs for themselves—the world will feel meaningfully different.” For Chen, that ability for models to work by themselves for longer is key. “I mean, I do think everyone has their own definitions of AGI,” he said. “But this concept of autonomous time—just the amount of time that the model can spend making productive progress on a difficult problem without hitting a dead end—that’s one of the big things that we’re after.” It’s a bold vision—and far beyond the capabilities of today’s models. But I was nevertheless struck by how Chen and Pachocki made AGI sound almost mundane. Compare this with how Sutskever responded when I spoke to him 18 months ago. “It’s going to be monumental, earth-shattering,” he told me. “There will be a before and an after.” Faced with the immensity of what he was building, Sutskever switched the focus of his career from designing better and better models to figuring out how to control a technology that he believed would soon be smarter than himself. Two years ago Sutskever set up what he called a superalignment team that he would co-lead with another OpenAI safety researcher, Jan Leike. The claim was that this team would funnel a full fifth of OpenAI’s resources into figuring out how to control a hypothetical superintelligence. Today, most of the people on the superalignment team, including Sutskever and Leike, have left the company and the team no longer exists.    When Leike quit, he said it was because the team had not been given the support he felt it deserved. He posted this on X: “Building smarter-than-human machines is an inherently dangerous endeavor. OpenAI is shouldering an enormous responsibility on behalf of all of humanity. But over the past years, safety culture and processes have taken a backseat to shiny products.” Other departing researchers shared similar statements. I asked Chen and Pachocki what they make of such concerns. “A lot of these things are highly personal decisions,” Chen said. “You know, a researcher can kind of, you know—” He started again. “They might have a belief that the field is going to evolve in a certain way and that their research is going to pan out and is going to bear fruit. And, you know, maybe the company doesn’t reshape in the way that you want it to. It’s a very dynamic field.” “A lot of these things are personal decisions,” he repeated. “Sometimes the field is just evolving in a way that is less consistent with the way that you’re doing research.” But alignment, both of them insist, is now part of the core business rather than the concern of one specific team. According to Pachocki, these models don’t work at all unless they work as you expect them to. There’s also little desire to focus on aligning a hypothetical superintelligence with your objectives when doing so with existing models is already enough of a challenge. “Two years ago the risks that we were imagining were mostly theoretical risks,” Pachocki said. “The world today looks very different, and I think a lot of alignment problems are now very practically motivated.” Still, experimental technology is being spun into mass-market products faster than ever before. Does that really never lead to disagreements between the two of them? “I am often afforded the luxury of really kind of thinking about the long term, where the technology is headed,” Pachocki said. “Contending with the reality of the process—both in terms of people and also, like, the broader company needs—falls on Mark. It’s not really a disagreement, but there is a natural tension between these different objectives and the different challenges that the company is facing that materializes between us.” Chen jumped in: “I think it’s just a very delicate balance.”  

For the past couple of years, OpenAI has felt like a one-man brand. With his showbiz style and fundraising glitz, CEO Sam Altman overshadows all other big names on the firm’s roster. Even his bungled ouster ended with him back on top—and more famous than ever. But look past the charismatic frontman and you get a clearer sense of where this company is going. After all, Altman is not the one building the technology on which its reputation rests. 

That responsibility falls to OpenAI’s twin heads of research—chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google.

I sat down with Chen and Pachocki for an exclusive conversation during a recent trip the pair made to London, where OpenAI set up its first international office in 2023. We talked about how they manage the inherent tension between research and product. We also talked about why they think coding and math are the keys to more capable all-purpose models; what they really mean when they talk about AGI; and what happened to OpenAI’s superalignment team, set up by the firm’s cofounder and former chief scientist Ilya Sutskever to prevent a hypothetical superintelligence from going rogue, which disbanded soon after he quit. 

In particular, I wanted to get a sense of where their heads are at in the run-up to OpenAI’s biggest product release in months: GPT-5.

Reports are out that the firm’s next-generation model will be launched in August. OpenAI’s official line—well, Altman’s—is that it will release GPT-5 “soon.” Anticipation is high. The leaps OpenAI made with GPT-3 and then GPT-4 raised the bar of what was thought possible with this technology. And yet delays to the launch of GPT-5 have fueled rumors that OpenAI has struggled to build a model that meets its own—not to mention everyone else’s—expectations.

Altman has been uncharacteristically modest: “[GPT-5] is an experimental model that incorporates new research techniques we will use in future models,” he posted on X—which makes it sound more like a work in progress than another horizon-shifting release. 

But expectation management is part of the job for a company that for the last several years has set the agenda for the industry. And Chen and Pachocki set the agenda inside OpenAI.

Twin peaks 

The firm’s main London office is in St James’s Park, a few hundred meters east of Buckingham Palace. But I met Chen and Pachocki in a conference room in a coworking space near King’s Cross, which OpenAI keeps as a kind of pied-à-terre in the heart of London’s tech neighborhood (Google DeepMind and Meta are just around the corner). OpenAI’s head of research communications, Laurance Fauconnet, sat with an open laptop at the end of the table. 

Chen, who was wearing a maroon polo shirt, is clean-cut, almost preppy. He’s media trained and comfortable talking to a reporter. (That’s him flirting with a chatbot in the “Introducing GPT-4o” video.) Pachocki, in a black elephant-logo tee, has more of a TV-movie hacker look. He stares at his hands a lot when he speaks.

But the pair are a tighter double act than they first appear. Pachocki summed up their roles. Chen shapes and manages the research teams, he said. “I am responsible for setting the research roadmap and establishing our long-term technical vision.”

“But there’s fluidity in the roles,” Chen said. “We’re both researchers, we pull on technical threads. Whatever we see that we can pull on and fix, that’s what we do.”

Chen joined the company in 2018 after working as a quantitative trader at the Wall Street firm Jane Street Capital, where he developed machine-learning models for futures trading. At OpenAI he spearheaded the creation of DALL-E, the firm’s breakthrough generative image model. He then worked on adding image recognition to GPT‑4 and led the development of Codex, the generative coding model that powers GitHub Copilot.

Pachocki left an academic career in theoretical computer science to join OpenAI in 2017 and replaced Sutskever as chief scientist in 2024. He is the key architect of OpenAI’s so-called reasoning models—especially o1 and o3—which are designed to tackle complex tasks in science, math, and coding. 

When we met they were buzzing, fresh off the high of two new back-to-back wins for their company’s technology.

On July 16, one of OpenAI’s large language models came in second in the AtCoder World Tour Finals, one of the world’s most hardcore programming competitions. On July 19, OpenAI announced that one of its models had achieved gold-medal-level results on the 2025 International Math Olympiad, one of the world’s most prestigious math contests.

The math result made headlines, not only because of OpenAI’s remarkable achievement, but because rival Google DeepMind revealed two days later that one of its models had achieved the same score in the same competition. Google DeepMind had played by the competition’s rules and waited for its results to be checked by the organizers before making an announcement; OpenAI had in effect marked its own answers.

For Chen and Pachocki, the result speaks for itself. Anyway, it’s the programming win they’re most excited about. “I think that’s quite underrated,” Chen told me. A gold medal result in the International Math Olympiad puts you somewhere in the top 20 to 50 competitors, he said. But in the AtCoder contest OpenAI’s model placed in the top two: “To break into a really different tier of human performance—that’s unprecedented.”

Ship, ship, ship!

People at OpenAI still like to say they work at a research lab. But the company is very different from the one it was before the release of ChatGPT three years ago. The firm is now in a race with the biggest and richest technology companies in the world and valued at $300 billion. Envelope-pushing research and eye-catching demos no longer cut it. It needs to ship products and get them into people’s hands—and boy, it does. 

OpenAI has kept up a run of new releases—putting out major updates to its GPT-4 series, launching a string of generative image and video models, and introducing the ability to talk to ChatGPT with your voice. Six months ago it kicked off a new wave of so-called reasoning models with its o1 release, soon followed by o3. And last week it released its browser-using agent Operator to the public. It now claims that more than 400 million people use its products every week and submit 2.5 billion prompts a day. 

OpenAI’s incoming CEO of applications, Fidji Simo, plans to keep up the momentum. In a memo to the company, she told employees she is looking forward to “helping get OpenAI’s technologies into the hands of more people around the world,” where they will “unlock more opportunities for more people than any other technology in history.” Expect the products to keep coming.

I asked how OpenAI juggles open-ended research and product development. “This is something we have been thinking about for a very long time, long before ChatGPT,” Pachocki said. “If we are actually serious about trying to build artificial general intelligence, clearly there will be so much that you can do with this technology along the way, so many tangents you can go down that will be big products.” In other words, keep shaking the tree and harvest what you can.

A talking point that comes up with OpenAI folks is that putting experimental models out into the world was a necessary part of research. The goal was to make people aware of how good this technology had become. “We want to educate people about what’s coming so that we can participate in what will be a very hard societal conversation,” Altman told me back in 2022. The makers of this strange new technology were also curious what it might be for: OpenAI was keen to get it into people’s hands to see what they would do with it.

Is that still the case? They answered at the same time. “Yeah!” Chen said. “To some extent,” Pachocki said. Chen laughed: “No, go ahead.” 

“I wouldn’t say research iterates on product,” said Pachocki. “But now that models are at the edge of the capabilities that can be measured by classical benchmarks and a lot of the long-standing challenges that we’ve been thinking about are starting to fall, we’re at the point where it really is about what the models can do in the real world.”

Like taking on humans in coding competitions. The person who beat OpenAI’s model at this year’s AtCoder contest, held in Japan, was a programmer named Przemysław Dębiak, also known as Psyho. The contest was a puzzle-solving marathon in which competitors had 10 hours to find the most efficient way to solve a complex coding problem. After his win, Psyho posted on X: “I’m completely exhausted … I’m barely alive.”  

Chen and Pachocki have strong ties to the world of competitive coding. Both have competed in international coding contests in the past and Chen coaches the USA Computing Olympiad team. I asked whether that personal enthusiasm for competitive coding colors their sense of how big a deal it is for a model to perform well at such a challenge.

They both laughed. “Definitely,” said Pachocki. “So: Psyho is kind of a legend. He’s been the number one competitor for many years. He’s also actually a friend of mine—we used to compete together in these contests.” Dębiak also used to work with Pachocki at OpenAI.

When Pachocki competed in coding contests he favored those that focused on shorter problems with concrete solutions. But Dębiak liked longer, open-ended problems without an obvious correct answer.

“He used to poke fun at me, saying that the kind of contest I was into will be automated long before the ones he liked,” Pachocki recalled. “So I was seriously invested in the performance of this model in this latest competition.”

Pachocki told me he was glued to the late-night livestream from Tokyo, watching his model come in second: “Psyho resists for now.” 

“We’ve tracked the performance of LLMs on coding contests for a while,” said Chen. “We’ve watched them become better than me, better than Jakub. It feels something like Lee Sedol playing Go.”

Lee is the master Go player who lost a series of matches to DeepMind’s game-playing model AlphaGo in 2016. The results stunned the international Go community and led Lee to give up professional play. Last year he told the New York Times: “Losing to AI, in a sense, meant my entire world was collapsing … I could no longer enjoy the game.” And yet, unlike Lee, Chen and Pachocki are thrilled to be surpassed.   

But why should the rest of us care about these niche wins? It’s clear that this technology—designed to mimic and, ultimately, stand in for human intelligence—is being built by people whose idea of peak intelligence is acing a math contest or holding your own against a legendary coder. Is it a problem that this view of intelligence is skewed toward the mathematical, analytical end of the scale?

“I mean, I think you are right that—you know, selfishly, we do want to create models which accelerate ourselves,” Chen told me. “We see that as a very fast factor to progress.”  

The argument researchers like Chen and Pachocki make is that math and coding are the bedrock for a far more general form of intelligence, one that can solve a wide range of problems in ways we might not have thought of ourselves. “We’re talking about programming and math here,” said Pachocki. “But it’s really about creativity, coming up with novel ideas, connecting ideas from different places.”

Look at the two recent competitions: “In both cases, there were problems which required very hard, out-of-the-box thinking. Psyho spent half the programming competition thinking and then came up with a solution that was really novel and quite different from anything that our model looked at.”

“This is really what we’re after,” Pachocki continued. “How do we get models to discover this sort of novel insight? To actually advance our knowledge? I think they are already capable of that in some limited ways. But I think this technology has the potential to really accelerate scientific progress.” 

I returned to the question about whether the focus on math and programming was a problem, conceding that maybe it’s fine if what we’re building are tools to help us do science. We don’t necessarily want large language models to replace politicians and have people skills, I suggested.

Chen pulled a face and looked up at the ceiling: “Why not?”

What’s missing

OpenAI was founded with a level of hubris that stood out even by Silicon Valley standards, boasting about its goal of building AGI back when talk of AGI still sounded kooky. OpenAI remains as gung-ho about AGI as ever, and it has done more than most to make AGI a mainstream multibillion-dollar concern. It’s not there yet, though. I asked Chen and Pachocki what they think is missing.

“I think the way to envision the future is to really, deeply study the technology that we see today,” Pachocki said. “From the beginning, OpenAI has looked at deep learning as this very mysterious and clearly very powerful technology with a lot of potential. We’ve been trying to understand its bottlenecks. What can it do? What can it not do?”  

At the current cutting edge, Chen said, are reasoning models, which break down problems into smaller, more manageable steps, but even they have limits: “You know, you have these models which know a lot of things but can’t chain that knowledge together. Why is that? Why can’t it do that in a way that humans can?”

OpenAI is throwing everything at answering that question.

“We are probably still, like, at the very beginning of this reasoning paradigm,” Pachocki told me. “Really, we are thinking about how to get these models to learn and explore over the long term and actually deliver very new ideas.”

Chen pushed the point home: “I really don’t consider reasoning done. We’ve definitely not solved it. You have to read so much text to get a kind of approximation of what humans know.”

OpenAI won’t say what data it uses to train its models or give details about their size and shape—only that it is working hard to make all stages of the development process more efficient.

Those efforts make them confident that so-called scaling laws—which suggest that models will continue to get better the more compute you throw at them—show no sign of breaking down.

“I don’t think there’s evidence that scaling laws are dead in any sense,” Chen insisted. “There have always been bottlenecks, right? Sometimes they’re to do with the way models are built. Sometimes they’re to do with data. But fundamentally it’s just about finding the research that breaks you through the current bottleneck.” 

The faith in progress is unshakeable. I brought up something Pachocki had said about AGI in an interview with Nature in May: “When I joined OpenAI in 2017, I was still among the biggest skeptics at the company.” He looked doubtful. 

“I’m not sure I was skeptical about the concept,” he said. “But I think I was—” He paused, looking at his hands on the table in front of him. “When I joined OpenAI, I expected the timelines to be longer to get to the point that we are now.”

“There’s a lot of consequences of AI,” he said. “But the one I think the most about is automated research. When we look at human history, a lot of it is about technological progress, about humans building new technologies. The point when computers can develop new technologies themselves seems like a very important, um, inflection point.

“We already see these models assist scientists. But when they are able to work on longer horizons—when they’re able to establish research programs for themselves—the world will feel meaningfully different.”

For Chen, that ability for models to work by themselves for longer is key. “I mean, I do think everyone has their own definitions of AGI,” he said. “But this concept of autonomous time—just the amount of time that the model can spend making productive progress on a difficult problem without hitting a dead end—that’s one of the big things that we’re after.”

It’s a bold vision—and far beyond the capabilities of today’s models. But I was nevertheless struck by how Chen and Pachocki made AGI sound almost mundane. Compare this with how Sutskever responded when I spoke to him 18 months ago. “It’s going to be monumental, earth-shattering,” he told me. “There will be a before and an after.” Faced with the immensity of what he was building, Sutskever switched the focus of his career from designing better and better models to figuring out how to control a technology that he believed would soon be smarter than himself.

Two years ago Sutskever set up what he called a superalignment team that he would co-lead with another OpenAI safety researcher, Jan Leike. The claim was that this team would funnel a full fifth of OpenAI’s resources into figuring out how to control a hypothetical superintelligence. Today, most of the people on the superalignment team, including Sutskever and Leike, have left the company and the team no longer exists.   

When Leike quit, he said it was because the team had not been given the support he felt it deserved. He posted this on X: “Building smarter-than-human machines is an inherently dangerous endeavor. OpenAI is shouldering an enormous responsibility on behalf of all of humanity. But over the past years, safety culture and processes have taken a backseat to shiny products.” Other departing researchers shared similar statements.

I asked Chen and Pachocki what they make of such concerns. “A lot of these things are highly personal decisions,” Chen said. “You know, a researcher can kind of, you know—”

He started again. “They might have a belief that the field is going to evolve in a certain way and that their research is going to pan out and is going to bear fruit. And, you know, maybe the company doesn’t reshape in the way that you want it to. It’s a very dynamic field.”

“A lot of these things are personal decisions,” he repeated. “Sometimes the field is just evolving in a way that is less consistent with the way that you’re doing research.”

But alignment, both of them insist, is now part of the core business rather than the concern of one specific team. According to Pachocki, these models don’t work at all unless they work as you expect them to. There’s also little desire to focus on aligning a hypothetical superintelligence with your objectives when doing so with existing models is already enough of a challenge.

“Two years ago the risks that we were imagining were mostly theoretical risks,” Pachocki said. “The world today looks very different, and I think a lot of alignment problems are now very practically motivated.”

Still, experimental technology is being spun into mass-market products faster than ever before. Does that really never lead to disagreements between the two of them?

I am often afforded the luxury of really kind of thinking about the long term, where the technology is headed,” Pachocki said. “Contending with the reality of the process—both in terms of people and also, like, the broader company needs—falls on Mark. It’s not really a disagreement, but there is a natural tension between these different objectives and the different challenges that the company is facing that materializes between us.”

Chen jumped in: “I think it’s just a very delicate balance.”  

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Petronas Inks MoU with Microsoft to Advance Malaysia’s AI Ecosystem

Malaysia’s Petroliam Nasional Berhad (Petronas) has signed a memorandum of understanding (MoU) with Microsoft to develop an artificial intelligence (AI)-enabled economy in Malaysia and help advance energy transition efforts in Asia. “At Petronas, innovation goes beyond technology – it is about shaping a future where energy is smarter, cleaner, and sustainable for not only the organizations involved, but also the nation and its people”, Mohd Yusri, Senior Vice President of Projects, Technology and Health, Safety, Security and Environment (PT and HSSE) at Petronas, said.   “By harnessing our joint expertise in innovation and sustainability, we are steadfast in advancing adoption of AI and Cloud capabilities in a manner that promotes sustainable energy practices, in support of Malaysia’s aspirations of building an AI economy with a robust ecosystem in which everyone thrives”, he added. The collaboration aims to develop an ecosystem in Malaysia that empowers organizations to leverage AI for economic growth and social benefits. Focusing on nation-building, these companies will aid Malaysia in establishing regional leadership in AI and cultivating a robust local AI community through joint programs, Petronas said. As part of this MoU, Petronas and Microsoft plan to pursue additional initiatives to enhance AI and energy innovations, utilizing Microsoft’s new Malaysia West cloud region, Petronas said. They will focus on integrating Agentic AI, Microsoft Copilot, data analytics, cloud computing, and cybersecurity, among other technologies, to improve operational efficiency and sustainability throughout Malaysia’s value chain, it said. “As a trusted technology partner to Petronas, we are thrilled to strengthen our collaboration to help advance their digital and AI transformation. With Microsoft’s new cloud region in Malaysia, we are committed to supporting Petronas with secure, scalable, and sustainable cloud solutions that will drive growth and innovation in Malaysia’s energy sector”, Laurence Si, Microsoft Malaysia Managing Director, said. The parties will also

Read More »

Repsol, NEO Energy Complete UK Merger

NEO Energy Group Ltd. and Repsol Resources UK Ltd. have completed their combination, touting the resulting company as one of the biggest producers on the United Kingdom continental shelf (UKCS). The combined entity, NEO NEXT Energy Ltd., is 55 percent owned by European energy investor HitecVision and 45 percent by Repsol E&P Group. Repsol E&P is 75 percent owned by Spanish integrated energy company Repsol SA and 25 percent owned by the United States’ EIG Global Energy Partners. “This equity split reflects the contributions and strategic alignment of both parties in the creation of a market-leading entity in the UKCS, with a projected 2025 production of approximately 130,000 barrels of oil equivalent per day”, Repsol said in a statement online announcing completion. According to the announcement of the deal in March, Repsol Resources UK owns stakes in 48 producing and non-producing oil and gas fields, while NEO’s portfolio in UK waters includes Penguins, Culzean, Gannet, Shearwater, Britannia Area and Elgin Franklin. NEO NEXT will operate 11 production hubs and “substantial undeveloped reserves”, the companies said March. “This combination creates a jointly governed business which will call upon the key strengths of both shareholders”, Francisco Gea, executive managing director for exploration and production at Repsol, said in comments for the closure of the transaction. “Repsol contributes operational capabilities on production, development and decommissioning activities which will be combined with NEO Energy expertise on financial and commercial matters. “We believe this combined business has many more opportunities for profitable growth in the basin and beyond”. NEO NEXT chief executive John Knight said, “Our strategy can be summarized as ‘Resilience, Yield and Growth’: the combined company has much more scale and diversity and opportunities for cost consolidation and portfolio high-grading, giving resilience despite the tough conditions in the UK”. “The benefits of synergies

Read More »

India Tells Refiners to Draw Up Plans for Non-Russian Crude

India told its oil refiners to come up with plans for buying non-Russian crude, a scenario that would have far-reaching consequences for the global oil market if it ultimately meant reduced dealings with Moscow. The government has asked state-owned processors to prepare an outline of where alternate barrels can be sourced and at what volume if Russian flows get stopped, people familiar with the matter said, asking not to be named due to the sensitivity of the matter. One of the people said that the instruction amounted to scenario planning in case Russian crude were to become unavailable. The request came after a social-media post on Wednesday by US President Donald Trump threw the Asian nation’s fuel makers into disarray. Trump said in his post that India would face “penalties” because of ongoing purchases of Russian energy that helped to fund the war in Ukraine.  India is a critical source of demand for Russian oil and Moscow would have to divert millions of barrels a month to China and other buyers if New Delhi were to halt buying. They’ve helped Russian barrels to keep flowing to the world — largely undisturbed — despite wide-ranging western sanctions. So far the Indian government hasn’t set out its position and people with knowledge of the matter said it’s still evaluating the situation and will continue to do so for several more days.  India’s refiners have been racing to buy barrels from elsewhere and there are tentative signs that they might be scaling back Russian cargo purchases.  A senior executive at a major Indian oil refiner said the company would try to source more crude from the Middle East and Africa, while still looking to the government for guidance on how to proceed. The situation was not entirely unexpected, but would increase costs and

Read More »

Oil Dips on Inflation, Geopolitical Jitters

Oil fell as broader markets weakened on worse-than-expected US inflation data and crude traders cashed out after prices reached a six-week high. While West Texas Intermediate slipped 1.1% on Thursday to settle below $70 a barrel, snapping a three day rally, prices are largely still range-bound as traders await clearer signals on balances for supply and demand. “Investors are just being cautious not to overextend the rally until we have more clarity on: one OPEC, two Russia — through the weekend,” as well as the looming Aug. 1 US tariff deadline, said Frank Monkam, head of macro trading at Buffalo Bayou Commodities. US President Donald Trump said he would impose a tariff on India’s exports and a penalty for its energy purchases from Russia from Aug. 1, the latest in a series of comments in which he expressed his anger at the lack of ceasefire in Ukraine. While the market impact of disrupting Indian purchases could be significant, as Moscow would have to find new buyers if it loses one of its largest customers, the relatively muted price movements offer a sign that there’s little expectation Trump will follow through for now. It’s the latest sign of an oil market that increasingly only reacts when there is a meaningful disruption to supply. While Trump has repeatedly threatened steps that might hurt output in producer nations from Venezuela to Iran and Russia since taking office, there’s so far not been a substantial hit to global supply, even when the US bombed Iran’s nuclear facilities. India’s refiners are seeking clarity from the government in New Delhi. A senior executive at a major processor said his company would try and source more crude from the Middle East and Africa, while also looking to the government for guidance on how it should proceed. “Finding

Read More »

NYPA’s updated renewables plan would more than double capacity to 7 GW

NYPA’s updated renewables plan would more than double capacity to 7 GW | Utility Dive Skip to main content An article from Dive Brief The New York Power Authority’s draft plan includes new renewable and energy storage projects totaling more than 3.8 GW. Published July 31, 2025 An aerial view of solar panels at the Sutter Greenworks Solar Site on Sept. 19, 2021, in Calverton, New York. A draft renewable energy plan issued on July 29, 2025, by the New York Power Authority calls for adding 7 GW of solar, wind and energy storage in the state. Bruce Bennett via Getty Images Dive Brief: The New York Power Authority on Tuesday published a draft of its Renewables Updated Strategic Plan, calling for 7 GW of solar, wind and energy storage — more than doubling the total energy capacity outlined in its initial plan released in January. New York lawmakers in 2023 expanded NYPA’s authority to develop, own and operate renewable energy resources. Officials at the public power utility say they are using the new authority to continue the state’s clean energy transition at a time when federal policy is shifting away from renewables. “There has never been a more critical time for NYPA to move expeditiously as we contend with expiring federal tax credits and associated increased competition for equipment and installers,” President and CEO Justin Driscoll said in a statement. Dive Insight: Advocates say public pressure for more clean energy led to NYPA expanding its renewables plan, and the timing is particularly acute given headwinds to solar and wind coming from the White House. “Instead of cutting deals with Trump or gutting New York’s climate mandates the way he is federally, [New York Democratic Gov. Kathy] Hochul must ensure NYPA leads the nation on lowering energy bills, slashing pollution, creating good green

Read More »

USA and Pakistan Sign Trade Deal to Boost Oil Reserves, Market Ties

The US sealed a trade deal with Pakistan as their officials wrapped up talks in Washington, agreeing to develop oil reserves. The agreement involves a reduction of the so called reciprocal tariffs, especially on Pakistani exports, according to a statement by Pakistan’s finance ministry on Thursday. No details on tariffs were shared by either side. The agreement will spur US investments in Pakistan’s infrastructure, besides deepening market ties between the partners, the ministry said.  US President Donald Trump said in a post on Truth Social that the two countries will “work together on developing their massive oil reserves”, adding that officials are now selecting the company that will anchor the partnership. Relations between Islamabad and Washington have been showing signs of easing after prolonged tensions, with President Trump welcoming Pakistan’s army chief, Field Marshal Asim Munir, for rare talks at the White House in June.  Pakistan, which lists the US as one of its top export destinations, had offered to boost American imports, particularly cotton and soybean. The South Asian nation sold over $5 billion worth of goods to the US as of 2024, and imported about $2.1 billion. The US has also expressed interest in sunrise sectors such as crypto currencies. Pakistan plans to legalize and regulate digital assets as the field gains traction in key Asian markets following Trump’s pro-crypto agenda, Bloomberg News reported. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Read More »

Data center survey: AI gains ground but trust concerns persist

Cost issues: 76% Forecasting future data center capacity requirements: 71% Improving energy performance for facilities equipment: 67% Power availability: 63% Supply chain disruptions: 65% A lack of qualified staff: 67% With respect to capacity planning, there’s been a notable increase in the number of operators who describe themselves as “very concerned” about forecasting future data center capacity requirements. Andy Lawrence, Uptime’s executive director of research, said two factors are contributing to this concern: ongoing strong growth for IT demand, and the often-unpredictable demand that AI workloads are creating. “There’s great uncertainty about … what the impact of AI is going to be, where it’s going to be located, how much of the power is going to be required, and even for things like space and cooling, how much of the infrastructure is going to be sucked up to support AI, whether it’s in a colocation, whether it’s in an enterprise or even in a hyperscale facility,” Lawrence said during a webinar sharing the survey results. The survey found that roughly one-third of data center owners and operators currently perform some AI training or inference, with significantly more planning to do so in the future. As the number of AI-based software deployments increases, information about the capabilities and limitations of AI in the workplace is becoming available. The awareness is also revealing AI’s suitability for certain tasks. According to the report, “the data center industry is entering a period of careful adoption, testing, and validation. Data centers are slow and careful in adopting new technologies, and AI will not be an exception.”

Read More »

Micron unveils PCIe Gen6 SSD to power AI data center workloads

Competitive positioning With the launch of the 9650 SSD PCIe Gen 6, Micron competes with Samsung and SK Hynix enterprise SSD offerings, which are the dominant players in the SSD market. In December last year, SK Hynix announced the development of PS1012 U.2 Gen5 PCIe SSD, for massive high-capacity storage for AI data centers.  The PM1743 is Samsung’s PCIe Gen5 offering in the market, with 14,000 MBps sequential read, designed for high-performance enterprise workloads. According to Faruqui, PCIe Gen6 data center SSDs are best suited for AI inference performance enhancement. However, we’re still months away from large-scale adoption as no current CPU platforms are available with PCIe 6.0 support. Only Nvidia’s Blackwell-based GPUs have native PCIe 6.0 x16 support with interoperability tests in progress. He added that PCIe Gen 6 SSDs will see very delayed adoption in the PC segment and imminent 2025 2H adoption in AI, data centers, high-performance computing (HPC), and enterprise storage solutions. Micron has also introduced two additional SSDs alongside the 9650. The 6600 ION SSD delivers 122TB in an E3.S form factor and is targeted at hyperscale and enterprise data centers looking to consolidate server infrastructure and build large AI data lakes. A 245TB variant is on the roadmap. The 7600 PCIe Gen5 SSD, meanwhile, is aimed at mixed workloads that require lower latency.

Read More »

AI Deployments are Reshaping Intra-Data Center Fiber and Communications

Artificial Intelligence is fundamentally changing the way data centers are architected, with a particular focus on the demands placed on internal fiber and communications infrastructure. While much attention is paid to the fiber connections between data centers or to end-users, the real transformation is happening inside the data center itself, where AI workloads are driving unprecedented requirements for bandwidth, low latency, and scalable networking. Network Segmentation and Specialization Inside the modern AI data center, the once-uniform network is giving way to a carefully divided architecture that reflects the growing divergence between conventional cloud services and the voracious needs of AI. Where a single, all-purpose network once sufficed, operators now deploy two distinct fabrics, each engineered for its own unique mission. The front-end network remains the familiar backbone for external user interactions and traditional cloud applications. Here, Ethernet still reigns, with server-to-leaf links running at 25 to 50 gigabits per second and spine connections scaling to 100 Gbps. Traffic is primarily north-south, moving data between users and the servers that power web services, storage, and enterprise applications. This is the network most people still imagine when they think of a data center: robust, versatile, and built for the demands of the internet age. But behind this familiar façade, a new, far more specialized network has emerged, dedicated entirely to the demands of GPU-driven AI workloads. In this backend, the rules are rewritten. Port speeds soar to 400 or even 800 gigabits per second per GPU, and latency is measured in sub-microseconds. The traffic pattern shifts decisively east-west, as servers and GPUs communicate in parallel, exchanging vast datasets at blistering speeds to train and run sophisticated AI models. The design of this network is anything but conventional: fat-tree or hypercube topologies ensure that no single link becomes a bottleneck, allowing thousands of

Read More »

ABB and Applied Digital Build a Template for AI-Ready Data Centers

Toward the Future of AI Factories The ABB–Applied Digital partnership signals a shift in the fundamentals of data center development, where electrification strategy, hyperscale design and readiness, and long-term financial structuring are no longer separate tracks but part of a unified build philosophy. As Applied Digital pushes toward REIT status, the Ellendale campus becomes not just a development milestone but a cornerstone asset: a long-term, revenue-generating, AI-optimized property underpinned by industrial-grade power architecture. The 250 MW CoreWeave lease, with the option to expand to 400 MW, establishes a robust revenue base and validates the site’s design as AI-first, not cloud-retrofitted. At the same time, ABB is positioning itself as a leader in AI data center power architecture, setting a new benchmark for scalable, high-density infrastructure. Its HiPerGuard Medium Voltage UPS, backed by deep global manufacturing and engineering capabilities, reimagines power delivery for the AI era, bypassing the limitations of legacy low-voltage systems. More than a component provider, ABB is now architecting full-stack electrification strategies at the campus level, aiming to make this medium-voltage model the global standard for AI factories. What’s unfolding in North Dakota is a preview of what’s coming elsewhere: AI-ready campuses that marry investment-grade real estate with next-generation power infrastructure, built for a future measured in megawatts per rack, not just racks per row. As AI continues to reshape what data centers are and how they’re built, Ellendale may prove to be one of the key locations where the new standard was set.

Read More »

Amazon’s Project Rainier Sets New Standard for AI Supercomputing at Scale

Supersized Infrastructure for the AI Era As AWS deploys Project Rainier, it is scaling AI compute to unprecedented heights, while also laying down a decisive marker in the escalating arms race for hyperscale dominance. With custom Trainium2 silicon, proprietary interconnects, and vertically integrated data center architecture, Amazon joins a trio of tech giants, alongside Microsoft’s Project Stargate and Google’s TPUv5 clusters, who are rapidly redefining the future of AI infrastructure. But Rainier represents more than just another high-performance cluster. It arrives in a moment where the size, speed, and ambition of AI infrastructure projects have entered uncharted territory. Consider the past several weeks alone: On June 24, AWS detailed Project Rainier, calling it “a massive, one-of-its-kind machine” and noting that “the sheer size of the project is unlike anything AWS has ever attempted.” The New York Times reports that the primary Rainier campus in Indiana could include up to 30 data center buildings. Just two days later, Fermi America unveiled plans for the HyperGrid AI campus in Amarillo, Texas on a sprawling 5,769-acre site with potential for 11 gigawatts of power and 18 million square feet of AI data center capacity. And on July 1, Oracle projected $30 billion in annual revenue from a single OpenAI cloud deal, tied to the Project Stargate campus in Abilene, Texas. As Data Center Frontier founder Rich Miller has observed, the dial on data center development has officially been turned to 11. Once an aspirational concept, the gigawatt-scale campus is now materializing—15 months after Miller forecasted its arrival. “It’s hard to imagine data center projects getting any bigger,” he notes. “But there’s probably someone out there wondering if they can adjust the dial so it goes to 12.” Against this backdrop, Project Rainier represents not just financial investment but architectural intent. Like Microsoft’s Stargate buildout in

Read More »

Google and CTC Global Partner to Fast-Track U.S. Power Grid Upgrades

On June 17, 2025, Google and CTC Global announced a joint initiative to accelerate the deployment of high-capacity power transmission lines using CTC’s U.S.-manufactured ACCC® advanced conductors. The collaboration seeks to relieve grid congestion by rapidly upgrading existing infrastructure, enabling greater integration of clean energy, improving system resilience, and unlocking capacity for hyperscale data centers. The effort represents a rare convergence of corporate climate commitments, utility innovation, and infrastructure modernization aligned with the public interest. As part of the initiative, Google and CTC issued a Request for Information (RFI) with responses due by July 14. The RFI invites utilities, state energy authorities, and developers to nominate transmission line segments for potential fast-tracked upgrades. Selected projects will receive support in the form of technical assessments, financial assistance, and workforce development resources. While advanced conductor technologies like ACCC® can significantly improve the efficiency and capacity of existing transmission corridors, technological innovation alone cannot resolve the grid’s structural challenges. Building new or upgraded transmission lines in the U.S. often requires complex permitting from multiple federal, state, and local agencies, and frequently faces legal opposition, especially from communities invoking Not-In-My-Backyard (NIMBY) objections. Today, the average timeline to construct new interstate transmission infrastructure stretches between 10 and 12 years, an untenable lag in an era when grid reliability is under increasing stress. In 2024, the Federal Energy Regulatory Commission (FERC) reported that more than 2,600 gigawatts (GW) of clean energy and storage projects were stalled in the interconnection queue, waiting for sufficient transmission capacity. The consequences affect not only industrial sectors like data centers but also residential areas vulnerable to brownouts and peak load disruptions. What is the New Technology? At the center of the initiative is CTC Global’s ACCC® (Aluminum Conductor Composite Core) advanced conductor, a next-generation overhead transmission technology engineered to boost grid

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »