Stay Ahead, Stay ONMINE

Inworld AI showcases AI case studies as they move to production

The current AI ecosystem wasn’t built with game developers in mind. While impressive in controlled demos, today’s AI technologies expose critical limitations when transitioning to production-ready games, said Kylan Gibbs, CEO of Inworld AI, in an interview with GamesBeat. Right now, AI deployment is being slowed because game developers are dependent on black-box APIs with unpredictable pricing and shifting terms, leading to a loss of autonomy and stalled innovation, he said. Players are left with disposable “AI-flavored” demos instead of sustained, evolving experiences. At the Game Developers Conference 2025, Inworld isn’t going to showcase technology for technology’s sake. Gibbs said the company is demonstrating how developers have overcome these structural barriers to ship AI-powered games that millions of players are enjoying right now. Their experiences highlight why so many AI projects fail before launch and more importantly, how to overcome these challenges. “We’ve seen a transition over the last few years at GDC. Overall, it’s a transition from demos and prototypes to production,” Gibbs said. “When we started out, it was really a proof of concept. ‘How does this work?’ The use case is pretty narrow. It was really just characters and non-player characters (NPCs), and it was a lot of focus on demos.” Now, Gibbs said, the company is focused on production with partners and large scale deployments and actually solving problems. Getting AI to work in production Inworld AI is working with partners like Nvidia and Streamlabs on AI. Earlier large language models (LLMs) were too costly to put in games. That’s because it could cost a lot of money to send a user’s query to AI out across the web to a datacenter, using valuable graphics processing unit (GPU) time. It sent the answer back, often so slowly that the user noticed the delay. One of the things that has helped with AI costs now is that the AI processing has been restructured, with tasks moving from the server to the client-side logic. However, that can only really happen if the user has a good machine with a good AI processor/GPU. Inference tasks can be done on the local machines, while harder machine learning problems may have to be done in the cloud, Gibbs said. “Where I think we’re at today is we actually have proof that the stuff works at huge scale in production, and we have the right tools to be able to do that. And that’s been a great and exciting transition at the same time, because we’ve now been focusing on that we’ve been able to actually uncover regarding the root challenges in the AI ecosystem,” Gibbs said. “When you’re in the prototyping demo mindset, a lot of things work really well, right? A lot of these tools like OpenAI, Anthropic are great for demos but they do not work when you go into massive, multi-million users at scale.” Gibbs said Inworld AI is focusing on solving the bigger problems at GDC. Inworld AI is sharing the real challenges it has encountered and showing what can work in production. “There are some very real challenges to making that work, and we can’t solve it all on our own. We need to solve it as an ecosystem,” Gibbs said. “We need to accept and stop promoting AI as this panacea, a plug and play solution. We have solved the problems with a few partners.” Gibbs is looking forward to the proliferation of AI PCs. “If you bring all the processing onto onto the local machine, then a lot of that AI becomes much more affordable,” Gibbs said. The company is providing all the backend models and efforts to contain costs. I noted that Mighty Bear Games, headed by Simon Davis, is creating games with AI agents, where the agents play the game and humans help craft the perfect agents. “Companions are super cool. You’ll see multi-agent simulation experiences, like doing dynamic crowds. If you’re if you are focused on a character based experience, you can have primary characters or background characters,” Gibbs said. “And actually getting background characters to work efficiently is really hard because when people look at things like the Stanford paper, it’s about simulating 1,000 agents at once. We all know that games are not built like that. How do you give a sense of millions of characters at scale, while also doing a level-of-detail system, so you’re maximizing the depth of each agent as you get closer to it.” AI skeptics? AI livestreams I asked Gibbs what he thought about the stat in the GDC 2025 survey, which showed that more game developers are skeptical about AI in this year’s survey compared to a year ago. The numbers showed 30% had a negative sentiment on AI, compared to 18% the year before. That’s going in the wrong direction. “I think that we’ve got to this point where everybody realizes that the future of their careers will have AI in it. And we are at a point before where everybody was happy just to follow along with OpenAI’s announcements and whatever their friends were doing on LinkedIn,” Gibbs said. People were likely turned off after they took tools like image generators with text prompts and these didn’t work so well in prodction. Now, as they move into production, they’re finding that it doesn’t work at scale. And so it takes better tools geared to specific users for developers, Gibbs said. “We should be skeptical, because there are real challenges that no one is solving. And unless we voice that skepticism and start really pressuring the ecosystem, it’s not going to change,” Gibbs said. The problems include cloud lock-in and unpredictable costs; performance and reliability issues; and a non-evolving AI. Another problem is controlling AI agents effectively so they don’t go off the rails. When players are playing in a game like Fortnite, getting a response in milliseconds is critical, Gibbs said. AI in games can be a compelling experience, but making it work with cost efficiency at scale requires solving a lot of problems, Gibbs said. As for the changes AI is bringing, Gibbs said, “There’s going to be a fundamental architecture change in how we build user-facing AI apps.” Gibbs said, “What happens is studios are building with tools and then they get a few months from production and they’re like, ‘Holy crap! This doesn’t work. We need to completely change our architecture.’” That’s what Inworld AI is working on and it will be announced in the future. Gibbs predicts that many AI tools will be quickly outdated within a matter of months. That’s going to make planning difficult. He also predicts that the capacity of third-party cloud providers will break under the strain. “Will that code actually work when you have four million users funneling through it?,” Gibbs said. “What we’re seeing is a lot of people having to go back and rework their entire code base from Python to C++ as they get closer to production.” Summary of partner demos Streamlabs’ architecture for bringing AI into workflow. At GDC, Inworld will be showcasing several key partner demos that highlight how studios of all sizes are successfully implementing AI. These include: Streamlabs: Intelligent Streaming Agent provides real-time commentary and production assistance. Wishroll: Showing off Status, a social media simulation game with unique AI-driven personalities. Little Umbrella: The Last Show, a web-based party game with witty AI hosting. Nanobit: Winked, a mobile chat game with persistent, evolving relationship building. Virtuos: Giving developers full control over AI character behaviors for a more immersive storytelling experience. Additionally, Inworld will feature two Inworld-developed technology showcases: On-device Demo: A cooperative game running seamlessly on-device across multiple hardware platforms. Realistic Multi-agent Simulation: Multi-agent simulation demonstrating realistic social behaviors and interactions. The critical barriers blocking AI games from production and real dev solutions  Kylan Gibbs is cofounder of Inworld AI and a speaker at our recent GamesBeat Next event. Below are seven of the key challenges that consistently prevent AI-powered games from making the leap from promising prototype to shipped product. Here’s how studios of all sizes used Inworld to break through these barriers and deliver experiences enjoyed by millions.  The real-time wall: Streamlabs Intelligent Agent The developer problem: Non-production ready cloud AI introduces response delays that break player immersion. Unoptimized cloud dependencies result in AI response times of 800 milliseconds to 1,200 milliseconds, making even the simplest interactions feel sluggish. All intelligence remains server-side, creating single points of failure and preventing true ownership, yet most developers can find few alternatives beyond this cloud-API-only AI workflow that locks them into perpetual dependency architectures. The Inworld solution: The Logitech G’s Streamlabs Intelligent Streaming Agent is an AI-driven co-host, producer, and technical sidekick that observes game events in real time, providing commentary during key moments, assisting with scene transitions, and driving audience engagement—letting creators focus on content without getting bogged down in production tasks. “We tried building this with standard cloud APIs, but the 1-2 second delay made the assistant feel disconnected from the action,” said the Streamlabs team. “Working with Inworld, we achieved 200 millisecond response times that make the assistant feel present in the moment.” Behind the scenes, the Inworld Framework orchestrates the assistant’s multimodal input processing, contextual reasoning, and adaptive output. By integrating seamlessly with third-party models and the Streamlabs API, Inworld makes it easy to interpret gameplay, chat, and voice commands, then deliver real-time actions—like switching scenes or clipping highlights. This approach saves developers from writing custom pipelines for every new AI model or event trigger. This isn’t just faster—it’s the difference between an assistant that feels alive versus one that always seems a step behind the action. The success tax: The Last Show The Last Show The developer problem: Success should be a cause for celebration, not a financial crisis. Yet, for AI-powered games, linear or even increasing unit costs mean expenses can quickly spiral out of control as user numbers grow. Instead of scaling smoothly, developers are forced to make emergency architecture changes, when they should be doubling down on success. The Inworld solution: Little Umbrella, the studio behind Death by AI, was no exception. While the game was an instant hit–reaching 20 million players in just two months – the success nearly bankrupted the studio. “Our cloud API costs went from $5K to $250K in two weeks,” shares their technical director. “We had to throttle user acquisition—literally turning away players—until we partnered with Inworld to restructure our AI architecture.” For their next game, they decided to flip the script, building with cost predictability and scalability in mind from day one. Introducing The Last Show, a web-based party game where an AI host generates hilarious questions based on topics chosen or customized by players. Players submit answers, vote for their favorites, and the least popular response leads to elimination – all while the AI host delivers witty roasts. The Last Show marks their comeback, engineered from the ground up to maintain both quality and cost predictability at scale. The result? A business model that thrives from success rather than being threatened by it. The quality-cost paradox: Status How can you be popular? Status knows. The developer problem: Better AI quality often correlates with higher costs, forcing developers into an impossible decision: deliver a subpar player experience or face unsustainable costs. AI should enhance gameplay, not become an economic roadblock. The Inworld solution: Wishroll’s Status (ranking as high as No. 4 in the App Store Lifestyle category) immerses players in a fictional world where they can roleplay as anyone they imagine—whether a world-famous pop star, a fictional character, or even a personified ChatGPT. Their goal is to amass followers, develop relationships with other celebrities, and complete unique milestones. The concept struck a chord with gamers and by the time the limited access beta launched in October 2024, Status had taken off. TikTok buzz drove over 100,000 downloads with many gamers getting turned away, while the game’s Discord community ballooned from a modest 100 users to 60,000 within a few days. Only two weeks after their public beta launch in February 2025, Status surpassed a million users.  “We were spending $12 to $15 per daily active user with top-tier models,” said CEO Fai Nur, in a statement. “That’s completely unsustainable. But when we tried cheaper alternatives, our users immediately noticed the quality drop and engagement plummeted.” Working with Inworld’s ML Optimization services, Wishroll was able to cut AI costs by 90% while improving quality metrics. “We saw how Inworld solved similar problems for other AI games and thought, ‘This is exactly what we need,’” explained Fai. “We could tell Inworld had a lot of experience and knowledge on exactly what our problem was – which was optimizing models and reducing costs.” “If we had launched with our original architecture, we’d be broke in days,” Fai explained. “Even raising tens of millions wouldn’t have sustained us beyond a month. Now we have a path to profitability.” The agent control problem: Partnership with Virtuos The developer problem: Even with sustainable performance benchmarks met, complex narrative games still require sophisticated control over AI agents’ behaviors, memories, and personalities to deliver deeply immersive and engaging experiences to gamers. Traditional approaches either lead to unpredictable interactions or require prohibitively complex scripting, making it nearly impossible to create believable characters with consistent personalities. The Inworld solution: Inworld is partnering with Virtuos, a global game development powerhouse known for co-developing some of the biggest triple-A titles in the industry like Marvel’s Midnight Suns and Metal Gear Solid Delta: Snake Eater. With deep expertise in world-building and character development, Virtuos immediately saw the need for providing developers with precise control over the personalities, behaviors, and memories of AI-driven NPCs. This ensures storytelling consistency and players’ choices to dynamically influence the narrative’s direction and outcome. Inworld’s suite of generative AI tools provides the cognitive core that brings these characters to life while equipping developers with full customization capabilities. Teams can fine-tune AI-driven characters to stay true to their narrative arcs, ensuring they evolve logically and consistently within the game world. With Inworld’s tools, Virtuos can focus on what they do best–creating rich, immersive experiences. “At Virtuos, we see AI as a way to enhance the artistry of game developers and accurately bring their visions to life,” said Piotr Chrzanowski, CTO at Virtuos, in a statement. “By integrating AI, we enable developers to add new dimensions to their creations, enriching the gaming experience without compromising quality. Our partnership with Inworld opens the door to gameplay experiences that weren’t possible before.” A prototype showcasing the best of both teams is in the works, and interested media are invited to stop by the Virtuos booth at C1515 for a private demo. The immersive dialogue challenge: Winked The developer problem: Nanobit’s Winked is a mobile interactive narrative experience where players build relationships through dynamic, evolving conversations, including direct messages with core characters. To meet player expectations, the player-facing AI-driven dialogue had to exceed what was possible even with frontier models — offering more personal, emotionally nuanced, and stylistically unique interactions. Yet, achieving the level of quality was beyond the capabilities of off-the-shelf models, and the high costs of premium AI solutions made scalability a challenge.  The Inworld solution: Using Inworld Cloud, Nanobit trained and distilled a custom AI model tailored specifically for Winked. This model delivered superior dialogue quality–more organic, personal, and contextually aware than off-the-shelf solutions—while keeping costs a fraction of traditional cloud APIs. The AI integrated seamlessly into Winked’s core game loops, enhancing user engagement while maintaining financial viability. Beyond improving player immersion, this AI-driven dialogue system remembers past conversations and carries the storyline forward, providing the player with relationships that evolve as chats progress. This in turn encourages players to engage in longer conversations and return more frequently as they grow closer to characters. The multi-agent orchestration challenge: Realistic multi-agent simulation The developer problem: Creating living, believable worlds requires coordinating multiple AI agents to interact naturally with each other and the player. Developers struggle to create social dynamics that feel organic rather than mechanical, especially at scale. The Inworld solution: Our Realistic Multi-agent Simulation demonstrates how to effectively orchestrate multiple AI agents into cohesive, living worlds using Inworld. By implementing sophisticated agent coordination systems, contextual awareness, and shared environmental knowledge, this simulation creates believable social dynamics that emerge naturally rather than through scripted behaviors. Whether forming spontaneous crowds around exciting in-game events, reacting to shared group emotes, or engaging in multi-character conversations, these autonomous agents showcase how proper agent orchestration enables emergent, lifelike behaviors at scale. This technical demonstration underscores the potential for deep player immersion and sustained engagement by bringing social hubs to life—where multiple characters interact with consistent personalities, mutual awareness, and collective response patterns that create the feeling of a truly living world. The hardware fragmentation challenge: On-device Demo The developer problem: AI features optimized for high-end devices fail on mainstream hardware, forcing developers to either limit their audience or compromise their vision. AI vendors also obscure critical capabilities required for on-device inference (distilled models, deep fine-tuning and distillation, runtime model adaptation) to maintain control and protect recurring revenue. The Inworld solution: While on-device is the key to a more scalable future of AI and games, AI hardware in gaming doesn’t have a one-size-fits-all solution. Ensuring consistent performance and accessibility for users on various devices can easily drive up complexity and cost. To achieve scalability, AI solutions must adapt seamlessly across diverse hardware configurations. Our on-device demo showcases an AI-powered cooperative gameplay running seamlessly across three hardware configurations: Nvidia GeForce RTX 5090 AMD Radeon RX 7900 XTX Tenstorrent Quietbox This demo isn’t about theoretical compatibility; it’s about achieving consistent performance across diverse hardware, allowing developers to target the full spectrum of gaming devices without sacrificing quality. The development difference: Going beyond prototypes The gap between prototype and production is where most AI game projects collapse. While out-of-the-box plugins are useful for prototyping, they break under real-world conditions: Latency collapse: Cloud-dependent tools see response times balloon under load, breaking immersion and even gameplay Cost explosion: Per-token pricing creates financial cliff edges that make scaling unpredictable Reliability bottlenecks: Each external API call introduces a new potential point of failure Quality consistency: AI performance varies dramatically between test and production environments “We’ve watched incredible AI game prototypes die in the transition to production for four years now,” says Evgenii Shingarev, VP of Engineering at Inworld, in a statement. “The pattern is always the same: impressive demo, enthusiastic investment, then the slow realization that the economics and technical architecture don’t support real-world deployment.” At Inworld, we’ve worked relentlessly to close this prototype-to-production gap, developing solutions that address the real-world challenges of shipping and scaling AI-powered games—not just showcasing impressive demos. At GDC, Inworld is excited to share experiences that don’t just make it to launch, but thrive at scale, said Gibbs. The company’s booth is at C1615. Instead of talking about the future of gaming with AI, we’ll show the real systems solving real problems, developed by teams who have faced the same challenges you’re encountering, Gibbs said. The path from AI prototype to production is challenging, but with the right approach and partners who understand what it takes to ship AI experiences that players love, it’s absolutely achievable, Gibbs said. Session with Jim Keller of Tenstorrent: Breaking down AI’s unsustainable economics: Jim Keller, now head of Tenstorrent, is a legendary hardware engineer who headed important processor projects at companies such as Apple, AMD and Intel. He will be on a GDC panel with Inworld CEO Kylan Gibbs for a candid examination of AI’s broken economic model in gaming and the practical path forward: “Current AI infrastructure is economically unsustainable for games at scale,” said Keller, in a statement. “We’re seeing studios adopt impressive AI features in development, only to strip them back before launch once they calculate the true cloud costs at scale.” Gibbs said he is looking forward to talking with Keller on stage about Tenstorrent, which aims to serve AI applications at scale for less than 100 times the cost. The session will explore concrete solutions to these economic barriers: Dramatically cheaper model and hardware options Local inference strategies that eliminate API dependency Practical hybridization approaches that optimize for cost, performance, and quality Active learning systems that improve ROI over time Drawing on Keller’s deep hardware expertise from Tenstorrent, AMD, Apple, Intel, and Tesla and Inworld’s expertise in real-time, user-facing AI, we’ll explore how to blend on-device compute with large-scale cloud resources under one architectural umbrella. Attendees will gain candid insights into what actually matters when bringing AI from theory into practice, and how to build a sustainable AI pipeline that keeps costs low without sacrificing creativity or performance. Session details: Thursday, March 20, 9:30 a.m. – 10:30 a.m. West Hall, Room #2000 For more details, visit the GDC page Session with Microsoft: AI innovation for game experiences Gibbs will also join Microsoft’s Haiyan Zhang and Katja Hofmann to explore how AI can drive the next wave of dynamic game experiences. This panel bridges research and practical implementation, addressing the critical challenges developers face when moving from prototypes to production. The session showcases how our collaborative approach solves industry-wide barriers preventing AI games from reaching players – focusing on proven patterns that overcome the reliability, quality, and cost challenges most games never survive. I asked how Gibbs could convince a game developer that AI is a train they can get on, and that it’s not a train coming right at them. “Unfortunately, there’s lots of other partners that we weren’t able to share publicly. A lot of the triple-A’s [are quiet]. It’s happening, but it requires a lot of work. We’re starting to engage with developers where the requirements are being creative. If they have a game that they’re planning on launching in the next year or two years, and they don’t have a clear line of sight on how to do that efficiently at scale or cost, we can work with them on that,” Gibbs said. “There is a fundamentally different ways that it can be structured and integrated into games. And we’re going to have a lot more announcements this year as we’re trying to make them more self serve.” Session details: Monday, March 17, 10:50 a.m. to 11:50 a.m. West Hall, Room #3011 For more details, visit the GDC page

The current AI ecosystem wasn’t built with game developers in mind. While impressive in controlled demos, today’s AI technologies expose critical limitations when transitioning to production-ready games, said Kylan Gibbs, CEO of Inworld AI, in an interview with GamesBeat.

Right now, AI deployment is being slowed because game developers are dependent on black-box APIs with unpredictable pricing and shifting terms, leading to a loss of autonomy and stalled innovation, he said. Players are left with disposable “AI-flavored” demos instead of sustained, evolving experiences.

At the Game Developers Conference 2025, Inworld isn’t going to showcase technology for technology’s sake. Gibbs said the company is demonstrating how developers have overcome these structural barriers to ship AI-powered games that millions of players are enjoying right now. Their experiences highlight why so many AI projects fail before launch and more importantly, how to overcome these challenges.

“We’ve seen a transition over the last few years at GDC. Overall, it’s a transition from demos and prototypes to production,” Gibbs said. “When we started out, it was really a proof of concept. ‘How does this work?’ The use case is pretty narrow. It was really just characters and non-player characters (NPCs), and it was a lot of focus on demos.”

Now, Gibbs said, the company is focused on production with partners and large scale deployments and actually solving problems.

Getting AI to work in production

Inworld AI is working with partners like Nvidia and Streamlabs on AI.

Earlier large language models (LLMs) were too costly to put in games. That’s because it could cost a lot of money to send a user’s query to AI out across the web to a datacenter, using valuable graphics processing unit (GPU) time. It sent the answer back, often so slowly that the user noticed the delay.

One of the things that has helped with AI costs now is that the AI processing has been restructured, with tasks moving from the server to the client-side logic. However, that can only really happen if the user has a good machine with a good AI processor/GPU. Inference tasks can be done on the local machines, while harder machine learning problems may have to be done in the cloud, Gibbs said.

“Where I think we’re at today is we actually have proof that the stuff works at huge scale in production, and we have the right tools to be able to do that. And that’s been a great and exciting transition at the same time, because we’ve now been focusing on that we’ve been able to actually uncover regarding the root challenges in the AI ecosystem,” Gibbs said. “When you’re in the prototyping demo mindset, a lot of things work really well, right? A lot of these tools like OpenAI, Anthropic are great for demos but they do not work when you go into massive, multi-million users at scale.”

Gibbs said Inworld AI is focusing on solving the bigger problems at GDC. Inworld AI is sharing the real challenges it has encountered and showing what can work in production.

“There are some very real challenges to making that work, and we can’t solve it all on our own. We need to solve it as an ecosystem,” Gibbs said. “We need to accept and stop promoting AI as this panacea, a plug and play solution. We have solved the problems with a few partners.”

Gibbs is looking forward to the proliferation of AI PCs.

“If you bring all the processing onto onto the local machine, then a lot of that AI becomes much more affordable,” Gibbs said.

The company is providing all the backend models and efforts to contain costs. I noted that Mighty Bear Games, headed by Simon Davis, is creating games with AI agents, where the agents play the game and humans help craft the perfect agents.

“Companions are super cool. You’ll see multi-agent simulation experiences, like doing dynamic crowds. If you’re if you are focused on a character based experience, you can have primary characters or background characters,” Gibbs said. “And actually getting background characters to work efficiently is really hard because when people look at things like the Stanford paper, it’s about simulating 1,000 agents at once. We all know that games are not built like that. How do you give a sense of millions of characters at scale, while also doing a level-of-detail system, so you’re maximizing the depth of each agent as you get closer to it.”

AI skeptics?

AI livestreams

I asked Gibbs what he thought about the stat in the GDC 2025 survey, which showed that more game developers are skeptical about AI in this year’s survey compared to a year ago. The numbers showed 30% had a negative sentiment on AI, compared to 18% the year before. That’s going in the wrong direction.

“I think that we’ve got to this point where everybody realizes that the future of their careers will have AI in it. And we are at a point before where everybody was happy just to follow along with OpenAI’s announcements and whatever their friends were doing on LinkedIn,” Gibbs said.

People were likely turned off after they took tools like image generators with text prompts and these didn’t work so well in prodction. Now, as they move into production, they’re finding that it doesn’t work at scale. And so it takes better tools geared to specific users for developers, Gibbs said.

“We should be skeptical, because there are real challenges that no one is solving. And unless we voice that skepticism and start really pressuring the ecosystem, it’s not going to change,” Gibbs said.

The problems include cloud lock-in and unpredictable costs; performance and reliability issues; and a non-evolving AI. Another problem is controlling AI agents effectively so they don’t go off the rails.

When players are playing in a game like Fortnite, getting a response in milliseconds is critical, Gibbs said. AI in games can be a compelling experience, but making it work with cost efficiency at scale requires solving a lot of problems, Gibbs said.

As for the changes AI is bringing, Gibbs said, “There’s going to be a fundamental architecture change in how we build user-facing AI apps.”

Gibbs said, “What happens is studios are building with tools and then they get a few months from production and they’re like, ‘Holy crap! This doesn’t work. We need to completely change our architecture.’”

That’s what Inworld AI is working on and it will be announced in the future. Gibbs predicts that many AI tools will be quickly outdated within a matter of months. That’s going to make planning difficult. He also predicts that the capacity of third-party cloud providers will break under the strain.

“Will that code actually work when you have four million users funneling through it?,” Gibbs said. “What we’re seeing is a lot of people having to go back and rework their entire code base from Python to C++ as they get closer to production.”

Summary of partner demos

Streamlabs’ architecture for bringing AI into workflow.

At GDC, Inworld will be showcasing several key partner demos that highlight how studios of all sizes are successfully implementing AI. These include:

  • Streamlabs: Intelligent Streaming Agent provides real-time commentary and production assistance.
  • Wishroll: Showing off Status, a social media simulation game with unique AI-driven personalities.
  • Little Umbrella: The Last Show, a web-based party game with witty AI hosting.
  • Nanobit: Winked, a mobile chat game with persistent, evolving relationship building.
  • Virtuos: Giving developers full control over AI character behaviors for a more immersive storytelling experience.

Additionally, Inworld will feature two Inworld-developed technology showcases:

  • On-device Demo: A cooperative game running seamlessly on-device across multiple hardware platforms.
  • Realistic Multi-agent Simulation: Multi-agent simulation demonstrating realistic social behaviors and interactions.

The critical barriers blocking AI games from production and real dev solutions 

Kylan Gibbs is cofounder of Inworld AI and a speaker at our recent GamesBeat Next event.
Kylan Gibbs is cofounder of Inworld AI and a speaker at our recent GamesBeat Next event.

Below are seven of the key challenges that consistently prevent AI-powered games from making the leap from promising prototype to shipped product. Here’s how studios of all sizes used Inworld to break through these barriers and deliver experiences enjoyed by millions. 

The real-time wall: Streamlabs Intelligent Agent

The developer problem: Non-production ready cloud AI introduces response delays that break player immersion. Unoptimized cloud dependencies result in AI response times of 800 milliseconds to 1,200 milliseconds, making even the simplest interactions feel sluggish.

All intelligence remains server-side, creating single points of failure and preventing true ownership, yet most developers can find few alternatives beyond this cloud-API-only AI workflow that locks them into perpetual dependency architectures.

The Inworld solution: The Logitech G’s Streamlabs Intelligent Streaming Agent is an AI-driven co-host, producer, and technical sidekick that observes game events in real time, providing commentary during key moments, assisting with scene transitions, and driving audience engagement—letting creators focus on content without getting bogged down in production tasks.

“We tried building this with standard cloud APIs, but the 1-2 second delay made the assistant feel disconnected from the action,” said the Streamlabs team. “Working with Inworld, we achieved 200 millisecond response times that make the assistant feel present in the moment.”

Behind the scenes, the Inworld Framework orchestrates the assistant’s multimodal input processing, contextual reasoning, and adaptive output. By integrating seamlessly with third-party models and the Streamlabs API, Inworld makes it easy to interpret gameplay, chat, and voice commands, then deliver real-time actions—like switching scenes or clipping highlights. This approach saves developers from writing custom pipelines for every new AI model or event trigger.

This isn’t just faster—it’s the difference between an assistant that feels alive versus one that always seems a step behind the action.

The success tax: The Last Show

The Last Show

The developer problem: Success should be a cause for celebration, not a financial crisis. Yet, for AI-powered games, linear or even increasing unit costs mean expenses can quickly spiral out of control as user numbers grow. Instead of scaling smoothly, developers are forced to make emergency architecture changes, when they should be doubling down on success.

The Inworld solution: Little Umbrella, the studio behind Death by AI, was no exception. While the game was an instant hit–reaching 20 million players in just two months – the success nearly bankrupted the studio.

“Our cloud API costs went from $5K to $250K in two weeks,” shares their technical director. “We had to throttle user acquisition—literally turning away players—until we partnered with Inworld to restructure our AI architecture.”

For their next game, they decided to flip the script, building with cost predictability and scalability in mind from day one. Introducing The Last Show, a web-based party game where an AI host generates hilarious questions based on topics chosen or customized by players. Players submit answers, vote for their favorites, and the least popular response leads to elimination – all while the AI host delivers witty roasts.

The Last Show marks their comeback, engineered from the ground up to maintain both quality and cost predictability at scale. The result? A business model that thrives from success rather than being threatened by it.

The quality-cost paradox: Status

How can you be popular? Status knows.

The developer problem: Better AI quality often correlates with higher costs, forcing developers into an impossible decision: deliver a subpar player experience or face unsustainable costs. AI should enhance gameplay, not become an economic roadblock.

The Inworld solution: Wishroll’s Status (ranking as high as No. 4 in the App Store Lifestyle category) immerses players in a fictional world where they can roleplay as anyone they imagine—whether a world-famous pop star, a fictional character, or even a personified ChatGPT. Their goal is to amass followers, develop relationships with other celebrities, and complete unique milestones.

The concept struck a chord with gamers and by the time the limited access beta launched in October 2024, Status had taken off. TikTok buzz drove over 100,000 downloads with many gamers getting turned away, while the game’s Discord community ballooned from a modest 100 users to 60,000 within a few days. Only two weeks after their public beta launch in February 2025, Status surpassed a million users. 

“We were spending $12 to $15 per daily active user with top-tier models,” said CEO Fai Nur, in a statement. “That’s completely unsustainable. But when we tried cheaper alternatives, our users immediately noticed the quality drop and engagement plummeted.”

Working with Inworld’s ML Optimization services, Wishroll was able to cut AI costs by 90% while improving quality metrics. “We saw how Inworld solved similar problems for other AI games and thought, ‘This is exactly what we need,’” explained Fai. “We could tell Inworld had a lot of experience and knowledge on exactly what our problem was – which was optimizing models and reducing costs.”

“If we had launched with our original architecture, we’d be broke in days,” Fai explained. “Even raising tens of millions wouldn’t have sustained us beyond a month. Now we have a path to profitability.”

The agent control problem: Partnership with Virtuos

The developer problem: Even with sustainable performance benchmarks met, complex narrative games still require sophisticated control over AI agents’ behaviors, memories, and personalities to deliver deeply immersive and engaging experiences to gamers. Traditional approaches either lead to unpredictable interactions or require prohibitively complex scripting, making it nearly impossible to create believable characters with consistent personalities.

The Inworld solution: Inworld is partnering with Virtuos, a global game development powerhouse known for co-developing some of the biggest triple-A titles in the industry like Marvel’s Midnight Suns and Metal Gear Solid Delta: Snake Eater. With deep expertise in world-building and character development, Virtuos immediately saw the need for providing developers with precise control over the personalities, behaviors, and memories of AI-driven NPCs. This ensures storytelling consistency and players’ choices to dynamically influence the narrative’s direction and outcome.

Inworld’s suite of generative AI tools provides the cognitive core that brings these characters to life while equipping developers with full customization capabilities. Teams can fine-tune AI-driven characters to stay true to their narrative arcs, ensuring they evolve logically and consistently within the game world. With Inworld’s tools, Virtuos can focus on what they do best–creating rich, immersive experiences.

“At Virtuos, we see AI as a way to enhance the artistry of game developers and accurately bring their visions to life,” said Piotr Chrzanowski, CTO at Virtuos, in a statement. “By integrating AI, we enable developers to add new dimensions to their creations, enriching the gaming experience without compromising quality. Our partnership with Inworld opens the door to gameplay experiences that weren’t possible before.”

A prototype showcasing the best of both teams is in the works, and interested media are invited to stop by the Virtuos booth at C1515 for a private demo.

The immersive dialogue challenge: Winked

The developer problem: Nanobit’s Winked is a mobile interactive narrative experience where players build relationships through dynamic, evolving conversations, including direct messages with core characters. To meet player expectations, the player-facing AI-driven dialogue had to exceed what was possible even with frontier models — offering more personal, emotionally nuanced, and stylistically unique interactions. Yet, achieving the level of quality was beyond the capabilities of off-the-shelf models, and the high costs of premium AI solutions made scalability a challenge. 

The Inworld solution: Using Inworld Cloud, Nanobit trained and distilled a custom AI model tailored specifically for Winked. This model delivered superior dialogue quality–more organic, personal, and contextually aware than off-the-shelf solutions—while keeping costs a fraction of traditional cloud APIs. The AI integrated seamlessly into Winked’s core game loops, enhancing user engagement while maintaining financial viability.

Beyond improving player immersion, this AI-driven dialogue system remembers past conversations and carries the storyline forward, providing the player with relationships that evolve as chats progress. This in turn encourages players to engage in longer conversations and return more frequently as they grow closer to characters.

The multi-agent orchestration challenge: Realistic multi-agent simulation

The developer problem: Creating living, believable worlds requires coordinating multiple AI agents to interact naturally with each other and the player. Developers struggle to create social dynamics that feel organic rather than mechanical, especially at scale.

The Inworld solution: Our Realistic Multi-agent Simulation demonstrates how to effectively orchestrate multiple AI agents into cohesive, living worlds using Inworld. By implementing sophisticated agent coordination systems, contextual awareness, and shared environmental knowledge, this simulation creates believable social dynamics that emerge naturally rather than through scripted behaviors.

Whether forming spontaneous crowds around exciting in-game events, reacting to shared group emotes, or engaging in multi-character conversations, these autonomous agents showcase how proper agent orchestration enables emergent, lifelike behaviors at scale. This technical demonstration underscores the potential for deep player immersion and sustained engagement by bringing social hubs to life—where multiple characters interact with consistent personalities, mutual awareness, and collective response patterns that create the feeling of a truly living world.

The hardware fragmentation challenge: On-device Demo

The developer problem: AI features optimized for high-end devices fail on mainstream hardware, forcing developers to either limit their audience or compromise their vision. AI vendors also obscure critical capabilities required for on-device inference (distilled models, deep fine-tuning and distillation, runtime model adaptation) to maintain control and protect recurring revenue.

The Inworld solution: While on-device is the key to a more scalable future of AI and games, AI hardware in gaming doesn’t have a one-size-fits-all solution. Ensuring consistent performance and accessibility for users on various devices can easily drive up complexity and cost. To achieve scalability, AI solutions must adapt seamlessly across diverse hardware configurations.

Our on-device demo showcases an AI-powered cooperative gameplay running seamlessly across three hardware configurations:

  • Nvidia GeForce RTX 5090
  • AMD Radeon RX 7900 XTX
  • Tenstorrent Quietbox

This demo isn’t about theoretical compatibility; it’s about achieving consistent performance across diverse hardware, allowing developers to target the full spectrum of gaming devices without sacrificing quality.

The development difference: Going beyond prototypes

The gap between prototype and production is where most AI game projects collapse. While out-of-the-box plugins are useful for prototyping, they break under real-world conditions:

  • Latency collapse: Cloud-dependent tools see response times balloon under load, breaking immersion and even gameplay
  • Cost explosion: Per-token pricing creates financial cliff edges that make scaling unpredictable
  • Reliability bottlenecks: Each external API call introduces a new potential point of failure
  • Quality consistency: AI performance varies dramatically between test and production environments

“We’ve watched incredible AI game prototypes die in the transition to production for four years now,” says Evgenii Shingarev, VP of Engineering at Inworld, in a statement. “The pattern is always the same: impressive demo, enthusiastic investment, then the slow realization that the economics and technical architecture don’t support real-world deployment.”

At Inworld, we’ve worked relentlessly to close this prototype-to-production gap, developing solutions that address the real-world challenges of shipping and scaling AI-powered games—not just showcasing impressive demos. At GDC, Inworld is excited to share experiences that don’t just make it to launch, but thrive at scale, said Gibbs. The company’s booth is at C1615.

Instead of talking about the future of gaming with AI, we’ll show the real systems solving real problems, developed by teams who have faced the same challenges you’re encountering, Gibbs said.

The path from AI prototype to production is challenging, but with the right approach and partners who understand what it takes to ship AI experiences that players love, it’s absolutely achievable, Gibbs said.

Session with Jim Keller of Tenstorrent: Breaking down AI’s unsustainable economics:

Jim Keller, now head of Tenstorrent, is a legendary hardware engineer who headed important processor projects at companies such as Apple, AMD and Intel. He will be on a GDC panel with Inworld CEO Kylan Gibbs for a candid examination of AI’s broken economic model in gaming and the practical path forward:

“Current AI infrastructure is economically unsustainable for games at scale,” said Keller, in a statement. “We’re seeing studios adopt impressive AI features in development, only to strip them back before launch once they calculate the true cloud costs at scale.”

Gibbs said he is looking forward to talking with Keller on stage about Tenstorrent, which aims to serve AI applications at scale for less than 100 times the cost.

The session will explore concrete solutions to these economic barriers:

  • Dramatically cheaper model and hardware options
  • Local inference strategies that eliminate API dependency
  • Practical hybridization approaches that optimize for cost, performance, and quality
  • Active learning systems that improve ROI over time

Drawing on Keller’s deep hardware expertise from Tenstorrent, AMD, Apple, Intel, and Tesla and Inworld’s expertise in real-time, user-facing AI, we’ll explore how to blend on-device compute with large-scale cloud resources under one architectural umbrella. Attendees will gain candid insights into what actually matters when bringing AI from theory into practice, and how to build a sustainable AI pipeline that keeps costs low without sacrificing creativity or performance.

Session details:

  • Thursday, March 20, 9:30 a.m. – 10:30 a.m.
  • West Hall, Room #2000
  • For more details, visit the GDC page

Session with Microsoft: AI innovation for game experiences

Gibbs will also join Microsoft’s Haiyan Zhang and Katja Hofmann to explore how AI can drive the next wave of dynamic game experiences. This panel bridges research and practical implementation, addressing the critical challenges developers face when moving from prototypes to production.

The session showcases how our collaborative approach solves industry-wide barriers preventing AI games from reaching players – focusing on proven patterns that overcome the reliability, quality, and cost challenges most games never survive.

I asked how Gibbs could convince a game developer that AI is a train they can get on, and that it’s not a train coming right at them.

“Unfortunately, there’s lots of other partners that we weren’t able to share publicly. A lot of the triple-A’s [are quiet]. It’s happening, but it requires a lot of work. We’re starting to engage with developers where the requirements are being creative. If they have a game that they’re planning on launching in the next year or two years, and they don’t have a clear line of sight on how to do that efficiently at scale or cost, we can work with them on that,” Gibbs said. “There is a fundamentally different ways that it can be structured and integrated into games. And we’re going to have a lot more announcements this year as we’re trying to make them more self serve.”

Session details:

  • Monday, March 17, 10:50 a.m. to 11:50 a.m.
  • West Hall, Room #3011
  • For more details, visit the GDC page
Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

IonQ, Alice & Bob roll out quantum breakthroughs

“And therefore, does bring us closer to escape velocity,” he added. “When this will happen, and we’ll be able to present an Alice & Bob’s logical qubit under threshold depends on a variety of factors, but we can openly say that this is the current big work happening in the

Read More »

Bad data in, bad data out: Protecting your investment in ADMS

Congratulations! Your utility has successfully implemented a cutting-edge ADMS application. Your GIS team has spent months working closely with the implementation team to clean and correct the data within the GIS application. The teams have validated voltage and phasing, eliminated loops, resolved open points, populated missing attribution with default values,

Read More »

EPA to end environmental justice programs, monitoring tools

Dive Brief: The Trump administration announced Wednesday it will shut down all environmental justice offices and officially end other EJ-related initiatives, a move that will impact how waste and recycling industries measure and track their environmental impact on neighboring communities. The closures include the EPA’s Office of Environmental Justice and

Read More »

Oil Rises as China Plans Stimulus, USA Ramps Up Pressure on Iran

Oil rose for a second session on optimistic economic signals from the two biggest consumers of crude, while US attacks on Yemen’s Iran-backed Houthis revived concerns about a wider confrontation in the Middle East.  West Texas Intermediate advanced 0.6% to settle below $68 a barrel after US retail sales showed a modest slowdown, rather than a precipitous drop as some had projected. Meanwhile, China plans measures to stabilize stock and real estate markets, lift wages and boost the nation’s birth rate, state-run news agency Xinhua reported. Potentially rekindling the market’s geopolitical risk premium, US President Donald Trump said in a social media post that the administration will view maritime attacks by the Houthi militia as equivalent to direct affronts by Tehran. That followed Defense Secretary Pete Hegseth’s comments on Sunday that US strikes on Houthi sites will be “unrelenting” until the group stops targeting vessels in the Red Sea. The geopolitical tensions “could easily shift some major market shorts back to the sidelines,” said Dennis Kissler, senior vice president for trading at BOK Financial Securities. US crude’s front-month contract is being met with resistance at a short-term moving average of $68.56, he said.  At least one fund placed an options bet equivalent to 20 million barrels that would profit if a flare-up in the Middle East pushes Brent’s June contract — currently trading near $71 — toward $100. Still, crude has fallen more than $10 a barrel from this year’s high in January, driven by Trump’s escalating trade war, an OPEC+ decision to increase supply and a possible end to the war in Ukraine that would return Russian barrels to the market. The US president may speak to Russian leader Vladimir Putin this week as Washington pushes for a deal to end the three-year conflict. However, futures remain in a bullish backwardation structure, with shorter-term contracts at a higher price than longer-dated

Read More »

DOE Approves Loan Disbursement for Palisades Nuclear Plant

WASHINGTON– U.S. Department of Energy (DOE) Secretary Chris Wright today announced the release of the second loan disbursement to Holtec for the Palisades Nuclear Plant. Today’s action releases $56,787,300 of the up to $1.52 billion loan guarantee to Holtec for the Palisades project, which will provide 800MW of affordable, reliable baseload power in Michigan. “Unleashing American energy dominance will require leveraging all energy sources that are affordable, reliable and secure – including nuclear energy,” said Secretary Wright. “Today’s action is yet another step toward advancing President Trump’s commitment to increase domestic energy production, bolster our security and lower costs for the American people.” The Palisades Nuclear Plant will be America’s first restart of a commercial nuclear reactor that ceased operations, subject to U.S. Nuclear Regulatory Commission (NRC) licensing approvals. The project is projected to support or retain up to 600 high-quality jobs in Michigan––many of them filled by workers who had previously been at the plant for over 20 years. Today’s disbursement is Holtec’s second disbursement of funds from the Loan Programs Office (LPO) since the announcement of its financial loan close in September 2024. LPO funds go toward the plant restart and ensuring the plant is NRC compliant. This announcement highlights DOE’s commitment to advance President Trump’s agenda of unleashing affordable, reliable, and secure energy through investing in projects across the country that support American jobs, bolster domestic supply chains, and strengthen America’s position as a world energy leader. ###

Read More »

Israel, Azerbaijan Step Up Alliance With Gas Exploration Deal

Azerbaijan will on Monday sign agreements to explore for natural gas in Israeli waters, highlighting a key strategic alliance between the two countries amid turbulence in the region. A consortium of Azeri state company Socar, BP Plc and Israel’s NewMed Energy LP will get the right to explore in one offshore block, in a signing overseen by Azerbaijan Economy Minister Mikayil Jabbarov and Israeli Energy Minister Eli Cohen, according to the Israeli energy ministry. The agreement gives Socar another foothold in important Israeli assets after the company bought a 10% stake in the Tamar gas field earlier this year. The deal comes at a time when Israel has been trying to deepen ties with Azerbaijan to help counter neighboring Iran. Israel and Azerbaijan are also dependent on each other for energy and defense equipment. Baku has maintained relations with Prime Minister Benjamin Netanyahu’s government as the war in the Middle East has pitted Israel against Iran and the militant groups it backs. The Socar consortium won the exploration rights in October 2023, but the Israel-Hamas war that kicked off that month delayed the signing of the contract. The companies will now have three years to conduct seismic surveys in the block to study the possibility of the presence of gas reserves.       Jabbarov’s visit to Jerusalem is the first for an Azerbaijani minister since the start of the war. Israel was the sixth-biggest buyer of oil from Azerbaijan last year, with sales totaling $713 million, according to a report in caliber.az, which cited data from the State Customs Committee.  The new exploration licenses will be for the so-called Cluster I, an area covering some 1,700 square kilometers in the northern part of Israel’s economic waters. The area “has hardly been explored in the past in terms of natural

Read More »

The Hitachi Interview: Laura Fleming, country managing director, UK and Ireland

In an exclusive interview with Energy Voice, Hitachi Energy country managing director for UK and Ireland, Laura Fleming, explains how the UK must aim to transition – not switch – away from oil and gas, and the immediate priorities to achieve Labour’s clean power vision. In the UK, Hitachi Energy is heavily focused on the wind sector and enabling the flow of electricity from wind farms to locations where it can find end users. Hitachi provides the connection for Dogger Bank Wind Farm and the Shetland Grid, among others. Energy Voice: From your previous experience working on offshore interconnector projects, what are the biggest priorities and opportunities for the UK right now? Laura Fleming: The main priority for the UK right now is to ensure that each GW of renewable energy is matched by investment in the grid. Growth in grid capacity is the key to unlocking growth in renewables. Grid capacity must move in lockstep with the growth of renewables. Rapid investment in a more sustainable, flexible and secure energy system is vital to the UK achieving clean power by 2030, kickstarting economic growth and achieving cheaper power. At Hitachi Energy, we are playing our part by working with the government, our customers and partners to deliver electricity networks that will enable the UK to become a clean energy superpower. Investment in grid capacity will allow the UK to capture the enormous growth opportunity from a net-zero grid. To deliver this, the priority should be on delivering the Transmission Acceleration Action Plan and the Clean Power 2030 Action Plan with rapid unblocking of grid connections that risk holding back renewable energy projects. As I mentioned, the UK has the opportunity to capture the enormous growth opportunity from the net-zero grid. To do this, we need incentives to support and

Read More »

North America Cuts 35 Rigs Week on Week

North America dropped 35 rigs week on week, according to Baker Hughes’ latest North America rotary rig count, which was released on March 14. Although the total U.S. rig count remained unchanged week on week, Canada’s total rig count dropped by 35 during the same period, taking the total North America rig count down to 791, comprising 592 rigs from the U.S. and 199 from Canada, the count outlined. Of the total U.S. rig count of 592, 576 rigs are categorized as land rigs, 14 are categorized as offshore rigs, and two are categorized as inland water rigs. The total U.S. rig count is made up of 487 oil rigs, 100 gas rigs, and five miscellaneous rigs, according to the count, which revealed that the U.S. total comprises 530 horizontal rigs, 50 directional rigs, and 12 vertical rigs. Week on week, the U.S. land rig count, offshore rig count, and inland water rig count remained unchanged, the count highlighted. The U.S. gas rig count decreased by one, its oil rig count increased by one, and its miscellaneous rig count remained unchanged, week on week, the count showed. Baker Hughes’ count revealed that the U.S. horizontal rig count decreased by one week on week, while the country’s directional rig count increased by one and its vertical rig count remained unchanged during the period. A major state variances subcategory included in the rig count showed that, week on week, New Mexico dropped three rigs and Oklahoma added two rigs. A major basin variances subcategory included in Baker Hughes’ rig count showed that the Permian basin dropped three rigs, the Eagle Ford basin dropped one rig, the Granite Wash basin added two rigs, and the Williston basin added one rig, week on week. Canada’s total rig count of 199 is made up of

Read More »

Goldman Cuts Oil Forecasts on Slow USA Growth, OPEC+ Policy

Goldman Sachs Group Inc. cut its oil price forecasts, as tariffs reduce the outlook for US growth while OPEC and its allies boost output. The move follows a drop in crude prices from this year’s high in January on plentiful supply, a weak demand outlook from top importer China and an escalating international trade war.  “While the $10 a barrel sellof since mid-January is larger than the change in our base case fundamentals, we reduce by $5 our December 2025 forecast for Brent to $71,” Goldman analysts including Daan Struyven said in the note dated Sunday. “The medium-term risks to our forecast remain to the downside given potential further tariff escalation and potentially longer OPEC+ production increases.” Some of the world’s biggest oil traders have turned increasingly bearish, with the likes of Vitol Group and Gunvor Group forecasting oversupply. The International Energy Agency said last week that demand is being eroded by the escalating trade war and the pledge by the Organization of Petroleum Exporting Countries and its allies to increase shipments, forecasting a surplus of 600,000 barrels this year — or about 0.6% of daily global consumption. However, Goldman Sachs said it expects prices to recover “modestly” in the coming months as US economic growth remains resilient for now, and Washington’s sanctions regime is showing no immediate signs of easing. Other geopolitical risks remain, including the latest US order to attack sites in Yemen controlled by the Houthis as they continue menacing Red Sea shipping. Oil demand will rise 900,000 barrels a day in January, 18% less than a previous forecast, Goldman said. Brent will trade in a range of $65 to $80 a barrel, and average $68 next year, the bank said. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments

Read More »

Enterprises reevaluate virtualization strategies amid Broadcom uncertainty

This dilemma of whether to absorb the Broadcom price hikes or embark on the arduous and risky journey of untangling from the VMware ecosystem is triggering a broader C-level conversation around virtualization strategy. “For enterprises navigating this uncertainly, the challenge isn’t just finding a replacement for VMware. IT shops of all sizes see Broadcom’s actions as an opportunity to rethink their approach to virtualization, cloud strategy and IT modernization,” says Steve McDowell, chief analyst at NAND Research. Elliot says that server virtualization has been taken for granted for a long time, and the Broadcom-driven wake-up call is forcing organizations to reevaluate their virtualization strategies at the board level. “That kind of strategic conversation hasn’t happened for years. Customers are saying, ‘What can we do as this platform emerges from VMware. How do we think about this relative to our multi-cloud strategy and private cloud and the efficiencies we can gain? Let’s talk about risk reduction. Let’s talk about platform strategy.’ This is an opportunity to identify business value. It’s triggering this plethora of swim lanes.” Check the waters before diving in While there are multiple alternatives to the VMware platform, none of them are as good from a feature perspective, and there’s a risk associated with moving off a tried-and-true platform. In estimating the cost of a large-scale VMware migration, Gartner cautions: “VMware’s server virtualization platform has become the point of integration for its customers across server, storage and network infrastructure in the data center. Equally, it is a focus of IT operational duties including workload provisioning, backup and disaster recovery. Migrating from VMware’s server virtualization platform would require untangling many aspects of these investments.” It would take a midsize enterprise at least two years to untangle much of its dependency upon VMware, and it could take a large enterprise

Read More »

5 alternatives to VMware vSphere virtualization platform

Nutanix – which is actively courting disgruntled VMware customers – provides storage services that aggregate storage in a global pool that enables any VM to access and consume storage resources. Features include compression, deduplication, high-availability and snapshots. Enterprises running high-performance databases often require external storage arrays, and Nutanix has addressed that need by certifying storage with SAP HANA and Oracle RAC. (Read more: Cisco, Nutanix strengthen joint HCI package) 4. Scale Computing Platform Scale provides an all-in-one hardware and software package that includes all software licenses. Software features offered at no additional charge include high-availability clustering, built-in disaster recovery, replication and software-defined storage. Scale also offers a tool to automate migrations off vSphere, a centralized management feature for HCI clusters, and the ability to mix and match dissimilar hardware appliances in a cluster. In addition, all storage is pooled. Last summer, Scale Computing said in a quarterly earnings announcement that sales have taken off, thanks in part to Broadcom’s changes to VMware sales operations.  5. VergeIO VergeIO takes HCI to the next level with something it calls ultraconverged infrastructure (UCI). This means VergeIO can not only virtualize the normal stack of compute, networking and storage, it can also implement multi-tenancy, creating multiple virtual data centers (VDCs). Each VDC has its own compute, network, storage, management and VergeOS assigned to it. Enterprises can manage and use each VDC much like the virtual private clouds offered by the hyperscalers. VergeIO says this model creates greater workload density, which means lower costs, improved availability, and simplified IT.

Read More »

IBM laying foundation for mainframe as ultimate AI server

“It will truly change what customers are able to do with AI,” Stowell said. IBM’s mainframe processors The next generation of processors is expected to continue a long history of generation-to-generation improvements, IBM stated in a new white paper on AI and the mainframe. “They are projected to clock in at 5.5 GHz. and include ten 36 MB level 2 caches. They’ll feature built-in low-latency data processing for accelerated I/O as well as a completely redesigned cache and chip-interconnection infrastructure for more on-chip cache and compute capacity,” IBM wrote.  Today’s mainframes also have extensions and accelerators that integrate with the core systems. These specialized add-ons are designed to enable the adoption of technologies such as Java, cloud and AI by accelerating computing paradigms that are essential for high-volume, low-latency transaction processing, IBM wrote.  “The next crop of AI accelerators are expected to be significantly enhanced—with each accelerator designed to deliver 4 times more compute power, reaching 24 trillion operations per second (TOPS),” IBM wrote. “The I/O and cache improvements will enable even faster processing and analysis of large amounts of data and consolidation of workloads running across multiple servers, for savings in data center space and power costs. And the new accelerators will provide increased capacity to enable additional transaction clock time to perform enhanced in-transaction AI inferencing.” In addition, the next generation of the accelerator architecture is expected to be more efficient for AI tasks. “Unlike standard CPUs, the chip architecture will have a simpler layout, designed to send data directly from one compute engine, and use a range of lower- precision numeric formats. These enhancements are expected to make running AI models more energy efficient and far less memory intensive. As a result, mainframe users can leverage much more complex AI models and perform AI inferencing at a greater scale

Read More »

VergeIO enhances VergeFabric network virtualization offering

VergeIO is not, however, using an off-the-shelf version of KVM. Rather, it is using what Crump referred to as a heavily modified KVM hypervisor base, with significant proprietary enhancements while still maintaining connections to the open-source community. VergeIO’s deployment profile is currently 70% on premises and about 30% via bare-metal service providers, with a particularly strong following among cloud service providers that host applications for their customers. The software requires direct hardware access due to its low-level integration with physical resources. “Since November of 2023, the normal number one customer we’re attracting right now is guys that have had a heart attack when they got their VMware renewal license,” Crump said. “The more of the stack you own, the better our story becomes.” A 2024 report from Data Center Intelligence Group (DCIG) identified VergeOS as one of the top 5 alternatives to VMware. “VergeIO starts by installing VergeOS on bare metal servers,” the report stated. “It then brings the servers’ hardware resources under its management, catalogs these resources, and makes them available to VMs. By directly accessing and managing the server’s hardware resources, it optimizes them in ways other hypervisors often cannot.” Advanced networking features in VergeFabric VergeFabric is the networking component within the VergeOS ecosystem, providing software-defined networking capabilities as an integrated service rather than as a separate virtual machine or application.

Read More »

Podcast: On the Frontier of Modular Edge AI Data Centers with Flexnode’s Andrew Lindsey

The modular data center industry is undergoing a seismic shift in the age of AI, and few are as deeply embedded in this transformation as Andrew Lindsey, Co-Founder and CEO of Flexnode. In a recent episode of the Data Center Frontier Show podcast, Lindsey joined Editor-in-Chief Matt Vincent and Senior Editor David Chernicoff to discuss the evolution of modular data centers, the growing demand for high-density liquid-cooled solutions, and the industry factors driving this momentum. A Background Rooted in Innovation Lindsey’s career has been defined by the intersection of technology and the built environment. Prior to launching Flexnode, he worked at Alpha Corporation, a top 100 engineering and construction management firm founded by his father in 1979. His early career involved spearheading technology adoption within the firm, with a focus on high-security infrastructure for both government and private clients. Recognizing a massive opportunity in the data center space, Lindsey saw a need for an innovative approach to infrastructure deployment. “The construction industry is relatively uninnovative,” he explained, citing a McKinsey study that ranked construction as the second least-digitized industry—just above fishing and wildlife, which remains deliberately undigitized. Given the billions of square feet of data center infrastructure required in a relatively short timeframe, Lindsey set out to streamline and modernize the process. Founded four years ago, Flexnode delivers modular data centers with a fully integrated approach, handling everything from site selection to design, engineering, manufacturing, deployment, operations, and even end-of-life decommissioning. Their core mission is to provide an “easy button” for high-density computing solutions, including cloud and dedicated GPU infrastructure, allowing faster and more efficient deployment of modular data centers. The Rising Momentum for Modular Data Centers As Vincent noted, Data Center Frontier has closely tracked the increasing traction of modular infrastructure. Lindsey has been at the forefront of this

Read More »

Last Energy to Deploy 30 Microreactors in Texas for Data Centers

As the demand for data center power surges in Texas, nuclear startup Last Energy has now announced plans to build 30 microreactors in the state’s Haskell County near the Dallas-Fort Worth Metroplex. The reactors will serve a growing customer base of data center operators in the region looking for reliable, carbon-free energy. The plan marks Last Energy’s largest project to date and a significant step in advancing modular nuclear power as a viable solution for high-density computing infrastructure. Meeting the Looming Power Demands of Texas Data Centers Texas is already home to over 340 data centers, with significant expansion underway. Google is increasing its data center footprint in Dallas, while OpenAI’s Stargate has announced plans for a new facility in Abilene, just an hour south of Last Energy’s planned site. The company notes the Dallas-Fort Worth metro area alone is projected to require an additional 43 gigawatts of power in the coming years, far surpassing current grid capacity. To help remediate, Last Energy has secured a 200+ acre site in Haskell County, approximately three and a half hours west of Dallas. The company has also filed for a grid connection with ERCOT, with plans to deliver power via a mix of private wire and grid transmission. Additionally, Last Energy has begun pre-application engagement with the U.S. Nuclear Regulatory Commission (NRC) for an Early Site Permit, a key step in securing regulatory approval. According to Last Energy CEO Bret Kugelmass, the company’s modular approach is designed to bring nuclear energy online faster than traditional projects. “Nuclear power is the most effective way to meet Texas’ growing energy demand, but it needs to be deployed faster and at scale,” Kugelmass said. “Our microreactors are designed to be plug-and-play, enabling data center operators to bypass the constraints of an overloaded grid.” Scaling Nuclear for

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »