Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Zencoder unveils its next-generation AI coding and unit testing agents today, positioning the San Francisco-based company as a formidable challenger to established players like GitHub Copilot and newcomers like Cursor.
The company, founded by former Wrike CEO Andrew Filev, integrates its AI agents directly into popular development environments including Visual Studio Code and JetBrains IDEs, alongside deep integrations with JIRA, GitHub, GitLab, Sentry, and more than 20 other development tools.
“We started with the thesis that transformers are powerful computing building blocks, but if you put them in a more agentic environment, you can get much more out of them,” said Filev in an exclusive interview with VentureBeat. “By agentic, I mean two key things: first, giving the AI feedback so it can improve its work, and second, equipping it with tools. Just like human intelligence, AI becomes significantly more capable when it has the right tools at its disposal.”
Why developers won’t need to abandon their favorite IDEs for AI assistance
Several AI coding assistants have emerged in the past year, but Zencoder’s approach distinguishes itself by operating within existing workflows rather than requiring developers to switch platforms.
“Our main competitor is Cursor. Cursor is its own development environment versus we deliver the same very powerful agentic capabilities, but within existing development environments,” Filev told VentureBeat. “For some developers, it doesn’t really matter. But for some developers, they either want or have to stick to their existing environments.”
This distinction matters particularly for enterprise developers working in Java and C#, languages for which specialized IDEs like JetBrains’ IntelliJ and Rider offer more robust support than generalized environments.
How Zencoder’s AI agents are beating state-of-the-art benchmarks by double-digit margins
The company claims significant performance advantages over competitors, backed by results on standard industry benchmarks. According to Filev, Zencoder’s agents can solve 63% of issues on the SWE-Bench Verified benchmark, placing it among the top three performers despite using a more practical single-trajectory approach rather than running multiple parallel attempts like some research-focused systems.
“Our agent is distinctive because we’re focused on building the best pipeline for real-world developer use,” Filev said. “What makes our approach special is that our agent operates on what we call a single track, single trajectory basis. For a single trajectory agent to successfully resolve 63% of these complex issues is remarkably impressive.”
Even more notable, the company reports approximately 30% success on the newer SWE-Bench Multimodal benchmark, which Filev claims is double the previous best result of less than 15%. On OpenAI’s recently introduced SWE-Lancer IC Diamond benchmark, Zencoder reports more than 30% success — over 20% better than OpenAI’s own best result.
The secret sauce: ‘Repo Grokking’ technology that understands your entire codebase
Zencoder’s performance stems from its proprietary “Repo Grokking” technology, which analyzes and interprets large codebases to provide critical context to the AI agents.
“All of these agents have distinct capabilities shaped by the language models embedded within them,” Filev explained. “Whether it’s a frontier model or an open source model, the LLM by itself knows nothing about your specific project in the vast majority of scenarios. It can only work with the context that’s provided to it.”
Zencoder’s approach combines multiple techniques beyond simple AI embeddings for semantic search. “It uses traditional full text search, it uses custom re-ranker, it uses LLM, it uses synthetic information. So it does a lot of things to build the best understanding of the customer repositories,” Filev said.
This contextual understanding helps the system avoid a common criticism of AI coding assistants—that they introduce more problems than they solve by misunderstanding project structures or dependencies.
‘Coffee Mode’: How developers can finally take breaks while AI writes their unit tests
Perhaps the most attention-grabbing feature is what Zencoder calls “Coffee Mode,” which allows developers to step away while the AI agents work autonomously.
“You can literally hit that button and go grab a coffee, and the agent will do that work by itself,” Filev told VentureBeat. “As we like to say in the company, you can watch forever the waterfall, the fire burning, and the agent working in coffee mode.”
The feature can be applied to both writing code and generating unit tests — with the latter proving particularly valuable since many developers prefer creating new features over writing test coverage.
“I’ve not seen a developer who’s like, ‘Oh my God, I want to write a bunch of tests for my code,’” Filev said. “They typically like creating stuff, and test is kind of supporting the creation, rather than the process of creation.”
Zencoder’s launch comes at a critical moment when developers and companies are navigating how to effectively integrate AI coding tools into existing workflows. The industry landscape includes skeptics who point to AI’s limitations in producing production-ready code and enthusiasts who overestimate its capabilities.
“There’s a lot of right now, a lot of emotion, pent up emotion on the AI side of things,” Filev observed. “You see people in both camps, like one of them saying, ‘hey, it’s the best thing since sliced bread, I’m gonna white code my next Salesforce.’ And then you have the naysayers that are trying to prove that they’re still the smartest kids on the block… trying to find the scenarios where it breaks.”
Filev advocates a more measured approach, viewing AI coding tools as sophisticated instruments requiring proper skill to utilize effectively. “It is a tool. It is a sophisticated tool, very powerful tool. And so engineers need to build skills around using that. It’s not yet to the point where it’s a replacement for an engineer in at least large, complex enterprise projects.”
The roadmap: Production-ready AI code generation with built-in security checks
Looking ahead, Zencoder plans to continue improving its agents’ performance on benchmarks while expanding support across more programming languages and focusing on production-ready code generation with built-in testing and security checks.
“What you will see through the rest of the year, a big chunk of it will be focused on making sure that the software that we create for you and with you, you have some confidence in it,” Filev said. “We want to make sure that that code is reviewed by AI or by your CI/CD tools, that hosted code is tested either by your CI/CD or by AI, that you know there are no obvious security vulnerabilities.”
Filev predicts dramatic changes in the software development landscape before the end of 2025: “I am confident that the software industry will look very different by the end of this year, and that this whole category will take another turn… Before the calendar ends, so in the next nine months, we will see another generation of AI coding assistance, AI coding agents.”
The company offers three pricing tiers: a free basic version, a $19 per user per month Business tier with advanced coding and testing features, and an Enterprise tier at $39 per user per month that includes premium support and compliance features.
For an industry still debating whether AI will replace developers or merely augment them, Zencoder’s approach suggests a third path: AI that meets developers where they are, helps them skip the tedious parts, and lets them enjoy their coffee in peace.
Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.
Read our Privacy Policy
Thanks for subscribing. Check out more VB newsletters here.
An error occured.
