Zencoder’s ‘Coffee Mode’ is the future of coding: Hit a button and let AI write your unit tests

Stay Ahead, Stay ONMINE

Zencoder’s ‘Coffee Mode’ is the future of coding: Hit a button and let AI write your unit tests

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Zencoder unveils its next-generation AI coding and unit testing agents today, positioning the San Francisco-based company as a formidable challenger to established players like GitHub Copilot and newcomers like Cursor. The company, founded by former Wrike […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Zencoder unveils its next-generation AI coding and unit testing agents today, positioning the San Francisco-based company as a formidable challenger to established players like GitHub Copilot and newcomers like Cursor.

The company, founded by former Wrike CEO Andrew Filev, integrates its AI agents directly into popular development environments including Visual Studio Code and JetBrains IDEs, alongside deep integrations with JIRA, GitHub, GitLab, Sentry, and more than 20 other development tools.

“We started with the thesis that transformers are powerful computing building blocks, but if you put them in a more agentic environment, you can get much more out of them,” said Filev in an exclusive interview with VentureBeat. “By agentic, I mean two key things: first, giving the AI feedback so it can improve its work, and second, equipping it with tools. Just like human intelligence, AI becomes significantly more capable when it has the right tools at its disposal.”

Why developers won’t need to abandon their favorite IDEs for AI assistance

Several AI coding assistants have emerged in the past year, but Zencoder’s approach distinguishes itself by operating within existing workflows rather than requiring developers to switch platforms.

“Our main competitor is Cursor. Cursor is its own development environment versus we deliver the same very powerful agentic capabilities, but within existing development environments,” Filev told VentureBeat. “For some developers, it doesn’t really matter. But for some developers, they either want or have to stick to their existing environments.”

This distinction matters particularly for enterprise developers working in Java and C#, languages for which specialized IDEs like JetBrains’ IntelliJ and Rider offer more robust support than generalized environments.

How Zencoder’s AI agents are beating state-of-the-art benchmarks by double-digit margins

The company claims significant performance advantages over competitors, backed by results on standard industry benchmarks. According to Filev, Zencoder’s agents can solve 63% of issues on the SWE-Bench Verified benchmark, placing it among the top three performers despite using a more practical single-trajectory approach rather than running multiple parallel attempts like some research-focused systems.

“Our agent is distinctive because we’re focused on building the best pipeline for real-world developer use,” Filev said. “What makes our approach special is that our agent operates on what we call a single track, single trajectory basis. For a single trajectory agent to successfully resolve 63% of these complex issues is remarkably impressive.”

Even more notable, the company reports approximately 30% success on the newer SWE-Bench Multimodal benchmark, which Filev claims is double the previous best result of less than 15%. On OpenAI’s recently introduced SWE-Lancer IC Diamond benchmark, Zencoder reports more than 30% success — over 20% better than OpenAI’s own best result.

The secret sauce: ‘Repo Grokking’ technology that understands your entire codebase

Zencoder’s performance stems from its proprietary “Repo Grokking” technology, which analyzes and interprets large codebases to provide critical context to the AI agents.

“All of these agents have distinct capabilities shaped by the language models embedded within them,” Filev explained. “Whether it’s a frontier model or an open source model, the LLM by itself knows nothing about your specific project in the vast majority of scenarios. It can only work with the context that’s provided to it.”

Zencoder’s approach combines multiple techniques beyond simple AI embeddings for semantic search. “It uses traditional full text search, it uses custom re-ranker, it uses LLM, it uses synthetic information. So it does a lot of things to build the best understanding of the customer repositories,” Filev said.

This contextual understanding helps the system avoid a common criticism of AI coding assistants—that they introduce more problems than they solve by misunderstanding project structures or dependencies.

‘Coffee Mode’: How developers can finally take breaks while AI writes their unit tests

Perhaps the most attention-grabbing feature is what Zencoder calls “Coffee Mode,” which allows developers to step away while the AI agents work autonomously.

“You can literally hit that button and go grab a coffee, and the agent will do that work by itself,” Filev told VentureBeat. “As we like to say in the company, you can watch forever the waterfall, the fire burning, and the agent working in coffee mode.”

The feature can be applied to both writing code and generating unit tests — with the latter proving particularly valuable since many developers prefer creating new features over writing test coverage.

“I’ve not seen a developer who’s like, ‘Oh my God, I want to write a bunch of tests for my code,’” Filev said. “They typically like creating stuff, and test is kind of supporting the creation, rather than the process of creation.”

Zencoder’s launch comes at a critical moment when developers and companies are navigating how to effectively integrate AI coding tools into existing workflows. The industry landscape includes skeptics who point to AI’s limitations in producing production-ready code and enthusiasts who overestimate its capabilities.

“There’s a lot of right now, a lot of emotion, pent up emotion on the AI side of things,” Filev observed. “You see people in both camps, like one of them saying, ‘hey, it’s the best thing since sliced bread, I’m gonna white code my next Salesforce.’ And then you have the naysayers that are trying to prove that they’re still the smartest kids on the block… trying to find the scenarios where it breaks.”

Filev advocates a more measured approach, viewing AI coding tools as sophisticated instruments requiring proper skill to utilize effectively. “It is a tool. It is a sophisticated tool, very powerful tool. And so engineers need to build skills around using that. It’s not yet to the point where it’s a replacement for an engineer in at least large, complex enterprise projects.”

The roadmap: Production-ready AI code generation with built-in security checks

Looking ahead, Zencoder plans to continue improving its agents’ performance on benchmarks while expanding support across more programming languages and focusing on production-ready code generation with built-in testing and security checks.

“What you will see through the rest of the year, a big chunk of it will be focused on making sure that the software that we create for you and with you, you have some confidence in it,” Filev said. “We want to make sure that that code is reviewed by AI or by your CI/CD tools, that hosted code is tested either by your CI/CD or by AI, that you know there are no obvious security vulnerabilities.”

Filev predicts dramatic changes in the software development landscape before the end of 2025: “I am confident that the software industry will look very different by the end of this year, and that this whole category will take another turn… Before the calendar ends, so in the next nine months, we will see another generation of AI coding assistance, AI coding agents.”

The company offers three pricing tiers: a free basic version, a $19 per user per month Business tier with advanced coding and testing features, and an Enterprise tier at $39 per user per month that includes premium support and compliance features.

For an industry still debating whether AI will replace developers or merely augment them, Zencoder’s approach suggests a third path: AI that meets developers where they are, helps them skip the tedious parts, and lets them enjoy their coffee in peace.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy, bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

MAD Chairs: A new tool to evaluate AI

This is a linkpost for https://arxiv.org/abs/2503.20986v2

IBM Cloud speeds AI workloads with Intel Gaudi 3 accelerators

For businesses that need more control over their AI development, IBM says they can deploy IBM watsonx.ai software with the Intel Gaudi 3-based virtual server on IBM Cloud VPC in Q2 2025. IBM watsonx.ai includes an end-to-end AI development studio, AI developer toolkit and full AI lifecycle management for developing AI services

Nvidia’s Blackwell raises the bar with new MLPerf Inference V5.0 results

Fifteen partners, including Cisco, Fujitsu, Hewlett Packard Enterprise, Dell Technologies, Oracle, and Google Cloud, were involved in the latest round of MLPerf testing, which, he noted, was the largest number of Nvidia partners submitting to the benchmark in any given round. When asked about his overall impression on the latest

Palo Alto VP Jordi Botifoll: ‘You can’t play with cybersecurity’

Palo Alto has boosted this effort in recent years with the integration of precision artificial intelligence that includes machine learning and deep learning techniques, in addition to generative AI tools. “Our strategy is to ensure that the time to detect an attack and the time to resolve it (if it

District Judge’s Gulf Ruling Has Major Implications on Future Lease Sales

A U.S. district judge’s recent ruling against the sale of Gulf oil and gas drilling rights has major implications on future sales, according to Ellen R. Wald, the President of Transversal Consulting. “It means that environmental groups can bring lawsuits against the Interior Department just because they don’t agree with a certain aspect of their environmental impact statement,” Wald told Rigzone. The Transversal Consulting President outlined to Rigzone that interpretations of the National Environmental Policy Act, a law from 1970, have come to include “things like energy market forecasts for 30 years in the future” and said the Trump administration has an opportunity to change this. Wald pointed out to Rigzone that the administration can do this “either by overhauling the judiciary, by changing the National Environmental Policy Act to be more specific about what must be included in an environmental impact statement so that it is not open to interpretation by the judiciary, or by taking the case to higher level courts (potentially even to the Supreme Court) where they could rule that the lower court was wrong in its interpretation of the National Environmental Policy Act”. “Lower courts are supposed to follow the rulings made by higher courts and they should dismiss future cases made on these grounds,” Wald said. “Ultimately if the Supreme Court rules on it, then the issue should be put to rest. Otherwise this issue will continue to plague oil and gas development in the United States,” Wald added. Wald told Rigzone that current activity in the Gulf will not be impacted by the ruling, “only activity on the leases in question”. “It isn’t clear whether companies started drilling operations or not or whether any injunctions were issued,” Wald said. The Transversal Consulting President went on to tell Rigzone that she doesn’t see how the Trump administration can get around the ruling other

UK government to profit from nuclear fusion fund

The UK government has invested £20 million into a UK nuclear fusion investment fund, which is expected to leverage up to £100m of private investment. Energy secretary Ed Miliband said: “This government is taking back control of Britain’s energy by driving for clean homegrown power through our Plan for Change. “Fusion has the potential to provide us with energy security, whilst attracting the best technologies to our shores and training up the next generation of British scientists and engineers. We are backing both nuclear and fusion power, and today we take a step forward in growing this exciting industry.” The Department for Energy Security and Net Zero (DESNZ) said on Thursday that it is investing capital to unlock private sector investment and help the nascent sector scale up. It said government will receive a share of any returns made by the partnership. The department argued that successful deployment of nuclear fusion energy would be “globally transformative” and would allow the UK to export the technology to a global market that is expected to be worth trillions of pounds. It said the funding was allocated from the government’s existing research and development budget for the year 2024 to 2025. The government’s cornerstone investment in nuclear fusion will effectively “kickstart” the Starmaker One investment fund, a private vehicle set up to enable start-ups and businesses to commercialise and grow. It is the first early-stage fusion energy venture capital fund to be formed outside the US and the first of its kind to partner with the government as an investor. The energy department said it has invested £410m, unveiled in January, into UK nuclear fusion research to spur collaboration with other countries to drive economic growth through developing clean and “unlimited” power. The fusion fund is structured as a limited partnership, in which the UK government is

EIP Raises Stake in Eni Renewables Arm to 10 Percent

Energy Infrastructure Partners (EIP) has injected about EUR 209 million ($230.42 million) in additional capital into Eni Plenitude SpA Società Benefit, increasing its stake to 10 percent. Including EUR 588 million paid March 2024, EIP’s investment in Eni SpA’s renewables arm now totals about EUR 800 million. “The transaction confirms a post-money equity value of Plenitude of around EUR 8 billion and an enterprise value of over EUR 10 billion”, Italian state-backed integrated energy company Eni said in an online statement. EIP partner Tim Marahrens said, “Our increased commitment to Plenitude reflects our confidence in its unique integrated model, which combines renewable generation, retail energy solutions and e-mobility at scale”. “Over the past year, Plenitude has demonstrated its ability to exceed targets and capitalize on the accelerating energy transition”, Marahrens added. Plenitude’s installed generation capacity from renewable sources rose to 4 GW last year, meeting a goal Eni outlined in its 2024-27 plan published March 14, 2024. Plenitude plans to reach over 8 GW of installed renewable energy capacity by 2027, and 15 GW by 2030. Currently it is active in over 15 countries. In Europe, it counts more than 10 million energy and energy solutions clients, as well as over 21,000 electric vehicle charging points, Eni said. Recently KKR & Co. Inc. completed the purchase of a 25 percent stake in another Eni company, biofuels developer Enilive. That is to be raised to 30 percent after the conclusion of a later deal. “The overall proceeds for Eni group, after accounting for cash adjustments and other items, amount to 2.967 billion euros [$3.2 billion], including a capital increase in Enilive of 500 million euros to support the company’s growth plan”, Eni said in a press release March 6. “Enilive, with its integrated business model, represents a prime example of the progress of the business satellite

Breaking the silence: Mental health in the workplace

IntrospeXion recognises that mental health is not just a personal struggle but a universal challenge, affecting every industry. Remote personnel, in particular, experience extreme psychological stress due to isolation, high-risk environments and limited access to support. While physical safety is a priority across all sectors, mental health remains a silent crisis that requires urgent attention. The harsh reality: What the data tells us Recent assessments conducted by IntrospeXion and many other industry experts, have revealed alarming trends within the Energy sector in the North Sea: 75% of all suicides are men. Increase in cases of domestic abuse for remote personnel, predominantly emotional and psychological in nature. 25% of reported domestic abuse cases involve male victims. Offshore workers are 15x more likely to attempt suicide. Burnout among young workers: 1/3 of Gen-Z workers (aged 18-24) took time off in 2024 due to stress, indicating a generational divide in resilience and coping with workplace pressures. 67% reported fatigue and sleep disruption, impacting cognitive function and overall wellbeing. Extrapolating these figures across the 40,000 offshore workers in the North Sea, the mental health crisis in this sector is undeniable. Yet, support structures remain inadequate, and stigma prevents many from seeking help. The role of industry leaders HR departments and industry leaders must take proactive ownership of offshore mental health. Traditional Employee Assistance Programs (EAPs) have just a 4% uptake, proof that conventional approaches are failing this workforce. IntrospeXion: Delivering real change IntrospeXion was founded by an industry psychologist to bridge the gap between mental health needs and meaningful intervention. Their team of qualified Psychologists provide onshore and offshore mental health solutions, which include: On-site Psychological Support: Routine mental health check-ins with qualified professionals. Crisis Intervention Training: Equipping managers with the skills to spot early warning signs and respond effectively. Cognitive Behavioural Stress Inoculation

BP, Repsol Start Production at Cypre Gas Project in Trinidad

BP PLC said Monday the Cypre field in Trinidad and Tobago is now producing and is expected to deliver about 250 million standard cubic feet a day (MMscfd) of gas at peak. The project, located 78 kilometers (48.47 miles) off the southeast coast of Trinidad in a water depth of about 80 meters (262.47 feet) according to BP, is part of the East Mayaro Block. Cypre is wholly owned by BP Trinidad and Tobago LLC (BPTT), a joint venture owned 70 percent by Britain’s BP and 30 percent by Spain’s Repsol SA. Cypre is one of 10 projects that BP aims to start up between 2025 and 2027. “Production from Cypre will make a significant contribution towards the 250,000 barrels of oil equivalent per day combined peak net production expected from these 10 projects”, it said in an online statement. BPTT’s third subsea development, Cypre will have 7 wells tied back to BPTT’s existing Jupiter platform. Phase 1, consisting of 4 wells, was completed at the end of 2024. “The second phase is expected to commence in the second half of this year”, BP said. BPTT president David Campbell said, “Cypre is another key milestone in bpTT’s strategy to maximize production from our shallow water acreage using existing infrastructure”. “The project not only reinforces our commitment to maintaining production but also plays a crucial role in satisfying our existing gas supply commitments”, Campbell added. “Cypre represents a significant investment in the country’s energy sector”. William Lin, BP executive vice president for gas and low-carbon energy, said, “The second of 10 major projects across our global portfolio that we expect to start up by 2027, Cypre is also the first of a series of projects we will be bringing online in Trinidad to deliver gas to the nation and add value for

USA Crude Oil Inventories Rise More Than 6MM Barrels Week on Week

U.S. commercial crude oil inventories, excluding those in the Strategic Petroleum Reserve (SPR), increased by 6.2 million barrels from the week ending March 21 to the week ending March 28, the U.S. Energy Information Administration (EIA) highlighted in its latest weekly petroleum status report. That report was released on April 2 and included data for the week ending March 28. The EIA report showed that crude oil stocks, not including the SPR, stood at 439.8 million barrels on March 28, 433.6 million barrels on March 21, and 451.4 million barrels on March 29, 2024. Crude oil in the SPR stood at 396.4 million barrels on March 28, 396.1 million barrels on March 21, and 363.6 million barrels on March 29, 2024, the report outlined. Total petroleum stocks – including crude oil, total motor gasoline, fuel ethanol, kerosene type jet fuel, distillate fuel oil, residual fuel oil, propane/propylene, and other oils – stood at 1.605 billion barrels on March 28, the report showed. Total petroleum stocks were up 5.6 million barrels week on week and up 27.2 million barrels year on year, the report revealed. “At 439.8 million barrels, U.S. crude oil inventories are about four percent below the five year average for this time of year,” the EIA said in its report. “Total motor gasoline inventories decreased by 1.6 million barrels from last week and are two percent above the five year average for this time of year. Finished gasoline inventories increased and blending components inventories decreased last week,” it added. “Distillate fuel inventories increased by 0.3 million barrels last week and are about six percent below the five year average for this time of year. Propane/propylene inventories increased by 1.0 million barrels from last week and are eight percent below the five year average for this time of year,”

New MLCommons benchmarks to test AI infrastructure performance

The latest release also broadens its scope beyond chatbot benchmarks. A new graph neural network (GNN) test targets datacenter-class hardware and is designed for workloads like fraud detection, recommendation engines, and knowledge graphs. It uses the RGAT model based on a graph dataset containing over 547 million nodes and 5.8 billion edges. Judging performance Analysts suggest that these benchmarks will make it easier to judge the performance of various hardware chips and clusters based on documented models. “As every chipmaker seeks to prove that its hardware is good enough to support AI, we now have a standard benchmark that shows the quality of question support, math, and coding skills associated with hardware,” said Hyoun Park, CEO and Chief Analyst at Amalgam Insights. Chipmakers can now compete not just on traditional speeds and feeds, but in mathematical skill and informational accuracy. This benchmark provides a rare opportunity to add new performance standards on cross-vendor hardware, Park added. “The latency in terms of how quickly tokens are delivered and the time for the user to see the response is the deciding factor,” said Neil Shah, partner and co-founder at Counterpoint Research. “This is where players such as NVIDIA, AMD, and Intel have to get the software right to help developers optimize the models and bring out the best compute performance.” Benchmarking and buying decisions Independent benchmarks like those from MLCommons play a key role in helping buyers evaluate system performance, but relying on them alone may not provide the full picture.

Potential Nvidia chip shortage looms as Chinese customers rush to beat US sales ban

Will it lead to shortages? The US first placed export controls on chips sent to China in October 2022 as a means to slow the country’s technological advances. It blocked the sale of Nvidia’s A100 and H100 chips, leading the company to develop the less powerful A800 and H800 chips for the market; they were also subsequently banned. There was a surge in demand for the H20 following the arrival of Chinese startup DeepSeek’s ultra low-cost, open-source AI model in January. And while the H20 is reported to be 15 times slower than Nvidia’s newest Blackwell chips sold elsewhere in the world, it was designed specifically by Nvidia to comply with the further US export controls introduced in October 2023. It is being used by Chinese companies for training, although it’s billed as an inference chip, explained Matt Kimball, VP and principal analyst for datacenter compute and storage at Moor Insights & Strategy. Should Nvidia choose to focus its efforts on manufacturing more of the chips, Kimball said he doesn’t think it will impact supply in the US and Europe, as Blackwell is the main product sold in those markets and H20 is an N-1 Hopper architecture chip. “If you take this a step further and ask whether this large order slows down the production of chips destined for the US and Europe, I’d say the answer is no, as the Hopper family is built on a different process node than the Blackwell family,” he said. Still, Kimball noted, “supply chain management is difficult, especially for smaller organizations that are put to the back of the line as hyperscalers with multibillion dollar orders are first in line for the newest [chips].”

European cloud group invests to create what it dubs “Trump-proof cloud services”

But analysts have questioned whether the Microsoft move truly addresses those European business concerns. Phil Brunkard, executive counselor at Info-Tech Research Group UK, said, commenting on last month’s announcement of the EU Data Boundary for the Microsoft Cloud, “Microsoft says that customer data will remain stored and processed in the EU and EFTA, but doesn’t guarantee true data sovereignty.” And European companies are now rethinking what data sovereignty means to them. They are moving beyond having it refer to where the data sits to focusing on which vendors control it, and who controls them. Responding to the new Euro cloud plan, another analyst, IDC VP Dave McCarthy, saw the effort as “signaling a growing European push for data control and independence.” “US providers could face tougher competition from EU companies that leverage this tech to offer sovereignty-friendly alternatives. Although €1 million isn’t a game-changer on its own, it’s a clear sign Europe wants to build its own cloud ecosystem—potentially at the expense of US market share,” McCarthy said. “For US providers, this could mean investing in more EU-based data centers or reconfiguring systems to ensure European customers’ data stays within the region. This isn’t just a compliance checkbox. It’s a shift that could hike operational costs and complexity, especially for companies used to running centralized setups.” Adding to the potential bad news for US hyperscalers, McCarthy said that there was little reason to believe that this trend would be limited to Europe. “If Europe pulls this off, other regions might take note and push for similar sovereignty rules. US providers could find themselves adapting to a patchwork of regulations worldwide, forcing a rethink of their global strategies,” McCarthy said. “This isn’t just a European headache, it’s a preview of what could become a broader challenge.”

Talent gap complicates cost-conscious cloud planning

The top strategy so far is what one enterprise calls the “Cloud Team.” You assemble all your people with cloud skills, and your own best software architect, and have the team examine current and proposed cloud applications, looking for a high-level approach that meets business goals. In this process, the team tries to avoid implementation specifics, focusing instead on the notion that a hybrid application has an agile cloud side and a governance-and-sovereignty data center side, and what has to be done is push functionality into the right place. The Cloud Team supporters say that an experienced application architect can deal with the cloud in abstract, without detailed knowledge of cloud tools and costs. For example, the architect can assess the value of using an event-driven versus transactional model without fixating on how either could be done. The idea is to first come up with approaches. Then, developers could work with cloud providers to map each approach to an implementation, and assess the costs, benefits, and risks. Ok, I lied about this being the top strategy—sort of, at least. It’s the only strategy that’s making much sense. The enterprises all start their cloud-reassessment journey on a different tack, but they agree it doesn’t work. The knee-jerk approach to cloud costs is to attack the implementation, not the design. What cloud features did you pick? Could you find ones that cost less? Could you perhaps shed all the special features and just host containers or VMs with no web services at all? Enterprises who try this, meaning almost all of them, report that they save less than 15% on cloud costs, a rate of savings that means roughly a five-year payback on the costs of making the application changes…if they can make them at all. Enterprises used to build all of

Lightmatter launches photonic chips to eliminate GPU idle time in AI data centers

“Silicon photonics can transform HPC, data centers, and networking by providing greater scalability, better energy efficiency, and seamless integration with existing semiconductor manufacturing and packaging technologies,” Jagadeesan added. “Lightmatter’s recent announcement of the Passage L200 co-packaged optics and M1000 reference platform demonstrates an important step toward addressing the interconnect bandwidth and latency between accelerators in AI data centers.” The market timing appears strategic, as enterprises worldwide face increasing computational demands from AI workloads while simultaneously confronting the physical limitations of traditional semiconductor scaling. Silicon photonics offers a potential path forward as conventional approaches reach their limits. Practical applications For enterprise IT leaders, Lightmatter’s technology could impact several key areas of infrastructure planning. AI development teams could see significantly reduced training times for complex models, enabling faster iteration and deployment of AI solutions. Real-time AI applications could benefit from lower latency between processing units, improving responsiveness for time-sensitive operations. Data centers could potentially achieve higher computational density with fewer networking bottlenecks, allowing more efficient use of physical space and resources. Infrastructure costs might be optimized by more efficient utilization of expensive GPU resources, as processors spend less time waiting for data and more time computing. These benefits would be particularly valuable for financial services, healthcare, research institutions, and technology companies working with large-scale AI deployments. Organizations that rely on real-time analysis of large datasets or require rapid training and deployment of complex AI models stand to gain the most from the technology. “Silicon photonics will be a key technology for interconnects across accelerators, racks, and data center fabrics,” Jagadeesan pointed out. “Chiplets and advanced packaging will coexist and dominate intra-package communication. The key aspect is integration, that is companies who have the potential to combine photonics, chiplets, and packaging in a more efficient way will gain competitive advantage.”

Silicon Motion rolls SSD kit to bolster AI workload performance

The kit utilizes the PCIe Dual Ported enterprise-grade SM8366 controller with support for PCIe Gen 5 x4 NVMe 2.0 and OCP 2.5 data center specifications. The 128TB SSD RDK also supports NVMe 2.0 Flexible Data Placement (FDP), a feature that allows advanced data management and improved SSD write efficiency and endurance. “Silicon Motion’s MonTitan SSD RDK offers a comprehensive solution for our customers, enabling them to rapidly develop and deploy enterprise-class SSDs tailored for AI data center and edge server applications.” said Alex Chou, senior vice president of the enterprise storage & display interface solution business at Silicon Motion. Silicon Motion doesn’t make drives, rather it makes reference design kits in different form factors that its customers use to build their own product. Its kits come in E1.S, E3.S, and U.2 form factors. The E1.S and U.2 forms mirror the M.2, which looks like a stick of gum and installs on the motherboard. There are PCI Express enclosures that hold four to six of those drives and plug into one card slot and appear to the system as a single drive.

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Stay Ahead, Stay ONMINE