Your Gateway to Power, Energy, Datacenters, Bitcoin and AI
Dive into the latest industry updates, our exclusive Paperboy Newsletter, and curated insights designed to keep you informed. Stay ahead with minimal time spent.
Discover What Matters Most to You

AI
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Bitcoin:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Datacenter:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.

Energy:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Discover What Matter Most to You
Featured Articles

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Meta has appointed Shengjia Zhao, a former OpenAI researcher and co‑creator of GPT‑4, as the Chief Scientist of its newly created Meta Superintelligence Labs (MSL). The announcement was made Friday by Mark Zuckerberg on Threads, noting Zhao will lead the lab’s scientific agenda alongside him and Alexandr Wang, the former CEO of Scale AI who Meta recently brought onboard as Chief AI Officer. “I am very excited to take up the role of chief scientist for meta super-intelligence labs. Looking forward to building asi [artificial superintelligence] and aligning it to empower people with the amazing team here. Let’s build!” Zhao wrote in his own Threads post. “Artificial superintelligence” is a nebulous term used in the AI industry to describe systems more powerful and capable than any today, beyond even the smartest humans, making them difficult to control. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Zhao’s strong commercial AI background Zhao, who previously worked at OpenAI, played a key role in the development of foundational models like GPT-4 and GPT-4o, according to arXiv system cards and research papers listing him as a co-author. He’s also known for his academic work on generative models and fair representations, with widely cited papers in venues like NeurIPS, ICML, and ICLR. Zhao joins Meta amid a high-stakes hiring blitz across the AI industry. Over the past few months, Meta has

Supertanker Hauling Saudi Diesel Heads to Europe
A supertanker carrying a cargo of diesel from the Middle East is en route to the fuel-starved European market, reflecting supply tightness in the region. The VLCC Nissos Keros loaded about 2 million barrels of ultra-low sulfur diesel from Saudi Arabia’s Jubail terminal and is currently signaling France where it’s due to arrive Aug. 30, according to Kpler and ship-tracking data compiled by Bloomberg. The vessel, which usually transports crude oil, was re-configured to carry diesel. Cargoes of the fuel would typically be carried on smaller tankers, but with freight rates elevated after the latest attacks on shipping in the Red Sea, operators have an incentive to clean up dirty tankers to haul products instead and reap the economies of scale. Europe’s diesel market remains under pressure, driven by a combination of lower refinery output, costly rerouting of imports to replace shunned Russian supplies and sanctions-related uncertainty. The arrival of a large shipment may provide temporary relief, but dependence on long-haul imports continues to expose the European market to spikes in freight costs and supply volatility. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples
Singapore-based AI startup Sapient Intelligence has developed a new AI architecture that can match, and in some cases vastly outperform, large language models (LLMs) on complex reasoning tasks, all while being significantly smaller and more data-efficient.The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

Oil Slips on Stronger Dollar, Trade Doubts
Oil fell as the dollar strengthened and conviction waned that the US will reach agreements with key trade partners ahead of a deadline next week. West Texas Intermediate crude slid more than 1% to settle near $65 a barrel after President Donald Trump said the US has a 50-50 chance of striking a trade deal with Europe, a contrast to the optimism the bloc’s diplomats expressed this week. Trump also said most tariff rates are essentially settled now. The effective US tariff rate is at the highest in a century, by some estimates, a potential threat to energy demand. In another headwind, Trump indicated he had no plans to fire Federal Reserve Chair Jerome Powell, boosting the dollar and making the commodities priced in the currency less attractive. Crude has remained in a holding pattern this month, but is down for the year as increased supply from OPEC+ adds to concerns of a looming glut. The group will next meet on Aug. 3 to decide on production levels. On Thursday, one member, Venezuela, was given a production reprieve by a US decision to let Chevron resume pumping oil in the country. “We expect crude to slowly sell off this fall, driven by steady acceleration of stock builds, softening physical markets, reduced refinery margin support and continued deescalation of geopolitically driven supply risk,” Macquarie Group analysts including Vikas Dwivedi wrote in a note. Oil Prices WTI for September delivery fell 1.3% to settle at $65.16 a barrel. Brent for September settlement slipped 1.1% to $68.44 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with

CoSyn: The open-source tool that’s making GPT-4V-level vision AI accessible to everyone
Researchers at the University of Pennsylvania and the Allen Institute for Artificial Intelligence have developed a groundbreaking tool that allows open-source AI systems to match or surpass the visual understanding capabilities of proprietary models like GPT-4V and Gemini 1.5 Flash, potentially reshaping the competitive landscape between open and closed AI development.
The tool, called CoSyn (Code-Guided Synthesis), addresses a critical bottleneck in AI development: the scarcity of high-quality training data for teaching machines to understand complex visual information like scientific charts, medical diagrams, and financial documents. Rather than scraping millions of images from the internet — a practice fraught with copyright and ethical concerns — CoSyn leverages the coding abilities of existing language models to generate synthetic training data.
“We have, we lack of such data to train the model. We lack of data, like documents, charts with rich annotations to train a vision language model to do question answering over those images,” explained Yue Yang, a recent Penn Engineering Ph.D. graduate and co-first author of the research, during an exclusive interview with VentureBeat. “Those images actually are more challenging to annotate, compared to natural photos, like a picture of a dog of a cat of a house.”
The breakthrough comes as enterprises increasingly seek AI systems capable of understanding and reasoning about complex visual information — capabilities essential for everything from automated document processing to AI agents that can navigate digital interfaces independently. The work was conducted during Yang’s internship with the PRIOR team at the Allen Institute for AI and supported by the Office of the Director of National Intelligence, Intelligence Advanced Research Projects Activity, and the Defense Advanced Research Projects Agency.
How synthetic data generation solves AI’s biggest training challenge
The challenge of training AI to understand text-rich images has long plagued the field. Unlike natural photographs, scientific figures, charts, and documents require extensive annotation work that is both time-consuming and expensive. Traditional approaches have relied on harvesting images and their alt-text descriptions from the internet, but this method produces training data that is often superficial and legally problematic.
CoSyn takes a fundamentally different approach by recognizing that most text-rich images are originally created through code — Python scripts generate charts, LaTeX renders mathematical equations, HTML creates web interfaces. The research team’s insight was to reverse this process: use language models’ proven coding abilities to generate the underlying code, then execute that code to create realistic synthetic images.
“One intuition is actually those images like charts documents. We render them from programs from code, like we use Python to generate charts. We use, like latex or word to write our documents,” Yang said. “So how about we go through the reverse way, like we generated the code because the text only language model has been proved very good at writing code.”
Chris Callison-Burch, a computer science professor at Penn who co-advised the research, described the approach in simpler terms: “This is like taking a student who’s great at writing and asking them to teach someone how to draw, just by describing what the drawing should look like. We’re essentially transferring the strengths of open-source AI from text to vision.”
CoSyn-trained models outperform GPT-4V and Gemini on key benchmarks
The results are striking. Using their synthetic dataset of 400,000 images and 2.7 million instruction pairs, models trained with CoSyn achieved state-of-the-art performance among open-source systems and surpassed proprietary models on seven benchmark tests measuring text-rich image understanding.
On average, their 7-billion parameter model scored 80.9% across the benchmark suite, outperforming the previous best open-source model (Llama 3.2 11B) by 3.9 percentage points. More remarkably, even their “zero-shot” model—trained without any examples from the evaluation datasets—outperformed most open and closed models, demonstrating the transferability of capabilities learned from synthetic data.
CoSyn-trained models outperformed GPT-4V and Gemini 1.5 Flash across seven text-rich image understanding benchmarks. (Credit: github.io/cosyn)
In one particularly compelling demonstration, the researchers created a new benchmark called NutritionQA, consisting of 100 questions about nutrition label photographs. Using just 7,000 synthetically generated nutrition labels for training, their model outperformed others trained on millions of real images. “Despite being trained on millions of images, we observe that open-source VLMs are not data-efficient and perform poorly on this novel task compared to GPT-4V,” the researchers wrote in their paper.
Yang emphasized the significance: “Those big packs, they have so many resources to collecting data to run a lot of experiments, and I but I think open source models, we can give access to people, the model weights, the data we trained, or even the code, the training script, everything people can developers can build upon.”
Real companies are already using vision AI for quality control and automation
The technology is already finding real-world applications across industries. Callison-Burch cited an example from one of his teaching assistants whose company uses vision-language models for cable installation quality assurance: “They have the workers on site who are doing the installation take photographs of the processes they’re doing it, and they use that to automatically validate that each step has been followed properly.”
This type of specialized visual understanding could transform numerous enterprise workflows, from automated document processing in financial services to quality control in manufacturing. The ability to train models on specific visual tasks using synthetic data means companies can develop AI systems tailored to their particular needs without the massive data collection efforts traditionally required.
For enterprise decision makers, the research suggests a shift in how to approach AI data strategies. “I think synthetic data is a very promising way to remove the effort for human annotation. It costs less money, and it will just automatically generate large scale data, and also can avoid some copyright issues,” Yang noted.
The persona-driven approach that makes AI training data more diverse
One of CoSyn’s key innovations is its approach to ensuring data diversity. To prevent the repetitive outputs common in AI-generated content, the system employs what the researchers call a “persona-driven mechanism.” Each time CoSyn generates a synthetic example, it pairs the request with a randomly sampled persona—a short description like “a sci-fi novelist constantly bouncing off ideas for new alien worlds” or “a chemistry teacher preparing lab materials.”
“Every time we generate one syntax data, we will appear with a randomly sampled persona,” Yang explained. “This will diversify the content and styles of the examples we generated, because, like, if I provide the persona of like a PhD student, it will generate something more scientific or more about, something about academia.”
This approach enables the system to generate content across nine different categories: charts, documents, math problems, tables, diagrams, vector graphics, music sheets, electrical circuits, and chemical structures. The researchers used 11 different rendering tools, from Python’s Matplotlib for charts to LaTeX for mathematical expressions, supported by 20 specialized generation pipelines.
Why this breakthrough could level the playing field between open source and Big Tech
The implications for the broader AI industry are significant. Major technology companies like OpenAI and Google have invested billions in developing their proprietary vision-language capabilities, creating systems whose training methods and data sources remain trade secrets. CoSyn offers a path for open-source alternatives to compete without requiring similar resource investments.
“Open source models still like, like behind those closed source models, but with all the efforts, all the resources from the open source community, everyone, like, we’ve had more efforts. We have more like energy, like from, from everyone. So I think finally we can catch up,” Yang said.
The commitment to openness extends beyond just releasing the model. The complete CoSyn codebase, the 400,000-image dataset, and all training scripts are publicly available, enabling researchers and companies worldwide to build upon the work. “From the academia side, like a lot of research is built upon openness, like we need all access to the data, code, everything to discover new findings to support our claims in the papers,” Yang emphasized.
This transparency addresses growing concerns about the black-box nature of proprietary AI systems. “If you only rely on the APIs for like open AI, this may not be reliable to prove your like scientific discoveries, because they may just. Something in the back end you never know,” Yang noted.
Beyond static image understanding, CoSyn is pioneering capabilities crucial for the next generation of AI agents—systems that can autonomously navigate digital interfaces and perform complex tasks. The researchers developed synthetic “pointing data” that teaches models exactly where to click on screenshots, a fundamental requirement for web-based automation.
Using 65,000 synthetic screenshots with click annotations, their model achieved state-of-the-art performance on ScreenSpot, a benchmark for click prediction, outperforming systems trained on 1.3 million real screenshots. “We only use like several 100k synthetic screenshot, we can outperform previous model on millions of screenshots,” Yang said.
This capability is essential as the industry moves toward AI agents that can perform knowledge work autonomously. “There’s sort of like two prevailing models and how you might go about implementing agents,” Callison-Burch explained. One approach uses specialized APIs, while the other relies on agents that “literally just use web browsing capabilities in the same way that you and I do.”
The vision-based approach, enabled by technologies like CoSyn, could prove more versatile: “You’re not just calling up software function, which is relatively straightforward, but you actually have to, like, take screenshots of the current state of the web browser. Reason about where to click, navigate your mouse to that location to click.”
How synthetic data sidesteps the growing copyright crisis in AI training
The synthetic data approach also provides a potential solution to mounting legal challenges around AI training data. With ongoing litigation over whether training on copyrighted materials constitutes fair use, synthetic data generation offers an alternative path that sidesteps many intellectual property concerns.
Callison-Burch, who testified before Congress on AI and copyright in 2023, sees synthetic data as complementary to, rather than replacing, real-world training data: “I don’t think that synthetic data eliminates the need for having wide amounts of diverse training data like that’s still a core element to training AI systems, but it does allow you to extend their capabilities in really remarkable ways.”
The approach demonstrates how existing knowledge can be transferred to new applications without directly using copyrighted materials. “The underlying thing that we’re relying on here is a large language model. Can write code that’s something that it learned from its original data. We’re now applying that to a totally different application, which is creation of new training data that is unlike any of the data that it was trained on.”
The current limits of synthetic data and what comes next
Despite its promise, synthetic data generation faces important limitations. “One limitation is it may inherit the biases from the model that generates such synthetic data,” Yang acknowledged. The system can also struggle with diversity: “If you prompt a large network to generate some data among different runs, it may generate similar data.”
The current research focuses on text-rich images rather than natural photographs, limiting its immediate applicability to some domains. “What about some real photos like some other like natural images? It is hard to generate synthetic data for those two males, or even like medical images, chest X rays,” Yang noted, though she indicated ongoing efforts to extend the approach to medical imaging.
Looking ahead, Yang expects synthetic data generation to become standard practice: “In the future, in two or three years, and even for nothing, editor has been a very important component to teach model different capabilities.” However, she emphasized that optimal results will likely require combining synthetic and real-world data: “Real world data will reflect some real world distributions. Single data can be large scale. Can be more controllable.”
Early adoption signals suggest the technology is already influencing industry practices. “I heard like companies, like meta, some teams also, like all Amazon, they are trying to using our data to train their model,” Yang revealed during the interview.
For startups and smaller companies, the cost advantages could be particularly significant. “For some startups, it is cheaper to host, their host open model on their server, rather than just calling the APIs, which is less controllable,” Yang noted.
The research team’s decision to make everything open source reflects a broader philosophy about AI development. As Yang prepares to join the Allen Institute full-time after completing her Ph.D., the commitment to open science remains central to their mission. “Currently, those vision language models are quite brittle. It just needs the right data to get the right capabilities,” she said. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
The vision for AI that acts, not just describes
As the research moves from academic laboratories to real-world applications, the implications extend far beyond improved benchmark scores. Yang and her colleagues are already looking toward applications that could transform how people with disabilities interact with technology, from AI that understands sign language for the hearing impaired to systems that can describe complex medical images for those with visual impairments.
“I have an idea to let the model to know how to understand the sign language or those people with hearing difficulties,” Yang said, describing potential future applications. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
Callison-Burch sees even broader possibilities, particularly in robotics and scientific discovery: “Synthetic data opens up many possible applications that we don’t have naturally occurring data for. So one that Yang has also worked on at the Allen Institute is that. Ocean of creating simulated training data for robots.”
The work represents more than just a technical achievement—it’s a demonstration that open-source AI development can compete with the well-funded efforts of major technology companies through innovative approaches to fundamental challenges. As Yang noted in reflecting on her decision to join the Allen Institute rather than accept higher-paying offers from companies like Meta: “I think it’s still a very early stage of those multimodal models, and there are not much resources, open resources, or knowledge to share to the community.”
The message is clear: in the race to build AI that can truly see and understand the world, the advantage may not always go to those with the deepest pockets, but to those with the most creative solutions.

Eni Profit Tops Estimates
Eni SpA reported profit that beat analyst estimates as proceeds from asset sales and sweeping cost cuts helped counter a weak oil market. While crude prices were lower in the second quarter — weighing on earnings at other European oil companies — Eni has been buoyed by a cost-reduction program introduced earlier this year, while asset disposals brought down debt. Adjusted net income fell 25% from a year earlier to €1.13 billion ($1.3 billion), the Italian energy company said Friday in a statement. That exceeded the €932.6 million average estimate of analysts surveyed by Bloomberg. Eni said it’s now targeting €3 billion of cost cuts this year, up from €2 billion previously. The company has also reaped billions of euros by offloading stakes in its renewables arm and mobility division, and is in talks to sell half of its carbon capture unit. “The combination of divestments set to come through this year, ongoing ‘self-help,’ as well as the additional cash flow from new ramp-ups sets Eni up for a strong second half of 2025 and 2026,” RBC Europe Ltd. analyst Biraj Borkhataria said in a note. He expects “growing free cash flow and a more resilient balance sheet than we’ve seen for many years.” The shares rose as much as 0.6% at the open in Milan, before trading little changed as of 9:08 a.m. local time. Eni confirmed plans for shareholders’ returns this year. It expects free cash flow before working capital of about €11.5 billion at $70-a-barrel crude, up from previous guidance of €11 billion. The company also raised its forecast for annual earnings from its gas division to €1 billion from €800 million. Net debt shrank to €29.1 billion at the end of June. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Meta has appointed Shengjia Zhao, a former OpenAI researcher and co‑creator of GPT‑4, as the Chief Scientist of its newly created Meta Superintelligence Labs (MSL). The announcement was made Friday by Mark Zuckerberg on Threads, noting Zhao will lead the lab’s scientific agenda alongside him and Alexandr Wang, the former CEO of Scale AI who Meta recently brought onboard as Chief AI Officer. “I am very excited to take up the role of chief scientist for meta super-intelligence labs. Looking forward to building asi [artificial superintelligence] and aligning it to empower people with the amazing team here. Let’s build!” Zhao wrote in his own Threads post. “Artificial superintelligence” is a nebulous term used in the AI industry to describe systems more powerful and capable than any today, beyond even the smartest humans, making them difficult to control. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Zhao’s strong commercial AI background Zhao, who previously worked at OpenAI, played a key role in the development of foundational models like GPT-4 and GPT-4o, according to arXiv system cards and research papers listing him as a co-author. He’s also known for his academic work on generative models and fair representations, with widely cited papers in venues like NeurIPS, ICML, and ICLR. Zhao joins Meta amid a high-stakes hiring blitz across the AI industry. Over the past few months, Meta has

Supertanker Hauling Saudi Diesel Heads to Europe
A supertanker carrying a cargo of diesel from the Middle East is en route to the fuel-starved European market, reflecting supply tightness in the region. The VLCC Nissos Keros loaded about 2 million barrels of ultra-low sulfur diesel from Saudi Arabia’s Jubail terminal and is currently signaling France where it’s due to arrive Aug. 30, according to Kpler and ship-tracking data compiled by Bloomberg. The vessel, which usually transports crude oil, was re-configured to carry diesel. Cargoes of the fuel would typically be carried on smaller tankers, but with freight rates elevated after the latest attacks on shipping in the Red Sea, operators have an incentive to clean up dirty tankers to haul products instead and reap the economies of scale. Europe’s diesel market remains under pressure, driven by a combination of lower refinery output, costly rerouting of imports to replace shunned Russian supplies and sanctions-related uncertainty. The arrival of a large shipment may provide temporary relief, but dependence on long-haul imports continues to expose the European market to spikes in freight costs and supply volatility. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples
Singapore-based AI startup Sapient Intelligence has developed a new AI architecture that can match, and in some cases vastly outperform, large language models (LLMs) on complex reasoning tasks, all while being significantly smaller and more data-efficient.The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

Oil Slips on Stronger Dollar, Trade Doubts
Oil fell as the dollar strengthened and conviction waned that the US will reach agreements with key trade partners ahead of a deadline next week. West Texas Intermediate crude slid more than 1% to settle near $65 a barrel after President Donald Trump said the US has a 50-50 chance of striking a trade deal with Europe, a contrast to the optimism the bloc’s diplomats expressed this week. Trump also said most tariff rates are essentially settled now. The effective US tariff rate is at the highest in a century, by some estimates, a potential threat to energy demand. In another headwind, Trump indicated he had no plans to fire Federal Reserve Chair Jerome Powell, boosting the dollar and making the commodities priced in the currency less attractive. Crude has remained in a holding pattern this month, but is down for the year as increased supply from OPEC+ adds to concerns of a looming glut. The group will next meet on Aug. 3 to decide on production levels. On Thursday, one member, Venezuela, was given a production reprieve by a US decision to let Chevron resume pumping oil in the country. “We expect crude to slowly sell off this fall, driven by steady acceleration of stock builds, softening physical markets, reduced refinery margin support and continued deescalation of geopolitically driven supply risk,” Macquarie Group analysts including Vikas Dwivedi wrote in a note. Oil Prices WTI for September delivery fell 1.3% to settle at $65.16 a barrel. Brent for September settlement slipped 1.1% to $68.44 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with

CoSyn: The open-source tool that’s making GPT-4V-level vision AI accessible to everyone
Researchers at the University of Pennsylvania and the Allen Institute for Artificial Intelligence have developed a groundbreaking tool that allows open-source AI systems to match or surpass the visual understanding capabilities of proprietary models like GPT-4V and Gemini 1.5 Flash, potentially reshaping the competitive landscape between open and closed AI development.
The tool, called CoSyn (Code-Guided Synthesis), addresses a critical bottleneck in AI development: the scarcity of high-quality training data for teaching machines to understand complex visual information like scientific charts, medical diagrams, and financial documents. Rather than scraping millions of images from the internet — a practice fraught with copyright and ethical concerns — CoSyn leverages the coding abilities of existing language models to generate synthetic training data.
“We have, we lack of such data to train the model. We lack of data, like documents, charts with rich annotations to train a vision language model to do question answering over those images,” explained Yue Yang, a recent Penn Engineering Ph.D. graduate and co-first author of the research, during an exclusive interview with VentureBeat. “Those images actually are more challenging to annotate, compared to natural photos, like a picture of a dog of a cat of a house.”
The breakthrough comes as enterprises increasingly seek AI systems capable of understanding and reasoning about complex visual information — capabilities essential for everything from automated document processing to AI agents that can navigate digital interfaces independently. The work was conducted during Yang’s internship with the PRIOR team at the Allen Institute for AI and supported by the Office of the Director of National Intelligence, Intelligence Advanced Research Projects Activity, and the Defense Advanced Research Projects Agency.
How synthetic data generation solves AI’s biggest training challenge
The challenge of training AI to understand text-rich images has long plagued the field. Unlike natural photographs, scientific figures, charts, and documents require extensive annotation work that is both time-consuming and expensive. Traditional approaches have relied on harvesting images and their alt-text descriptions from the internet, but this method produces training data that is often superficial and legally problematic.
CoSyn takes a fundamentally different approach by recognizing that most text-rich images are originally created through code — Python scripts generate charts, LaTeX renders mathematical equations, HTML creates web interfaces. The research team’s insight was to reverse this process: use language models’ proven coding abilities to generate the underlying code, then execute that code to create realistic synthetic images.
“One intuition is actually those images like charts documents. We render them from programs from code, like we use Python to generate charts. We use, like latex or word to write our documents,” Yang said. “So how about we go through the reverse way, like we generated the code because the text only language model has been proved very good at writing code.”
Chris Callison-Burch, a computer science professor at Penn who co-advised the research, described the approach in simpler terms: “This is like taking a student who’s great at writing and asking them to teach someone how to draw, just by describing what the drawing should look like. We’re essentially transferring the strengths of open-source AI from text to vision.”
CoSyn-trained models outperform GPT-4V and Gemini on key benchmarks
The results are striking. Using their synthetic dataset of 400,000 images and 2.7 million instruction pairs, models trained with CoSyn achieved state-of-the-art performance among open-source systems and surpassed proprietary models on seven benchmark tests measuring text-rich image understanding.
On average, their 7-billion parameter model scored 80.9% across the benchmark suite, outperforming the previous best open-source model (Llama 3.2 11B) by 3.9 percentage points. More remarkably, even their “zero-shot” model—trained without any examples from the evaluation datasets—outperformed most open and closed models, demonstrating the transferability of capabilities learned from synthetic data.
CoSyn-trained models outperformed GPT-4V and Gemini 1.5 Flash across seven text-rich image understanding benchmarks. (Credit: github.io/cosyn)
In one particularly compelling demonstration, the researchers created a new benchmark called NutritionQA, consisting of 100 questions about nutrition label photographs. Using just 7,000 synthetically generated nutrition labels for training, their model outperformed others trained on millions of real images. “Despite being trained on millions of images, we observe that open-source VLMs are not data-efficient and perform poorly on this novel task compared to GPT-4V,” the researchers wrote in their paper.
Yang emphasized the significance: “Those big packs, they have so many resources to collecting data to run a lot of experiments, and I but I think open source models, we can give access to people, the model weights, the data we trained, or even the code, the training script, everything people can developers can build upon.”
Real companies are already using vision AI for quality control and automation
The technology is already finding real-world applications across industries. Callison-Burch cited an example from one of his teaching assistants whose company uses vision-language models for cable installation quality assurance: “They have the workers on site who are doing the installation take photographs of the processes they’re doing it, and they use that to automatically validate that each step has been followed properly.”
This type of specialized visual understanding could transform numerous enterprise workflows, from automated document processing in financial services to quality control in manufacturing. The ability to train models on specific visual tasks using synthetic data means companies can develop AI systems tailored to their particular needs without the massive data collection efforts traditionally required.
For enterprise decision makers, the research suggests a shift in how to approach AI data strategies. “I think synthetic data is a very promising way to remove the effort for human annotation. It costs less money, and it will just automatically generate large scale data, and also can avoid some copyright issues,” Yang noted.
The persona-driven approach that makes AI training data more diverse
One of CoSyn’s key innovations is its approach to ensuring data diversity. To prevent the repetitive outputs common in AI-generated content, the system employs what the researchers call a “persona-driven mechanism.” Each time CoSyn generates a synthetic example, it pairs the request with a randomly sampled persona—a short description like “a sci-fi novelist constantly bouncing off ideas for new alien worlds” or “a chemistry teacher preparing lab materials.”
“Every time we generate one syntax data, we will appear with a randomly sampled persona,” Yang explained. “This will diversify the content and styles of the examples we generated, because, like, if I provide the persona of like a PhD student, it will generate something more scientific or more about, something about academia.”
This approach enables the system to generate content across nine different categories: charts, documents, math problems, tables, diagrams, vector graphics, music sheets, electrical circuits, and chemical structures. The researchers used 11 different rendering tools, from Python’s Matplotlib for charts to LaTeX for mathematical expressions, supported by 20 specialized generation pipelines.
Why this breakthrough could level the playing field between open source and Big Tech
The implications for the broader AI industry are significant. Major technology companies like OpenAI and Google have invested billions in developing their proprietary vision-language capabilities, creating systems whose training methods and data sources remain trade secrets. CoSyn offers a path for open-source alternatives to compete without requiring similar resource investments.
“Open source models still like, like behind those closed source models, but with all the efforts, all the resources from the open source community, everyone, like, we’ve had more efforts. We have more like energy, like from, from everyone. So I think finally we can catch up,” Yang said.
The commitment to openness extends beyond just releasing the model. The complete CoSyn codebase, the 400,000-image dataset, and all training scripts are publicly available, enabling researchers and companies worldwide to build upon the work. “From the academia side, like a lot of research is built upon openness, like we need all access to the data, code, everything to discover new findings to support our claims in the papers,” Yang emphasized.
This transparency addresses growing concerns about the black-box nature of proprietary AI systems. “If you only rely on the APIs for like open AI, this may not be reliable to prove your like scientific discoveries, because they may just. Something in the back end you never know,” Yang noted.
Beyond static image understanding, CoSyn is pioneering capabilities crucial for the next generation of AI agents—systems that can autonomously navigate digital interfaces and perform complex tasks. The researchers developed synthetic “pointing data” that teaches models exactly where to click on screenshots, a fundamental requirement for web-based automation.
Using 65,000 synthetic screenshots with click annotations, their model achieved state-of-the-art performance on ScreenSpot, a benchmark for click prediction, outperforming systems trained on 1.3 million real screenshots. “We only use like several 100k synthetic screenshot, we can outperform previous model on millions of screenshots,” Yang said.
This capability is essential as the industry moves toward AI agents that can perform knowledge work autonomously. “There’s sort of like two prevailing models and how you might go about implementing agents,” Callison-Burch explained. One approach uses specialized APIs, while the other relies on agents that “literally just use web browsing capabilities in the same way that you and I do.”
The vision-based approach, enabled by technologies like CoSyn, could prove more versatile: “You’re not just calling up software function, which is relatively straightforward, but you actually have to, like, take screenshots of the current state of the web browser. Reason about where to click, navigate your mouse to that location to click.”
How synthetic data sidesteps the growing copyright crisis in AI training
The synthetic data approach also provides a potential solution to mounting legal challenges around AI training data. With ongoing litigation over whether training on copyrighted materials constitutes fair use, synthetic data generation offers an alternative path that sidesteps many intellectual property concerns.
Callison-Burch, who testified before Congress on AI and copyright in 2023, sees synthetic data as complementary to, rather than replacing, real-world training data: “I don’t think that synthetic data eliminates the need for having wide amounts of diverse training data like that’s still a core element to training AI systems, but it does allow you to extend their capabilities in really remarkable ways.”
The approach demonstrates how existing knowledge can be transferred to new applications without directly using copyrighted materials. “The underlying thing that we’re relying on here is a large language model. Can write code that’s something that it learned from its original data. We’re now applying that to a totally different application, which is creation of new training data that is unlike any of the data that it was trained on.”
The current limits of synthetic data and what comes next
Despite its promise, synthetic data generation faces important limitations. “One limitation is it may inherit the biases from the model that generates such synthetic data,” Yang acknowledged. The system can also struggle with diversity: “If you prompt a large network to generate some data among different runs, it may generate similar data.”
The current research focuses on text-rich images rather than natural photographs, limiting its immediate applicability to some domains. “What about some real photos like some other like natural images? It is hard to generate synthetic data for those two males, or even like medical images, chest X rays,” Yang noted, though she indicated ongoing efforts to extend the approach to medical imaging.
Looking ahead, Yang expects synthetic data generation to become standard practice: “In the future, in two or three years, and even for nothing, editor has been a very important component to teach model different capabilities.” However, she emphasized that optimal results will likely require combining synthetic and real-world data: “Real world data will reflect some real world distributions. Single data can be large scale. Can be more controllable.”
Early adoption signals suggest the technology is already influencing industry practices. “I heard like companies, like meta, some teams also, like all Amazon, they are trying to using our data to train their model,” Yang revealed during the interview.
For startups and smaller companies, the cost advantages could be particularly significant. “For some startups, it is cheaper to host, their host open model on their server, rather than just calling the APIs, which is less controllable,” Yang noted.
The research team’s decision to make everything open source reflects a broader philosophy about AI development. As Yang prepares to join the Allen Institute full-time after completing her Ph.D., the commitment to open science remains central to their mission. “Currently, those vision language models are quite brittle. It just needs the right data to get the right capabilities,” she said. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
The vision for AI that acts, not just describes
As the research moves from academic laboratories to real-world applications, the implications extend far beyond improved benchmark scores. Yang and her colleagues are already looking toward applications that could transform how people with disabilities interact with technology, from AI that understands sign language for the hearing impaired to systems that can describe complex medical images for those with visual impairments.
“I have an idea to let the model to know how to understand the sign language or those people with hearing difficulties,” Yang said, describing potential future applications. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
Callison-Burch sees even broader possibilities, particularly in robotics and scientific discovery: “Synthetic data opens up many possible applications that we don’t have naturally occurring data for. So one that Yang has also worked on at the Allen Institute is that. Ocean of creating simulated training data for robots.”
The work represents more than just a technical achievement—it’s a demonstration that open-source AI development can compete with the well-funded efforts of major technology companies through innovative approaches to fundamental challenges. As Yang noted in reflecting on her decision to join the Allen Institute rather than accept higher-paying offers from companies like Meta: “I think it’s still a very early stage of those multimodal models, and there are not much resources, open resources, or knowledge to share to the community.”
The message is clear: in the race to build AI that can truly see and understand the world, the advantage may not always go to those with the deepest pockets, but to those with the most creative solutions.

Eni Profit Tops Estimates
Eni SpA reported profit that beat analyst estimates as proceeds from asset sales and sweeping cost cuts helped counter a weak oil market. While crude prices were lower in the second quarter — weighing on earnings at other European oil companies — Eni has been buoyed by a cost-reduction program introduced earlier this year, while asset disposals brought down debt. Adjusted net income fell 25% from a year earlier to €1.13 billion ($1.3 billion), the Italian energy company said Friday in a statement. That exceeded the €932.6 million average estimate of analysts surveyed by Bloomberg. Eni said it’s now targeting €3 billion of cost cuts this year, up from €2 billion previously. The company has also reaped billions of euros by offloading stakes in its renewables arm and mobility division, and is in talks to sell half of its carbon capture unit. “The combination of divestments set to come through this year, ongoing ‘self-help,’ as well as the additional cash flow from new ramp-ups sets Eni up for a strong second half of 2025 and 2026,” RBC Europe Ltd. analyst Biraj Borkhataria said in a note. He expects “growing free cash flow and a more resilient balance sheet than we’ve seen for many years.” The shares rose as much as 0.6% at the open in Milan, before trading little changed as of 9:08 a.m. local time. Eni confirmed plans for shareholders’ returns this year. It expects free cash flow before working capital of about €11.5 billion at $70-a-barrel crude, up from previous guidance of €11 billion. The company also raised its forecast for annual earnings from its gas division to €1 billion from €800 million. Net debt shrank to €29.1 billion at the end of June. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of

Supertanker Hauling Saudi Diesel Heads to Europe
A supertanker carrying a cargo of diesel from the Middle East is en route to the fuel-starved European market, reflecting supply tightness in the region. The VLCC Nissos Keros loaded about 2 million barrels of ultra-low sulfur diesel from Saudi Arabia’s Jubail terminal and is currently signaling France where it’s due to arrive Aug. 30, according to Kpler and ship-tracking data compiled by Bloomberg. The vessel, which usually transports crude oil, was re-configured to carry diesel. Cargoes of the fuel would typically be carried on smaller tankers, but with freight rates elevated after the latest attacks on shipping in the Red Sea, operators have an incentive to clean up dirty tankers to haul products instead and reap the economies of scale. Europe’s diesel market remains under pressure, driven by a combination of lower refinery output, costly rerouting of imports to replace shunned Russian supplies and sanctions-related uncertainty. The arrival of a large shipment may provide temporary relief, but dependence on long-haul imports continues to expose the European market to spikes in freight costs and supply volatility. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

Oil Slips on Stronger Dollar, Trade Doubts
Oil fell as the dollar strengthened and conviction waned that the US will reach agreements with key trade partners ahead of a deadline next week. West Texas Intermediate crude slid more than 1% to settle near $65 a barrel after President Donald Trump said the US has a 50-50 chance of striking a trade deal with Europe, a contrast to the optimism the bloc’s diplomats expressed this week. Trump also said most tariff rates are essentially settled now. The effective US tariff rate is at the highest in a century, by some estimates, a potential threat to energy demand. In another headwind, Trump indicated he had no plans to fire Federal Reserve Chair Jerome Powell, boosting the dollar and making the commodities priced in the currency less attractive. Crude has remained in a holding pattern this month, but is down for the year as increased supply from OPEC+ adds to concerns of a looming glut. The group will next meet on Aug. 3 to decide on production levels. On Thursday, one member, Venezuela, was given a production reprieve by a US decision to let Chevron resume pumping oil in the country. “We expect crude to slowly sell off this fall, driven by steady acceleration of stock builds, softening physical markets, reduced refinery margin support and continued deescalation of geopolitically driven supply risk,” Macquarie Group analysts including Vikas Dwivedi wrote in a note. Oil Prices WTI for September delivery fell 1.3% to settle at $65.16 a barrel. Brent for September settlement slipped 1.1% to $68.44 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with

BP to Exit $36B Australian Green Hydrogen Hub
BP Plc will exit its role in a massive green hydrogen production facility planned in Australia as the British oil major refocuses on the fossil fuels that drive its profits. The company told its partners in the Australian Renewable Energy Hub that it plans to leave the project as both operator and equity holder, according to a statement from a BP spokesperson. It’s the latest setback for green hydrogen, a fuel once touted as a key way for Big Oil to profit from the energy transition that has so far proved too costly for mass production and consumption. The AREH project company will take over as operator over coming months with support from founding partner InterContinental Energy, according to an AREH spokesperson. BP’s decision to exit the project doesn’t reflect the opportunity the hub presents to decarbonize the Pilbara and support the creation of a green iron industry, they said. BP’s entry into the project – once estimated to cost about $36 billion – came at a time when the company sought to rapidly build up a business in low-carbon energy and shrink its oil business. But after years of stock under-performance compared with its peers and the departure of the plan’s architect – Chief Executive Officer Bernard Looney – BP has refined its strategy to focus more squarely on profits than green goals. The company is far from alone in leaving its ambitions for green hydrogen behind. Scores of companies that once saw the fuel as the next big thing in energy have cut back plans as hoped for cost declines failed to materialize. Also on Thursday, Fortescue Ltd. said it would abandon plans for a $550 million Arizona Hydrogen Project in the US and a $150 million PEM50 Project in Gladstone, Australia – resulting in a pretax writedown of $150 million. Meanwhile, Woodside

Eni Profit Tops Estimates
Eni SpA reported profit that beat analyst estimates as proceeds from asset sales and sweeping cost cuts helped counter a weak oil market. While crude prices were lower in the second quarter — weighing on earnings at other European oil companies — Eni has been buoyed by a cost-reduction program introduced earlier this year, while asset disposals brought down debt. Adjusted net income fell 25% from a year earlier to €1.13 billion ($1.3 billion), the Italian energy company said Friday in a statement. That exceeded the €932.6 million average estimate of analysts surveyed by Bloomberg. Eni said it’s now targeting €3 billion of cost cuts this year, up from €2 billion previously. The company has also reaped billions of euros by offloading stakes in its renewables arm and mobility division, and is in talks to sell half of its carbon capture unit. “The combination of divestments set to come through this year, ongoing ‘self-help,’ as well as the additional cash flow from new ramp-ups sets Eni up for a strong second half of 2025 and 2026,” RBC Europe Ltd. analyst Biraj Borkhataria said in a note. He expects “growing free cash flow and a more resilient balance sheet than we’ve seen for many years.” The shares rose as much as 0.6% at the open in Milan, before trading little changed as of 9:08 a.m. local time. Eni confirmed plans for shareholders’ returns this year. It expects free cash flow before working capital of about €11.5 billion at $70-a-barrel crude, up from previous guidance of €11 billion. The company also raised its forecast for annual earnings from its gas division to €1 billion from €800 million. Net debt shrank to €29.1 billion at the end of June. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of

Chevron to Cut Positions as Part of Hess Integration
Chevron will “consolidate or eliminate some positions” as part of its integration with Hess Corporation, a Chevron spokesperson told Rigzone. “Chevron completed the merger with Hess Corporation on July 18,” the spokesperson said. “We are working quickly to integrate the Hess workforce and are focused on maintaining safe and reliable operations throughout the transition period,” the spokesperson added. “As part of the integration, we will consolidate or eliminate some positions. As required by the WARN Act, Chevron has provided notice of a planned workforce reduction to appropriate state and local government representatives for Downtown Houston and North Dakota,” the spokesperson went on to state. When asked by Rigzone to confirm how many positions will be affected, the Chevron spokesperson said, “for the WARN Notices issued on July 21, Chevron anticipates a reduction of approximately 575 employees in Downtown Houston and 70 employees in North Dakota”. The spokesperson told Rigzone that “these are difficult decisions which … [the company does] not make lightly”. “We understand the impact this news may have on employees, their families and the communities where we operate,” the spokesperson said. “Our priority is to support our employees through this transition. We are offering severance benefits and outplacement support,” the Chevron representative added. In a statement posted on its website on July 18, Chevron announced that it had completed its acquisition of Hess Corporation following the satisfaction of all necessary closing conditions, including a favorable arbitration outcome regarding Hess’ offshore Guyana asset. “This merger of two great American companies brings together the best in the industry,” Chevron Chairman and CEO Mike Wirth said in that statement. “The combination enhances and extends our growth profile well into the next decade, which we believe will drive greater long-term value to shareholders,” he added. In this statement, former Hess Corporation CEO

Coal- and gas-fired power plants have a new best friend: data centers
Abbe Ramanan is a project director at Clean Energy Group. In 2020, the Virginia Assembly passed the Virginia Clean Economy Act, a law that required the state’s largest utility, Dominion Energy, to generate all its electricity from renewable resources by 2045. However, Dominion has found a useful loophole to get around the law’s requirements — data centers. Viriginia hosts the largest data center market in the world, and is home to at least 150 hyperscale data centers, with more being proposed. In its recent integrated resource plan, Dominion cited projected energy demand from these data centers as a key reason to delay retiring existing power plants, including the Clover Power Station, a coal-powered peaker plant in Halifax County, a disproportionately low-income region. In addition to delaying peaker retirements, Dominion has proposed building new gas-powered generation, including a 1-GW peaker plant in Chesterfield, a community that already shoulders an undue environmental burden from existing natural gas- and coal-fired generation. Similar stories have played out across the country as data centers become more and more ubiquitous, particularly in the Southeast. Utilities in Virginia, Georgia, North Carolina and South Carolina have proposed building 20,000 MW of new gas power plants by 2040. Data centers driving the projected load growth are being used to justify this buildout. In Virginia, Georgia and South Carolina, data centers are responsible for at least 65% of projected load growth. Data centers are also delaying the retirement of fossil fuel power plants nationwide, with at least 17 fossil fuel generators originally scheduled for closure now delaying retirement. This new gas buildout, as well as the delayed retirement of fossil fuel generators, overwhelmingly harms Black and brown communities, who face higher energy and environmental burdens. The gas bonanza is especially concerning because the projected demand from data centers could be

Microsoft will invest $80B in AI data centers in fiscal 2025
And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs). In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

John Deere unveils more autonomous farm machines to address skill labor shortage
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

2025 playbook for enterprise AI success, from agents to evals
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Three Aberdeen oil company headquarters sell for £45m
Three Aberdeen oil company headquarters have been sold in a deal worth £45 million. The CNOOC, Apache and Taqa buildings at the Prime Four business park in Kingswells have been acquired by EEH Ventures. The trio of buildings, totalling 275,000 sq ft, were previously owned by Canadian firm BMO. The financial services powerhouse first bought the buildings in 2014 but took the decision to sell the buildings as part of a “long-standing strategy to reduce their office exposure across the UK”. The deal was the largest to take place throughout Scotland during the last quarter of 2024. Trio of buildings snapped up London headquartered EEH Ventures was founded in 2013 and owns a number of residential, offices, shopping centres and hotels throughout the UK. All three Kingswells-based buildings were pre-let, designed and constructed by Aberdeen property developer Drum in 2012 on a 15-year lease. © Supplied by CBREThe Aberdeen headquarters of Taqa. Image: CBRE The North Sea headquarters of Middle-East oil firm Taqa has previously been described as “an amazing success story in the Granite City”. Taqa announced in 2023 that it intends to cease production from all of its UK North Sea platforms by the end of 2027. Meanwhile, Apache revealed at the end of last year it is planning to exit the North Sea by the end of 2029 blaming the windfall tax. The US firm first entered the North Sea in 2003 but will wrap up all of its UK operations by 2030. Aberdeen big deals The Prime Four acquisition wasn’t the biggest Granite City commercial property sale of 2024. American private equity firm Lone Star bought Union Square shopping centre from Hammerson for £111m. © ShutterstockAberdeen city centre. Hammerson, who also built the property, had originally been seeking £150m. BP’s North Sea headquarters in Stoneywood, Aberdeen, was also sold. Manchester-based

2025 ransomware predictions, trends, and how to prepare
Zscaler ThreatLabz research team has revealed critical insights and predictions on ransomware trends for 2025. The latest Ransomware Report uncovered a surge in sophisticated tactics and extortion attacks. As ransomware remains a key concern for CISOs and CIOs, the report sheds light on actionable strategies to mitigate risks. Top Ransomware Predictions for 2025: ● AI-Powered Social Engineering: In 2025, GenAI will fuel voice phishing (vishing) attacks. With the proliferation of GenAI-based tooling, initial access broker groups will increasingly leverage AI-generated voices; which sound more and more realistic by adopting local accents and dialects to enhance credibility and success rates. ● The Trifecta of Social Engineering Attacks: Vishing, Ransomware and Data Exfiltration. Additionally, sophisticated ransomware groups, like the Dark Angels, will continue the trend of low-volume, high-impact attacks; preferring to focus on an individual company, stealing vast amounts of data without encrypting files, and evading media and law enforcement scrutiny. ● Targeted Industries Under Siege: Manufacturing, healthcare, education, energy will remain primary targets, with no slowdown in attacks expected. ● New SEC Regulations Drive Increased Transparency: 2025 will see an uptick in reported ransomware attacks and payouts due to new, tighter SEC requirements mandating that public companies report material incidents within four business days. ● Ransomware Payouts Are on the Rise: In 2025 ransom demands will most likely increase due to an evolving ecosystem of cybercrime groups, specializing in designated attack tactics, and collaboration by these groups that have entered a sophisticated profit sharing model using Ransomware-as-a-Service. To combat damaging ransomware attacks, Zscaler ThreatLabz recommends the following strategies. ● Fighting AI with AI: As threat actors use AI to identify vulnerabilities, organizations must counter with AI-powered zero trust security systems that detect and mitigate new threats. ● Advantages of adopting a Zero Trust architecture: A Zero Trust cloud security platform stops

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Meta has appointed Shengjia Zhao, a former OpenAI researcher and co‑creator of GPT‑4, as the Chief Scientist of its newly created Meta Superintelligence Labs (MSL). The announcement was made Friday by Mark Zuckerberg on Threads, noting Zhao will lead the lab’s scientific agenda alongside him and Alexandr Wang, the former CEO of Scale AI who Meta recently brought onboard as Chief AI Officer. “I am very excited to take up the role of chief scientist for meta super-intelligence labs. Looking forward to building asi [artificial superintelligence] and aligning it to empower people with the amazing team here. Let’s build!” Zhao wrote in his own Threads post. “Artificial superintelligence” is a nebulous term used in the AI industry to describe systems more powerful and capable than any today, beyond even the smartest humans, making them difficult to control. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Zhao’s strong commercial AI background Zhao, who previously worked at OpenAI, played a key role in the development of foundational models like GPT-4 and GPT-4o, according to arXiv system cards and research papers listing him as a co-author. He’s also known for his academic work on generative models and fair representations, with widely cited papers in venues like NeurIPS, ICML, and ICLR. Zhao joins Meta amid a high-stakes hiring blitz across the AI industry. Over the past few months, Meta has

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples
Singapore-based AI startup Sapient Intelligence has developed a new AI architecture that can match, and in some cases vastly outperform, large language models (LLMs) on complex reasoning tasks, all while being significantly smaller and more data-efficient.The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

CoSyn: The open-source tool that’s making GPT-4V-level vision AI accessible to everyone
Researchers at the University of Pennsylvania and the Allen Institute for Artificial Intelligence have developed a groundbreaking tool that allows open-source AI systems to match or surpass the visual understanding capabilities of proprietary models like GPT-4V and Gemini 1.5 Flash, potentially reshaping the competitive landscape between open and closed AI development.
The tool, called CoSyn (Code-Guided Synthesis), addresses a critical bottleneck in AI development: the scarcity of high-quality training data for teaching machines to understand complex visual information like scientific charts, medical diagrams, and financial documents. Rather than scraping millions of images from the internet — a practice fraught with copyright and ethical concerns — CoSyn leverages the coding abilities of existing language models to generate synthetic training data.
“We have, we lack of such data to train the model. We lack of data, like documents, charts with rich annotations to train a vision language model to do question answering over those images,” explained Yue Yang, a recent Penn Engineering Ph.D. graduate and co-first author of the research, during an exclusive interview with VentureBeat. “Those images actually are more challenging to annotate, compared to natural photos, like a picture of a dog of a cat of a house.”
The breakthrough comes as enterprises increasingly seek AI systems capable of understanding and reasoning about complex visual information — capabilities essential for everything from automated document processing to AI agents that can navigate digital interfaces independently. The work was conducted during Yang’s internship with the PRIOR team at the Allen Institute for AI and supported by the Office of the Director of National Intelligence, Intelligence Advanced Research Projects Activity, and the Defense Advanced Research Projects Agency.
How synthetic data generation solves AI’s biggest training challenge
The challenge of training AI to understand text-rich images has long plagued the field. Unlike natural photographs, scientific figures, charts, and documents require extensive annotation work that is both time-consuming and expensive. Traditional approaches have relied on harvesting images and their alt-text descriptions from the internet, but this method produces training data that is often superficial and legally problematic.
CoSyn takes a fundamentally different approach by recognizing that most text-rich images are originally created through code — Python scripts generate charts, LaTeX renders mathematical equations, HTML creates web interfaces. The research team’s insight was to reverse this process: use language models’ proven coding abilities to generate the underlying code, then execute that code to create realistic synthetic images.
“One intuition is actually those images like charts documents. We render them from programs from code, like we use Python to generate charts. We use, like latex or word to write our documents,” Yang said. “So how about we go through the reverse way, like we generated the code because the text only language model has been proved very good at writing code.”
Chris Callison-Burch, a computer science professor at Penn who co-advised the research, described the approach in simpler terms: “This is like taking a student who’s great at writing and asking them to teach someone how to draw, just by describing what the drawing should look like. We’re essentially transferring the strengths of open-source AI from text to vision.”
CoSyn-trained models outperform GPT-4V and Gemini on key benchmarks
The results are striking. Using their synthetic dataset of 400,000 images and 2.7 million instruction pairs, models trained with CoSyn achieved state-of-the-art performance among open-source systems and surpassed proprietary models on seven benchmark tests measuring text-rich image understanding.
On average, their 7-billion parameter model scored 80.9% across the benchmark suite, outperforming the previous best open-source model (Llama 3.2 11B) by 3.9 percentage points. More remarkably, even their “zero-shot” model—trained without any examples from the evaluation datasets—outperformed most open and closed models, demonstrating the transferability of capabilities learned from synthetic data.
CoSyn-trained models outperformed GPT-4V and Gemini 1.5 Flash across seven text-rich image understanding benchmarks. (Credit: github.io/cosyn)
In one particularly compelling demonstration, the researchers created a new benchmark called NutritionQA, consisting of 100 questions about nutrition label photographs. Using just 7,000 synthetically generated nutrition labels for training, their model outperformed others trained on millions of real images. “Despite being trained on millions of images, we observe that open-source VLMs are not data-efficient and perform poorly on this novel task compared to GPT-4V,” the researchers wrote in their paper.
Yang emphasized the significance: “Those big packs, they have so many resources to collecting data to run a lot of experiments, and I but I think open source models, we can give access to people, the model weights, the data we trained, or even the code, the training script, everything people can developers can build upon.”
Real companies are already using vision AI for quality control and automation
The technology is already finding real-world applications across industries. Callison-Burch cited an example from one of his teaching assistants whose company uses vision-language models for cable installation quality assurance: “They have the workers on site who are doing the installation take photographs of the processes they’re doing it, and they use that to automatically validate that each step has been followed properly.”
This type of specialized visual understanding could transform numerous enterprise workflows, from automated document processing in financial services to quality control in manufacturing. The ability to train models on specific visual tasks using synthetic data means companies can develop AI systems tailored to their particular needs without the massive data collection efforts traditionally required.
For enterprise decision makers, the research suggests a shift in how to approach AI data strategies. “I think synthetic data is a very promising way to remove the effort for human annotation. It costs less money, and it will just automatically generate large scale data, and also can avoid some copyright issues,” Yang noted.
The persona-driven approach that makes AI training data more diverse
One of CoSyn’s key innovations is its approach to ensuring data diversity. To prevent the repetitive outputs common in AI-generated content, the system employs what the researchers call a “persona-driven mechanism.” Each time CoSyn generates a synthetic example, it pairs the request with a randomly sampled persona—a short description like “a sci-fi novelist constantly bouncing off ideas for new alien worlds” or “a chemistry teacher preparing lab materials.”
“Every time we generate one syntax data, we will appear with a randomly sampled persona,” Yang explained. “This will diversify the content and styles of the examples we generated, because, like, if I provide the persona of like a PhD student, it will generate something more scientific or more about, something about academia.”
This approach enables the system to generate content across nine different categories: charts, documents, math problems, tables, diagrams, vector graphics, music sheets, electrical circuits, and chemical structures. The researchers used 11 different rendering tools, from Python’s Matplotlib for charts to LaTeX for mathematical expressions, supported by 20 specialized generation pipelines.
Why this breakthrough could level the playing field between open source and Big Tech
The implications for the broader AI industry are significant. Major technology companies like OpenAI and Google have invested billions in developing their proprietary vision-language capabilities, creating systems whose training methods and data sources remain trade secrets. CoSyn offers a path for open-source alternatives to compete without requiring similar resource investments.
“Open source models still like, like behind those closed source models, but with all the efforts, all the resources from the open source community, everyone, like, we’ve had more efforts. We have more like energy, like from, from everyone. So I think finally we can catch up,” Yang said.
The commitment to openness extends beyond just releasing the model. The complete CoSyn codebase, the 400,000-image dataset, and all training scripts are publicly available, enabling researchers and companies worldwide to build upon the work. “From the academia side, like a lot of research is built upon openness, like we need all access to the data, code, everything to discover new findings to support our claims in the papers,” Yang emphasized.
This transparency addresses growing concerns about the black-box nature of proprietary AI systems. “If you only rely on the APIs for like open AI, this may not be reliable to prove your like scientific discoveries, because they may just. Something in the back end you never know,” Yang noted.
Beyond static image understanding, CoSyn is pioneering capabilities crucial for the next generation of AI agents—systems that can autonomously navigate digital interfaces and perform complex tasks. The researchers developed synthetic “pointing data” that teaches models exactly where to click on screenshots, a fundamental requirement for web-based automation.
Using 65,000 synthetic screenshots with click annotations, their model achieved state-of-the-art performance on ScreenSpot, a benchmark for click prediction, outperforming systems trained on 1.3 million real screenshots. “We only use like several 100k synthetic screenshot, we can outperform previous model on millions of screenshots,” Yang said.
This capability is essential as the industry moves toward AI agents that can perform knowledge work autonomously. “There’s sort of like two prevailing models and how you might go about implementing agents,” Callison-Burch explained. One approach uses specialized APIs, while the other relies on agents that “literally just use web browsing capabilities in the same way that you and I do.”
The vision-based approach, enabled by technologies like CoSyn, could prove more versatile: “You’re not just calling up software function, which is relatively straightforward, but you actually have to, like, take screenshots of the current state of the web browser. Reason about where to click, navigate your mouse to that location to click.”
How synthetic data sidesteps the growing copyright crisis in AI training
The synthetic data approach also provides a potential solution to mounting legal challenges around AI training data. With ongoing litigation over whether training on copyrighted materials constitutes fair use, synthetic data generation offers an alternative path that sidesteps many intellectual property concerns.
Callison-Burch, who testified before Congress on AI and copyright in 2023, sees synthetic data as complementary to, rather than replacing, real-world training data: “I don’t think that synthetic data eliminates the need for having wide amounts of diverse training data like that’s still a core element to training AI systems, but it does allow you to extend their capabilities in really remarkable ways.”
The approach demonstrates how existing knowledge can be transferred to new applications without directly using copyrighted materials. “The underlying thing that we’re relying on here is a large language model. Can write code that’s something that it learned from its original data. We’re now applying that to a totally different application, which is creation of new training data that is unlike any of the data that it was trained on.”
The current limits of synthetic data and what comes next
Despite its promise, synthetic data generation faces important limitations. “One limitation is it may inherit the biases from the model that generates such synthetic data,” Yang acknowledged. The system can also struggle with diversity: “If you prompt a large network to generate some data among different runs, it may generate similar data.”
The current research focuses on text-rich images rather than natural photographs, limiting its immediate applicability to some domains. “What about some real photos like some other like natural images? It is hard to generate synthetic data for those two males, or even like medical images, chest X rays,” Yang noted, though she indicated ongoing efforts to extend the approach to medical imaging.
Looking ahead, Yang expects synthetic data generation to become standard practice: “In the future, in two or three years, and even for nothing, editor has been a very important component to teach model different capabilities.” However, she emphasized that optimal results will likely require combining synthetic and real-world data: “Real world data will reflect some real world distributions. Single data can be large scale. Can be more controllable.”
Early adoption signals suggest the technology is already influencing industry practices. “I heard like companies, like meta, some teams also, like all Amazon, they are trying to using our data to train their model,” Yang revealed during the interview.
For startups and smaller companies, the cost advantages could be particularly significant. “For some startups, it is cheaper to host, their host open model on their server, rather than just calling the APIs, which is less controllable,” Yang noted.
The research team’s decision to make everything open source reflects a broader philosophy about AI development. As Yang prepares to join the Allen Institute full-time after completing her Ph.D., the commitment to open science remains central to their mission. “Currently, those vision language models are quite brittle. It just needs the right data to get the right capabilities,” she said. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
The vision for AI that acts, not just describes
As the research moves from academic laboratories to real-world applications, the implications extend far beyond improved benchmark scores. Yang and her colleagues are already looking toward applications that could transform how people with disabilities interact with technology, from AI that understands sign language for the hearing impaired to systems that can describe complex medical images for those with visual impairments.
“I have an idea to let the model to know how to understand the sign language or those people with hearing difficulties,” Yang said, describing potential future applications. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
Callison-Burch sees even broader possibilities, particularly in robotics and scientific discovery: “Synthetic data opens up many possible applications that we don’t have naturally occurring data for. So one that Yang has also worked on at the Allen Institute is that. Ocean of creating simulated training data for robots.”
The work represents more than just a technical achievement—it’s a demonstration that open-source AI development can compete with the well-funded efforts of major technology companies through innovative approaches to fundamental challenges. As Yang noted in reflecting on her decision to join the Allen Institute rather than accept higher-paying offers from companies like Meta: “I think it’s still a very early stage of those multimodal models, and there are not much resources, open resources, or knowledge to share to the community.”
The message is clear: in the race to build AI that can truly see and understand the world, the advantage may not always go to those with the deepest pockets, but to those with the most creative solutions.

It’s Qwen’s summer: new open source Qwen3-235B-A22B-Thinking-2507 tops OpenAI, Gemini reasoning models on key benchmarks
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now If the AI industry had an equivalent to the recording industry’s “song of the summer” — a hit that catches on in the warmer months here in the Northern Hemisphere and is heard playing everywhere — the clear honoree for that title would go to Alibaba’s Qwen Team. Over just the past week, the frontier model AI research division of the Chinese e-commerce behemoth has released not one, not two, not three, but four (!!) new open source generative AI models that offer record-setting benchmarks, besting even some leading proprietary options. Last night, Qwen Team capped it off with the release of Qwen3-235B-A22B-Thinking-2507, it’s updated reasoning large language model (LLM), which takes longer to respond than a non-reasoning or “instruct” LLM, engaging in “chains-of-thought” or self-reflection and self-checking that hopefully result in more correct and comprehensive responses on more difficult tasks. Indeed, the new Qwen3-Thinking-2507, as we’ll call it for short, now leads or closely trails top-performing models across several major benchmarks. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF As AI influencer and news aggregator Andrew Curran wrote on X: “Qwen’s strongest reasoning model has arrived, and it is at the frontier.” In the AIME25 benchmark—designed to evaluate problem-solving ability in mathematical and logical contexts — Qwen3-Thinking-2507 leads all reported models with a score of 92.3, narrowly surpassing both OpenAI’s o4-mini (92.7) and Gemini-2.5
The Download: saving the US climate programs, and America’s AI protections are under threat
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. How nonprofits and academia are stepping up to salvage US climate programs Nonprofits are trying to preserve a US effort to modernize greenhouse-gas measurements, amid growing fears that the Trump administration’s dismantling of federal programs will obscure the nation’s contributions to climate change. The Data Foundation, a Washington, DC, nonprofit, is fundraising for an initiative that will coordinate efforts among nonprofits, technical experts, and companies to improve the accuracy and accessibility of climate emissions information. It will build on an effort to improve the collection of emissions data that former president Joe Biden launched in 2023—and which President Trump nullified on his first day in office.
The new greenhouse-gas coalition is one of a growing number of nonprofit and academic groups that have spun up or shifted focus to keep essential climate monitoring and research efforts going amid the Trump administration’s assault on environmental funding, staffing, and regulations. Read the full story. —James Temple
America’s AI watchdog is losing its bite Most Americans encounter the Federal Trade Commission only if they’ve been scammed: It handles identity theft, fraud, and stolen data. During the Biden administration, the agency went after AI companies for scamming customers with deceptive advertising or harming people by selling irresponsible technologies. With the announcement of President Trump’s AI Action Plan, that era may now be over.The new plan suggests that the Trump administration believes the agency’s previous actions went too far, and that it would be reviewing all FTC actions taken under the Biden administration.The move is the latest in its evolving attack on the agency, which provides a significant route of redress for people harmed by AI in the US. It’s likely to result in faster deployment of AI with fewer checks on accuracy, fairness, or consumer harm. Read the full story. —James O’Donnell Trump’s AI Action Plan is a distraction —Asad Ramzanali is the director of artificial intelligence and technology policy at the Vanderbilt Policy Accelerator.
On Wednesday, President Trump issued three executive orders, delivered a speech, and released an action plan, all on the topic of continuing American leadership in AI. This flurry of actions made for glitzy press moments, including an hour-long speech from the president and onstage signings. But while the tech industry cheered these announcements (which will swell their coffers), they obscured the fact that the administration is currently decimating the very policies that enabled America to become the world leader in AI in the first place. Read the full story. The deadly saga of the controversial gene therapy Elevidys It has been a grim few months for the Duchenne muscular dystrophy (DMD) community. There had been some excitement when, a couple of years ago, a gene therapy for the disorder was approved by the US Food and Drug Administration for the first time. That drug, Elevidys, has now been implicated in the deaths of two teenage boys. The drug’s approval was always controversial—there was a lack of evidence that it actually worked, for starters. But the agency that once rubber-stamped the drug has now turned on its manufacturer, Sarepta Therapeutics. In a remarkable chain of events, the FDA asked the company to stop shipping the drug on July 18. Sarepta refused to comply.In the days since, the company has acquiesced. But its reputation has already been hit. And the events have dealt a devastating blow to people desperate for treatments that might help them, their children, or other family members with DMD. Read the full story. —Jessica Hamzelou A version of this article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.
The must-reads
I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Corporate America is paying the price for Trump’s tariffsUS businesses are absorbing the costs—for now. (WSJ $)+ Inflation is likely to hit hard in the fall. (Vox)+ Sweeping tariffs could threaten the US manufacturing rebound. (MIT Technology Review) 2 GPT-5 is reportedly launching next monthAfter some unexpected setbacks. (The Verge) 3 Meta is hosting ads crowdfunding for IDF dronesA watchdog has identified more than 100 ads seeking donations for the army. (The Guardian)4 AI is helping researchers to combat long covid and MEA new platform spots biological markers of the conditions in patients. (FT $)+ Scientists are finding signals of long covid in blood. They could lead to new treatments. (MIT Technology Review)5 Demand is surging for banned chip repair expertise in China While most of Nvidia’s chips aren’t allowed in the country, there’s a booming industry for fixing them once they break. (Reuters)+ Nvidia’s chips are being smuggled in through the black market. (FT $) 6 ChatGPT can offer up instructions for self harm and devil worshipIt guides users through self-mutilation rituals, despite violating its own policies. (The Atlantic $) 7 We have more steel than we can possibly useBut countries are worried about the optics of ceasing production. (NYT $)+ This startup just hit a big milestone for green steel production. (MIT Technology Review)
8 Internet age checks are comingA swath of child protection laws are forcing a profound shift across the web. (Wired $)+ Child online safety laws will actually hurt kids, critics say. (MIT Technology Review) 9 An Italian rocket maker wants to conduct launches in the USAvio SpA is keen to launch flights from Wallops Island in Virginia. (Bloomberg $)+ Rivals are rising to challenge the dominance of SpaceX. (MIT Technology Review) 10 This app allows women to check a potential date’s historyBut men say there’s no recourse to address false posts about them. (WP $)
Quote of the day “If OpenAI’s ChatGPT or Google’s Gemini had responded that it was trained to appeal to the left, congressional Republicans would have been outraged and opened an investigation. Instead, they were silent.” —Senator Ed Markey urges the CEOs of major tech companies to fight Donald Trump’s anti-woke AI order, Ars Technica reports. One more thing The great AI consciousness conundrumAI consciousness isn’t just a devilishly tricky intellectual puzzle; it’s a morally weighty problem with potentially dire consequences that philosophers, cognitive scientists, and engineers alike are currently grappling with.Fail to identify a conscious AI, and you might unintentionally subjugate a being whose interests ought to matter. Mistake an unconscious AI for a conscious one, and you risk compromising human safety and happiness for the sake of an unthinking, unfeeling hunk of silicon and code.Over the past few decades, a small research community has doggedly attacked the question of what consciousness is and how it works. The effort has yielded real progress. And now, with the rapid advance of AI technology, these insights could offer our only guide to the untested, morally fraught waters of artificial consciousness. Read the full story. —Grace Huckins We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.) + Crank it up to 11! Spinal Tap is back, baby.+ The tale of New York’s first real architectural firm.+ Crazy Train as played by a class of children on the xylophone is a real delight.+ Here’s how to easily add some cheap and accessible superfoods to zhuzh up your daily diet.

The deadly saga of the controversial gene therapy Elevidys
It has been a grim few months for the Duchenne muscular dystrophy (DMD) community. There had been some excitement when, a couple of years ago, a gene therapy for the disorder was approved by the US Food and Drug Administration for the first time. That drug, Elevidys, has now been implicated in the deaths of two teenage boys. The drug’s approval was always controversial—there was a lack of evidence that it actually worked, for starters. But the agency that once rubber-stamped the drug has now turned on its manufacturer, Sarepta Therapeutics. In a remarkable chain of events, the FDA asked the company to stop shipping the drug on July 18. Sarepta refused to comply. In the days since, the company has acquiesced. But its reputation has already been hit. And the events have dealt a devastating blow to people desperate for treatments that might help them, their children, or other family members with DMD. DMD is a rare genetic disorder that causes muscles to degenerate over time. It’s caused by a mutation in a gene that codes for a protein called dystrophin. That protein is essential for muscles—without it, muscles weaken and waste away. The disease mostly affects boys, and symptoms usually start in early childhood.
At first, affected children usually start to find it hard to jump or climb stairs. But as the disease progresses, other movements become difficult too. Eventually, the condition might affect the heart and lungs. The life expectancy of a person with DMD has recently improved, but it is still only around 30 or 40 years. There is no cure. It’s a devastating diagnosis. Elevidys was designed to replace missing dystrophin with a shortened, engineered version of the protein. In June 2023, the FDA approved the therapy for eligible four- and five-year-olds. It came with a $3.2 million price tag.
The approval was celebrated by people affected by DMD, says Debra Miller, founder of CureDuchenne, an organization that funds research into the condition and offers support to those affected by it. “We’ve not had much in the way of meaningful therapies,” she says. “The excitement was great.” But the approval was controversial. It came under an “accelerated approval” program that essentially lowers the bar of evidence for drugs designed to treat “serious or life-threatening diseases where there is an unmet medical need.” Elevidys was approved because it appeared to increase levels of the engineered protein in patients’ muscles. But it had not been shown to improve patient outcomes: It had failed a randomized clinical trial. The FDA approval was granted on the condition that Sarepta complete another clinical trial. The topline results of that trial were described in October 2023 and were published in detail a year later. Again, the drug failed to meet its “primary endpoint”—in other words, it didn’t work as well as hoped. In June 2024, the FDA expanded the approval of Elevidys. It granted traditional approval for the drug to treat people with DMD who are over the age of four and can walk independently, and another accelerated approval for those who can’t. Some experts were appalled at the FDA’s decision—even some within the FDA disagreed with it. But things weren’t so simple for people living with DMD. I spoke to some parents of such children a couple of years ago. They pointed out that drug approvals can help bring interest and investment to DMD research. And, above all, they were desperate for any drug that might help their children. They were desperate for hope. Unfortunately, the treatment does not appear to be delivering on that hope. There have always been questions over whether it works. But now there are serious questions over how safe it is. In March 2025, a 16-year-old boy died after being treated with Elevidys. He had developed acute liver failure (ALF) after having the treatment, Sarepta said in a statement. On June 15, the company announced a second death—a 15-year-old who also developed ALF following Elevidys treatment. The company said it would pause shipments of the drug, but only for patients who are not able to walk.
The following day, Sarepta held an online presentation in which CEO Doug Ingram said that the company was exploring ways to make the treatment safer, perhaps by treating recipients with another drug that dampens their immune systems. But that same day, the company announced that it was laying off 500 employees—36% of its workforce. Sarepta did not respond to a request for comment. On June 24, the FDA announced that it was investigating the risks of serious outcomes “including hospitalization and death” associated with Elevidys, and “evaluating the need for further regulatory action.” There was more tragic news on July 18, when there were reports that a third patient had died following a Sarepta treatment. This patient, a 51-year-old, hadn’t been taking Elevidys but was enrolled in a clinical trial for a different Sarepta gene therapy designed to treat limb-girdle muscular dystrophy. The same day, the FDA asked Sarepta to voluntarily pause all shipments of Elevidys. Sarepta refused to do so. The refusal was surprising, says Michael Kelly, chief scientific officer at CureDuchenne: “It was an unusual step to take.” After significant media coverage, including reporting that the FDA was “deeply troubled” by the decision and would use its “full regulatory authority,” Sarepta backed down a few days later. On July 21, the company announced its decision to “voluntarily and temporarily” pause all shipments of Elevidys in the US. Sarepta says it will now work with the FDA to address safety and labeling concerns. But in the meantime, the saga has left the DMD community grappling with “a mix of disappointment and concern,” says Kelly. Many are worried about the risks of taking the treatment. Others are devastated that they are no longer able to access it. Miller says she knows of families who have been working with their insurance providers to get authorization for the drug. “It’s like the rug has been pulled out from under them,” she says. Many families have no other treatment options. “And we know what happens when you do nothing with Duchenne,” she says. Others, particularly those with teenage children with DMD, are deciding against trying the drug, she adds. The decision over whether to take Elevidys was already a personal one based on several factors, he says. People with DMD and their families deserve clear and transparent information about the treatment in order to make that decision.
The FDA’s decision to approve Elevidys was made on limited data, says Kelly. But as things stand today, over 900 people have been treated with Elevidys. “That gives the FDA… an opportunity to look at real data and make informed decisions,” he says. “Families facing Duchenne do not have time to waste,” Kelly says. “They must navigate a landscape where hope is tempered by the realities of medical complexity.” A version of this article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Meta has appointed Shengjia Zhao, a former OpenAI researcher and co‑creator of GPT‑4, as the Chief Scientist of its newly created Meta Superintelligence Labs (MSL). The announcement was made Friday by Mark Zuckerberg on Threads, noting Zhao will lead the lab’s scientific agenda alongside him and Alexandr Wang, the former CEO of Scale AI who Meta recently brought onboard as Chief AI Officer. “I am very excited to take up the role of chief scientist for meta super-intelligence labs. Looking forward to building asi [artificial superintelligence] and aligning it to empower people with the amazing team here. Let’s build!” Zhao wrote in his own Threads post. “Artificial superintelligence” is a nebulous term used in the AI industry to describe systems more powerful and capable than any today, beyond even the smartest humans, making them difficult to control. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Zhao’s strong commercial AI background Zhao, who previously worked at OpenAI, played a key role in the development of foundational models like GPT-4 and GPT-4o, according to arXiv system cards and research papers listing him as a co-author. He’s also known for his academic work on generative models and fair representations, with widely cited papers in venues like NeurIPS, ICML, and ICLR. Zhao joins Meta amid a high-stakes hiring blitz across the AI industry. Over the past few months, Meta has

Supertanker Hauling Saudi Diesel Heads to Europe
A supertanker carrying a cargo of diesel from the Middle East is en route to the fuel-starved European market, reflecting supply tightness in the region. The VLCC Nissos Keros loaded about 2 million barrels of ultra-low sulfur diesel from Saudi Arabia’s Jubail terminal and is currently signaling France where it’s due to arrive Aug. 30, according to Kpler and ship-tracking data compiled by Bloomberg. The vessel, which usually transports crude oil, was re-configured to carry diesel. Cargoes of the fuel would typically be carried on smaller tankers, but with freight rates elevated after the latest attacks on shipping in the Red Sea, operators have an incentive to clean up dirty tankers to haul products instead and reap the economies of scale. Europe’s diesel market remains under pressure, driven by a combination of lower refinery output, costly rerouting of imports to replace shunned Russian supplies and sanctions-related uncertainty. The arrival of a large shipment may provide temporary relief, but dependence on long-haul imports continues to expose the European market to spikes in freight costs and supply volatility. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of Rigzone. All comments are subject to editorial review. Off-topic, inappropriate or insulting comments will be removed.

New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples
Singapore-based AI startup Sapient Intelligence has developed a new AI architecture that can match, and in some cases vastly outperform, large language models (LLMs) on complex reasoning tasks, all while being significantly smaller and more data-efficient.The architecture, known as the Hierarchical Reasoning Model (HRM), is inspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation. The model achieves impressive results with a fraction of the data and memory required by today’s LLMs. This efficiency could have important implications for real-world enterprise AI applications where data is scarce and computational resources are limited.When faced with a complex problem, current LLMs largely rely on chain-of-thought (CoT) prompting, breaking down problems into intermediate text-based steps, essentially forcing the model to “think out loud” as it works toward a solution.While CoT has improved the reasoning abilities of LLMs, it has fundamental limitations. In their paper, researchers at Sapient Intelligence argue that “CoT for reasoning is a crutch, not a satisfactory solution. It relies on brittle, human-defined decompositions where a single misstep or a misorder of the steps can derail the reasoning process entirely.”

Oil Slips on Stronger Dollar, Trade Doubts
Oil fell as the dollar strengthened and conviction waned that the US will reach agreements with key trade partners ahead of a deadline next week. West Texas Intermediate crude slid more than 1% to settle near $65 a barrel after President Donald Trump said the US has a 50-50 chance of striking a trade deal with Europe, a contrast to the optimism the bloc’s diplomats expressed this week. Trump also said most tariff rates are essentially settled now. The effective US tariff rate is at the highest in a century, by some estimates, a potential threat to energy demand. In another headwind, Trump indicated he had no plans to fire Federal Reserve Chair Jerome Powell, boosting the dollar and making the commodities priced in the currency less attractive. Crude has remained in a holding pattern this month, but is down for the year as increased supply from OPEC+ adds to concerns of a looming glut. The group will next meet on Aug. 3 to decide on production levels. On Thursday, one member, Venezuela, was given a production reprieve by a US decision to let Chevron resume pumping oil in the country. “We expect crude to slowly sell off this fall, driven by steady acceleration of stock builds, softening physical markets, reduced refinery margin support and continued deescalation of geopolitically driven supply risk,” Macquarie Group analysts including Vikas Dwivedi wrote in a note. Oil Prices WTI for September delivery fell 1.3% to settle at $65.16 a barrel. Brent for September settlement slipped 1.1% to $68.44 a barrel. What do you think? We’d love to hear from you, join the conversation on the Rigzone Energy Network. The Rigzone Energy Network is a new social experience created for you and all energy professionals to Speak Up about our industry, share knowledge, connect with

CoSyn: The open-source tool that’s making GPT-4V-level vision AI accessible to everyone
Researchers at the University of Pennsylvania and the Allen Institute for Artificial Intelligence have developed a groundbreaking tool that allows open-source AI systems to match or surpass the visual understanding capabilities of proprietary models like GPT-4V and Gemini 1.5 Flash, potentially reshaping the competitive landscape between open and closed AI development.
The tool, called CoSyn (Code-Guided Synthesis), addresses a critical bottleneck in AI development: the scarcity of high-quality training data for teaching machines to understand complex visual information like scientific charts, medical diagrams, and financial documents. Rather than scraping millions of images from the internet — a practice fraught with copyright and ethical concerns — CoSyn leverages the coding abilities of existing language models to generate synthetic training data.
“We have, we lack of such data to train the model. We lack of data, like documents, charts with rich annotations to train a vision language model to do question answering over those images,” explained Yue Yang, a recent Penn Engineering Ph.D. graduate and co-first author of the research, during an exclusive interview with VentureBeat. “Those images actually are more challenging to annotate, compared to natural photos, like a picture of a dog of a cat of a house.”
The breakthrough comes as enterprises increasingly seek AI systems capable of understanding and reasoning about complex visual information — capabilities essential for everything from automated document processing to AI agents that can navigate digital interfaces independently. The work was conducted during Yang’s internship with the PRIOR team at the Allen Institute for AI and supported by the Office of the Director of National Intelligence, Intelligence Advanced Research Projects Activity, and the Defense Advanced Research Projects Agency.
How synthetic data generation solves AI’s biggest training challenge
The challenge of training AI to understand text-rich images has long plagued the field. Unlike natural photographs, scientific figures, charts, and documents require extensive annotation work that is both time-consuming and expensive. Traditional approaches have relied on harvesting images and their alt-text descriptions from the internet, but this method produces training data that is often superficial and legally problematic.
CoSyn takes a fundamentally different approach by recognizing that most text-rich images are originally created through code — Python scripts generate charts, LaTeX renders mathematical equations, HTML creates web interfaces. The research team’s insight was to reverse this process: use language models’ proven coding abilities to generate the underlying code, then execute that code to create realistic synthetic images.
“One intuition is actually those images like charts documents. We render them from programs from code, like we use Python to generate charts. We use, like latex or word to write our documents,” Yang said. “So how about we go through the reverse way, like we generated the code because the text only language model has been proved very good at writing code.”
Chris Callison-Burch, a computer science professor at Penn who co-advised the research, described the approach in simpler terms: “This is like taking a student who’s great at writing and asking them to teach someone how to draw, just by describing what the drawing should look like. We’re essentially transferring the strengths of open-source AI from text to vision.”
CoSyn-trained models outperform GPT-4V and Gemini on key benchmarks
The results are striking. Using their synthetic dataset of 400,000 images and 2.7 million instruction pairs, models trained with CoSyn achieved state-of-the-art performance among open-source systems and surpassed proprietary models on seven benchmark tests measuring text-rich image understanding.
On average, their 7-billion parameter model scored 80.9% across the benchmark suite, outperforming the previous best open-source model (Llama 3.2 11B) by 3.9 percentage points. More remarkably, even their “zero-shot” model—trained without any examples from the evaluation datasets—outperformed most open and closed models, demonstrating the transferability of capabilities learned from synthetic data.
CoSyn-trained models outperformed GPT-4V and Gemini 1.5 Flash across seven text-rich image understanding benchmarks. (Credit: github.io/cosyn)
In one particularly compelling demonstration, the researchers created a new benchmark called NutritionQA, consisting of 100 questions about nutrition label photographs. Using just 7,000 synthetically generated nutrition labels for training, their model outperformed others trained on millions of real images. “Despite being trained on millions of images, we observe that open-source VLMs are not data-efficient and perform poorly on this novel task compared to GPT-4V,” the researchers wrote in their paper.
Yang emphasized the significance: “Those big packs, they have so many resources to collecting data to run a lot of experiments, and I but I think open source models, we can give access to people, the model weights, the data we trained, or even the code, the training script, everything people can developers can build upon.”
Real companies are already using vision AI for quality control and automation
The technology is already finding real-world applications across industries. Callison-Burch cited an example from one of his teaching assistants whose company uses vision-language models for cable installation quality assurance: “They have the workers on site who are doing the installation take photographs of the processes they’re doing it, and they use that to automatically validate that each step has been followed properly.”
This type of specialized visual understanding could transform numerous enterprise workflows, from automated document processing in financial services to quality control in manufacturing. The ability to train models on specific visual tasks using synthetic data means companies can develop AI systems tailored to their particular needs without the massive data collection efforts traditionally required.
For enterprise decision makers, the research suggests a shift in how to approach AI data strategies. “I think synthetic data is a very promising way to remove the effort for human annotation. It costs less money, and it will just automatically generate large scale data, and also can avoid some copyright issues,” Yang noted.
The persona-driven approach that makes AI training data more diverse
One of CoSyn’s key innovations is its approach to ensuring data diversity. To prevent the repetitive outputs common in AI-generated content, the system employs what the researchers call a “persona-driven mechanism.” Each time CoSyn generates a synthetic example, it pairs the request with a randomly sampled persona—a short description like “a sci-fi novelist constantly bouncing off ideas for new alien worlds” or “a chemistry teacher preparing lab materials.”
“Every time we generate one syntax data, we will appear with a randomly sampled persona,” Yang explained. “This will diversify the content and styles of the examples we generated, because, like, if I provide the persona of like a PhD student, it will generate something more scientific or more about, something about academia.”
This approach enables the system to generate content across nine different categories: charts, documents, math problems, tables, diagrams, vector graphics, music sheets, electrical circuits, and chemical structures. The researchers used 11 different rendering tools, from Python’s Matplotlib for charts to LaTeX for mathematical expressions, supported by 20 specialized generation pipelines.
Why this breakthrough could level the playing field between open source and Big Tech
The implications for the broader AI industry are significant. Major technology companies like OpenAI and Google have invested billions in developing their proprietary vision-language capabilities, creating systems whose training methods and data sources remain trade secrets. CoSyn offers a path for open-source alternatives to compete without requiring similar resource investments.
“Open source models still like, like behind those closed source models, but with all the efforts, all the resources from the open source community, everyone, like, we’ve had more efforts. We have more like energy, like from, from everyone. So I think finally we can catch up,” Yang said.
The commitment to openness extends beyond just releasing the model. The complete CoSyn codebase, the 400,000-image dataset, and all training scripts are publicly available, enabling researchers and companies worldwide to build upon the work. “From the academia side, like a lot of research is built upon openness, like we need all access to the data, code, everything to discover new findings to support our claims in the papers,” Yang emphasized.
This transparency addresses growing concerns about the black-box nature of proprietary AI systems. “If you only rely on the APIs for like open AI, this may not be reliable to prove your like scientific discoveries, because they may just. Something in the back end you never know,” Yang noted.
Beyond static image understanding, CoSyn is pioneering capabilities crucial for the next generation of AI agents—systems that can autonomously navigate digital interfaces and perform complex tasks. The researchers developed synthetic “pointing data” that teaches models exactly where to click on screenshots, a fundamental requirement for web-based automation.
Using 65,000 synthetic screenshots with click annotations, their model achieved state-of-the-art performance on ScreenSpot, a benchmark for click prediction, outperforming systems trained on 1.3 million real screenshots. “We only use like several 100k synthetic screenshot, we can outperform previous model on millions of screenshots,” Yang said.
This capability is essential as the industry moves toward AI agents that can perform knowledge work autonomously. “There’s sort of like two prevailing models and how you might go about implementing agents,” Callison-Burch explained. One approach uses specialized APIs, while the other relies on agents that “literally just use web browsing capabilities in the same way that you and I do.”
The vision-based approach, enabled by technologies like CoSyn, could prove more versatile: “You’re not just calling up software function, which is relatively straightforward, but you actually have to, like, take screenshots of the current state of the web browser. Reason about where to click, navigate your mouse to that location to click.”
How synthetic data sidesteps the growing copyright crisis in AI training
The synthetic data approach also provides a potential solution to mounting legal challenges around AI training data. With ongoing litigation over whether training on copyrighted materials constitutes fair use, synthetic data generation offers an alternative path that sidesteps many intellectual property concerns.
Callison-Burch, who testified before Congress on AI and copyright in 2023, sees synthetic data as complementary to, rather than replacing, real-world training data: “I don’t think that synthetic data eliminates the need for having wide amounts of diverse training data like that’s still a core element to training AI systems, but it does allow you to extend their capabilities in really remarkable ways.”
The approach demonstrates how existing knowledge can be transferred to new applications without directly using copyrighted materials. “The underlying thing that we’re relying on here is a large language model. Can write code that’s something that it learned from its original data. We’re now applying that to a totally different application, which is creation of new training data that is unlike any of the data that it was trained on.”
The current limits of synthetic data and what comes next
Despite its promise, synthetic data generation faces important limitations. “One limitation is it may inherit the biases from the model that generates such synthetic data,” Yang acknowledged. The system can also struggle with diversity: “If you prompt a large network to generate some data among different runs, it may generate similar data.”
The current research focuses on text-rich images rather than natural photographs, limiting its immediate applicability to some domains. “What about some real photos like some other like natural images? It is hard to generate synthetic data for those two males, or even like medical images, chest X rays,” Yang noted, though she indicated ongoing efforts to extend the approach to medical imaging.
Looking ahead, Yang expects synthetic data generation to become standard practice: “In the future, in two or three years, and even for nothing, editor has been a very important component to teach model different capabilities.” However, she emphasized that optimal results will likely require combining synthetic and real-world data: “Real world data will reflect some real world distributions. Single data can be large scale. Can be more controllable.”
Early adoption signals suggest the technology is already influencing industry practices. “I heard like companies, like meta, some teams also, like all Amazon, they are trying to using our data to train their model,” Yang revealed during the interview.
For startups and smaller companies, the cost advantages could be particularly significant. “For some startups, it is cheaper to host, their host open model on their server, rather than just calling the APIs, which is less controllable,” Yang noted.
The research team’s decision to make everything open source reflects a broader philosophy about AI development. As Yang prepares to join the Allen Institute full-time after completing her Ph.D., the commitment to open science remains central to their mission. “Currently, those vision language models are quite brittle. It just needs the right data to get the right capabilities,” she said. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
The vision for AI that acts, not just describes
As the research moves from academic laboratories to real-world applications, the implications extend far beyond improved benchmark scores. Yang and her colleagues are already looking toward applications that could transform how people with disabilities interact with technology, from AI that understands sign language for the hearing impaired to systems that can describe complex medical images for those with visual impairments.
“I have an idea to let the model to know how to understand the sign language or those people with hearing difficulties,” Yang said, describing potential future applications. “If you find the right data, you can improve models capability on it, and it will benefit the society.”
Callison-Burch sees even broader possibilities, particularly in robotics and scientific discovery: “Synthetic data opens up many possible applications that we don’t have naturally occurring data for. So one that Yang has also worked on at the Allen Institute is that. Ocean of creating simulated training data for robots.”
The work represents more than just a technical achievement—it’s a demonstration that open-source AI development can compete with the well-funded efforts of major technology companies through innovative approaches to fundamental challenges. As Yang noted in reflecting on her decision to join the Allen Institute rather than accept higher-paying offers from companies like Meta: “I think it’s still a very early stage of those multimodal models, and there are not much resources, open resources, or knowledge to share to the community.”
The message is clear: in the race to build AI that can truly see and understand the world, the advantage may not always go to those with the deepest pockets, but to those with the most creative solutions.

Eni Profit Tops Estimates
Eni SpA reported profit that beat analyst estimates as proceeds from asset sales and sweeping cost cuts helped counter a weak oil market. While crude prices were lower in the second quarter — weighing on earnings at other European oil companies — Eni has been buoyed by a cost-reduction program introduced earlier this year, while asset disposals brought down debt. Adjusted net income fell 25% from a year earlier to €1.13 billion ($1.3 billion), the Italian energy company said Friday in a statement. That exceeded the €932.6 million average estimate of analysts surveyed by Bloomberg. Eni said it’s now targeting €3 billion of cost cuts this year, up from €2 billion previously. The company has also reaped billions of euros by offloading stakes in its renewables arm and mobility division, and is in talks to sell half of its carbon capture unit. “The combination of divestments set to come through this year, ongoing ‘self-help,’ as well as the additional cash flow from new ramp-ups sets Eni up for a strong second half of 2025 and 2026,” RBC Europe Ltd. analyst Biraj Borkhataria said in a note. He expects “growing free cash flow and a more resilient balance sheet than we’ve seen for many years.” The shares rose as much as 0.6% at the open in Milan, before trading little changed as of 9:08 a.m. local time. Eni confirmed plans for shareholders’ returns this year. It expects free cash flow before working capital of about €11.5 billion at $70-a-barrel crude, up from previous guidance of €11 billion. The company also raised its forecast for annual earnings from its gas division to €1 billion from €800 million. Net debt shrank to €29.1 billion at the end of June. WHAT DO YOU THINK? Generated by readers, the comments included herein do not reflect the views and opinions of
Stay Ahead with the Paperboy Newsletter
Your weekly dose of insights into AI, Bitcoin mining, Datacenters and Energy indusrty news. Spend 3-5 minutes and catch-up on 1 week of news.