Stay Ahead, Stay ONMINE

How Meta leverages generative AI to understand user intent

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta — parent company of Facebook, Instagram, WhatsApp, Threads and more —runs one of the biggest recommendation systems in the world. In two recently released papers, its researchers have revealed how generative models can be used […]

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Meta — parent company of Facebook, Instagram, WhatsApp, Threads and more —runs one of the biggest recommendation systems in the world.

In two recently released papers, its researchers have revealed how generative models can be used to better understand and respond to user intent. 

By looking at recommendations as a generative problem, you can tackle it in new ways that are richer in content and more efficient than classic approaches. This approach can have important uses for any application that requires retrieving documents, products, or other kinds of objects.

Dense vs generative retrieval

The standard approach to creating recommendation systems is to compute, store, and retrieve dense representations of documents. For example, to recommend items to users, an application must train a model that can compute embeddings for both users and items. Then it must create a large store of item embeddings. 

At inference time, the recommendation system tries to understand the user’s intent by finding one or more items whose embeddings are similar to the user’s. This approach require an increasing amount of storage and computation capacity as the number of items grows because every item embedding must be stored and every recommendation operation requires comparing the user embedding against the entire item store.

Dense retrieval
Dense retrieval (source: arXiv)

Generative retrieval is a more recent approach that tries to understand user intent and make recommendations by predicting the next item in a sequence instead of searching a database. Generative retrieval does not require storing item embeddings and its inference and storage costs remain constant as the list of items grows.

The key to making generative retrieval work is to compute “semantic IDs” (SIDs) which contain the contextual information about each item. Generative retrieval systems like TIGER work in two phases. First, an encoder model is trained to create a unique embedding value for each item based on its description and properties. These embedding values become the SIDs and are stored along with the item. 

Generative retrieval
Generative retrieval (source: arXiv)

In the second stage, a Transformer model is trained to predict the next SID in an input sequence. The list of input SIDs represents the user’s interactions with past items and the model’s prediction is the SID of the item to recommend. Generative retrieval reduces the need for storing and searching across individual item embeddings. It also enhances the ability to capture deeper semantic relationships within the data and provides other benefits of generative models, such as modifying the temperature to adjust the diversity of recommendations. 

Advanced generative retrieval

Despite its lower storage and inference costs, generative retrieval suffers from some limitations. For example, it tends to overfit to the items it has seen during training, which means it has trouble dealing with items that were added to the catalog after the model was trained. In recommendation systems, this is often referred to as “the cold start problem,” which pertains to users and items that are new and have no interaction history. 

To address these shortcomings, Meta has developed a hybrid recommendation system called LIGER, which combines the computational and storage efficiencies of generative retrieval with the robust embedding quality and ranking capabilities of dense retrieval.

During training, LIGER uses both similarity score and next-token goals to improve the model’s recommendations. During inference, LIGER selects several candidates based on the generative mechanism and supplements them with a few cold-start items, which are then ranked based on the embeddings of the generated candidates. 

LIGER
LIGER combines generative and dense retrieval (source: arXiv)

The researchers note that “the fusion of dense and generative retrieval methods holds tremendous potential for advancing recommendation systems” and as the models evolve, “they will become increasingly practical for real-world applications, enabling more personalized and responsive user experiences.”

In a separate paper, the researchers introduce a novel multimodal generative retrieval method named Multimodal preference discerner (Mender), a technique that can enable generative models to pick up implicit preferences from user’s interactions with different items. Mender builds on top of the generative retrieval methods based on SIDs and adds a few components that can enrich recommendations with user preferences.

Mender uses a large language model (LLM) to translate user interactions into specific preferences. For example, if the user has praised or complained about a specific item in a review, the model will summarize it into a preference about that product category. 

The main recommender model is trained to be conditioned both on the sequence of user interactions and the user preferences when predicting the next semantic ID in the input sequence. This gives the recommender model the ability to generalize and perform in-context learning and adapt to user preferences without being explicitly trained on them.

“Our contributions pave the way for a new class of generative retrieval models that unlock the ability to utilize organic data for steering recommendation via textual user preferences,” the researchers write.

Mender
Mender recommendation framework (source: arXiv)

Implications for enterprise applications

The efficiency provided by generative retrieval systems can have important implications for enterprise applications. These advancements translate into immediate practical benefits, including reduced infrastructure costs and faster inference. The technology’s ability to maintain constant storage and inference costs regardless of catalog size makes it particularly valuable for growing businesses.

The benefits extend across industries, from e-commerce to enterprise search. Generative retrieval is still in its early stages and we can expect applications and frameworks to emerge as it matures.

Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

TechnipFMC Logs Higher Q1 Revenue

TechnipFMC PLC has posted a revenue of $2.23 billion for the first quarter, up 9.4 percent year-on-year (YoY), while net income fell 9.6 percent YoY to $142 million. Included in total company results was a foreign exchange loss of $12.1 million, or $8.1 million after-tax, TechnipFMC said. Inbound orders during

Read More »

IBM aims for autonomous security operations

“IBM’s proactive threat hunting augments traditional security solutions to uncover anomalous activity and IBM’s proactive threat hunters work with organizations to help identify their crown jewel assets and critical concerns. This input enables the threat hunting team to create fully tailored threat hunt reports and customized detections,” IDC stated. “AI/ML

Read More »

Cisco automates AI-driven security across enterprise networks

“The result is automated detection and response for the most common attacks,” Shipley wrote in a blog post about the new XDR capabilities. “Machine learning, machine reasoning, and LLMs combine to trigger multiple AI agents acting on different parts of the investigation lifecycle. Each investigation has a clear verdict. This

Read More »

ISACA Barcelona president warns of quantum illiteracy

Gallego says the challenge is not theoretical but practical, adding, “We are already seeing clear warning signs.” He warns of “the so-called ‘harvest now, decrypt later’ attacks, which consist of intercepting encrypted data today to decrypt it in the future with quantum technology. This is not science fiction, but a

Read More »

BP makes Q1 profit amid strategic reset

BP made a profit of $700 million in the first quarter of this year, compared to a $195bn loss for the fourth quarter 2024, as the company pushes on with its strategic reset. However, the first-quarter 2025 profit is still lower than the $2.2bn profit seen in the first quarter of 2024. According to its first-quarter results, the company also saw its underlying replacement cost profit grow from a loss in the previous quarter to $569m. “In February, we announced a fundamental reset of our strategy – to grow the upstream, focus the downstream and invest with discipline in the transition – and we have already made significant progress,” BP CEO Murray Auchincloss said. “So far this year we have started up three major projects, made six exploration discoveries and have progressed our divestment programme – all while delivering strong operational performance, with over 95% upstream plant reliability supporting the best operating efficiency on record, and over 96% refining availability.” BP chief financial officer Kate Thomson added: “We are also making good progress on divestments and now expect proceeds of $3-4bn this year. This underpins our confidence in meeting our net debt target of $14-18bn by the end of 2027. Its results added that BP expects second quarter 2025 reported upstream production to be broadly flat compared with the first-quarter 2025. However, it added that it expects its fuels margins to remain sensitive to movements in the cost of supply along with a significantly higher level of planned refinery turnaround activity compared to the first quarter and refining margin environment to remain sensitive to the economic outlook. The results added that BP expects to make around $14.5bn of capital expenditures in 2025. Auchincloss added: “We continue to monitor market volatility and changes and remain focused on moving at pace. I’m

Read More »

Emera, Nova Scotia Power Report Cyber Breach

Emera Inc. and subsidiary Nova Scotia Power Inc. on Monday reported unauthorized access to parts of Emera’s Canadian network and servers supporting some of its business applications. A joint press release said the incident has not impacted Nova Scotia Power’s physical operations and Emera’s utilities in the United States and the Caribbean. However, Nova Scotia Power said on social media the cyber breach has impacted some of its phone lines and its online portal for customer account access. Its outage line, though, remains available for emergency calls, it said. “Unfortunately, while we work to bring systems back, there will be an impact on wait times for calls to our customer care center”, Nova Scotia Power says on its website. “Immediately following detection of the external threat, the companies activated their incident response and business continuity protocols, engaged leading third-party cybersecurity experts, and took actions to contain and isolate the affected servers and prevent further intrusion”, the joint statement said. “Law enforcement officials have been notified. “There remains no disruption to any of our Canadian physical operations including at Nova Scotia Power’s generation, transmission and distribution facilities, the Maritime Link or the Brunswick Pipeline, and the incident has not impacted the utility’s ability to safely and reliably serve customers in Nova Scotia. There has been no impact to Emera’s U.S. or Caribbean utilities”. Nova Scotia Power, a regulated power utility, serves over 525,000 residential, commercial and industrial customers in the southeastern province on Canada’s Atlantic coast. It accounts for 95 percent of Nova Scotia’s generation, transmission and distribution services, according to information on Emera’s website. Halifax-based Emera counts a total of around 2.5 million utility customers. It owns 6 electric and natural gas utilities, it says on its website. The joint statement said the cyber incident will not derail the scheduled release

Read More »

US Urges Eastern Europe to Split from EU Energy Transition Aims

The US wants central and eastern European countries to join its path of “energy freedom” instead of following the wider region’s transition to a net zero economy, Energy Secretary Chris Wright said in Warsaw. The Energy Secretary told the Three Seas Business Forum – which numerous leaders from the region attended Monday – that western Europe chose the wrong path of expensive “top-down imposition of enforced climate policies.” He argued that renewables investments weigh on growth and boost revenues of foreign companies, and that eastern Europe should pick a different path. “Central Europe faces a time for choosing,” Wright told conference participants. “We warmly welcome you to join us on Team Energy Freedom and Prosperity for Citizens.” Under President Donald Trump, the US has started to withdraw from the Paris Agreement, a landmark 2015 deal that aims to slow down global warming. The administration also announced a series of measures to expand mining and use of coal, and wants to help increase oil and gas production. Wright said that while climate change is “a real physical phenomenon,” it’s not the world’s most “urgent problem.” “In fact, the clarion conclusion from economic studies of climate change is that net zero 2050 is absolutely the wrong goal,” he said. “Not only is it unachievable, but the blind pursuit of it will cause, is causing far more human damage than climate change itself.” President Trump has repeatedly called on Europe to buy more American energy products if the bloc wants to avoid tariffs. Wright came to Poland as US firms Westinghouse Electric Co. and Bechtel Group Corp. signed the extension of their development contract for the country’s first nuclear power plant. Westinghouse also has plans to build reactors in other parts of the region.  “The two biggest climate solutions in the coming decades are the

Read More »

Aberdeen’s Decom Engineering secures over £2m in international contracts to start 2025

Scottish subsea cutting specialist Decom Engineering has secured more than £2 million worth of international contracts in the first quarter of 2025. The Portlethen-based firm provides a range of cold cuttings ‘chopsaws’ which can operate in harsh offshore and subsea environments. Decom Engineering said the three contract wins in the Americas, Nigeria and Australia will position the company for further international growth. Among the contract wins, Decom Engineering secured a chain cutting scope in the Gulf of Mexico which will see it deploy its ultra-light C1-16 chopsaw to cut mooring chains. Meanwhile, a Decom team is preparing to mobilise on a 300-day campaign offshore Brazil involving two C1-16s. Decom Engineering commercial director Nick McNally said company “continuing to build on a strong track record” with other contract wins in Nigeria and Western Australia. © Supplied by Decom EngineeringDecom Engineering commercial director Nick McNally. Approximately 90% of  Decom’s revenues now come from outside of the North Sea, he added, and the company sees the most potential growth in Australia, Brazil and the US. But the North Sea continues to provide projects, with Decom Engineering involved in the removal of the Brent Charlie platform last year. “By analysing past campaigns in challenging environments including the Indian Ocean and the North Sea, we have fine-tuned our designs to perform seamlessly in extreme conditions,” McNally said. “On several projects we have been asked to assist on structural removal scopes after other cutting technologies have proven not to be capable and this contingency role is now moving Decom into a position where we are becoming the first choice on larger and more complex scopes.” On the back of the recent contract wins, McNally said Decom is expanding its Aberdeen headcount with an additional engineer and upgrading another staff member to a full-time position.

Read More »

Carbon tax is ‘killing manufacturing’ in UK, says Ineos boss Ratcliffe

Chemicals billionaire Sir Jim Ratcliffe has said a tax on carbon emissions is “killing manufacturing” in Britain. The chairman of Ineos, the fourth-biggest chemicals company in the world, said its plant at Grangemouth, near Falkirk, faces a £15-million tax bill for its carbon emissions in 2024. Ratcliffe said the tax bill is a “heavy blow” for British firms and that it is slowing the Ineos’s attempts to become more energy efficient. On Wednesday, firms will be required to pay the tax levied on their carbon emissions in 2024 under the UK emission trading scheme. The scheme is designed to reduce carbon emissions across the economy by making it more expensive to burn fossil fuels and other waste. Ratcliffe said businesses “can’t afford” to pay the levy at a time when UK energy costs remain high after prices spiked in 2022 and 2023. He said: “To meet this tax obligation, we will be forced to pause vital investment in projects that were designed to make our operations more efficient and more sustainable. The irony isn’t lost on us.” His intervention comes after Ineos saw its profits wiped out last year as it racked up large debt costs. The company’s debt pile stood at more than 10.6 billion euros (£9bn) at the end of 2024, while it swung to a 71.1m euro (£60.5m) loss from a 407.8m euros (£347.3m) profit the previous year. However, it still saw revenues rise nearly 9% to 16.2bn euros (£13.8bn) last year. He continued: “This is not just Ineos, this is a reality for British manufacturers up and down the country: carbon emissions, taxes and excessive energy costs are squeezing the life out of the sector. The billionaire said the tax could push some British manufacturing offshore to countries with “less stringent” emissions rules. He added: “A

Read More »

Renewable energy advocates see threats from Texas legislation

The Texas Legislature is considering bills that would increase requirements for renewable energy facility siting and generation reliability, raising concerns from the sector about how this legislation could reduce deployment during a time of increasing energy demand. “With energy demand rising fast, Texas needs every megawatt it can generate to keep the lights on and our economy strong,” said Daniel Giese, the Solar Energy Industries Association’s Texas director of state affairs, in an April 15 press release following the Texas Senate passing one such bill, SB 819. SB 819 would “represent a significant change to renewables development,” even after the Senate “scaled back some of the more rigorous requirements,” law firm Vinson & Elkins said in an April 21 blog post.  The bill would require “create a siting regime for most new or expanded solar and wind projects,” Vinson & Elkins said, requiring developers to “apply for and receive a permit before interconnection to the ERCOT transmission grid of a renewable generation facility with a capacity of 10 MW or greater,” the firm said. SEIA said the bill “adds onerous requirements to new solar projects that would not apply to other energy sources except wind,” which would risk grid reliability, raise utility bills and infringe on property rights. “Property rights are a core Texas value, but this bill would claw back landowners’ rights to make land use decisions that are best for themselves and their families,” SEIA said. “The state telling landowners that they can’t use their land in the way they see fit is antithetical to the Texas identity.” Stewards of Texas, a policy group lobbying for the bill, argues the opposite. It says in a Jan. 17 release that the bill “will ensure responsible siting of wind and solar projects to minimize environmental damage while safeguarding landowner rights and natural resources.” Sen.

Read More »

Deep Data Center: Neoclouds as the ‘Picks and Shovels’ of the AI Gold Rush

In 1849, the discovery of gold in California ignited a frenzy, drawing prospectors from around the world in pursuit of quick fortune. While few struck it rich digging and sifting dirt, a different class of entrepreneurs quietly prospered: those who supplied the miners with the tools of the trade. From picks and shovels to tents and provisions, these providers became indispensable to the gold rush, profiting handsomely regardless of who found gold. Today, a new gold rush is underway, in pursuit of artificial intelligence. And just like the days of yore, the real fortunes may lie not in the gold itself, but in the infrastructure and equipment that enable its extraction. This is where neocloud players and chipmakers are positioned, representing themselves as the fundamental enablers of the AI revolution. Neoclouds: The Essential Tools and Implements of AI Innovation The AI boom has sparked a frenzy of innovation, investment, and competition. From generative AI applications like ChatGPT to autonomous systems and personalized recommendations, AI is rapidly transforming industries. Yet, behind every groundbreaking AI model lies an unsung hero: the infrastructure powering it. Enter neocloud providers—the specialized cloud platforms delivering the GPU horsepower that fuels AI’s meteoric rise. Let’s examine how neoclouds represent the “picks and shovels” of the AI gold rush, used for extracting the essential backbone of AI innovation. Neoclouds are emerging as indispensable players in the AI ecosystem, offering tailored solutions for compute-intensive workloads such as training large language models (LLMs) and performing high-speed inference. Unlike traditional hyperscalers (e.g., AWS, Azure, Google Cloud), which cater to a broad range of use cases, neoclouds focus exclusively on optimizing infrastructure for AI and machine learning applications. This specialization allows them to deliver superior performance at a lower cost, making them the go-to choice for startups, enterprises, and research institutions alike.

Read More »

Soluna Computing: Innovating Renewable Computing for Sustainable Data Centers

Dorothy 1A & 1B (Texas): These twin 25 MW facilities are powered by wind and serve Bitcoin hosting and mining workloads. Together, they consumed over 112,000 MWh of curtailed energy in 2024, demonstrating the impact of Soluna’s model. Dorothy 2 (Texas): Currently under construction and scheduled for energization in Q4 2025, this 48 MW site will increase Soluna’s hosting and mining capacity by 64%. Sophie (Kentucky): A 25 MW grid- and hydro-powered hosting center with a strong cost profile and consistent output. Project Grace (Texas): A 2 MW AI pilot project in development, part of Soluna’s transition into HPC and machine learning. Project Kati (Texas): With 166 MW split between Bitcoin and AI hosting, this project recently exited the Electric Reliability Council of Texas, Inc. planning phase and is expected to energize between 2025 and 2027. Project Rosa (Texas): A 187 MW flagship project co-located with wind assets, aimed at both Bitcoin and AI workloads. Land and power agreements were secured by the company in early 2025. These developments are part of the company’s broader effort to tackle both energy waste and infrastructure bottlenecks. Soluna’s behind-the-meter design enables flexibility to draw from the grid or directly from renewable sources, maximizing energy value while minimizing emissions. Competition is Fierce and a Narrower Focus Better Serves the Business In 2024, Soluna tested the waters of providing AI services via a  GPU-as-a-Service through a partnership with HPE, branded as Project Ada. The pilot aimed to rent out cloud GPUs for AI developers and LLM training. However, due to oversupply in the GPU market, delayed product rollouts (like NVIDIA’s H200), and poor demand economics, Soluna terminated the contract in March 2025. The cancellation of the contract with HPE frees up resources for Soluna to focus on what it believes the company does best: designing

Read More »

Quiet Genius at the Neutral Line: How Onics Filters Are Reshaping the Future of Data Center Power Efficiency

Why Harmonics Matter In a typical data center, nonlinear loads—like servers, UPS systems, and switch-mode power supplies—introduce harmonic distortion into the electrical system. These harmonics travel along the neutral and ground conductors, where they can increase current flow, cause overheating in transformers, and shorten the lifespan of critical power infrastructure. More subtly, they waste power through reactive losses that don’t show up on a basic utility bill, but do show up in heat, inefficiency, and increased infrastructure stress. Traditional mitigation approaches—like active harmonic filters or isolation transformers—are complex, expensive, and often require custom integration and ongoing maintenance. That’s where Onics’ solution stands out. It’s engineered as a shunt-style, low-pass filter: a passive device that sits in parallel with the circuit, quietly siphoning off problematic harmonics without interrupting operations.  The result? Lower apparent power demand, reduced electrical losses, and a quieter, more stable current environment—especially on the neutral line, where cumulative harmonic effects often peak. Behind the Numbers: Real-World Impact While the Onics filters offer a passive complement to traditional mitigation strategies, they aren’t intended to replace active harmonic filters or isolation transformers in systems that require them—they work best as a low-complexity enhancement to existing power quality designs. LoPilato says Onics has deployed its filters in mission-critical environments ranging from enterprise edge to large colos, and the data is consistent. In one example, a 6 MW data center saw a verified 9.2% reduction in energy consumption after deploying Onics filters at key electrical junctures. Another facility clocked in at 17.8% savings across its lighting and support loads, thanks in part to improved power factor and reduced transformer strain. The filters work by targeting high-frequency distortion—typically above the 3rd harmonic and up through the 35th. By passively attenuating this range, the system reduces reactive current on the neutral and helps stabilize

Read More »

New IEA Report Contrasts Energy Bottlenecks with Opportunities for AI and Data Center Growth

Artificial intelligence has, without question, crossed the threshold—from a speculative academic pursuit into the defining infrastructure of 21st-century commerce, governance, and innovation. What began in the realm of research labs and open-source models is now embedded in the capital stack of every major hyperscaler, semiconductor roadmap, and national industrial strategy. But as AI scales, so does its energy footprint. From Nvidia-powered GPU clusters to exascale training farms, the conversation across boardrooms and site selection teams has fundamentally shifted. It’s no longer just about compute density, thermal loads, or software frameworks. It’s about power—how to find it, finance it, future-proof it, and increasingly, how to generate it onsite. That refrain—“It’s all about power now”—has moved from a whisper to a full-throated consensus across the data center industry. The latest report from the International Energy Agency (IEA) gives this refrain global context and hard numbers, affirming what developers, utilities, and infrastructure operators have already sensed on the ground: the AI revolution will be throttled or propelled by the availability of scalable, sustainable, and dispatchable electricity. Why Energy Is the Real Bottleneck to Intelligence at Scale The major new IEA report puts it plainly: The transformative promise of AI will be throttled—or unleashed—by the world’s ability to deliver scalable, reliable, and sustainable electricity. The stakes are enormous. Countries that can supply the power AI craves will shape the future. Those that can’t may find themselves sidelined. Importantly, while AI poses clear challenges, the report emphasizes how it also offers solutions: from optimizing energy grids and reducing emissions in industrial sectors to enhancing energy security by supporting infrastructure defenses against cyberattacks. The report calls for immediate investments in both energy generation and grid capabilities, as well as stronger collaboration between the tech and energy sectors to avoid critical bottlenecks. The IEA advises that, for countries

Read More »

Colorado Eyes the AI Data Center Boom with Bold Incentive Push

Even as states work on legislation to limit data center development, it is clear that some locations are looking to get a bigger piece of the huge data center spending that the AI wave has created. It appears that politicians in Colorado took a look around and thought to themselves “Why is all that data center building going to Texas and Arizona? What’s wrong with the Rocky Mountain State?” Taking a page from the proven playbook that has gotten data centers built all over the country, Colorado is trying to jump on the financial incentives for data center development bandwagon. SB 24-085: A Statewide Strategy to Attract Data Center Investment Looking to significantly boost its appeal as a data center hub, Colorado is now considering Senate Bill 24-085, currently making its way through the state legislature. Sponsored by Senators Priola and Buckner and Representatives Parenti and Weinberg, this legislation promises substantial economic incentives in the form of state sales and use tax rebates for new data centers established within the state from fiscal year 2026 through 2033. Colorado hopes to position itself strategically to compete with neighboring states in attracting lucrative tech investments and high-skilled jobs. According to DataCenterMap.com, there are currently 53 data centers in the state, almost all located in the Denver area, but they are predominantly smaller facilities. In today’s era of massive AI-driven hyperscale expansion, Colorado is rarely mentioned in the same breath as major AI data center markets.  Some local communities have passed their own incentive packages, but SB 24-085 aims to offer a unified, statewide framework that can also help mitigate growing NIMBY (Not In My Backyard) sentiment around new developments. The Details: How SB 24-085 Works The bill, titled “Concerning a rebate of the state sales and use tax paid on new digital infrastructure

Read More »

Wonder Valley and the Great AI Pivot: Kevin O’Leary’s Bold Data Center Play

Data Center World 2025 drew record-breaking attendance, underscoring the AI-fueled urgency transforming infrastructure investment. But no session captivated the crowd quite like Kevin O’Leary’s electrifying keynote on Wonder Valley—his audacious plan to build the world’s largest AI compute data center campus. In a sweeping narrative that ranged from pandemic pivots to stranded gas and Branson-brand inspiration, O’Leary laid out a real estate and infrastructure strategy built for the AI era. A Pandemic-Era Pivot Becomes a Case Study in Digital Resilience O’Leary opened with a Shark Tank success story that doubled as a business parable. In 2019, a woman-led startup called Blueland raised $50 million to eliminate plastic cleaning bottles by shipping concentrated cleaning tablets in reusable kits. When COVID-19 shut down retail in 2020, her inventory was stuck in limbo—until she made an urgent call to O’Leary. What followed was a high-stakes, last-minute pivot: a union-approved commercial shoot in Brooklyn the night SAG-AFTRA shut down television production. The direct response ad campaign that resulted would not only liquidate the stranded inventory at full margin, but deliver something more valuable—data. By targeting locked-down consumers through local remnant TV ad slots and optimizing by conversion, Blueland saw unheard-of response rates as high as 17%. The campaign turned into a data goldmine: buyer locations, tablet usage patterns, household sizes, and contact details. Follow-up SMS campaigns would drive 30% reorders. “It built such a franchise in those 36 months,” O’Leary said, “with no retail. Now every retailer wants in.” The lesson? Build your infrastructure to control your data, and you build a business that scales even in chaos. This anecdote set the tone for the keynote: in a volatile world, infrastructure resilience and data control are the new core competencies. The Data Center Power Crisis: “There Is Not a Gig on the Grid” O’Leary

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »