Stay Ahead, Stay ONMINE

Introducing MASK: A Benchmark for Measuring Honesty in AI Systems

This is a linkpost for https://www.mask-benchmark.ai/
This is a linkpost for https://www.mask-benchmark.ai/
Shape
Shape
Stay Ahead

Explore More Insights

Stay ahead with more perspectives on cutting-edge power, infrastructure, energy,  bitcoin and AI solutions. Explore these articles to uncover strategies and insights shaping the future of industries.

Shape

Takeaways from Cisco’s AI Summit

Software development is in an absolute frenzy right now, Scott said. “You have very, very senior people, the best coders you’ve ever met in your life, who are just completely overwhelmed trying to keep up with the rate of progress that’s happening right now.” Optimizing AI development for agents or humans?

Read More »

Eying AI factories, Nvidia buys bigger stake in CoreWeave

Nvidia continues to throw its sizable bank account around, this time making a $2 billion investment in GPU cloud service provider CoreWeave. The company says the investment reflects Nvidia’s “confidence in CoreWeave’s business, team and growth strategy as a cloud platform built on Nvidia infrastructure.” CoreWeave is not the only

Read More »

AI, security tailwinds signal promising 2026 for Cisco

A big component of AI in communications is agentic agents talking to employees and customers, and bringing trust to the system is where Cisco should shine. It builds and runs its own infrastructure, which is secure by design. Cisco has relationships with governments all over the world, and between Webex

Read More »

BOEM Announces ‘Major Step’ Toward Expanding Offshore Energy

In a statement posted on its website, the Bureau of Ocean Energy Management (BOEM) announced “another major step toward expanding offshore energy development pursuant to the One Big Beautiful Bill Act”. BOEM noted in the statement that it has released the Final Notice of Sale for Lease Sale Big Beautiful Gulf 2 (BBG2), which it said is the second of 30 Gulf of America lease sales required by the One Big Beautiful Bill Act. The final notice of sale will be published in the Federal Register on February 5, BOEM revealed in the statement, outlining that this satisfies the requirement for the notice to publish at least 30 days prior to the scheduled lease sale date on March 11. Lease Sale BBG2 proposes to offer approximately 15,066 unleased blocks covering about 80.4 million acres on the U.S. Outer Continental Shelf in the Gulf of America, BOEM highlighted in the statement, adding that the blocks are located three to 231 miles offshore and span water depths from nine feet to more than 11,100 feet.   BOEM noted in the statement that certain areas will be excluded from the sale, “including blocks subject to the Sept. 8, 2020, presidential withdrawal; blocks adjacent to or beyond the U.S. Exclusive Economic Zone in the Eastern Gap; blocks within the boundaries of the Flower Garden Banks National Marine Sanctuary; [and] any block which received a bid in Lease Sale BBG1”. In the statement, BOEM Acting Director Matt Giacona said, “lease Sale BBG2 is a key step in advancing BOEM’s offshore oil and gas program in the Gulf of America”. “Following the strong industry response to Lease Sale BBG1, this proposed sale aims to ensure continued investment in the U.S. Outer Continental Shelf and support American energy independence,” he added. BOEM noted in its statement that the Gulf

Read More »

Pemex Slashes Debt to 11 Year Low

Mexico’s Petroleos Mexicanos reduced its debt to the lowest level in 11 years, a hopeful sign for the struggling state-owned oil company as it seeks to reverse a decades-long production decline and revive its money-losing refining business. The company cut its total debt to roughly $84.5 billion, according to a company presentation, after receiving more than $40 billion of support last year from Mexico’s finance ministry through debt purchases and cash injections. Pemex also made about 390.2 billion pesos ($22.7 billion) of payments to partners in 2025, Chief Executive Officer Victor Rodriguez said in a press briefing Wednesday, another indication the company is making headway in whittling down its large debts to service providers. The company’s crude oil output has dropped by about 50% from its peak more than two decades ago. For years, Pemex has struggled to bring new discoveries online as production fell at many of its most prolific fields.  Mexican President Claudia Sheinbaum is seeking a turnaround by attracting more private investment to the nation’s aging oil and gas fields, with the aim of making Pemex self-sufficient by 2027. Sheinbaum said Wednesday that government financial assistance for the company may continue next year. Pemex also is taking steps to improve efficiency at its aging refineries, which have been hit by accidents and outages in recent years. Mexico’s flagship Dos Bocas plant is now producing about 300,000 barrels of fuel per day, Sheinbaum said, lifting total output at the country’s domestic refineries to around 1.2 million barrels a day. Separately, Pemex has drawn criticism from US President Donald Trump over its oil sales to Cuba. Sheinbaum said Mexico is holding diplomatic talks to keep supplying oil to the communist nation on humanitarian grounds. Mexico sold nearly $500 million of crude to the island in 2025, she said Wednesday. WHAT DO YOU THINK?

Read More »

Shell’s Profit Falls

(Update) February 5, 2026, 9:57 AM GMT: Article updated with with shares, analyst and investor reaction and context throughout. Shell Plc profits slumped in the fourth quarter, undershooting expectations as lower crude prices, a weak oil-trading performance and struggling chemicals business overshadowed stronger refining margins. Europe’s largest oil company took on more debt, maintained its quarterly share buyback of $3.5 billion and raised its dividend even as volatile oil prices pressure its plan to boost investor returns through 2030. Gearing — the ratio of net debt to equity — rose, challenging the firm’s ability to organically stick with its level of share repurchases through this year.  Investors are increasingly focused on Shell’s growth outlook after Chief Executive Officer Wael Sawan cut costs and shed underperforming assets. His goal to close a large valuation gap with Exxon Mobil Corp. and Chevron Corp. has become harder this year after the shares of the US rivals soared, buoyed by strong production from low-cost oil fields in Guyana, the Permian Basin and Kazakhstan.  Shell shares fell as much as 2.6% on Thursday, outpacing declines of peers BP Plc and TotalEnergies SE. Shell’s stock was the best performer among the world’s top five oil majors in dollar terms last year, but since mid-November the gains have fizzled and so far in 2026 it’s been lagging its peers. Still, it has outperformed European rivals during Sawan’s three-year tenure. Shell’s adjusted net income of $3.26 billion for the quarter was down 11% from a year earlier and lower than the average analyst estimate of $3.51 billion. A slight production rise — in line with expectations — was unable to lift earnings. The London-based company’s 2% year-on-year production growth in the quarter pales in comparison to that of the Americans. Chevron increased output by 20% in the fourth quarter

Read More »

USA Expected to Produce Almost 1/4 of Global Oil Output

The U.S. is projected to produce almost a quarter of the world’s petroleum and other liquid fuels output in 2026 in the U.S. Energy Information Administration’s (EIA) latest short term energy outlook. According to the EIA’s January STEO, which was published last month, the EIA sees U.S. petroleum and other liquid fuels production averaging 23.88 million barrels per day in 2026. The EIA forecasts in the STEO that global production this year will come in at 107.65 million barrels per day, putting the U.S. contribution at 22.18 percent of the overall figure. A quarterly breakdown included in the EIA’s latest STEO outlined that the EIA sees U.S. petroleum and other liquid fuels production making up 22.18 percent of the world’s total petroleum and liquid fuels production in the first quarter of this year, 22.25 percent in the second quarter, 22.15 percent in the third quarter, and 22.14 percent in the fourth quarter. The U.S. share of global petroleum and other liquid fuels production is projected to drop to 21.84 percent in 2027 in the EIA’s latest STEO, which forecasts that U.S. output will average 23.63 million barrels per day and that global production will average 108.18 million barrels per day. Back in 2025, the U.S. share of global petroleum and other liquid fuels production came in at 22.22 percent, the EIA’s January STEO showed. U.S. output averaged 23.62 million barrels per day in 2025 and global production came in at 106.28 million barrels per day, the STEO highlighted. OPEC+ is projected to contribute 44.93 million barrels per day, or 41.73 percent, to the 2026 total, and 45.07 million barrels per day, or 41.66 percent, to the 2027 total, the EIA pointed out in its STEO. OPEC+ contributed 43.80 million barrels per day, or 41.21 percent, to the 2025 total, the

Read More »

Oil Ends Higher Amid Rising Middle East Risks

Oil edged higher as traders parsed conflicting reports on the status of nuclear talks between the US and Iran, clouding the outlook on whether Washington will proceed with military strikes against the major oil producer. West Texas Intermediate rose 3.1% to settle above $65 a barrel. Prices pared gains in post-settlement trading as Iranian Foreign Minister Abbas Araghchi confirmed in a social media post that negotiations will be held in Oman on Friday. The commodity surged earlier on reports that the US told Iran it will not agree to Tehran’s demands to change the location and format of talks planned for Friday, Axios said, citing two US officials. Adding to bullish momentum, US President Donald Trump said that Iran’s Supreme Leader Ayatollah Ali Khamenei “should be very worried” in an interview with NBC. Traders have been closely monitoring the risk of possible US military intervention in Iran, which could disrupt key shipping lanes as well as the country’s roughly 3.3 million barrels-per-day oil production. Doubts over whether talks surrounding Iran’s nuclear program would proceed as planned have intensified since Tuesday, when US and Iranian forces appeared to square off in the sea and air. An Iranian drone approached an American aircraft carrier in the Arabian Sea and was shot down just hours after a US-flagged oil tanker was hailed by small armed ships in the Strait of Hormuz off Iran’s coast. Concern over a potential conflict in the Middle East, a source of about a third of the world’s crude, helped lift prices last month despite signs of a growing oversupply. It has also kept the cost of bullish options high relative to bearish ones for the longest stretch in more than a year. “Geopolitical tensions are really driving it,” Equinor Chief Financial Officer Torgrim Reitan said in a Bloomberg

Read More »

Holtum Says LNG Projects Need New Financing Playbook

Trafigura Group Chief Executive Officer Richard Holtum said the liquefied natural gas industry needs a “bit of innovation” when it comes to financing projects. “I feel sorry for LNG projects in the US,” Holtum said on a panel at the LNG 2026 conference in Doha. “They would only get bank financing when they show that they’ve sold 80%-90% of their volume on long-term projects.”  LNG developers are scrambling to fully finance their projects to export more of the fuel, with the next wave of production from terminals under construction in the US and Qatar due to enter the market over the next decade. In the US, several projects including Delfin LNG off the coast of Louisiana, are working to finalize the debt and equity commitments. The current approach of LNG financing contrasts with oil, where banks are more comfortable with the inherent value of the commodity, the CEO said. “Whilst if your project financing some oil exploration, it’s simply the bank takes a view that, okay, oil is worth $40, $50, $60, $70, whatever it is, it doesn’t matter, they take a view, that’s what it’s worth in the long term,” he said. A similar model for LNG, where project financing is based on a long-term price forecast for the fuel, doesn’t seem to be developing, Holtum said. Still, global LNG capacity is set to rise by about 50% by the end of the decade — the biggest build-out in the industry’s history — led by the US.  The current arrangement, where 90% of LNG is sold to utilities under long-term contracts, risks running into difficulties because many companies and countries have made net-zero commitments, according to the Trafigura CEO. “If they have made those commitments, signing a 20-year contract or a 25-year contract that starts in 2030 is inherently problematic,” Holtum

Read More »

Azure outage disrupts VMs and identity services for over 10 hours

After multiple infrastructure scale-up attempts failed to handle the backlog and retry volumes, Microsoft ultimately removed traffic from the affected service to repair the underlying infrastructure without load. “The outage didn’t just take websites offline, but it halted development workflows and disrupted real-world operations,” said Pareekh Jain, CEO at EIIRTrend & Pareekh Consulting. Cloud outages on the rise Cloud outages have become more frequent in recent years, with major providers such as AWS, Google Cloud, and IBM all experiencing high-profile disruptions. AWS services were severely impacted for more than 15 hours when a DNS problem rendered the DynamoDB API unreliable. In November, a bad configuration file in Cloudflare’s Bot Management system led to intermittent service disruptions across several online platforms. In June, an invalid automated update disrupted the company’s identity and access management (IAM) system, resulting in users being unable to use Google to authenticate on third-party apps. “The evolving data center architecture is shaped by the shift to more demanding, intricate workloads driven by the new velocity and variability of AI. This rapid expansion is not only introducing complexities but also challenging existing dependencies. So any misconfiguration or mismanagement at the control layer can disrupt the environment,” said Neil Shah, co-founder and VP at Counterpoint Research. Preparing for the next cloud incident This is not an isolated incident. For CIOs, the event only reinforces the need to rethink resilience strategies. In the immediate aftermath when a hyperscale dependency fails, waiting is not a recommended strategy for CIOs, and they should focus on a strategy of stabilize, prioritize, and communicate, stated Jain. “First, stabilize by declaring a formal cloud incident with a single incident commander, quickly determining whether the issue affects control-plane operations or running workloads, and freezing all non-essential changes such as deployments and infrastructure updates.”

Read More »

Intel sets sights on data center GPUs amid AI-driven infrastructure shifts

Supply chain reliability is another underappreciated advantage. Hyperscalers want a credible second source, but only if Intel can offer stable, predictable roadmaps across multiple product generations. However, the company runs into a major constraint at the software layer. “The decisive bottleneck is software,” Rawat said. “CUDA functions as an industry operating standard, embedded across models, pipelines, and DevOps. Intel’s challenge is to prove that migration costs are low, and that ongoing optimization does not become a hidden engineering tax.” For enterprise buyers, that software gap translates directly into switching risk. Tighter integration of Intel CPUs, GPUs, and networking could improve system-level efficiency for enterprises and cloud providers, but the dominance of the CUDA ecosystem remains the primary barrier to switching, said Charlie Dai, VP and principal analyst at Forrester. “Even with strong hardware integration, buyers will hesitate without seamless compatibility with mainstream ML/DL frameworks and tooling,” Dai added.

Read More »

8 hot networking trends for 2026

Recurring license fees may have dissuaded enterprises from adopting AIOps in the past, but that’s changing, Morgan adds: “Over the past few years, vendors have added features and increased the value of those licenses, including 24×7 support. Now, by paying the equivalent of a fraction of a network engineer’s salary in license fees, a mid-sized enterprise can reduce hours spent on operations and level-one support in order to allocate more of their valuable networking experts’ time to AI projects. Every enterprise’s business case will be different, but with networking expertise in high demand, we predict that in 2026, the labor savings will outweigh the additional license costs for the majority of mid-to-large sized enterprises.” 2. AI boosts data center networking investments Enterprise data centers, which not so long ago were on the endangered species list, have made a remarkable comeback, driven by the reality that many AI workloads need to be hosted on premises, either for privacy, security, regulatory, latency or cost considerations. The global market for data center networking technologies was estimated at around $46 billion in 2025 and is projected to reach $103 billion by the end of 2030, a growth rate of nearly 18%, according to BCC Research: “The data center networking technologies market is rapidly changing due to increasing use of AI-powered solutions across data centers and sectors like telecom, IT, banking, financial services, insurance, government and commercial industries.” McKinsey predicts that global demand for data center capacity could nearly triple by 2030, with about 70% of that demand coming from AI workloads. McKinsey says both training and inference workloads are contributing to data center growth, with inference expected to become the dominant workload by 2030. 3. Private clouds roll in Clearly, the hyperscalers are driving most of the new data center construction, but enterprises are

Read More »

Cisco: Infrastructure, trust, model development are key AI challenges

“The G200 chip was for the scale out, because what’s happening now is these models are getting bigger where they don’t just fit within a single data center. You don’t have enough power to just pull into a single data center,” Patel said. “So now you need to have data centers that might be hundreds of kilometers apart, that operate like an ultra-cluster that are coherent. And so that requires a completely different chip architecture to make sure that you have capabilities like deep buffering and so on and so forth… You need to make sure that these data centers can be scaled across physical boundaries.”  “In addition, we are reaching the physical limits of copper and optics, and coherent optics especially are going to be extremely important as we go start building out this data center infrastructure. So that’s an area that you’re starting to see a tremendous amount of progress being made,” Patel said. The second constraint is the AI trust deficit, Patel said. “We currently need to make sure that these systems are trusted by the people that are using them, because if you don’t trust these systems, you’ll never use them,” Patel said. “This is the first time that security is actually becoming a prerequisite for adoption. In the past, you always ask the question whether you want to be secure, or you want to be productive. And those were kind of needs that offset each other,” Patel said. “We need to make sure that we trust not just using AI for cyber defense, but we trust AI itself,” Patel said. The third constraint is the notion of a data gap. AI models get trained on human-generated data that’s publicly available on the Internet, but “we’re running out,” Patel said. “And what you’re starting to see happen

Read More »

How Robotics Is Re-Engineering Data Center Construction and Operations

Physical AI: A Reusable Robotics Stack for Data Center Operations This is where the recent collaboration between Multiply Labs and NVIDIA becomes relevant, even though the application is biomanufacturing rather than data centers. Multiply Labs has outlined a robotics approach built on three core elements: Digital twins using NVIDIA Isaac Sim to model hardware and validate changes in simulation before deployment. Foundation-model-based skill learning via NVIDIA Isaac GR00T, enabling robots to generalize tasks rather than rely on brittle, hard-coded behaviors. Perception pipelines including FoundationPose and FoundationStereo, that convert expert demonstrations into structured training data. Taken together, this represents a reusable blueprint for data center robotics. Applying the Lesson to Data Center Environments The same physical-AI techniques now being applied in lab and manufacturing environments map cleanly onto the realities of data center operations, particularly where safety, uptime, and variability intersect. Digital-twin-first deployment Before a robot ever enters a live data hall, it needs to be trained in simulation. That means modeling aisle geometry, obstacles, rack layouts, reflective surfaces, and lighting variation; along with “what if” scenarios such as blocked aisles, emergency egress conditions, ladders left in place, or spill events. Simulation-first workflows make it possible to validate behavior and edge cases before introducing any new system into a production environment. Skill learning beats hard-coded rules Data centers appear structured, but in practice they are full of variability: temporary cabling, staged parts, mixed-vendor racks, and countless human exceptions. Foundation-model approaches to manipulation are designed to generalize across that messiness far better than traditional rule-based automation, which tends to break when conditions drift even slightly from the expected state. Imitation learning captures tribal knowledge Many operational tasks rely on tacit expertise developed over years in the field, such as how to manage stiff patch cords, visually confirm latch engagement, or stage a

Read More »

Applied Digital CEO Wes Cummins On the Hard Part of the AI Boom: Execution

Designing for What Comes After the Current AI Cycle Applied Digital’s design philosophy starts with a premise many developers still resist: today’s density assumptions may not hold. “We’re designing for maximum flexibility for the future—higher density power, lower density power, higher voltage delivery, and more floor space,” Cummins said. “It’s counterintuitive because densities are going up, but we don’t know what comes next.” That choice – to allocate more floor space even as rack densities climb – signals a long-view approach. Facilities are engineered to accommodate shifts in voltage, cooling topology, and customer requirements without forcing wholesale retrofits. Higher-voltage delivery, mixed cooling configurations, and adaptable data halls are baked in from the start. The goal is not to predict the future perfectly, Cummins stressed, but to avoid painting infrastructure into a corner. Supply Chain as Competitive Advantage If flexibility is the design thesis, supply chain control is the execution weapon. “It’s a huge advantage that we locked in our MEP supply chain 18 to 24 months ago,” Cummins said. “It’s a tight environment, and more timelines are going to get missed in 2026 because of it.” Applied Digital moved early to secure long-lead mechanical, electrical, and plumbing components; well before demand pressure fully rippled through transformers, switchgear, chillers, generators, and breakers. That foresight now underpins the company’s ability to make credible delivery commitments while competitors confront procurement bottlenecks. Cummins was blunt: many delays won’t stem from poor planning, but from simple unavailability. From 100 MW to 700 MW Without Losing Control The past year marked a structural pivot for Applied Digital. What began as a single, 100-megawatt “field of dreams” facility in North Dakota has become more than 700 MW under construction, with expansion still ahead. “A hundred megawatts used to be considered scale,” Cummins said. “Now we’re at 700

Read More »

Microsoft will invest $80B in AI data centers in fiscal 2025

And Microsoft isn’t the only one that is ramping up its investments into AI-enabled data centers. Rival cloud service providers are all investing in either upgrading or opening new data centers to capture a larger chunk of business from developers and users of large language models (LLMs).  In a report published in October 2024, Bloomberg Intelligence estimated that demand for generative AI would push Microsoft, AWS, Google, Oracle, Meta, and Apple would between them devote $200 billion to capex in 2025, up from $110 billion in 2023. Microsoft is one of the biggest spenders, followed closely by Google and AWS, Bloomberg Intelligence said. Its estimate of Microsoft’s capital spending on AI, at $62.4 billion for calendar 2025, is lower than Smith’s claim that the company will invest $80 billion in the fiscal year to June 30, 2025. Both figures, though, are way higher than Microsoft’s 2020 capital expenditure of “just” $17.6 billion. The majority of the increased spending is tied to cloud services and the expansion of AI infrastructure needed to provide compute capacity for OpenAI workloads. Separately, last October Amazon CEO Andy Jassy said his company planned total capex spend of $75 billion in 2024 and even more in 2025, with much of it going to AWS, its cloud computing division.

Read More »

John Deere unveils more autonomous farm machines to address skill labor shortage

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Self-driving tractors might be the path to self-driving cars. John Deere has revealed a new line of autonomous machines and tech across agriculture, construction and commercial landscaping. The Moline, Illinois-based John Deere has been in business for 187 years, yet it’s been a regular as a non-tech company showing off technology at the big tech trade show in Las Vegas and is back at CES 2025 with more autonomous tractors and other vehicles. This is not something we usually cover, but John Deere has a lot of data that is interesting in the big picture of tech. The message from the company is that there aren’t enough skilled farm laborers to do the work that its customers need. It’s been a challenge for most of the last two decades, said Jahmy Hindman, CTO at John Deere, in a briefing. Much of the tech will come this fall and after that. He noted that the average farmer in the U.S. is over 58 and works 12 to 18 hours a day to grow food for us. And he said the American Farm Bureau Federation estimates there are roughly 2.4 million farm jobs that need to be filled annually; and the agricultural work force continues to shrink. (This is my hint to the anti-immigration crowd). John Deere’s autonomous 9RX Tractor. Farmers can oversee it using an app. While each of these industries experiences their own set of challenges, a commonality across all is skilled labor availability. In construction, about 80% percent of contractors struggle to find skilled labor. And in commercial landscaping, 86% of landscaping business owners can’t find labor to fill open positions, he said. “They have to figure out how to do

Read More »

2025 playbook for enterprise AI success, from agents to evals

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More 2025 is poised to be a pivotal year for enterprise AI. The past year has seen rapid innovation, and this year will see the same. This has made it more critical than ever to revisit your AI strategy to stay competitive and create value for your customers. From scaling AI agents to optimizing costs, here are the five critical areas enterprises should prioritize for their AI strategy this year. 1. Agents: the next generation of automation AI agents are no longer theoretical. In 2025, they’re indispensable tools for enterprises looking to streamline operations and enhance customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make nuanced decisions, navigate complex multi-step tasks, and integrate seamlessly with tools and APIs. At the start of 2024, agents were not ready for prime time, making frustrating mistakes like hallucinating URLs. They started getting better as frontier large language models themselves improved. “Let me put it this way,” said Sam Witteveen, cofounder of Red Dragon, a company that develops agents for companies, and that recently reviewed the 48 agents it built last year. “Interestingly, the ones that we built at the start of the year, a lot of those worked way better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail. Models are getting better and hallucinating less, and they’re also being trained to do agentic tasks. Another feature that the model providers are researching is a way to use the LLM as a judge, and as models get cheaper (something we’ll cover below), companies can use three or more models to

Read More »

OpenAI’s red teaming innovations define new essentials for security leaders in the AI era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has taken a more aggressive approach to red teaming than its AI competitors, demonstrating its security teams’ advanced capabilities in two areas: multi-step reinforcement and external red teaming. OpenAI recently released two papers that set a new competitive standard for improving the quality, reliability and safety of AI models in these two techniques and more. The first paper, “OpenAI’s Approach to External Red Teaming for AI Models and Systems,” reports that specialized teams outside the company have proven effective in uncovering vulnerabilities that might otherwise have made it into a released model because in-house testing techniques may have missed them. In the second paper, “Diverse and Effective Red Teaming with Auto-Generated Rewards and Multi-Step Reinforcement Learning,” OpenAI introduces an automated framework that relies on iterative reinforcement learning to generate a broad spectrum of novel, wide-ranging attacks. Going all-in on red teaming pays practical, competitive dividends It’s encouraging to see competitive intensity in red teaming growing among AI companies. When Anthropic released its AI red team guidelines in June of last year, it joined AI providers including Google, Microsoft, Nvidia, OpenAI, and even the U.S.’s National Institute of Standards and Technology (NIST), which all had released red teaming frameworks. Investing heavily in red teaming yields tangible benefits for security leaders in any organization. OpenAI’s paper on external red teaming provides a detailed analysis of how the company strives to create specialized external teams that include cybersecurity and subject matter experts. The goal is to see if knowledgeable external teams can defeat models’ security perimeters and find gaps in their security, biases and controls that prompt-based testing couldn’t find. What makes OpenAI’s recent papers noteworthy is how well they define using human-in-the-middle

Read More »