OpenAI and Paradigm Launch EVMbench to Check AI Good Contract Hacking

Contents

Three Methods to Break Good Contracts
Actual Limitations Price Noting
$10M for Defensive Analysis

Rongchai Wang
Mar 05, 2026 00:55

New benchmark evaluates AI brokers’ skill to detect, patch, and exploit sensible contract vulnerabilities. GPT-5.3-Codex scores 72.2% on exploit duties.

OpenAI and crypto enterprise agency Paradigm have launched EVMbench, a benchmark that measures how effectively AI brokers can discover, repair, and exploit vulnerabilities in Ethereum sensible contracts. The announcement comes as AI-powered safety instruments race to guard the $100 billion-plus locked in DeFi protocols.

The benchmark attracts from 120 curated high-severity vulnerabilities pulled from 40 actual safety audits, principally from Code4rena competitions. It additionally consists of vulnerability eventualities from safety evaluations of Tempo, a Layer 1 blockchain constructed for stablecoin funds.

Three Methods to Break Good Contracts

EVMbench checks AI brokers throughout three distinct modes. In Detect mode, brokers audit contract repositories and get scored on discovering recognized vulnerabilities. Patch mode requires brokers to repair susceptible code with out breaking current performance. Exploit mode is essentially the most aggressive—brokers should execute precise fund-draining assaults towards contracts deployed on a sandboxed blockchain.

The outcomes present how rapidly AI capabilities are advancing on this area. GPT-5.3-Codex working through Codex CLI hit a 72.2% success price on exploit duties. That is greater than double the 31.9% rating from GPT-5, which launched simply six months prior.

Curiously, AI brokers carry out higher at attacking than defending. The exploit setting has a transparent goal—hold iterating till you drain the funds. Detection and patching proved more durable. Brokers typically stopped after discovering one bug as an alternative of auditing exhaustively, and sustaining full contract performance whereas eradicating delicate vulnerabilities remained difficult.

Actual Limitations Price Noting

OpenAI acknowledged EVMbench does not seize the total issue of real-world contract safety. Closely deployed protocols like Uniswap or Aave bear way more scrutiny than audit competitors code. The benchmark can also’t confirm if an agent finds legit vulnerabilities that human auditors missed—it solely checks towards recognized points.

The exploit setting runs on a clear native Anvil occasion quite than forked mainnet state, and timing-dependent assaults fall outdoors scope. Single-chain environments just for now.

$10M for Defensive Analysis

Alongside EVMbench, OpenAI dedicated $10 million in API credit particularly for defensive safety analysis. The corporate is increasing its Aardvark safety analysis agent to extra customers and partnering with open-source maintainers without spending a dime codebase scanning.

The timing issues. As AI brokers get higher at exploiting contracts, the window between vulnerability discovery and exploitation shrinks. Protocol groups that are not utilizing AI-assisted auditing will more and more discover themselves at a drawback towards attackers who’re.

OpenAI launched EVMbench’s duties, tooling, and analysis framework publicly. For DeFi builders and safety researchers, it is each a measuring stick and a warning about the place AI capabilities are headed.

Picture supply: Shutterstock

3 High Expertise Mutual Funds to Purchase Now for Sturdy Lengthy-Time period Progress

5 Methods Inflation and Taxes Are Quietly Reducing a $250,000 Retirement in Half

Dozens of ice cream merchandise recalled over ‘life-threatening’ allergy threat

Bull of the Day: Chevron Company (CVX)

US soldier charged with making $400,000 on Maduro removing bets

OpenAI and Paradigm Launch EVMbench to Check AI Good Contract Hacking

Three Methods to Break Good Contracts

Actual Limitations Price Noting

$10M for Defensive Analysis

Leave a Reply Cancel reply

Follow US

Popular News

Success Story: Charles Tyler’s Studying Journey with 101 Blockchains

Key Advantages, Use Circumstances, And Developments

The Innovation Hub Playbook: Constructing a Digital Ecosystem for the Recent Meals Chain

Follow Us on Socials

We influence 20 million users and is the number one business blockchain and crypto news network on the planet.

Topics

Three Methods to Break Good Contracts

Actual Limitations Price Noting

$10M for Defensive Analysis

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Follow US

Popular News

Topics