Daily Brief: OpenAI Launches EVMbench Benchmark for AI Agents on Smart Contract Vulnerabilities
OpenAI Launches EVMbench Benchmark for AI Agents on Smart Contract Vulnerabilities. World Labs Raises $1B Led by NVIDIA, AMD for Spatial Intelligence AI Models. Saudi HUMAIN Invests $3B in xAI Series.
OpenAI Launches EVMbench Benchmark for AI Agents on Smart Contract Vulnerabilities
OpenAI dropped EVMbench today, a new benchmark to test AI agents on spotting, exploiting, and fixing vulnerabilities in Ethereum smart contracts. Curated from 40 audit repos with 120 high-severity issues, it uses programmatic grading for objectivity. Developed with Paradigm, it exposed gaps: top models like GPT-5.3-Codex hit 72.2% on exploits but faltered on detection and patching.[1][2][3]
OpenAI's putting $10M into cybersecurity research, signaling serious commitment to agentic AI in blockchain. OpenAI's announcement post racked up over 6k likes, with crypto and AI folks praising the real-world testing rigor.
This could slash the billions lost to exploits annually, pushing agents toward reliable security roles.
World Labs Raises $1B Led by NVIDIA, AMD for Spatial Intelligence AI Models
Fei-Fei Li's World Labs just closed a massive $1B round on February 18, led by NVIDIA and AMD, with Fidelity and Autodesk chipping in $200M. Building on their 2024 $230M seed, they're targeting frontier models for 3D spatial intelligence—think perception, interaction, and integration into design tools.[4][5][6]
It's a VC pivot from LLMs to embodied AI, betting big on "world models" for real-world apps like architecture. The post got 856 likes, with VCs like @zeinatab highlighting the chip giants' backing.
Saudi HUMAIN Invests $3B in xAI Series E Ahead of SpaceX Acquisition
Saudi PIF-backed HUMAIN poured $3B into xAI's $20B Series E in early February, right before SpaceX's acquisition. Stakes convert to SpaceX shares, positioning HUMAIN as a major player in the $1.25T combined entity, extending their datacenter ties.[7][8]
This underscores the Middle East's AI gold rush and Musk's ecosystem pull. HUMAIN's Tareq Amin shared the pride on X, sparking buzz on strategic alliances.
Scout AI Unveils Fury Autonomous Vehicle Orchestrator for Military Applications
Scout AI launched Fury today after a year of dev: the first agentic system turning natural language missions into coordinated actions for drone and ground vehicle fleets. U.S. Army demo showed autonomous strikes with battle damage assessment; they've locked in contracts for ISVs.[10][11][12]
Video from @adcock_colby hit 651 likes, fueling talk on AI agents in kinetic ops. This leaps defense robotics forward.
India AI Impact Summit 2026 Draws Nordic Leaders from Sweden and Finland
At New Delhi's AI Impact Summit (Feb 16-20), Sweden's Deputy PM Ebba Busch and Finland's PM Petteri Orpo met Modi to push AI ties, sovereign tech, ethics, and India-EU FTA. Busch lauded Modi's vision for sustainable Nordic-India collab.[13][14][15]
Diplomatic X buzz included MEAIndia welcoming Orpo—prime for Nordic firms eyeing global partnerships.
What This Means For Your Business
Agent benchmarks like EVMbench highlight the gap between hype and reliability in multi-agent systems—perfect for Up North AI's quality & trust reviews and outcome engineering. As models struggle with patching, businesses need judgment to orchestrate secure workflows, whether in blockchain or beyond. Our MCP/A2A expertise turns raw agent power into dependable teams, echoing "Code is free. Judgment isn't."
Funding surges (World Labs, xAI) and defense plays like Fury scream multi-agent orchestration demand. Spatial AI and physical fleets demand agent workforce design that scales from sims to real ops. Nordic leaders at India Summit open doors for EU-Asia pilots in ethical AI.
Key takeaway: Invest in vetted agent systems now—benchmarks prove raw models won't cut it alone. Up North AI engineers outcomes that deliver.
Sources
- https://cdn.openai.com/evmbench/evmbench.pdf
- https://cryptobriefing.com/ai-security-benchmarking-system
- https://www.coindesk.com/tech/2026/02/18/sam-altman-s-openai-unveils-evmbench-to-test-whether-ai-can-keep-crypto-s-smart-contracts-safe
- https://finance.yahoo.com/news/ai-pioneer-fei-fei-lis-202957884.html
- https://techcrunch.com/2026/02/18/world-labs-lands-200m-from-autodesk-to-bring-world-models-into-3d-workflows
- https://www.thedeepview.com/articles/world-labs-raises-usd1b-as-vcs-look-beyond-llms
- https://finance.yahoo.com/news/saudi-arabia-humain-invests-3-123558006.html
- https://www.semafor.com/article/02/18/2026/saudis-humain-invests-3b-in-elon-musks-xai
- https://www.prnewswire.com/news-releases/scout-ai-introduces-fury-autonomous-vehicle-orchestrator-302691787.html
- https://scoutco.ai/
- https://www.wired.com/story/ai-lab-scout-ai-is-using-ai-agents-to-blow-things-up
- https://www.newindiaabroad.com/english/news/swedish-finnish-uk-leaders-arrive-in-delhi-for-ai-impact-summit
- https://m.economictimes.com/news/newsblogs/ai-impact-summit-2026-delhi-live-updates-day-2-announcements-narendra-modi-bharat-mandapam-french-president-macron-visit-india/liveblog/128449060.cms
- https://www.livemint.com/technology/ai-summit-delhi-2026-live-updates-ai-impact-summit-day-3-bharat-mandapam-india-narendra-modi-delhi-expo-18-february-2026-11771376974825.html
Stay ahead of AI
No spam. Unsubscribe anytime.
Need help making sense of AI?
Reading the news is one thing. Knowing what to do about it is another. We help companies turn AI trends into action.