high 2026-05-07T00:00:00.000Z
AI News: May 7, 2026 — IBM Think Wraps, Claude Sonnet 4.6 Leads ClawBench, Anthropic JV Confirmed
🔗 Оригинал →AI News: May 7, 2026 — IBM Think Wraps, Claude Sonnet 4.6 Leads ClawBench, Anthropic JV Confirmed
Source: AIToolsRecap.com | Published: May 7, 2026
Overview
- IBM Think 2026 concluded its four-day run in Boston with multiple general-availability product launches.
- Claude Sonnet 4.6 achieved a top score of 33.3% on ClawBench, the first agent benchmark evaluated against live production websites.
- Anthropic’s $1.5 billion private equity joint venture structure was formally confirmed, with major contributions from Blackstone, Hellman & Friedman, and Goldman Sachs.
IBM Think 2026 — Final Day Highlights
IBM’s annual Think conference closed May 7 in Boston with GA releases for several previewed products:
- IBM Sovereign Core — Now GA. Embeds governance policy at the infrastructure runtime level for regulated, cross-border environments.
- IBM Bob — Launched in tiers: Pro, Pro+, Ultra, and Enterprise SaaS. An end-to-end software development partner covering code generation, testing, security, and deployment across the full SDLC. Unlike point-in-time coding assistants, Bob operates across the entire application lifecycle.
- Next-Gen IBM watsonx Orchestrate — Full release for multi-agent orchestration. Enables enterprises to build, deploy, and manage thousands of agents built by different teams across an organization.
- IBM Docling for watsonx — Document intelligence platform that converts documents into structured Markdown, JSON, and HTML for RAG workflows.
- OpenRAG on watsonx.data — Open agentic retrieval framework shipped alongside the conference close.
Claude Sonnet 4.6 Tops ClawBench
Researchers from UBC and Vector Institute published ClawBench, a new evaluation framework for real-world AI agents:
- Scale: 153 tasks across 144 live production websites in 15 categories, including completing purchases, booking appointments, and submitting job applications.
- Live-Site Execution: Unlike prior sandbox benchmarks, ClawBench operates on real production sites, intercepting only the final submission request to keep evaluation safe.
- Top Score: Claude Sonnet 4.6 scored 33.3%, the highest among all frontier models tested.
- Behavioral Capture: Records five layers of data per run:
- Session replays
- Screenshots
- HTTP traffic
- Agent reasoning traces
- Browser actions
- Evaluation: Scored by an agentic evaluator that produces step-level diagnostics.
Anthropic $1.5B Joint Venture — Structure Confirmed
The full structure of Anthropic’s private equity joint venture is now confirmed:
- Vehicle Size: $1.5 billion
- Major Contributions:
- Anthropic: ~$300 million
- Blackstone: ~$300 million
- Hellman & Friedman: ~$300 million
- Goldman Sachs: $150 million
- Additional Participants: Apollo Global Management, General Atlantic, Leonard Green, GIC, and Sequoia Capital.
- Operating Model: A forward-deployed enterprise services firm that embeds Claude directly into the operations of PE-backed portfolio companies.
- Key Quote: CFO Krishna Rao said the structure exists because enterprise demand for Claude is “significantly outpacing any single delivery model.”