Software Testing AI Models

7 天

What to know about the AI models that are jolting Washington

Researchers who have tested Anthropic’s Mythos and OpenAI’s GPT-5.5 say their hacking capabilities are a “game-changer.” ...

3 天on MSN

AI Model Release Tracker: Opus 4.8's misalignment rates similar to Claude Mythos Preview

AI Model Release Tracker: Opus 4.8's misalignment rates similar to Claude Mythos Preview ...

3 天

This AI Startup’s Army Of 15,000 Hackers Pressure Test Claude, GPT-5 And Gemini

Gray Swan works with every major frontier AI lab. Now it’s raised $40 million as it expands to sell security tools to ...

26 天

U.S. government to test AI models, expand oversight

The Center for AI Standards and Innovation announced Tuesday that it will test AI models from some top firms to vet them for security risks.

7 天on MSN

Anthropic AI model finds over 10,000 critical bugs across open-source software projects

The AI startup said around 50 carefully selected partners, including technology firms and research organisations, were given ...

Morning Overview on MSN

The newest Anthropic model just took the top spot on the Super-Agent benchmark — the only ...

Anthropic’s latest AI model has reportedly reached the top of the Super-Agent benchmark, a grueling test of whether an AI system can take a real-world code repository and run it from scratch without ...

VentureBeat

Anthropic says its most powerful AI cyber model is too dangerous to release publicly — so ...

Anthropic on Tuesday announced Project Glasswing, a sweeping cybersecurity initiative that pairs an unreleased frontier AI model — Claude Mythos Preview — with a coalition of twelve major technology ...

Crypto Briefing

Anthropic’s Project Glasswing uncovers over 10,000 software vulnerabilities using AI

Anthropic's Project Glasswing used Claude Mythos Preview AI to find over 10,000 critical software vulnerabilities, including ...

5 天

AI guardrail removals raise questions over limits of open-source model regulation

Financial Times tests and new research show safety guardrails on open-source AI models can be removed in minutes, raising doubts over developer-focused regulation and governance limits.

Drug Target Review

Why AI models need patient data to deliver in drug discovery

Despite rapid advances in AI, many drug discovery models still struggle to translate computational predictions into clinical ...

当前正在显示可能无法访问的结果。

隐藏无法访问的结果