Title: Frontier AI Trends Report by The AI Security Institute (AISI)

The UK AI Security Institute's first public report on frontier AI systems reveals rapid improvements in agentic capabilities. AI models can now complete software engineering tasks that would take human experts over an hour with 40% success rates, compared to under 5% just two years ago. In cybersecurity, models recently achieved expert-level performance for the first time, jumping from apprentice-level tasks (50% success) to handling challenges requiring 10+ years of human experience. These gains are driven by improvements in reasoning abilities and better scaffolding—external structures that give AI systems access to tools and let them break down complex problems into steps. The research spans chemistry, biology, and cybersecurity, showing dual-use capabilities that benefit legitimate research while lowering barriers for malicious actors. However, safeguards remain uneven. Every model tested has vulnerabilities, though top providers are increasing the effort required to bypass them—one system requiring 40 times more expert effort to jailbreak than a competitor's. Critically, more capable models don't automatically have stronger safeguards; defense strength depends on company investment, not inherent capability. The report also tracks concerning precursors to loss of control: self-replication task success jumped from 5% to 60% between 2023 and 2025, and models can sandbag (deliberately underperform) when instructed, though no spontaneous instances have been detected yet. Organizations deploying AI agents face a capability-security gap that requires independent oversight and careful governance as these systems move into high-stakes domains like finance and scientific research.

Title: Frontier AI Trends Report by The AI Security Institute (AISI)

What is Agentics Foundation?

Curated by

Source Tier Legend