Agent Threat Rules: New AI Agent Security Detection Format
Summary
Agent Threat Rules, or ATR, is a new open detection format designed to counter security threats against AI agents. These agents are used in coding assistants and other frameworks, and their access can lead to prompt injection or credential theft. Here's the thing: ATR rules are written in YAML and declare attack patterns, input fields, and test cases. A reference engine in TypeScript and a Python wrapper called pyATR evaluate these rules. They are both under the MIT license. What's interesting is the project includes over 400 rules covering categories like prompt injection and skill compromise. It draws inspiration from established rule standards like Sigma and YARA. The system shows varied performance. For example, it achieves 98.0% recall against NVIDIA garak’s in-the-wild jailbreak corpus but only 1.3% against AdvBench. The maintainer notes that while individual rules pass tests, paraphrased attacks can still be missed. The bottom line: Four organizations, including Microsoft and Cisco, already use or have integrated ATR into their tools. This initiative aims to provide a standardized way to detect AI agent security threats, which is crucial as AI adoption grows.
This is an AI-generated audio summary. Always check the original source for complete reporting.