Microsoft Launches AI Red Teaming Agent to Automate Adversarial Safety Testing

Azure Support

May 19, 2026 · Updated Jun 12, 2026

Microsoft released the AI Red Teaming Agent in preview to automate the discovery of security risks in generative AI applications. The tool simulates adversarial attacks to measure vulnerabilities and generate safety scorecards, shifting safety engineering from manual probing to autonomous evaluation.

Microsoft launched the AI Red Teaming Agent in preview within Azure AI Foundry, providing an automated way to probe generative AI systems for vulnerabilities. The tool acts as an adversarial agent that executes multi-step attacks—such as jailbreaks (techniques used to bypass model constraints)—to identify where a model might bypass safety guardrails.

Availability: Public preview
Underlying framework: PyRIT (Python Risk Identification Tool)
Core metric: Attack Success Rate (ASR)
Execution modes: Azure AI Foundry portal or local SDK
Attack types: Jailbreaks, prompt injections, and more

Manual red teaming is increasingly impractical, a challenge highlighted by Microsoft's interconnected agent network research. This update addresses the need for Cloudflare's architectural defense model by providing a standardized, autonomous method to measure the Attack Success Rate (the ratio of successful adversarial bypasses).

You can run the agent through the Azure AI Foundry portal or locally using the Azure AI Evaluation SDK. It leverages the open-source PyRIT framework to generate safety scorecards and detailed logs for continuous improvement. The feature is currently available in public preview for Azure customers.

View the full update on learn.microsoft.com

Azure Support

@AzureSupportMay 18

Strengthen your AI systems with the AI Red Teaming Agent! 🔐 Proactively uncover risks in generative AI ⚡ Simulate adversarial attacks to detect vulnerabilities 🛠️ Measure attack success rates and generate safety scorecards 📊 Monitor, log, and continuously improve AI safety with automated evaluations Start here 👉 https://t.co/kgdroGH3Ak #AzureAI #ResponsibleAI #AIsecurity #GenAI

1371

View on X

Still wondering? A few quick answers below.

The AI Red Teaming Agent is an automated security tool within Azure AI Foundry designed to identify risks in generative AI systems. It acts as an adversarial agent that simulates attacks to proactively uncover vulnerabilities like jailbreaks or prompt injections, which are techniques used to bypass a model's safety filters and instructions.

The agent automates the red teaming process by simulating adversarial probing against a target AI application. It generates attack-response pairs that are evaluated to calculate an Attack Success Rate. This quantitative metric helps developers measure how often a system fails to block harmful inputs, allowing for data-driven safety improvements and the generation of scorecards.

The AI Red Teaming Agent is currently available in public preview for Azure customers. Users can access the tool directly through the Azure AI Foundry portal or run it locally on their own hardware using the Azure AI Evaluation SDK. This dual approach allows for both cloud-based monitoring and local development testing.

The tool is built on Microsoft's open-source Python Risk Identification Tool, also known as PyRIT. By leveraging this framework, the agent can simulate complex adversarial behaviors and automate the evaluation of content risks. This integration allows the agent to provide structured logs and automated evaluations that were previously handled through manual security testing.

The agent is designed for developers and security professionals building generative AI applications on the Azure platform. It is particularly useful for teams that need to scale their safety testing beyond manual efforts. Users can integrate the agent into their existing workflows to continuously monitor, log, and improve the safety of their AI models.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft demonstrated MDASH, a multi-model agentic scanning harness that uses over 100 specialized AI agents to autonomously find and validate exploitable code vulnerabilities. The system shifts security from manual triage to an automated pipeline that can prove exploits and suggest fixes directly in the developer workflow.

Microsoft ResearchMay 1

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

Microsoft Research red-teamed a network of over 100 autonomous agents to uncover vulnerabilities that only appear during agent-to-agent interactions. These network-level risks, including self-propagating worms and manufactured consensus, suggest that individual agent safety is insufficient for securing interconnected ecosystems.

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

NVIDIAMay 21

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

NVIDIA released a verification framework for agent skills that uses automated scanning and cryptographic signing to ensure AI instructions are safe and authentic. While previous security focused on isolating the agent's environment, this shift brings governance directly to the capabilities an agent learns and executes.

ReplitApr 21

Replit Launches Security Agent to Perform Deep Code Audits in Minutes

Replit launched the Security Agent, a specialized tool that performs full pre-launch audits by mapping application architecture and identifying exploitable vulnerabilities. It uses a hybrid approach to filter out 90% of false positives, securing AI-generated code without the weeks of manual review typically required by security engineers.

What is the Microsoft AI Red Teaming Agent?

How does the AI Red Teaming Agent work?

Is the AI Red Teaming Agent available to everyone?

What framework does the AI Red Teaming Agent use?

Who can use the AI Red Teaming Agent?

Keep reading

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

Replit Launches Security Agent to Perform Deep Code Audits in Minutes

Replit Launches Security Agent to Perform Deep Code Audits in Minutes

Keep reading

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

NVIDIA Launches Verified Agent Skills to Secure Autonomous AI Capabilities

Replit Launches Security Agent to Perform Deep Code Audits in Minutes

Replit Launches Security Agent to Perform Deep Code Audits in Minutes