These applications fall into two distinct layers — each requiring different red teaming, and both are required.
Layer 1: Base Model Red Teaming (GPT-4 / Underlying LLM)
For both Bing Chat and M365 Copilot, Microsoft’s AI Red Team red teams the underlying GPT-4 or Phi model independently before it is integrated into any product. This layer focuses on:
- Intrinsic model safety and alignment
- Harmful content capability scoping (what the model can produce)
- Jailbreak susceptibility at the raw model level
- Multi-turn manipulation (Crescendo, Skeleton Key, CCA techniques)
This is done once at the model level and feeds safety improvements back into the model training and fine-tuning pipeline.
Layer 2: Application-Level Red Teaming
This is where M365 Copilot and Bing Chat diverge significantly from a generic chatbot — and where the most critical, unique red teaming is required.
For Bing Chat (now Microsoft Copilot web)
Bing Chat’s red teaming scope encompasses the entire search-grounded experience, not just the chat interface. Key focus areas:
Before Bing Chat launched, Microsoft ran hundreds of hours of adversarial probing by dedicated RAI experts — on top of the base GPT-4 red teaming — specifically targeting search-grounded failure modes.
For M365 Copilot (Teams, Word, Outlook, SharePoint)
M365 Copilot is significantly more complex and requires Tier 2–3 red teaming (Standard → Deep Agentic) because it has read/write access to sensitive enterprise data and can take actions — not just answer questions.
Unique risk surface of M365 Copilot:
For Copilot Studio Agents (Custom M365-Embedded Agents)
Copilot Studio agents require Tier 3: Deep Agentic Red Teaming — the highest non-regulated tier — because they can act with real write permissions on enterprise systems.
Documented real-world attacks (Tenable, Datadog — 2025):
- A Copilot Studio travel-booking agent was coerced via prompt injection to reveal payment card records and set booking prices to $0 by abusing its update action
- CoPhish attack: Exploiting Copilot Studio demo pages and OAuth login flows to harvest OAuth tokens and achieve tenant compromise
Required red teaming for Copilot Studio agents:
Tooling recommended for Copilot Studio red teaming:
- Azure AI Foundry AI Red Teaming Agent — directly integrates with Copilot Studio for automated safety scans
- PyRIT — for multi-turn attack strategy execution (Crescendo, TAP)
- Microsoft Defender Advanced Hunting — for detecting agent misconfigurations post-deployment via Community Hunting Queries
- Manual adversarial testing — always required for agentic write-action scenarios; automation alone is insufficient
Summary: Recommended Tiers by Application Type
Application | Red Teaming Tier | Key Unique Risks |
Bing Chat / Copilot web | Tier 2 (Standard) | Indirect injection via web content, search-grounded misinformation |
M365 Copilot (read-only) | Tier 2 (Standard) | Data oversharing, EchoLeak-style injection via documents/email |
M365 Copilot (with plugins/connectors) | Tier 2–3 | Exfiltration chains, cross-tenant data bleed, OAuth misuse |
Copilot Studio agents (read/write) | Tier 3 (Deep Agentic) | Action abuse, connector prompt injection, shadow agents, CoPhish |
Copilot Studio agents (regulated data — healthcare, finance) | Tier 4 | Full adversarial + regulatory conformity |
The core principle Microsoft applies: the more real-world action a Copilot application can take and the more sensitive data it can access, the deeper the red teaming must go — from passive content safety testing all the way to full agentic adversarial simulation.
⁂
References
- https://www.zdnet.com/article/microsofts-red-team-has-monitored-ai-since-2018-here-are-five-big-insights/
- https://news.microsoft.com/zh-tw/features/microsoft-ai-red-team-building/
- https://www.scworld.com/perspective/an-inside-look-at-microsofts-ai-red-team
- https://azure.github.io/PyRIT/
- https://www.linkedin.com/posts/markrussinovich_last-year-i-shared-our-discovery-of-the-crescendo-activity-7306036636562137089-sBtd
- https://www.microsoft.com/en-us/msrc/blog/2025/07/how-microsoft-defends-against-indirect-prompt-injection-attacks
- https://www.promptfoo.dev/blog/prompt-injection/
- https://concentric.ai/too-much-access-microsoft-copilot-data-risks-explained/
- https://www.linkedin.com/posts/errino_microsoftcopilot-cybersecurity-cio-activity-7452418699161948160-S-wz
- https://arxiv.org/html/2509.10540v1
- https://www.linkedin.com/posts/seandendle_microsoft-365-copilot-arbitrary-data-exfiltration-activity-7386521100907814914-SYc-
- https://www.hornetsecurity.com/en/blog/sharepoint-hacking-using-copilot/
- https://www.levelblue.com/blogs/levelblue-blog/trustwave-spiderlabs-red-team-flight-tests-microsoft-copilot
- https://windowsforum.com/threads/copilot-studio-risks-no-code-ai-agents-expose-new-attack-surface.393875/
- https://www.valencesecurity.com/saas-security-terms/microsoft-copilot-studio-security
- https://www.microsoft.com/en-us/security/blog/2026/02/12/copilot-studio-agent-security-top-10-risks-detect-prevent/?wt.md_id=AZ-MVP-5004796
- https://learn.microsoft.com/en-us/answers/questions/5621948/with-the-red-team-sdk-can-we-test-only-safety-risk
- https://www.linkedin.com/pulse/copilot-studio-risk-evaluations-ai-red-teaming-agent-estevan-dt9wf
- https://www.lewisdoes.dev/blog/microsoft-copilot-for-microsoft-365-extensibility-options/
- https://www.microsoft.com/en-us/security/blog/2023/08/07/microsoft-ai-red-team-building-future-of-safer-ai/?msockid=18290294b2206337069616d3b30c629e
- https://community.powerplatform.com/forums/thread/details/?threadid=f348cfc5-842e-f111-88b4-7ced8dcd2411
- https://www.reddit.com/r/microsoft_365_copilot/comments/1jluhki/copilot_info_security_when_using_teams/
- https://learn.microsoft.com/en-us/microsoft-365/copilot/extensibility/
- https://learn.microsoft.com/fr-fr/training/modules/introduction-ai-security-testing/1-what-is-ai-red-teaming
- https://helloitsliam.com/2026/01/06/secure-plugins-agents-and-graph-connectors-in-copilot/
- https://learn.microsoft.com/en-us/microsoft-365/copilot/security-microsoft-365-copilot
- https://learn.microsoft.com/en-us/microsoft-365/copilot/extensibility/samples
- https://learn.microsoft.com/ja-jp/training/modules/introduction-ai-security-testing/1-what-is-ai-red-teaming
- https://www.reddit.com/r/devops/comments/1qdr4hg/how_big_of_a_risk_is_prompt_injection_for/
- https://windowsforum.com/threads/copilot-for-exchange-server-on-premises-architectures-security-and-pilot-readiness.385907/
- https://www.lasso.security/blog/microsoft-copilot-security-concerns
¶ Discussion
Comments are powered by Giscus / GitHub Discussions. They appear here once configured — see
Configure Giscusin the project README and updateGISCUSinsrc/consts.ts.