These applications fall into two distinct layers — each requiring different red teaming, and both are required.

Layer 1: Base Model Red Teaming (GPT-4 / Underlying LLM)

For both Bing Chat and M365 Copilot, Microsoft’s AI Red Team red teams the underlying GPT-4 or Phi model independently before it is integrated into any product. This layer focuses on:

  • Intrinsic model safety and alignment
  • Harmful content capability scoping (what the model can produce)
  • Jailbreak susceptibility at the raw model level
  • Multi-turn manipulation (Crescendo, Skeleton Key, CCA techniques)

This is done once at the model level and feeds safety improvements back into the model training and fine-tuning pipeline.

Layer 2: Application-Level Red Teaming

This is where M365 Copilot and Bing Chat diverge significantly from a generic chatbot — and where the most critical, unique red teaming is required.

For Bing Chat (now Microsoft Copilot web)

Bing Chat’s red teaming scope encompasses the entire search-grounded experience, not just the chat interface. Key focus areas:

Attack Vector

What Is Tested

Indirect prompt injection via web content

Malicious web pages injecting instructions into search-grounded responses

Grounded hallucination / misinformation

AI citing poisoned or adversarial web sources as authoritative

Jailbreaks through search context

Using retrieved documents to smuggle unsafe content

Persona manipulation

Attempts to get Copilot to adopt harmful personas via multi-turn dialogue

RAI (Responsible AI) harms

Bias, hate speech, self-harm content triggered via search topics

Before Bing Chat launched, Microsoft ran hundreds of hours of adversarial probing by dedicated RAI experts — on top of the base GPT-4 red teaming — specifically targeting search-grounded failure modes.

For M365 Copilot (Teams, Word, Outlook, SharePoint)

M365 Copilot is significantly more complex and requires Tier 2–3 red teaming (Standard → Deep Agentic) because it has read/write access to sensitive enterprise data and can take actions — not just answer questions.

Unique risk surface of M365 Copilot:

Risk Category

Description

Red Teaming Focus

Data oversharing / over-retrieval

Copilot accesses everything a user can access via Microsoft Graph — surfacing files users forgot existed

Test cross-document data leakage; verify permission boundary enforcement

Indirect prompt injection (EchoLeak / CVE-2025-32711)

Zero-click exploit: malicious content hidden in a SharePoint doc or email injects instructions into Copilot's context

Inject adversarial content into SharePoint, OneDrive, Teams messages, emails; verify Copilot doesn't execute injected instructions

Sensitive data exfiltration

Attackers embed hidden instructions in documents to have Copilot summarize and transmit sensitive data to an attacker-controlled endpoint

Test document-grounded exfiltration chains

Cross-user data bleed

Copilot responses leaking data between users in the same tenant

Verify tenant isolation and session boundaries

Privilege escalation via Copilot

Using Copilot to discover misconfigured SharePoint permissions or excessive access

Red team Copilot as a "living off the land" discovery tool for an attacker with initial access

For Copilot Studio Agents (Custom M365-Embedded Agents)

Copilot Studio agents require Tier 3: Deep Agentic Red Teaming — the highest non-regulated tier — because they can act with real write permissions on enterprise systems.

Documented real-world attacks (Tenable, Datadog — 2025):

  • A Copilot Studio travel-booking agent was coerced via prompt injection to reveal payment card records and set booking prices to $0 by abusing its update action
  • CoPhish attack: Exploiting Copilot Studio demo pages and OAuth login flows to harvest OAuth tokens and achieve tenant compromise

Required red teaming for Copilot Studio agents:

Test Category

What to Test

Agent misconfiguration

Agents shared org-wide without authentication, overprivileged connectors, agents persisting after owner departs

Prompt injection via connector data

Inject adversarial instructions into SharePoint, Dataverse, Exchange, or external API responses that the agent reads

Unauthorized action execution

Can the agent be tricked into write, delete, or update operations beyond its intended scope?

OAuth/token abuse

Test whether agent OAuth grants can be harvested or misused via CoPhish-style flows

Shadow AI agent discovery

Identify agents built by business teams outside security review that have excessive Graph permissions

Human-in-the-loop bypass

Verify that high-risk actions require confirmation and cannot be bypassed via conversation manipulation

  • Azure AI Foundry AI Red Teaming Agent — directly integrates with Copilot Studio for automated safety scans
  • PyRIT — for multi-turn attack strategy execution (Crescendo, TAP)
  • Microsoft Defender Advanced Hunting — for detecting agent misconfigurations post-deployment via Community Hunting Queries
  • Manual adversarial testing — always required for agentic write-action scenarios; automation alone is insufficient

Application

Red Teaming Tier

Key Unique Risks

Bing Chat / Copilot web

Tier 2 (Standard)

Indirect injection via web content, search-grounded misinformation

M365 Copilot (read-only)

Tier 2 (Standard)

Data oversharing, EchoLeak-style injection via documents/email

M365 Copilot (with plugins/connectors)

Tier 2–3

Exfiltration chains, cross-tenant data bleed, OAuth misuse

Copilot Studio agents (read/write)

Tier 3 (Deep Agentic)

Action abuse, connector prompt injection, shadow agents, CoPhish

Copilot Studio agents (regulated data — healthcare, finance)

Tier 4

Full adversarial + regulatory conformity

The core principle Microsoft applies: the more real-world action a Copilot application can take and the more sensitive data it can access, the deeper the red teaming must go — from passive content safety testing all the way to full agentic adversarial simulation.

References

  1. https://www.zdnet.com/article/microsofts-red-team-has-monitored-ai-since-2018-here-are-five-big-insights/
  2. https://news.microsoft.com/zh-tw/features/microsoft-ai-red-team-building/
  3. https://www.scworld.com/perspective/an-inside-look-at-microsofts-ai-red-team
  4. https://azure.github.io/PyRIT/
  5. https://www.linkedin.com/posts/markrussinovich_last-year-i-shared-our-discovery-of-the-crescendo-activity-7306036636562137089-sBtd
  6. https://www.microsoft.com/en-us/msrc/blog/2025/07/how-microsoft-defends-against-indirect-prompt-injection-attacks
  7. https://www.promptfoo.dev/blog/prompt-injection/
  8. https://concentric.ai/too-much-access-microsoft-copilot-data-risks-explained/
  9. https://www.linkedin.com/posts/errino_microsoftcopilot-cybersecurity-cio-activity-7452418699161948160-S-wz
  10. https://arxiv.org/html/2509.10540v1
  11. https://www.linkedin.com/posts/seandendle_microsoft-365-copilot-arbitrary-data-exfiltration-activity-7386521100907814914-SYc-
  12. https://www.hornetsecurity.com/en/blog/sharepoint-hacking-using-copilot/
  13. https://www.levelblue.com/blogs/levelblue-blog/trustwave-spiderlabs-red-team-flight-tests-microsoft-copilot
  14. https://windowsforum.com/threads/copilot-studio-risks-no-code-ai-agents-expose-new-attack-surface.393875/
  15. https://www.valencesecurity.com/saas-security-terms/microsoft-copilot-studio-security
  16. https://www.microsoft.com/en-us/security/blog/2026/02/12/copilot-studio-agent-security-top-10-risks-detect-prevent/?wt.md_id=AZ-MVP-5004796
  17. https://learn.microsoft.com/en-us/answers/questions/5621948/with-the-red-team-sdk-can-we-test-only-safety-risk
  18. https://www.linkedin.com/pulse/copilot-studio-risk-evaluations-ai-red-teaming-agent-estevan-dt9wf
  19. https://www.lewisdoes.dev/blog/microsoft-copilot-for-microsoft-365-extensibility-options/
  20. https://www.microsoft.com/en-us/security/blog/2023/08/07/microsoft-ai-red-team-building-future-of-safer-ai/?msockid=18290294b2206337069616d3b30c629e
  21. https://community.powerplatform.com/forums/thread/details/?threadid=f348cfc5-842e-f111-88b4-7ced8dcd2411
  22. https://www.reddit.com/r/microsoft_365_copilot/comments/1jluhki/copilot_info_security_when_using_teams/
  23. https://learn.microsoft.com/en-us/microsoft-365/copilot/extensibility/
  24. https://learn.microsoft.com/fr-fr/training/modules/introduction-ai-security-testing/1-what-is-ai-red-teaming
  25. https://helloitsliam.com/2026/01/06/secure-plugins-agents-and-graph-connectors-in-copilot/
  26. https://learn.microsoft.com/en-us/microsoft-365/copilot/security-microsoft-365-copilot
  27. https://learn.microsoft.com/en-us/microsoft-365/copilot/extensibility/samples
  28. https://learn.microsoft.com/ja-jp/training/modules/introduction-ai-security-testing/1-what-is-ai-red-teaming
  29. https://www.reddit.com/r/devops/comments/1qdr4hg/how_big_of_a_risk_is_prompt_injection_for/
  30. https://windowsforum.com/threads/copilot-for-exchange-server-on-premises-architectures-security-and-pilot-readiness.385907/
  31. https://www.lasso.security/blog/microsoft-copilot-security-concerns

Discussion

Comments are powered by Giscus / GitHub Discussions. They appear here once configured — see Configure Giscus in the project README and update GISCUS in src/consts.ts.