Back to Blog
Abstract visualization of Microsoft Copilot data flows fragmenting across hidden Exchange folders, SharePoint containers, and compliance infrastructure
legaltech

Copilot Is Everywhere: How Microsoft's AI Sprawl Is Creating the Biggest eDiscovery Blind Spot Since BYOD

April 8, 2026

Microsoft shipped Copilot to 400 million users. Most litigation teams haven't updated their legal holds to account for the hidden Exchange folders, fragmented SharePoint containers, and new .loop files that Copilot generates across dozens of surfaces.

By Claude and Gemini with Sid Newby | April 2026

Microsoft shipped Copilot to 400 million users. Then somebody in legal asked where the data goes.

That question — deceptively simple, operationally devastating — landed on my desk three times last month from three different Am Law 200 firms. Each one had rolled out Microsoft 365 Copilot to their enterprise. Each one had updated their legal hold templates sometime in the last eighteen months. None of them had accounted for the fact that a single Copilot prompt now generates discoverable artifacts across Exchange hidden folders, SharePoint containers, OneDrive paths, and a new category of .loop files that most litigation hold workflows have never heard of. One firm's general counsel put it plainly: "We told our custodians to preserve everything in their mailbox. We didn't realize the mailbox had grown invisible rooms."

The Copilot eDiscovery problem isn't theoretical. It's a preservation emergency hiding behind a productivity feature — and most litigation teams are sleepwalking into it.


The Copilot sprawl nobody mapped

Here is what happened while legal departments were debating whether to adopt AI: Microsoft embedded it everywhere.

Count the surfaces. Microsoft 365 Copilot runs inside Word, Excel, PowerPoint, Outlook, Teams, OneNote, Loop, Whiteboard, Stream, Forms, Planner, and SharePoint. Copilot Chat (the free browser-based version, formerly Bing Chat Enterprise) works without an M365 license. Security Copilot operates inside Microsoft Defender and Purview. Copilot Studio lets enterprises build custom Copilot agents. Copilot in Fabric runs inside Power BI and the data analytics stack. Copilot in Edge summarizes web pages. Windows Copilot sits in the operating system itself. And as of February 2026, Microsoft Purview's retention system now tracks interactions from ChatGPT Enterprise, Google Gemini, and DeepSeek alongside Microsoft's own Copilot products.[1]

That is not a product. That is an ecosystem — and every node in it generates ESI that your legal hold probably doesn't cover.

Microsoft Copilot Ecosystem

Figure 1: Microsoft's Copilot ecosystem spans dozens of surfaces across productivity, security, analytics, and the operating system itself — each generating unique discoverable artifacts.

Lighthouse Global's April 2026 analysis called it "steering the Microsoft Copilot fleet" and warned that organizations "cannot differentiate by M365 application" when setting retention policies.[2] You cannot tell Purview to keep Word Copilot prompts for seven years and delete Teams Copilot prompts after thirty days. It's all or nothing. For litigation teams accustomed to granular hold configurations — preserve this custodian's email but not their voicemail, hold Teams chats but not Yammer — that blunt instrument is a governance regression.


Where the bodies are buried: the hidden mailbox architecture

Every Copilot interaction — every prompt you type, every response the AI returns — gets stored in a hidden folder inside the user's Exchange Online mailbox.[1] Not the regular Inbox. Not Sent Items. A hidden folder that "isn't designed to be directly accessible to users or administrators," according to Microsoft's own documentation.[1]

This is the same storage mechanism that Exchange uses for Teams private channel messages and cloud-based Teams users. The mailbox has a RecipientTypeDetails attribute of UserMailbox, which means compliance administrators can search it with eDiscovery tools — but only if they know to look for it.[1]

When a retention policy kicks in and messages expire, they move to another hidden folder called SubstrateHolds — a subfolder within the Recoverable Items folder. Items sit in SubstrateHolds for at least one day before a timer job permanently deletes them, and that timer job runs every one to seven days.[1] The practical implication: deleted Copilot data can persist in a searchable state for up to eight days after the retention period expires, creating a window where you might find data you thought was gone — or where opposing counsel might find data you forgot existed.

mermaid Diagram

Figure 2: The Copilot data lifecycle flows through hidden Exchange folders and SubstrateHolds before permanent deletion — with legal holds suspending the process at the final gate.

The critical point for litigation: if a mailbox is subject to Litigation Hold, a delay hold, or an eDiscovery hold, permanent deletion from SubstrateHolds is suspended entirely.[1] The items stay. That's the same behavior as regular Exchange items — which means Copilot data is covered by existing Litigation Hold mechanisms. The hold works. But only if you placed the hold on the right mailbox in the first place, and only if you know that Copilot data lives there at all.

Redgrave LLP's March 2026 analysis put it starkly: organizations are "unknowingly accumulating vast Copilot artifacts without corresponding policies."[3] Unless IT explicitly disabled Copilot data capture, "every prompt, response, referenced file, and linked document" is searchable and preservable through eDiscovery. Microsoft's strategy is to make Copilot the "primary gateway for productivity" — and the compliance infrastructure is playing catch-up.[3]


The cloud attachment problem: one prompt, a hundred files

A Copilot prompt doesn't exist in isolation. When a user asks Copilot to "summarize the Q3 litigation budget," Copilot reaches into SharePoint, OneDrive, and the Microsoft Graph to find relevant documents. It generates cloud attachments — hyperlinked file references embedded in the Copilot response.[3] Those referenced files aren't copies stored inside the hidden mailbox folder. They're pointers to the originals, which live wherever they always lived: SharePoint document libraries, OneDrive folders, Teams file tabs.

This is where eDiscovery collections get expensive. A single custodian with active Copilot usage might generate hundreds of prompts per week. Each prompt can reference multiple files. In a Premium eDiscovery collection, those cloud attachments can be pulled into the review set along with the prompts and responses.[4] Suddenly your collection isn't a custodian's Copilot interactions — it's a custodian's Copilot interactions plus every document Copilot ever touched on their behalf.

Reed Smith's 2026 analysis flagged the "version shared" feature as particularly treacherous: Microsoft's documentation says you need it enabled to preserve referenced files, but testing across tenants shows "inconsistent results."[4] Some firms see the referenced files pulled in cleanly. Others get broken links. The tooling is not yet reliable enough for defensible preservation — and most organizations don't know that until they're in the middle of a collection.

Copilot Artifact TypeStorage LocationDiscoverable?Legal Hold Behavior
User promptsHidden Exchange folderYes, via Purview eDiscoveryLitigation Hold suspends deletion
AI responsesHidden Exchange folderYes, via Purview eDiscoveryLitigation Hold suspends deletion
Cloud attachments (referenced files)SharePoint / OneDriveYes, if collectedSeparate hold required on source location
Copilot Pages (.loop files)SharePoint containersYes, but not auto-included in Litigation HoldManual hold configuration needed
Meeting summaries (Teams Copilot)Exchange / Teams backendPartial — excludes transcripts in transcriptionless modeFollows Teams retention
Audit logsUnified Audit LogYes, retained minimum 6 monthsNot subject to Litigation Hold

Table 1: Copilot artifacts fragment across multiple storage locations, each with different discovery and preservation behaviors. Source: Microsoft Learn,[1] Redgrave LLP,[3] Reed Smith.[4]


Copilot Pages: the gap nobody patched

Copilot Pages are a new artifact category introduced with Microsoft 365 Copilot. They use the .loop file format and are stored in SharePoint containers. And the critical gap: Copilot Pages and Copilot Notebooks are not automatically included when a user is placed on Litigation Hold.[5]

Read that again. You place a custodian on Litigation Hold. Their email is preserved. Their Teams chats are preserved. Their Copilot prompts and responses — stored in the hidden Exchange folder — are preserved. But the Copilot Pages they created? Those live in SharePoint containers, and they require a separate, manual hold configuration.[5]

Microsoft's own compliance documentation acknowledges this: the summary of governance capabilities for Copilot Pages states that Litigation Hold support requires explicit configuration beyond the standard mailbox hold.[5] For a litigation team that thought "place the custodian on hold" covered everything, this is a preservation gap with real sanctions exposure.

The nBold analysis of Copilot Knowledge Pages in March 2026 added another wrinkle: when a user clicks "Edit in Pages" on a Copilot response, the data transforms from an Exchange-stored interaction into a SharePoint-stored .loop file.[6] It moves from one retention policy jurisdiction to another. The Copilot retention policy that was keeping the original prompt-response pair alive no longer applies. Now you need a SharePoint retention policy covering that specific container — and if you don't have one, that artifact is on borrowed time.


The Purview retention overhaul: better, but still behind

To Microsoft's credit, the retention infrastructure has improved significantly. As of February 2026, Purview retention policies now support separate locations for different Copilot products:[1]

Microsoft Copilot experiences:

Enterprise AI apps:

Other AI apps:

That last category is remarkable. Microsoft is now offering to retain and manage interactions with competitor AI products — ChatGPT, Gemini, DeepSeek — inside Purview's compliance framework.[1] Whether organizations trust Microsoft to govern their employees' use of competing AI tools is a question for another day. The eDiscovery implication is immediate: if your organization enables Purview collection policies for those third-party apps, those interactions become searchable ESI alongside your Copilot data.

mermaid Diagram

Figure 3: Copilot-generated ESI fragments across at least five distinct storage locations, each requiring separate preservation strategies.

But the separation is still coarse. You can set different retention for "Microsoft 365 Copilot" versus "Security Copilot" — but you still cannot distinguish between Copilot prompts generated in Word versus Copilot prompts generated in Teams within the M365 Copilot location.[2] For organizations with nuanced retention schedules — keep client-facing communications for seven years, delete internal collaboration after two — that granularity gap means overretention or underretention of Copilot artifacts, neither of which is defensible.


What your legal hold template is missing

Most legal hold notices were last updated to account for Teams and Slack. Some added "AI-generated content" as a line item after the hallucination cases hit the press. Almost none address the specific mechanics of Copilot preservation. Here is what needs to change:

1. Name the hidden mailbox explicitly

Your hold notice should tell custodians — and more importantly, your IT team — that Copilot data resides in hidden Exchange folders that are not visible in Outlook. The hold must apply to the user's Exchange mailbox, which captures Copilot prompts and responses automatically. But custodians should know this data exists so they don't delete Copilot chat histories thinking it's transient.

2. Hold SharePoint containers for Copilot Pages

A mailbox-only Litigation Hold does not cover Copilot Pages or Notebooks. Your IT team needs to identify which SharePoint containers store .loop files for each custodian and apply separate holds to those locations. This is not optional — it's a preservation obligation.

3. Map cloud attachment sources

When a custodian uses Copilot to work with documents, those referenced files may need preservation. Your hold should identify the custodian's OneDrive and relevant SharePoint sites, alongside their mailbox. The "version shared" feature, if enabled, creates additional copies that may need separate retention treatment.

4. Account for third-party AI app interactions

If your organization has Purview collection policies capturing ChatGPT, Gemini, or DeepSeek interactions, those are now ESI too. Your hold template should ask custodians whether they use third-party AI tools for work purposes and include those data sources in scope.

5. Test before you litigate

Reed Smith's analysis delivered perhaps the most sobering recommendation: "you cannot assume that based on what we're saying, it's actually going to work that way."[4] Test your Copilot collection in Purview before you need it for real. Run a practice eDiscovery search. Verify that prompts, responses, and cloud attachments actually appear in the review set. The time to discover that your Copilot preservation has gaps is during a tabletop exercise, not during a meet-and-confer.


The BYOD parallel — and why this time is worse

When bring-your-own-device policies swept through enterprises in 2012-2015, litigation teams faced a similar scramble: new data sources, unclear preservation obligations, incomplete collection tools. It took years of case law, sanctions motions, and vendor innovation before BYOD preservation became routine.

Copilot is the same pattern with higher stakes. BYOD added new devices to the ESI map. Copilot adds new data types that are generated continuously, stored in hidden locations, fragmented across multiple services, and governed by retention policies that change faster than legal departments can track them. Microsoft updated its Copilot retention documentation in February 2026 — the third significant revision in twelve months.[1] Your legal hold template from January is already out of date.

The volume question alone should keep litigation support teams awake. A single active Copilot user generates dozens of prompts per day. Multiply by thousands of custodians across an enterprise, and the math is staggering. Every prompt creates a discoverable artifact. Every response creates another. Every referenced document is a potential cloud attachment pulled into the collection. Redgrave noted that organizations are preserving broadly because "identifying truly relevant items at preservation stage is difficult" — the same relevance triage problem that made email review expensive, now replicated across a data type that most review platforms have only recently begun to support.[3]


What courts haven't decided yet — and what they will

There is no case law directly addressing Copilot preservation standards. No court has ruled on whether a party's failure to preserve Copilot prompts constitutes spoliation. No judge has issued an adverse inference instruction because meeting summaries generated by Teams Copilot were lost when a custodian left the organization.

That will change. The volume of Copilot-generated ESI is growing exponentially. Microsoft reported 400 million monthly active Copilot users in January 2026.[7] As those interactions accumulate, they will become relevant in litigation — because they show what a custodian knew, when they knew it, and what AI told them to do about it.

Reed Smith's analysis framed the relevance question precisely: a court might find that a Copilot response is "unreliable" as an accurate summary of the law — but the interaction itself remains "relevant to demonstrating user intent or knowledge."[4] If a corporate officer asked Copilot to summarize the company's exposure on a pending claim, that prompt is relevant regardless of whether Copilot's answer was accurate. The question wasn't whether Copilot got it right. The question is what the officer was thinking about and when.

The organizations that will be best positioned are those treating Copilot data like any other ESI source: mapped, held, collected, and reviewed. The ones that will face sanctions are those that assumed Copilot interactions were ephemeral convenience data — the digital equivalent of scribbling notes on a napkin. They are not. They are searchable, preservable, discoverable artifacts stored in Microsoft's compliance infrastructure. The napkin has a SubstrateHolds folder.


Related Reading


[1]Microsoft Learn, "Learn about retention for Copilot and AI apps." Retention policies for Copilot. Updated February 2026.
[2]Lighthouse Global, "Steering the Microsoft Copilot Fleet." Lighthouse eDiscovery blog. April 2026.
[3]Redgrave LLP, "Redgrave Legal 365: Microsoft's Strategic Shift Has Major Implications for Information Governance and eDiscovery." Redgrave LLP publication. March 2026.
[4]Reed Smith LLP, "AI for Legal Departments: Managing eDiscovery and Data Retention Risks with Microsoft Copilot." Reed Smith article. 2026.
[5]Microsoft Learn, "Summary of governance, lifecycle, and compliance capabilities for Copilot Pages and Copilot Notebooks." Copilot Pages compliance. 2026.
[6]nBold, "Microsoft 365 Copilot Knowledge Pages: eDiscovery, Retention, and Lifecycle Guidance." nBold compliance guide. March 2026.
[7]Microsoft, "Copilot Retention, Auditing and eDiscovery: A deep dive." Microsoft Community Hub. 2026.

Related Posts