By Sid Newby | April 2026
In twenty-plus years of litigation technology work, I've watched the definition of "electronically stored information" expand from email archives to Slack channels, from network shares to cloud drives, from text messages to ephemeral communications. Each expansion caught the industry off guard, triggered a wave of spoliation motions, and eventually forced new governance frameworks. But what's happening right now -- the silent, exponential growth of AI-generated data across every corner of enterprise technology -- is different in kind, not just degree. Every chatbot conversation, every AI meeting transcript, every auto-generated summary is creating discoverable records that most organizations don't know exist, can't reliably preserve, and haven't thought to include in their litigation hold protocols. And a landmark federal ruling just confirmed what many of us feared: the privilege frameworks we've relied on for decades weren't built for this.
The new ESI frontier: AI-generated data is everywhere
Let's start with the scope of the problem. In 2026, generative AI isn't a niche tool used by a handful of early adopters -- it's infrastructure. Microsoft Copilot is embedded in Office 365 and Teams. Google Gemini is integrated into Workspace. Zoom AI Companion automatically generates meeting summaries. Slack AI creates channel digests and thread summaries. Salesforce Einstein drafts customer communications. And across every industry, employees are using ChatGPT, Claude, Perplexity, and dozens of other AI platforms to draft emails, analyze contracts, summarize depositions, research legal questions, and prepare for meetings.[1]
Every one of these interactions generates data. Specifically, it generates three categories of electronically stored information that fall squarely within discovery obligations:
- Prompts -- the questions, instructions, and context that users feed into AI tools
- Outputs -- the AI-generated responses, summaries, drafts, and analyses
- Activity logs -- metadata documenting who used which tool, when, for how long, and with what inputs[1]
This isn't speculative. As K&L Gates noted in a February 2026 analysis, "relevant GenAI Data is discoverable" under the Federal Rules of Civil Procedure, and courts are applying traditional discovery rules under FRCP 26(b)(1) to treat AI-generated material like any other potentially relevant ESI.[1] The question isn't whether this data is discoverable -- it is. The question is whether your organization knows where it all lives, whether your litigation hold protocols capture it, and whether your review workflows can handle it.
For most organizations, the honest answer to all three questions is no.

Figure 1: The three categories of AI-generated ESI now subject to discovery obligations: prompts, outputs, and activity logs.
United States v. Heppner: the ruling that changed everything
On February 10, 2026, Judge Jed S. Rakoff of the U.S. District Court for the Southern District of New York addressed what he called "a question of first impression nationwide" -- whether written exchanges between a criminal defendant and a generative AI platform are protected by attorney-client privilege or the work product doctrine.[2]
The case, United States v. Heppner (No. 25-cr-00506-JSR), involved a fraud defendant who had used Anthropic's Claude to analyze his legal situation, research potential defenses, and prepare materials related to his case -- all without his attorney's direction. When prosecutors sought production of 31 documents memorializing these AI exchanges, the defendant asserted both attorney-client privilege and work product protection.[2]
Judge Rakoff rejected both claims. His reasoning was methodical and, for the litigation technology community, deeply consequential.
The privilege analysis
On attorney-client privilege, the court identified three fatal deficiencies in the defendant's claim:[2][3]
- No attorney-client relationship: Neither the defendant nor the AI platform was a licensed attorney. "That alone disposes of Heppner's claim of privilege," Judge Rakoff wrote.
- No confidentiality: Claude's user policy expressly reserves the right to disclose data to third parties, including governmental authorities. The platform's disclaimer that users have no expectation of confidentiality was, in the court's view, dispositive.
- No purpose of obtaining legal advice: The defendant communicated with Claude "on his own initiative without counsel direction," not for the purpose of obtaining legal advice within the traditional meaning of the privilege.
The work product analysis
The work product doctrine fared no better. Judge Rakoff held that the AI-generated materials were not "prepared by or at the behest of counsel" and did not reflect counsel's trial strategy at the time of their creation.[2] The defendant had acted independently, and the resulting documents bore no imprint of attorney mental processes or litigation strategy.
Why this matters beyond criminal law
The Heppner ruling is binding only in the Southern District of New York, but its influence will be felt nationally. The SDNY is the most closely watched federal trial court in the country, and Judge Rakoff -- a former federal prosecutor and Columbia Law professor with three decades on the bench -- is among the most cited and respected trial judges in America.[3]
More importantly, the ruling articulates a framework that will almost certainly be applied in civil litigation and regulatory enforcement. As Regulatory Oversight noted in its March 2026 analysis, state attorneys general conducting investigations will likely cite Heppner to challenge privilege claims over AI-generated materials, particularly in consumer protection and data privacy enforcement actions.[3]
The message to every litigation team is clear: if your employees, executives, or even your clients are using commercial AI platforms to discuss legal strategy, analyze legal exposure, or prepare litigation-related materials without attorney direction, those communications are almost certainly discoverable.
Warner v. Gilbarco: the split that complicates everything
If Heppner were the only ruling on AI privilege, the path forward would be relatively clear. But on the same day -- February 10, 2026 -- a federal court in Michigan reached a materially different conclusion in Warner v. Gilbarco, Inc.[4]
In Warner, a pro se plaintiff in an employment discrimination case had used ChatGPT to help draft litigation filings and analyze her legal claims. The defendant moved to compel production of the plaintiff's AI chat history, arguing that sharing information with a public AI platform constituted a waiver of work product protection.[4]
The court disagreed. Magistrate Judge found that the plaintiff's ChatGPT interactions constituted protected work product, characterizing AI as "a tool, not a person" and holding that disclosure to a public AI platform did not constitute waiver to an adversary or conduit to an adversary.[4]
The conceptual divergence between these two rulings is significant:[5]
| Factor | Heppner (SDNY) | Warner (E.D. Mich.) |
|---|---|---|
| AI characterization | Potentially agent-like if counsel-directed | Neutral tool/instrument |
| Privilege outcome | No privilege (consumer AI, no counsel) | Work product protected |
| Confidentiality analysis | Platform terms destroy confidentiality | Tool use \u2260 disclosure to adversary |
| Key distinguishing fact | Defendant acted without counsel direction | Pro se plaintiff preparing litigation |
| Doctrinal implication | Counsel involvement is critical gateway | Work product survives AI disclosure |
Table 1: Comparison of the two landmark AI privilege rulings from February 2026. Source: Sidley Austin Data Matters analysis.[^5]
As Sidley Austin observed in its March 2026 analysis, courts are characterizing AI divergently -- as a neutral instrument in some contexts and as a potentially agent-like professional in others -- and this conceptual tension will shape privilege disputes for years to come.[5]
For litigation teams, the practical takeaway is uncomfortable: whether AI-generated materials are privileged may depend on which jurisdiction you're in, who directed the AI use, and what platform was used. There is no safe harbor, and the law is actively splitting.
The meeting-tool time bomb
While the privilege debate commands headlines, a quieter crisis is unfolding in conference rooms and video calls across every industry. AI-enabled meeting tools -- the transcription engines, notetakers, and summarization features embedded in Zoom, Microsoft Teams, Google Meet, Otter.ai, Fireflies.ai, and dozens of other platforms -- are silently generating the largest new category of discoverable ESI since the advent of corporate email.[6]
The mechanics are straightforward and alarming. Generative AI capabilities embedded directly in videoconferencing platforms now routinely:[6][7]
- Record meetings without explicit per-meeting consent from all participants
- Create transcripts with speaker attribution, timestamps, and keyword tagging
- Auto-generate summaries that extract action items, decisions, and key topics
- Store all of the above in cloud repositories controlled by third-party vendors
And critically, many of these features are enabled by default. When Seyfarth Shaw highlighted AI meeting tools in its 2026 Commercial Litigation Outlook, the firm identified this as one of the most significant emerging risks in the discovery landscape, noting the "sudden proliferation of new, unvetted records that can capture sensitive, strategic, or privileged conversations."[6]
Why this is different from traditional meeting recordings
Organizations have dealt with meeting recordings before. But AI-powered meeting tools create something fundamentally different from a simple audio or video file:
- Searchable text: AI transcripts convert spoken words into indexed, searchable text -- making them far more discoverable and reviewable than raw audio files.
- Speaker attribution: Modern transcription tools identify who said what, creating an attributed record of individual statements that can be used to establish knowledge, intent, or admission.
- Auto-generated summaries: AI summaries extract and characterize what the tool determines to be the most important content -- but these characterizations may be inaccurate, out of context, or incomplete. An AI summary that states "John agreed to delay the product recall" becomes a devastating exhibit even if John's actual words were more nuanced.
- Metadata proliferation: Beyond the transcript itself, these tools generate metadata about meeting participants, duration, topics discussed, follow-up actions assigned, and even sentiment analysis in some cases.
- Third-party storage: Much of this data is stored in vendor cloud infrastructure, raising questions about who controls it, how long it's retained, and whether it's subject to the vendor's own data practices.[8]
The HaystackID eDiscovery team has flagged the retention implications specifically, noting that organizations need to understand not just what AI meeting tools are capturing, but where that data resides, who has access to it, and what retention policies (or lack thereof) govern its lifecycle.[8]
The privilege exposure
The privilege implications of AI meeting tools are particularly acute. Consider a routine scenario: a company's general counsel holds a video call with outside litigation counsel to discuss strategy in a pending case. The meeting takes place on Microsoft Teams. Teams Copilot is enabled by default. The AI generates a transcript and a summary of the call, storing both in the company's Microsoft 365 environment.
That transcript and summary are now ESI. If they're captured by a litigation hold, they'll be collected and reviewed. If opposing counsel requests meeting transcripts and summaries, the producing party will need to assert privilege -- but over AI-generated documents that may not have been created at counsel's direction, may be stored on third-party infrastructure, and may contain AI characterizations of privileged communications that are subtly different from what was actually said.
After Heppner, the privilege analysis for these documents is anything but certain.
The OpenAI precedent: millions of logs, compelled
The discoverability of AI-generated data isn't theoretical. In In re OpenAI, Inc., Copyright Infringement Litigation (S.D.N.Y., December 2, 2025), the court compelled OpenAI to produce millions of GenAI logs -- including user prompts and model responses -- as part of the discovery process in consolidated copyright infringement claims brought by major content creators.[1][9]
The ruling established several important principles:
- AI conversation logs are ESI: The court treated ChatGPT interaction histories as discoverable electronically stored information, applying standard FRCP 26(b)(1) analysis.
- Volume alone doesn't defeat discoverability: Despite the enormous scale of the data, the court ordered production with anonymization protections, rejecting OpenAI's argument that the burden was disproportionate to the needs of the case.
- Proportionality still matters: In a separate ruling in the same consolidated litigation, Magistrate Judge denied a New York Times discovery request involving approximately 80,000 entries from OpenAI's internal AI tools, finding that review of those entries would require more than 1,300 hours -- a disproportionate burden given the data's limited connection to the issues in dispute.[1]
The combined message from these rulings is nuanced but clear: AI-generated data is fully discoverable, proportionality provides meaningful but not absolute limitations, and organizations that generate or host AI interaction data need to be prepared for discovery demands that would have been unimaginable five years ago.

Figure 2: Timeline of landmark court rulings establishing AI-generated content as discoverable ESI, from the OpenAI copyright litigation through the Heppner privilege decision.
The governance gap: what most organizations are missing
The gap between the legal reality and organizational preparedness is staggering. While courts are rapidly establishing that AI-generated content is discoverable ESI subject to preservation obligations, most organizations have done little to adapt their information governance frameworks.
The shadow AI problem
The most immediate challenge is shadow AI -- the use of consumer AI tools by employees outside of any approved or monitored channel. An employee who pastes confidential deposition testimony into ChatGPT for summarization has just created discoverable ESI on a third-party platform that the organization may not even know exists, cannot reliably preserve, and almost certainly hasn't included in its litigation hold notices.[10]
Fisher Phillips identified this as a critical governance gap in its 2026 analysis, noting that even when organizations prohibit unauthorized AI use, employees routinely use personal devices or browser-based tools to access AI platforms for work-related tasks.[10] The result is discoverable data scattered across platforms the organization doesn't control, generated by employees the organization didn't authorize, and stored in jurisdictions the organization may not be aware of.
Preservation obligations in the AI era
The preservation obligations for AI-generated content are substantial and largely unfamiliar to litigation teams accustomed to preserving email, documents, and structured data. As K&L Gates outlined, organizations facing litigation must now:[1]
- Identify AI custodians: Determine which employees and departments use generative AI tools and where the resulting data is stored
- Disable auto-delete: Many AI platforms have default data retention settings that automatically purge conversation histories -- these must be disabled for relevant custodians once a litigation hold is triggered
- Export and preserve: Chat histories, prompts, outputs, and logs must be exported from AI platforms and preserved in a defensible manner
- Capture personal tools: Custodians must disclose use of personal or browser-based AI tools so those sources can be evaluated and potentially preserved
- Avoid selective editing: Custodians should not edit or selectively copy AI data in ways that alter context -- a particularly important obligation given that AI conversation threads can be long, multi-topic, and contextually dependent
- Update litigation hold notices: Standard hold notices that reference "email, documents, and electronic files" almost certainly don't capture AI-generated content with sufficient specificity
The meet-and-confer imperative
Perhaps most importantly, AI-generated content needs to be addressed early in the discovery process. K&L Gates recommends that parties address GenAI data relevance in their initial Rule 26(f) meet-and-confer conferences and incorporate AI-specific protocols into ESI discovery agreements.[1] Waiting until a discovery dispute to figure out the scope and discoverability of AI data is a recipe for sanctions motions and adverse inference instructions.
Practical framework: what litigation teams should do now
Based on the emerging case law and guidance from leading practitioners, here is a practical framework for litigation teams navigating the AI-generated ESI landscape in 2026.
1. Audit your AI footprint
Before you can preserve AI-generated data, you need to know it exists. Conduct a comprehensive inventory of:[5][10]
- Enterprise AI tools: Which AI platforms has the organization licensed? What data do they generate and retain?
- Embedded AI features: Which productivity tools (Office 365, Google Workspace, Zoom, Slack) have AI features enabled? What data do those features generate?
- Consumer AI usage: What shadow AI usage exists? What policies govern (or fail to govern) employee use of consumer AI platforms?
- Vendor data practices: For each AI platform, what does the vendor's privacy policy say about data retention, confidentiality, and disclosure to third parties?
2. Update governance policies
Organizations need AI-specific information governance policies that address:[10][1]
- Acceptable use: Which AI tools are approved for which purposes? What types of information may and may not be shared with AI platforms?
- Privilege protection: Any use of AI for legal analysis or litigation preparation must be conducted under attorney direction, using enterprise platforms with contractual confidentiality protections -- not consumer tools with third-party disclosure rights
- Retention and deletion: What are the default retention settings for AI-generated data? When and how is that data deleted? How are those settings modified when a litigation hold is triggered?
- Documentation: Employees should document their use of AI tools for work-related tasks, particularly any use involving sensitive, confidential, or litigation-related information
3. Restructure litigation holds
Standard litigation hold notices are inadequate for AI-generated content. Updated hold notices should specifically reference:[1]
- Generative AI chat histories and conversation logs
- AI-generated meeting transcripts, summaries, and action items
- AI-assisted document drafts and analyses
- Prompts and inputs provided to any AI tool related to the litigation subject matter
- Metadata and activity logs from AI platforms
4. Protect privilege deliberately
After Heppner and Warner, privilege protection for AI-generated content requires deliberate structuring:[5]
- Counsel must direct: Any AI use for legal analysis must be undertaken at the behest of, and under the direction of, counsel
- Enterprise platforms only: Use enterprise AI tools with contractual confidentiality protections, not consumer platforms whose terms of service disclaim confidentiality
- Document the chain: Maintain records showing that AI use was counsel-directed, conducted on confidential platforms, and undertaken for the purpose of obtaining legal advice or preparing for litigation
- Review AI outputs: Don't assume AI-generated legal analysis is privileged simply because it relates to legal matters -- the Heppner framework requires attorney involvement, not just legal subject matter
5. Address AI data in ESI protocols
Update standard ESI protocols and discovery agreements to address:[1]
- The scope of AI-generated data subject to preservation and production
- Search methodologies for AI conversation histories (which may not be structured like traditional documents)
- Production format for AI-generated content (native format, text export, or both)
- Anonymization and protective order requirements for AI data containing sensitive information
- Cost allocation for the collection and review of AI-generated ESI
What's coming next: the regulatory and judicial horizon
The Heppner and Warner rulings are first-generation decisions in what will be a long and complex evolution of privilege and discovery law in the AI era. Several developments on the horizon will shape the landscape further.
The Hyperlink Rule
Courts are increasingly adopting mandatory requirements for legal filings that cite judicial opinions, statutes, or regulations to include hyperlinks to reputable legal research databases.[11] While this development is primarily a response to the AI hallucination crisis -- lawyers filing briefs with AI-fabricated citations -- it has implications for AI-generated legal analysis more broadly, signaling judicial willingness to impose new procedural requirements in response to AI-driven challenges.
State AI governance laws
California's Transparency in Frontier Artificial Intelligence Act and Texas's Responsible Artificial Intelligence Governance Act took effect on January 1, 2026, though an executive order signed in December 2025 casts doubt on the enforceability of state AI laws.[11] New York passed an amendment on March 9, 2026 requiring generative AI platforms to display notices that outputs may be inaccurate.[11] These regulatory developments will shape both the AI tools available to legal teams and the disclosure obligations surrounding their use.
The enterprise vs. consumer distinction
The most important doctrinal question going forward is whether courts will consistently distinguish between enterprise AI platforms used under attorney direction with contractual confidentiality protections and consumer AI tools used independently without such protections. The Heppner court's suggestion that counsel-directed AI use might function "akin to a highly trained professional" acting as an agent leaves the door open for privilege protection in enterprise contexts -- but that door hasn't been walked through yet.[5]
Amendment of the Federal Rules
The Judicial Conference's Advisory Committee on Civil Rules is actively studying the intersection of AI and discovery, and amendments to Rule 26 and Rule 37(e) (governing preservation and sanctions for failure to preserve ESI) to address AI-generated content explicitly are widely anticipated within the next two to three years.[1]
The access to justice dimension
There's one more angle that matters -- and it's one I care about deeply. The AI-generated ESI crisis has profoundly unequal impacts.
Large law firms and corporate legal departments with sophisticated information governance programs, enterprise AI deployments, and dedicated eDiscovery teams will adapt. They'll update their hold notices, audit their AI footprint, restructure their privilege protocols, and address AI data in their ESI agreements. It will be expensive and time-consuming, but they have the resources.
Small firms, solo practitioners, and pro se litigants -- the people who need AI tools the most to level the playing field -- are the ones most exposed. The Warner plaintiff used ChatGPT to help her prosecute an employment discrimination case because she couldn't afford a lawyer. The Heppner defendant used Claude to understand his legal situation because, at the time, he was navigating the criminal justice system. These are exactly the use cases where AI has the most potential to democratize access to justice -- and they're the use cases where privilege and confidentiality protections are most uncertain.[4][2]
As an industry, we need to get this right. That means developing clear, workable privilege frameworks that protect AI-assisted legal work regardless of whether the user can afford an enterprise platform with contractual confidentiality provisions. It means updating the Federal Rules to address AI-generated ESI explicitly, rather than leaving it to ad hoc judicial rulings that create conflicting precedent across circuits. And it means building litigation technology tools that can handle the collection, preservation, review, and production of AI-generated data -- because the volume is only going to grow.
The AI data explosion isn't a future problem. It's a today problem. And the organizations, law firms, and litigation support providers that build the governance frameworks, technical infrastructure, and legal expertise to handle it now will be the ones that thrive in the decade ahead.
The rest will be writing response briefs to sanctions motions.