Does differential privacy solve copyright?

This started as a literature-review presentation I gave on the intersection of copyright law, differential privacy, and generative AI. The short version of the answer to the title question is no — DP doesn’t solve copyright, and most of the technical proposals that try to are solving a different (related but narrower) problem. The longer version is below, with the slides if you want them.

Download the slides (PDF)

What copyright actually is

US copyright is a utilitarian regime. The constitutional intent — “To promote the Progress of Science and useful Arts” (Art. I, § 8, cl. 8) — is to incentivize the production of new creative work by giving authors a time-limited monopoly on copying, distributing, adapting, displaying, and performing the work. Two consequences worth pinning down before any technical discussion:

Copyright protects expression, not ideas. You can’t copyright the concept of a young wizard at boarding school; you can copyright the specific text of Harry Potter. This is the idea–expression distinction, and it’s the load-bearing wall in everything that follows.
Infringement requires both access and substantial similarity. Independent creation isn’t infringement, even of an identical work. Two painters can paint similar landscapes if neither saw the other’s.

On top of that sits fair use, which carves out exceptions based on four factors: purpose and character of use (transformative vs. substitutive), nature of the original, amount used, and effect on the potential market. Fair use is what lets criticism, parody, scholarship, and (sometimes) machine learning happen at all.

Why generative AI breaks the doctrine

Two centuries of doctrine were built around human-scale copying — printing presses, photocopiers, sample-based music. Generative AI scrambles every part of the framework at once:

Scale. Training corpora ingest billions of copyrighted works; deployment serves millions of outputs per day. Doctrines designed around bounded copying don’t have an obvious story for unbounded ingestion-and-emission.
Inputs and outputs flip the protection logic. A user inputs ideas and questions (not protected). The model outputs expression — but is that expression protected? And whose? The training authors, the prompt author, the model provider, none of the above?
Substantial similarity becomes hard to prove. A model can produce work that’s stylistically derivative without any single output being a verbatim copy of a single training example.
Models don’t need incentives. The constitutional rationale for copyright is to motivate authors. A diffusion model has no need to be motivated. That puts AI generation in a place the doctrine didn’t anticipate.

Lee, Cooper, and Grimmelmann (2024) frame this as a generative AI supply chain — copyright issues can attach at every stage from data creation, dataset curation, pre-training, fine-tuning, alignment, deployment, and generation. Pretending the problem only lives at one stage is what most of the existing technical proposals do.

The technical literature, as a tour

Most of the existing technical work falls into one of two camps: memorization (we can extract training data) and provable copyright protection (we can constrain models so they can’t reproduce specific works). The papers worth knowing:

Memorization. Carlini and collaborators have a ~5-year arc here:

Extracting Training Data from Large Language Models (Carlini et al., 2021) — GPT-2 memorizes and can be prompted to regurgitate training examples verbatim.
Quantifying Memorization Across Neural Language Models (Carlini et al., 2022) — memorization scales with model size and training-data duplication. Bigger models with more-repeated data memorize more.
Extracting Training Data from Diffusion Models (Carlini et al., 2023) — image diffusion models do this too, with verbatim image reconstruction.
Scalable Extraction of Training Data from (Production) Language Models (Nasr et al., 2023) — the same attack works on aligned, deployed models; alignment doesn’t fix it.
How much do language models memorize? (Morris et al., 2025) — proposes new memorization metrics and lands on roughly 3.6 bits per parameter as model capacity for memorization.

Provable copyright protection. Vyas, Kakade, & Barak (2023) introduced Near Access-Freeness (NAF): a model $p$ is $k$ -NAF for a copyrighted set $C$ if, for every prompt $x$ , the divergence between $p(\cdot \mid x)$ and a “safe” model $\text{safe}_C(\cdot \mid x)$ that never accessed $C$ is at most $k$ bits:

$\Delta\!\left(p(\cdot \mid x) \,\|\, \text{safe}_C(\cdot \mid x)\right) \le k_x$

Compared to standard differential privacy, NAF is relaxed: it separates access from substantial similarity, gives a one-sided upper bound, and only depends on outputs (so it works as a black-box criterion). It also leaves two unresolved questions: a quantitative one (what value of $k$ is small enough?) and a qualitative one (where do we get the safe function?).

Two concrete recipes for the safe function:

Leave-one-out. Train one model with the copyrighted item, one without; bound the divergence. Honest but expensive — you need $|C|$ “without” models, one per protected work.
Copy-Protection-Δ. Split the dataset into two non-overlapping shards $D_1, D_2$ , train models $A$ and $B$ separately, and combine them at inference via $p(y \mid x) \propto \min(q_1, q_2)$ or $\sqrt{q_1 q_2}$ . If a copyrighted work only appears in one shard, the combined model is provably bounded away from reproducing it.

Both ideas treat copyright as a probabilistic-output problem. That works as far as it goes, but…

Why privacy and copyright don’t actually align

The most important paper in this space, in my view, is Elkin-Koren, Hacohen, Livni, & Moran (2023), Can Copyright Be Reduced to Privacy? The answer is no, and the reasons matter:

Privacy is over-inclusive. Copyright doesn’t protect everything in a copyrighted work. De minimis uses, fair use, facts, ideas, methods, and unprotected stylistic aspects all sit inside copyrighted works but outside copyright’s reach. A system that protects a copyrighted work’s content — the way DP would — withholds too much from subsequent authors. The doctrine deliberately leaves “breathing room” for downstream creativity, and privacy doesn’t.

Privacy is under-inclusive. Copyrighted expression doesn’t have to come from the input data to count as infringement. A model trained on a dataset that excludes some painting can still produce a substantially similar image to that painting — perhaps because the same expressive content is reachable through other inputs, or because users prompt their way there. DP protections at the dataset level don’t catch any of this.

Privacy protects content. Copyright protects original expression. These look similar in a probability-bound sense and are actually quite different in scope. Privacy treats every record uniformly; copyright treats every element of every work differently — same painting can have one element that’s protected expression, another that’s an unprotected idea, and a third that’s de minimis and not worth a lawsuit.

NAF improves on raw DP by being content-safety- rather than model-safety-oriented, by accepting soft constraints, and by allowing the safe function to depend on the input. But Elkin-Koren et al.’s core observation is more general: there’s no free lunch. You can pick at most two of: (1) allow learning from copyrighted works, (2) protect the entire set of copyrighted works, (3) maintain model quality. NAF doesn’t escape this trilemma; it just lets you trade more gracefully.

Cohen’s clean-room defense

A more recent paper, Cohen (2025), Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models, sharpens both the criticism of NAF and the constructive proposal:

NAF is not safe. Cohen shows NAF models can be coerced into verbatim reproduction via multiple prompt composition and data-dependent prompts. Worse, an NAF model can enable verbatim copying by users who don’t know the underlying data — a “tainted model” failure. So NAF is not a sufficient copyright defense.

Clean-room training as a substitute. Cohen proposes building a copyright dependency graph (which works are derivative of which) and a scrub function that produces a sub-dataset containing no work stemming from a particular copyrighted item. Training the actual model on a “golden” deduplicated subset $D_1$ and a scrubbed subset $D_2$ — had we scrubbed the data, the user would not have copied — bounds the user’s probability of producing substantially-similar output. This solves the under-inclusiveness problem: it catches the case where a user produces infringing output without ever directly accessing the protected item.

This works only if the dataset is “golden” — i.e. each work is deduplicated against its copyright dependency tree. That is non-trivial in practice, but it’s at least a clearer specification of what “clean-room copy protection” should mean.

So is memorization the right thing to measure?

A common move in litigation is to take Carlini et al.’s memorization papers and use them as evidence of infringement: the model memorized our work, ergo it’s infringing. Carlini himself published a 2025 piece, What my privacy papers (don’t) have to say about copyright and generative AI, pushing back on this. His memorization work shows models sometimes output verbatim training data — that’s a privacy result, not a copyright result.

Cooper and Grimmelmann (2024), The Files are in the Computer, make the legal argument: memorization is neither necessary nor sufficient for copyright infringement. Not necessary, because non-memorized models can still produce substantially similar output. Not sufficient, because memorization is a model-internal property, while infringement is a property of acts — copying, distributing, displaying — performed by legal persons. The model “remembering” a poem is, from copyright’s perspective, only relevant insofar as that act of remembering supports a downstream act of infringement.

This is the gap the technical literature mostly elides. We measure memorization because it’s measurable; we treat memorization-reduction as the goal because the metric exists. That’s a streetlight effect.

A flowchart that ends with a question mark

Here’s how the implication tree actually shakes out:

Does memorization happen? Yes (Carlini et al., Nasr et al., Morris et al.).
Is that a copyright concern? Probably — sometimes, in some cases.
Can we fix it with differential privacy? Maybe, partially, with utility cost.
Does privacy align with copyright? No. (Elkin-Koren et al., Cohen.)
Okay, how do we actually solve copyright? ?

The terminal question mark is the honest answer. Privacy is a tool we have; copyright is the problem we want to solve; and they are not the same thing. NAF, clean-room training, and the like are real progress on a real subproblem (verbatim reproduction by an honest user), but they don’t speak to the larger questions: training-data sourcing, market dilution, attribution, de minimis uses, fair-use codification.

What “actually solving” it would require

The slide deck closed with a three-way split that I think is roughly right:

Technical work. New metrics oriented specifically toward copyright (not memorization, not privacy). Mitigation, remedy, and compensation primitives — machine unlearning that’s robust to checkpoint-divergence attacks, RAG-time provenance enforcement, well-defined takedown mechanisms.

Judicial work. Fair-use doctrine has to figure out transformativeness and market dilution for AI training. What level of training-data ingestion is permissible, and how do we attribute downstream harms when models with shared lineage produce similar outputs?

Legislative work. Copyright reform that addresses licensing, text-and-data-mining (TDM) exceptions, fair-use codification, AI-specific privacy laws, and the consent/compensation/control questions about training data. None of this gets resolved by better algorithms.

The technical and legal communities have to be in the same room for this. Most of the proposals that frame copyright as a math problem stop at “we have a probability bound” and leave the legal apparatus to catch up. Most of the legal proposals that frame copyright as a doctrinal problem stop at “we need more case law” and leave the technical apparatus undefined. Neither half is wrong; together they’re insufficient.

Two takeaways I want to repeat

If only two things land from this:

Memorization ≠ copyright infringement. Memorization is a model-internal property. Infringement is a property of acts, evaluated against expression — not data. Reducing memorization is a privacy intervention, and at most a partial copyright defense.
Privacy ≠ copyright protection. They’re related — both are about constraining what a model can reveal — but the scopes are different in ways that matter doctrinally. Privacy frameworks don’t admit de minimis, don’t admit fair use, don’t draw the idea–expression line. Pretending they do is what makes “DP solves copyright” a tempting and wrong conclusion.

The slides have the references, citations, and the diagrams I cut for length. If you’d like to argue with any of this — or have pointers I missed — I’d genuinely like to hear it.

References

Carlini et al. (2021). Extracting Training Data from Large Language Models. arXiv:2012.07805
Carlini et al. (2022). Quantifying Memorization Across Neural Language Models. arXiv:2202.07646
Carlini et al. (2023). Extracting Training Data from Diffusion Models. arXiv:2301.13188
Vyas, Kakade, & Barak (2023). On Provable Copyright Protection for Generative Models. arXiv:2302.10870
Henderson et al. (2023). Foundation Models and Fair Use. arXiv:2303.15715
Elkin-Koren, Hacohen, Livni, & Moran (2023). Can Copyright Be Reduced to Privacy? arXiv:2305.14822
Nasr et al. (2023). Scalable Extraction of Training Data from (Production) Language Models. arXiv:2311.17035
Lee, Cooper, & Grimmelmann (2024). Talkin’ ‘Bout AI Generation: Copyright and the Generative-AI Supply Chain. SSRN 4523551
Cooper & Grimmelmann (2024). The Files are in the Computer: On Copyright, Memorization, and Generative AI. SSRN 4803118
Morris et al. (2025). How much do language models memorize? arXiv:2505.24832
Cohen (2025). Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models. arXiv:2506.19881