Machine unlearning: Google Research validates an audit test, but not yet on LLMs

Google Research presented at AISTATS 2026 a statistical test designed to audit machine unlearning, that is, the targeted removal of data from an already trained model. The Regularized f-Divergence Kernel Tests framework, authored by Mónica Ribero, Antonin Schrab and Arthur Gretton, promises to significantly reduce the experimental cost of certain audits: on the SVT3 mechanism for differential privacy, it detects violations with a few thousand samples, whereas DP-Auditorium could require millions. But its scope remains limited: the published validations concern synthetic benchmarks and high-energy physics datasets, not large language models, even though those are at the center of regulatory tensions around deletion, traceability and data governance.

What the test fixes and what it leaves open

The tool addresses a known flaw in the standard two-sample test (MMD). Two models retrained from scratch on the same data but with different batch sizes produce distinct distributions, triggering a false alarm of unlearning failure. The new test avoids this pitfall through a three-sample relative test and an automatic selection of the f-divergence - a measure of distance between distributions - best suited to each type of drift.

The experimental cost contrast is the main argument. On the SVT3 (Sparse Vector Technique) mechanism in differential privacy, the framework detects violations with a few thousand samples, versus millions for DP-Auditorium - the reference tool published by Google Research in 2024 (arXiv:2307.05608) - for a comparable detection rate. The detail matters: the gain is documented on SVT3, not across all differential privacy mechanisms, and the authors specify that no single divergence systematically dominates the others. Three methods - Selective Synaptic Dampening (SSD), pruning and finetuning - were deemed unable to effectively erase the targeted data under the paper's simplified experimental conditions; only the random label technique passed the three-sample relative test, a limitation the authors acknowledge.

The broader scope, however, remains to be demonstrated. The work in arXiv:2510.16629 published in October 2025 establishes that a model can never perfectly forget data by adjusting only its current parameters: a residual imprint of the supposedly erased information remains - a structural obstacle that the test by Ribero et al. measures but does not remove. Feng et al. (CMU, UK AI Security Institute, Oxford), in a May 2025 preprint, consider current unlearning evaluations on large language models inconclusive, and Chen et al. (LMU Munich, Oxford, Siemens) simultaneously published an audit framework specific to LLMs - a framework not compared in the AISTATS 2026 paper.

An obligation to achieve results without an enforceable method

Under GDPR, Article 17 on the right to erasure gives individuals the possibility to request the deletion of their data: applied to an AI model, this means having to establish that the relevant data no longer influence the outputs. GDPR imposes an obligation to erase when the conditions of Article 17 are met; applied to AI models, this obligation nonetheless runs into a technical gray area: how can one demonstrate that the data in question no longer continue to influence the model's behavior?

At the European level, the latest framework does not fill this gap. The GPAI Code of Practice, whose final version was published by the European Commission in July 2025, is a voluntary tool covering transparency, copyright and safety, helping providers demonstrate compliance with Article 53 of the AI Act, which requires a public summary of the content used for training (Article 53(1)(d), applicable since August 2, 2025). The document - in the version consulted - does not prescribe any method for verifying the effective erasure of data in a model that has already been deployed.

The gap is closed by tools, not by texts. That is precisely the void that the test by Ribero, Schrab and Gretton seeks to address, by proposing a defensible statistical measure of successful erasure. What remains is the hurdle not yet cleared: as long as experimental validation does not move beyond synthetic benchmarks and physics models to reach large language models, where erasure requests are concentrated, the chain of evidence expected by a data protection officer remains incomplete.

Stephane Nachez

ActuIA editorial team — news, data and analysis on artificial intelligence for decision-makers.

Machine unlearning: Google Research validates an audit test, but not yet on LLMs

What the test fixes and what it leaves open

An obligation to achieve results without an enforceable method

Helped by GPT-5, Then Left to Their Own Devices: A Randomized Trial Measures the Learning Cost of AI Assistance

The preprint ExpGraph proposes a self-evolving graph memory for LLM agents

GPT More Confident on Difficult Tasks Where It Makes the Most Mistakes, According to a USC/Berkeley Preprint