MIA
Daubert-eligibleTier 1Min-K%++
Zhang et al. · ICLR 2025 · arXiv:2404.02936
Description
Calibrated membership-inference attack against pretraining data. Uses per-token log-likelihood vs. neighborhood entropy to achieve state-of-the-art AUROC across LLaMA, Pythia, GPT-NeoX families. Decouples token informativeness from membership signal, producing a clean p-value under permutation null.
Technical signature
in: (model, work_blocks[]) → out: { p_value, auroc, per_block_scores[] }Adversarial robustness
Robust to paraphrase rewrites at edit-distance ≤ 0.2; degrades on adversarial decoder defenses (rejection sampling).
Daubert eligibility
Method is peer-reviewed, has published error rates, and is reproducible from the chain-of-custody hash. Eligible for Tier-A admissibility packaging. Ultimate admissibility is the court's determination.