Confidence is not the absence of caveats.
This page documents how the Litmus Test works, the thresholds it uses, what our internal validation does and does not prove, and where the method is weak. Publishing limits is what separates a forensic instrument from a marketing claim.
How the test works
IPLYR runs a near-verbatim recall extraction test based on Ahmed, Cooper, Koyejo & Liang (arXiv:2601.02671) and the follow-up extraction work in arXiv:2505.12546. Given a target work W and a model M, the test scores the rate at whichM emits verbatim blocks of W under controlled, reproducible prompting.
- Default threshold:
p < 0.001nv-recall — the Stanford "moderate-risk" floor. - Strict mode:
p < 1e-5, used for evidentiary tier-A reports. - Binary verdict:
EXTRACTION CONFIRMED/NO EXTRACTION DETECTEDat the chosen threshold. - Reproducibility: every prompt, seed, sampling parameter, and response is preserved in the chain-of-custody hash tree.
Internal validation (what it proves, what it doesn't)
We have run the test against synthetic memorization corpora and against a Project Gutenberg public-domain control set across major frontier and open-weight models. On those internal corpora, strict-mode produces zero false positives.
Where the test is weak
- Heavy paraphrase evasion. Models that have been heavily fine-tuned to paraphrase (~30%+ token-level rewording) can evade verbatim-extraction detection. We disclose this clearly in every report.
- Negative is not "never used."
NO EXTRACTION DETECTEDmeans the model did not emit verbatim blocks under our protocol — it is not proof the work was never in the training set. - Refusal-trained models. Aggressive refusal training reduces extraction surface; we report a jailbreak-resistance score alongside the verdict.
- Multimodal coverage. Image and audio nv-recall analogues are weaker than text. Reports are scoped accordingly.
What this is not
- Not legal advice. Not a guarantee of case outcome. Not a substitute for counsel's independent judgment.
- Not a claim of prior court acceptance. The method is designed for Daubert / FRE 702 admissibility; admission in a specific case is for the court.
- Not a black box. Every report is reproducible from the chain-of-custody hash.
Available to counsel and opposing experts under MNDA, including the Daubert-factor matrix and prompt-protocol appendix.