Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

📰 ArXiv cs.AI

Finetuning large language models can activate verbatim recall of copyrighted books, bypassing safety alignment strategies

advanced Published 26 Mar 2026
Action Steps
  1. Finetuning can reactivate verbatim recall of copyrighted books in large language models
  2. Safety alignment strategies such as RLHF, system prompts, and output filters may not be effective against finetuning
  3. Developers should re-evaluate their models' training data and finetuning procedures to prevent copyright infringement
  4. Regulatory bodies should consider the implications of finetuning on copyright laws and regulations
Who Needs to Know This

AI engineers and researchers working on large language models need to be aware of this issue to ensure compliance with copyright laws and regulations, and to develop more effective safety measures

Key Insight

💡 Finetuning can compromise the safety and compliance of large language models with copyright laws

Share This
🚨 Finetuning can bypass safety measures and reactivate verbatim recall of copyrighted books in LLMs 🚨
Read full paper → ← Back to News