Alignment Whack-a-Mole: Code Repository for Research on LLM Copyrighted Book Memorization via Finetuning
By
reconnecting
1mo ago· 4 min readenCode
95/100
Golden Brown
Bagelometer↗
Hot, fresh, and worth queueing round the block for.
Score95Typepress releaseSentimentneutral
Summary
This repository provides the official code for the paper "Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models." It includes data preprocessing, finetuning scripts, memorization evaluation, and analysis tools. The research demonstrates that finetuning large language models can trigger verbatim recall of copyrighted book content. The repo contains partial example excerpts from Cormac McCarthy's "The Road" but excludes full copyrighted content and model generations due to copyright concerns.
Key quotes
· 3 pulledThe official code repo of Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models
Full book content and model generations are not included because the books are copyrighted and the generations contain large portions of verbatim text.
We provide partial example files in data/ containing a small subset of excerpts and generations from The Road by Cormac McCarthy.
The official code repo of Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models - cauchy221/Alignment-Whack-a-Mole-Code
