All Topics

Technology

Art

Alignment Whack-a-Mole: Code Repository for Research on LLM Copyrighted Book Memorization via Finetuning

reconnecting

1mo ago· 4 min readenCode

95/100

Golden Brown

Bagelometer↗

Hot, fresh, and worth queueing round the block for.

Score95Typepress releaseSentimentneutral

Summary

This repository provides the official code for the paper "Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models." It includes data preprocessing, finetuning scripts, memorization evaluation, and analysis tools. The research demonstrates that finetuning large language models can trigger verbatim recall of copyrighted book content. The repo contains partial example excerpts from Cormac McCarthy's "The Road" but excludes full copyrighted content and model generations due to copyright concerns.

Key quotes

· 3 pulled

The official code repo of Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

Full book content and model generations are not included because the books are copyrighted and the generations contain large portions of verbatim text.

We provide partial example files in data/ containing a small subset of excerpts and generations from The Road by Cormac McCarthy.

Snippet from the RSS feed

The official code repo of Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models - cauchy221/Alignment-Whack-a-Mole-Code

You might also wanna read

MemoAttack: A Memory-Driven Framework for Automated LLM Jailbreak Attacks

This paper introduces MemoAttack, a novel memory-driven black-box jailbreak framework for large language models (LLMs). Unlike existing meth

arxiv.org·2d ago