Google argues AI training on public web data should be protected as fair use in new policy paper
By
Murray Stassen
Summary
Google has published a policy paper arguing that training AI models on publicly available web data should be protected as fair use under US copyright law. The company contends that copyright concerns around generative AI are best addressed at the output stage (what the AI generates) rather than the input stage (what data it's trained on). This positions Google against music industry groups like the RIAA and publishers who are suing AI companies over unauthorized use of copyrighted works for training. As a company building its own AI music tools, Google has a direct financial and legal stake in how these copyright questions are resolved.
Source
Key quotes
· 3 pulledtraining AI models on publicly available web data should 'remain protected' by fair use in the US
copyright concerns raised by generative AI are best addressed at the level of outputs, not inputs
The RIAA, music publishers, and independent artists are all fighting AI companies in court over the same question: whether training a model on copyrighted work without permission is fair use
You might also wanna read

AI Companies' Copyright Dilemma: Scraping Data vs. Fair Use
The article criticizes AI companies for scraping vast amounts of online content, including text, photos, and videos, to train their models w
jskfellows.stanford.edu·10mo agoAI Development and Copyright: Why Expanding Protections Could Harm Innovation
The article argues against expanding copyright protections for AI training data, contending that requiring licenses for such materials would
Federal judge rules copyrighted books are fair use for AI training
Analysis of Copyright Law Ruling in Authors Guild v. OpenAI and Implications for AI Training Data
This article analyzes a significant copyright law ruling in Authors Guild v. OpenAI, where Judge Sidney Stein denied OpenAI's motion to dism
AI Copyright Cases: Courts Rule on LLM Training Data Memorization and Infringement
The article discusses legal cases involving AI companies and copyright infringement related to training data memorization. It covers a US co
arstechnica.com·4mo agoMeta Argues Uploading Pirated Books via BitTorrent for AI Training Qualifies as Fair Use
Meta is arguing in court that uploading pirated books via BitTorrent for training its Llama large language model qualifies as fair use. The

Comments
Sign in to join the conversation.
No comments yet. Be the first.