The U.S. Copyright Office seeks public input on how copyright law should address issues raised by AI systems that can generate creative works. Key questions include:
- Should permission/payment to copyright holders be required to train AI models on their works? If so, how to facilitate licensing?
- Can AI creations qualify for copyright despite lacking human authorship?
- Who bears legal liability if AI systems produce infringing content?
- Should AI outputs be labelled to show non-human origin?
- Do new publicity-type rights need to be created to address AI replicating personal attributes?
The rapid advancement of artificial intelligence (AI) technologies like generative models and conversational bots has sparked urgent debate around their relationship with copyright law.
As AI grows more sophisticated in mimicking uniquely human creative endeavours, questions loom about who can claim ownership and legal protections.
Details of Announcement
The U.S. Copyright Office has opened a public comment period to gather perspectives on copyright issues related to artificial intelligence (AI). The Office is seeking input on several key questions:
- How copyright law should address the use of copyrighted works to train AI systems. This includes whether permission and/or compensation of copyright owners should be required, and if so, how licensing could work.
- Whether AI-generated works can qualify for copyright protection given the lack of human authorship. The Office believes current law requires human creation but seeks views on whether the law should change.
- How to allocate legal liability if AI systems produce infringing content. For example, if an AI image mimics a copyrighted work used in training, who bears responsibility?
- Whether AI outputs should be labelled to identify their non-human origin, and if so, how such a system could work.
- Whether new publicity-type rights are needed to address AI replicating personal attributes like an individual's voice or artistic style.
Here is a summary of key dates related to the U.S. Copyright Office's artificial intelligence and copyright inquiry:
- August 2022 - The Copyright Office refuses registration for an AI-generated artwork, stating human authorship is required.
- October 2022 - Congress asks the Copyright Office to advise on AI copyright issues.
- March 2023 - The Office launches an AI initiative and issues guidance reiterating the human authorship requirement.
- April to May 2023 - The Office holds public listening sessions on AI and copyright.
- June to July 2023 - The Office hosts public webinars on AI and copyright topics.
- August 24, 2023 - The Office publishes a formal notice of inquiry seeking public comments on multiple AI copyright questions.
- October 18, 2023 - Deadline for initial public comments to the Office.
- November 15, 2023 - Deadline for reply comments.
The Copyright Office has been exploring emerging issues related to copyright and artificial intelligence over the past year through events, meetings with experts, and issuing guidance. The notice of inquiry marks a key milestone by formally soliciting public input to advise Congress on whether legislative changes may be needed to address AI technologies. The upcoming comment period deadlines in October and November 2023 give parties the opportunity to provide their perspectives.
Who Owns Creative Works Generated by AI?
A pivotal question surrounds whether AI-produced works should qualify for copyright at all if no human was involved in their creation. This issue came to a head last year when the Copyright Office rejected artist Stephen Thaler's application to copyright an image spawned by his AI system. Thaler sued but a federal judge sided with the Copyright Office, affirming that copyright has always required human authorship.
With advanced generative AI like DALL-E and Midjourney churning out remarkably detailed digital images, paintings, and other media based on text prompts, debates rage on. "Over the past several years, the Office has begun to receive applications to register works containing AI-generated material," the agency noted. Some argue AI art lacks human expression or intent required for copyright eligibility. Others counter that programmers' and users' inputs and selections make AI output a collaborative human-machine creation worthy of protection.
Training AI on Copyrighted Data
Another thorny issue is the common practice of feeding vast amounts of copyrighted text, images, video and other data to train AI systems. Several major models like GPT-3 ingest millions of online books, websites and more. While creators often do not give explicit consent, the "fair use" doctrine may legally allow this uncompensated use for research purposes.
Nonetheless, high-profile lawsuits recently emerged. Comedian Sarah Silverman and authors like Christopher Golden sued AI companies for allegedly scraping their copyrighted books and material without permission. Major news organizations also blocked their content from being copied into AI training datasets over copyright concerns. The Copyright Office now seeks input on what uses of copyrighted data for AI training should be allowed or barred.
Training Data and Generative AI Spark Legal Debates
At the core of recent controversies are two key AI capabilities:
- Large language models that ingest freely available data to improve natural language processing. This information often includes copyrighted books, articles, songs, images, and more.
- Generative AI systems like DALL-E and Midjourney that can autonomously create art, music, and text—potentially mimicking recognizable styles without human direction.
To address growing disputes, the US Copyright Office has opened a public comment period through mid-November to gather perspectives on if and how copyright applies to these AI applications. Specifically, it aims to parse three complex questions:
- Should AI systems require permission to use copyrighted works in training data? If so, how would this even be tracked and enforced?
- Can AI creations qualify for copyright given the lack of human authorship? If not, how to account for economic impacts on original creators?
- How to allocate legal liability if AI systems do infringe on copyright or publicity rights?
According to the Copyright Office, recent applications already seek protection for AI outputs, indicating a need for guidance. But with corporations, researchers, artists, authors, and consumers all holding different interests, consensus will be difficult.
Mounting Lawsuits Highlight Urgent Need for Guardrails
The absence of clear regulations has sparked growing legal action alleging improper use of copyrighted materials:
- Generative art platforms targeted. Artists filed suits against AI art companies Stable Diffusion, Midjourney, and DeviantArt for training models on artwork without consent.
- Language models scrutinized. Comedian Sarah Silverman and authors sued OpenAI and Meta for scraping books without permission to develop natural language AI like ChatGPT.
- News outlets restrict access. Major publishers blocked OpenAI's web crawler to avoid having their articles copied into training data.
These cases exemplify how existing copyright statutes fail to address AI systems that can ingest and synthesize massive data. Even identifying infringing content within black-box algorithms is hugely difficult. It's thus unsurprising that calls for updated rules are intensifying.
Shared Liability for AI Creations
Given AI's ability to mimic voices, styles and more, the Copyright Office also raised questions around publicity rights and unfair competition laws. If an AI chatbot perfectly imitates a real person's speech patterns without their authorization, who holds responsibility? How about AI art or media that closely parallels a human creator's signature style?
Here the issues grow even more complex, touching on publicity rights and false endorsement. If AI models lift identifying characteristics of real people, legal liability likely extends beyond direct training data abuse to the companies deploying the tech. However, apportioning blame across data sources, AI developers and users gets murky.
Informing AI Regulation
This public comment period represents just an initial step by the Copyright Office to inform its own policies on registering AI works. But the feedback could also shape critical congressional action to update IP laws for the era of the algorithm. Lawmakers are accelerating efforts to draft oversight guardrails for AI. With innovators applying AI in novel ways daily, pressure mounts for responsive policies to keep pace. By inviting input from diverse AI stakeholders, regulators can craft balanced rules on technology with immense creative promise and disruption.
How to Submit Comments
For the convenience of participants and to streamline the process, the Copyright Office has adopted the regulations.gov system for the submission and posting of public comments related to this proceeding. All interested parties are encouraged to submit their comments electronically through the regulations.gov platform.
Specific instructions on how to submit your comments using the regulations.gov system can be found on the Copyright Office's official website at https://copyright.gov/policy/artificial-intelligence. This website provides step-by-step guidance on how to navigate the regulations.gov platform and successfully submit your comments.
If electronic submission through regulations.gov is not feasible due to specific circumstances, individuals may contact the Copyright Office to request special instructions for submitting their comments. We are committed to ensuring that all stakeholders have an opportunity to participate in the process, and we will provide appropriate guidance for alternate submission methods.
For any further inquiries or assistance related to the submission of comments or any other aspect of this proceeding, please feel free to reach out to:
Assistant to the General Counsel
Email: [email protected]
List of Questions
|General Questions: The Office has several general questions about generative AI in addition to the specific topics listed below. Commenters are encouraged to raise any positions or views that are not elicited by the more detailed questions further below.||1. As described above, generative AI systems have the ability to produce material that would be copyrightable if it were created by a human author. What are your views on the potential benefits and risks of this technology? How is the use of this technology currently affecting or likely to affect creators, copyright owners, technology developers, researchers, and the public?|
|Does the increasing use or distribution of AI-generated material raise any unique issues for your sector or industry as compared to other copyright stakeholders?|
|Please identify any papers or studies that you believe are relevant to this Notice. These may address, for example, the economic effects of generative AI on the creative industries or how different licensing regimes do or could operate to remunerate copyright owners and/or creators for the use of their works in training AI models. The Office requests that commenters provide a hyperlink to the identified papers.|
|Are there any statutory or regulatory approaches that have been adopted or are under consideration in other countries that relate to copyright and AI that should be considered or avoided in the United States? How important a factor is international consistency in this area across borders?|
|Training||If your comment applies only to a specific subset of AI technologies, please make that clear.|
|6. What kinds of copyright-protected training materials are used to train AI models, and how are those materials collected and curated?|
|6.1. How or where do developers of AI models acquire the materials or datasets that their models are trained on? To what extent is training material first collected by third-party entities (such as academic researchers or private companies)?|
|6.2. To what extent are copyrighted works licensed from copyright owners for use as training materials? To your knowledge, what licensing models are currently being offered and used?|
|Transparency & Recordkeeping||15. In order to allow copyright owners to determine whether their works have been used, should developers of AI models be required to collect, retain, and disclose records regarding the materials used to train their models? Should creators of training datasets have a similar obligation?|
|15.1. What level of specificity should be required?|
|17. Outside of copyright law, are there existing U.S. laws that could require developers of AI models or systems to retain or disclose records about the materials they used for training?|
|Generative AI Outputs||If your comment applies only to a particular subset of generative AI technologies, please make that clear.|
|Copyrightability||18. Under copyright law, are there circumstances when a human using a generative AI system should be considered the “author” of material produced by the system? If so, what factors are relevant to that determination? For example, is selecting what material an AI model is trained on and/or providing an iterative series of text commands or prompts sufficient to claim authorship of the resulting output?|
|19. Are any revisions to the Copyright Act necessary to clarify the human authorship requirement or to provide additional standards to determine when content including AI-generated material is subject to copyright protection?|
|Infringement||22. Can AI-generated outputs implicate the exclusive rights of preexisting copyrighted works, such as the right of reproduction or the derivative work right? If so, in what circumstances?|
|23. Is the substantial similarity test adequate to address claims of infringement based on outputs from a generative AI system, or is some other standard appropriate or necessary?|
|25. If AI-generated material is found to infringe a copyrighted work, who should be directly or secondarily liable—the developer of a generative AI model, the developer of the system incorporating that model, end users of the system, or other parties?|
|Labeling or Identification||28. Should the law require AI-generated material to be labeled or otherwise publicly identified as being generated by AI? If so, in what context should the requirement apply and how should it work?|
|29. What tools exist or are in development to identify AI-generated material, including by standard-setting bodies? How accurate are these tools? What are their limitations?|
|Additional Questions About Issues Related to Copyright||30. What legal rights, if any, currently apply to AI-generated material that features the name or likeness, including vocal likeness, of a particular person?|
|33. With respect to sound recordings, how does section 114(b) of the Copyright Act relate to state law, such as state right of publicity laws? Does this issue require legislative attention in the context of generative AI?|
|34. Please identify any issues not mentioned above that the Copyright Office should consider in conducting this study.|