Yale Researchers Propose Copyleft License for AI Transparency

Key Takeaways

  • Introduces a 'copyleft' licensing framework to ensure AI models built on open-source code remain transparent and accessible.
  • Provides a legal mechanism to combat 'open washing' by requiring developers to share model architecture and training data.
  • Empowers the open-source community to maintain control over how their contributions are utilized in generative AI development.

Yale researchers propose ‘copyleft’ rules for generative AI

Researchers at Yale’s Digital Ethics Center have introduced a new licensing framework designed to bring transparency to the rapidly evolving field of generative artificial intelligence. The proposed Contextual Copyleft AI License (CCAI) seeks to address a growing tension between the free and open-source software (FOSS) community and AI developers, who often utilize open-source code to build models without reciprocating the transparency required by FOSS principles.

Extending the copyleft concept

The CCAI is a novel extension of "copyleft" licensing, a concept traditionally used in software to ensure that derivative works remain as free and transparent as the original material. By treating generative AI models as derivative works, the proposed license would require developers who train their models on open-source code to make their resulting architecture and training data freely available.
Lead author Grant Shanklin, a de Vries-Sherif Junior Fellow at the Digital Ethics Center and a rising senior at Yale College, noted that this approach could provide open-source developers with meaningful control over how their contributions are utilized. Shanklin emphasized that the framework would incentivize the formation of a community dedicated to building AI tools that align with the values of the open-source movement, fostering a more open and responsible development environment.

Addressing the transparency gap

The study, published in the International Journal of Law and Information Technology, argues that many AI companies currently benefit from open-source code while keeping their final models proprietary. Claudio Novelli, a de Vries-Sherif Associate Research Scientist at the Digital Ethics Center, pointed out that while some companies claim to be open, their models often remain opaque regarding key components. The researchers suggest that the CCAI would effectively combat "open washing," a practice where organizations label products as open while maintaining restrictive, proprietary controls.
The authors conducted a comprehensive legal and policy analysis, concluding that the CCAI is legally feasible under current copyright law, provided that the training of AI models does not qualify as "fair use." In addition to increasing transparency, the researchers believe the license would encourage innovation by granting open-source developers access to training data that is currently locked within closed, proprietary systems.

Mitigating risks in AI development

Generative AI presents a higher risk profile than traditional software due to its potential to generate deceptive content or amplify malicious activities, such as phishing. The Yale researchers suggest that the CCAI could serve as a vital complement to existing regulatory protections, such as those enacted by the European Union, which aim to prevent AI systems from using manipulative or deceptive techniques to influence human behavior.
By ensuring that models built with open-source software remain fully transparent, the researchers argue that the CCAI framework could help mitigate the risks associated with AI. The study was co-authored by Claudio Novelli, Emmie Hine, Luciano Floridi, the John K. Castle Professor in the Practice of Cognitive Science and founding director of the Digital Ethics Center, and former undergraduate fellow Tyler Schroder.

Comments (0)

No comments yet

Be the first to share your thoughts!