Try our custom LLM Masker
Featured Image

3 min read

13/07/2023

Generative AI and Copyright Issues

Generative artificial intelligence (ChatGPT, for example) has become very promising for researchers and content creators of all kinds. However, it also poses  some risks yet to be explored, such as copyright infringement, among others. As AI tools crawl the Internet and other digital sources for information to respond to users' queries, the information that is collected often belongs to other content creators.  

What risks does this involve for those who believe this information, and even use it textually, when it is inaccurate?  

Let's focus on the content generated by AIG and the implications that content creators should consider. 

 

Recommended reading:

Generative AI, the Branch of Artificial Intelligence That Will Spark Many Discussions in 2023 

 

 

Copyright: entering unknown territories  

Daniel Restrepo, an associate lawyer at the Fennemore firm in Phoenix, works in the business and finance group. He points out that there are two conflicting interests when it comes to intellectual property (IP) regulation and generative AI.  

"Historically, copyright was reserved for human-created content with the political aim of fostering the exchange of new and innovative ideas between the public and culture," However, there is "an enormous interest in promoting and rewarding AI development and machine learning. AI has significant value for companies, government administration and national security." 

The dilemma? "If we do not provide IP rights to the content generated by the AI, specifically to its designers, there is much less incentive to create such software if the result immediately enters the public domain." 

In addition, there are other concerns. As Kennington Groff, a lawyer specializing in intellectual property at Founders Legal in Atlanta, points out:  

"According to recent guidelines provided by the US Copyright Office, there is a potential risk of infringement when AI generates content derived from copyrighted material without proper authorization. As systems collect information and respond to users' questions and concerns, they can inadvertently use copyrighted content belonging to other creators." 

Chairman of Founders Legal entertainment group and lead lawyer for the firm in Nashville, Aaron C. Rice, mentions in a blog post in April 2023, (U.S. Copyright Guidelines for Works Accounting AI-Generated Material) that the copyright registration process includes the following disclosure requirement:  

"When registering a work containing AI-generated material, creators must clarify the use of AI in the registration request. This disclosure helps the Copyright Office to assess the human author's contribution to the work."  

 

Understanding how GenAI actually works

As Arle Lommel, Director of the Data Service at CSA Research in Massachusetts, points out, Generative AI does not really work the way many believe. Many people assume that it acts as a giant search engine that retrieves and plays content that you have previously stored somewhere.  

That's not how it works. Generative AI systems do not store vast amounts of training data. What they do is store statistical representations of this data. This means that they cannot simply reproduce something they were trained on, they have to generate something new based on it.  

Lommel compares the process to that of a university student who is asked to write an essay based on existing research and then express and synthesize, in their own words, part of the knowledge to reflect their own understanding."This differs from the student who buys an online essay or copies a Wikipedia article, which clearly constitutes plagiarism." 

Addressing plagiarism and ownership of AI-generated content will be extremely difficult, given the way these systems work.  

Lommel recognizes, "theoretically, the result is a derivated work." But he adds: "derived from many, many essays, which contribute in an infinitesimal degree to the result. This does not mean that an intelligent legal strategy cannot succeed by finding some offending use, but the risk is very low. "  

For written content creators, for example, there are tools such as Grammarly or Turnitin that can be used to identify plagiarism (and percentage). There are also tools such as the OpenAI text classifier, which allow users to copy and paste text to analyze the probability that it was created by a human or by AI. 

 

 Can be interesting:

Beyond ChatGPT: The Future of Large Language Models and AI 

 

 

Necessary or not? 

For those who use Generative AI to create written content, there is a risk greater than plagiarism: that of inaccuracy. Lommel explains the problem of AI inaccuracy as follows: 

"Generative AI carries a real risk because of its fluidity - the result resembles something a competent human could say or create. This exponentially increases the probability of ignoring subtle problems of meaning." 

For example, if you use GenAI to translate a manual on how to treat a disease and it makes mistakes in the details that nobody detects because the result seems human, however discreet these mistakes may be, who is responsible if it results in damage? All current GenAI tools explicitly waive any guarantee of suitability in their results, leaving all responsibility in the hands of the person who uses them. 

Simply saying "I thought I was fine" will not eliminate personal responsibility. There have already been lawsuits concerning incorrect content, but it is only a matter of time before a large organization is sued for content produced (using GAI) without sufficient supervision. 

The static nature of these systems poses the risk of inaccuracy, as they are trained with time-limited datasets. For ChatGPT, it is still 2021. 

"GPT-4 warns users about this. The model becomes problematic when systems make claims based on past knowledge that is now known to be untrue. Imagine a case where a system describes someone as a convicted killer, but that person was completely exonerated after the system was trained." 

That risk exists even with content that is not generated by AI, as humans can overlook certain facts that would lead to confusion. 

There are three important things written content creators should do as they experiment with this technology:  

  1. Consider Generative AI as a tool that can provide useful and valuable information to your writing process.

  2. Conduct thorough fact-checks.

  3. Submit any created content that depends on the information provided by generative AI, albeit to a small extent, to tools that can minimize the risk of inadvertent copyright infringement. cta Ebook Cómo la IA puede salvar tu negocio