Intellectual Property Concerns in AI-Generated Content: Top 3 Questions Answered

Samuel G. Villegas
5 min readSep 7, 2023

--

When I was pitching the idea for Crystal Docs, many of you expressed one of the main concerns about Intellectual Property in AI-generated content: “Okay, great idea, but what about the Intellectual Property risks?”.

The three main questions some of you were asking:

  • What happens to my proprietary codebase and data?
  • Can I/my company claim authorship of the generated content?
  • Can someone sue me for generated content that violates someone’s IP?

Let’s address those, and if you have other questions, feel free to reach out or leave them in the comments.

What happens to my proprietary codebase and data when I submit it to a LLM?

I first answered this question by explaining that we apply similar terms to your codebase and data as these other companies: Github, GitLab, Bitbucket, Google Documents, Microsoft 365, AWS, or GCP. You own your content and data, the only thing you’re giving us and them are some rights needed to host and process the data; Also, in some cases, like with Github’s Terms, they have a program where they use your user-generated-content and data to train their AI Models, and you can always Opt-Out and ensure that your codebase and Intellectual Property are protected.

But of course, one size doesn’t fit all, and some of you work for companies that host their solutions for version control systems, document management systems or have very closed policies on the intellectual property over the codebases, in consequence, that first policy is not a good fit for those situations.

To help those companies we are creating a self-hosted version of our solution to help them be compliant with their intellectual property Policies while ensuring they take advantage of AI models while remaining compliant and keeping content and data secure.

Equally important, whether you’re using Crystal Docs’s solution or any of the commonly known available tools, you must feel confident that the Intellectual Property of your codebase and data is secure and compliant, so make sure that you read those Terms and Conditions to ensure the policies held by those tools comply with your requests and needs.

In short, with us, you own your data, We just ask you to give some rights needed to store and process to generate documentation content using our AI Models, and if this doesn’t work for your company policies you will still be able to run a self-hosted version of our solution.

Can I/my company claim authorship of the AI-generated content?

This is a complex question, as the answer varies based on the country and its corresponding regulations. For example, on August 18th, 2023, a US Court ruled out that AI-generated art cannot be subject to copyright: “Only works with human authors can receive copyrights,” setting a precedent for future cases.

In the EU, a new AI Act was passed early this year (Apr 2023). It started as a draft two years ago and now states, “Companies deploying generation AI-based tools will have to disclose any copyrighted material used to develop their systems.” But still, there needs to be a clear definition of authorship and ownership of AI-generated content.

If you’re in Japan, they also published a blueprint policy over this type of content; in India in 2019, they published a report from the World Intellectual Property Organization as the standard for their regulations but they have not updated or published anything new; and in China, they haven’t officially published anything.

Whether your software code is a piece of art or you generate code for implementing a proprietary algorithm, you should pay attention to what you can and cannot do with this content; of course, you could argue that if you host your models or the company hosting the models cedes you generation proprietary rights over the generated content, or you change the name of some functions and variables, you could claim authorship for the content.

At Crystal Docs, we have decided to establish that the generated content you create while using our solution belongs to you or your company, ceding you total proprietary rights over the generated content. This policy could change depending on the regulations, but it’s a stepping stone to ensuring you don’t worry too much about this.

Can someone sue me for AI-generated content that violates someone’s IP?

Imagine you’re asked to develop a new feature on an existing large codebase, You use Crystal Docs to generate the documentation making it easier to know what’s happening on that codebase, and then you use that knowledge to generate code ‘inspired’ on your codebase using an AI Model, done you developed the new feature, later you feel very proud so you publish the solution and how fast you did it.

Suddenly, another company notices your publication and claims your code/solution resembles their copyrighted code or solutions. The unfortunate reality is, yes, you can face legal action for AI-generated content that unintentionally mimics or replicates another’s copyrighted material.

Given the capacity of AI to analyze vast amounts of data, it may inadvertently create outputs that align closely with existing copyrighted works, even if that wasn’t the original intention. As with the examples in the music or film industries where AI can recreate songs or scenes that seem ‘inspired,’ software code isn’t immune to these parallels.

The AI generation model may be trained on an immense data set, making it challenging to pinpoint direct IP infringement. Nevertheless, the responsibility lies with the users and the businesses utilizing AI generation tools to ensure that the generated content does not violate any existing copyrights or intellectual properties.

We continuously work towards refining our AI models to mitigate such risks, but we also urge users to vet generated content thoroughly before publication or distribution.

As AI evolves, staying informed about the new regulations covering Intellectual Property in AI-generated content is crucial as you use these tools to elevate your software development process.

Dive into the conversation; share your AI experiences or concerns below.

--

--