Bidirectional linking with Contentful

Contentful is a Content Management System (CMS) that is incredibly easy to set up and use. It comes with APIs (JS, Ruby, Java, Python, ….) that allow us to deliver content as well as to manage content. It also provides easy ways to import/export data from other data sources such as WordPress.

In one of my journeys recently, I was looking for a CMS that would allow our engineering team to store and manage content via APIs and would also allow non-technical users (subject matter experts on the content itself) to modify the content without having to go through a complicated process. We decided on Contentful and were able to create our desired content structure very easily.

Over time, we realized that our content was growing and we had 3 content types with hundreds of entries each and they all had a many-many relationship with each other. How would we establish these relationships among them in Contentful?

This article deals with how we solved this problem using custom code to do the mappings (and more) for our needs. Some familiarity with Contentful is necessary to understand the examples below.

For the purpose of this article, I’ll simplify this use case with a popular example: Blog Posts and Authors.

Let’s say we have a Blog Post (“My Rails Article On Medium”) and it has one author (“Maximus Mallya”).

So, if we create the above Blog Post and create Maximus Mallya as the author using the Create New Entry and Link option, we could look up Maximus Mallya and it would show that the associated Blog Post would be “My Rails Article…”. Bidirectional relationship achieved.

Mission accomplished, right?

The Blog Post view showing Maximus as the author
The Author view showing “My Rails Article..” as the associated Blog Post
However, our content types are complex entries having over 30 properties (descriptions, links, resources etc.) each. These could not all be created up front and had to be created separately. Hence, our need was to link all these entries to each other after they were fully created.

To take the above Blog Post example, let’s say that our blog post has 3 authors (Maximus Mallya, Piper Mallya and Krueger Mallya). What if each of these Author content types had 15 properties and had to be created separately? In such cases we would need to do the following:

  1. Create the Blog Post (“My Rails Article On Medium”)
  2. Create each of the 3 Authors (Maximus, Piper, Krueger). Populate all the properties associated with each Author.
A Blog Post and 3 Authors created

3. Link the Blog Post to the 3 authors manually

4. Visit each Author and link back to the Blog Post (example: Link Krueger to the Blog Post)

As the number of blog posts and authors grows, the above process can become tedious and is prone to human error. Also, what if the same author gets added multiple times?

Solution: We can write a simple program in Ruby (or in any other language) that does the following:

  1. Fetch the Blog Posts and associated Authors and create a hash (blog_post_author_hash)
  2. Fetch the Authors and associated Blog Posts and create another hash (author_blog_post_hash)
  3. Iterate over the blog_post_author_hash and check whether the entries match with the ones in author_blog_post_hash and if not, invoke the Content Management API to create a new link from the Author to the Blog post.

The code to iterate over the blog posts and extract the authors would look like the snippet below.

Tip: In my example above , I am fetching 2 levels of data and also looking to fetch up to 300 entries (the default is 100). This is just to show how to fetch more data than normal and should be used with care as it can take a performance hit. You can use pagination as the number of entries grows.

I’ll leave it to the reader to actually create the hashes and do the comparisons. Once you find the mismatches (for example: If Author “Piper Mallya” is not associated with the Blog Post), you can programmatically add a Link to the Blog Post in Piper Mallya’s entry.

Let’s say that Piper’s Entry id is 5QJRButh9CqGGo0A6qOWYY and the Blog Post id is 5EpIDhwwbSgMUWuIcEyg2a. With this information, you can use the Content Management API as follows:

Create a new link for this relationship between Author and Blog Post
Fetch the entry for the Author
Extract the array of associated Blog Posts (might be a good time to check for duplicates too)
Add this blog post to the array
Update the Author Entry
Publish the Author Entry
Update: Contentful has an experimental graphql implementation that supports back references. You can check it out at: https://github.com/contentful-labs/cf-graphql
In conclusion: We were able to eliminate weeks of extra effort of mapping the various A → B and B → A relationships, identifying duplicates and unassociated entries (orphans) and validations, by using the simple solution above. In the future, as CMS linking processes improve, these solutions might not be required.