Our approach to building World’s first AI coloring pages generator: A Technical and Creative Journey

MWM
Geek Culture
Published in
4 min readApr 11, 2023

by Thomas Jacquemin

Try Color Pop AI

Disclaimer. This article represents our initial approach to the feature and should be considered a starting point. Since its publication, our methods and technology have evolved. We are committed to providing regular updates on our progress and improvements as we continue to advance the capabilities of Color Pop AI.

At MWM, we closely follow the latest technological advancements related to creativity. In the field of image generation, we have seen a tremendous increase in the capabilities of “generative models” in recent years. These AI models, trained on large datasets, capture the essence of the underlying data and become capable of generating new samples. Recent works have coupled generative capabilities with large language models that focus on written text comprehension, resulting in a fascinating form of interactivity: the generation of an image can now be conditioned on a sentence formulated by the user. This means it’s possible to influence the model in its generation process, as the model is trained to take into account the input text request.

The birth of Color Pop AI

It didn’t take long for a large community of AI enthusiasts to identify the creative power of these new tools, bringing AI-generated art under the spotlight. At MWM, we wanted to join in, and had already tried to get results with older frameworks like CLIP+VQGAN, which allowed us to obtain images recognizable by their abstract and evocative style. However, diffusion models made it possible to reach a new step in terms of speed and generation quality.

While the Machine Learning team worked on the technical specifications required for exploiting such models, the idea of applying these technologies to the Color Pop app originated from MWM’s designers who were well-aware of these new trends. Using AI to automatically generate coloring drawings came from the alliance of both technical and creative minds.

How it works

We applied these generation models in the context of coloring book drawing creation. Although our Color Pop app has already enabled thousands of users daily to express their creativity by coloring drawings from a vast predefined collection, what if a user wants to color something not available nor in any digital or physical form? For instance, what if they want to color their own creative ideas, such as a DJ cat or cat riding a motorbike?

The generative models made it possible for users to generate their own drawings and bring their own ideas by addressing the AI directly. Over the last few months, we have been working to provide our users with this on-the-fly drawing generation system, which is fully compatible with the existing drawing kit developed by the MWM mobile rendering team. Each generation being unique, users can now add their creations to their collection, color them, and share them with the world.

The deep learning model we use is called Stable Diffusion. It has been fine-tuned on a substantial coloring drawing dataset. When a Color Pop user sends a request to the service, four propositions are generated by the AI on our servers. A final post-processing step is applied to clean, upscale the resolution, and convert the image for our custom drawing kit that we developed for the mobile application.

Challenges and solutions

The first challenge was achieving a good level of quality, with some homogeneity of style regarding the coloring areas. They must be large enough and well-defined to be pleasant to color. Adding an ML feature to an existing application with tens of thousands of daily users is not straightforward. The AI model capable of generating the drawings is resource-intensive, and specific deployment strategies had to be put in place to quickly deliver four different images per request to all users in the world while avoiding unnecessary and costly server overload.

To address these challenges, we trained the model on a large dataset of internal manually curated artworks, with emphasis on quality first. The generative model was then guided to the desired specific drawing style, again with fine-tuning on a large collection of coloring book pages, that we acquired during the years of Color Pop existence.

As this project proved to be broad in scope, we also approached it as a way to further strengthen our internal pipelines and processes, including data collection, tracking and validation of experiments related to the models, integration within MWM’s custom services and infrastructures, model serving optimization, infrastructure scalability, and production monitoring.

The future of Color Pop

As we look to the next version of Color Pop, our main goal is to continuously improve the generation results to provide an even better coloring experience for our users. This includes working on achieving cleaner and smoother lines, which can be achieved by gathering more data and conducting further experimentation around the ML generative models. We also plan to keep up with the latest research in the field and update our service with better models as they become available.

In terms of product features, we have some exciting ideas in the works. One possibility is to offer users the ability to generate coloring pages based on their own faces, which would add a new level of personalization to the app. We are also exploring an “outpainting” feature, which would allow users to generate even larger canvases by generating extensions of the original drawing. While these ideas are still in the research phase, we are excited about the potential they hold for expanding the possibilities of Color Pop and giving users even more creative tools to work with.

About the author
Thomas Jacquemin, Machine Learning Engineer at MWM

--

--

MWM
Geek Culture

We make AI-powered tools that help people to boundlessly explore their potential https://mwm.io/