AI to Art: Creating Your First Children’s Book

Saswat Panda
9 min readNov 14, 2023

--

Earlier this year, I embarked on a fun creative journey, crafting a custom Mother’s Day book for my wife and our baby daughter using Midjourney and ChatGPT. Now, with the holiday season just around the corner, several friends have asked me to share the process I used. While it’s not a breeze, it’s certainly manageable and fun, even if (like me) you don’t know how to use image editing tools. Crafting my 16-page book was a labor of love, requiring about 16 hours of dedicated effort. The most significant hurdle was achieving a consistent visual style and character art across the pages with Midjourney. You could probably create something in a few hours, but to get an “acceptable” level of consistency, more time might be needed. I’m hoping this post will streamline your creative process and save you a lot of time!

Process Overview

  1. Story: Create a story outline yourself and use ChatGPT to flesh it out
  2. Illustrations: Use Midjourney to generate images with a consistent visual style
  3. Publishing: Combine images to create a PDF and send to prin

Budgeting and Resources

  • AI Tools: ChatGPT is free. Midjourney has a basic plan at $10/mo, and you might need to buy more credits depending on usage (I spent ~$20 total for 16 pages)
  • Design Software: Figma (free and optional), PowerPoint (or Google Slides)
  • Printing: I used BookBaby for $99.

Step 1: Story

Devise the story yourself and then get ChatGPT to improve it. I found it difficult for even the best language models to come up with a super-personal story that doesn’t seem contrived.

  1. Craft Your Story: Begin by devising your own narrative.
  2. Enhance with ChatGPT: Use ChatGPT to transform your tale into a book format by:
    • Improving the narrative’s quality.
    • Segmenting the story across individual pages.
    • Refining the writing style for your target audience.
    • Suggesting vivid image descriptions for each page.
  3. Refine and Repeat: Modify your prompts to address any shortcomings in the output. Revisit Step 2 as needed for enhancements.
  4. Curate Your Masterpiece: Compile the finest elements from various ChatGPT iterations to create your polished story.
  5. Iterate for Perfection: Return to Step 2 and repeat the process if further refinements are necessary.
Input for Attempt #1
Output for Attempt #1 (Excerpt)

Step 2: Illustration

Some tips before we get started…

Simplify Your Scenes: Aim for simplicity in your scenes. Achieving a consistent visual style and character art across renderings is challenging, especially as you add more elements and characters. For instance, creating scenes with just one character makes it significantly easier to maintain consistency compared to scenes with three.

Embrace Creative Flexibility: Allow for creative interpretation in your illustrations. Rather than attempting to replicate the exact scene in your mind, explore different artistic expressions. My own experience with Page 1 (see below) is a good example.

Experiment Generously: Don’t hesitate to experiment. On average, I generated about 26 images per page before selecting the final one. It’s all about trial and error to find the perfect fit.

  1. Get Started with Midjourney: Begin by setting up Midjourney and learning the basics of Discord.
  2. Learn the Basics: Invest an hour in understanding Midjourney’s features and capabilities.
  3. Generate Your Illustrations:
    • Focus on finalizing one page (or a list of 2–3 options) before moving to the next.
    • Set a specific time limit for each page. Mine was 45 minutes. This is crucial, especially if you’re a perfectionist because Midjourney is never perfect.
    • Once you’re satisfied with an image, increase its resolution. Initially, I used a third-party image upscaler, but Midjourney now includes this feature.
    • Don’t forget to save each image as you go along.

Techniques We’ll Use

Seed parameter

A “seed” in software systems refers to a number used in generating pseudo-random numbers. This seed, inputted into a specific formula, produces a unique output number. Changing the seed value even slightly results in an unpredictable, different output. In Midjourney, random numbers derived from seeds create unique content for each request. However, specifying a fixed seed parameter ensures identical outputs for identical requests. A consistent seed can also yield similar visual styles for slightly varied queries.

Character sheets

A character sheet can be a great way to generate multiple poses for a single character. You can then split the sheet into individual images, upload the image to the Discord chat (to get a unique URL for the image) and then pass the image URL into the next Discord imagine command. This increases the likelihood that the command will output content that is consistent with the character sheet.

Sample images

You can pass one or more image links to your prompt, which will direct the prompt to generate a similar image. One approach for visual consistency is to use the final images from previous scenes. To generate the URLs, simply copy the link of upscaled images (or upload your own image to the Discord chat and click “Copy Link”).

The two images on the left were used as part of the prompt for the image on the right

Zooming out

If you’re using a similar prompt for each rendering, the zoom level will likely be similar. One approach I’ve found useful is the “Zoom Out” button. Don’t be afraid to (1) Zoom out more than you think you need, (2) Up-res the image, (3) Crop out the sections you don’t need.

Example zoomed-out view on the right. You can then crop as needed so characters are centered/bottom-right/etc.

Style parameter

Midjourney just released a style parameter that I wish I had access to earlier this year! It seems to really help with maintaining style consistency between sheets and I recommend trying it out.

Sample Pages

In this section, I’ll walk through a couple of the first pages of the book, how the scenes were generated, and the challenges that I faced. Note that I generated roughly 26 images for each page, so don’t be afraid to try many things.

Page 1

“In bustling San Francisco, by the bay so wide, lived one-year-old [XX] with her Amma and Papa, side by side.”

[Illustration: A bright, colorful picture of a tall house in San Francisco, with [XX], Amma, and Papa peeking out from a window.]

I first tried to recreate the illustration exactly as is — an indoor scene with a view of the outdoors. As you can see from some examples below, that didn’t work.

“peeking out from a window” not really working

I decided to make the scene outdoors. You can see that requesting 3 characters “baby, mother, father” just doesn’t work very well.

“baby, mother, father” not so good

After trying a few different things, I arrived at a setup that I liked. However, getting the number, age, skin color, hair, height, gender, and other attributes of the characters was a real challenge. For example, if I used the term “spectacles”, glasses would show up on any combination of the 3 (or more) rendered characters.

The left-most image is referenced in the two prompts on the right

I finally arrived at one I thought was decent — 3 characters of the right size and gender, but they don’t really look like us. When my wife later saw this, she asked “What’s up with that animal? And the egg???” I have no clue.

Page 1 Complete! But what’s that white blob?

But this process taught me a major lesson — it was time to kick Papa out of the rest of the book.

Page 4

“Then came a trash truck, rumbling and tumbling, [XX] saw it, her favorite, and started running and fumbling.”
[Illustration: [XX] running enthusiastically behind a colorful trash truck, with Amma looking in fear.]

The first attempt was to capture the illustration as-is. Three elements — trash truck, baby, mom — turned out to be too complicated. So I switched to just two. Even then, I always got renderings of a truck chasing a child, not the other way around.

It's not “garbage truck running behind baby” 🤦

The final solution was to create renderings of a rear-facing truck and rear-facing child separately and then combine them into the final image.

Images selected from the two on the left were used to generate the image on the right
Page 4 Complete! But couldn’t fit Amma in 🙁

Page 11

“But oh, it was too much, Amma’s world went awhirl, and down she swooned, our dear girl!”
[Illustration: Amma fainting dramatically, with Shah Rukh Khan looking surprised.]

I’m selecting this page as an example because it was the hardest to get right and required the most off-Midjourney finagling. To get 3 characters — Amma, baby, and Shah Rukh Khan — in the exact poses I wanted each of them to be in, I had to generate the three characters separately with whitespace backgrounds, use online background removal software for each, and then combine them into a single image in Figma (or Powerpoint, Preview, etc).

First: Iterating towards the right pose for Amma
Second: Getting SRK right
Third: Sprinkle in some cuteness
The final shot combining 3 images together

Step 3: Printing

There are several options here. If you want to publish your book, you could publish directly through Amazon. I wanted to create something private and unique for my family and opted for BookBaby, which I found to be reasonably priced at $99, and high quality.

  1. Select Your Publisher: Choose a publisher that aligns with your vision and budget.
  2. Designing the Layout:
    • Use PowerPoint (or a similar tool) to create a layout for your book, where each slide represents a page.
    • Adjust the slide dimensions to match your publisher’s specifications.
    • Crop the images if necessary. In my case, I wanted a square book, which fortunately aligned with the aspect ratio of my images, eliminating the need for cropping.
    • Plan your text layout. I opted for a simple design with text and images on alternating pages.
  3. Create Your Cover and Back Images: Use Midjourney to craft a captivating cover and back image for your book. Add the book titles to the image using an image editor (eg. Powerpoint or Figma).
  4. Assemble Your Book into a PDF: Compile your pages and cover into a PDF, adhering to your publisher’s format requirements.
  5. Submit Your Masterpiece!

I like using Figma, so here’s how I kept track of selected images as I rendered them:

Storyboard 1: Page breakdown with up to 4 potential renderings
Storyboard 2: Final book before exporting to PDF via PowerPoint

That’s It!

Hope this post inspires you to give it a shot this holiday season and craft a unique children’s book for your loved ones. Please feel free to share thoughts, questions, and tips I may have missed in the comments below!

--

--