Popular Text-to-Image Generation AI Tools: A Comparative Analysis

Vaishnavi R
Version 1
Published in
9 min readOct 10, 2023
Robot painting an image of mountains
Image by Bing Image Creator

In an era where imagination meets innovation, the world of AI has opened up an intriguing domain of creativity through its image-generation AI tools. Whether you’re an artist, designer, or simply curious about the intersection of technology and imagination you’ll find these AI-powered tools as a gateway for artistic expression and endless inspiration.

In this blog, we will analyse some popular AI-driven image-generation tools.

1) DALL-E

DALL-E is a 12-billion parameter transformer language model, trained to generate images from natural language descriptions, called “prompts”.

DALL-E, DALL-E 2, and DALL-E 3 models have been developed by OpenAI using deep learning methodologies. In September 2023, OpenAI announced their latest image model, DALL-E 3, capable of understanding “significantly more nuance and detail” than previous iterations.

Screengrab of DALL-E interface

How to use DALL-E 2

  1. Visit DALL·E 2 (openai.com) and click on Try DALL-E and sign up.
  2. Once the page loads, navigate to the top right corner to check and buy the credits.
  3. All new users receive 15 free credits. They have a 1-month validity from the date they are granted and are replenished monthly.
  4. For more information on credits, you can refer to this link: How DALL·E Credits Work | OpenAI Help Centre.
  5. Enter a prompt on the home page and click ‘Generate’.
  6. Wait a few seconds and your AI-generated image will be ready.
  7. You also have an option to upload an image, set the frame then write a prompt and click ‘Generate’.
  8. Finally, you can choose to download, save to a collection, share, edit, or create more variations easily.

Features:

  1. DALL-E provides the option to edit images.
  2. Inpainting: Allows to erase some aspect of the existing image and use AI to fill in the gap with whatever the user specifies.
  3. Outpainting: Allows use of AI to expand the borders of an existing image.
  4. DALL-E API: Developers can integrate DALL-E directly into their apps through an API. For more details, check this out: DALL·E API now available in public beta (openai.com).

In Version 1-AI Labs, here’s what we tried with DALL-E:

We experimented with various art styles and different keywords with prompts including Digital art, Oil painting, Water colour painting, Drawing, Illustration, Crayons drawing, Memphis art, Acrylic painting, Gouache, Pencil drawing, Street art, Anime art, Manga art style, and Cartoon style, 2d, 3d, 4d, CGI, Cinema4d, Colourful, Disney movies, Pixar style, Studio Ghibli, Lighting, Realistic, Photo, Animation, and more.

Points to consider while writing prompts to generate good-quality images:

  • DALL-E works best if the prompt is concise and specific.
  • DALL-E does not follow the prompt when generating text within the generated image. So this tool isn’t good for branding.
  • It messed up the images containing wordings and could not generate good images containing text.
  • Long and complicated prompts many times gave unclear and bad images.
  • Writing about multiple characters in the prompt led to low-quality, distorted images whereas sticking to a single character with specific information produced good results.
  • Mentioning specific objects in the prompt generated nice images.
    For example, “A photo of a happy panda sitting on a tree branch.”
  • Explaining the Scene or Setting as specifically as possible worked out well.
    For instance, “ An intricate sandcastle on a tropical beach, palm trees, summer sunrise, photorealistic”

DreamStudio(Stability.ai), Bing Image Creator, and Midjourney have outperformed the DALL-E model in terms of image quality.

2) Bing Image Creator

An example of DALL-E’s application is Bing Image Creator. This is a result of Microsoft’s partnership with OpenAI. It uses DALL·E 2, and currently, there is no charge for its use, at least for the time being.

Screengrab of Bing Image Creator interface

How to use Bing Image Creator?

  1. Sign up for a new Microsoft account or log into your existing Microsoft account.
  2. Visit https://www.bing.com/create, which opens the Image Creator in your browser.
  3. New users are granted 25 boosted generations. Additional boosts can be earned by trading in Microsoft rewards.
  4. Type any text description and click ‘Create’ to generate a set of images.

Features

  • Bing image creator uses DALL-E 3. It is available in Microsoft Edge, and it is very easy to use.
  • This tool leverages AI to generate realistic and diverse images from natural language descriptions.
  • It gives a prompt template to help in writing prompts.

Here’s what we tried with Bing Image Creator:

Adding more details in the prompt such as starting with an adjective, explaining foreground and background details separately, and concluding with specific art styles resulted in generating nice images.

Points to consider while writing prompts to generate good-quality images:

  • Bing Image Creator generated amazing images when compared to DALL-E API.
  • However, it generates distorted images when text is included, similar to the issues seen with DALL-E.
  • To get good-quality images, need to create images several times by varying keywords in the prompts.
  • This tool generates amazing images when prompts are well-structured.

3) Midjourney

I would say, that Midjourney is currently the best image-generation AI tool so far. It produces the best-looking most realistic and visually appealing results.

Screengrab of Midjourney interface on Discord

How to use Midjourney?

  1. The only way to access Midjourney is through the Discord chat app.
  2. Sign up for Discord, then visit the Midjourney website and click “Join the Beta”. Accept Invite to gain access.
  3. Midjourney does not offer free credits; You’ll need to purchase a subscription plan.
  4. To generate images, simply enter ‘/imagine’ in the message box, followed by your prompt, and then press enter.
  5. Finally, you will get 4 variations of images as a result.

Features

  • This tool has an image upscaling option for every set of images.
  • Using the ‘Re-roll’ button will re-run the prompt and generate 4 new images.
  • We can zoom out images and a custom zoom option is also available.
  • It allows you to add images as a part of a prompt.
  • The official Midjourney API is not yet available. However third-party APIs are available or you could use a discord API.

Here’s what we tried with Midjourney

Points to consider while writing prompts to generate good-quality images:

  • The more specific and detailed the prompt you write, the better the results will be.
  • Adding adjectives and specifying foreground and background objects generates amazing images.
  • Midjourney offers a variety of settings and options that are handy for styling the images.
  • However, it struggles to generate images containing text, resulting in bad, unclear images.

4) DreamStudio (Stability.ai API)

Stability AI is a company that has released an open-source version of its popular web application called DreamStudio. DreamStudio uses Stability AI’s state-of-the-art image generation model called Stable Diffusion to generate images.

Screengrab of DreamStudio interface

How to use DreamStudio?

  1. To get started, visit DreamStudio’s website and sign up for an account.
  2. New users are granted 25 free credits. These credits are required to use Stability API as well as DreamStudio.
  3. On the left side of the interface, enter prompts and adjust other controls as needed.
  4. Press ‘Dream’ to generate images.

Features

  • Offers various styles such as Photographic, Digital Art, Comic book etc.
  • Allows to enter Negative prompt and customization of width, height, generative steps, and seed values in the advanced settings.
  • Developers can access the Stability AI API at: https://dreamstudio.com/api/

Note: A Negative prompt is a method of indicating to the system what you do not want to see in the generated image. For example, if you want an image of a butterfly, you can use negative prompts like “bird, fish, moth” to avoid any confusion or unwanted elements.

Here’s what we tried with Stability AI API:

Points to consider while writing prompts to generate good-quality images:

  • The more specific and detailed your prompt is, the better the results.
  • Dream Studio may not produce images as impressive as Midjourney.
  • However, it also tends to struggle with prompts containing text, resulting in bad images.

5) Adobe Firefly

Adobe Firefly is a generative AI tool developed by Adobe. This tool allows users to create images, transform text, play with colours, and much more using simple prompts in over 100 languages.

Screengrab of Adobe Firefly interface

How to use Adobe Firefly?

  1. Adobe Firefly is available as a web app at firefly.adobe.com.
  2. Go to the site and click on ‘Get Firefly free’ and then sign up.
  3. Write a text prompt, then click ‘Generate.’
  4. New users get 25 monthly generative credits with the free plan. And these credits reset each month.
  5. The consumption of generative credits depends on the generated output’s computational cost and the value of the generative AI feature used.

Features

  • Generative Fill: Uses a brush to remove objects, or paint in new ones from text descriptions.
  • Conversational editing: Possible to edit the generated images by using prompts in the conversational editor.
  • Photos mashup: Possible to combine more than 2 photos to generate a new one.
  • An image upload option is available for generating custom-designed images.
  • Offers various Text effects, generative recolour, sketch-to-image, upscaling, and many other options.
  • Abode Firefly API is not yet available.

Here’s what we tried with Adobe Firefly:

Points to consider while writing prompts to generate good-quality images:

  • Use specific and well-structured prompts. Specifying art styles and adding adjectives gives good quality images.
  • Text to template: It is possible to generate posters with wordings (wording remains intact) just by entering a prompt (e.g., birthday card). This tool can be used for branding.
  • Adobe Firefly is far better than the DALL-E API and Stability AI API in terms of performance.

Costs:

  • DALL-E: There are slight discounts available for other image-size variations. For more details on pricing check this- Pricing (openai.com)
  • Midjourney: Midjourney Subscription Plans — Check this for additional details.
  • DreamStudio: Credit usage scales according to the step count, pixel dimensions, and the compute required to generate the images. For more details check this link: Platform (stability.ai).
  • Adobe Firefly: Premium plan offers more features such as 100 monthly credits, no watermarks, etc. This is billed monthly and can be cancelled anytime.

Conclusion

In the rapidly evolving landscape of AI-powered image generation tools, we’ve explored some of the most promising options available as of now. When it comes to selecting an image generation tool, the choice should be based on your unique requirements, budget constraints, and desired level of control.

When it comes to generating high-quality, realistic, and artistically pleasing images that align with the user’s descriptions, Midjourney stands out as the top choice. Based on my experience and evaluation, I recommend Midjourney unless you need an API.

Generating images containing texts is an issue with most of the tools, except for Adobe Firefly. So, if you want to make special posters with custom images and add your brand’s text, Adobe Firefly is a great choice.

As technology continues to advance, we can expect even more exciting developments in the field of AI-generated images, opening new horizons for creativity and innovation.

About the author

Vaishnavi R is a Junior Data Scientist at the Version 1 AI Labs.

--

--

Vaishnavi R
Version 1

Junior Data Scientist at the Version 1 AI & Innovation Labs.