Text-to-Image AI on CML

Dennis Lee
3 min readOct 29, 2023

--

Eager to establish your personal Dall-E platform while avoiding the intricate challenges of managing privacy, security, and expenses associated with APIs? You may opt for constructing your private Stable-Diffusion system, known for its strong prompt capabilities and the ability to generate a wide range of high-quality digital images, from imaginative artworks to lifelike visuals. Additionally, you can explore diverse text-to-image models using the Stable-Diffusion web interface, courtesy of the contributions from the open-source AI 🤗 community.

In this article, I’ll explain how straightforward it is to deploy Stable-Diffusion using CML (Cloudera Machine Learning). The combination of CML and an on-premise K8s platform offers a fascinating fusion of technology. This dynamic duo empowers artists, designers, marketers, and architects to leverage the potential of text-to-image AI. With K8s orchestration, CML hosts yet another exciting open-source innovation on the container platforms.

It only takes a few simple steps to spin up the popular Stable-Diffusion system on the CML platform:

  1. Create a CML project. Select Python 3.9 kernel. Add GPU enabled Runtime variant.
  2. Start a new session. Select the appropriate Resource Profile.

3. In the CML session, open the >_Terminal Access. Run the following command to clone the git repository of Stable-Diffusion-Webui.

$ git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

4. Install the required Python libraries as written in its README.md.

pip install -U requirements.txt

5. Create a new Python file with the content below.

!cd stable-diffusion-webui; python launch.py --port=${CDSW_APP_PORT}

6. Click the ▶️ button. Proceed to click the Custom Web App.. button, when it appears.

That’s it! Your private Text-to-Image platform is now up and running.

With the help of GPU, the processing time to produce the images based on the input texts takes less than half a minute. That said, the lead time depends heavily on the model, canvas size and tuning settings. Here are some fun images it produces, with the associated prompts. These visuals are produced through the utilization of 2 models, namely Realistic_Vision and DreamShaper.

Star Wars battle scene with futuristic robots with Singapore's Garden by the Bay in the background.
Ironman having breakfast on the moon.

Finally, I decided to generate a lifelike image featuring Cloudera’s text using the text-to-image functionality, utilizing an extension called controlNet.

beautiful sea, 3d, realistic, 8k, high resolution

By adjusting a few parameters, it produces the following realistic masterpiece. Interrogating the digital image using image-to-text feature, illustrates it as “a picture of the word CLOUDERA written in water with waves around it and a rock formation in the middle, a digital rendering, environmental art” 👏🏽👏🏽👏🏽

--

--