Example of Generative AI being used as Craft Material to create birds of paradise earrings (link to project)

Generative AI as Craft Material

Published in

Bits and Behavior

8 min readFeb 10, 2023

Imagine the star trek replicator exists, and you can use it to create or replicate any object. Would you instead print ready-made things or prefer to get parts you can assemble, polish, and customize? While existing Generative AI models are the closest approximation of the replicator we currently have, their results are primarily digital and still require quite a bit of shepherding to materialize in the physical world. But already, it can be done — and we’re seeing a proliferation of tools and improvements in this space.

3D objects generated with Autodesk’s CLIP-Forge, Sanghi et al., 2021, arxiv.org/pdf/2110.02624.pdf

How can we support people to use generative AI models for tinkering and making? What are some initial forays in physical crafting with generative AI that can help inspire future potential directions? This article sheds more light on how generative AI (in particular diffusion models) works, how we might ideate, design, and make with it, and reflect on what this all means for creators.

The current state of the art

Recent applications for generative AI, such as Dall-E (1.5 million users) or Midjourney (4 million users), have taken the content creation world by storm and stimulated our collective imagination to consider AI a new medium for artistic expression. Many of these applications use machine learning models that generate images based on a text description, also called a prompt. These large image-generation models are trained on an enormous amount of data, allowing the creation of high-quality images by users with no graphics or design training. While many of you have seen examples of AI-generated images or videos, you may wonder how this technology works and why it has become so popular.

How do diffusion models work?

The core architecture of diffusion models

Many generative AI applications use diffusion model architecture under the hood. Diffusion models are a type of AI algorithm inspired by non-equilibrium thermodynamics. They add random noise to an input image and then learn to reconstruct a new, similar image from noise. As more noise is added to different samples of the original image (see x1, x2 in fig1), the image gets compressed into a low dimensional representation (z) which is used to create a new image similar to the original one. The process of gradually adding noise is called a forward trajectory or forward pass, and the process of reconstructing a new image progressively from noise is called a reverse trajectory.

The key insight is that a diffusion model needs to gradually learn the probability of the distribution of noise for different steps in the reverse trajectory.

Training a diffusion model for modeling a 2D Swiss roll. From Sohl-Dickstein et al., 2015., arxiv.org/abs/1503.03585v8.

Another way to think about this is to imagine that Diffusion Models work by destroying training data through the successive addition of noise and then learning to recover the data by reversing this noising process. After training, we can use the Diffusion Model to generate unique new data by simply passing sampled noise through the learned denoising process.

To guide the reconstruction trajectory, more recent implementations of diffusion models use text, semantic maps, or other images to condition what possible image should be generated (reconstructed) from the space of all possible options with different probabilities, aka the latent space (see figure below).

The architecture of unCLIP, from Ramesh et al., 2022, arxiv.org/abs/2204.06125.

As mentioned above, Diffusion Models have exploded in popularity as they produce State-of-the-Art image quality and enable people to create photorealistic images that didn’t exist before, such as hybrid creatures, intricate architectures, new materials, and unique artifacts.

Right: Image from Shai Noy, prompt: “A beautiful dress carved out of dead wood with lichen and mushrooms, on a mannequin. High quality, high resolution, studio lighting”, generated with Imagen; Left: Image from Oren Levantar (tomato house) generated with DALL-E.

What can I make with Generative AI?

You can use generative AI to create images, text, music, games, avatars, UI interfaces, or videos. Here are a few platforms that gained popularity:

Image generation: DALL·E 2, Midjourney, Stability AI
Text generation: GPT3 Playground, Jasper.ai, AI Test Kitchen, Chat-GPT
Video generation: Meta’s Make-A-Video & Google’s Imagen Video
Music generation: harmonai.org
Avatar generation: Character.ai, Avatar diffusion, Lensa
UI Interfaces: Figma plugins
Video game generation: NVIDIA’s DLSS (Deep Learning Super Sampling).
Various demos and applications: Hugging Face Spaces

Prompts as a craft material

Most generative AI models use text as input, creating unique opportunities for creators and designers to iterate on their ideas quickly or to collaborate with others. As a result, many different communities of practitioners have emerged around these technologies, with people sharing images, prompts, or tricks to achieve specific effects or styles. For example, Midjourney has 4 million users on their discord, and recently, the company shared that people are using the platform for fun and professional projects (source). There are even secondary markets, such as Promptbase, where creators sell their successful prompts.

For example, “Disney Pixar style Old steampunk cute robot beetle, garden goddess, trending on artstation, sharp focus, studio photo, intricate details, highly detailed, by greg rutkowski” — is a prompt used by Greg Rutkowski and shared on the PlaygroundAI platform (link to post and image). Other creators either buy or remix for free inspiring prompts hoping to achieve the same effects.

An AI-generated image shared with the prompt that was used to generate it, referencing the styles of Pixar and artist Greg Rutkowski, on the Playground AI community.

Sharing the prompts with the generated artifacts could be seen as similar to the “View Source” web feature, which could catalyze visual design. For example, platforms such as Playground AI support more straightforward iteration and remixing by allowing users to share images with all the metadata required to reproduce them (prompt, model id) (see Fig. ref). These features are making prompt-based image generation even more accessible and more craftable. Moreover, many of these features for generative AI are becoming available directly in design tools such as Photoshop or Figma, enabling designers to integrate them into their workflow.

Physical crafting

Several examples from the maker communities show that generative AI is starting to be integrated into fabrication and crafting projects. These examples show that generative AI models are primarily used for ideation or generative design.

Ideation.

Several makers use generative AI for ideation. For example, they use Midjourney to generate concept boards starting with an object or a concept they like (i.e., shell earrings, Birds of Paradise fashion, Rambutan dress). Then they select an intriguing initial composition and use AI models to generate many revised iterations based on the original image. With each one, the AI learns more about your end goal and sometimes suggests its quirky take on the initial prompt along the way. Many of the makers use the Upscale and Remaster features of the models several times to get a very polished composition before moving onward with their fabrication process. Once they achieve a design they like, they either generate a 3d model in cad tools or use the successful prompt to directly generate 3d renderings in Clip forger or other text-to-3d diffusion models.

Example of Rambutan-inspired dress project (link to project)

Another example of ideation is makers using diffusion models to generate art or drawings that a robot could draw. The tricky part for these projects is to be able to pick drawing styles that could be successfully painted by machines. The tool of choice for this operation is Grasshopper which generates a topographical model of the image input into this definition based on the light/dark values of the image. Lighter areas will cause the robot to lift itself upward, away from the page. With this in mind, makers need to consider the areas they would like the robot to draw. Adobe Creative Cloud’s Illustrator and Photoshop tools are often used to adjust the outcomes.

https://www.instructables.com/How-to-Creating-Digital-Art-With-AI-and-Draw-Robot/

Generative Design

Makers also use generative AI when they quickly want to explore a design space or various form factors for the same object type. Suppose you want to build a table; you could use Clip Forge to generate various types of tables 3d models. Once you pick a table model you like, you could go further and in-paint a section of the table and generate various design options for the legs or the top (example project).

*Demo of* ***in-paint feature in CAD*** where designers can select a section of the table and generate various design options for it (example project).

Many of the text-to-3d rendering models allow for mesh exports. The newest Dream Fusion model in this category adds additional optimization strategies to improve geometry, allowing the final rendered models to have high-quality normals, surface geometry, and depth which could easily be exported to CAD for 3D printing.

What does this mean for creators?

Image generated with Microsoft Designer, Prompt: “a piano playing music by itself, in the distance, professional, head-and-shoulders view, studio lighting, like art deco, neon, in the style of skeuomorphism, intricately designed”

While the generative AI models allow anyone to express themselves with images, videos, music, or 3d models, it was received with mixed reactions in the creators’ communities. When an image generated by AI won an art competition, the artist community reacted strongly against allowing such submissions.

Art historians argue that generative models like DALL·E do not themselves create art but that the artists and technologists who apply them as tools are the ones creating art. Art communities such as DeviantArt banned the use of generative models for the artifacts posted on their platforms. However, design firms such as Ideo confirmed that they are currently using generative AI in their practice to generate more inclusive personas or unique concept boards.

I think the examples of imagery we see emerging in the existing communities, such as Midjourney, really call us to revisit the famous quote from Alan Kay:

“the music is not in the piano”

Maybe create alternative metaphors. Rather than thinking of these models as paintbrushes or musical instruments, maybe we can think of them as an opinionated design partners that sometimes will inspire us to diverge our creative process in surprising and whimsical ways.

This post is a draft from a longer piece published in the latest version of Make Magazine. See the full story here https://makezine.com/article/craft/fine-art/generative-ai-for-makers-ai-has-truly-arrived-and-its-here-to-help-you-make-and-craft/