What DALL-E 2 is capable of and not

Yashashree Patel
AI Skunks

--

I started exploring DALL-E 2 a week ago and have since spent countless hours testing its features and generating various artworks. My goal was to assess its performance across different domains and produce impressive visual pieces.

As a result, I’ve compiled a list of my observations on DALL-E for which I used ChatGPT to produce me a prompt, accompanied by relevant examples.

If you have any specific requests for artwork based on certain prompts or scenes, feel free to share them in the comments section.

What DALL-E 2 excels at

Fashion Design

DALL-E’s expertise extends beyond just creating stunning visuals. It possesses a keen understanding of various clothing styles, including those designed for women. With the help of stylistic direction, DALL-E can generate some truly innovative and beautiful outfits. Interestingly, it appears to excel in producing highly realistic-looking wedding dresses, which may be due to the consistency of the aesthetic of such dresses and the high-quality images available online.

“create a ball gown wedding dress design that embodies grace and sophistication. The dress should feature intricate lace detailing and a full skirt that flows elegantly with every step. The bodice should be form-fitting and accentuate the bride’s curves, with a sweetheart neckline and off-the-shoulder sleeves. The color palette should be soft and romantic, with shades of blush, ivory, and champagne”

Wild Life Photography

DALL-E’s ability to create complex scenes is truly impressive, often resulting in images that are indistinguishable from real-life photographs. I have found myself easily mistaking its creations for actual images while browsing online platforms like Tumblr.

1. “Generate a photograph of a group of giraffes grazing on acacia trees in the African savanna, with a stunning orange and pink sunset in the sky above them” | 2. “Create an image of a majestic elephant standing in a clearing surrounded by lush greenery, with the sun setting in the background casting a warm glow over the scene”

Food Plating

There’s no denying its ability to create beautifully presented dishes that would fit right in at a fancy restaurant. The plating style is spot-on and impressive.

“Create an image of a beautifully plated dessert that incorporates elements of nature, such as leaves and flowers, and features a unique, eye-catching centerpiece made of chocolate or fruit”

Jewelry Design

The close-up photograph created by DALL-E 2 with the textures and details of the jewelry with exceptional precision, highlighting the exquisite craftsmanship and beauty of the piece

“Generate a high-resolution image of a silver statement earrings with amethysts and an amber pendant, photographed in extreme close-up to showcase the intricate details and textures of the jewelry. Create a stunning and lifelike representation of this beautiful piece”

Pop Culture

DALL-E has an impressive ability to identify and comprehend a vast assortment of pop culture references, particularly for visual media and literary works that have been adapted into films. Moreover, it is capable of converting recognized references into a diverse range of art styles with remarkable flexibility.
I asked ChatGPT to generate prompt for DALL-E 2 to transform pop culture characters into different art styles.

“Transform Iron Man into a retro movie poster, with vibrant colors, bold typography, and a dynamic illustration of him in his iconic armor”
“Reimagine Darth Vader as a stained glass window, with vibrant colors and intricate details that highlight his menacing presence”
“Imagine Wonder Woman as a classical sculpture, with intricate details of her armor, flowing hair, and confident stance captured in pristine white marble”

Art Style Transfer

DALL-E is capable of producing exceptionally high-quality results that are characterized by specific artistic styles. It has the ability to generate charcoal or pencil sketches, imitate the signature painting styles of renowned artists, and even create unconventional pieces such as “medieval illuminated manuscripts”.

1. “A dragon into a character in an illuminated manuscript, with intricate details of its scales, wings, and fiery breath captured in bright colors and bold lines” | 2. “A wizard into a character in a medieval tapestry, riding a giant owl and wielding a staff that glows with magical energy”
1. “Transform Gustav Klimt’s ‘The Kiss’ into a mosaic artwork, with intricate details of the figures’ clothing and hair captured in a shimmering, jewel-like design” | 2. “Reinterpret Edvard Munch’s ‘The Scream’ as a digital illustration, with vibrant colors and bold lines that give a modern and dynamic twist to the haunting image”

Digital Art

DALL-E has the ability to create stunning, imaginative art pieces that exude a fantastical aesthetic, provided that the appropriate prompts are given and some careful selection is made. A few exemplary pieces include:

“Generate an ethereal digital artwork of a celestial being, with delicate, glowing wings and a serene expression that exudes tranquility and grace.”
“Create a photorealistic digital artwork of a dragonfly perched on a lily pad, with the shimmering reflection of the water’s surface and the intricate details of the dragonfly’s wings captured in stunning detail”

The outcomes produced by DALL-E for song lyrics can be unpredictable and vary in quality, but with some persistence and experimentation, it can yield remarkable and amusing representations of poetry and abstract ideas. The versatility of the tool is what makes it enjoyable for me, as it is impossible to anticipate the direction it will take when given a prompt.

1. “The song ‘Purple Rain’ into a colorful, psychedelic digital artwork, with bold hues and swirling patterns that capture the song’s emotional and expressive themes” | 2. “A minimalist digital artwork inspired by the song ‘Somewhere Over the Rainbow’, with a simple, clean design that captures the song’s hopeful and optimistic message”

Futur Commercials

I personally have a strong affinity for the artistic output of DALL-E when given the instruction to create works in the style of surrealism, particularly in its surrealistic interpretations of commercials or advertisements. I am so enamored by the artwork that if my online ads were entirely replaced by DALL-E’s surrealistic art pieces, I am confident that I would click on at least half of them.

“Digital artwork for an advertisement promoting a futuristic self-driving car, with a vibrant, neon color palette and unexpected juxtapositions of objects and scenes that convey the car’s innovative and unconventional features”
“Produce a surrealistic digital artwork for an advertisement promoting a high-end, luxury watch brand, with an otherworldly, fantastical vibe and unexpected visual elements that convey the watch’s unparalleled elegance and sophistication”

Fiction Art

“A digital artwork in the style of a book cover for a novel set in a dystopian future, with a dark, moody color palette and surreal, futuristic imagery that captures the novel’s themes of power, control, and rebellion”

What DALL-E 2 flops at

Prompts with 2 characters

While DALL-E can effectively generate images with specific traits for a single character, it may encounter difficulties if the requested trait is not commonly found in images. For instance, if the requested trait is something like “a person with a giant nose,” DALL-E may struggle to produce a realistic image. Additionally, although DALL-E can generate images of multiple generic people in a crowd, it may face challenges when it comes to accurately depicting their facial features. Moreover, distinguishing which traits are meant for a specific character, such as Character A or Character B, can be a challenging task for DALL-E beyond basic distinctions like “a man and a woman.”

“Create a digital artwork of a young blonde girl with pigtails resting in a hammock, while an elderly man with a white beard sits in a rocking chair beside her, reading a book, with a lush forest in the background and soft dappled light streaming through the trees”

Here, it’s lost when it comes to remembering which combination of age/gender/book is in what location.

Even when I provide pop culture references for two characters that DALL-E is already familiar with, such as Wonder Woman and Superman, the model struggles to differentiate between them. It appears that DALL-E views the input as having two distinct characters, each with their own set of traits. However, when generating the image, it appears that the traits are randomly reassigned between the characters, resulting in a blended image that does not accurately depict either character. This behavior suggests that DALL-E may view the input as two separate lists of traits rather than assigning traits to each character individually.

“Produce a digital artwork of Wonder Woman and Superman standing side by side, with Wonder Woman holding her sword and shield, and Superman flexing his muscles and preparing to fly, with a dramatic cityscape in the background”

Objects with non-standard usage

DALL-E’s spelling is notoriously bad, often getting words wrong even by sheer chance. The model does manage to produce recognizable English letters, but its letter order is often closer to a random draw from a bag of Scrabble letters than to actual English spelling. While scaling up the model may eventually lead to better spelling, currently DALL-E 2 is not consistent in its spelling.

DALL-E’s ability to generate images based on text prompts is incredibly impressive. Its object recognition and understanding of context is very strong, allowing it to consistently include requested objects and place them in reasonable relation to each other. However, the model’s limitations in tracking two different characters or non-person objects of the same type reveal some conceptual limitations. Additionally, its tendency to apply color and texture styling to unintended areas and its struggles with the “edit” and “variations” modes also suggest that the model may have difficulty tracking a set of objects-with-assigned-traits.

Despite these limitations, working with DALL-E feels like communicating with an alien entity that doesn’t quite reason in the same ontology as humans. The model seems to easily understand complex and nuanced prompts, but struggles with more simple concepts that a human child could understand. Overall, DALL-E’s impressive abilities highlight the immense potential of AI and machine learning, but also raise important questions about how we communicate with and understand these systems.

Finally, I would like to encourage anyone with specific requests for artwork prompts to share them in the comments section. Thank you for your time and interest!

--

--