Stability.ai released the Stable Diffusion 2.0 model last weekend, which is the biggest update since Stable Diffusion 1.4 since August. However, the new release caused controversy in the AI art community. Users complained the distorted anatomical structures and weird fuzzy textures in generative outcomes look more like downgrade rather than an update. A mass especially when compared to the eye-pleasured easily-satisfied outcomes from Midjourney v4.
There is even a funny “conspiracy” speculation that Emad/SD Team releases a very basic open 2.0 model before they update the DreamStudio, because they kept a better art hypernetwork/model set for their money-earning service or APIs selling and won’t be publicly available in short-term. The community has to develop their own way to finetune SD2 for goodies for free, hmmm 😅😅
I totally understand the community’s frustration and disappointment with first impressions of SD2, I was too. Not many of my fav previous prompts are able to survive. But after abandoning the old thinking and experimenting with a few sets of new prompts, my confidence and enthusiasm in SD2 was boosted again. I found many highlights and advantages of this wild newborn model. Not only is it not bad, but really better. (cherry pick warning)
The prompts and seeds attached in image captions. Welcome to regenerate or do more exploration. Using the same seed should work in Diffusers + 2.0 model, but may not work out the same in Dreamstudio.
CGS: 9, Steps: 25, Size: 769 * 1024 or 768 * 960.
All outcomes are pure prompt generation, no init image, no postediting, and no negative prompt (which might be more helpful to use).
The biggest improvement of SD2.0 is the higher default resolution (from 512 to 768 px), good results with fewer steps (from 50 to 25, but not faster), and a significant improvement in image quality and detail richness. In particular, the handling of the complex relationship of various light & shadows, diffuse and ambient reflections for various surfaces or materials, and more natural looking depth of field and perspective, which exceeds all models currently available, I think.
As shown in the below three outcomes of transparent crystals on the sea surface, the orange sunset light forms beautiful reflection and refraction on the wave, on the surface and inside of the crystal, and it acts differently on the highly transparent crystal body and translucent ice cube, and generated accurate spherical deformation reflecting on the crystal ball.
The next experiment is the generation of underwater scenes. The rendering of underwater scenes and water simulation is a crowning issue in the CG industry, and I was surprised by how well the AI generation is able to do this. Leaving aside the complex lighting and reflections effect through the wave, the horse prancing underwater even shows the effect of buoyancy force.
Maybe these outcomes are HDR too much, but it can de-saturated in post-processing easily, or by removing some modifiers or using negative prompt.
Fineness and expressive detail of the texture from various natural materials is stunning in SD2 too. I tried sand, snow fields, breaking waves and sea foam.
The change brought by the improved encoders of SD2 model requires new prompt engineering. Too short and too long prompts do not work well in SD2, obviously. It is impossible to runout nice results by too short words as “Fire fox chibi” in Midjourney v4, nor use the practice of stacking modifiers or reference artists as before to win the AI art lotto rewarding.
It is also possible to stop using “artstation”, “500px” and other “praying words to the AI god”, and I have tested that adding or not adding has no effect on the results.
My feeling from my experiments is that SD2 is more sensitive and accurate in responding to modifier words than the previous versions. This means that it can offer higher controllability to refinement, which makes targeted prompt design more feasible. Out of the era of blindfolded alchemy. This is certainly a gift for players who like a challenge.
Here are four steps of my experiments with liquid dark metal texture generation.
The first one looks like painted glossy acrylic molding paste under strong lighting, not quite what I expected.
In the second one, I added the modifier “flowing, Ribbon-like shine”, a bit too silky.
In the last two, added “Solidified lava”, which is much closer to the texture I want.
As shown SD2 is quite sensitive to the response of these 3 modifier additions, the changes are quite obvious.
PS. I didn’t stack a bunch of 3D engine words in prompt.
The following three images are my progressive optimization of a black and white sand dune photography, with the same seeds. The first try came out with thecomposition I’d like to keep while the contrast of sand waves was too rigid. So I added “perfect brightness and contrast balance” , and it worked as expected (image 2). But the curve of the sand wave jittered too much, so I added “Extremely artistic curve” ( image 3).
Maybe a single experiment is just a lucky coincidence. But it did show me the possibility of fine editing during prompt engineering.
The next experiment is about the response to different artists’ styles. 6 paintings of the same subject of iceberg landscapes, with only the artists changed.
Michael Whelan — — sci-fi fantasy artist with the style of vivid colors and elegant & simple compositions.
Bruce Pennington — — sci-fi illustrator with a retro style largely characterized by bold, daring colors.
Chesley Bonestell — — illustrator honored for alien landscapes and space subjects with bold brushstrokes and highly saturated palette.
Andreas Rocha — — a digital artist in the field of game and concept design, with a more modern, lighter and spirited style (I really like to use him)
SD2 also shows high responsiveness to the artist’s style. Improved predictability makes the selection of referencing artists and purposeful prompt design more feasible. Well, this also makes me feel no longer need to use more than three artists in SD2.
Color scheme tested here. Pen portraits in style of Kaethe Butcher. I tried red & blue, yellow & blue, teal & burnt sienna, the results are unexpectedly specific and artistic.
The facial anatomy is accurate in generated outcomes. No twofaces in the tall canvas anymore. I picked the below 4 from less than 20 generations.
Experiment with dry and wet painting medium, oil vs watercolor. The brushstroke and edge rendering characterized in the two typical mediums are stunning, as well as the simulation of the canvas/paper surface. I really like the depiction of transparent glass vessels and copperware.
On the oil painting, the lemon skin simulates the cracked texture. On the watercolor ( image 4 ), the simulation of two techniques of wet and dry painting is quite something.
I painted watercolor for years, so it’s hard for me to tell if the below one is a scan of the original or an AI generated.
This group also compared the two classical fine-art mediums, oil vs watercolor, with a landscape theme, in style of Andreas Rocha（although he is a digital painting master）
Another controversy after the release of SD2 is that the community found that face features of celebrities were removed from its training set. The portraits generated using celebrities as keywords are no longer distinctive, ( yes, you probably won’t get Emma Watson or Gal Gadot with mermaid tails in 2.0, but Obama still seems to work).
But I guess it’s easy to do custom finetune for anyone who needs this feature. As a foundational open model, I personally agree with SD team’s way of giving more consideration to ethical issues and excluding controversial data the earlier the better.
I have little interest in regenerating celebrities, but was happy to try 4 famous faces from art history in my favorite printmaking style. Surely you can guess who they are.
One of my favorite art styles is printmaking but I feel that woodcut or etching prints were not done well enough in available models, either like mixed pen + charcoal drawings or vector illustrations. Printmaking needs higher abstraction of the shape and line, and harder to get the aesthetic balance of colors or bright & dark.
dan mumford + aaron horkey is a great combination for printmaking.
“dan mumford + aaron horkey + james jean” is perfect for textured patterns generation.
Tips: Don’t add “oil on canvas” in SD2 prompt, it’s easy to get out of the frame, use “ fine-art oil painting of ” instead.
Do not use “ woodcut / etching print of xxx ” for printmaking, it’s easier to get kind of photo taken with a print work, replace with “ fine-art woodcut printmaking of ”
The following run two groups of retro-style science fiction posters.
Alien landscapes + time travel machines.
Alien landscapes + black obelisk, awesome atmosphere setting and light & shadow coordination.
The winter forest in printmaking style. Vivid color modifiers randomly come out with different color matching. I am living in Canada, these scenery artworks of late winter woods in twilight are really atmospheric to me.
I will continue to experiment with SD 2.0 depth2img and inpainting models, and custom finetune in the following days, and share more experience with you.
If AI generated models want to be specialized tools for serious applications, it is important to be able to mature the three functions: professionally guiding color scheme and composition by using sketches (init image) , delicate and precise editing for creative iterations, and more customizable fine-tuning with lower cost.
End with a monumental Adam Ansel’s Moonrise Grand Canyon photo , thanks for watching. This is the first successful outcome I ran with SD2.0.
I will continue to update on my Twitter thread if I got more Tips or awesome results.
Plus, I’ve just developed and newly launched an app specifically designed for AI artists and enthusiasts, KALOS.art, a simple and powerful way to create remarkable portfolio gallery, show and sell your digital artworks, prompts or stories, earn tipping (as power charging) , and obtain digital certificates for free.
You can purchased my works for commercial license using from my Kalos, or just give me a sweet like 😘.