Stable Diffusion(Prompting and Beyond) p.2

Wassimgouia
9 min readAug 16, 2023

--

prompting in stable diffusion is what differs a professional from an amateur so I will be covering now the right way of prompting in Stable Diffusion

it goes like this:

[Style Of Photo] photo of a [Subject] , [Important Feature] , [More Details] , [Pose or Action] , [Framing] , [ Setting/Background] , [Lighting] , [Camera Angle] , [Camera Properties] , In Style Of [Photographer]

[Style Of Photo] :

Abstract:

Abstract photography involves creatively capturing imaginative non-representational subjects, utilizing visual components such as color and texture. To produce compelling abstract images in Stable Diffusion, give higher importance to the label (abstract:1.4).

Analogue:

Analog photography employs film and conventional development methods to attain a timeless vintage aesthetic characterized by grain, fading, discoloration, light leaks, and textured flaws that are not present in polished digital pictures. Valuable for creating a universal vintage appearance.

Beauty:

Beauty photography encompasses the art of capturing sophisticated and stylized portraits of models, aimed at accentuating and commemorating cosmetics, allure, and visually pleasing subjects in a romanticized manner. Blend with a studio-themed lighting cue for enhanced results.

Candid:

Candid photography revolves around the act of capturing individuals exhibiting their natural behavior in public environments, all without their awareness, in order to seize genuine moments. Incorporating actions into your instruction and selecting a public setting greatly enhances the effectiveness of this approach.

Street Fashion Photography:

Street fashion photography captures unposed images of fashionable individuals in urban environments, showcasing their innovative street-style ensembles and designs in locations beyond traditional studios or fashion runways.

[Subject] :

The subject can be something like “Elderly Woman”, “Young Girl”, “Middle-Aged Man” Or “9 Years Old Boy” etc…

also, You can describe the ethnicity and skin tone in the Subject description like this :

“African America”, “Light Skin” Or “White Asian” etc…

[Important Features] :

A Feature is what Makes the uniqueness in the image let’s take some examples :

The Hair “Dread heads” “Curly short hair”

Clothing “Wearing a hoodie” “wearing a sundress” “eyepatch”

The expression “Smiling” or “Sad”

Accessories “nose piercings” “watch”

[More Details] :

Using adjectives to convey the personality like :

“Confident” “Laughing light-heartedly” “looking anxious”

[Pose Or Action] :

The Pose and Action outlined for the subject offer valuable guidance for molding the ultimately generated portrait.

Here’s an example :

“Standing with hands on hips” “leaning against a wall” “jumping in the air” “doing a high kick”

[Framing] :

Framing is Composing the view

Let’s take some examples:

Close-up on face:

Including the phrase “close up on face” within a portrait prompt narrows the composition to emphasize facial features, unveiling emotions, identity, and intricate particulars. This technique accentuates expressions and concentrates attention solely on the subject.

Full body:

Incorporating the term “full body” within the prompt encompasses the complete subject, effectively displaying movement, clothing, and surroundings. This instruction is ideal for dynamic postures, accentuating prominent outfits, or setting up a specific scene.

upper body:

Upper body framing captures the subject from the chest or waist upwards, directing attention to the posture, expression, attire, and hand gestures, while also preserving a degree of surrounding context.

[Setting/Background] :

The prompt’s environment significantly affects how realistic the generated photorealistic images appear. When describing a background, offer relevant context without being too detailed.

let’s take some prompt examples:

“Sunset beach” “Misty forest” “At the edge of a cliff, gazing over a misty valley during dawn.”

[Lighting] :

Lighting in photos plays a crucial role in photography, as it affects the mood, atmosphere, and overall quality of the image. Different types of lighting can create various effects, such as highlighting certain features, creating shadows, or setting a specific tone.

let’s take some examples:

Candle Light:

Candlelight generates a warm, intimate, flickering glow, casting gentle dancing shadows and highlights. You can prompt this effect by describing a scene lit only by candles.

Cinematic Lighting:

Cinematic lighting employs intense high-contrast keys, backlights, and rims to craft moody portrait lighting reminiscent of Hollywood style. To achieve this effect in Stable Diffusion, you can prompt it by specifying “cinematic lighting” along with strong side key lights or rim lighting.

God Rays:

God rays lighting forms striking beams of light that pass through particles in the air. You can prompt this effect by specifying “radiant god rays” or “light beams streaming through the haze.” This generates an atmospheric and mystical lighting ambiance.

Silhouette Lighting:

Achieves a silhouette effect by positioning the subject or object in front of a strong backlight. You can prompt this by describing the subject as “silhouetted against the bright sky/window/light,” resulting in a striking darkened outline of the subject against the backdrop.

[Camera Angle] :

The camera angle in photos refers to the specific position from which a photograph is taken in relation to the subject.

let’s take some examples:

from above/or high angle:

When you specify a “high angle” or mention shooting “from above,” it means placing the camera above the subject and pointing it downward. This technique captures the subject from an elevated perspective, highlighting feelings of smallness, vulnerability, or isolation within the scene.

from below/low angle :

When you specify a “low angle” or indicate shooting “from below,” it means positioning the camera beneath the subject and pointing it upwards. This instruction prompts Stable Diffusion to capture the subject from a lower viewpoint, enhancing the perception of power, height, and dominance within the scene.

[Camera Properties] :

Camera properties in photos refer to the various settings and features that a photographer can adjust on their camera to control how an image is captured.

We will be taking 3 Camera Properties from cinema cameras to digital cameras to retro cameras, also film types and lenses plus the filters and effects

now let’s start with Cinema Cameras and take some examples:

Bolex H16 :

The Bolex H16 was a flexible 16mm film camera operated by hand cranking. To replicate its vintage 16mm appearance, natural vignette, and handheld video style in Stable Diffusion, you can prompt it by mentioning “shot on Bolex H16.”

Aaton LTR :

The Aaton LTR 54 was a remarkably versatile Super 16 cinema film camera. To capture its natural perspective, subtle vignette appearance, and film texture in Stable Diffusion, you can instruct by saying “Shot on Aaton LTR.”

now with digital Cameras let’s take some examples:

Fujifilm X-T4 :

The Fujifilm X-T4 mirrorless camera generates photos featuring Fujifilm’s renowned color science and film simulation modes. You can mimic these effects in Stable Diffusion by specifying desired traits such as vibrant colors, strong contrast tones, and authentic graininess.

Lumix GH5 :

The Panasonic Lumix GH5 is a Micro Four Thirds mirrorless camera acclaimed for its adaptability in capturing high bitrate 4K video. To creatively replicate its features in Stable Diffusion, you can use prompts like “shot on Lumix GH5” along with tags such as cinematic bokeh, dynamic range, and vibrant colors.

now with retro Cameras let’s take an example:

Diana F+ :

This plastic toy camera boasts a devoted fan base due to its enchanting soft focus and vignetted pictures, ideal for achieving a misty retro film style. Apply a weight of 1.6 and employ tags such as film grain, dreamy haze, blur, and light leaks. Note that using a High-Res fix will lessen the impact of the effect.

now with Film types let’s take an example:

Agfa Vista :

Agfa Vista was a budget-friendly color print film recognized for its vivid and exaggerated colors. This film is an excellent choice for generating images with a highly vibrant and saturated visual style.

now with the lens prompt:

After some prompting examples I’ve done when I write “50mm lens” or “75mm lens” the results almost look identical. However, when you type specialized lens types(like 8mm Fisheye Lens Or wide angle lens) it shows some different aesthetic results.

[Filter & Effects] :

glitch style :

Intentionally altering or tampering with digital images to introduce deliberate distortions, artifacts, color shifts, banding, and other digital irregularities.

(infrared filter:1.4) :

Mimics the pronounced color shifts seen in infrared photography, where shades of pink dominate foliage and skies, resulting in a distinct reinterpretation of reality.

[Photographer] :

In this Photographer section in our prompt which isn’t really that necessary you can follow a certain photographer's style of image to give your image the aesthetic and the final touch it needs let’s take some examples for this

Hayao Miyazaki :

Yousuf Karsh :

Ansel Adams :

Bonus prompting Tricks: let’s take for example the word “sunny weather” You can control the weight by these options :

(sunny weather) the weight:1.1

((sunny weather)) the weight:1.22

(((sunny weather))) the weight:1.33

or you can use this :

(sunny weather:1.1) the weight:1.1

(sunny weather:1.22) the weight:1.22

(sunny weather:1.33) the weight:1.33

[sunny weather] The weight:-1.1

[[sunny weather]] the weight:-1.22

[[[sunny weather]]] the weight:-1.33

or you can use this :

[sunny weather:-1.1] the weight:-1.1

[sunny weather:-1.22] the weight:-1.22

[sunny weather:-1.33] the weight:-1.33

The Alternating words:

The Alternating words go like this : [dread head | brown hair | short hair] This phrase synonymizes this term collectively.

The CFG(Classifier Free Guidance) Scale:

The CFG Scale is a parameter that controls how much the generated image matches the text prompt and/or the input image. but keep in mind that lower values give more creative results (between 5–11). more than 11 or lower than 5 gives unpredictable results.

--

--