From Concept to Creation — My First Steps in AI Face Swaping

Piotr Raszkowski
5 min readNov 30, 2023

--

I’ve been thinking about how to craft a photo using generative AI, without the need for complex setups like a dedicated LORA or dedicated model training. The goal was to find a quick and simple way. This idea came to me after seeing a tweet from an AI enthusiast and tech fan https://twitter.com/promptowy on X platform “featuring” two polish sport journalists.

I’ve concluded that it’s beneficial to utilize one of the available services, such as Midjourney, DALL-E, or a local version of Stable Diffusion (e.g. AUTOMATIC1111), and only swap the face at the very end.

I decided to test a workflow using Midjourney and an application called FaceFusion for the final face-swapping task.

I would like to make an important note: all images I used were publicly accessible via Twitter, Instagram, or Google. I have provided references to the images used. Both individuals are public figures and frequently appear in the media.

First experiment

My first goal was to replicate a similar effect to that shown in the X post. The key question was: “How to generate an image of a man holding a guitar”? As someone not particularly skilled in art, I recognize that AI is a far superior artist compared to me. So, I decided to consult Midjourney for assistance, you can use the /describe command to formulate a prompt. Let's start with this command and see what Midjourney suggests for a man holding a guitar.

In the output, we find four distinct prompts, each opening a door to different creative possibilities.

We’ll proceed by executing each one to see the varied results they yield

Prompt 1

a painting of a man with an electric guitar, in the style of speedpainting, vivid portraiture, smokey background, hard edge painter, canon eos 5d mark iv, sharp/prickly, intensely colorful figuration — ar 85:128

Prompt 2

a painting of man playing guitar in front of colorful paint, in the style of jay anacleto, karol bak, caras ionut, studio portrait, sharp/prickly, uhd image, representational — ar 85:128

Prompt 3

a painting of a man playing a guitar, in the style of ross tran, caras ionut, strip painting, tim okamura, colorful portraiture, retro rock, sharp/prickly — ar 85:128

Prompt 4

freddie marshall’s painted guitar, in the style of epic portraiture, aggressive digital illustration, realistic figurative painting, dramatic movement, saturated color fields, celebrity-portraits, smokey background — ar 85:128

The results were decent but not exceptional. Eager to find the perfect match, I decided to experiment further. After several attempts, I successfully crafted the following image.

After creating the ideal generated image, the next step is face swapping using FaceFusion. For this, we need a source image of our chosen ‘model’ to replace the face in the generated picture. The source image should match your requirements. I used the 4K Stogram app to download an Instagram photo. I looked for a photo with an angle close to my generated image, believing it would make the AI swap more seamless.

The final image featuring https://www.instagram.com/krzysztof.stanowski/ playing on an electric guitar, source image https://www.instagram.com/p/CylPzSTs2IO.

Second experiment

In my second experiment, I aimed to craft an image independently. My concept involved creating a portrait of the Polish geopolitician, Jacek Bartosiak. Below is the prompt that encapsulates my vision for this portrait

a realistic portrait of a men with round face in black suite pointing with his finger, wearing glasses with black frame, 47 yr old, geopolitician, politician, ultra realistic photography, warm light, finest details, a play of dark and light, 85mm focal length, Canon EOS 5D Mark IV

After numerous attempts, this prompt finally led to the creation of the following generated image.

I searched for a source image on Google and found the official photo of Jacek Bartosiak on his website, available at http://jacekbartosiak.pl/wp-content/uploads/2019/04/jacek-bartosiak.jpg.

The final image isn’t perfect, but it shows potential. I only spent a little time creating and adjusting the images. With more time, the results could be amazing, so there’s a lot of room for improvement.

I also experimented with the DALL-E generator. I requested Chat GPT-4 to create an image using the same prompt I had given to Midjourney.

The workflow and final toughts

The workflow for creating these images is quite straightforward, offering a broad range of tools for image generation and final face swapping.

  1. Develop your prompt, either through your own creativity or using tools like the /describe command from Midjourney, especially if you're aiming to replicate an existing image.
  2. Generate your image using tools like Midjourney, DALL-E, or any other capable image generator.
  3. For face swapping, use tools such as FaceFusion or clipdrop.io, and you’re done.

Particularly with my second experiment, I encountered numerous challenges in achieving a satisfactory result. From this experience, I’ve drawn several key conclusions:

  • The size and shape of the body matter. For instance, if the head is more circular in the source image but more oval in the target image, the final result might not be ideal.
  • Additional facial features, such as glasses, can also affect the outcome.
  • I believe it’s probably better to have a similar angle and proportions between both the source and target images.
  • To obtain a good result, experimentation is crucial. It’s essential to create a well-designed target image and then find a suitable source image; not every target image will yield successful results.

--

--