AI Generated Imaging — The Next Step In Computational Photography
For creatives, hard work and effort is put behind a great painting, award winning photograph or solidly directed video. It all starts from the mind of the creative. This is talent that makes a creative recognized in the industry they work in media, arts and entertainment. What if creative thinking can be done by a computer, and also create the image?
That is now possible in the realm of AI (Artificial Intelligence) computer software. It is possible to tell an app what you want to render and it will be done automatically, without anything else required from the user. Imagine just typing or saying what you want to create and the software will do everything for you. That would not only be a disruption in the art and photography industry, but it can also create a new form of imaging industry for creatives.
Other techniques include genetic breeding (using Style-GAN techniques), random image generation and soon there will be natural language image editing.
Note: As of writing, most of these techniques are still experimental or limited in use.
It Is So Easy, Anyone Can Do It
Frameworks that bring machine learning and computational imaging allow users to create their own images using an app. It is now so simple, anyone with the app installed can do it. An example is the DALL-E 2 system from Open AI. The algorithm allows users to speak or type, using natural language, which the computer will then interpret and convert into a rendered image.
Users merely give a description of the image they want the software to generate. For example, a user can type or say:
“a raccoon astronaut with the cosmos reflecting on the glass of his helmet dreaming of the stars”
The software using DALL-E 2 generated the following image (note: this is not a human made illustration) from that description:
The rendering was done by the software, without any additional human assistance. Open AI developed the system using a neural network, which is a type of process that uses a “trained” set of images that the software references. It then makes use of another AI technique called natural language processing (NLP) to understand the given description from the user (typed or by voice) to generate the image (this uses the GPT-3 library).
What is amazing about this is the accuracy and the speed of generating the image. What makes it even better is that it does not require the user to do anything other than give a description of what they want. When it becomes a mainstream app integrated with smartphones, anyone can create their own images by typing or saying it. This can be for fun and entertainment. A more serious use of this technology is for commercial content creation (e.g. stock photos or images, memes, thumbnails, etc.).
A creator on YouTube could use the app to generate a thumbnail for their video. Students can use it to generate images that they can use for class reports or projects. To be more creative, content creators can use this feature (instead of stock photo sites) to generate exactly what they are thinking about. This can simply be done from the app installed on a smartphone. The possibilities for many applications begin to open from here.
Generate Your Own Model For A Photoshoot
Are you a photographer looking for models to shoot? What if you can just generate your own model online? As a matter of fact, you can do that already. The following faces in these photos are not of real people and were computer generated.
This images were generated online from This Person Does Not Exist. Anyone can generate a “fake” person. The app uses a Generative Adversarial Network (GAN) style-based image generator. From a training set of many faces from around the world (different facial types, ethnicities, races, etc.), the software generates a face on the fly. There are no parameters needed. The software creates a random face when the user clicks the generate button.
This could be cringe if you fear it looks so real that it can replace an actual human being. There are even virtual model applications that can render not just the face, but also the body (includes hands, legs, torso, feet and other details). That means a product retailer can use virtual models in their next ad campaign to save costs from hiring a real model. Whether or not this will actually become a big business, could depend upon market preferences and the current state of the world.
CM Models has added a line of virtual models on their website. One of their virtual models is named Zoe. You can tell that Zoe is not a real person, since “she” looks very computer generated. The point here is that she can be used in virtual worlds, as a model in the emerging metaverse. Zoe can appear sporting versions of popular fashion brands that are also NFTs (Non-Fungible Tokens). Zoe is not likely to appear on the catwalk during fashion week, but to a virtual event in the metaverse sometime soon.
What would favor using virtual models instead of a real person? There are different reasons. One could be due to lack of an available human model when an ad campaign needs one. The campaign can substitute a human with a virtual model temporarily. If there are any more lockdowns due to health restrictions, producers and creative directors might turn more toward virtual photography (i.e. remote photoshoots) and virtual models. The app can even be a replacement for an actual photographer and model, so it can save time and cost.
Genetic Image Generation
Users of the app ArtBreeder can explore the applications of AI to generate unique images of faces, art and characters. They make use of machine learning to generate stunning images. It is not just people, but objects and just about anything beyond imagination. The software generates images based on their distinct features or “genes” (much like in genetic engineering). The software can combine these “genes” to create new unique images.
This app is mostly used for fun. It is a a form of social app, since users can use each others images to generate new images. Images are given credits to their respective creators, and the app makes it available for other users to use. Users who generate many images are offered a paid subscription with many added benefits (based on tiers). Thus, this is an example of a commercial AI app.
Users can combine different images (which is like breeding genetic information), when they create their own image. It can then be tracked using an “ancestry map”, that shows the “lineage” of the created image. Users can keep creating as many image portraits and styles as they want, while networking with other users. It is both creative and collaborative, allowing users an interactive experience.
In art, this can be a way to trace the digital ancestry of the creative’s work. It is primarily for digital content creation, not for physical real world content. With this system, artists can be attributed and even compensated (royalty) in the future if someone wants to use their work for commercial purposes.
Voice Command Image Editing
Retouchers and anyone who uses a photo app on their smartphone can soon use voice commands to edit their photos. What makes this possible is natural language integration with imaging software. For example a user can simply say:
The app will understand the word “brighten” from its trained set of vocabulary in the language it recognizes. The image will then be brightened. For a more advanced user, their needs to be some way to control the granularity of the features. They can say:
“Brighten photo +12”
The app will then adjust the brightness by 12 levels out of let’s say 100. There will still be a manual way to brighten the image, but being able to use AI features that can understand natural language provides more convenience when it comes to editing.
In the true sense of natural language, a user can say what they need however they like. The structure is not in the syntax, like in computer code. Instead the software will analyze the vocabulary and sentence to properly interpret the command. This is like interacting with a speaker assistant to perform a task.
Artists and photographers can view these new apps as tools to add to their arsenal. It can aid in image creation, only requiring more creativity in the final result. An artist can generate an image they have in mind and then further work with that image to create their work. Photographers can have the app edit their image (in real time) to produce the best result possible, the hallmark of computational photography. We already see this in some smartphones (e.g. iPhone) that apply AI techniques in image processing.
They can also view AI to be a threat to job security. If an average person can just generate amazing images on the fly, why would they need to pay an artist? Who needs a photographer when you can take your own photos with your smartphone? Many people are already taking their own portraits or selfies with advanced features in smartphone cameras. These are disruptions that bring about a paradigm shift that always forces people to make adjustments.
The truth is, none of these technologies will immediately replace a good artist, photographer or model. Those creatives have skills that are in demand, no matter what. AI is going to be an alternative to the norm, at least until the technology improves. They have an advantage in speed, cost savings and convenience. Despite those benefits, a creative director will more likely still work with a human model and photographer rather than using a virtual model for a campaign. Artists will still be in demand because their vision is something no computer can yet replace.
These AI systems are also not at the same level as their human counterpart. They are examples of general AI, and not something that has surpassed the human ability for creativity. Computers can create impressive images with what is available in AI, but they are still prone to errors and may not be able to produce the content users are looking for (based on the data set).