Elevate with AI: Rethinking Expectations, Redefining Success

Roshan Rao
inspiringbrilliance
8 min readMar 12, 2024
^ Visual Designer vs AI Generated (Can you tell which one was generated how ?)

TL;DR

Every business has a defined visual identity that shapes the brand — can Generative AI [Text-to-image ] cater to the visual design needs of the organisation and make the process smoother when off-the shelf solutions fail to deliver?

This blog captures our journey to find answers to this question. It will cover various aspects of our approach, experience/experimentations with Gen AI & key takeaways for driving better outcomes with Gen AI. These stem from experiments conducted internally for a specific use-case under visual design scope [Design of 2D Characters]. Driving new possibilities with a different mindset [(re)look at AI as one that is taking baby steps, quality over quantity data sets can give one good start, know AI’s capabilities to decide ‘continue with humans vs offload to AI’ tasks, etc] can lead to better outcomes with decent outputs, and set teams & organisation to strategise right for all AI-powered solutions going forward.

Introduction

Generative AI in the space of ‘Text to Image’ generation is an established space in terms of its ability to render phenomenal outcomes. In this blog, we’ll explore my attempt [as a design practitioner and a solution thinker] towards using Generative AI as a new material to do design, especially visual or graphic design.

The intent was to solve a problem in the context of a business — how can we scale visual design across artifacts needed to be generated for different organisation needs when resources are limited to get things done [people & time].

We will introduce the problem context in-detail, approach and list out some of the key learnings we had while trying to solve the problem using General AI.

Note: You can jump to the learning part if you are not interested in the problem space. If you are interested in the details of the problem we took up and what were the outcomes with AI, read on.

The Problem

With a small team of in-house visual design practitioners in the firm, we started to face challenges in generating higher volume of creative assets in the visual or graphic design space. For the business, this created a bottleneck. This limited the individual & design team’s ability to keep up with increasing demands from business. The question arose,

‘How to address the growing need of doing visual design at scale with a limited set of folks within the organisation ?’

The Approach

To overcome this challenge of scaling visual design, being curious about possibilities, I turned my attention towards AI technology- GenAI in’ text to Image’ modality space.

Initial impression of the output samples being put out there in social media (on websites such as prompthero), gave me (like for most of us) an impression that ‘the problem we were facing is a solved problem and should be easily overcome with a standard set of tools [product/ models-frameworks] available off the shelf. Or so we thought…

The Use case

Our Brand assets are composed of different kinds of elements and assets. There are bespoke graphical patterns, icons, characters, colours, etc which are used in creating posters, brochures, presentations, etc.

^ Sample set of Brand assets

The degree of modularity and flexibility each of these elements facilitated varied from ‘use it as-is’ to ‘apply it in-principle’ for endless ideation and pushing creative boundaries. The characters, hereafter referred to as BLOBS, was stemming from one such idea for endless possibilities.

It was something which could not be arrived at by applying constraints or rules via coding or through standard templates as the character had to be playful and communicating the emotion.

^ Blobs generated by humans/ visual designers

As a visual design exercise, we would individually create these blobs inspired by different inspirational characters or scenarios for an intent. Ideation would drive if a character is needed and that would mean custom development each time. This approach proved time-consuming.

Also, personally as a design practitioner, I was unsure if AI could match the work of a skilled visual designer in terms of creativity (how an individual is able to generate innovative ideas and then translate it into artefacts) for the needs of an organisation or a brand.

In essence, our challenge was to help member/visual designers within the organisation to create visual design artifact [blob], from a ‘thought’ to an ‘output’ with few simple prompts [almost like a DIY] using GenAI as design material.

Then there was the aspect of how quickly, effectively and repeatedly can AI do it?

First Contact

To start with, I tried out standard products out there [E.g. Dall-E, MidJourney, etc] with a standard set of prompts. The experience of generating visual designs on the likes of blobs using tools and products out there was a bit far from expectations.

Any prompt created almost ‘random’ seeming outcomes. It seemed like there was a level of misinterpretation. Contextual understanding of what the user used in their prompts (specific to their organisation) versus what the generic model understood was seemingly different. That resulted in a mental block to unlearn organisational ‘lingo’ and start trying out other synonyms. This seemed like a problem as it was a hit-or a miss.

So, along with another Data Science enthusiast who wanted to tinker with possibilities of ‘Text to Image’ [ex-Sahajeevi, Rahul Pradyumna], decided to take a plunge to custom train a model and see where it leads us.

The model which was picked was the Stable Diffusion model (DDIMS) and using our own set of sample images shown above( 30 samples), we started to train the model and slowly navigate the outcome, each time trying to overcome gaps and surprises.

The Outcome

While the first cut was in the ball-park of what we wanted, a lot needed to be figured out for achieving satisfactory results each time. Outcomes needed to be more accurate with limited or no hallucinations.

It was a tango between the design and the data science perspectives to see where we met with success and where the model was throwing surprises. For instance, a lot of times, in spite of images trained on had no reference of superheroes having a star on their chest, the model almost rendered it each time.

These were addressed by refining our ’base prompt set”- inclusive of positive and negative prompts. Steps such as these, taken over the course of the entire journey (3–4 weeks) of this experiment, we could successfully establish a POC of what is possible.

Over time hallucinations also started to reduce and we inched closer to the outcomes which could even be used. It was not perfect but it was ‘usable’ with a call-out that these are AI Generated- Humans were not involved and hence there was possibility of what tech could achieve.

The Output

Below are some samples of the AI-generated blobs at the end of the experiment fine tuning the model with better prompt engineering and playing around with the levers.

^ Blobs generated by humans/ Visual designers
^ Blobs generated by model (trained in-house).

Model Details : Stable Diffusion pipeline with DDIMS Scheduler ( Denoising Diffusion Implicit models) — 100 epoch

The Learnings

Throughout the process, this black-box -AI started to open up and helped us really reframe our own thinking, expectations and the approach one could take.

Below are some of the key learnings if you have a specific use-case for which off-shelf products or models seem to be not delivering on expectations

  • AI as an Infant per use case- If you closely observe a baby, each task is an achievement in its initial 1–2 years of being a toddler, as it continues to learn and master new tasks each time with few inputs from the parents. The same is true for an AI Model as well. It will take its time to learn. At first, it might seem to be on-par with the direction we want it to take, it is the details that will need sufficient training. What will help is, if you can break your tasks into distinct molecules & its atoms, the model would need sufficient training , starting from the tasks , dumbing it down all the way to the atomic units.
  • Calibrating Expectations- Do not expect AI to do magic right from the start. It will take time depending on the use case. For few, there might be frameworks which can themselves result in satisfactory outcomes. But for the rest, where needs are stemming from organisation or business context, it will take an effort to tune, fine tune and re-tune the model to get satisfactory outcomes.
  • Training Data-set- Contrary to popular belief, looks like for certain use-cases, one can actually work with smaller data sets and still get satisfactory outcomes. Variation seems to be the key in the data set, where ensuring there are enough differences at each of atomic units can also get the job done. Quality over quantity should suffice to kick-start.
  • Levers as Equalizers- Turns out model training is only part of the problem. As a user, one really needs to master how much freedom or restriction is to be given for the model ‘each time’ to give satisfactory results. One needs to establish the ‘negative prompts’, guidance scale, inference steps and the seeds for better results. Fine tuning this and setting it up right for the future ( like etched in stone) can reduce surprises along the way.
Reference of Levers at one’s disposal while using Google Collab
  • UX for non-AI Scientists- Turns out, in spite of setting everything up (model training, levers, negative prompt, base prompt engineered etc), one needs to build a wrapper around the model — an UX and UI for non-DS scientists. This reduces the possibility of scrapping what is already established (not disturbing the sanctity of the model tuned) & therefore make the best use of the tool.
  • Design & Data Science collaboration- Having a Design lens while building the model and constantly collaborating with a Data science individual is key to be mindful of do’s and downfalls in the model. Constantly using what was built as an end-user was also insightful on the gaps in UX.
  • Trade-off mindset- Approaching the experiment from what heavy lifting the model should do versus a designer who can do faster at every step can help keep pushing possibilities. Choosing which battle is worth losing to win the war has been the strategy. Not everything the model needs to do.
  • Not everything needs AI: Critical thinking of how many atoms does it take to design the visual/ aretfact and if applying such defined rules can simply get things done faster, then GenAI models are an overkill.

The Conclusion

Over the course of the experiment, I realised that Generative AI — Visual rendering does hold immense potential even when put in the context of a use-case very local to an organisation.

While it may not replace practitioners (yet), it can definitely help complement them in their effort or help democratise ‘creative liberty’ to other members within the team. With the right approach and change in mindset, one can leverage AI to newer possibilities.

--

--

Roshan Rao
inspiringbrilliance

Solving problems with right means for impactful outcomes