Can Artificial Intelligence Make Us More Creative?

Published in

graphtogo

10 min readJun 1, 2021

As someone in the creative department in an advertising agency, coming up with ideas was my daily job. Although this process of creativity is fun, it is also excruciating. When nothing comes to your mind, the experience is horrendous.

My most extended dream was to overcome this complex yet elusive process of ideation. For me, I felt that when I am ideating alone, I’m always at a dead-end. However, when I’m working with others, I also thought that I tend to get more inspiration, or based on what I verbalized, I also get different inspiration.

I’ve always been interested in digital tools that helped people’s creation, and I thought maybe it might be helpful to create a tool that allows ideation.

So for my master’s degree thesis project, I decided to build a tool that does that with artificial intelligence.

This entry is about the process I went through upon working on this project.

Precedents

The idea of using artificial intelligence as means of ideation is not an entirely novel method. Among many precedents, my favorite is the “Writing with the Machine” project done by Robin Sloan.

The project consists of RNN(Recurrent Neural Network) that is trained from Sci-Fi novel data. By combining this RNN model with a text editor, the user can interactively work together with the RNN model and have the RNN model auto-complete a given sentence provided by the user.

Since the RNN is trained using Sci-Fi novels, it is interesting to see that the auto-completed sentences exhibit an influence of Sci-Fi. When I saw this project for the first time, I felt the process was much like brainstorming done among several people, where ideas tossed around. My next thought was if it is possible to create an interactive experience that would replicate the process of ideation just like this.

I decided to dig down deeper into this thought and see what I can do possibly.

What I Aimed to Achieve in this Project

There is no single best practice upon ideating, and the way how people approach ideation is diverse. However, in this project, I fixated on using words as the starting point for the ideation process. Particularly, I developed an application with several machine learning models that respond to the user's textual input. The user engages with several responses from the models, which help the process of ideation.

Instead of ideating just by yourself, I wanted to realize an ideation process that interacts with another system concurrently. Also, I imagined designing an interactive system where the user can ideate with several models that the user can select on top of what Robin Sloan did for his project. By enabling to do so, choosing a model is also an essential choice throughout the ideation process. The decision of how to manage the models becomes included in the act of thinking up ideas.

Project Structure

To realize the project, and given my circumstances, I worked on the project by utilizing the following solutions.

Interface design: Figma
Interface implementation: p5.js
Machine learning implementation: ml5.js

I learned how to use Figma during my summer intern as a product designer in Somerville. For this project, I mainly used to prototype the graphical interface.

With regards to the interface itself, I used p5.js. p5.js is easy to use and has excellent versatility. Leveraging machine learning capabilities through p5.js, logging interactions and outputting that through JSON, and running the software online, it was easy to have test users access the interface to test upon prototyping.

Interface design

I used p5.js to implement the interface after designing with Figma.

The user can enter text input on the left, and models output the response on the right. The interface allows the user to compare the input and the output simultaneously.

The user can select models on the left and change the model's parameters on the right of the interface.

Left: Interface to select the model. Right: Interface to change the parameters of the model.

To feed the input to the model, highlight the text and press the upper right corner button.

Upon pressing the generate button, the selected models will generate text output to complete the highlighted text.

After the output is generated, the user will evaluate the result and continue the writing process by interacting with the models concurrently.

Implementation of the Machine Learning Models

I chose to use LSTM-RNN on ml5.js upon implementing the machine learning model. I could have used other neural network models such as GPT3. Still, I wanted to avoid the situation where the project's outcome is merely decided on the novelty of the implemented neural networks. Still, I instead chose to stick to a simpler model so that the focus is set on the ideation process and, therefore, fixated upon using RNN.

Model Training

ml5.js comes with an RNN module by default, and I used this to build the RNN. In addition to the module, ml5.js also provides a python script that helps users develop their own RNN model with a corpus input. Upon building my own RNN model for this project, I collected the following data:

JPOP lyrics data
Fan-fiction Japanese fantasy novel data
Contemporary classic Japanese literature data

Using these data, I trained an RNN model.

However, the more data you use to train your model, the heavier the trained model becomes. Since p5.js doesn’t have GPU support upon running machine learning models, this becomes a problem when running a trained model on a browser. To overcome this, I had to train several models with different data sizes and parameters to identify the optimal model that is a good fit for this project.

Long story short, I decided to use JPOP lyrics data to train my machine learning model. I trained a different model using fan-written Japanese fantasy novel data generated much output about magic spells, swords, and knights. Another model I trained using contemporary classic Japanese literature data generated texts that sounded 100 years old. As someone coming from a creative background at an advertising agency, neither of these models felt suitable for the ideation process.

User Testing

To validate the process of ideation done interactively with a machine learning model, I initially considered running in-person user tests when I started to work on this project; however, given the circumstances regarding COVID-19, this became difficult. However, since I built everything with p5.js and ml5.js, I shared the URL with several people to have them test the interface without meeting in person.

Given this benefit, I decided to conduct user testing remotely. I added few changes to the software to enable this.

First, I gathered 30 users to interact with the software I built for this project. All of them are working in the advertising industry and holds a related creative position. They are copywriters, art directors, and creative planners, and the female-male ratio is roughly 1:1.

I gave them 15 minutes each and tasked them to use this interface to work on creative writing.

While the user interacted with the software, I recorded their JSON data and analyzed how they interacted.

Upon working on the analysis, I divided the 30 users into the following three groups:

Group 1: This group interacted with a machine learning model trained with JPOP lyrics.
Group 2: This group did not interact with any model for assistance. (Therefore, same as writing with a simple text editor.)
Group 3: Instead of a machine learning model, this group interacted with a dummy model that returns random phrases of JPOP lyrics.

At the end of the experiment, the 30 participants responded to a questionnaire and had three independent judges to evaluate the creative outputs from the participants.

User testing results

It isn't easy to evaluate creative responses objectively, but the data collected through the experiment was intriguing.

First, I looked at the text input of all 30 participants. The x-axis is the time in seconds, and the y-axis is the number of input characters.

Left: group 1, middle: group 2, right: group 3

Usually, if the user keeps on typing, the number of characters entered would rise accordingly. However, if you look at the plots for group 1 and group 3, one may notice occasions where the number of characters has dropped or jumped. This indicates the user has either deleted a large amount of text or copy-pasted the suggestion generated by the machine learning models.

Personally, I find this fascinating because this demonstrates the fact that the users are approaching the given task by collaborating with the models and using the interface.

Moreover, when you observe group 2, where the system did not assist the users, you can see that the groups that had assistance, group 1 and 3, worked on the assignment longer and entered more texts.

Left: Average time to complete the task among groups, middle: Average number of text input among groups, right: Average text input/sec.

This result seems to be realized through the interaction with the software. It is intriguing to see that the software is affecting the ideation process of the user.

Looking into the data deeper gives a different insight.

In the last plot, one can confirm the difference among groups: group1 and 3, which had assistance from the system, and group 2 had no assistance.

Would there be a difference in the output between groups 1 and 3, assisted by a machine learning model and a dummy model?

Left: group 1 that had the assistance from a machine learning model, right: group 3 that had assistance from a dummy model.

In these plots, the x-axis shows how many times the system assisted the user, and the y-axis shows the final number of characters as the output. Observing group 1, aided by a machine learning model, shows that the more machine learning model assisted the user, the more user outputs. However, group 3, assisted by a dummy model that gives random text, does not seem to be the same.

While you can’t definitively argue that generating more output will guarantee the quality of the output itself, I think we can see some insight.

Not just the numbers we can observe, but I also asked the users: “Did the suggestion from the software help your creative process?”

Left: group 1, which had assistance from machine learning models, right: group 3, which had assistance from a dummy model

Looking at the result, group 1, which had machine learning models on the backend, had better results than group 3, which had a dummy model to assist the user.

I assumed that group 1 would have the final evaluation results, but that was not the case. I had three independent judges evaluate all 30 outputs from 3 groups on a scale of 1 to 10, which you can see in the result below.

Evaluation by three judges among groups. (Average on a scale of 1 to 10)

Group 3, assisted by a dummy model, had the highest result, and group 1, which machine learning models helped, had the lowest mark.

Overall consideration

The final result was surprising. I agree that there is a limitation to discuss creativity thoroughly with the result of these three groups. Still, I believe there are several insights given this experiment.

First, upon designing a system that collaborates with the user to perform creative tasks, one thing I should have considered was the impact of latency. Regarding the creative assistive system that used a dummy model for group 3, given that this system merely displayed random JPOP lyrics phrases, the results were given to the user instantaneously. However, with group 1 that leveraged a machine learning model, this was not the case, and the user had to wait a small amount of duration for the system to respond with creative suggestions. This may have impeded the interactive process between the user and therefore resulted in lower points at the end.

Another possible explanation is that the suggestion made by the dummy model could have served as a “bad example” for the user, and therefore enabled the user to think in different ways, thus resulting in a different output compared to a solo ideation process.

When it comes to brainstorming ideas and thinking creatively, there are numerous methods. The tool I developed throughout this project is by no means anything thorough, and I believe many arguable points.

However, I believe that it is meaningful to explore other ways of the creative process by using the software I prototyped.

The designer's task in the future may include curating the data of AI.

If we suppose that the interactive ideation process and AI become a valid method for people working in the creative industry, I believe that they will ultimately be responsible for how the AI functions and, therefore, be responsible for what data should be used to construct it the AI. In that sense, they will need to curate the data for that AI. Like how people started to explore new ways to create things on computers, I believe that another aspect of human creativity will manifest by using AI as a creative medium.

To end this post, I’d like to quote my favorite phrase by Lauren McCarthy, who is the creator of p5.js.