Generative AI can Ideate Harder

Published in

𝐀𝐈 𝐦𝐨𝐧𝐤𝐬.𝐢𝐨

11 min readAug 8, 2023

The world needs more breakthroughs — climate, energy, healthcare, education, and policymaking, just to name a few — and faster.

Breakthroughs often come from the combination of ideas from very diverse origins. Humans, because of their capabilities, incentives, and the objective complexity of the current state-of-the-art, struggle with integrating that diversity of notions, at the scale required. Can Generative AI help?

In some domains, like folding new proteins, it does already — for instance, look at AlphaFold, a Google DeepMind AI program that creates new proteins. While awe-inspiring, that’s possible only when AI has a good enough model of the world, can run experiments at scale, and do that largely by itself so it isn’t encumbered by humans’ lower processing speed. However, most AI deployed, and especially large language models (LLMs), don’t know what the world is — it only knows “how the world talks about the world”.

But there may be reason for optimism. Let’s start with the following example.

Solving really, really hard problems

You might have heard this story, a classic when being trained on innovation design (laid out in the HBR article titled “Are We Solving the Right Problems”, by Thomas Wedell-Wedellsborg): the best way to address elevator’s passenger dissatisfaction at the duration of an elevator ride is not, beyond a point, to spend more on engineering; it is to put mirrors on its walls. To most elevator users, the few seconds lost don’t really matter — what matters is the perception of wasted time. And human perception is easier to manipulate than gravity. The result? Billions of costs saved, better safety, faster-to-realize projects.

It is also a solution that LLMs find very hard to come up with by themselves, out of the box, especially when prompted in a very narrow way (“faster elevator ride”), which is how most users prompt most of the time. Herein lays part of the solution: tell the AI to think about radically different ideas that combine solutions from different spaces, to take an expansive perspective of the user (or at least abstract away from the problem), and, importantly, to apply “lenses” to the problem. Lenses like “think of a solution as if you were the Dalai Lama”.

That combination works, especially when (a) done deliberately and recursively in a chain, (b) pulling ideas from very different spaces (c) with today’s increasing AI context (“memory”) window, and (d) with human interventions in the right places.

Together, (a) (b) (c) (d) may constitute a big first step in making AI more capable of creativity. Even better, this combination scales well as part of innovation workflows, as AI can sift through orders of magnitude more ideas than traditional teams could do.

Yet it is not trivial to get such an output, especially at scale. Anyone who’s used ChatGPT or its predecessors knows that the output is typically well-structured, comprehensive, and convincing, but often relatively bland and unimaginative. And when you increase the so-called “temperature” (randomness, creativity), you frequently get a lot of useless hallucinations. That’s at least with the versions you can easily get in your hands, off the shelf. What they do is represented below.

source: supermind.design

That’s just fine for most users and uses. But for true innovation, you don’t want most ideas to be “OK” — you want a few ideas to have an inordinate potential, even if at the cost of throwing away 99% of the output. How does one get that?

It is about framework-based lenses, filtering, and recombination. All the way down

Everyone, including pundits, was surprised at how much of our world’s functioning is already reflected in our language, which makes LLMs so good. But truly novel ideas do not just come from the prediction of the most likely tokens based on billions of text examples. Truly novel ideas are at least partly the result of understanding how the world works, and the application and porting of those conceptual models to new situations.

That’s where the “Dalai Lama’s elevator” example is interesting. An LLM wouldn’t typically go think about Buddhist wisdom when engaging in a conversation about elevators. But we — humans — could make it do it: make it look at reality, and the problem statement, through a different lens, one of attempting to influence human perception instead of blindly ramming against physics constraints) and getting it to recombine it with a very different field (mindfulness, for instance). There are two significant implications of this.

First, frameworks matter

We, as a human species and society, have embedded complex reasoning, and some understanding of how the world works, into artifacts that AI can mine, not just through syntax and general semantic similarity, but through theories and frameworks.

Lots of world-leading symbolic frameworks are embedded into the semantics of theories (e.g., “Porter’s Five Forces”), authors and artists (e.g., “Andy Warhol”), or into social constructs like people’s roles (e.g., “a medical doctor”). Those embed into semantics (or semiotics and visual styles, in the case of imagery) a representation of these people’s interpretation of the world. They are the results of lengthy research processes, performed by gifted individuals and their teams, that weeded out connections between things that didn’t work. In a way, they are a form of natural selection for ideas, crystallized in carefully crafted text, and propagated by thousands of examples of their use in the media, for instance. Generative AI can read that “DNA code”.

We saw that through the early prompts for Midjourney or Dall-e. Things like “paint this like Andy Warhol would”: for a machine, Andy Warhol is a framework, that Andy Warhol built with his brain and all the stimuli he processed as a person in the world of his day, and Andy Warhol’s style is an embedding of his symbolic representation of the world. And we see it in the importance of “persona-based prompts” like “you are a helpful innovation consultant with experience in human-centered design, neuroscience, and construction engineering”: around those words, there are many others linked to specific applications of the related methods, science, and technology. That is one set of representations of those concepts, with an explanation of how the world works as studied by design, neuroscience, and construction engineering.

Frameworks are creative constraints that have forced us for centuries to explore problems and solutions through a crystalline lens. That “passing through the narrows of the constraint” is often the spark for true innovation. And now, we can do some of that with AI-powered machines.

Second, the recombination of ideas matters

Steve Job’s chief integrator role across disparate disciplines is a good example of the power of recombination: his love for calligraphy, his enhanced perception partially due to psychedelics and the resulting obsessive empathy with human reactions, his understanding and entourage of computer science, gave us computers that don’t feel like computers (they feel like art). A related ecosystem of people playing with AI and Gorilla Glass resulted in “no-keyboard keyboards”.

In another telling example, Wikipedia’s Jimmy Wales, inspired by his knowledge of open-source software creation, applied it to knowledge curation, triggering the birth of a non-hierarchical editorial encyclopedia and revolutionizing how we look at the curation of knowledge.

These are just two examples, of the inception of the most valuable company ever, and one of the most useful websites ever. (Apart from them, a significant body of research shows that breakthroughs in science come from the connection of ideas from different fields — among others, look up Matt Clancy’s New Things Under the Sun for a thorough review, or some classics like Steven Johnson’s “Where good ideas come from”).

Of course, these were and are extraordinary people and teams. The good news though is that while harnessing very diverse fields is hard for a human, for AI the distance between knowledge items is a computationally tractable problem. Machines don’t suffer from the so-called “burden of knowledge” which limits the rate at which people can achieve enough competence to be able to contribute something novel in their field, the same way as we do. They don’t have “working memory” limitations the way we do.

Theoretically, generative AI tools have most of what they need already — they just need to be pointed at it.

Enter “AIdea Colliders”

What we need is something like what is described in the following chart. I tentatively call it Deliberate Framing, Recombination, and filtering (DFRF). It is, in other words, an “Idea Collider”, loosely inspired by the work of colliders in physics. The output is “AIdeas” (not a typo), triggered by the AI’s language-reasoning capabilities and the deep knowledge embedded in human theories and frameworks. In an AIdea Collider using DFRF, problems are (1) exploded and explored, and LLMs outputs are constrained through theories and frameworks (2) and (3), before (4) the deduced AIdea outputs are recombined (5), and then filtered (6), at scale.

source: supermind.design

An example of this architecture is MIT Ideator from the Massachusetts Institute of Technology’s Center for Collective Intelligence, an idea-generating machine built by our lab’s team after GPT-3 was released. There, we do away with the traditional chat interface and force the machine to apply a series of lenses to the problem statements, then ask humans to recombine the pieces. The focus of the tool is to help solve problems through a specific design framework called Supermind Design. It has a strong emphasis on exploration of the problem space, and on organizational designs that leverage collective intelligence. A full paper is being released on Arxiv, and you can apply to use MIT Ideator and give feedback.

Think [encyclopedia]+[web]+[community]=[Wikipedia]. You would know the answer today, but not before Wikipedia existed. Many other examples exist (you can find some on the supermind.design website), and countless more could be built.

More broadly, idea colliders can be built with frameworks from a very wide range of spaces. Consider some illustrative examples:

strategy e.g., Christensen’s disruption, Blue Ocean, PESTEL, Experience Curves, SWOT
innovation ideation e.g., Design Thinking activities such as journey mapping, persona analysis, or analogies (also called alternate worlds), Doblin’s Ten Types of Innovation, TRIZ, Lean Startup, Six Hats
decision making e.g., cynefin, logic tree, Eisenhower Matrix, balanced scorecard, debate techniques
operational improvement e.g., Lean Management’s FMEA or RCA, Six Sigma practices, ISO 9001, HAZOP
any other management framework e.g., McKinsey 7s, Ray Dalio’s Principles, Ikigai, Agile, Theory X and Theory Y, OODA loops, structured coaching methods, Peter Drucker’s theories and principles
industry-specific frameworks e.g., Consumer Products HACCP — Hazard Analysis Critical Control Points
and many others, including art (e.g., Brian Eno’s Oblique Strategies traditionally used for music, psychology (e.g., personality types, Maslow’s pyramid, flow theory), ESG parameters, personal coaching (e.g., Ikigai), or any logical and structured reasoning method (e.g., induction/deduction, Socratic questioning), etc.

And AI output can be fed with interesting, specific examples data sourced from diverse and faraway spaces, for instance, embedded in vector databases for retrieval-augmented generation. Think about being able to mine Arxiv, Patent Office records, Crunchbase startup databases, healthcare guidelines, or climate mitigation practices (including those successfully used in developing countries, or by indigenous groups) among others.

The recombination and filtering phases will also be crucial. An AIdea Collider can theoretically generate millions of idea fragments — and recombine them, geometrically expanding the output. Some of the triaging can be done by machines, for instance through specialized models representing a digital twin for the typical desirability/feasibility/viability scorecard process; or with composite machines based on an ensemble of models, able to critique each other — for instance, a model that helps fact-checking or identifies similar solutions, or one with stronger ethical skills applied to the output.

Part of this work could be done by humans, especially in large networks to distribute the load and harness varied viewpoints. People could “prune” specific branches of the output and give more emphasis to others, for instance with AI providing summaries and mapping the exploration space more visually, like a hyper-scale form of sticky-note clustering that innovation professionals are familiar with. At any rate, it is very likely that, to generate real breakthroughs, AIdeas will often require augmentation and recombination by human experts who possess innovation and domain skills.

Ultimately, we need to be thinking of ideation as a scalable process, with many steps, some of which are recursive, as illustratively depicted below.

source: supermind.design

These steps, like those in any other deliberately engineered business process, will feature a combination of machines and people — what is called a supermind. Not unlike what one would do with people, some of the machines will have the same skills and will possibly only use different lenses, while others will be different (as in the example of the ethical AI filter mentioned before). The space for designing such processes is much vaster than what we traditionally use, as visualized below.

source: supermind.design

Remember the world before the internet, automation, and Wikipedia? Compared to today, the future will be as different as today is compared to that long-gone time.

Start small and fast. But also, this is the time to think big

To recap, the basic workstreams for an Idea Collider are:

Identify relevant frameworks and classify them
Identify interesting data sets
Create chained flows, with the ability for humans to prune intermediate results, and recombine others
Work on the right UI/X to enable frictionless human-machine collaboration
Work on models for filtering

This is just with current technology, and a minimal amount of coding. There’s a lot more, but this already can get many started to build a real, large-scale Collider, with evolving contemporary AI capabilities such as chaining, context windows, multiple agents (and potentially even swarms). For some experimental design, apart from the MIT work mentioned earlier, look at what Luca Taroni has recently devised, based on these concepts.

The future possibilities are manifold. For instance, AI models increasingly use tools, such as OpenAI Plug-Ins, to perform specific tasks, and Colliders could be a new type of plug-in. Individual organizations could build their own — for instance, consulting firms and innovation departments — provided they allocate the right amount of data engineering capacity and design it well.

And that’s even before deliberately mining knowledge graphs, hence adding a layer of signals that maps connections between topics, between people, and between people and topics — ultimately yielding additional ways of exploring the solution space.

All of this is ready for productive experimentation for quite a few use cases. Innovation organizations that use AI-augmented collective intelligence (ACI), for example in the form of Idea Colliders, stand a real chance to accelerate the future by unlocking the breakthroughs we all need.

Start today.

Generative AI can Ideate Harder

Solving really, really hard problems

It is about framework-based lenses, filtering, and recombination. All the way down

First, frameworks matter

Second, the recombination of ideas matters

Enter “AIdea Colliders”

Start small and fast. But also, this is the time to think big

Written by Giannigiacomelli