Practical UX Techniques I Learned from Designing and Building an AI-powered Figma Plugin

Published in

Bootcamp

9 min readNov 1, 2023

I recently built a Figma plugin that can understand your designs more deeply than ChatGPT’s Vision feature to help GPT-4 respond to any question or generate anything based on your specific designs. Where ChatGPT-4V can understand what’s visible, Felix helps LLMs understand what isn’t, like exact font sizes, layout, structure, responsiveness, and interactivity.

I wanted to write a full rundown on my own process of designing and programming an AI-powered plugin, and share some strategies, tactics, pitfalls, and insights I gained through the process.

Working with large language models creates both new challenges and opportunities from a design perspective. A lot of them are unique to the specific nature of LLMs / AI. We’re not just dealing with UIs and traditional data, we’re designing the interaction between people, traditional programs, and these new intelligent systems.

So, where are we going? The landscape of UX and AI is still unfolding, and we’re all learning as we go. As AI models become more complex and their applications more widespread, they will only become more integrated and intertwined with the modern software stack, from LLM services to autonomous programs to even operating systems themselves. The need for UX and product designers to guide these technologies and make them usable, ethical, accessible, and beneficial will only grow.

Information Flows: New Layers

A good design conceals complex data flows, making them feel simple and intuitive. And when developing apps with an integration of LLMs, you need to fully understand the new layer of information flows.

Working with LLMs means design decisions can have impacts on both the application engineering and prompt engineering, so system prompt will at times include design considerations as well.

For example, the way that the text formatting is handled, the AI is given the instruction to use a technique (markdown) to format its own writing and have that reflect in the UI. This means that responses don’t come in as a solid wall of text because they’re coherently broken into paragraphs by the LLM.

From Predefined Use Cases to Open-Ended Prompting

With traditional software design and development, predefined use cases are pretty common and computing processes are deterministic, meaning they strictly follow fixed, predictable rules. But when it comes to AI such as large language models (LLMs), there is an opportunity to create more open-ended use cases due to their inherent stochastic or probabilistic flexibility.

Stochastic flexibility

This means that instead of trying to anticipate and accommodate every possible user need with a static, predetermined flow, we can structure the system in a way that exposes some of the actual prompt engineering to the user. This flexibility makes AI systems more adaptable and user-friendly, as they can dynamically adapt to a wide range of inputs. That said, it also brings with it challenges in terms of usability; as users become more involved in shaping prompts and interactions, the system should compensate for that load with clear guidance.

Sandboxes

The Felix plugin itself actually started off as a precise tool focused on a specific set of tasks: analyze this design, give feedback for this, generate alternative copy for each field, etc., but I realized its potential wasn’t in carrying out a small set of specific functions — it was in being an open-ended tool. That meant removing hardcoded prompts and exposing the prompt input to the user. Instead of being an appliance with a few features, it would be a sandbox.

Emergent Use Cases

This open selection-prompt dynamic means it’s way more versatile. You can ask anything, about anything. It can adapt to emerging use cases based on whatever design inputs and prompts you can come up with, allowing new use cases and prompts to emerge over time.

Making room for user-facing prompt engineering

AI-integrated apps have to handle user inputs that range from simple chat style requests to more complex prompt engineering. Text input fields with a static height mean users that are constructing more complex prompts have a harder time reviewing and modifying their prompts, leading to difficulties, mistakes and clunky UX.

That’s why I felt it was important that the text box for user input dynamically expands based on the content. By making the text input field expandable, the application enables complex queries without sacrificing usability. It’s an admittedly small but impactful way to improve the user’s experience when interacting with open-ended LLMs.

Clearing the API key hurdle gracefully

Felix beta is a static application with no servers of its own. What that means is that I can run it for free for now while I gather feedback, talk to users, implement fixes and updates and work on developing a more advanced version of the app. These are great for the product design, but it also means that users need to sign up for and connect their own AI service to Felix. (For those that aren’t familiar, API key is just a private key that you can paste into another app to connect their services, like OpenAI’s GPT models, to another app, like Felix in this case.)

Launching with an external API key dependency allowed me to launch Felix early and get feedback, but it also created a bit of a hurdle: not everyone knows what an API key is, or where to get their own.

To handle this, on first load up, Felix starts with a notice card under the plugin’s name. This card contains a key icon and text saying to add an OpenAI API key to get started. The design choice here is about immediate guidance and a clear next step.

Below the API notice, the interface allows local interactions even without a key. The selection title and size estimate defaults to “Make a selection,” with subtext, “Take your time, no rush.” The intent is to encourage exploration without pressure. Although you can’t send requests without an API key, you can still understand token and cost estimates, making the tool partially functional. The goal behind this is straightforward: minimize initial barriers and allow users to get a feel for the tool before committing to making requests.

Below the API key input field is a set of external links that take users to the Sign Up page, Usage, Payment Limits, or Manage API Keys sections of the app, since they’re hosted in OpenAI’s platform site, I needed to make sure they were always accessible.

Next, the input placeholder starts with ‘sk-’, an indicator to include the sk- before the key. If the character count doesn’t match the expected length, an error message appears when users click send. After key save, the button label changes to ‘Delete,’ and a “Key Updated” subtext appears below the field. The field is masked up to the last 4 characters so they’re protected but still identifiable to users. By the time users navigate back to the main screen, the notice will have disappeared, and requests are re-enabled.

Because LLM requests through OpenAI’s API a priced per token it’s important to avoid sending a request without a prompt, or sending a comprehensive prompt without a selection. To help prevent that I disabled the send button in those circumstances, and added a subtle bounce effect to indicate the issue when it’s clicked while either the selection or prompt is empty.

Hidden Challenge of Costs

Felix’s functionality is based on a specific selection within Figma, so it’s important to display what’s selected clearly. Details are automatically pulled in and populated when you update a selection, and the name of the top-level element is displayed in the UI. If it’s a single element like a frame, its name is displayed under Selection, but when multiple elements are selected it displays the number of top-level elements selected.

One of the more challenging aspects was handling the economics: because AI generations cost money, these costs should be displayed to the user. In short, I didn’t use a real tokenizer to estimate the size of a request before running it, because it seemed a bit overkill for pre-generation selections that may or may not even lead to a generation (as well as the development complexity as I’m no programming expert by any means). Instead, I ran tests using the real tokenizer models to get a rough estimate and a range that compensates for its variance from the estimate.

By measuring the character-to-token compression ratio, and the maximum variance from that ratio, you can produce a range that compensates for its own inaccuracies. This helps users get an idea of the size and cost of a selection per model right when a new selection is made. GPT-4 calls are 10x more expensive than GPT-3.5-Turbo-16k, so it helps to see the implications in terms of prices per selection *and* per model, during use, rather than afterward in a billing statement.

Response Handling

Originally I had descoped streaming due to the relative complexity of handing the stream as it comes in, so I had to design a spinner to handle wait time. But after testing it a few times it became clear, waiting for the entire response to come back in can add wait times of up 5 to 10 seconds with longer responses. Given our expectation for digital systems of any type to run quickly, it felt like it was always broken until the message would come back in. The decision in the end was that it was basically a showstopper and would need to add streaming back in, even if it took a bit of extra time.

UX involves determining how we process and present data from the back-end/services. For example, the markdown text formatting techniues I mentioned above makes it easier to read and understand the responses, as it enables the model to have a say in how the text should be formatted, adding another layer of nuance to the formatting. It’s a common theme with designing for AI integrated applications that UX solutions often impact both traditional programming and prompt engineering.

Guidance and Edge Cases

Error handling isn’t just a necessary afterthought, it’s actually a chance to guide users toward the best path of use. Stray from that path, and designs should nudge you (gently) in the right direction. Dealing with an open-ended sandbox-like tool, the aim was always to minimize confusion and provide clear paths for resolution of errors throughout the discovery process.

The goal was to catch scenarios where they would hit an API error due to oversized selection and suggest to either select less content or change models before writing out their entire prompt and clicking send. These cautionary messages are actually non-blocking, in that it doesn’t actually disable the send button when overloaded so users can still call the API and get a response error with the exact token length details of their selection and maximum input size.

Having models with different context sizes adds a bit of complexity, but it’s a great example of that nudging. If your selection exceeds the context window of the model with a smaller context window, Felix kindly suggests you either select fewer elements, or switch to the model with a larger context window. If you surpass the size limit of the largest model, Felix suggests only to try selecting less.

When users make a massive selection, it’s important Felix doesn’t just get overloaded, so when it reaches a certain number of elements, it bails on calculations to avoid freezing up and basically says ‘woah, you’re pushing it a bit too hard here, try selecting less’.

(Of course this can all be improved with an embeddings model, vector database and RAG implementation like Pinecone.io to create a dynamic context window that can handle an entire design file, but that’ll be a next step.)

Closing Thoughts

I think the power of AI is to help us be more effective at what we love to do. AI should be assistive for people, not competitive with us. The goal should ultimately be to augment and assist human capabilities in the process of designing software, websites, apps, or anything else.

I don’t think as people we have anything to fear when it comes to AI offloading menial intellectual tasks, as it will enable us to focus on the more important things.

As designers, we’re going to be designing not only AI-integrated apps, but eventually AI will become a core component of the tech stack, all the way up to operating systems, so we’re better off familiarizing ourselves with the technology and the specifics of designing for it. While only an introduction to the subject, I hope this will be helpful for anyone out there.

Twitter: @diet_rams

Website: felixforfigma.com

Felix Plugin Download: Felix AI on Figma Comminity