The SMART Principles: Designing Interfaces That LLMs Understand

Howard Zhou
8 min readJun 28, 2024

--

1. Why Care About Interface Design?

Imagine building an LLM application capable of calling external services. The success of this application hinges on its ability to accurately perform these calls. If the LLM misunderstands the interface, the entire operation can fail, making the product unreliable and unusable. Therefore, designing interfaces that LLMs can easily comprehend directly impacts the usability and success of the product.

To ensure your interfaces are clear, simple, and effective, let me introduce the SMART principles. These principles are specifically designed for developing Actions (or Plugins) for platforms like GPTs and are equally applicable to direct LLM Function Calling. Let’s dive into each principle in detail.

2. Mastering the SMART Principles

2.1 Simple Inputs

Keeping input parameters simple and straightforward is crucial for LLMs to understand and process commands effectively. Complex inputs can confuse the model, leading to incorrect or incomplete responses. For example, consider an interface for creating calendar events. There are two types of events: all-day events and events with a specific time range. If we use a single interface to handle both types, the input parameters might look like this:

{
"title": "Meeting", // event title, required
"description": "Discuss project roadmap", // event description, not required, default null
"is_full_day": false, // set true if user doesn't offer absolute time range
"start_time": "2023-08-15 13:00", // required if is_full_day is false
"end_time": "2023-08-15 14:00", // required if is_full_day is false
"start_date": "2023-08-15", // required if is_full_day is true
"end_date": "2023-08-15" // required if is_full_day is true, same as start_date if only one day
}

This interface increases the complexity of the input parameters, and LLMs often struggle to construct such inputs accurately, even with advanced models like GPT-4.

However, when the interface is split into two separate ones, each handling a specific type of event, the inputs become much simpler and more manageable for the LLM:

For events with a specific time range:

{
"title": "Meeting", // event title, required
"description": "Discuss project roadmap", // event description, not required, default null
"start_time": "2023-08-15 13:00", // start time of the event, in format 2006-01-02 15:04
"end_time": "2023-08-15 14:00" // end time of the event, in format 2006-01-02 15:04
}

For all-day events:

{
"title": "Holiday", // event title, required
"description": "National holiday", // event description, not required, default null
"start_date": "2023-08-15", // start date of the event, in format 2006-01-02
"end_date": "2023-08-15" // end date of the event, in format 2006-01-02, same as start_date if only one day
}

By simplifying the inputs in this way, LLMs can better understand and accurately construct the necessary parameters. This approach is particularly important for interfaces designed to be used by LLMs, ensuring they can effectively interact with the system without confusion.

2.2 Meaningful Strings

Using meaningful strings instead of numeric enums is essential for LLMs to interpret and process data correctly. Numeric values require additional context and mapping, which can complicate the understanding process for LLMs. On the other hand, meaningful strings are self-explanatory and provide clear information about the data they represent.

For example, consider an interface that categorizes events with labels. Using numeric enums might look like this:

{
"label": 1
}

In this case, the LLM would need additional context to understand that 1 means "work." This adds an extra layer of complexity and increases the chances of misinterpretation. Instead, using meaningful strings makes the data immediately clear:

{
"label": "work"
}

This approach simplifies the understanding process for LLMs, reducing the need for additional context and mapping. For front-end engineers working with real users, numeric values might not pose an issue, but for LLMs, meaningful strings significantly enhance stability and performance.

Another important aspect is returning runtime information, such as server time and user time zone, in a human-readable format. This helps LLMs to provide more accurate and contextually relevant responses, especially on platforms like GPTs that do not automatically include such information in the dialogue context. As a workaround, including this information in each API response ensures that it is carried into the LLM’s context, improving the model’s alignment with user expectations. For instance:

{
"server_time": "2024-06-27 11:20",
"user_timezone": "Asia/Shanghai"
}

By using meaningful strings and including necessary runtime information, you enhance the clarity and interpretability of the data, making it easier for LLMs to understand and act upon it correctly. This leads to more reliable interactions and a better overall user experience.

2.3 Avoid Headers

When designing interfaces for LLMs on platforms like GPTs, it is essential to avoid using headers for passing parameters, except for Authorization. GPTs platforms typically do not allow any header parameters other than Authorization, making this a strict limitation we must adhere to. However, you can still configure the Authorization parameter correctly using standard authorization mechanisms.

To include the Authorization parameter in your GPTs Action schema, you can use the following configuration in your paths and components:

In paths:

security:
- BearerAuth: []

In components:

securitySchemes:
BearerAuth:
type: http
scheme: bearer
bearerFormat: JWT

This setup ensures that the Authorization parameter is correctly handled by the GPTs platform, following standard authorization mechanisms accepted by GPTs.

On other platforms like Coze, you might be able to specify header parameters directly. However, I still recommend not overusing headers to maintain consistency with GPTs and reduce complexity. By keeping parameters within the body or query string, you make the interface simpler and more transparent for LLMs to process, minimizing the risk of misinterpretation and ensuring more reliable and accurate responses.

2.4 Responsibility

Ensuring that each interface has a single responsibility is crucial for maintaining clarity and simplicity in your design. This principle, often referred to as the Single Responsibility Principle (SRP), means that each interface or API endpoint should be dedicated to a single, specific task. This not only makes the interface easier for LLMs to understand but also simplifies debugging and maintenance.

For example, consider an API that handles user management. Instead of having a single endpoint that handles multiple actions like creating, updating, and deleting users, you should have separate endpoints for each action:

  • Create User:
POST /api/users
{
"name": "John Doe",
"email": "john.doe@example.com"
}
  • Update User:
PUT /api/users/123
{
"email": "new.email@example.com"
}
  • Delete User:
DELETE /api/users/123

By following the Single Responsibility Principle, each endpoint has a clear and distinct purpose, making it easier for LLMs to interpret and execute the appropriate commands. This reduces the risk of errors and improves the overall reliability of your application.

2.5 Transparent Descriptions

Clear and concise descriptions are crucial for ensuring that LLMs can understand and execute commands accurately. Each interface and its parameters should be described in straightforward language, avoiding technical jargon or ambiguity.

For example, consider an API parameter that indicates the number of days an event spans overnight. A vague description might look like this:

  • Vague Description: Number of days the event spans.

This description is not necessarily wrong, but it lacks context and detail. Instead, a transparent description should clearly state the purpose of the parameter, how to use it, and provide examples:

  • Transparent Description:
cross_days:
minimum: 0
type: integer
description: |-
Number of days the event spans overnight. Use this parameter for events that span multiple days.
For example, if the event starts at 11 PM and ends at 6 AM the next day, set this to 1.
Note: This value should be one less than the total number of days for events that span multiple days.
For example, if an event spans from July 1st to July 3rd (3 days), set this value to 2.

Clear and detailed descriptions help LLMs understand the exact purpose and usage of each parameter, which is crucial for generating accurate and reliable responses. This is especially important when LLMs need to construct or interpret complex queries. By providing thorough and transparent documentation, you reduce the risk of misinterpretation and improve the overall reliability of interactions with the LLM.

3. Building GPTs Actions: An Engineering Approach

To effectively build Actions for GPTs, a structured engineering approach is essential. Using Calendar EVA Now as an example, here’s how you can systematically design and implement GPTs Actions. Calendar EVA Now is an intelligent agent designed to help users manage their schedules through natural language dialogue. Users can query their schedules, create single or recurring events, and adjust schedules with simple commands. This makes it an ideal example to illustrate the process.

Step 1: Build a Test Set

The first step is to create a comprehensive test set that covers common user commands. This allows you to test how well the GPT model understands and processes different inputs. Here is a sample test set for Calendar EVA Now:

Step 2: Design the Interface Definition

Before implementing the interface, design the interface definition using best practices and generate the OpenAPI 3.0 format that GPTs accepts. Test the interface definition against your test set to see if the GPT model selects the correct interface and constructs the right parameters. You don’t need to complete the implementation yet; focus on the design and adjustments based on the test results. Here’s an example:

OpenAPI 3.0 Definition:

openapi: 3.1.0
info:
title: evanow
contact: {}
version: xxx
servers:
- url: https://evanow.chat
paths:
/api/v1/auth/users/profile:
get:
description: get current user profile
operationId: GetUserProfile
security:
- BearerAuth: []
responses:
"200":
description: OK
content:
application/json:
schema:
allOf:
- $ref: '#/components/schemas/http.RspData'
- type: object
properties:
data:
$ref: '#/components/schemas/internal_handler_http.UserDetail'
"400":
description: Bad Request
content:
application/json:
schema:
$ref: '#/components/schemas/http.RspBase'
components:
securitySchemes:
BearerAuth:
type: http
scheme: bearer
bearerFormat: JWT

Step 3: Implement the Interface

Once you have a well-defined and tested interface, proceed with implementing it. This step involves writing the backend code to handle the API requests as defined. Ensure that each endpoint adheres to the single responsibility principle, and thoroughly test the implementation to confirm it works as expected.

Iterative Development

When adding new features, repeat the above process. Start by updating your test set with new user commands, design the interface definition, test and adjust it, and finally, implement the interface. This iterative approach ensures that your Actions remain robust and easy for GPTs to understand and use.

By following these steps, you can create efficient and reliable Actions for GPTs, enhancing the overall user experience and functionality of your applications.

Experience Calendar EVA Now for yourself at Calendar EVA Now. The homepage includes entry points for both GPTs and Coze experiences.

Give it a try and see how convenient managing your schedule with natural language commands can be. Imagine the ease of adjusting your plans without the hassle of manually clicking through a calendar — experience a new way of interacting with your schedule effortlessly.

https://www.youtube.com/embed/PLSwmLn7cxs

--

--