A Developer’s Journey To the AI and GraphQL Galaxy

The article that helps tame an LLM to produce a GraphQL query for a specific API

10 min readOct 3, 2023

Generative AI has stirred debates. While some might regard it as a fleeting trend, I view it as a game-changer. AI and Large Language Models (LLM) combined give way to AI-powered chatbots that hold great potential for improving operations and customer services. Such AI bots are designed to communicate with an entire system via an API. Sounds okay… unless your API is built using GraphQL. What’s the trick, you ask? GraphQL stands out as a potent query language due to its flexibility, versatility, and extensibility. Getting LLM to produce a perfect and valid GraphQL query for a specific API is challenging. Some even say it’s impossible…

At monday.com, innovation is our mantra. We set off on the AI journey some time ago, and very soon I realized that mastering AI feels like discovering a new galaxy far, far away — there are no ready answers, so you need to trust you gut feeling and push the boundaries. And this thrill of making the impossible possible makes my determination surge like the Force within a Jedi.

We started at monday.com with foundational tasks such as “Summary Update” and “Generate tasks”. These tasks, powered by basic prompts, enabled app developers to create first bespoke AI solutions within the monday.com framework.

Our ambition grew when we decided to create the Implementation Consultant. Think of it as a “droid” that estimates monday.com users’ needs and offers bespoke boards to them. Once a user gives the go-ahead, this “droid” seamlessly crafts a GraphQL mutation over the monday.com API and creates the individually tailored solution.

However, using our LLM to generate these GraphQL API calls was like navigating the asteroid field — so tricky! Our quick solution, translating LLM’s JSON output into GraphQL mutations, felt more like a temporary patch than the ultimate fix.

Determined, I set my sights on a more streamlined solution.

Disclaimer: This is not a production-ready battle-tested solution. This is my personal exploration of the possible solution.

The Eureka Moment

Initially, I hoped to use a direct approach: pluck out the schema from our API and mesh it into the prompt. So, when someone commands “create a board with status, date, and assignee columns”, the prompt would weave together the query’s specifics, the nuances of GraphQL, and our API’s schema.
But my hopes clashed with reality. We’ve got lots of entity types — over 200! Not to mention countless resolvers, fields and arguments. The result? An overwhelming token count that exceeded our prompt’s limit. Darn.

My ‘aha!’ moment arrived when I stumbled upon VectorDB and the marvel of similarity search. A spark ignited: Why not feed Vector DB with a bunch of GraphQL query examples from our monday.com API? We could then search for queries that meet the user’s request and make them the core of our prompt.
Luckily our analytics data storage held a treasure trove of valid GraphQL queries.

Yet, a challenge remained which was about a description, a narrative that would meet a user’s request. If a user desires to “create a board with status, date, and assignee columns”, we need a query with a description that echoes this wish.
Fortunately, I didn’t have to painstakingly craft thousands of descriptions by myself. Enter LLM (MakerSuite) had my back. The prompt that I ran looked like this:

function createPromptGraphQLDescription(query) {
    return 'You\'re an expert in GraphQL and intimately familiar with the monday.com API. \n' +
        'I have a GraphQL query from the monday.com API that I\'d like you to explain. \n' +
        'Please cover the query\'s purpose, the arguments it accepts, and the output it generates.\n' +
        '\n' + 'Here is the GraphQL query:\n' +
        '\n' + '```graphql\n' +
        `${query}\n`;
}

Thus the final piece of the puzzle fell into place. Now let’s see how this solution can be broken down into smaller steps.

Preparation phase:

Retrieve a CSV filled with all valid GraphQL queries.
Execute a script where LLM acts as a detective searching for and extracting a comprehensive description for each query and storing it in the CSV.
Convert the query description into a vector format.
Store the description in the vector format along with the associated GraphQL query as metadata in the VectorDB.

Execution phase:

Transform user input into vector.
Search for similarities in VectorDB using this vector, zeroing in on the top K results that resonate.
Weave these top K results into a prompt that nudges LLM to make a valid GraphQL query.
Sit back, relax, and enjoy the magic.

Let’s get coding

Before we start, a word of caution: the examples here are basic. Think of them as the backbone missing the muscles like no error handling, no input validation, and more. So, if you’re contemplating pulling a “Han Solo”, all guns blazing with a direct copy-paste… think twice. Remember, it’s always more thrilling to conjure your own asteroid field of bugs! 😜

As for the eagle-eyed who spotted the use of JavaScript, there are solid reasons behind this decision:

JavaScript has an expansive audience, making it one of the most recognized and adopted languages.
We’ve leaned heavily on JavaScript at monday.com.
While Python shines in data science-specific tasks, outside that niche, there isn’t a compelling advantage to picking Python over JavaScript for our use case.

Where’s the full code? Right here: graphql-assistant repo.

Getting a CSV:

It’s all about the data. I had to choose: manually jot down queries or scoop up historical ones from analytics. I chose the latter (less sweat, more gain). The golden rule: strip away any personal data.

On our next mission to get a description for each query, LLM gets the spotlight.

Enter MakerSuite. It’s free, powered by PaLM API, and is as accessible as the Mos Eisley Cantina. This is how I charmed LLM into crafting text in response to a prompt:

const prompt = async (prompt, config) => {
    const MODEL_NAME = 'models/text-bison-001';
    const API_KEY = process.env.GOOGLE_PALM_API_KEY;

    const client = new TextServiceClient({
        authClient: new GoogleAuth().fromAPIKey(API_KEY),
    });

    console.log(`Calling MakerSuite API with prompt: ${prompt}`);
    const result = await client.generateText({
        model: MODEL_NAME,
        prompt: {
            text: prompt,
        },
        ...config,
    });
    console.log('Result:', JSON.stringify(result));
    return result;
};

MakerSuite offers three models:

Text-bison-001: for text generation

Chat-bison-001: for chit-chat interactions

Embedding-gecko-001: to craft vectors.

I used this prompt:

function createPromptGraphQLDescription(query) {
    return 'You\'re an expert in GraphQL and intimately familiar with the monday.com API. \n' +
        'I have a GraphQL query from the monday.com API that I\'d like you to explain. \n' +
        'Please cover the query\'s purpose, the arguments it accepts, and the output it generates.\n' +
        '\n' + 'Here is the GraphQL query:\n' +
        '\n' + '```graphql\n' +
        `${query}\n`;
}

The full logic of reading the CSV and playing with MakerSuite’s API, row by row, looks like this:

async function generateDescription() {
    const rows = await readCSV(inputFile);
    const outputData = [];

    for (let i = 0; i < rows.length; i++) {
        const row = rows[i];
        const query = row.query;
        try {
            const prompt = createPromptGraphQLDescription(query);
            const result = await makersApi.prompt(prompt);
            const candidates = result[0].candidates;
            // Because output might have multiple lines and special characters
            // I had to sanitize it
            const sanitizedDesc = sanitizeForCSV(candidates[0].output);
            const sanitizedQuery = sanitizeForCSV(row.query);
            outputData.push({description: sanitizedDesc, query: sanitizedQuery});
        } catch (error) {
            console.error(`Error processing row with query: ${query}`, '\n', error);
        }
    }

    const csvData = outputData.map((row) => `${row.description},${row.query}`).join('\n');
    try {
        await writeFileAsync(outputFile, csvData);
    } catch (error) {
        console.error('Error writing to output file:', error);
    }
}

As a result, we’re the proud owners of a CSV containing queries and their descriptions.

Vectorize the descriptions

For this, we use the “gecko” model:

const embeddings = async (text) => {
    const MODEL_NAME = 'models/embedding-gecko-001';
    const API_KEY = process.env.GOOGLE_PALM_API_KEY;

    const client = new TextServiceClient({
        authClient: new GoogleAuth().fromAPIKey(API_KEY),
    });

    const result = await client.embedText({
        model: MODEL_NAME,
        text: text,
    });

    return result[0].embedding;
};

With vectors (or, for the fancy folk, embeddings) in hand, it’s Pinecone’s time to shine. Why Pinecone? It’s free and simple. Pinecone stores our vectors in what it calls an index — think of it as a table. You can play with this index, querying, erasing, searching… the whole nine yards.

When crafting this index, Pinecone asks for vector dimensions. For the MakerSuite Embeddings model (gecko), this magic number is 768. Just imagine an army of 768 numbers marching out every time “gecko” transforms text into a vector. Euclidean is our chosen distance metric.

Next, we transform each description into an embedding. This code rambles through our CSV rows:

async function fillIndexWithVectors() {
    const rows = await readCSV(outputFile)
    for (let i = 0; i < rows.length; i++) {
        const row = rows[i];
        const item = {
            id: i, description: row.description, query: row.query
        }
        try {
            console.log(`Processing row ${i + 1} out of ${rows.length}`);
            const index = await vectorDbApi.addVectors([item]);
        } catch (error) {
            console.error(`Error processing row with values: ${item}`, '\n', error);
        }
    }
    return 'Success';
}

And for each row that contains a description and GraphQL query, we call embeddings API to turn description into vector and then store it with a query in Pinecone via “upsert” function.

const addVectors = async (items) => {
    if (!Array.isArray(items)) {
        throw new Error('Items must be an array');
    }
    const pinecone = new Pinecone();
    const records = await Promise.all(
        items.map(async (item) => {
            const values = (await makersAPI.embeddings(item.description)).value;
            return {
                id: item.id.toString(),
                values: values,
                metadata: {
                    ...item,
                },
            };
        }),
    );

    const index = pinecone.index('monday-com-graphql-query');
    await index.upsert(records);
};

Now the initial setup is complete. Onwards!

The best part:

Alright, team, the stage is set, the pieces are in place, and it’s time to unleash the Rancor! (Alright, it’s not a Rancor, but you got an idea :)

Step 1: Take what our user writes and turn that into an embedding, then let it run wild in Pinecone:

const queryVectors = async (text, resultSize) => {
    const pinecone = new Pinecone();
    const vector = (await makersAPI.embeddings(text)).value;
    const index = pinecone.index('monday-com-graphql-query');
    const results = await index.query({
        topK: resultSize,
        vector,
        includeMetadata: true,
    });
    // Output all results to console for de
    console.log("Got results for search:", JSON.stringify(results));
    // Print all distances of similar vectors. See that it's makes sense and then we have good data.
    console.log("Distance are:", results.matches.map((match) => match.score).join(", "));
    return results;
};

Step 2: Having found our vectors, it’s time to make the actual query out of them, so that it expresses what our user initially sought. Here’s how we do it:

async function getGraphQLQuery(text, shouldUseSimilarQueries) {
    const response = await vectorDbApi.queryVectors(text, 6);
    const similarQueries = response.matches.map((vector) => {
        return {
            query: vector.metadata.query, description: vector.metadata.description,
        }
    });

    const prompt = getSearchGraphQLPrompt(text, similarQueries);
    const result = await makersApi.prompt(prompt, {temperature: 0.1});
    return {
        query: cleanFormattingCharacters(result[0].candidates[0].output)
    }
}

Want to know what the prompt looks like? Feast your eyes:

function getSearchGraphQLPrompt(text, similarQueries) {
    let similarQueriesText = ""
    if (similarQueriesText.length > 0) {
        similarQueriesText = 'For contextual understanding, consider the ' +
            'following known queries that have similar purpose: \n'
    }
    similarQueries.forEach((query, index) => {
        similarQueriesText += `#### Set ${index + 1}:\n` + `Query: '''\n${query.query}'''\n`
    });
    return 'Your task: Given the user\'s free-text sentence below, ' +
        'convert it into a valid GraphQL query using the monday.com API. ' +
        'Incorporate any specific parameters like \'board IDs\',\'column ids\', \'column values\', etc., ' +
        'mentioned in the user\'s description into the resulting GraphQL query. ' +
        'Output the query only, nothing else.\n' + '\n' 
        + 'User\'s Query Description:\n' + `'''${text}'''` + '\n' + `${similarQueriesText}` + '\n' + 
        'REMEMBER, to return only a GraphQL valid query.\n' + '\n' + 'Your GraphQL query:';
}

And… voilà! That’s the magic trick, folks.

To paint a clearer picture, here’s a nifty before-and-after of our transformation process. And to make it harder I will try to ask for API that we only recently introduced in monday.com — Items Page API:

To Infinity and Beyond!

As you see, the results are nothing short of fantastic. But this is just our opening act. What’s next? Well, let me pull back the curtain a tad:

1. The Art of Refinement: Through my experimenting, I’ve realized it might be wiser not to throw every query into the mix. Instead, it’s about crafting a few masterful, diverse GraphQL queries tailored to specific APIs. This select group would address most use-cases, ensuring better results.

2. Extra Layers to the Prompt: Imagine our prompt having not just an example of a matching query but also the schema relevant to the API in that example. This extra layer could potentially shoot our accuracy through the roof!

3. A Testing Playbook: Picture a test set comprising “user input” and its corresponding “graphql query”. We run our solution, take both the LLM’s response and the test input, convert them into embeddings, and what then? Measure the gap (distance) between the two vectors. A tiny gap? That’s LLM giving us a high-five with an almost perfect query!

Now, if you thought this was the end of our journey — not so soon! Just as George Lucas surprised us with more Star Wars after Episode 3, I’ve got more up my sleeve. I’, going to share my learnings about GenAI and my journey to AI expertise.

But for today, as the twin suns of Tatooine set, I say ‘good-bye’. May the Code be with you!

What’s the best gift for someone who wrote such a long article? Right, a comment and a bunch of claps :) What did you think? Did you find it useful or inspiring?

A Developer’s Journey To the AI and GraphQL Galaxy

The article that helps tame an LLM to produce a GraphQL query for a specific API

The Eureka Moment

Let’s get coding

To Infinity and Beyond!

Written by Yonatan V. Levin