AI-Assisted JS Development

Support companies with Brainiac and LangchainJS help

Published in

Brainly Technology Blog

7 min readJun 29, 2023

Photo by @possessedphotography on unsplash.com

In recent times, the emergence of AI-driven tools such as GitHub’s Copilot and OpenAI’s ChatGPT has greatly improved developer productivity. The common issue this article addresses is the integration of AI technology to enhance code practices and knowledge within an organization.

Motivation

My venture into this domain was sparked by a request from a fellow developer to clarify a concept from the documentation:

Developer: How to add a context into Brainiac module?
Me: You have to add it to contextProviders array in module declaration. We have an example somewhere, let me find it.
Me (3 minutes later): Example is here: <link>
Developer: Thanks!

After a lengthy manual explanation, I used LangchainJS to query GPT 4, feeding the Brainiac documentation as the context. I was playing with this tool recently and it seemed to be a good idea to spread practices more efficiently.

Specifically, for those who are not machine learning experts (like me), LangchainJS serves as a convenient tool offering ready-to-use APIs with machine learning utilities. These tools allow non-experts like myself to build solutions using tools like ChatGPT or Bard.

Going back to the results — the output was a comprehensive explanation, delivered in seconds, instead of the usual minutes:

% yarn ask-brainiac "How to add context into Brainiac module?"To add context to a Brainiac module, you need to provide the context providers in the configuration of the module. Then, render the context providers in the same order as in the definition. Here's an example:First, define the contextProviders in the module configuration:    contextProviders: [StoreProvider, AnotherStoreProvider];Then, render the contexts in the correct order:    <StoreProvider>
      <AnotherStoreProvider>
        <MyModule />
      </AnotherStoreProvider>
    </StoreProvider>A working example of a Brainiac module with context can be found here.

After this successful example, I have decided to develop more ways to assist Software Engineers at Brainly to support their understanding of our organization's code practices and in the future — provide ways to generate large chunks of high-quality code.

Disclaimers

Before we delve further, there are a few important points to bear in mind:

While this article primarily discusses tools and techniques that lean towards machine learning, it’s worth noting that I am not a machine learning engineer. The insights provided here are drawn from my experiences and are intended to be useful for other non-ML engineers by leveraging LangchainJS.
When feeding your organization’s documentation into the bot, remember that it should not contain sensitive or proprietary data. The code examples used in this context should be generic and not reveal any internal secrets or sensitive information.
There’s ongoing debate and uncertainty around copyright issues relating to bot-generated code. As such, the use of AI-assisted code in production applications should be undertaken with caution and due consideration. Until there’s clearer legislation or guidelines on this matter, it’s recommended to use these tools primarily for educational, experimental, or internal purposes.

Brainiac - AI-friendly JS framework

Brainiac, mentioned in the previous section, is Brainly’s in-house meta-framework based on React and NX. It leverages the NX ecosystem to create a set of libraries that corresponds to particular modules, services, and components. I consider Brainiac as an excellent base for future enhancement by AI due to its 2 features:

Highly opinionated and free of unnecessary abstractions — AI Bot’s responses will be more consistent with clear and simple guidelines.
Rich in automation — AI Bot will be able to refer to generators' usage if the prompt will ask for step-by-step implementation. Bot won’t be forced to waste tokens on generating the code by itself, it will just refer to running the code generator with proper arguments.

I encourage you for getting familiar with linked articles about Brainiac, I hope they will influence your organization’s code practices. We plan to open-source Brainiac in the non-disclosed future.

How it works

Here is an example of a simple command-line interface tool that responds to a query from the terminal using LangchainJS and GPT 4. It is based on the LangchainJS example for RetrievalQAChain.

import { OpenAI } from "langchain/llms/openai";
import { RetrievalQAChain } from "langchain/chains";
import { HNSWLib } from "langchain/vectorstores/hnswlib";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import * as fs from "fs";
import * as path from "path";
import { marked } from "marked";
import TerminalRenderer from "marked-terminal";

const args = process.argv.slice(2);

marked.setOptions({
  renderer: new TerminalRenderer(),
});

function combineFileContents(directoryPath) {
  let combinedContents = "";

  function readFilesRecursively(directory) {
    const files = fs.readdirSync(directory);

    files.forEach((file) => {
      const filePath = path.join(directory, file);
      const stat = fs.statSync(filePath);

      if (stat.isFile()) {
        combinedContents += fs.readFileSync(filePath, "utf8");
        combinedContents += "\n\n"; // Add two newline characters
      } else if (stat.isDirectory()) {
        readFilesRecursively(filePath);
      }
    });
  }

  readFilesRecursively(directoryPath);

  return combinedContents;
}

export const run = async (query = args[0]) => {
  // Initialize the LLM to use to answer the question.
  const model = new OpenAI({ modelName: "gpt-4" });
  // Directory of doc files
  const text = combineFileContents("./docs/brainiac");
  const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000 });
  const docs = await textSplitter.createDocuments([text]);

  // Create a vector store from the documents.
  const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings());

  // Create a chain that uses the OpenAI LLM and HNSWLib vector store.
  const chain = RetrievalQAChain.fromLLM(
    model,
    vectorStore.asRetriever()
  );
  const res = await chain.call({
    query: query,
  });
  console.log(marked(res.text, { mangle: false, headerIds: false }));
};

if (args[1] === "run") {
  run();
}

Code performs the following tasks:

Accepts a developer prompt as input — an example form would be yarn ask-brainiac <developers-query> . It requires setting the OpenAI token by command export OPENAI_API_TOKEN=<your-openai-api-token>.
Reads documentation files from a specified directory and joins them.
Splits the documentation into chunks.
Feeds each chunk as context to GPT-4, and returns an iterated response.
Display response in terminal using marked — it allows for displaying code snippets in the terminal.

Other applications

There are other, even more, accessible ways to deploy this solution. They consider using ConversationalRetrievalQA which works in a chat-like manner instead of a single-time prompt.

Custom ChatGPT Implementation: A custom implementation of ChatGPT made with Next.js and hosted on GitHub Pages.
Communication AppBot: This would require a backend such as a Next.js API route and creating a custom Slack/Teams app.

Improvements

This solution calls GPT 4 several times, per each documentation piece. A single query might cost > $1 in larger organizations. This is still less than the developer time that is saved. However, you can shrink the costs, and response time and improve the accuracy of results by:

Bot and Dev-friendly documentation: Ensure your documentation yields accurate bot responses. Unified document structure, syntax, and examples improve results. Moreover, documentation well digested by bots should be easily understood by developers.
Prompt Engineering: This friendly field will definitely improve the accuracy of responses. Prompts improvement can be applied either on the developer or script sides. While the first one will require proper training for the entire organization, the second one will not fit all possible developer queries.
Here is a simple example that could make the answer more tied to the usual requirements:


ACTION: You are Brainiac framework advisor. 
Answer or fulfil ${query} according to RULES, 
taking into consideration CONTEXT.

RULES:
1) Consider code structure according to Brainiac documentation, 
memoize callbacks and variables.
2) Act as expert in Software React Engineering and someone who is 
an expert in Brainiac knowledge.
3) Write code according to the task.
4) Provide explanation why the code is created in way described 
in Brainiac documentation.
5) If you are not sure whether Brainiac feature should be in the code, 
don't use it.
6) Don't enclose code description that is obvious to 
medium level software engineer.

CONTEXT:
I am engineer who is not familiar to Brainiac yet and I want you 
to show me how to write Brainiac code without reading documentation. 
I don't want to remove code you have generated so please make sure 
the only code you are adding is compliant 
with best Brainiac and React practices.

Documentation Splitting Logic: Currently, documents are split based on a fixed token amount. A better approach would be splitting them into paragraphs. This might require custom JS code and possibly leveraging other LangchainJS utilities.
REML (Relevance Matrix Learning): A machine learning technique for multi-label classification that transforms document pieces and prompts into vectors. The vector with the closest match is selected, saving computational and developer time, and increasing efficiency. Companies like Qdrant provide ready-to-use solutions.

Qdrant database, on github.com/qdrant/qdrant

Future ideas

Looking forward, the following applications are being considered and will be discussed in separate articles:

Next-Gen Code Generators: Combining AI with traditional code generators can yield generation scripts that accurately add and modify code per business scenario. Brainiac’s opinionated approach simplifies this process. Again, copyright cases should be strongly considered when going in this direction.
Pull Request Checks: Implementing Github PRs code checks in line with company style and directions.
Documentation Tests: A simple flow would be to run a specific query, cut the code from the response, and run an Abstract Syntax Tree (AST) check to ensure proper module creation.

Thank you for reading. Stay tuned for more insights on Brainly’s efforts to enhance developer workflows with AI assistance.

For more info about the Brainiac ecosystem, check out the articles: