Coding Week 2: GSoC’ 22 with SCoRe Lab

Pranjal Walia
Leopards Lab
Published in
5 min readJul 17, 2022

In this post, I’ll be talking about how the consequent week of coding went. You can read my previous post about week 1 here as well.

Now, previously we discussed a bit on how exactly the SDK is one part of the project and its need for type-definition files, this time onward, we’ll be focusing more on the side of the compilers, specifically, the TypeScript Compiler and its exposed API 🔥.

Now let’s begin by trying to understand how data can be extracted from a code using the TypeScript compiler. We know that files are really just pieces of text and any compiler does not process code in its raw form i.e. in the form of a file. Instead, the compiler employs its frontend which is responsible for the analysis and structural verification of source code. The frontend is composed of the lexer, syntax, and semantic analyzers that perform as finite state automata for the language specification. Now, the output of the frontend is an intermediate representation of code(IR) that is the Abstract Syntax Tree, this is how the compiler internally represents whatever code we have written.

Compiler Schematics

My work in a nutshell is to fetch the output of the frontend, modify it and use the modified version of the AST to fill in code that completes the implementation of required functions.

This is where the compiler API comes in and simplifies a lot of the interaction with TypeScript source files namely by allowing us to generate ASTs and also interacting with them as per our specification. How is this important? Well, in order to traverse a tree (say in level order), we would need to perform a BFS (Breadth First Search) and at each node, we would need to modify the structure of the tree to match our specifications. This can be an extremely complicated and tedious process if done manually.

Fortunately, the TypeScript Compiler exposes an API that can do this traversal for us simply by the use of forEachChild method on the AST (as simple as that 🤯) and so the compiler API is an integral part of the process.

Consider this simple piece of code:

class ClassName {
/**
*
* @param {module} alisdk ali SDK
* @param {object} options SDK options
*/
constructor(alisdk, accessKeyId, accessKeySecret) {
this._ali = alisdk;
this._instance = new this._ali(accessKeyId, accessKeySecret);
this._sdkClassName = this._instance.SDKClassName;
}
function() {
return new Promise((resolve, reject) => {
this._sdkClassName
.SDKFunctionName()
.then(data => resolve(data))
.catch(err => reject(err));
});
}
}
module.exports = ClassName;

This is internally represented as an AST as follows:

{
kind: 288, // (SyntaxKind.SourceFile)
statements: [{
kind: 243, // (SyntaxKind.FunctionDeclaration)
name: {
kind: 75 // (SyntaxKind.Identifier)
escapedText: "..."
},
body: {
kind: 222, // (SyntaxKind.Block)
statements: [{
kind: 225, // (SyntaxKind.ExpressionStatement)
expression: {
kind: 195, // (SyntaxKind.CallExpression)
expression: {
kind: 193, // (SyntaxKind.PropertyAccessExpression)
name: {
kind: 75 // (SyntaxKind.Identifier)
escapedText: "log",
},
expression: {
kind: 75, // (SyntaxKind.Identifier)
escapedText: "console",
}
}
},
arguments: [{
kind: 10, // (SyntaxKind.StringLiteral)
text: "...",
}]
}]
}
}]
}

This is when things start to complicate as we can traverse nodes and extract the data from them, the more specialized the data, the different methods will be required to deal with them.

For example, if we intend to perform a shallow traversal and not visit the entire depth of a tree at once, then we use the function visitNode but if we want all nodes of a single node to be traversed in their depth entirely, we use the visitEachChild method. Our aim to traverse each node and at every node i , we need to extract all its children. This can be done as follows:

const transformer = sourceFile => {
const visitor = (node: ts.Node): ts.Node =>
console.log(node.kind,`\t#ts.SyntaxKind.${ts.SyntaxKind[node.kind]}` );
return ts.visitEachChild(node, visitor, context);
};
return ts.visitNode(sourceFile, visitor);
};

Now, say node i represents a class declaration, so its children would be its member functions. Suppose each method is represented by a node j . Effectively j would be a child of i . And j would be having more child nodes representing its parameters, body, return type, etc.

Essentially in type-definition files, we want to find a class, get its functions, and also all params associated with each function. We define interfaces that define all the data we need to extract from the AST to successfully perform our computation 🔨.

interface FunctionData {
functionName: string;
SDKFunctionName: string;
params: param[];
}
interface param {
name: string;
type: string;
typeName: string;
}
interface ClassData {
className: string;
functions: FunctionData[];
serviceName: string;
}

Now comes the actual traversal of the AST and extraction, as we have the class AST with us, we will start visiting all its child nodes and if we encounter a syntax kind of Method Declaration, we will push it and then we will again traverse through the extracted function nodes to extract there child parameter nodes.

let methods: FunctionData[] = [];AST.forEach(method => { 
if
(functions.includes(method.name.text)) {
const parameters = [];
method.parameters.map(param => {
if (param.name.text !== "callback") {
const parameter = {
name: param.name.text,
optional: param.questionToken ? true : false,
type: SyntaxKind[param.type.kind],
typeName: null
};
if (parameter.type==="TypeReference" && param.type.typeName) {
parameter.typeName = param.type.typeName.text;
} parameters.push(parameter);
}
});
methods.push({
functionName: name.toString(),
SDKFunctionName: method.name.text.toString(),
params: parameters
});
});

In the above snippet, we essentially traverse the tree and when we encounter the function, we check if it exists in the target set of functions we want to extract if so we go ahead and traverse further along that node in depth to extract its parameters. In this way, we look up all the required functions.

Helpful Resources

  1. https://astexplorer.net/ (a lifesaver when dealing with TS/JS ASTs)
  2. https://learning-notes.mistermicheels.com/javascript/typescript/compiler-api/
  3. https://levelup.gitconnected.com/typescript-compiler-and-compiler-api-part-1-4bb0d24a565e

Epilogue 🔚

Over the course of this week, I worked on getting the parsing logic to work and successfully extracted all the required implementations from existing type-definition files (which were created by me 🤪), and going forward into the next week, I aim to write out the logic for filling out the target blueprint where all the extracted data would be written out, watch out for that post!

Well, this is the end of the line for now, till next time! Follow me on my socials in case of any questions, tips, and guidance pertaining to GSoC or anything in general.

Connect with me: Linkedin, Github, Twitter 😀

--

--