Writing a multi-language online compiler for VS Code Part 2: Compilation Service

Ben Meehan
7 min readJun 11, 2023

--

In Part 1, we looked at how the whole system is structured. In this part, we will write a Golang compilation service that will compile C++ code.

The flow and process are similar for all the languages, so once you get the hang of how C++ is compiled, it should be easy to implement a similar service for other languages.

I’ll write the server itself in Golang, but feel free to use any other back-end language that you like.

UPDATE : Part 3 is now available

The Core Idea:

The idea is simple. We will have a Golang server, that receives the C++ code through a POST request, compiles and executes the code through sub-commands, and returns the output.

What’s not so simple is the security challenges this possesses. See, since we are allowing the user to execute any C++ code in our server, they could very well crash our server in a minute by executing a code with an infinite loop or even worse execute the rm -rf command to delete all our files.

The Security Concerns:

In this project, we will be taking care of 4 main security concerns.

  1. Prevent users from executing a large piece of code. ( > 1 MB)
  2. Prevent users from executing a code with infinite loops.
  3. Prevent users from creating, reading, or writing files in unauthorized directories.
  4. Prevent users from executing unauthorized shell commands.

Many more issues can be thought of but we will just take care of these 4 to keep things simple.

For the first two, we can do some checks in Golang to mitigate them. But for 3 and 4, we have 2 options.

1. Sandboxing:

The most common way to avoid all these issues is a method called sandboxing. You create something like a docker container that has its file system and libraries and execute the user code inside it. That way you can isolate your server file system from the place where the unchecked user code may be executed.

I did not go with this approach because while it’s safe, it is also a bit complicated to implement and I wanted to keep things simple.

2. Jailing:

The other approach is jailing. In this, we create a restricted OS user with limited permissions and execute the code as that user instead of the root user.

We will cover this approach in detail in Part 3.

It is called jailing because we build a wall around the code by restricting the permissions for it and confining it to accessing only the required files and folders.

Implementation:

Note: The implementation is going to be a bit long and detailed, so if you just want the full code check it out here.

Okay, Now that we understood a bit about the security concerns, let’s start by creating a simple Golang server using the built-in net/http package.

package main

import (
"log"
"net/http"
)

func main() {
http.HandleFunc("/compile", handleCompile)
log.Println("C++ Server listening on port 8080...")
log.Fatal(http.ListenAndServe(":8080", nil))
}

This just creates a simple http server that listens for requests on the /compile route on PORT 8080 and calls the handleCompile function. But before we write the handleCompile function, let us define some structs and constants to make things readable.

// Maximum allowed code size in bytes
const MaxCodeSize = 1024 * 1024 // 1 MB

// Restricted user and group ID
const RestrictedUserID = 1000
const RestrictedGroupID = 1000

type CompileRequest struct {
Code string `json:"code"`
Input string `json:"input"`
Language string `json:"language,omitempty"`
}

Define these before the main function.

MaxCodeSize tells how large the C++ code in the incoming request can be. Here, it’s 1024 bytes (which is 1 KB) x 1024 = 1 MB.

The RestrictedUserID and RestrictedGroupID define the user and group IDs of the unprivileged user that will be executing the code. The default is 1000 for both inside a docker container.

Finally, the CompileRequest struct defines the format of the JSON in the incoming POST request.

The Handler Function:

When a request hits the handleCompile function, the first step is to decode the body into an instance of the compileRequest struct.

func handleCompile(w http.ResponseWriter, r *http.Request) {
// Read the request body
body, err := ioutil.ReadAll(r.Body)
if err != nil {
w.WriteHeader(http.StatusBadRequest)
fmt.Fprint(w, "Error occurred during parsing of the request body", err)
log.Printf("Failed to read request body: %v", err)
return
}

// Parse the JSON request body
var compileReq CompileRequest
err = json.Unmarshal(body, &compileReq)
if err != nil {
w.WriteHeader(http.StatusBadRequest)
fmt.Fprint(w, "Error occurred during parsing of the JSON request body", err)
log.Printf("Failed to parse JSON request body: %v", err)
return
}
}

We are just extracting the JSON from the request body and returning status 500 if there is an error.

Now, the code will be present in the Code field of the compileReq struct.

The next step is to check for security issue number 1. i.e if the code size exceeds 1 MB.

 code := []byte(compileReq.Code)

// Check if the code size exceeds the maximum allowed size
if len(code) > MaxCodeSize {
... status 500 error
}

We typecast the Code string into a byte array and then check the length of it. If there are more than a million entries in the array (1 megabyte is a million bytes), we return a status 500.

If the code size is fine, the next step is to write this code into some temporary file that can be fed into the g++ compiler.

// Create a temporary file to store the code
tmpFile, err := ioutil.TempFile("", "code-*.cpp")
if err != nil {
... status 500 error
}
defer os.Remove(tmpFile.Name()) // Clean up the temporary file

// Write the code to the temporary file
_, err = tmpFile.Write(code)
if err != nil {
... status 500 error
}

// Close the temporary file
err = tmpFile.Close()
if err != nil {
... status 500 error
}

ioutil.TempFile() creates a temporary file in /tmp directory and replaces the ‘*’ in the filename with some random numbers. This ensures that the filenames are unique for each request. Make sure that the filename also ends in .cpp extension.

We should also make sure that the file is deleted once the response is sent. we do this using defer os.Remove(tmpFile.Name()).

Then we just write the code to the file and close the file.

We will also need to create another file to store the output similarly. This file will store the compiled byte code/ machine code.

 // Create a temporary file to store the output
tmpOpFile, err := ioutil.TempFile("", "output-*")
if err != nil {
... status 500 error
}

Note: The next compilation step is unnecessary for interpreted languages like Python and Javascript. You can skip directly to running the file.

Now, we supply the code file to the g++ compiler, telling it to write the output to the tempOpFile we created. We use the exec library to call a bash subcommand for g++.

// Compile the code using G++
outputFile := tmpOpFile.Name()
cmd := exec.Command("g++", tmpFile.Name(), "-o", outputFile)

compilerOutput, err := cmd.CombinedOutput()
if err != nil {
... status 500 error
}

log.Printf("Compilation successful. Output file: %s", outputFile)

// Close the temporary output file
err = tmpOpFile.Close()
if err != nil {
... status 500 error
}

Now, we have a binary generated. All that is left is to execute this binary and capture the output from stdout.

While doing this, we need to limit the time for which the binary can execute to check for infinite loops.

// Create a context with a timeout duration
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

// Create a channel to receive the output
outputChannel := make(chan []byte)

// Run the compiled output in a Goroutine and monitor for timeouts
go func() {
cmd := exec.CommandContext(ctx, outputFile)

// Set the user and group ID of the executed program
cmd.SysProcAttr = &syscall.SysProcAttr{
Credential: &syscall.Credential{
Uid: RestrictedUserID,
Gid: RestrictedGroupID,
},
}

// Set the input for the program
cmd.Stdin = strings.NewReader(compileReq.Input)

cmdOutput, err := cmd.CombinedOutput()
if err != nil {
log.Printf("Execution error: %s", err)
}

// Send the execution output through the channel
outputChannel <- cmdOutput
}()

// Remove the temporary output file after the Goroutine completes
defer os.Remove(tmpOpFile.Name())

select {
case <-ctx.Done():
// Execution timed out
w.WriteHeader(http.StatusInternalServerError)
fmt.Fprint(w, "Execution timed out")
log.Println("Execution timed out")
case output := <-outputChannel:
// Execution completed within the timeout duration
w.Header().Set("Content-Type", "text/plain")
w.Write(output)
}

We create a context that times out after 5 seconds and pass it to the exec command as before. We also supply the file name that needs to be executed along with the user id who will be running the file.

compileReq.Input has any inputs needed for the C++ code that was sent over in the request. We supply it to the stdout.

We execute the code in a separate go routine and wait. One of two things can happen.

  1. The code executes successfully and gives the output/error.
  2. The code execution crosses 5 seconds and times out.

Based on these cases, we return the corresponding response.

So, That’s it! The code to compile and respond with the output. Hopefully, you learned something new reading this. In the next part, we will take this code and try to dockerize it, so that it can be run on any machine.

--

--

Ben Meehan

Software Engineer at Razorpay. Sharing knowledge and experiences.