Better Programming

Advice for programmers.

Node.js Implementation of Image Recognition Using TensorFlow and Express.js

Aidan Tilgner
Better Programming
Published in
8 min readJun 1, 2022

--

The State of Artificial Intelligence

Artificial Intelligence is quite the field, captivating our interests with its astonishing capabilities and complex nature. The technology has come a long way since the 1940s when the possibility of recreating a brain using electronics was first theorized. While most of its lifetime, artificial intelligence, and therefore machine learning, have been fairly exclusive fields, things are starting to change.

For most of its history, machine learning has been accessible only to those with extensive computer science education and access to beefy hardware. But now, with the advent of open source platforms like TensorFlow and Keras, anyone with a basic knowledge of coding and a computer to do it on can train ML models and/or utilize the vast collection of existing ones.

This democratization of AI is crucial for the future of the industry and possibly the human race as a whole. We cannot have a situation where giants of the industry hold sole access to the most powerful tool in the history of the world, it needs to be something that even you or I have the chance to access and utilize.

All of that being said, you might think that image recognition and object detection are quite the endeavor, and that may be true if we were working from scratch in C++. However, I’m here to inform you that even I, a mere JavaScript developer, have managed to implement object detection in an Express server.

Today, I’m going to teach you how to do it yourself.

What We Are Going To Build

The Code

So, what exactly are we building today? Well, it’s not that complicated, fortunately; it’s mostly what I said before. We’ll have an Express server with an endpoint that takes an image upload and uses the COCO-SSD model from TensorFlow to make predictions on the objects that exist in the uploaded image. It will then return an array of those predictions as JSON for use by the client.

This is basically a server implementation of a TensorFlow tutorial that walks through building a client-side application that runs object detection on a stream of webcam input. After that, it shows the end user the predictions in a visible manner.

Client-Side vs Server-Side

I’d highly recommend the aforementioned tutorial for a client-side implementation, which can have its own benefits. When running a pre-trained model on the client’s computer, you increase their privacy, the speed of the model, and lower the cost as you don’t have to pay for a server setup. These are definitely to be taken into consideration, but server-side has its benefits as well.

A server-side implementation allows you to have full access over the hardware that will be used to run the models, allowing you to optimize for bigger, better, and higher-memory models. A Node.js server can be run on pretty much any computer including the Raspberry Pi, which means a centralized server performing model actions can be an asset in IoT integrations.

All that being said, it all depends on the use case, and your end user. It’s always a good idea to spend time finding the best implementation for your situation, this will lead to the most scalable, maintainable, and flexible product.

Let’s Build the Server

Anyway, enough of me rambling on about this and that. You’re here to follow a tutorial. Let’s get started.

Requirements

First of all, you’ll want to make sure that your environment is set up correctly. You’ll need Node.js installed for package management and running the code. While a specific version is not necessarily required, I used Node v14.17.0 and Npm v6.14.13 for this tutorial.

Python 2.7 is required to install the TensorFlow package correctly. 3+ won’t work. If you have another version installed, don’t fret. Simply go to the downloads page, install the correct version (2.7.16 introduces a bug fix so I used that), and then switch to it using the py -2.7 command (you can check all currently installed versions with py -0 command).

New Project

Once you have the requirements met, create a new folder for this to live in, and navigate to it.

$ mkdir ./img-recog-server && cd ./img-recog-server

Initialize a new empty Npm project.

$ npm init -y

Dependencies

Next, we’ll install the required dependencies. The first two are TensorFlow packages, @tensorflow/tfjs-node and @tensorflow-models/coco-ssd , the first being the main package optimized for Node, and the second being the model which will make the predictions. For the Express server, we’ll need express for the server, dotenv to handle environment variables and busboy to handle image uploads.

$ npm i @tensorflow-models/coco-ssd @tensorflow/tfjs-node express dotenv busboy

At this point, your package.json should look like the following.

Don’t forget to add "type": "module" to allow for es6 module syntax.

Once you have your npm project configured correctly, we can get started with the code.

The Code

Now, we’ll make a new index.js file that will be the entry point to our program. Throughout the tutorial, I’ll give you various code snippets, each of which you can add to the end of the file to eventually create a whole program.

$ touch index.js

Enter the file, and we’ll start importing the correct modules.

Herem we’re importing the correct modules and initializing the dotenv package. Now, if we place an .env file in the root of our directory next to index.js , we can write environment variables and bring them into our code with process.env.[variable_name] . Make an .env file and write the following.

PORT=5000

You can use whatever port you’d like; just change the number. Later on, we’ll use this in our code. First, however, we’ll initialize the model in our code.

// * INIT MODEL
let model = undefined
(async () => {
model = await coco_ssd.load({
base: "mobilenet_v1"
})
})();

Now, using an asynchronous Immediately Invoked Function Expression (IIFE), we can load the model to our variable. This lets us make sure we’re not trying to use the model before it’s initialized. Now that we have this model initialized, let’s set up our Express server.

const app = express()
const PORT = process.env.PORT || 5000
app.post("/predict", (req, res) => {})app.listen(PORT, () => {
console.log("Listening on port " + PORT)
})

Back to the PORT environment variable. We’ll use it to initialize our app instance. We’re also creating an HTTP POST route that will handle our object recognition. For the purposes of being able to explain it, I left it blank, but now let’s write the actual code that goes inside:

So, this is where it gets complicated. To begin, we’re making a check to see if the model is initialized, if it isn’t, we’ll let the user know, but it definitely should be after a few seconds of having the server up.

After our check, we’re using the busboy package to intercept Content-Type: multipart/form-data from the POST request. This is necessary because vanilla Express won’t parse the form-data correctly, and so we use an extension like busboy to create a stream/pipe for the data to come through. Then, when we upload an image, it will feed the image into a buffer, which can then be read by the tf.node.decodeImage method.

TensorFlow will decode any image that has the format BMP, GIF, JPEG, or PNG. This method returns what’s called a tensor, which is basically a multidimensional array that the coco_ssd model has an easier time understanding. That being said, the coco_ssd model also supports ImageData, an HTML Image element, an HTML Canvas element, or an HTML Video element, and will make predictions accordingly.

The model.detect also takes two other optional arguments, as seen above, after the image, you may define the maximum number of bounding boxes to send back, and the minimum score of a prediction to include. I set the maximum number of boxes to 3, and the score to 0.25.

Once all of that has run, and the model is finished making its predictions, then we send the array returned from the model.detect method as JSON back to the end user.

Now that you’ve done that, you’re done. You’ve successfully created a server that can take an image upload and return predictions based on that. You can run your server with the following command.

$ node index.js

If something isn’t working and you want to double-check, I’ll leave the complete code below:

Testing

However, we’re not quite done yet. You’ll want to test it out first. So, get your API testing suite of choice out, and we’ll make a quick API request to make sure you’ve got it down.

I use Thunder Client with VSCode because that’s fastest for me, but Postman is another good option. If you’re using Postman, I’d recommend this thread for uploading an image. Otherwise, using Thunder Client is super simple.

Once you have the extension installed, go to the Thunder Client tab on the VSCode sidebar. The icon should look like this, but another way to find it on Windows is ctrl + shift + r .

Once there, click the blue “New Request” button at the top.

Then, fill out the request information like so.

Make sure you’re making a POST request and using the Form option for the body. Then, check the “Files” box in the upper right, add a field name, name it whatever you’d like, and select an image of the correct format. Once again, we can only use images that fit the BMP, GIF, JPEG, and PNG formats.

Once you have your request configured, press ctrl + s to save the request. Make sure that your server is running, and then fire away the request. You should get back a response much like this one.

Conclusion

Now, you’re finally done. You have a server that you can upload images to and use to run image detection. So, what now? Well, machine learning is a huge field with endless possibilities. Explore the many models available for use with the links below, train your own, and figure out cool ways to add AI to an app. Whatever you want.

As Artificial Intelligence grows and develops, we as developers must too.

Machine learning’s growing accessibility means that we have a unique opportunity to leverage this amazing technology, and the ability to grow and refine our skills with it will be a massive advantage as we move into a more AI-dominated workforce.

The possibilities are endless, and you have the ability to make them realities.

Happy coding, everyone!

References

--

--

Aidan Tilgner
Aidan Tilgner

Written by Aidan Tilgner

Software Developer working my way through the world and sharing what I learn with all of you. More of my writing — aidantilgner.substack.com

No responses yet