Unlocking the Potential of Computer Vision with Webcam Access in Sveltekit

Published in

CodeX

6 min readFeb 9, 2023

In our previous article, we built a sentiment analysis web application using Sveltekit and vadersentiment. This endeavor marked a significant milestone in the field of AI development and demonstrated the vast potential for creating innovative solutions with Sveltekit.

As the demand for AI applications continues to grow, we are now presented with an even wider range of possibilities. By leveraging the power of Sveltekit, developers can simplify the creation of complex AI-powered systems and streamline the development process. The sentiment analysis web app we built serves as a prime example of how Sveltekit can be used to build advanced AI applications with ease.

In the coming articles, we will delve deeper into the capabilities of Sveltekit and explore new frontiers in the realm of AI development. So, stay tuned for more exciting updates as we continue to uncover the limitless possibilities of Sveltekit in the realm of AI applications.

This brief article will guide you in accessing the webcam stream from Sveltekit, an essential step towards developing computer vision applications such as real-time object detection and others. Sveltekit provides a user-friendly platform for building such applications with ease.

Why the webcam access is important in CV apps

The webcam is a critical component in computer vision applications, as it provides the necessary input for algorithms to process and analyze visual information. In recent years, the advancement of webcam technology has enabled the development of a wide range of computer vision applications, such as facial recognition, object detection, and gesture recognition, among others.

Access to the webcam is essential for computer vision applications as it enables the creation of real-time visual data streams. This real-time data is crucial for building applications that can respond to changing environments, making them highly dynamic and flexible. For instance, in a real-time object detection application, the webcam is used to capture video frames that are then analyzed by computer vision algorithms to detect and track objects in the scene.

Moreover, the webcam provides a direct and convenient way for users to interact with computer vision applications. For instance, in a video conferencing application, the webcam is used to capture the user’s video and audio, allowing them to communicate with others in real-time.

Setup

Before we begin creating our app, let’s use pnpm to create a new SvelteKit project:

pnpm create svelte@latest kit-webcam

The code above will prompt you to choose from a few options and generate a basic SvelteKit project based on your selection. The generated project should look something like this:

Once you have selected the options, you will need to install the dependencies for the project, which can be done by running the following command from the project’s directory:

# cd kit-webcam
pnpm install

Adding Tailwind CSS for styling

In addition, we can also install Tailwindcss to enhance the styling and design capabilities of our project, because why not. Fortunately, there is a useful tool called “svelte-add” that is specifically designed for this purpose.

npx svelte-add@latest tailwindcss

To install these new dependencies, you will need to use the following command:

pnpm install

Now open the project in your preferred editor; mine is vscode.

Adding the Webcam functionality

Before embarking on the journey of adding webcam functionality to our Sveltekit application, it is crucial to understand the underlying technology that makes it possible: the Web Navigator API.

The Navigator.mediaDevices property on the Navigator API offers access to connected media input devices, such as cameras and microphones, as well as screen sharing. This property is read-only and returns a MediaDevices object, providing a convenient and consistent way for web applications to interact with these devices.

The ability to access media input devices like cameras and microphones is a critical component in many web applications, especially those that involve computer vision. The Navigator.mediaDevices property provides a standardized way for web applications to interact with these devices, ensuring compatibility across different browsers and platforms.

With a foundational understanding of the technology behind the webcam, it’s time to dive into the process of adding webcam functionality to our Sveltekit application. To get started, we will begin by working within the +page.svelte file. This file is where we will create the necessary code and logic to access the webcam and bring this functionality to life within our application

<script>

    let stream;
    let videoRef;

    async function getStream() {
        try {
            stream = await navigator.mediaDevices.getUserMedia({
                video: true,
                audio: false
            });
            videoRef.srcObject = stream;
        } catch (err) {
            console.error(err);
        }
        console.log(stream.getTracks()[0])
    }

    async function stopStream() {
        stream.getTracks().forEach(track => track.stop());
        videoRef.srcObject = null;
    }
</script>

The script above defines two functions, getStream and stopStream, which are used to start and stop the webcam stream, respectively. The script also declares two variables, stream and videoRef, which are used to store the webcam stream and reference to the video element in the HTML file respectively.

The getStream function is an asynchronous function that uses the await operator to wait for the completion of the navigator.mediaDevices.getUserMedia function. This function is part of the Web Navigator.mediaDevices API, and it is used to request access to a user's webcam. The request is made using an options object that specifies that the application needs access to the video stream, but not the audio stream.

If the request is successful, the stream variable is assigned the resulting stream, and the srcObject property of the video element, referenced by videoRef, is set to the stream.

The stopStream function stops the webcam stream by using the forEach method to loop over the tracks of the stream and stop each track using the track.stop method. The srcObject property of the video element is then set to null, effectively stopping the display of the webcam stream.

The code is wrapped in a try...catch block, which is used to handle any errors that may occur while accessing the webcam. If an error occurs, it is logged to the console using the console.error method.

Finally, the code logs the first track of the stream to the console using the console.log method. This is used for debugging purposes and can be removed in the final implementation.

Designing the UI

To integrate the webcam functionality into our user interface, we will add the following markup to the same +page.svelte file.

<section class="container mx-auto px-4">
    <h1 class="text-4xl text-blue-500 my-4">Webcam Stream Mastery</h1>
    <button class="rounded-sm bg-slate-600 text-white px-4 py-2" on:click={getStream}>Start Stream</button>
    <button class="rounded-sm bg-red-600 text-white px-4 py-2" on:click={stopStream}>Stop Stream</button>

    <video class="mt-4 rounded-sm " width="640" height="480" autoplay={true} bind:this={videoRef} />

</section>

The above code is a simple HTML markup for a webcam stream interface that utilizes Tailwind CSS for styling. It consists of:

A container section with a title “Webcam Stream Mastery”.
Two buttons Start Stream and Stop Stream. The Start Stream button is bound to the getStream function in the script, and the Stop Stream button is bound to the stopStream function in the script.
A video element that is set to have a width of 640 pixels and a height of 480 pixels. The video element is set to autoplay and the videoRef variable in the script is bound to this element.

When the Start Stream button is clicked, the getStream function will be executed, which will request access to the user's webcam and start the video stream. The video stream will be displayed in the video element. When the Stop Stream button is clicked, the stopStream function will be executed, which will stop the video stream.

You will see the following when you run the application now with:

pnpm dev

With that, we have successfully gained access to the webcam using Sveltekit. This article serves as a foundational starting point for exploring more advanced applications that will come later.

Conclusion

In conclusion, the ability to access the webcam is crucial for developing computer vision applications. Through the use of the Navigator.mediaDevices API and Sveltekit, we have explored the basic steps to stream the webcam video in a simple user interface. This article should provide a solid foundation for building more complex computer vision applications, and we encourage further exploration and experimentation with the technology.

The code for this project can be found here