Chapter 1. Hello, AI

Published in

Wyze

4 min readJul 20, 2019

Hey everyone, I’m Shawn, one of the product managers at Wyze. I work on artificial intelligence (AI) products, aiming to make our products smarter and easier to use. On July 9th, we released our first AI feature — person detection. It’s free for both new and existing customers, and can be quickly enabled via firmware and app upgrades on Wyze Cam v2 and Wyze Cam Pan. We’ve received a ton of feedback from users since launch, so I wanted to write a few posts to answer your questions and explain AI concepts.

This chapter will be an overview of how AI works.

From Wikipedia, Artificial Intelligence refers to “machines (or computers) that mimic ‘cognitive’ functions that humans associate with the human mind, such as ‘learning’ and ‘problem solving.” The idea can be traced back to the 1950s. However, thanks to the drastic increase in computing power in the past decade, AI is being revived and is being applied in various industries.

There are many different branches of AI research and application. Training computers to recognize objects is called computer vision. Making computers understand conversations and talk like human beings is called natural language processing. Also, we can teach a computer to make the best choice by having it try and fail multiple times so that it can ultimately play chess or go against humans and even beat them!

Now let’s apply these concepts to the Wyze Cam. Before AI was introduced, Wyze Cam could only tell you that it detected motion. It couldn’t tell you what triggered the motion nor what happened during the recording. But with AI, Wyze Cam can now start to learn. So, how does that learning process happen?

Just like an infant, the world is completely new to Wyze Cam. So, we need to teach it how to recognize the world around it. Similar to the human learning process, the key steps are to:

Build a fundamental knowledge
Explore environments and find out what works and what doesn’t
Explain the difference between what works and not. Learn from the delta.
Apply new knowledge to real situations

To lay the foundation for person detection, data scientists and machine learning engineers at Wyze and Xnor.Ai first built a mathematical model so that computers could understand the structure of pictures and extract common patterns called “features”. Once the model knew that there were multiple objects in an image, engineers could then teach the model to identify which object is a person. This was done by training the model with open source images (e.g. MSCOCO, ImageNet).

However, images from the Internet are significantly different to those from a Wyze Cam. Think about a picture taken from a fisheye lens — people are much taller than normal on the edge of the picture. But when you move closer to the center of the image, the height of the same person shrinks. When we deployed the Internet-trained model directly on the Wyze Cam, the accuracy was only around 60%. So, the next step was to improve the accuracy by giving a specific dataset — Wyze Cam footage submitted by our beta testers.

Fortunately, we didn’t have to build a model from scratch. We re-wired some connections in the model and trained the model with the Wyze Cam dataset. This process was repeated several times with varied datasets until the model reached a balance point, where the model could not only maximize the accuracy but also avoid overfit problem.

So with many months of efforts, the first version of person detection was ready to be launched. But remember, the model still hasn’t fully explored the world yet. Although we’ve launched person detection, the Wyze Cam may still fail to recognize people (called “false negative”) or wrongly identify an object as a person (called “false positive). At this time of writing, we need to continue to feed our model with diverse real-world data, in order to expand its horizon and make it more accurate.

In short, training our model literally means need to “train” it. It consumes both time and data, and the output that we measure and benchmark on is accuracy. By nature, the model cannot reach 100% accuracy. Our goal is to make our AI more accurate in less time.

In the next chapter, we’ll talk about the big difference between Wyze Cam’s person detection and other cameras’.

Chapter 1. Hello, AI

Written by Shawn Niu