Understanding Computer Vision

Self driving cars and facial recognition technology have recently dominated the machine learning and artificial intelligence conversation. However, these two uses are not the only ways that systems can process images and videos. In broad terms, computer vision is the branch of machine learning specializing in how computers “see” photos and videos. Some categories of computer vision include: image classification, object detection, segmentation, summarization, and optical character recognition.

In order to understand how computer vision is shaping the future of security, business, and day-to-day life, it is important to have a basic understanding of how the process works. Computers “see” in a similar way that humans do. Eyes are round, clovers have three (sometimes four) leaves, fish have gills. Within the past five years, technology has advanced enough for computers to recognize patterns in pixels. Our brains use colors, patterns, and context, to determine what an image contains, the same way computers use pixels. A computer may recognize a cluster of light blue pixels as the sky or a lake depending on the pixels surrounding the blue cluster. When data is entered and many pictures are labeled, a system starts the process of learning what the patterns in the pixels are. This process, from beginning to end, is similar to how people process what we see. After all, nobody was able to recognize and convey the message “The large patch blue above me is the sky” the day they were born. Over years of training and understanding context people can start to see and point out patterns in what we see, which a trained computer can do in seconds.

Throughout history, new technology has completely transformed the way our planet works in both positive and negative ways- stone tools destroyed hunter-gatherer culture; the combustion engine transports us and also destroys the environment; television can entertain families or, through the complementary invention of political ads, destroy them. Computer vision is no different. Civilians with no criminal record can be recognized in Central Park , or, the X-Ray image of a lung can reveal whether a person has pneumonia or not.

Moral issues aside, if there is a need to analyze a large set of images or videos, the task may take a person hours or weeks depending on the number of pictures to be analyzed. It can take only seconds for a computer to analyze the same number of images. Other times, images can only be categorized by a specialist (for example a radiologist, molecular biologist, or meterologist). In this case, training a model to label an image is more efficient than having no understanding of what an image is showing at all.

One of the more widely and common forms of computer vision is image classification. Image classification can be so accurate that it has rivaled dermatologists in determining whether or not a photo of a skin lesion is cancerous or not, potentially allowing people to determine keep frequent checks on their own skin without the help of a doctor. Computer vision technology can also lead to improved cancer prognosis. Classification clearly has benefit. Skyl has used their platform to determine whether or not a person has pneumonia using only an X-Ray of their lung, a task normally assigned to radiologists.

Image for post
Image for post

Another branch of computer vision, Optical character recognition (OCR) interprets handwritten text. In the healthcare industry, this can be used to read and understand prescriptions. I would argue that using OCR to interpret prescriptions is one of the most incredible feats of science and technology since Neil Armstrong landed on the moon in 1969, considering entirely illegible handwriting is a graduation requirement for most medical schools.

However, the uses of computer vision technology are not limited to the healthcare industry. Marketing agencies aim to use computer vision to target ads towards people who are more likely to buy their product based off of their interests. By detecting their brand logos (or similar brands’ logos) in photographs on social media using object detection, companies can narrow down target demographics. Understanding where their logo is used can give companies a more precise idea of who is interested in their product, or who may become interested in their product if introduced to it. This allows marketing companies to simultaneously be closer to their consumer and father from their not . Object detection can give companies a slightly more specific representation of their brands targeted demographic, making room for slightly more accurately targeted ads.

Yet these uses and a clear path for the future of these industries, not every hospital or marketing agency has made room for computer vision in their companies. More often than not, factors such as cost, knowledge of computer vision, and lack of direction for a project deters people from even attempting to adopt an automated system for processing images. However, platforms such as Skyl are the most cost effective ways to integrate machine learning technology into any company. Using Skyl, companies are not limited to computer vision ML projects. Any Skyl user can also create Natural Language Processing projects at the same time.

Though the future of computer vision is unclear, we can only assume that it will continue to drastically change the way that the healthcare, marketing, and security industries are run. If you would like to learn more about how computer vision can drive your company into the future, visit our page at Skyl.ai.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store