Introduction to Computer Vision

Raghunath D
6 min readJan 26, 2019

--

The goal of computer vision is to understand the story unfolding in a picture.

Why bother learning computer vision?

Well, images are everywhere! Whether it be personal photo albums on your smartphone,public photos on Facebook, or videos on YouTube, we now have more images than ever — and we need methods to analyze, categorize, and quantify the contents of these images.

  • For example, have you recently tagged a photo of yourself or a friend on Facebook lately? How does Facebook seem to “know” where the faces are in an image?

What is Computer Vision?

Computer Vision is a field of computer science that works on enabling computers to see, identify and process images in the same way that human vision does and then provide appropriate output.

If machines could understand what they see, it opens up a possibility of lot of different applications.

Applications:

  • Autonomous vehicles (self-driving cars)
  • Robotic surgery and medical diagnosis
  • Robots that can navigate our clutter world — Robotic chefs, farmers, assistants, etc.
  • Intelligent surveillance and drones
  • Creation of Art
  • Improved image and video searching
  • Social and Fun applications — like Instagram filters, etc
  • Improve photography
  • Widely practical application is, Face detection and unlocking mechanism that we use in our mobile phones.
  • Interesting application: Well, we could build representations of our 3D world using public image repositories like Flickr. We could download thousands and thousands of pictures of Manhattan, taken by citizens with their smartphones and cameras, and then analyze them and organize them to construct a 3-D representation of the city. We would then virtually navigate this city through our computers. Sounds cool?
  • We apply computer vision algorithms to analyze movies, football games, hand gesture recognition (for sign language), license plates (just in case you were driving too fast), medicine, surgery, military, and retail, and so on.
  • We even use computer visions in space! NASA’s Mars Rover includes capabilities to model the terrain of the planet, detect obstacles in its path, and stitch together panoramic images.

Medical field:

Computer vision can also be applied to the medical field. For example, we can develop methods to automatically analyze breast histology images for cancer risk factors. Normally, a task like this would require a trained pathologist with years of experience — and it would be extremely time consuming!

Research demonstrated that computer vision algorithms could be applied to these images and could automatically analyze and quantify cellular structures — without human intervention! Now, we can analyze breast histology images for cancer risk factors much faster.

Of course, computer vision can also be applied to other areas of the medical field. Analyzing X-rays, MRI scans, and cellular structures all can be performed using computer vision algorithms.

Surveillance

Another popular application of computer vision is surveillance. While surveillance tends to have a negative connotation of sorts, there are many different types. One type of surveillance is related to analyzing security videos, looking for possible suspects after a robbery.

But a different type of surveillance can be seen in the retail world. Department stores can use calibrated cameras to track how you walk through their stores and which kiosks you stop at.

On your last visit to your favorite clothing retailer, did you stop to examine the spring’s latest jeans trends? How long did you look at the jeans? What was your facial expression as you looked at the jeans? Did you then pick up a pair and head to the dressing room? These are all types of questions that computer vision surveillance systems can answer.

This list will continue to grow in the coming years. This is an exciting field with end-less possibilities.

I repeat, the goal of computer vision is to understand the story unfolding in a picture by enabling computers to understand and label images.

As humans, this task is quite simple. But for computers, the task is extremely difficult. Why?

Why is it a difficult task?

They see images as a stream of raw numbers from camera sensor typically, the color intensities of RGB components.

  • Camera sensors and lens limitations
  • View point variations
  • Changing lighting conditions
  • Issues of scale
  • No rigid deformations
  • Occlusion — partially hidden image
  • Issues of clutter
  • Object class variations
  • Ambiguous optical illusions

Despite of this difficulties, CV has had a lot of success stories..

Image processing Vs Computer Vision

The difference between image processing and computer vision is that in image processing the input is an image and the output is also an image.

In computer vision, the input is an image and the output is some information.

What is OpenCV?

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library.

Why OpenCV Python?

OpenCV is the most popular Computer Vision library in the world with an estimated 14 million downloads. The following qualities make it an excellent library of choice for building commercial Computer Vision applications.

  1. Highly optimized: OpenCV is written in C/C++ with the goal of building real-time applications. It is highly optimized when compiled with the appropriate options and can utilize multiple cores on your machine. It is also capable of utilizing the heterogeneous computing resources (e.g. a GPU) when compiled with OpenCL support.
  2. Open sourced under the BSD license: It is open source and licensed under the very permissive BSD license. This means you can use it to build commercial applications and do not need to open source your own code. However, there are parts of OpenCV ( e.g the opencv_contrib module) that may or may not be under the BSD license.
  3. Language bindings: It is written in C/C++ with bindings for other languages including Python and Java.
  4. Portability: It supports Linux, Mac, Windows, iOS and Android operating systems.

--

--

Raghunath D

Software Engineer working in Oracle. Data Enthusiast interested in Computer Vision and wanna be a Machine learning engineer.