Computer Vision : A short beginner’s guide.

HARI LAKSHMAN
Analytics Vidhya
Published in
5 min readMay 21, 2021

Contents:

  1. Overview
  2. How does CV works?
  3. Applications of CV
  4. Math needed in CV
  5. How to Learn CV — My path….

1. Overview:

In layman’s term, Computer vision (CV)is a branch of computer science which aims to replicate the way how a human’s vision works to a computer. Giving a computer/ a robot/ a machine the sense of vision.

Computer Vision allows computers to identify and extract data/ information from the objects found in images, videos, and real-life. Visual data can be very difficult for computers to understand. Humans make sense of what we see based on our experiences and memories. We’ve been training our brains since the day we were born, which puts computers at a disadvantage when it comes to interpreting visual information. And that’s where the genius of computer vision comes in. With support from artificial intelligence, neural networks, deep learning, parallel computing, and machine learning, computer vision helping to bridge the gap between computers seeing and computers comprehending what they see.

PC: hqsoftwarelab

2. So How does Computer vision works?

What computer vision does is simple, it is understanding the image. This also implies videos, as it is technically a collection of images (frames). Understanding an image is a quite a complex and lengthy problem. Rather people identify certain tasks in the image understanding requirements, and only do that.

There are several tasks in image understanding; some are low level tasks that are used in various others, while some are high level tasks. Some of the low level tasks are:

  • Image cleaning
  • Image segmentation
  • Histogram analysis
  • Image color space translation
  • Image transformation
  • Image edge detection and contours, lines approximation & etc.

Some of the high level tasks (that usually uses the low level ones) are:

  • Object detection
  • Object recognition
  • Object segmentation and localization
  • Object tracking
  • Feature extraction
  • Feature, color correction
  • Feature reconstruction, approximation & etc.

All the computer vision problem has a similar approach of understanding what is in the scene. We do this by looking into different colors, then shadows, stoke, etc to recognize different images and what’s inside it, we still use the same approach but with a much efficient technique. So various methods were chosen eg: In CCTV footage you need to track the moving objects then you just need to compare each of the frames and see what pixel value is changing. Now the CCTV are still cameras, the background always remains constant the pixel value remains the same, the only moving object will have a variable pixel value. This is recognized as the object.

But how does the modern approach work? currently, this is done with a type of neural network called a convolutional neural network. This is a specific type of network which could recognize the features in an image. These networks are a sequential arrangement of something called conventional layers.

This is the simplest visualization of a neural network. You can observe how we have arranged each layer and how they are connected to each other.
This is how a convolutional neural network looks in-depth, you have an input on one side and each of the layers is extracting the features from the image and on the output, we have the label for the image.

3. Applications of Computer Vision:

There are a lot of industries & areas where CV can be implemented. Here I have mentioned few important and successful areas where CV is being used extensively.

  1. Google Lens.
  2. Robotics.
  3. Manufacturing Industries( Defect detection, assembly line, etc)
  4. Security Systems.
  5. VR/AR
  6. Smartphones
  7. Automated Vehicles.
  8. Medical Imaging ( MRI reconstruction, automatic pathology, robots aided surgery etc)

These are only a very tiny part of the whole amount of applications that computer vision has.

4. Math behind Computer Vision

Getting started with computer vision requires that you are comfortable with maths such as:

  • Linear algebra : Matrices, Vectors, Singular value decomposition (SVD). Especially how to use it.
  • Numerical optimization : First order optimization methods, Second order optimization methods.
  • Probability and statistics : Random variables, Probability distribution functions, Bayes theorem.

5. How to learn CV — What I did….

The first step is I understood the basics of computer vision and the maths behind that.

Started with simple object detection, features filtration, face detection, animating the faces, art with Machine learning projects.

First, downloaded OpenCV and tried out some sample code such as static image matching, object detection and classification etc to get hands on with the code. There was an article in internet sorry unfortunately I did not save it, But with little effort you could find it easily. It was about an intriguing project using a cam (can be your phone’s, lap’s or any cam) It detects the faces of the people and checks out in the database for their face features, once successfully matched then it’ll automatically register the attendance in the excel sheet. This was my first coolest project with CV.

Irrespective of what subject you learn, Once you get hands-on experience you will get confidence, you’ll get more innovative ideas, you’ll get cool and good feel.

A quick & effective path :

1.Take online courses in order to gain domain expertise or improve knowledge.

2.Understand programming patterns and principles, such as object orientated programming.

3. Use machine libraries and framework.

4. Read practical ML/DL books

5. Be aware of cloud services such as GCP, AWS etc

6. Understand Deep Learning fundamentals

7. Practice makes the man perfect : Experiment & Play with it.

Feel Free to contact me for guidance and help in Robotics, CV & ML. Looking forward to learn along with you.

Thanks for reading,

Email: harilakshmanrb@gmail.com

--

--