SEEING THROUGH THE EYES OF A MACHINE
The phenomenon that makes machines such as computers or mobile phones see the surroundings is known as Computer Vision. Serious work on re-creating a human eye started way back in 50s and since then, we have come a long way. Computer vision has already made its way to our mobile phone via different e-commerce or camera apps.
Think of what more can be done by machine when they will be able to see as accurate as a human eye. Human eye is a complex structure and it goes through more complex phenomenon of understanding the environment. In a similar fashion, making machines see things and make them capable enough to figure out what they are seeing and further categorize it, is still a pretty tough job.
Working on Computer Vision is equivalent to working on millions of calculations in the blink of an eye with almost same accuracy as that of a human eye. It is not just about converting a picture into pixels, and then try to make sense of what’s in the picture through those pixels, you will have to first understand the bigger picture of how to extract information from those pixels and understand what they represent.
So, let’s understand how do machines see?
A. Represent colors by numbers: In computer science, each color is represented by a specified HEX value. That is how machines are programmed to understand what colors the image pixels are made up. Whereas as humans we have an inherited knowledge to differ between the shades.
B. Image Segmentation: Computers are made to identify similar group of colors and then segment the image i.e. distinguish the foreground from background. The technique of color gradient is used to find edges of different objects.
C. Finding corners: After segmentation, images are then looked up for certain features, also known as corners. In simple words, algorithms search for lines that meet at an angle and cover a specific part of the image with one color shade. Features, also called corners are the building blocks which help to find more detailed information contained in the image.
D. Find textures: Another important aspect to identify any image correctly is to determine the texture in the image. The difference in textures between two objects makes it easier for a machine to correctly categorize an object.
E. Make a guess: After implementing the above steps, a machine needs to make a nearly-right guess and match the image with those present in the database.
F. Finally, see the bigger picture! At last, a machine sees the bigger and clear picture and checks if it was right identifying the one, as per the feeded algorithmic instructions. The accuracy has improved a lot in past years but still, machines make mistakes when asked to handle images with mixed objects.
2. Universities That have Computer Vision Research Groups:
Universities in Canada:
Universities in Europe:
University of Oxford (http://www.robots.ox.ac.uk/~vgg/)
3. If you’re starting out in the field of Computer Vision, find below an exhaustive list of topics one must know.
A. Beginner level
- Linear Algebra
- Singular Value Decomposition
- Introductory level Pattern Recognition
- Principal Component Analysis
- Kalman filtering
- Fourier Transform
To gain practical knowledge about how things work especially the algorithms, start learning about OpenCV from Computer Vision perspective:
Tip: When programming in C, C++, Python we use OpenCV library for computer vision. When programming in MATLAB, we use computer vision system toolbox. Similarly there are more open source libraries if you are programming in other languages.
You should also know about the keywords or key works done in the field and here is where you can learn them from :
- SIFT: classic descriptor for general-purpose vision
- HOG: well-known descriptor that is particularly good for human detection
- Viola-Jones: great face detector
- Shape Contexts
- Deformable Part Models
A list of must-read books include:
Advanced level — Towards Deep Learning
TED Talks to watch:
Online Courses to go for:
- Udacity : Introduction to Computer Vision
- Stanford’s CS231n: Convolutional Neural Networks for Visual Recognition
- University of Central Florida — Prof. Mubarak Shah’s Video lectures
- Apply all your knowledge on concepts and algorithms gained from aforementioned resources to solve a few assignments and do a project on your own.
Advanced Level — Towards Deep Learning
- Geoff Hinton’s Neural Net lectures on Coursera
- Stanford course: Deep Learning for Natural Language Processing
- Stanford course: Convolutional Neural Networks for Visual Recognition
4. Projects Around The Globe
b. Project Tokyo — deliver AI-enabled prototypes that augment awareness of social, physical and textual environment for people who are blind or have vision impairments.
5. Conversation with Experts
Here are a few excerpts from my conversation with 2 experts who have found their passion in the field of Computer Vision.
Conversation with Prof. Devi Parikh | Visiting Researcher at Facebook AI Research | Assistant Professor at Georgia Tech (Previously at Virginia Tech)
Computer Vision is a subfield of Artificial Intelligence where the goal is to build a computer replicating the visual intelligence of human brain. Machine Learning is a generic term for teaching machines anything, but Computer Vision specifically deals with visual data. In Machine Learning, we deal more with statistical tools whereas Computer Vision could include both — statistical as well non-statistical tools. For instance, 3D reconstruction in Computer Vision field tends to use machine learning tools less frequently than say image classification and object recognition. Many computer vision tasks have their own needs for which we develop specific machine learning tools.
For any student to start learning about the field, I’d advise them to pick a problem by going through researchers’ web pages and selecting one problem they find interesting. Mostly people are working on cutting edge problems for which standard datasets are available out there that could be used. They can select a problem, a dataset, as well as a library they might want to use and get their hands dirty.
When taking masters or PhD students, what I usually look for is — accountability, pro-activeness, and determination. Have your basic concepts clear about the field. Try to read research papers. Try to get a sense for the problems at the frontiers of AI that researchers world-wide are working on. And get your hands dirty.
B. Conversation with Richa Agrawal | University of Pennsylvania Alumnus | Computer Vision Research Engineer at Whodat
I graduated from MNIT Jaipur and while studying there I got in touch with the Robotics group. We did a few projects and went on to participate in a national level competition at IIT Roorkee. We won the competition and that boosted my morale. After completing my bachelor’s, I started working at Yahoo. I realized that this is not something I wanted or want to do and hence, went for my master’s at University of Pennsylvania. I explored different research areas during that time by taking different courses and finally decided Computer Vision as my main research interest. After graduating, I worked at a startup in the US and was looking for similar opportunity in India as the field started growing even here. At Whodat, a Computer Vision startup based out of Bangalore, we do stuff with Augmented Reality and Visualization. For instance, say, you’re planning to buy furniture for your home; you go to a shop and choose one after visualizing it in your home environment. After the furniture gets delivered, you realize that either it is too big or too small but nothing can be done about it now. We are trying to help you by building a solution that will let you visualize furniture at your home. This will enable you to make better decisions and hassle free purchase of items.
When studying, many a times I came to a point where I was not able to give my best and used to feel demotivated but then an advice from a friend came to the rescue. He told me that –‘ there are only a few people (less than 0.1%) who are able to make it to this point (doing master’s from abroad and that too in a technical field like Computer Vision) and you have already proved that you’re one of them. And, you just need to push a little harder. Only you can do it for yourself and nobody else will do it. And at the end, only your learning is what matters the most.‘
Some suggestions for students to get started is to talk to their peers in other colleges and ask about what kind of projects they do. Then they can form a team with a leader and start experimenting. I’d also recommend participating in competitions and hackathons. It is highly important to find your interests and go with them instead of working in an area you don’t like. Computer Vision, for instance, is a great area with a huge scope of development in India as in this field, all you need is a camera which has started penetrating to even smaller cities now. So, the future of Computer Vision is definitely bright.
Beginner’s Guide to Data Science ← P R E V I O U S
Apoorva Bhalla| Content & Marketing Fellow at Connectedreams.com