Baby Steps Into Computer Vision

Toni Kendel
Blue Harvest Tech Blog
6 min readJan 12, 2022

--

You probably started noticing that in recent years computer vision is being used across a lot of different technologies, such as finger and face lock security on your phones or self-driving vehicles rolling out of Tesla’s factories. Additionally, it is also used to further some medical advances where computers aid in discovering different medical issues using patients’ x-rays.

Given some of these examples, you might be surprised to hear that writing a working prototype in a couple of hours can be easily achievable. With the entry bar being lower than ever, a thorough understanding of mathematics or a complex programming language is not in order to jump in and build.

This article contains some of the sources that guided me in my discovery of the topic of computer vision. Hopefully, like me, it will help you peek behind the curtains and show how you could take the first step into the world of computer vision.

With your mind set to build something in the field of computer vision field, there will be some technologies to cover. From programming languages to different libraries and tutorials to help you get around easier.

Programming Language

To start, you will need some tools to work with and in this case, there are a few different ones for different occasions.

Python

Python is more suited for higher-level functions. Since you will be working with images that are numpy arrays, Numpy and Scipy will come in handy. With its big library support, it also reduces the development time while being more approachable to newcomers in the field.

C++

Open-CV is written in C++, and hence it lends itself as a natural choice. It is more useful for CPU-intensive, number-crunching tasks. It has some of the advanced functionalities that have not been ported to python yet.

I would recommend Python for computer vision as your go-to Swiss knife. Do keep in mind that both come with the same Computer Vision library that you can use! Python was initially made with big data analysis in mind and hence it is widely supported and used in a lot of AI fields and big data projects.

Libraries

Once you have picked your main tool, you will want a library to help create your project.

Open-CV

Open-CV is one of the libraries that I would recommend using. It has ample useful functions and algorithms.

With minimal mathematical knowledge required for its use, the learning curve is reduced drastically. As a result, it helps in building the computer vision-based project, while also serving as a sort of stepping stone into the field. It can be used in C++, Python, Java, and C.

YOLO

YOLO or you Only Look Once is a very fast tool for object detection which is all thanks to neural networking. Because of that it has great learning capabilities and is quite accurate. The con on the other hand is a rather poor performance on detecting smaller objects.

Courses

You don’t need much to make your prototype with computer vision. Aside from the aforementioned libraries, the only thing that you should have is knowledge of your chosen programming language in order to start your journey.

Introduction

Now that you have the preferred tool and library to start the project, you might want to put on your work attire and get to business. But how to navigate through this vast amount of information in the library?

Working with computer vision-related projects a lot of time follows a similar idea where you try and process an image in order to extract the information you need.

Computer Vision for Beginners: Part 1 — Written by Jiwon Jeong, this source guided me through this challenge systematically. It has a quite thorough description and supporting visuals made especially for beginners.

Open-CV Documentation — If you have questions about specific functions and how they work, the Open-CV docs have a great explanation for them.

Examples

PyImageSearchElaborate examples on topics revolving around object detention, hand sign tracking, and more.

A Gentle Introduction to Object Recognition With Deep Learning — Excellent introduction to object recognition in general.

Open-CV Text Detection — Great article if you are looking into text detection.

Where to branch out after you have dabbled a bit with CV?

There are a lot of different paths you can take from here on out given that the field of Artificial intelligence is vast. For some, exploring its sub roots might feel like the next step. There are many great online courses, one of which is this course tutored by Andrew Ng. If that is the path you find interest in you might also want to consider taking up courses in Azure or AWS for Data Science.

In my case, I started expanding my knowledge on the topic of machine learning. As a consequence, I ended up merging the knowledge of machine learning and CV to improve prior projects. In one of the projects for content detection in user images, initially, it just made us of Open-CV. However, with the implementation of machine learning, I managed to create a small test case of images that were trained to find the relevant information after it was processed with computer vision.

However, if you would like to build expertise on this topic, and understand how these algorithms work behind the scenes, I would suggest a different approach. Knowledge and interest in mathematics and statistics are crucial for forming an in-depth understanding of any AI field, including CV. In order to build something yourself without depending on the framework, you will need to become familiar with things like algebra, probability theory, model selection, Gaussian distribution, and polynomial fitting, just to name a few.

From my experience, the best recommendation I can give to everyone is to get their hands dirty and start working on some prototypes. I believe that is the best way to learn. I started working on projects that required computer vision functionality without any prior experience. Some pointers and mentoring along the way opened my eyes to this newer field and triggered a drive in me to learn more about AI. At one point in the future, I wish to bring my knowledge to the medical field and help other people.

Some of the technologies that I have had the pleasure of working with so far in the AI field have been Tensorflow and Keras which have been quite user-friendly so far. In the future, I will make a second article where I will try and expand upon my current journey in the field of AI!

In conclusion, this article has covered the basic and essential tools you will need to start exploring the topic of computer vision.

For those who are looking for more resources on a certain topic please feel free to reach out to me.

Announcement: We at Blue Harvest are hiring! If you are interested in similar technologies and would like to work with us, check our careers page. We are looking for enthusiastic candidates. So, send us your CV!

--

--