An Intro to OpenCV with Python 3.x

Saurav Saha
Sep 2, 2018 · 6 min read

So hello there people! Want to enter into the world of computer vision, image processing and AI? You need to start of somewhere, don’t you? But before you go on to create a sentient AI capable of remembering peoples faces you must start by learning the basics of-course. OpenCV might be just the thing you are looking for. So, in today’s post we will be looking into easy but very handy knowledge about some basic image processing and manipulations using OpenCV and Python which might go on to be crucial in your journey of exploring computer vision. If you are already an image processing expert this post might seem very petty for you, but go through it nonetheless, you might remember a thing or two which you might have forgotten by getting lost in the path of life.

As I am a big anime fan we will use this picture of good old Goku in his Super Saiyan form as our Guinea pig for the image processing experiments on this post.

So, first let’s start off by learning to load our image:

Use the function cv2.imread() to read an image. The image should be in the working directory or a full path of image should be given.

Second argument is a flag which specifies the way the image should be read.

  • cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be neglected. It is the default flag.
  • cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode
  • cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel.

But, yeah I know we are lazy and so typing those long flag names isn’t an option, we obviously want something more easier, for this we can simply use numbers 1,0 and -1 respectively for the above flags. So, the next thing we will do is load the image in grayscale.

So, all we need to do is include an argument 0 with the earlier cv2.imread line to display our image in grayscale line and our code would now look something like this:

gray scaled image

Now you must be thinking what does the cv2.waitKey(0) command do exactly? The command cv2.waitKey() is a keyboard binding function. Its argument is the time in milliseconds. The function waits for specified milliseconds for any keyboard event. If you press any key in that time, the program continues. If 0 is passed, it waits indefinitely for a key stroke. It can also be set to detect specific key strokes like, if key a is pressed etc which we will discuss below.

You might remember me telling previously that the cv2.imread() function returns our image in the form of a Numpy array. So yes we can get the shape of the Numpy array to get to know about the resolution of our image. We can do this by using line 11 in our code which is print(image.shape).

The output returned by executing print(image.shape)

Doing this we get to know that the resolution of our image is 353 x 500 which means our image is 353 pixels tall and 500 pixels wide, but we also see there is also a third number as our output. This number simply tells us about the number of color channels in our image, i.e 3, for Red, Blue and Green as our image is in RGB format. An RGB image, sometimes referred to as a truecolor image, is stored as an m-by-n-by-3 data array that defines red, green, and blue color components for each individual pixel. The color of each pixel is determined by the combination of the red, green, and blue intensities stored in each color plane at the pixel’s location.

Enough of this nerdy and technical mumbo-jumbo, let’s move on by learning how to resize our image. We can do so by using the command cv2.resize(). Let’s modify the resolution of our image as 150 x 100.

The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are cv.INTER_AREA for shrinking and cv.INTER_CUBIC(slow) & cv.INTER_LINEAR for zooming. By default, interpolation method used is cv.INTER_LINEAR for all resizing purposes. As you can see we are using cv2.INTER_AREA as our scaling factor.

resized image

Let us now learn how to rotate our image, use the following commands to do so. Firstly, we store the height and width of our image and calculate its center. Then we form a new rotation matrix which would describe how our image would look when rotated about its center, we shall do so by using cv2.getRotationMatrix2D(). The second argument is the amount of degrees by which we would want to rotate our image, i,e 90 degrees in this case. The third parameter states the scaling factor as 1.0 which means we are not resizing the image in any manner. cv2.warpAffine() is used to actually return the image in the rotated form that we need.

90 degree rotated image of goku

The next thing that we would learn to do is cropping our image, which is a very easy thing to do indeed. We simply pass the image with the required co-ordinates of Y and X axes to crop out the part of image that we need.

cropped image

You can also draw random stuff on your image if you want by using functions such as cv2.line(), cv2.rectangle, cv2.circle(). All these functions take in the first parameter as the image over which the figures would be drawn. The last parameter describes the thickness of the figure that is drawn and if it is given as -1 then the whole figure is colored, but you can’t use -1 as the last parameter for the line for the obvious reason that it is already colored.

Image Output when we use: cv2.rectangle(image,(50,50),(250,250),(0,0,0),-1) and cv2.circle(image,(70,70),30,(255,0,255),-1)

The last thing that would be covered on this post is about writing stuff on an image and permanently saving it. cv2.putText() is used for writing some text into our image. The first parameter is obviously our original image, the next one is the string that is to be displayed. The third parameter describes the co-ordinates where our text is to be placed and the next one is the font style that we will be using. The next 2 parameters describe the color of the text and the separation of characters. The parameter cv2.LINE_AA describes that anti-aliasing is to be used to display the string.

And there it is, you can congratulate yourself, you’ve learnt or at least revised a lot of stuff today! Now go ahead and explore the code yourself and try it out on your favorite images. Probably try your hand at creating those popular Vegeta memes like the one below. Lol!

Saurav Saha

Written by

Deep Learning and AI Fanatic, Football Lover-Lifelong Madridista, Rock Music Enthusiast, Anime fan, part time Philosopher, exploring life

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade