Toonify(Cartoonization) Images using OpenCV and NumPy Python Libraries

Cartoonization means a humorous sketch or drawing of a person or vehicle or thing. A lot of software applications are available to perform this, but it requires more time and the price is high. To save time and money, a simpler way is to develop using OpenCV and NumPy Python libraries. It requires a few lines of code. A lot of people worry about the coding and face difficulties at a few steps like they don't understand what exactly that function or step means. This basic cartoon effect project will make you understand each and every step clearly.

Likhitha kakanuru
Analytics Vidhya
5 min readSep 12, 2020

--

Before going to code, we need to know about OpenCV and NumPy Python libraries.

Let’s start with the OpenCV library:

OpenCV:

OpenCV is an image processing library created by Intel. It provides simple and useful ways to read and write images. The OpenCV library allows you to run Computer Vision algorithms efficiently in real-time. OpenCV is a popular Computer Vision library that has many built-in tools for image analysis. One of the main advantages of OpenCV is that it is highly optimized and available on almost all platforms. OpenCV reads images in BGR format.

NumPy:

Numpy is a library for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays. A NumPy array is similar to the list. We can cast a list to a NumPy array by first importing it. Numpy arrays contain data of the same type, we can use attribute “dtype” to obtain the data type of the array’s elements.

Built-in functions required for the project:

— Bilateral Filtering

— Edge Detection

Bilateral Filtering:

Bilateral Filtering is a technique for image smoothening while preserving edges. It depends only on two parameters that indicate the size and contrast of the features to preserve. Image smoothening is used to reduce the sharpness of edges and detail in an image.

Edge Detection:

Edge Detection is an image processing technique to find the boundaries or edges of objects within the image, by determining where the brightness of the image changes fastly.

Coming to the coding part, let's start with importing the required resources:

Importing libraries

Next, Let’s start reading and displaying the dimensions of an image. The prescribed code is given below:

Loads and prints the dimensions of the image

— Downsampling is used to reduce the size of an image and bilateral filtering is for smoothening of image.

— cv2.imread(“image.jpg”) loads the image and shape defines the dimensions of the image.

To resize the image, we can use cv2.resize() function as follows:

Resizing the image

— cv2.resize() helps in reducing the number of pixels from the image.

The next step is to perform downsampling using the gaussian pyramid:

Downsampling image using a Gaussian pyramid

— The Gaussian pyramid is used to downsample images. cv2.pyrDown() is used to reduce spatial resolution in the image. Typically used to zoom out from the image.

Next, we will apply small bilateral filters repeatedly instead of one large filter as shown below:

Applying small bilateral filters

— img_color is the input image

— d = diameter of each pixel neighborhood which is used during filtering

— sigmaColor filters the sigma in color space. The greater the value is, the farther colors within the pixel neighborhood will be mixed together

— sigmaSpace filters the sigma in the coordinate space. If the value of the parameter is larger then the farther pixels will influence each other as long as their colors are similar.

Note: Large filters are very slow, so it is recommended to use d=5 for real-time applications and d=9 for offline applications that need heavy noise filtering.

Next, we need to perform upsampling as follows:

Upsampling image

— Upsampling is used to increase spatial resolution that is to increase the size of an image. It is typically used to zoom in on a small region of an image and for eliminating the pixelation effect that arises when a low resolution is displayed on a relatively large frame.

— cvtColor() is used to convert an image from one color space to another. Here img_rgb is the image whose color space has to be changed.

— medianBlur() takes the median of all pixels in the kernel area and the central element is replaced with this median value. Here img_gray is the input image and 7 is the kernel size.

— adaptiveThreshold() calculates threshold value for smaller regions. Here img_blur is the input image, 255 is the maximum value assigned to a pixel.

— ADAPTIVE_THRESH_MEAN_C tells about the mean of the neighborhood values minus constant value.

— blockSize decides the size of the neighborhood area.

— C is just a constant value that is subtracted from the mean.

The final step is to convert back the image to color image using bitwise and to display the stacked images:

— cv2.bitwise_and() used to perform image masking. Here img_color is the first input image and img_edge is the second input image.

— np.hstack() is used to display multiple images in one window.

— cv2.imshow() displays the specified image.

— cv2.waitkey(0) will display the window infinitely until any key is pressed.

The few output images for the prescribed code are as shown below:

To know more about NumPy and other Python libraries you can refer to this link:

--

--

Likhitha kakanuru
Analytics Vidhya

Business Analyst | Passionate writer | Likes to write about Technology and real-life Experiences.