Computer Vision for Logo Recognition

Nandan Meda
cverse-ai
Published in
6 min readOct 1, 2019

My internship at Couture.ai in the summer of 2019 was a great experience. The internship began on 21st May and ended on 13th July. In these eight weeks, I learned many things. From machine learning and web development to organizational practices such as communication, writing reports and presentations, this internship gave me the opportunity to understand and get a first-hand experience at many things. The following are brief summaries of the projects that I had taken up during my internship.

1. Generation of a Data Set for a Logo Recognition Machine Learning Program

I developed a process using Python which generates a data set which was used to train a machine learning program that predicts the brand of a piece of clothing from the logo present on it. The process resizes various brand logos, pastes them on different images of clothes and stores the locations of the logos in the images in a CSV file.

This process was necessary as there was no open source data set available online which provided a localization of the region containing the logo in the images. This is also a fully automatic process that eliminates the need for manual annotation, which is very time-consuming and cumbersome. This process also has other use cases.

Left: Image of a t-shirt. Right: Image of the same t-shirt after pasting a logo on it using the above process.

2. Study Of Matrix Operations Used In Machine Learning

Linear algebra is a field of mathematics concerned with vectors, matrices, and related operations. It is a key foundation to the field of machine learning, from notations used to describe the operation of algorithms to the implementation of algorithms in code. I learned about a few important matrix operations used in ML, namely Linear Discriminant Analysis, Principal Component Analysis, LU Decomposition, Alternating Least Squares, and Stochastic Gradient Descent.

Dimensionality Reduction Techniques

Dimensionality reduction is simply, the process of reducing the dimension of your feature set. It is necessary because more the number of features, more is the chance of overfitting, resulting in poor performance on real data.

I learned about two dimensionality reduction techniques, namely Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). PCA is a method used to reduce the dimensionality of large unlabelled data sets by transforming a large set of variables into a smaller one that contains most of the information of the original set. The goal of LDA is to project a feature space (a dataset of d-dimensional samples) onto a smaller subspace k (where k ≤ d−1) such that the new axes maximize the separation between the multiple classes of the data.

Comparison of PCA and LDA

Matrix Factorisation

Matrix factorisation methods are used to reduce/factorize a matrix into its constituent parts in order to make it easier to perform more complex matrix operations.

The first matrix factorisation method that I learned was Lower–Upper (LU) Decomposition, which factors a matrix as the product of an upper triangular matrix and a lower triangular matrix with diagonal elements being equal to 1. This method of factorizing a matrix has various applications, such as solving a system of equations, finding the inverse of a matrix and finding the determinant of the matrix.

Matrix factorisation algorithms are also used in recommender systems to decompose the user-item interaction matrix into the product of two lower dimensionality rectangular matrices. One is the user matrix, where rows represent users and columns are latent factors, while the other matrix is the item matrix, where rows are latent factors and columns represent items. The user and item matrices are obtained by minimizing the error between true rating and predicted rating. I learnt two methods to minimize the error: Alternating Least Square (ALS) and Stochastic Gradient Descent (SGD).

ALS minimizes two loss functions alternately. It first holds the user matrix fixed and runs gradient descent with item matrix; then it holds the item matrix fixed and runs gradient descent with user matrix. With SGD, we take derivatives of the loss function with respect to each variable in the model and update the feature weights one individual sample at a time.

After obtaining the user and item matrices, the predicted rating of an item given by a particular user can be expressed as a dot product of the corresponding user latent vector and item latent vector. With matrix factorisation, less-known items can have latent representations which are as rich as popular items, which improves the recommender’s ability to recommend less-known items.

Example of matrix factorization of data movie ratings.

3. Creation of Web Applications for deploying a Machine Learning Model

Machine learning models need to be deployed so that they can be used by others. In order to enable people who do not have a technical background to use it, the model code has to be made available to them in a way that they are familiar with. For this, the model needs to be encapsulated behind some sort of an API that other applications could use to communicate with the model.

The brand predicting ML model developed by Couture.ai needed to be deployed over the web so that the client could use it to make predictions. For this, I developed two web applications on which the user can upload images and view the prediction of the ML model. I used Python with Flask for the back-end and AngularJS and JavaScript for the front-end.

Flask is a popular Python web framework that is made to support the construction of dynamic web applications. AngularJS is an open source Model-View-Controller framework which is used to create dynamic views in web-applications.

I created two web applications: one where the user could upload a folder of images and another where the user could either click or drag and drop to upload a single image.

In the first web application, the user has the option of uploading a folder containing images. After the user has uploaded the files, the images are displayed on the same webpage using AngularJS and JavaScript. On clicking the Predict button, Flask runs the code of the ML model in order to make predictions on the uploaded images. It then renders the results page where the predictions are displayed.

In the second web application, the user can upload the image by either clicking and selecting the image or by dragging and dropping the image. To provide this functionality in the webpage, I used JavaScript and DropzoneJS, an open source library which helps with drag and drop file uploads. Like the previous application, the predictions are displayed on clicking the Predict button.

Since I just completed my second year, this internship helped introduce me to what it feels like to work in the field of computer science and to the way companies function. All in all, it was a great learning experience for me.

--

--