How to Score 99% Accuracy on Digit Recognizer(Kaggle Competition) Using Genetic Algorithm.

Deepak Yadav
Nov 2 · 4 min read

I assume that everyone know about MNIST dataset. If not read wikipedia.

There is two ways to solve this problem:

1- Using machine learning algorithms like KNN classifier. (which is bad idea due to taking more time during prediction and very slow)

2- Using Deep learning algorithms like CNN and RNN. ( which is very good idea because CNN perform very well on images)

How to score 99% accuracy ?

Genetic algorithm( a type of optimization algorithm, which can optimize anything)

I don’t know about Genetic algorithm. What can I do?

There is no need to worry about , Just clear your basic and comeback again. There are lot of prebuilt library available to implement Genetic algorithm like Deap. In this article we are going to use prebuilt library which is made for optimize CNN. cool !

But someone said Genetic algorithm takes too much time to run and needs very fast machine. I don’t have such machine ?

necessity is the mother of invention !

We have google Colab. (garebo ka GPU).

I never use Google Colab. What can I do?

A quick answer is google colab is jupyter notebook, which is running on google server having GPUand TPU.

Main Mission Start Here. Be Ready !

There is only three steps to follow

Step 1- Make account on Google . click here (very easy)

Step 2- Clone pre-built genetic algorithm and import it to google drive. (very very easy)

Step 3- Run the algorithm. ( Very very very easy).

Step 4- Make prediction and submit to Kaggle. (This is cheating !)

Step 1

Do yourself.

Step 2

Step 2.1- Clone pre-built algorithm.

click here and clone the repo into your machine.

Step 2.2- Import on Google Drive( be careful, you are gone hurt) .

upload the whole folder into your google drive (You know about Google Drive, Don’t be kidding).

Click on New->Folder upload->select devol from your machine.

Step 2.3- Open the demo.ipynb

devol->example->demo.ipynb (right click and open with google colab)

Step 2.4- Change some parameters in the code

num_generation = 3, pop_size = 10, epochs = 4. (this take less time to run).

Step 2.5- Change the runtime

Runtime->change runtime type. Set to Python 3 and GPU and save it.

Step 2.6- Mount the drive

click MOUNT DRIVE-> automatic generate the code->run it (ctrl+ENTER)

Step 2.7- Copy the path of devol and Install it.

Navigate to devol and rigth click on devol and copy the path

Install using pip. !pip install -e /path/of/devol. look below image

Step 2.8- change the directory

paste same path here also.

Important: all three step (2.6, 2.7, 2.8) run in sequence. look below image.

Step 3 Run the algorithm

Step 3.1- Run the script

Run script one by one. from 4th cell to be downward.

Step 3.2- The result

When running is finished, best-model.h5 is generated.Download it.(This is used in Kaggle site)

Step 4- Make Prediction and Submit to Kaggle.

Step4.1- Go to the Digit Recognizer (kaggle competion)

Open New Notebook. for more help see my kaggle notebook.

Step 4.2- Add “best-model.h5” and write the code (click here to see the code)

The END

Thanks for reading. visit my profiles.

References

Deepak Yadav

Written by

Data Scientist ( intern) at Activa Inc.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade