Transfer Learning made easy

Thibaut Lucas
Analytics Vidhya

--

Every new Computer Vision experiment is the same story

  • Struggle to find open data online to create a Dataset
  • Spend days to create some kind of ground truth
  • Find a pre-trained model suited for my use-case with the code/checkpoints/weights
  • Launch the training 40 times before finding a way to get rid of those CUDA or Tensorflow errors
  • Finally have a trained model but obliged to run a notebook to show the results to my peers (and forget to keep tracks of metrics and logs)

And then it’s only one training but now you have to fine-tune your model, compare the results, maybe bring more data, and you will probably lost some things along the way, leave alone the re-usability of your project.

When you think about it, this program shouldn’t be a data scientist’s everyday life. With the amount of resources available online since the last few years, we don’t need to reinvent the wheel.

We shouldn’t focus on duplicating knowledge but rather on how to leverage what have been done to increase performances

I didn’t even mentioned the whole Data/Ops part which will be part of another article but you got the big picture.

But those 5 points are mandatory in order to bring AI to life in the industries, and the good news is :

You don’t have to care anymore

I will show you how you can leverage the Picsell.ia platform to never lose time again and always keep track of all your projects.

For this purpose, let’s say we want to train a CV model to detect buildings in aerial images from scratch.

Data Collection

In this article we presented you a glimpse of our Open Dataset Hub, a place to find public Datasets curated by our users that you can clone freely.

Let’s see if there is a Dataset containing some buildings

Bingo this is what we need ! 993 aerial images containing 55000 segmented buildings and it only took 3 min to access this Dataset and the ground truth.

Screenshot from the project’s review interface

Here, we are lucky to find an already annotated Dataset, but thanks to our Optimized Annotation Interface, you would have done it in no time !

2. Architecture Search

Now that we have our clean Dataset, let’s choose an architecture from the Open Model Hub

Yes it’s the same thing as the Open Dataset Hub but for pre-trained models

We have chosen SSD_Inception trained on COCO because of its robust architecture and it’s high training speed.

Let’s attach it to our project (we have renamed it SSD_inception_buildings), now we can see it in our ‘project overview’ :

3. Training

And now the fun begins, we will use the power of the Picsell.ia Python SDK (Yes another new feature) to train our model seamlessly and keep track of the results directly in the platform.

First you will need your API Token (in your profile section) and your Project Token (in project-settings →Project Token)

Your project token should look like this

Then we will move from the platform to our github and use one of our repo’s :

You can select the notebook file called “Object_Detection_TF1_easy.ipynb” and click on this little button :

It allows you ton open our notebook in a Google Colaboratory session if you don’t have enough hardware to train your model (but you can also clone our repo on your machine if you are lucky enough to own top notch GPU’s)

From now I assume that you have opened a session in collab and that our notebook is opened in front of your eyes.

The first lines you should see are those

This allows you to clone our Github repository directly on the collab virtual machine, setup the right version of Tensorflow and our package.

This wrapper function is used to ‘wrap’ all the heavy-lifting made by our python library and if you are interested for an in-depth overview, you can check out the ‘Object_Detection_TF1.ipynb’ or our documentation.
But for now…

Andrew Ng and his famous quote from Coursera

Now all you have to do is to enter your tokens, the name of the model as we chose it earlier, and run the previously declared function.

You should see the magic happen :

As you can see, everything is handled properly by the picsellia module, what it does in a nutshell is :

  • Download your images and annotations on the machine (in the right format)
  • Split smartly your Dataset for training and testing
  • Extract Bounding-boxes out of the masks from the Dataset
  • Get the original checkpoints of the model you chose on the platform
  • Initialize the model with the right parameters so you will not get some weird errors
  • Launch the training from the checkpoints

4. Visualization

Now that the training is over, the picsellia module will be in charge of sending all the important data to the platform so you and your team can track the experiment from there, here are the last steps from the wrapper :

  • Send the Checkpoints
  • Send the logs and metrics
  • Send some results from validation
  • Send the weights and make it available in our Serving Engine

And we are done ! Now let’s move back to the platform to really see what picsellia has made for you :

Screenshot from the dashboard on Picsell.ia

Welcome to your project Training Dashboard ! As you can see, it gathers all the important informations about your data, your training logs and your evaluation metrics, so you will never have to run it again to show it to someone.

And the best of all, if you go back to your notebook to continue the training, it will automatically train from the checkpoints saved earlier and create a “v2” tab in your dashboard with the new information, but you will still have access to the “v1” of your experiment.

That’s the power of Picsell.ia

You may have seen that we have called a function named ‘send_weights’ in our wrapper, and obviously, it sends the final weights of your model on the platform.

That allows you to do several things :

  • Make your model available publicly in our Model Hub
  • Run inference with your model directly on the platform in the “Playground” section of your project and annotate more data with it.

As you can see, your model is right here in your model list :

5. Testing

Here is the playground, now you can simply run inference on the image of your choice :

Not that bad for only 5000 epochs of training.

And Voilà you have seen what you can do with our new platform, we have created those tools so you can focus on what’s really important in AI nowadays : Science and Business

If you want to go further in what you can do with Picsell.ia, I invite you to read our documentation, try things, and write to us if you need any help !

We have a lot of communities that you can join to discuss with other user and our data scientists :

I hope that you enjoyed the read and that you will find your way though our product, don’t hesitate to share or like this article if you found it useful !

See you on the next tutorial !

--

--