Hi guys, happy new year! Today we are going to implement the famous Vi(sion) T(ransformer) proposed in AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE.
Code is here, an interactive version of this article can be downloaded from here.
ViT will be soon available on my new computer vision library called glasses
This is a technical tutorial, not your normal medium post where you find out about the top 5 secret pandas functions to make you rich.
So, before beginning, I highly recommend you to:
- have a look at the amazing The Illustrated Transformer website
- watch Yannic Kilcher video about ViT
- read Einops…
All the code can be found here. An interactive version of this article can be downloaded from here
Today we are going to use deep learning to create a face unlock algorithm. To complete our puzzle, we need three main pieces.
First of all, we need a way to find a face inside an image. We can use an end-end approach called MTCNN (Multi-task Cascaded Convolutional Networks).
Just a little bit of technical background, it is called Cascaded because it is composed of multiple stages, each stage has its neural network. …
All the code used in this article is here
Recently, PyTorch has introduced its new production framework to properly serve models, called torchserve.
So, without further due, let’s present today’s roadmap:
To showcase torchserve, we will serve a fully trained ResNet34 to perform image classification.
Official doc here
The best way to install torchserve is with docker. You just need to pull the image.
You can use the following command to save the latest image.
docker pull pytorch/torchserve:latest
All the tags are available here
Today we are going to build a semantic browser using deep learning to search in more than 50k papers about the recent COVID-19 disease.
All the code is on my GitHub repo. While a live version of this article is here
The key idea is to encode each paper in a vector representing its semantic content and then search using cosine similarity between a query and all the encoded documents. This is the same process used by image browsers (e.g. Google Images) to search for similar images.
So, our puzzle is composed of three pieces: data, a mapping from papers to vectors and a way to search. …
The template is here
In this article, we present you a deep learning template based on Pytorch. This template aims to make it easier for you to start a new deep learning computer vision project with PyTorch. The main features are:
Let’s face it, usually, data scientists are not software engineers and they usually end up with spaghetti code, most of the time on a big unusable Jupiter-notebook. With this repo, you have proposed a clean example of how your code should be split and modularized to make scalability and sharability possible. In this example, we will try to classify Darth Vader and Luke Skywalker. We have 100 images per class gathered using google images. The dataset is here. You just have to exact it in this folder and run main.py. We are fine-tuning resnet18 and it should be able to reach > 90% accuracy in 5/10 epochs. …
Today we are going to implement the famous ResNet from Kaiming He et al. (Microsoft Research) in Pytorch. It won the 1st place on the ILSVRC 2015 classification task.
ResNet and all its variants have been implemented in my library glasses
Code is here, an interactive version of this article can be downloaded here The original paper can be read from here (it is very easy to follow) and additional material can be found in this quora answer
This is not a technical article and I am not smart enough to explain residual connection better than the original authors. …
There is one famous urban legend about computer vision. Around the 80s, the US military wanted to use neural networks to automatically detect camouflaged enemy tanks. They took a number of pictures of trees without tanks and then pictures with the same trees with tanks behind them. The results were impressive. So impressive that the army wanted to be sure the net had correctly generalized. They took new pictures of woods with and without tanks and they showed them again to the network. This time, the model performed terribly, it was not able to discriminate between pictures with tanks behind woods and just trees.It turned out that all the pictures without tanks were taken on a cloudy day while the ones with tanks on a sunny day! …
Updated at Pytorch 1.7
You can find the code here
Pytorch is an open source deep learning framework that provides a smart way to create ML models. Even if the documentation is well made, I still find that most people still are able to write bad and not organized PyTorch code.
Today, we are going to see how to use the three main building blocks of PyTorch: Module, Sequential and ModuleList
. We are going to start with an example and iteratively we will make it better.
All these four classes are contained into torch.nn
The Module is the main building block, it defines the base class for all neural network and you MUST subclass it. …
Three different ways
You can find the Jupyter notebook for this article here
Today we are going to see how to create words embedding using TensorFlow.
Updated to tf 1.9
Words embedding is a way to represent words by creating high dimensional vector space in which similar words are close to each other.
Long story short, Neural Networks work with numbers so you can’t just throw words in it. You could one-hot encoded all the words but you will lose the notion of similarity between them.
Usually, almost always, you place your Embedding layer in-front-of your neural network.
Usually, you have some text files, you extract tokens from the text and you build vocabulary. Something similar to the following…
Automatically bind your fields to and from a tensorflow graph
Would it be cool to automatically bind class fields to tensorflow variables in a graph and restore them without manually get each variable back from it?
The code for this article can be found here, a jupyter-notebook version can be found here
Image you have a Model
class
Usually, you first build your model and then you train it. After that, you want to get from the saved graph the old variables without rebuild the whole model from scratch.
<tf.Variable 'variable:0' shape=(1,) dtype=int32_ref>
Now, imagine we have just trained our model and we want to store it. …