Depth estimation with deep neural networks part 2

Mahmoud Selmy
Sep 20, 2018 · 4 min read

This is our second mini-blog about depth estimation. If you haven't read part 1 , I would truly recommend you to read it.

We will talk today about “Deeper Depth Prediction with Fully Convolutional Residual Networks” paper, it is really good paper that has shown a good level of durability when we used it some funny applications that depend on a fine depth “we will talk about this later”, And also we provide a Tensorflow implementation of this paper.

First we must admit the image-depth data-sets are much fewer than other data-sets related to the popular tasks like Classifications or Object detection, So we should use the Transfer Learning Technique . No matter what the architecture are using. In order not to consume our precious rare labeled data to learn some basic features about the scene that they have already been learnt in other tasks .

But now the important question is what encoder “a Pre-Trained model that we will use to convert the image to its’ basic features”, The most important criteria is the receptive field at the last convolutional layer , The larger it became the better basic features we can get from this model.

The Architecture :

Image for post
Image for post
Deeper Depth 2016 Architecture

They have used ResNet as an encoder because it has 483*483 receptive field so it would be enough to fully capture the input image 304*228.

The main contribution is that they have used Residual Up-Convolutions instead of Fully-Connected layers because the FC Layers are discriminative by its’ nature so it doesn’t suit a regression problem like depth estimation and the other its really memory consuming.

This paper shows a great comparison between different up-projection blocks. lets dig deep in this comparison.

Image for post
Image for post

a) Vanilla UP-Convolution: It uses un-pooling layer”The reverse operation of pooling where we map each cell value in the input feature map to 2*2 cells in the out put map where the input value occupies the top left cell and the other cells are zeros.” followed by conv-layer but the un-pooling much weaken the resulting features map and its make it hard to learn any thing useful from this sparse map.

b) Vanilla UP-Projection: It is much like the first block but it uses projection in order to make it easier to the model to learn. By using Fusion between two independent branches in order to get more dense feature map. But the resulting features map still sucks, The authors have been inspired by the projection blocks at the ResNet.

c) Fast UP-Convolution: Here they proposed a great contribution by splitting the 5*5 conv-filter weights into non-overlapping groups, indicated by different colors and A{3*3},B{3*2},C{2*3},D{2*2} in the figure. Each of them will produce a separate features map and the resulting feature map would be an interleaving of each category output.

They call it fast because they found that using this block decreases the training time by 15%.

Image for post
Image for post
Figure

d) Fast UP-Projection: They just applied the new interleaving technique to the up-projection approach.

The Loss function: They have used the reverse Huber loss function because they don’t want the data-set outliers to have a great effect on the gradient flow, So they used the l1-norm when the depth difference crosses a certain thershold.

Image for post
Image for post
Huber loss function

The output : The resulting depth map is good but it is blurry, This might represents an issue in a cool application “we will discuss it later, so stay tuned”.

Image for post
Image for post
;

Tensorflow Implementation: this is our github repo.

Data Driven Investor

from confusion to clarity not insanity

Sign up for DDI Intel

By Data Driven Investor

In each issue we share the best stories from the Data-Driven Investor's expert community. Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Mahmoud Selmy

Written by

Data Driven Investor

from confusion to clarity not insanity

Mahmoud Selmy

Written by

Data Driven Investor

from confusion to clarity not insanity

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store