Week3 — Plant Disease Detection

Sevda Sayan
bbm406f19
Published in
3 min readDec 16, 2019

Theme: Classification to plants that healthy or diseased and predict to photographed plant disease.

Team Members: Fatmanur Turhan, Sevda Sayan, İsmet Seyhan

In our blog last week, we explained our dataset and arhitecture cleary. In this blog, we will explain our approach. Last week we have decided to use AlexNet or GoogleNet, but after our researches we decided to use Vgg16. Lets see what it is look like.

VGG16 is a convolutional neural network model. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes. It makes the improvement over AlexNet by replacing large kernel-sized filters (11 and 5 in the first and second convolutional layer, respectively) with multiple 3×3 kernel-sized filters one after another.

VGG16 Architecture

The input to cov1 layer is of fixed size 224 x 224 RGB image. The image is passed through a stack of convolutional (conv.) layers, where the filters were used with a very small receptive field: 3×3 (which is the smallest size to capture the notion of left/right, up/down, center). In one of the configurations, it also utilizes 1×1 convolution filters, which can be seen as a linear transformation of the input channels (followed by non-linearity). The convolution stride is fixed to 1 pixel; the spatial padding of conv. layer input is such that the spatial resolution is preserved after convolution, i.e. the padding is 1-pixel for 3×3 conv. layers. Spatial pooling is carried out by five max-pooling layers, which follow some of the conv. layers (not all the conv. layers are followed by max-pooling). Max-pooling is performed over a 2×2 pixel window, with stride 2.

Three Fully-Connected (FC) layers follow a stack of convolutional layers (which has a different depth in different architectures): the first two have 4096 channels each, the third performs 1000-way ILSVRC classification and thus contains 1000 channels (one for each class). The final layer is the soft-max layer. The configuration of the fully connected layers is the same in all networks.

PYTORCH

In this project we will use some functionality of Pytorch library. Lets see what is it and why do we will use it.

PyTorch is a machine learning framework produced by Facebook in October 2016. It is open source, and is based on the popular Torch library. PyTorch is designed to provide good flexibility and high speeds for deep neural network implementation PyTorch is different from other deep learning frameworks in that it uses dynamic computation graphs. While static computational graphs are defined prior to runtime, dynamic graphs are defined “on the fly” via the forward computation

Advantages of PyTorch

Pytorch is easy to learn and easy to code. For the lovers of oop programming, torch.nn.Module allows for creating reusable code which is very developer friendly. Pytorch is great for rapid prototyping especially for small-scale or academic projects.

It’s a Python-based scientific computing package targeted at two sets of audiences:

  • A replacement for NumPy to use the power of GPUs
  • A deep learning research platform that provides maximum flexibility and speed

Tensors : Similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing. It has some operations for matrixes like rand, zeros, add eg. similar to NumPy. Also we can construct a tensor directly from NumPy if we need and vise versa.

CUDA Tensors : Tensors can be moved onto any device using the .to method. And it helps us to do our studies on gpu.

Because of these advantages we will use Pytorch and VGG16 model defined in it. Next week we will train our network on our dataset. See you next week.

Thank you for reading…

--

--