Stories by Furkan Kınlı on Medium

[Deep Learning Lab] Episode-5: CIFAR-100

Furkan Kınlı — Mon, 03 Sep 2018 06:38:08 GMT

Let the “Deep Learning Lab” begin!

This is the fifth episode of “Deep Learning Lab” story series which contains my individual deep learning works with different cases.

Previously on Deep Learning Lab:

As I already mentioned in Episode 2, I would like to work on CIFAR-100 which contains 60.000 different images with 100 categories for this episode. The main aim in this work will be to reach or exceed.. Sorry. Mmm... Actually, to try to show how the model which has the best score in the literature is implemented. I’m sure that it will be exciting to re-implement the article “Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)” which explains how to invent ELUs and how it works by comparison with ReLUs and its newfangled versions (sReLUs, LReLUs etc.) by Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter.

CIFAR-100

Please let me quickly remind you what CIFAR data sets are. CIFAR data sets are one of the most well-known data sets in computer vision tasks created by Geoffrey Hinton, Alex Krizhevsky and Vinod Nair. While CIFAR-10 is more popular to start to work on deep learning from scratch since it has 10 category labels for the images, however, to work with CIFAR-100 is not a common behavior in deep learning community since it is not easy to train a -really- deep learning model. There are 100 different category labels containing 600 images for each (1 testing image for 5 training images per class). The 100 classes in the CIFAR-100 are grouped into 20 super-classes. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the super-class to which it belongs). We will work with the fine labels.

CIFAR-100 Class List

Moreover, it is not possible to get results (above 90%) such like in MNIST-like data sets, then bloggers or tutorial writers do not prefer to use CIFAR-100 -broadly speaking-, since they are aware of not making the readers feel like they change the world. -But, I do-.

Let me reference to the real hero:

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009

University of Toronto, Technical Report

In this episode, I will re-implement the article of Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter about discussing a new activation function approach. They claimed that using ELUs as activation function evokes getting more accurate results in faster way. How the activation function runs faster is not our main concern, so I strongly suggest you to read the article if you want to get more information about it.

The main problem -according to the article- in ReLUs is that their mean activation is not zero. In the other words, ReLUs are not zero-centered activation functions and it leads to shift the bias term for the term in the next layer. But then, ELUs arranges the mean of the activation closer to zero because they have negative values, even if these values are very close to zero, and it converges faster, -it means the model will learn faster-.

ELU vs. ReLUs

The magic behind ELUs is surprisingly easy to see. ELUs have exponential term in the formula and the derivative of an exponential term, as you all know, equals to the exponential term itself. For the forward propagation, all weights and biases are activated with some constant multiplication of an exponential of them, and they are back-propagated with the derivative of the activation function, it is -actually- exponential of all weights and biases. The formula can be seen below.

The formula of ELUs and the derivative of ELUs

It seems that it works very well with CIFAR-100 data sets, since they have the best accuracy score in the literature, -for now-. I prefer to save the benchmark scores for the last.

Let me reference to the other real heroes:

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter, 2016

International Conference on Learning Representations (ICLR) 2016

No more speaking. We have too much jobs to do here. It is the time for coding.

https://medium.com/media/dbdfb736773e0b1da53cf446f64f10d3/href

LET’S GOOOOO!

At this point, I would like to convey my thanks to MSI Turkey and Tufan Vardar, Digital Marketing Specialist @ MSI Turkey since they donated one 1080Ti GPU to me to foster my academic researches and blog posts.

Importing the libraries as we always do.

from __future__ import print_function
import keras
from keras.datasets import cifar100
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD
from keras.regularizers import l2
from keras.callbacks import Callback, LearningRateScheduler, TensorBoard, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import print_summary, to_categorical
from keras import backend as K
import sys
import os
import numpy as np

Initializing the parameters. In section 4.3 of the article, the parameters have been described.

Mini batch size: 100
Initial learning rate: 0.01
Momentum rate: 0.9
L2 regularization weight decay: 0.0005
Dropout rates for all layers: 0.5.

BATCH_SIZE = 100
NUM_CLASSES = 100
EPOCHS = 165000
INIT_DROPOUT_RATE = 0.5
MOMENTUM_RATE = 0.9
INIT_LEARNING_RATE = 0.01
L2_DECAY_RATE = 0.0005
CROP_SIZE = 32
LOG_DIR = ‘./logs’
MODEL_PATH = ‘./models/keras_cifar100_model.h5’

Thanks to Keras, we can load the data set easily.

(x_train, y_train), (x_test, y_test) = cifar100.load_data()

We need to convert the labels in the data set into categorical matrix structure from 1-dim numpy array structure.

y_train = to_categorical(y_train, NUM_CLASSES)
y_test = to_categorical(y_test, NUM_CLASSES)

We need to normalize the images in the data set.

x_train = x_train.astype(‘float32’)
x_test = x_test.astype(‘float32’)
x_train /= 255.0
x_test /= 255.0

(From the article) The data set should be preprocessed with global contrast normalization (sample-wise centering) and ZCA whitening. Additionally, the images should be padded with four 0 pixels at all borders (2D zero padding layer at the top of the model). The model should be trained 32x32 random crops with random horizontal flipping. That’s all for data augmentation.

The CNN Architecture:

18 convolutional layers arranged in stacks of

(layers x units x receptive fields)

([1×384×3],[1×384×1,1×384×2,2×640×2],[1×640×1,3×768×2],[1×768×1,2×896×2],[1×896×3,2×1024×2],[1×1024×1,1×1152×2],[1×1152×1],[1×100×1])

model = Sequential()
model.add(ZeroPadding2D(4, input_shape=x_train.shape[1:]))
# Stack 1:
model.add(Conv2D(384, (3, 3), padding='same', kernel_regularizer=l2(0.01)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 2:
model.add(Conv2D(384, (1, 1), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(384, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(640, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(640, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 3:
model.add(Conv2D(640, (3, 3), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(768, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(768, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(768, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 4:
model.add(Conv2D(768, (1, 1), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(896, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(896, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 5:
model.add(Conv2D(896, (3, 3), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(1024, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(1024, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 6:
model.add(Conv2D(1024, (1, 1), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Conv2D(1152, (2, 2), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

# Stack 7:
model.add(Conv2D(1152, (1, 1), padding='same', kernel_regularizer=l2(L2_DECAY_RATE)))
model.add(Activation('elu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Dropout(INIT_DROPOUT_RATE))

model.add(Flatten())
model.add(Dense(NUM_CLASSES))
model.add(Activation('softmax'))

This network is very deep. Very. With my resources, one epoch runs on 3 minutes. If we want to experiment this article completely, we have to train the model for 165.000 epoch. It means the training needs to at least 40 days -nonstop- to run.

This is insane.

https://medium.com/media/4a079752f8eb128bce01548c803db4c4/href

You can check the summary of model how much the model is deep.

model.summary()

Summary of the model (tail)

Other adjustments to the model:

The learning rate will be decreased by a factor of 10 after 35.000 iterations
For the later 50.000 iterations, the drop-out rate will be increased for all layers in a stack to (0, 0.1, 0.2, 0.3, 0.4, 0.5, 0).
For the last 40.000 iterations, the drop-out rate will be increased by a factor of 1.5 for all layers.

We need to use callbacks to make these adjustments. First, we will write the schedulers for learning rate and the drop-out rate.

For the learning rate:

def lr_scheduler(epoch, lr, step_decay = 0.1):
    return float(lr * step_decay) if epoch == 35.000 else lr

For the drop-out rate:

def dr_scheduler(epoch, layers, rate_list = [0.0, .1, .2, .3, .4, .5, 0.0], rate_factor = 1.5):
    if epoch == 85000:
        for i, layer in enumerate([l for l in layers if "dropout" in np.str.lower(l.name)]):
            layer.rate = layer.rate + rate_list[i]
    elif epoch == 135000:
        for i, layer in enumerate([l for l in layers if "dropout" in np.str.lower(l.name)]):
            layer.rate = layer.rate + layer.rate * rate_factor if layer.rate <= 0.66 else 1
    return layers

Then, we can define our custom callback objects for the learning rate

class StepLearningRateSchedulerAt(LearningRateScheduler):
    def __init__(self, schedule, verbose = 0): 
        super(LearningRateScheduler, self).__init__()
        self.schedule = schedule
        self.verbose = verbose
    
    def on_epoch_begin(self, epoch, logs=None): 
        if not hasattr(self.model.optimizer, 'lr'):
            raise ValueError('Optimizer must have a "lr" attribute.')
            
        lr = float(K.get_value(self.model.optimizer.lr))
        lr = self.schedule(epoch, lr)
       
        if not isinstance(lr, (float, np.float32, np.float64)):
            raise ValueError('The output of the "schedule" function ' 'should be float.')
        
        K.set_value(self.model.optimizer.lr, lr)

        if self.verbose > 0: 
            print('\nEpoch %05d: LearningRateScheduler reducing learning ' 'rate to %s.' % (epoch + 1, lr))

and the drop-out rate.

class DropoutRateScheduler(Callback):
    def __init__(self, schedule, verbose = 0):
        super(Callback, self).__init__()
        self.schedule = schedule
        self.verbose = verbose
        
    def on_epoch_begin(self, epoch, logs=None):
        if not hasattr(self.model, 'layers'):
            raise ValueError('Model must have a "layers" attribute.')
            
        layers = self.model.layers
        layers = self.schedule(epoch, layers)
        
        if not isinstance(layers, list):
            raise ValueError('The output of the "schedule" function should be list.')
        
        self.model.layers = layers
        
        if self.verbose > 0:
            for layer in [l for l in self.model.layers if "dropout" in np.str.lower(l.name)]:
                print('\nEpoch %05d: Dropout rate for layer %s: %s.' % (epoch + 1, layer.name, layer.rate))

Let’s get back to the data augmentation methods. By applying zero padding four 0 pixels at all borders, we will randomly crop the images by 32x32. To achieve this, we need to create custom generator which takes ImageDataGenerator object as an input and yields each batch of images by cropping them. Source: JK Jung’s Blog

First, create a method to crop an image with a certain size.

def random_crop(img, random_crop_size):
    height, width = img.shape[0], img.shape[1]
    dy, dx = random_crop_size
    x = np.random.randint(0, width - dx + 1)
    y = np.random.randint(0, height - dy + 1)
    return img[y:(y+dy), x:(x+dx), :]

Then, apply this method to each image in the batch which yields from ImageDataGenerator object.

def crop_generator(batches, crop_length, num_channel = 3):
    while True:
        batch_x, batch_y = next(batches)
        batch_crops = np.zeros((batch_x.shape[0], crop_length, crop_length, num_channel))
        for i in range(batch_x.shape[0]):
            batch_crops[i] = random_crop(batch_x[i], (crop_length, crop_length))
        yield (batch_crops, batch_y)

Clevert and his colleagues preferred to use the Stochastic Gradient Descent with Momentum algorithm to optimize the weights on the back-propagation. Momentum term has been set to 0.9, and Nesterov accelerator for SGD has not been used. I, again, strongly recommend you to read an article, this one and this one, in order to get more information about SGD algorithm.

opt = SGD(lr=INIT_LEARNING_RATE, momentum=MOMENTUM_RATE)

Here is the part that I be loved. Callbacks! Let’s create callback objects. First one is our custom learning scheduler to decrease the learning rate after a certain number of epoch. Also, we have another custom callback for adjusting the drop-out rates in the stack layers. Next, we will record what our model has done during the training process. And lastly, we will save our trained model in each epoch that has better result than previous one.

(Please do not forget to call them by fitting the data to the generator, I forgot the drop-out scheduler, and I spent one day to realize that I -actually- do not call it. That was one of the most painful experience in my deep learning life.)

lr_rate_scheduler = StepLearningRateSchedulerAt(lr_scheduler)
dropout_scheduler = DropoutRateScheduler(dr_scheduler)
tensorboard = TensorBoard(log_dir=LOG_DIR, batch_size=BATCH_SIZE)
checkpointer = ModelCheckpoint(MODEL_PATH, monitor='val_loss', verbose=1, save_best_only=True)

That’s all I think. Now, we are ready to compile our model. Categorical cross-entropy has been picked as loss function since we have 100 category labels in the data set, and we already prepared the labels in the categorical matrix structure. Likewise, we will measure our performance on the validation set with top-1 and top-5 accuracies.

model.compile(optimizer=opt,
              loss='categorical_crossentropy',
              metrics=['accuracy', 'top_k_categorical_accuracy'])

We will use ImageDataGenerator object to handle the data pre-processing on real time and make sure that the process goes randomly. Just for reminding, in the article, global contrast normalization (sample-wise centering) and ZCA whitening and horizontal flipping methods should be used for augmenting the data.

datagen = ImageDataGenerator(samplewise_center=True,
                             zca_whitening=True,
                             horizontal_flip=True,
                             validation_split=0.2)

ATTENTION!

If we use sample-wise or feature-wise centering methods, we have to fit the training data to the generator. Otherwise, these methods do not work.

datagen.fit(x_train)

Now, we will flow the data using our custom generator object for cropping the images. Here is the flowing methods for training and validation data. Since we define the rate of splitting the data to training and validation in the ImageDataGenerator object, it is enough to specify the subset as “training” or “validation” in the flowing method to split the data.

train_flow = datagen.flow(x_train, y_train, batch_size=BATCH_SIZE, subset="training")
train_flow_w_crops = crop_generator(train_flow, CROP_SIZE)
valid_flow = datagen.flow(x_train, y_train, batch_size=BATCH_SIZE, subset="validation")

WOW! Ready to train, huh?

https://medium.com/media/72c1019228a59de12ba27897b37f17a9/href

GO GO GO!!!

model.fit_generator(train_flow_w_crops,
                    epochs=EPOCHS,
                    steps_per_epoch=len(x_train) / BATCH_SIZE,
                    callbacks=[lr_rate_scheduler, dropout_scheduler, tensorboard, checkpointer],
                    validation_data=valid_flow,
                    validation_steps=len(x_train) / BATCH_SIZE)

Head of epochs

https://medium.com/media/eca6a6aab241ee7c78a979f46e9696d4/href

165.000 epochs! COME ON!

As I mentioned earlier, I cannot finish the training process with my resources (by the way, it is 1080Ti). So, we do not have model to test at the end of this episode. If you have better GPU/s and have never ending patient during the training (for me it was expected to run at least 40 days -nonstop-), you can go for it -but I won’t-.

test_datagen = ImageDataGenerator(samplewise_center=True,
                                  zca_whitening=True)
test_datagen.fit(x_test)

test_flow = test_datagen.flow(x_test, y_test, batch_size=BATCH_SIZE)
results = model.evaluate_generator(test_flow, steps=len(x_test) / BATCH_SIZE)

print('Test loss: ' + str(results[0]))
print('Accuracy: ' + str(results[1]))
print('Top-5 Accuracy: ' + str(results[2]))

Here is the results of this model in the article.

Results of this model on CIFAR data sets

As you can see, the top-1 accuracy of this model on CIFAR-100 is 75.72%. This is the best result in the literature. For the details of the experiment, please read the article.

You can find the Jupyter Notebook of this episode in my GitHub Repository.

Well, the fifth episode of “Deep Learning Lab” series, CIFAR-100 ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.

[Deep Learning Lab] Episode-4: Deep Fashion

Furkan Kınlı — Wed, 27 Jun 2018 06:01:01 GMT

Let the “Deep Learning Lab” begin!

This is the fourth episode of “Deep Learning Lab” story series which contains my individual deep learning works with different cases.

Previously on Deep Learning Lab:

In this episode, I will be appreciated to introduce DeepFashion dataset, which is a large-scale clothes database and has several appealing properties, to deep learning enthusiasts -I mean, you-

Clip art of DeepFashion dataset

Let’s quickly give more information about the DeepFashion dataset. DeepFashion is an open-source (commercial use not allowed by release agreement) dataset which is created for IEEE Conference on Computer Vision and Pattern Recognition (CVPR) in 2016 by Ziwei Liu, Ping Luo and their colleagues in The Chinese University of Hong-Kong and Shenzhen Institutes of Advanced Technology.

This dataset consists of more than 800.000 different RGB-colored images -ranging from well-posed shopping images to unstructured customer images & cross-posed and cross-domain images-. The size of images are not same as in the other well-known datasets and each image in the dataset is labeled with one of ~50 categories, ~1000 attributes, bounding box and clothing landmarks. In this work, I just focused on category classification and bounding box detection tasks using such a large subset (~290.000 images) of this well-prepared dataset which contains clothing categories and attributes in wild.

Example of categories and attributes

Let me reference to the real heroes:

DeepFashion: Powering Robust Clothes Recognition and Retrievel with Rich Annotations, Ziwei Liu et al., 2016

Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016

ARE WE READY TO THIS JOURNEY?

https://medium.com/media/96b782e7a94df9b1a6f34347a466ba03/href

LET’S GOOOO!!!

After downloading the dataset from here, we need to prepare the category labels by adding images to folders as images with same label in the same folder. Besides, we need to split the data into train, validation and test sets as annotated in the paper. But first, importing the libraries -please-.

import shutil
import os
import re
import cv2

# will use them for creating custom directory iterator
import numpy as np
from six.moves import range

# regular expression for splitting by whitespace
splitter = re.compile("\s+")
base_path = ''

Then,

def process_folders():
    # Read the relevant annotation file and preprocess it
    # Assumed that the annotation files are under '/data/anno' path
    with open('./data/anno/list_eval_partition.txt', 'r') as eval_partition_file:
        list_eval_partition = [line.rstrip('\n') for line in eval_partition_file][2:]
        list_eval_partition = [splitter.split(line) for line in list_eval_partition]
        list_all = [(v[0][4:], v[0].split('/')[1].split('_')[-1], v[1]) for v in list_eval_partition]

    # Put each image into the relevant folder in train/test/validation folder
    for element in list_all:
        if not os.path.exists(os.path.join(base_path, element[2])):
            os.mkdir(os.path.join(base_path, element[2]))
        if not os.path.exists(os.path.join(os.path.join(base_path, element[2]), element[1])):
            os.mkdir(os.path.join(os.path.join(base_path, element[2]), element[1]))
        if not os.path.exists(os.path.join(os.path.join(os.path.join(os.path.join(base_path, element[2]), element[1])),
                              element[0].split('/')[0])):
            os.mkdir(os.path.join(os.path.join(os.path.join(os.path.join(base_path, element[2]), element[1])),
                     element[0].split('/')[0]))
        shutil.move(os.path.join(base_path, element[0]),
                    os.path.join(os.path.join(os.path.join(base_path, element[2]), element[1]), element[0])

process_folders()

We need to extract the bounding box information from the annotation file and to normalize the values of bounding box information by the shape of the related image.

def create_dict_bboxes(list_all, split='train'):
    lst = [(line[0], line[1], line[3], line[2]) for line in list_all if line[2] == split]
    lst = [("".join(line[0].split('/')[0] + '/' + line[3] + '/' + line[1] + line[0][3:]), line[1], line[2]) for line in lst]
    lst_shape = [cv2.imread('./data/' + line[0]).shape for line in lst]
    lst = [(line[0], line[1], (round(line[2][0] / shape[1], 2), round(line[2][1] / shape[0], 2), round(line[2][2] / shape[1], 2), round(line[2][3] / shape[0], 2))) for line, shape in zip(lst, lst_shape)]
    dict_ = {"/".join(line[0].split('/')[2:]): {'x1': line[2][0], 'y1': line[2][1], 'x2': line[2][2], 'y2': line[2][3], 'shape': line[2][4]} for line in lst}
    return dict_

def get_dict_bboxes():
    with open('./data/anno/list_category_img.txt', 'r') as category_img_file, \
            open('./data/anno/list_eval_partition.txt', 'r') as eval_partition_file, \
            open('./data/anno/list_bbox.txt', 'r') as bbox_file:
        list_category_img = [line.rstrip('\n') for line in category_img_file][2:]
        list_eval_partition = [line.rstrip('\n') for line in eval_partition_file][2:]
        list_bbox = [line.rstrip('\n') for line in bbox_file][2:]

        list_category_img = [splitter.split(line) for line in list_category_img]
        list_eval_partition = [splitter.split(line) for line in list_eval_partition]
        list_bbox = [splitter.split(line) for line in list_bbox]

        list_all = [(k[0], k[0].split('/')[1].split('_')[-1], v[1], (int(b[1]), int(b[2]), int(b[3]), int(b[4])))
                    for k, v, b in zip(list_category_img, list_eval_partition, list_bbox)]

        list_all.sort(key=lambda x: x[1])

        dict_train = create_dict_bboxes(list_all)
        dict_val = create_dict_bboxes(list_all, split='val')
        dict_test = create_dict_bboxes(list_all, split='test')

        return dict_train, dict_val, dict_test

Too much work for preprocessing, huh?…

https://medium.com/media/d09adbabd84b39e843f9b82c2da34f05/href

And now, we can import Keras-things.

from keras.models import Model
from keras.layers import Dense
from keras.regularizers import l2
from keras.optimizers import SGD
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.preprocessing.image import DirectoryIterator, ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, EarlyStopping, TensorBoard
from keras import backend as K

We have to be aware of that this dataset contains ~290.000 images. It is not possible to train a ‘deep’ learning model from scratch even if you have a sup-computer -not super :)- with 32 GB RAM and 1080 Ti GPU.

How do we solve this problem?

I can hear you.

Yes. Transfer learning.

Well, transfer learning is a general machine learning method where a developed model for a general task is reused for more specific task as a starting point. This is too much simple definition of transfer learning. Please look at these links for more information about transfer learning: A Gentle Introduction to Transfer Learning for Deep Learning & CS231n Convolutional Neural Networks for Visual Recognition

In our case, we will use 50-layer residual network (ResNet50) model pre-trained with ImageNet, but we will not train all layers in this model from scratch. After freezing the earlier layers which represent low-level features as weights such as line detector and pattern detector, we will train the layers which represent higher level features -more specific to data- by optimizing the loss function with low learning rate.

Less parameters to train
Less time for training
Preserving the lower level feature weights while fine-tuning the data-specific feature weights
Eliminating the possibility of getting stucked on local minima for the loss function during the early stage of the training

Just write this code snippet to get pre-trained ResNet50 model in Keras.

model_resnet = ResNet50(weights='imagenet', include_top=False, pooling='avg')

Not including at top?? What does that mean?

For ImageNet dataset, there are different 1000 labels to categorize the images. Thus, when you want to train a model with ImageNet dataset, you need to specify the number of neurons in the output (softmax) layer as 1000. However, we have such a dataset with ~50 -actually, 46- labels to categorize the images. We should not include the top (output, softmax, last, whatever you would like to call) layer of ResNet50 for our model, so we can add a new layer and specify the number of neurons as what the dataset needs.

As I mentioned before, we need to freeze some layers in the very first part of the model. Freezing a layer means that -simply- making it not trainable in the model.

for layer in model_resnet.layers[:-12]:
    # 6 - 12 - 18 have been tried. 12 is the best.
    layer.trainable = False

Now, let’s build the category classification branch in the model.

x = model_resnet.output
x = Dense(512, activation='elu', kernel_regularizer=l2(0.001))(x)
y = Dense(46, activation='softmax', name='img')(x)

Then, we will build the bounding box detection branch in the model.

x_bbox = model_resnet.output
x_bbox = Dense(512, activation='relu', kernel_regularizer=l2(0.001))(x_bbox)
x_bbox = Dense(128, activation='relu', kernel_regularizer=l2(0.001))(x_bbox)
bbox = Dense(4, kernel_initializer='normal', name='bbox')(x_bbox)

Finally, we will create our final model by specifying the input and outputs for the branches.

final_model = Model(inputs=model_resnet.input,
                    outputs=[y, bbox])

The summary of our transfer learning model could be seen as:

print(final_model.summary())

The summary of trainable part of our transfer learning model

It could be seen that the number of trainable parameters in our custom ResNet50-like model are almost 25% percent of total number of parameters in original ResNet50 since we have already frozen the bunch of layers that contains low-level feature information and we will be training just last 12 layers.

To train a transfer learning model is hard to optimize. I am -still- working on how the optimization methods effect the training process and loss function for transfer learning approach. I will use Stochastic Gradient Descent (SGD) algorithm to optimize the weights in the backpropagation in order to make sure that I am on the safe side. Set the momentum parameter as 0.9 and the nesterov parameter as True. I strongly recommend you to read an article, this one, to get more information about SGD algorithm.

opt = SGD(lr=0.0001, momentum=0.9, nesterov=True)

Why do we keep the learning rate too low?

The answer is simple. We want to not change the weights by destroying the information coming from the ImageNet and to learn something from the data. If you use default learning value, for example, the loss function will converge too fast and start to over-fit the training set.

We are ready now to compile our model. While categorical crossentropy method has been picked as loss function for category classification task, mean squared error method has been picked as loss function for bounding box detection task -you can pick either mean squarred logarithmic error-. Likewise, we will measure our performance on the validation set with top-1 and top-5 accuracies for category classification, and mean squarred error for bounding box detection.

final_model.compile(optimizer=opt,
                    loss={'img': 'categorical_crossentropy',
                          'bbox': 'mean_squared_error'},
                    metrics={'img': ['accuracy', 'top_k_categorical_accuracy'], # default: top-5
                             'bbox': ['mse']})

We have some problems here. How do we load our data without being out of bounds for memory? Also, how do we give such an input which contains an image, category label and bounding box together? Let’s figure it out!

Loading the data:

If you try to load all images with at least 100x100 size to your less than 64 GB memory, it will be out of bounds for memory. The solution is flowing the images as a batch from the directory -ImageDataGenerator class in Keras-, it will load the data and give it to the model as batch-to-batch. Besides, we can augment the data in real time by helping of this method. At the end, we need to create ImageDataGenerator objects for training and test sets.

train_datagen = ImageDataGenerator(rotation_range=30.,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   horizontal_flip=True)
test_datagen = ImageDataGenerator()

Note that if you use normalization methods (feature-wise/sample-wise std normalization) to augment the data, you have to fit and transform the data before starting to train the data.

Manipulating the batch iterator:

This part could be seen as the most challenging part of this episode, but actually it is not. In ImageDataGenerator class, the method of flowing the data from the directory uses a DirectoryIterator object to iterate the data over the directory. We have to extend a custom object from the original DirectorIterator object. For the original one, GO.

class DirectoryIteratorWithBoundingBoxes(DirectoryIterator):
    def __init__(self, directory, image_data_generator, bounding_boxes: dict = None, target_size=(256, 256),
                 color_mode: str = 'rgb', classes=None, class_mode: str = 'categorical', batch_size: int = 32,
                 shuffle: bool = True, seed=None, data_format=None, save_to_dir=None,
                 save_prefix: str = '', save_format: str = 'jpeg', follow_links: bool = False):
        super().__init__(directory, image_data_generator, target_size, color_mode, classes, class_mode, batch_size,
                         shuffle, seed, data_format, save_to_dir, save_prefix, save_format, follow_links)
        self.bounding_boxes = bounding_boxes

    def next(self):
        """
        # Returns
            The next batch.
        """
        with self.lock:
            index_array = next(self.index_generator)
        # The transformation of images is not under thread lock
        # so it can be done in parallel
        batch_x = np.zeros((len(index_array),) + self.image_shape, dtype=K.floatx())
        locations = np.zeros((len(batch_x),) + (4,), dtype=K.floatx())

        grayscale = self.color_mode == 'grayscale'
        # build batch of image data
        for i, j in enumerate(index_array):
            fname = self.filenames[j]
            img = image.load_img(os.path.join(self.directory, fname),
                                 grayscale=grayscale,
                                 target_size=self.target_size)
            x = image.img_to_array(img, data_format=self.data_format)
            x = self.image_data_generator.random_transform(x)
            x = self.image_data_generator.standardize(x)
            batch_x[i] = x

            if self.bounding_boxes is not None:
                bounding_box = self.bounding_boxes[fname]
                locations[i] = np.asarray(
                    [bounding_box['x1'], bounding_box['y1'], bounding_box['x2'], bounding_box['y2']],
                    dtype=K.floatx())
        # optionally save augmented images to disk for debugging purposes
        # build batch of labels
        if self.class_mode == 'sparse':
            batch_y = self.classes[index_array]
        elif self.class_mode == 'binary':
            batch_y = self.classes[index_array].astype(K.floatx())
        elif self.class_mode == 'categorical':
            batch_y = np.zeros((len(batch_x), 46), dtype=K.floatx())
            for i, label in enumerate(self.classes[index_array]):
                batch_y[i, label] = 1.
        else:
            return batch_x

        if self.bounding_boxes is not None:
            return batch_x, [batch_y, locations]
        else:
            return batch_x, batch_y

Bold parts of the code above have been added to DirectoryIterator object to reach category labels and bounding box information at the same time.

Wuhuuuu!!

It is mentioned that the size of the images in the dataset are not same, so we need to set a target size for the images in the iterator objects.

dict_train, dict_val, dict_test = get_dict_bboxes()

train_iterator = DirectoryIteratorWithBoundingBoxes("./data/img/train", train_datagen, bounding_boxes=dict_train, target_size=(200, 200))

test_iterator = DirectoryIteratorWithBoundingBoxes("./data/img/val", test_datagen, bounding_boxes=dict_val,target_size=(200, 200))

It is the time to add some helpful features to our model. First, we will define a learning rate reducer in order to get rid of the plateaus in the loss function. Also, we will record what our model has done during the training process. Next, we will make sure that the training will be stopped if there is no change in the value of the loss function on the validation set for a certain epoch. Finally, we will save our trained model in each epoch that has better result than previous one.

lr_reducer = ReduceLROnPlateau(monitor='val_loss',
                               patience=12,
                               factor=0.5,
                               verbose=1)
tensorboard = TensorBoard(log_dir='./logs')
early_stopper = EarlyStopping(monitor='val_loss',
                              patience=30,
                              verbose=1)
checkpoint = ModelCheckpoint('./models/model.h5')

Edit: Thank you to Killian, we have to create a custom generator object which yields the batches of images to the model.

def custom_generator(iterator):
    while True:
        batch_x, batch_y = iterator.next()
        yield (batch_x, batch_y)

It has been long journey, but we are very close to the end. Keep the faith!

https://medium.com/media/31768bb13350767d628089d9c9d820b3/href

We can start training our model. GO GO GO!!!

final_model.fit_generator(custom_generator(train_iterator),
                          steps_per_epoch=2000,
                          epochs=200, validation_data=custom_generator(test_iterator),
                          validation_steps=200,
                          verbose=2,
                          shuffle=True,
                          callbacks=[lr_reducer, checkpoint, early_stopper, tensorboard],
                          workers=12)

Head epochs of training

Waiting…

https://medium.com/media/587c12d53773099e36e01304bc3747b1/href

20. epoch

Just before early stopping

Well, after early stopping at between 140 and 145th epochs, we can measure the performance of our model on the test set.

Hold the breath! AND…

test_datagen = ImageDataGenerator()

test_iterator = DirectoryIteratorWithBoundingBoxes("./data/img/test", test_datagen, bounding_boxes=dict_test, target_size=(200, 200))

scores = final_model.evaluate_generator(custom_generator(test_iterator), steps=2000)

print('Multi target loss: ' + str(scores[0]))
print('Image loss: ' + str(scores[1]))
print('Bounding boxes loss: ' + str(scores[2]))
print('Image accuracy: ' + str(scores[3]))
print('Top-5 image accuracy: ' + str(scores[4]))
print('Bounding boxes error: ' + str(scores[5]))

Results

~85% accuracy on top-5 predictions. WOW!

~<0.05 error on bounding box regression. WOW!

https://medium.com/media/8adf821adca6e936abc06e9ca36119b3/href

Of course, the results could be improved by increasing the number of augmentation methods and hyperparameter optimization in a certain range, but we are still very close to the results in the paper. It is -definitely- the time to celebrate.

Well, the fourth episode of “Deep Learning Lab” series, DeepFashion ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.

[Deep Learning Lab] Episode-3: fer2013

Furkan Kınlı — Fri, 06 Apr 2018 09:18:04 GMT

Let the “Deep Learning Lab” begin!

This is the third episode of “Deep Learning Lab” story series which contains my individual deep learning works with different cases.

I would like to work on fer2013 dataset, which was published on International Conference on Machine Learning (ICML) 5 years ago, to recognize the facial expression in the third episode.

Example images from fer2013 dataset

I eidetically hear you ask what this fer2013 is. fer2013 is an open-source dataset which is first, created for an ongoing project by Pierre-Luc Carrier and Aaron Courville, then shared publicly for a Kaggle competition, shortly before ICML 2013. This dataset consists of 35.887 grayscale, 48x48 sized face images with various emotions -7 emotions, all labeled-.

Emotion labels in the dataset:
0: -4593 images- Angry
1: -547 images- Disgust
2: -5121 images- Fear
3: -8989 images- Happy
4: -6077 images- Sad
5: -4002 images- Surprise
6: -6198 images- Neutral

During the competition, 28.709 images and 3.589 images were shared with the participants as training and public test sets respectively and the remaining 3.589 images were kept as private test set to find the winner of the comptetition. The dataset was set to accessible to everyone after completing the competition.

Let me reference to the real heroes:

Challenges in Representation Learning: A report on three machine learning contests, Ian Goodfellow et al., 2013

Universitè de Montrèal, Technical Report

LET’S GET BACK TO 2013…

https://medium.com/media/f2824681302202931fc3ebb444804a13/href

LET’S GOOOOO!

In the demo part of this story, I used JetBrains PyCharm and OpenCV to capture live frames from web camera and to detect the faces and to recognize the emotions on the faces.

First and foremost: Importing the libraries

import sys, os
import pandas as pd
import numpy as np
import cv2
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization
from keras.losses import categorical_crossentropy
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.callbacks import ReduceLROnPlateau, TensorBoard, EarlyStopping, ModelCheckpoint
from keras.models import load_model

Once you have created a new folder, which is called “Emotion Recognition” in your Google Drive, you need to upload “fer2013.csv” Excel file to this folder. After then, we will define file paths through the virtual machine with the following code snippet.

BASEPATH = 'drive/Emotion Recognition'
sys.path.insert(0, BASEPATH)
os.chdir(BASEPATH)
MODELPATH = './models/model.h5'

Initializing the parameters.

We will feed the convolutional neural network with the images as batch, which contains 64 images for each, in 100 epochs and eventually, the network model will output the possibilities of 7 different emotions (num_classes) can belong to the faces on the images sized with 48x48.

num_features = 64
num_labels = 7
batch_size = 64
epochs = 100
width, height = 48, 48

Let’s read our data with the help of “pandas” from the Excel file we just uploaded to Google Drive.

data = pd.read_csv('./fer2013.csv')

Let’s see what it looks like.

data.tail()

The last 5 rows of fer2013 dataset

As you realized at first glance, the images in the Excel file are stored with the corresponding pixel values on each row and preprocessing on the data is required -a little bit-. (Source for preprocessing)

Converting the relevant column element into a list for each row
Splitting the string by space character as a list
Numpy ❤
Normalizing the image
Resizing the image
Expanding the dimension of channel for each image
Converting the labels to catergorical matrix

pixels = data['pixels'].tolist() # 1

faces = []
for pixel_sequence in pixels:
    face = [int(pixel) for pixel in pixel_sequence.split(' ')] # 2
    face = np.asarray(face).reshape(width, height) # 3
    
    # There is an issue for normalizing images. Just comment out 4 and 5 lines until when I found the solution.

    # face = face / 255.0 # 4
    # face = cv2.resize(face.astype('uint8'), (width, height)) # 5
    faces.append(face.astype('float32'))

faces = np.asarray(faces)
faces = np.expand_dims(faces, -1) # 6

emotions = pd.get_dummies(data['emotion']).as_matrix() # 7

We are now ready to split our model into training, validation and test sets -well, I am sure that-.

X_train, X_test, y_train, y_test = train_test_split(faces, emotions, test_size=0.1, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, random_state=41)

What about the architecture of the model will be?

[2 x CONV (3x3)] — MAXP (2x2) — DROPOUT (0.5)
[2 x CONV (3x3)] — MAXP (2x2) — DROPOUT (0.5)
[2 x CONV (3x3)] — MAXP (2x2) — DROPOUT (0.5)
[2 x CONV (3x3)] — MAXP (2x2) — DROPOUT (0.5)
Dense (512) — DROPOUT (0.5)
Dense (256) — DROPOUT (0.5)
Dense (128) — DROPOUT (0.5)

In the first convolutional layer, L2 regularization (0.01) has been added.
In all convolutional layers except the first one, batch normalization layer has been added.
MAXP (2x2) and DROPOUT (0.5) layers have been added to each convolutional layers block.
“RELU” has been picked as activation function for all convolutional layers.

model = Sequential()

model.add(Conv2D(num_features, kernel_size=(3, 3), activation='relu', input_shape=(width, height, 1), data_format='channels_last', kernel_regularizer=l2(0.01)))
model.add(Conv2D(num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.5))

model.add(Conv2D(2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.5))

model.add(Conv2D(2*2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(2*2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.5))

model.add(Conv2D(2*2*2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(2*2*2*num_features, kernel_size=(3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Dropout(0.5))

model.add(Flatten())

model.add(Dense(2*2*2*num_features, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2*2*num_features, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(2*num_features, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_labels, activation='softmax'))

Let’s see the total trainable / non-trainable parameters.

model.summary()

The last part of model summary

We are now ready to compile our model. The categorical crossentropy function has been picked out as a loss function because we have more than 2 labels and already prepared the labels in the categorical matrix structure -I confess, again, copied it from the previous episodes-.

model.compile(loss=categorical_crossentropy,
              optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-7),
              metrics=['accuracy'])

Let’s add some more features to our model.

Firstly, we help the loss function to get rid of the “plateaus” by reducing the learning rate parameter of the optimization function with a certain value (factor) if there is no improvement on the value of the loss function for the validation set after a certain epoch (patience).

lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.9, patience=3, verbose=1)

We record everything done during the training into the “logs” folder as log to be able to better interpret the results of our model and to visually analyze the changes in the loss function and the accuracy during the training.

For more information on TensorBoard: GO

tensorboard = TensorBoard(log_dir='./logs')

Even if we could prevent that the loss function goes to the plateaus, the value of the loss function of validation set could get stuck in a certain range while the training set’s does not (in other words, while the model continues to learn something). As long as we continue to train the model after this point, the only thing the model could do is to memorize (over-fit) the training data -I could say there is no chance of getting rid of the local minima for the loss function without a miracle-. This is something that we will not want at all.

We stop the training of the model if there is no change in the value of the loss function on the validation set for a certain epoch (patience).

early_stopper = EarlyStopping(monitor='val_loss', min_delta=0, patience=8, verbose=1, mode='auto')

Finally, we save our model during training as long as it gets a better result than the previous epoch. Thus, we will have the best possible model at the end of the training.

checkpointer = ModelCheckpoint(MODELPATH, monitor='val_loss', verbose=1, save_best_only=True)

We can start training our model. GO GO GO!!!

model.fit(np.array(X_train), np.array(y_train),
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(np.array(X_test), np.array(y_test)),
          shuffle=True,
          callbacks=[lr_reducer, tensorboard, early_stopper, checkpointer])

Head epochs of training

https://medium.com/media/dae1fad8704a762df1f1fa38762bb4f8/href

Tail epochs of training

Before measuring the performance of our model on the test set, let’s see the performances of winners’ models in the relevant Kaggle competition in 2013.

RBM (Yichuan Tang) — 71.162%
UNSUPERVISED (Yingbo Zhou & Chetan Ramaiah) — 69.267%
MAXIM MILAKOV (Maxim Milakov) — 68.821%
RADU+MARIUS+CRISTI (Radu Ionescu & Marius Popescu & Cristian Grozea) — 67.484%

We , again, all hold our breath, AND…

scores = model.evaluate(np.array(X_test), np.array(y_test), batch_size=batch_size)
print("Loss: " + str(scores[0]))
print("Accuracy: " + str(scores[1]))

Loss & Accuracy

We are very close to the performances of the winners in the competition, but we cannot pass.

At this point, the most significant problem that I could not solve is that the model starts to memorize the images after a certain number of epochs during the training. I have tried different combinations of optimization function types, different number of epochs and batch sizes, different learning rate values and deeper / shallow / less dense model architectures, but the result has never been improved.

If so, I would be very pleased to hear your ideas about the solution of this problem.

The demo of predicting the facial expression of detected faces by Haar-Cascade face detection algorithm by using our trained model:

emotion_dict = {0: "Angry", 1: "Disgust", 2: "Fear", 3: "Happy", 4: "Sad", 5: "Surprise", 6: "Neutral"}

model = load_model(MODELPATH)

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)

    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 1)
        roi_gray = gray[y:y + h, x:x + w]
        cropped_img = np.expand_dims(np.expand_dims(cv2.resize(roi_gray, (48, 48)), -1), 0)
        cv2.normalize(cropped_img, cropped_img, alpha=0, beta=1, norm_type=cv2.NORM_L2, dtype=cv2.CV_32F)
        prediction = model.predict(cropped_img)
        cv2.putText(frame, emotion_dict[int(np.argmax(prediction))], (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 1, cv2.LINE_AA)

    cv2.imshow('frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

If you try the demo, of course, you may notice that the model does not work very well in some cases. However, if I ask you to say an error rate for this model after trying the demo, you will definitely -and maybe interestingly- say a much smaller error rate than 35% (our model’s).

Examples of me while trying the demo

P.S: I could not act like I disgust or… I might not want to put my disgusted face here. -Kappa-

Well, the third episode of “Deep Learning Lab” series, fer2013 ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.

[Derin Öğrenme Lab] Bölüm-1: Fashion-MNIST

Furkan Kınlı — Mon, 05 Mar 2018 06:01:01 GMT

Derin öğrenme çalışmalarımı paylaşacağım “Derin Öğrenme Lab” serisinin ilk bölümüne başlıyoruz.

İlk bölüm için çalışmak istediğim veri seti -burada şaşırmıyoruz- MNIST veri seti. Fakat ilk olarak aklınıza geldiği üzere rakamlardan oluşan MNIST değil, kıyafetlerden oluşan MNIST. Yani Fashion-MNIST -burada şaşırıyoruz-.

Fashion-MNIST

Çoğunuzun yakından bildiği üzere, orijinal MNIST veri seti, rakamlardan oluşan toplamda 70.000 tane el yazısı rakamdan oluşuyor. Bu setin 60.000 tanesi modeli eğitmek için kullanılırken, 10.000 tanesi ise eğitilen modelin performansını test edebilmek için ayrılmış durumda. Veri seti içerisinden örnekleri aşağıdaki resim ile görebiliriz.

Bildiğimiz el yazısı rakamlardan oluşan MNIST’e alternatif olarak, Zalando Araştırma Ekibi tarafından geliştirilen ve kıyafetlerden oluşan bir veri seti Fashion-MNIST. Atasıyla aynı fiziksel özelliklere sahip, 60.000 resim modeli eğitmek için, 10.000 resim ise eğitilen modelin performansını ölçmek için kullanılıyor. Bu veri setini seçmemin en önemli sebebi şudur ki; Google’da yapılan derin öğrenme ile ilgili aramaların çok büyük çoğunluğu sizi MNIST ile tanıştırabilir, fakat muhtemelen Fashion-MNIST ile ilk defa karşılaşıyorsunuz -yanılıyor muyum?-.

Fashion-MNIST Örnek Resimler

Referansımızı da şuraya bırakalım:

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgraf.

https://arxiv.org/abs/1708.07747

Bu ekip, makine öğrenmesi ve derin öğrenme ile ilgilenen tüm araştırmacılar için neden böyle bir veri seti yaratma ihtiyacı hissettiklerini ise 3 madde ile özetliyorlar:

MNIST, günümüz algoritmaları için öğrenmesi çok kolay. Çok da derin sayılmayabilecek 2 katmanlı evrişimli(convolutional) sinir ağları ile 99.7% gibi bir başarım oranı mevcut. Klasik makine öğrenmesi algoritmaları için ise bu rakam 97%.
MNIST çok kullanıldı. Makine öğrenmesi ya da derin öğrenme ile ilgilenmeye başlayan herkes ilk etapta bu veri setini kullandı. Bir nevi, kutsal veri seti oldu. Geçtiğimiz günlerde, Ian Goodfellow (kendisi Google Brain ekibinde önemli bir araştırmacıdır) yeni başlayan kişileri MNIST veri setinden uzak durmaya davet etmişti.
MNIST günümüzdeki bilgisayarla görü araştırmalarında aşılması gereken problemleri yansıtmaktan çok uzak.
(Han Xiao et al.)

Fashion-MNIST konusunda yeterince ikna edici olabildiysek, kodlamaya başlayalım.

LET’S GOOOOO!

Araştırmalarımda Tensorflow ve Keras kullanıyorum. “Derin Öğrenme Lab” serisinde ise, konu hakkında minimum bilgiye sahip birisine bile kodun nasıl çalıştığını anlayabilme imkanı sunan Keras’ı tercih ettim, edeceğim -niye yalan söyleyeyim, aslında biraz Python bilmek, biraz da literatürü takip etmek de gerek-.

İlgili kütüphaneleri programımıza tanıtalım.

from __future__ import print_function
import keras
from keras.datasets import fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import print_summary
from keras.optimizers import Adam
from keras.regularizers import l2
import os

Parametrelerimizi tanımlayalım.

batch_size = 32 # İsterseniz 64 ya da 128 ile de deneyebilirsiniz
num_classes = 10
epochs = 100 # 93. epoch'tan sonra kayıp fonksiyonu değeri sabit kalmaya başlıyor
# Modelinizi kaydetmek için:
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_fashion_mnist_trained_model.h5'

Veri setimizi Keras sayesinde çok kolay bir şekilde elde ediyoruz.

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

Veri setimizdeki resimler siyah-beyaz(grayscale) olduğu için veriyi yeniden şekillendirmemiz gerekiyor.

x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)
input_shape = (28, 28, 1)

Veri setimizdeki etiketleri de 1 boyutlu numpy array yapısından kategorik matriks yapısına çevirmemiz gerekiyor.

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Bu kadar “preprocessing” işlemi yeter. Yetmeli :)
Artık modelimizi oluşturalım.

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', kernel_regularizer=l2(0.01), input_shape=input_shape))
model.add(Activation('relu'))
model.add(Conv2D(32, (5, 5), kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same', kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(Conv2D(64, (5, 5), kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

Bu modelin genel şemasını şu şekilde görebiliriz:

Modelimiz

Backprop sırasında ağırlıkları optimize etmek için Adam (Adaptive Moment Estimation) algoritmasını kullandım. Parametrelerini ilgili makalede belirtildiği üzere default bıraktım.

opt = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)

O kadar “preprocessing” işlemi yetmemiş…
Veri setimizdeki resimleri oluşturan matriksleri normalize etmeyi unuttuk -LUL-.

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

Artık modelimizi derlemeye hazırız.
Kayıp fonksiyonu olarak kategorik çapraz entropi fonksiyonunu seçtik çünkü 2’den fazla etikete sahibiz ve etiketlerimizi kategorik matriks olarak hazırladık.

model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

Modelimizi eğitmek için hazırız. GO GO GO!

model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)

Wuhuuu! Biraz hızlı öğreniyoruz, zekiyiz değil mi?

Epoch 1–5

Ve eğitimimiz (Google Colaboratory’nin sağlamış olduğu Tesla K80 sayesinde) hızlı bir şekilde tamamlanıyor. Şimdi modelimizin performansını daha önce hiç görmediğimiz test veri seti ile ölçme zamanı.

Epoch 93–100

Performans ölçümü için şu kod parçacığını çalıştırmamız yeterli.

scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

VEEEEEE… TA TAAA!!!

Test seti üzerindeki başarım oranımız

Modelimiz tahmin ettiği 10.000 test resminin 90.52%’ini doğru tahmin etmiş. Literatür performansları için: Tıklıyoruz (Benchmark başlığı altında görebilirsiniz)

Evet “Derin Öğrenme Lab” serisinin ilk bölümü olan Fashion-MNIST burada bitiyor. Yayında ve yapımda emeği geçen herkese teşekkürler. Yorumlar ve öneriler için mail atabilirsiniz. Ayrıca bana ulaşmak için LinkedIn hesabımı kullanabilirsiniz.

Yazının orijinaline kişisel web sayfamdan ulaşmak için: GO!

fk.

[Deep Learning Lab] Episode-2: CIFAR-10

Furkan Kınlı — Sun, 18 Feb 2018 07:01:01 GMT

Let the “Deep Learning Lab” begin!

This is the second episode of “Deep Learning Lab” story series which contains my individual deep learning works with different cases.

I would like to work on CIFAR datasets in the second episode. CIFAR-10 and CIFAR-100 are two different datasets with different number of classes -please take a hint, it’s all about names-. First, it’s the time to start with CIFAR-10, which is -relatively- easier to work and actually, working on CIFAR-100 in such a different case has been already planned for the later part of the series.

CIFAR-10

Let’s quickly get to know the CIFAR-10 dataset. CIFAR-10 is one of the most well-known image dataset containing 60.000 different images which is created by the first person that should come to your mind in deep learning and his teammates. OFC, I’m talking about Geoffrey Hinton. CIFAR-10 is labeled subsets of the “80 million tiny images” dataset. (G. Hinton, A. Krizhevsky and V. Nair in Canadian Institute for Advanced Research)

The size of all images in this dataset is 32x32x3 (RGB). If you don’t have any idea of what are the “3” in the third dimension and “RGB” in the brackets mean, I strongly recommend you to read this article. Moreover, there are 50.000 images for training a model and 10.000 images for evaluating the performance of the model. The classes and randomly selected 10 images of each class could be seen in the picture below.

Classes and randomly selected 10 images of each class

Let me reference to the real hero:

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009

University of Toronto, Technical Report

I am looking forward to creating an accurate deep learning model on the CIFAR-10 dataset. Let’s start coding!

LET’S GOOOOO!

Randomly selected 24 images in CIFAR-10 dataset

The very first move: Importing the libraries

from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import print_summary, to_categorical
import sys
import os

We need to assign a file path on Google Drive to save the model that we trained on Google Colaboratory. The first thing to do is to create a new folder named “cifar10” on Google Drive and then, let’s run the following code snippet on Google Colab.

sys.path.insert(0, 'drive/cifar10')
os.chdir(“drive/cifar10”)

Initializing the parameters.

We will feed the convolutional neural network with the images as batches -each batch contains 64 images- in 100 epochs and eventually, the network model will output the possibilities of 10 different categories (num_classes) can belong to the image.

batch_size = 64
num_classes = 10
epochs = 100
model_name = 'keras_cifar10_model'
save_dir = '/model/' + model_name

Thanks to Keras, we can load the dataset easily.

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

We also need to convert the labels in the dataset into categorical matrix structure from 1-dim numpy array structure.

y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

Once bitten twice shy, we will not forget it for this time. We need to normalize the images in the dataset.

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255.0
x_test /= 255.0

We are now absolutely sure that it is enough for preprocessing -for now, LUL-. It is the time to create our model. For this episode in the series, I would prefer to use the most common neural network model architecture in the literature: [CONV] — [MAXP] -..- [CONV] — [MAXP] — [Dense]

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(64, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.4))

model.add(Flatten())
model.add(Dense(80))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

The summary of this model could be seen below:

Summary of model

I would prefer the Stochastic Gradient Descent algorithm to optimize the weights on the backpropagation. Set the momentum parameter as 0.9, and just leave the others as default. I, again, strongly recommend you to read an article, this one, in order to get more information about SGD algorithm.

opt = SGD(lr=0.01, momentum=0.9, decay=0, nesterov=False)

model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

We’ve done a lot and we have only one step to begin training our model. At this time, I would like to make a different move. I will split the training dataset (50.000 images) into training (40.000 images) and validation (10.000 images) datasets to measure the validation accuracy of our model in such a better way. Thus, our neural network model will continue the training by evaluating the images that never been seen during the training after each epoch.

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_split=0.2,
          shuffle=True)

Well, so far so good. We have started to learn. I think we did, didn’t we?

Epochs 1–5

In contrast to the previous episode, training our model took a long time despite using a powerful GPU -thanks to Google Colab-. After about 4 hours of training, it could be seen like below.

Epochs 95–100

Just before measuring the accuracy of our model with the test dataset, I would like to share with you the achievements obtained using the CIFAR-10 dataset. GO!

We all hold our breath, AND…

scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

The result:

Our -lovely- model classifies 77.37% of 10.000 test images.

https://medium.com/media/b52768042f41e827e314f2830888e65c/href

We -really- need to look at the performances in the literature and I -indeed- don’t believe that we fall even further behind. I’m proud of this figure since our model has just 3 layers with simple neural network architecture and trained only 4 hours.

So what do you think?

What should we do to improve the performance of our model?

https://medium.com/media/06676f03abd136255f7f0259d51dc42b/href

The first answer for this question: Train the model for a while.

I’m not a lazy guy -HUH-.

Results:

100 epochs: Accuracy: 77.37%, Loss: 0.670
150 epochs: Accuracy: 78.22%, Loss: 0.646
200 epochs: Accuracy: 77.77%, Loss: 0.670

I won’t continue to train this model anymore -of course- since there was no improvement in the loss function values after 170–175th epochs. In other words, our model will start to overfitting. If it is trained more -as is-, the performance of test dataset will begin decreasing, which is the last thing we would like to happen.

Well, what else can we do?

Yes, possible, we can still do something to improve the performance of our model.

For example:

Data augmentation. We can efficiently increase the number of images in the dataset with the help of a method in Keras library named “ImageDataGenerator” by augmenting the images with horizontal/vertical flipping, rescaling, rotating and whitening etc. The more data we have for training, the more accurate result we could obtain -here, I do not claim that data augmentation always guarantees better accuracy-.
Changing optimizer. Stochastic Gradient Descent algorithm to optimize the weights is probably not the most appropriate algorithm for this dataset. There may be an increase in the performance -attention! I’m not talking about a definite increase”.
Changing learning rate. We could decrease the learning rate of the model a bit after 170th epoch. It is possible to change the learning rate during the training by helping of the methods in Keras library named “LearningRateScheduler” and “ReduceLROnPlateau”.
Changing the architecture. If the performance of our model still does not satisfy us, we will have to question the architecture of the model that we are building. We need to try to resemble our model to more modern neural networks like ResNet and VGGNet or we need to change the activation functions of the layers in the model. Then, we will have a chance to improve the performance.

Before summing up, I would like to save our model as a file with extension “.h5”. Thus, I can continue to train my model, I can convert the model into a software which predicts “real-case” pictures that the model has never met before and I can share the model with other deep learning researchers who would like to use this model.

model.save(save_dir + '.h5')

Well, the second episode of “Deep Learning Lab” series, CIFAR-10 ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.

[Deep Learning Lab] Episode-1: Fashion-MNIST

Furkan Kınlı — Fri, 02 Feb 2018 08:01:07 GMT

Let the “Deep Learning Lab” begin!

This is the first episode of “Deep Learning Lab” story series which contains my individual works for deep learning with different cases.

The dataset for the first episode that I would like to work on is MNIST dataset -not surprisingly-. However, it is not MNIST handwritten digit database as first come to your mind, but MNIST-like fashion product database. Actually, Fashion-MNIST -wow!-.

Fashion-MNIST

Fashion-MNIST dataset has been developed by the Zalando Research Team as clothes product database and as an alternative to the original MNIST handwritten digits database. Besides to have the same physical characteristics as the ancestor (the original one), there are 60.000 images for training a model and 10.000 images for evaluating the performance of the model. The most significant reason for picking this dataset is that the vast majority of searches about deep learning on Google may introduce you to the original MNIST, but you are now probably meeting Fashion-MNIST for the first time -don’t you?-.

Example images of Fashion-MNIST

Let me reference to the real heroes:

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgraf.

https://arxiv.org/abs/1708.07747

Zalando team summarizes why they think that there is a need to create such a dataset for machine learning and deep learning researchers, with these 3 sentences:

MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily.
MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.
MNIST can not represent modern computer vision tasks.

(Han Xiao et al.)

If I succeeded in convincing you enough for Fashion-MNIST, let’s start coding.

LET’S GOOOOO!

I prefer to use Tensorflow and Keras for my works. In the “Deep Learning Lab” series, I would like to choose Keras, which gives you an opportunity to understand how the code works even if you have minimum/no knowledge of the subject -well, not tell a lie; it needs to know a bit about Python, and also to follow the literature-.

Importing the libraries.

from __future__ import print_function
import keras
from keras.datasets import fashion_mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import print_summary
from keras.optimizers import Adam
from keras.regularizers import l2
import os

Initializing the parameters.

batch_size = 32 # You can try 64 or 128 if you'd like to
num_classes = 10
epochs = 100 # loss function value will be stabilized after 93rd epoch
# To save the model:
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_fashion_mnist_trained_model.h5'

Thanks to Keras, we can load the dataset easily.

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

We need to reshape the data since the images in the dataset are grayscaled.

x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)
input_shape = (28, 28, 1)

We also need to convert the labels in the dataset into categorical matrix structure from 1-dim numpy array structure.

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Enough for preprocessing. It should be :).
Now let’s build our model.

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', kernel_regularizer=l2(0.01), input_shape=input_shape))
model.add(Activation('relu'))
model.add(Conv2D(32, (5, 5), kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same', kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(Conv2D(64, (5, 5), kernel_regularizer=l2(0.01)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

The summary of this model could be seen below:

Our Model

I used Adam (Adaptive Moment Estimation) algorithm to optimize the weights during the backpropagation. Just left the parameters default as specified in the relevant article.

opt = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)

Not enough preprocessing… We forgot to normalize the images in the dataset -LUL-.

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])

We are ready to train our model. GO GO GO!

model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              validation_data=(x_test, y_test),
              shuffle=True)

Wuhuuu! We learn a bit fast. It is very smart, isn’t it?

Epoch 1–5

Our training has been completed in a couple of shakes (Thanks to Tesla K80 and Google Colaboratory). Now it’s time to measure the performance of our model with the test set.

Epoch 93–100

To evaluate the performance, we only need to run the following code snippet.

scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

AND… TA TA TAM!!!

Test Accuracy

Our model predicted 90.52% of 10.000 test images as correct. For the literature performances: GO! (You can see under the Benchmark heading)

Well, the first episode of “Deep Learning Lab” series, Fashion-MNIST ends here. Thank you for taking the time with me. For comments and suggestions, please e-mail me. You can also contact me via LinkedIn. Thank you.

fk.

Bir KÖRS Hikayesi: not winner but WINNER

Furkan Kınlı — Wed, 05 Jul 2017 21:27:45 GMT

https://medium.com/media/d522ae720bea111be68628d587103a79/href

Dıt dıt dırıt dıt dı dıt dırıttı dı dıt…

Sertap Erener’in kazandığı 1.lik ile birlikte ilgimi çeken ve takip etmeye başladığım Eurovision Şarkı Yarışması hakkında bir şeyler karalamak istedim bu sefer. Özellikle son 5 yıldır (Türkiye'nin yarışmadan süresiz çekildiği tarihle denk gelmesi tamamen tesadüf) baya baya Eurovision gurmesi olmaya başladığımı söyleyebilirim.

Tabi ki her sene ocak-şubat döneminde o senenin şarkılarının piyasa çıkması, defalarca dinlenip tahlil edilmesi, sevilen şarkıların daha fazla dinlenmesi, takılan şarkının hep dinlenmesi gibi süreçler mayısın ikinci haftasına kadar sonsuz döngü misali devam ediyor. Bazen bu durum çevremdeki insanları bezdirecek seviyeye gelse de, yarışma tarihi yaklaştıkça heyecan da artıyor onlarda da.

Mutlaka her sene bir favori şarkı seçiyorum ama son yıllarda resmen seçtiğim şarkının üzerine lanet getiriyorum. Bir şekilde bu şarkılar kazanamıyor.

KÖRS’ün Başlangıcı: Yohanna — Is It True?(Sene: 2009 — Derece: 2)

https://medium.com/media/54f57521b8cd6673eca24a585dbf2271/href

FAK Lena, GO Manga dedik ama: maNga — We Could Be The Same(Sene: 2010 — Derece: 2)

https://medium.com/media/dde1a239ae89bcd9d6862cd1e0561213/href

Eurovision’un Zeljko’su: Zeljko Joksimovic — Nje Ljubav Stvar(Sene: 2012 — Derece: 3)

https://medium.com/media/10fe9d968331d2f823941b5e2e8b154b/href

KÖRS’ün Yükselişi: Sanna Nielsen — Undo(Sene: 2014 — Derece: 3)

https://medium.com/media/eed06149b6eb520e4d82e21b18c56fe6/href

KÖRS çok bozdu, baya bozdu yani öyle böyle değil, inanılmaz bozdu, çok fazla bozdu yani, o kadar bozdu önünü alamadık öyle kötü bozdu yani, bozdu, bozdu, bozdu, bir yerden sonra bozmaz diye bekledik, daha da bozdu, artık bozmasın dedik, iyice bozdu.

KÖRS’ün Zirvesi, Şarkıların Kralı: Il Volo — Grande Amore (Sene: 2015 — Derece: 3)

https://medium.com/media/ae1d38e448bdc082bec9c2d18b095dbe/href

Sessizliğin Sesi Dami Ablamız: Dami Im — Sound of Silence (Sene: 2016 — Derece: 2)

https://medium.com/media/d0ee8389d4accef5c6f769384cad3de8/href

Bu Sefer KÖRS’ü Yıktık Dedik Ama…: Francesco Gabbani — Occidentali’s Karma (Sene: 2017 — Derece: 6)

(141 Milyon tekil izlenmesi vardı. Kaliteliydi. Bahis oranlarında 4 ay açık ara önde gitti. Bütün ön partilerde, değerlendirmelerde 1. olacağı konuşuluyordu. Sahnede goril oynattı. Olmadı. Thanks to KÖRS)

https://medium.com/media/699d27eb61aabba01929dbd8da5cedbc/href

Bu da böyle bir KÖRS hikayesiydi. Bakalım önümüzdeki sene hangi şarkının başını yakacağım. Bekliyoruz. Road to Lizbon 2018.

Başlangıç

Furkan Kınlı — Fri, 30 Jun 2017 22:53:10 GMT

İlginçtir ki, uzun süre boyunca haber sitesi kıvamındaki bloglarda birçok yazı yazmama rağmen, kendim hakkında bir şeyler karalamak için cesaretimi bir süredir toplayamıyordum. Peki ne değişti?

İlgi çekme resmi.

Öncelikle kendimden bahsetmeliyim. Furkan Kınlı. 24.5 yaşındayım. Özyeğin Üniversitesi’nde Mühendislikte Bilgisayar Bilimleri okuyorum (Vallahi İngilizce sayfasında Computer Science in Engineering yazıyor). Bilgisayarla görü ve makine öğrenmesi alanlarında mümkün olduğunca literatürü takip ederek kendimi geliştirmeye çalışıyorum. Yani kendimi “yazılımcı” olarak nitelendirmek istemiyorum. Değilim çünkü.

Her üniversite öğrencisi bilir, bir üniversiteden lisans diploması alabilmek için saha tecrübesi edinmek adına staj koşulu/dersi vardır. Benim de mezun olabilmem için 40 saatlik bir iş tecrübesi yaşamam gerekiyor. İlk 4 dönem boyunca konu ile ilgili hiçbir endişem yoktu tabi ki. “Zamanı gelince” yapardım çünkü. Üzülerek belirtiyorum, kendi kafamda şişe kırmışım, bayılmışım haberim yok.

Zamanı geldi ve tabi büyük bir heyecan ile 0 (yazıyla sıfır) proje, niteliksiz deneyimler ve hobilerimden oluşan bir CV hazırladım. Malum siteyi açtım. Arama çubuğuna Bilgisayar Mühendisi, Yazılım Mühendisi vb. girdilerin ardından başladım “staj” aramaya. En kötü tanıdıklardan bir şeyler ayarlanırdı muhakkak. Koskoca “Bilgisayar Mühendisi” adayıyım işsiz mi kalacağım bulurum elbet şöyle fiyakalı bir staj. Değil mi? LUL.

Bırakın mülakat davetini, çoğu şirketten geri dönüş bile alamadım. Geri dönüş yapanların da cümleleri tek ağızdan çıkmış gibiydi sanki. “Şirketimize yaptığınız başvuru incelenmiş yapılan değerlendirmeler neticesinde ilgili pozisyon için şu anda sizinle bla bla bla”. Bir anda tanıdıklar da ortadan kaybolunca stajımı yapmam gereken dönemde yapamadım ve daha kötüsü öz güvenim kırıldı. Çevremdeki halalar, dayılar hal hatır sordukça canım sıkılıyor, daha da içime kapanıyordum çünkü cevap veremiyordum. Aslında kendime cevap veremiyordum. Sonuç olarak kendine güveni olmayan, önemsiz ve başarısız hisseden, psikolojisi bozulan ve çevresindeki insanlardan uzaklaşan bir birey olup çıktım. Bu durum okulda NŞA’daki son dönemim olması gereken döneme dek devam etti (Okul uzadı, teşekkürler dif).

Hikaye bir noktada değişecek elbette. İşte o noktadayız.

Malum ders yüzünden (diferansiyel denklemlerden bahsediyorum tabi ki) 8. dönemime sarkan bir 3.sınıf dersim ve o dersimin hocası beni, dolayısıyla da her şeyi, değiştirdi. Teknik sunum. Dersin içeriği olarak da İngilizce olarak nasıl verimli sunum yapılır, sunum yaparken nelere dikkat etmeliyiz gibi kazanımlarımız var. Dersi geçebilmek için iki kere sunum yapmamız gerekiyor. Birincisinde hikayeye göre bir konferansta gördüğümüz Sİ İ O ile tanışmaya çalışıyoruz, kendimizi tanıtıyoruz; ikincisinde ise bir girişimciyi oynuyoruz ve projemizi iş adamlarına sunarak fon almaya çalışıyoruz. Bu noktaya kadar her şey normal. Neredeyse her üniversitede olan zorunlu derslerden biri aslında. Burada kilit nokta dersin hocasının yaklaşımı.

Herhangi bir dersten çok farklı şekilde, bir hayat dersi, bir kişisel gelişim ve uyanış dersi oldu bu benim için. Burak Hocanın bana kazandırdığı öz güven hayata bakış açımı değiştirdi. Kendimi geliştirmem için ilgilenmem gereken odak noktaları bana hatırlattı. İlk dersinden çıkıp kız arkadaşıma dersi anlatırken halimi hala hatırlıyorum. İnanılmaz.

Kendimi fark etmem, kendimi geliştirmek için çabalamam ve öz güvenimin yeniden kazanılmasıyla başvurular için hazırdım. Daha özenli ve beni anlatan bir CV hazırlamakla işe başladım. Ardından yapmak istediğim şeyi belirledim ve buna paralel olarak belli başlı yerlere başvurularımı yaptım. Bingo! Önce genel yetenek testleri, ardından İngilizce testleri ve video mülakatları…

İyi gidiyordum. Artık sıra bendeydi. Mülakatlar sırasında sanki sahnede gösteri yapıyor gibi hissediyordum kendimi. Son bireysel mülakatımda, sunum sırasında değerli İK uzmanı, bana “darlayıcı” soru sorma ihtiyacı hissetmedi ve 20 dakika muhabbet ettik. Biliyordum bu sefer olacaktı. Ve oldu.

Türk Telekom START Staj Programına kabul edildiğim haberini aldım. Çok istemiştim ve oldu. Kendimce hak ettiğimi düşünüyordum. O an herkes öyle düşünürdü sanırım. Bu noktada Burak Hocama bana kattığı değeri ve kendimin farkına varmamı sağladığı için çok*sonsuz teşekkür ediyorum.

Ve staj programım 2 gün sonra başlıyor.
Good game. Well played. Well deserved.