Logistic Regression using Tensorflow

Google has open sourced Tensorflow, a computational graph based language to develop production scale machine learning applications. There is many valuable tutorial on their official website at https://www.tensorflow.org/tutorials/. They have installation instruction and tutorials on image recognition using Convolution Neural Network, word2vec model implementation and seq2seq model for machine translation. I am going to share a simple logistic regression.

We will be using to predict labels of hand written digit as a supervised machine learning technique. Model will learn from training data with labels for each digits and will be able to predict for new digits.

First we will import required libraries required.

import tensorflow as tf

import numpy as np

import math

from tqdm import tqdm

import matplotlib.pyplot as plt

plt.ion()

At the beginning we will look into some basic tensor operations using tensorflow to get familiraty.

Declare tensor variables

x1 = tf.Variable(tf.truncated_normal([5], mean = 1, stddev = 1./math.sqrt(5)))

x2 = tf.Variable(tf.trincated_normal([5], mean = 2, stddev = 1./math.sqrt(5)))

Initialize varible

sess = tf.Session()

see.run( tf.global_variables_initializer())

print(x1.eval())

sqx1x2 = x1*x2

print(sqx1x2.eval())

logx1 = tf.log(x1)

print(logx1.eval())

A common operation in logistic regression is to use sigmoid function that can squash output between [0,1] which can be interpreted as probability.

sigx2 = tf.sigmoid(x2)

print(sigx2.eval())

We can also define constants

w1 = tf.constant(0.1)

w2 = tf.constant(0.2)

sess.run(tf.global_variables_initializer())

n1 = tf.sigmoid(w1*x1 + w2*x2)

print((w1*x1).eval())

print((w2*x2).eval())

print(n1.eval())

Okay, now we can get into simple logistic regression using tensorflow:

First we will use digits with lables numpy file

data = np.load(‘data_with_labels.npz’)

train = data[‘arr_0’]/255

labels = data[‘arr_1’]

Look at some data

print(train[0])

print(lables[0])

We need to convert labels to one hot vectors to work with tensorflow

def to_onehot(labels, nclasses = 5):

outlabels = np.zeros((len(labels), nclasses))

for i, l in enumerate(labels):

outlables[i,l] = 1

return outlabels

onehot = to_onehot(labels)

We will split data into train and test

indices = np.random.permutation(train.shape[0])

test_cnt = int(train.shape[0]*0.1)

test_idx, train_idx = indices[:test_cnt], indices[test_cnt:]

test, train = train[test_idx,:], train[:train_idx,,:]

onehot_test, onehot_train = onehot[test_idx, :], onehot[train_idx, :]

sess = tf.Session()

Define input variables

X = tf.placeholder(“float”, [None, 1296])

y_ = tf.placeholder(“float”, [None, 5])

Define model parameter

W = tf.Variable(tf.truncated_normal([1296, 5], stddev = 1./math.sqrt(1296)))

b = tf.Variable(tf.constant(0.1, shape=[5]))

Initialize

sess.run(global_variable_initializer())

y = tf.nn.softmax(tf.matmul(X,W) + b)

We need to define loss in order to optimize parameters against minimizing loss which will reduce difference between predicted and true labels.

loss = tf.reduce_mean(tf.nn.cross_entropy_with_logits(y + 1e-50, y_))

Define a train step: Great that tensorflow provides a inbuilt function to minimize loss using Gradient Descent optmizer.

train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss)

Define accuracy

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, “float”))

We have done model building now we can start training this model for epochs number as iteration.

for i in tqdm(range(epochs)):

Acc = accuracy.eval(feed_dict={X: train.reshape([-1, 1296]), y_: onehot_train})

train_acc[i//10] = Acc

train_Acc = accuracy.eval(feed_dict={X: train.reshape([-1, 1296]), y_: onehot_train})

train_acc[i//10] = train_Acc

test_Acc = accuracy.eval(feed_dict={X: test.reshape([-1, 1296]), y_: onehot_test})

test_acc[i//10] = test_Acc

train_step.run(feed_dict={X:train.reshape([-1, 1296]), y_: onehot_train})

print(train_acc[-1])

print(test_acc[-1])

We can plot accuracy curve and see training and test accuracy for possible overfit.

plt.plot(train_acc, ‘bo’)

plt.plot(test_acc, ‘rx’)

f, plts = plt.subplots(5, sharex = True)

for i in range(5):

plts[i].pcolor(W.eval()[:,1].reshape([36, 36])

Also we could look into confusion matrix which will show us true and false positive for classification.

predict = np.argmax(y.eval(feed_dict={X: test.reshape([-1, 1296]), y_: onehot_test}), axis = 1)

conf = np.zeros([5,5])

for p, t in zip(predict.np.argmax(onehot_test, axis=1)):

conf[t, p] += 1

plt.matshow(conf)

plt.colorbar()

We can also see some weights

f, plts = plt.subplots(4, 8, sharex = True)

for i in range(32):

plts[i//8, i%8].pcolormesh(W1.eval()[:, i].reshape([36, 36]))

Weights:

Confusion matrix:

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.