Previous
We introduce how to useVariable
, Constant
, Placeholder
, and the operations of tensor. Finally, discuss theActivation
functions.
[Tensorflow] CH1: Getting Started With Tensorflow
The Graph
, Loss
, Optimizer
, Regression
, Classification
was discussed as link below.
[Tensorflow] Ch2: The Tensorflow Way
Discussed the different methods of Linear regression as like Matrix Inverse Method
, Decopmosition Method
, Deming Regression
, Lasso and Ridge Regression
, Elastic Net Regression
, and Logistic Regression.
What is Support Vector Machines (SVM)?
Support vector machine(SVM) is supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
Computing the (soft-margin) SVM classifier amounts to minimizing an expression of the form.
Working with a Linear SVM
Create the Graph
and data.
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([[x[0], x[3]] for x in iris.data])
y_vals = np.array([1 if y==0 else -1 for y in iris.target])
Build the setosa
data.
setosa_x = [d[1] for i,d in enumerate(x_vals) if y_vals[i]==1]
setosa_y = [d[0] for i,d in enumerate(x_vals) if y_vals[i]==1]
not_setosa_x = [d[1] for i,d in enumerate(x_vals) if y_vals[i]==-1]
not_setosa_y = [d[0] for i,d in enumerate(x_vals) if y_vals[i]==-1]
Split the training
data and testing
data.
train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
Output the shape of data.
x_vals_train:(120, 2), x_vals_test:(30, 2), y_vals_train:(120,), y_vals_test:(30,)
Create the placeholder
and Variable
.
batch_size = 100
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[2,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
Declare the model output.
model_output = tf.subtract(tf.matmul(x_data, A), b)
Declare the necessary components for the maximum margin loss.
l2_norm = tf.reduce_sum(tf.square(A))
alpha = tf.constant([0.1])
classification_term = tf.reduce_mean(tf.maximum(0., tf.subtract(1.,tf.multiply(model_output, y_target))))
loss = tf.add(classification_term, tf.multiply(alpha, l2_norm))
Declare the prediction
and accuracy
functions.
prediction = tf.sign(model_output)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, y_target),
tf.float32))
Declare the optimizer.
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
Train.
loss_vec = []
train_accuracy = []
test_accuracy = []
for i in range(500):
rand_index = np.random.choice(len(x_vals_train), size=batch_size)
X = x_vals_train[rand_index]
Y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data: X, y_target: Y})
temp_loss = sess.run(loss, feed_dict={x_data: X, y_target: Y})
loss_vec.append(temp_loss)
train_acc_temp = sess.run(accuracy, feed_dict={x_data: x_vals_train, y_target: np.transpose([y_vals_train])})
train_accuracy.append(train_acc_temp)
test_acc_temp = sess.run(accuracy, feed_dict={x_data: x_vals_test, y_target: np.transpose([y_vals_test])})
test_accuracy.append(test_acc_temp)
Output, 500 epochs
Output: 5000 epochs
Reduction to Linear Regression
Support vector machines can be used to t linear regression. The loss function will similar to
ε is half of the width of the margin.
Create the Graph
and data.
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([x[3] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])
train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
Declare the batch_size
, placeholder
, and Variable
.
batch_size = 50
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
model_output = tf.add(tf.matmul(x_data, A), b)
Create the epsilon
and set 0.5.
epsilon = tf.constant([0.5])
loss = tf.reduce_mean(tf.maximum(0., tf.subtract(tf.abs(tf.subtract(model_
output, y_target)), epsilon)))
Declare the optimizer.
my_opt = tf.train.GradientDescentOptimizer(0.075)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
Train.
train_loss = []
test_loss = []
for i in range(200):
rand_index = np.random.choice(len(x_vals_train), size=batch_
size)
X = np.transpose([x_vals_train[rand_index]])
Y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data: X, y_target: Y})
temp_train_loss = sess.run(loss, feed_dict={x_data:
np.transpose([x_vals_train]), y_target: np.transpose([y_vals_train])})
train_loss.append(temp_train_loss)
temp_test_loss = sess.run(loss, feed_dict={x_data:np.transpose([x_vals_test]), y_target: np.transpose([y_vals_test])})
test_loss.append(temp_test_loss)
Output
Working with Kernels in TensorFlow
Introduce how to changer kernels and separate non-linear separable data. We will discuss how to implement the Gaussian kernel.
Create the data
(x_vals, y_vals) = datasets.make_circles(n_samples=500, factor=.5,noise=.1)
y_vals = np.array([1 if y==1 else -1 for y in y_vals])
class1_x = [x[0] for i,x in enumerate(x_vals) if y_vals[i]==1]
class1_y = [x[1] for i,x in enumerate(x_vals) if y_vals[i]==1]
class2_x = [x[0] for i,x in enumerate(x_vals) if y_vals[i]==-1]
class2_y = [x[1] for i,x in enumerate(x_vals) if y_vals[i]==-1]
Output
Declare the batch_size
, placeholder
, and Variable
.
batch_size = 250
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
prediction_grid = tf.placeholder(shape=[None, 2], dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[1,batch_size]))
Create the Gaussian kernel.
gamma = tf.constant(-50.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.add(tf.subtract(dist, tf.multiply(2., tf.matmul(x_data,tf.transpose(x_data)))), tf.transpose(dist))
my_kernel = tf.exp(tf.multiply(gamma, tf.abs(sq_dists)))
Declare the dual problem.
model_output = tf.matmul(b, my_kernel)
first_term = tf.reduce_sum(b)
b_vec_cross = tf.matmul(tf.transpose(b), b)
y_target_cross = tf.matmul(y_target, tf.transpose(y_target))
second_term = tf.reduce_sum(tf.multiply(my_kernel, tf.multiply(b_vec_cross,y_target_cross)))
loss = tf.negative(tf.subtract(first_term, second_term))
Create the prediction
and accuracy
function.
rA = tf.reshape(tf.reduce_sum(tf.square(x_data), 1),[-1,1])
rB = tf.reshape(tf.reduce_sum(tf.square(prediction_grid), 1),[-1,1])
pred_sq_dist = tf.add(tf.subtract(rA, tf.multiply(2., tf.matmul(x_data,tf.transpose(prediction_grid)))), tf.transpose(rB))
pred_kernel = tf.exp(tf.multiply(gamma, tf.abs(pred_sq_dist)))
prediction_output = tf.matmul(tf.multiply(tf.transpose(y_target),b),pred_kernel)
prediction = tf.sign(prediction_output-tf.reduce_mean(prediction_output))
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.squeeze(prediction),tf.squeeze(y_target)), tf.float32))
Declare the optimizer.
my_opt = tf.train.GradientDescentOptimizer(0.001)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
Train.
loss_vec = []
batch_accuracy = []
for i in range(1000):
rand_index = np.random.choice(len(x_vals), size=batch_size)
X = x_vals[rand_index]
Y = np.transpose([y_vals[rand_index]])
sess.run(train_step, feed_dict={x_data: X, y_target: Y})
temp_loss = sess.run(loss, feed_dict={x_data: X, y_target: Y})
loss_vec.append(temp_loss)
acc_temp = sess.run(accuracy, feed_dict={x_data: X,y_target: Y,prediction_grid:X})
batch_accuracy.append(acc_temp)
Output
Implementing a Non-Linear SVM
Create the Graph
and data.
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([[x[0], x[3]] for x in iris.data])
y_vals = np.array([1 if y==0 else -1 for y in iris.target])
class1_x = [x[0] for i,x in enumerate(x_vals) if y_vals[i]==1]
class1_y = [x[1] for i,x in enumerate(x_vals) if y_vals[i]==1]
class2_x = [x[0] for i,x in enumerate(x_vals) if y_vals[i]==-1]
class2_y = [x[1] for i,x in enumerate(x_vals) if y_vals[i]==-1]
Plot the points.
Declare the batch_size
, placeholder
, and Variable
batch_size = 100
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
prediction_grid = tf.placeholder(shape=[None, 2], dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[1,batch_size]))
Declare the Gaussian kernel.
gamma = tf.constant(-10.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.add(tf.subtract(dist, tf.multiply(2., tf.matmul(x_data,tf.transpose(x_data)))), tf.transpose(dist))
my_kernel = tf.exp(tf.multiply(gamma, tf.abs(sq_dists)))
Compute the loss for the dual optimization problem.
model_output = tf.matmul(b, my_kernel)
first_term = tf.reduce_sum(b)
b_vec_cross = tf.matmul(tf.transpose(b), b)
y_target_cross = tf.matmul(y_target, tf.transpose(y_target))
second_term = tf.reduce_sum(tf.multiply(my_kernel, tf.multiply(b_vec_cross,y_target_cross)))
loss = tf.negative(tf.subtract(first_term, second_term))
Create the prediction function.
rA = tf.reshape(tf.reduce_sum(tf.square(x_data), 1),[-1,1])
rB = tf.reshape(tf.reduce_sum(tf.square(prediction_grid), 1),[-1,1])
pred_sq_dist = tf.add(tf.subtract(rA, tf.multiply(2., tf.matmul(x_data,
tf.transpose(prediction_grid)))), tf.transpose(rB))
pred_kernel = tf.exp(tf.multiply(gamma, tf.abs(pred_sq_dist)))
prediction_output = tf.matmul(tf.multiply(tf.transpose(y_target),b),pred_kernel)
prediction = tf.sign(prediction_output-tf.reduce_mean(prediction_output))
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.squeeze(prediction),tf.squeeze(y_target)), tf.float32))
Declare the optimizer function.
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
Train.
loss_vec = []
batch_accuracy = []
for i in range(1000):
rand_index = np.random.choice(len(x_vals), size=batch_size)
X = x_vals[rand_index]
Y = np.transpose([y_vals[rand_index]])
sess.run(train_step, feed_dict={x_data: X, y_target:Y})
temp_loss = sess.run(loss, feed_dict={x_data: X, y_target: Y})
loss_vec.append(temp_loss)
acc_temp = sess.run(accuracy, feed_dict={x_data: X,y_target: Y,prediction_grid:X})
batch_accuracy.append(acc_temp)
Output
Implementing a Multi-Class SVM
Create the Graph
and data.
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([[x[0], x[3]] for x in iris.data])
y_vals1 = np.array([1 if y==0 else -1 for y in iris.target])
y_vals2 = np.array([1 if y==1 else -1 for y in iris.target])
y_vals3 = np.array([1 if y==2 else -1 for y in iris.target])
y_vals = np.array([y_vals1, y_vals2, y_vals3])
class1_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==0]
class1_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==0]
class2_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==1]
class2_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==1]
class3_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==2]
class3_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==2]
Plot the points.
Declare the batch_size
, placeholder
, and Variable
.
batch_size = 50
x_data = tf.placeholder(shape=[None, 2], dtype=tf.float32)
y_target = tf.placeholder(shape=[3, None], dtype=tf.float32)
prediction_grid = tf.placeholder(shape=[None, 2], dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[3,batch_size]))
Declare the Gaussian kernel.
gamma = tf.constant(-10.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.add(tf.subtract(dist, tf.multiply(2., tf.matmul(x_data,tf.transpose(x_data)))), tf.transpose(dist))
my_kernel = tf.exp(tf.multiply(gamma, tf.abs(sq_dists)))
We will end up with three-dimensional matrices and we will want to broadcast matrix multiplication across the third index.
def reshape_matmul(mat):
v1 = tf.expand_dims(mat, 1)
v2 = tf.reshape(v1, [3, batch_size, 1])
return(tf.matmul(v2, v1))
Copmute the dual loss function.
model_output = tf.matmul(b, my_kernel)
first_term = tf.reduce_sum(b)
b_vec_cross = tf.matmul(tf.transpose(b), b)
y_target_cross = reshape_matmul(y_target)
second_term = tf.reduce_sum(tf.multiply(my_kernel, tf.multiply(b_vec_cross,y_target_cross)),[1,2])
loss = tf.reduce_sum(tf.negative(tf.subtract(first_term, second_term)))
Create the prediction kernel.
rA = tf.reshape(tf.reduce_sum(tf.square(x_data), 1),[-1,1])
rB = tf.reshape(tf.reduce_sum(tf.square(prediction_grid), 1),[-1,1])
pred_sq_dist = tf.add(tf.subtract(rA, tf.multiply(2., tf.matmul(x_data,tf.transpose(prediction_grid)))), tf.transpose(rB))
pred_kernel = tf.exp(tf.multiply(gamma, tf.abs(pred_sq_dist)))
Create the prediction.
prediction_output = tf.matmul(tf.multiply(y_target,b), pred_kernel)
prediction = tf.arg_max(prediction_output-tf.expand_dims(tf.reduce_mean(prediction_output,1), 1), 0)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction,tf.argmax(y_target,0)), tf.float32))
Declare the optimizer function.
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
Train.
loss_vec = []
batch_accuracy = []
for i in range(500):
rand_index = np.random.choice(len(x_vals), size=batch_size)
X = x_vals[rand_index]
Y = y_vals[:,rand_index]
sess.run(train_step, feed_dict={x_data: X, y_target:Y})
temp_loss = sess.run(loss, feed_dict={x_data: X, y_target: Y})
loss_vec.append(temp_loss)
acc_temp = sess.run(accuracy, feed_dict={x_data: X, y_target: Y, prediction_grid:X})
batch_accuracy.append(acc_temp)
Output
Reference
[1] https://en.wikipedia.org/wiki/Support_vector_machine
[2] https://www.tensorflow.org/
[3] Tensorflow machine learning cookbook