Understanding TensorFlow: Part 1

Serie 1: TensorFlow 1.x VS TensorFlow 2.x

7 min readSep 9, 2021

In this article series, you will get an in-depth understanding of TensorFlow. This is an open-source distributed numerical computation framework.

We will get started today with TensorFlow by defining a simple calculation and trying to compute it using both Tensorflow 1.x and TensorFlow 2.x aiming to have an idea of how convenient Tensorflow 2.x is now comparing with Tensorflow 1.x.

After we successfully complete this, we will investigate the eager mode and the autograph mode of TensorFlow 2.x.

Here is the content outline:

TensorFlow 1.x VS TensorFlow 2.x
Eager mode of TensorFlow 2.x
Autograph mode of TensorFlow 2.x

TensorFlow 1.x VS TensorFlow 2.x

TensorFlow has developed two phases by now-TensorFlow 1.x and TensorFlow 2.x. Our introduction will focus on Tensorflow 2.x. Before that, it is better for us to have an idea about these two versions’ main differences so that we can have a focus to learn the features of TensorFlow 2.x and how convenient TensorFlow 2.x becomes, comparing with Tensorflow 1.x.

Let’s take a look at the comparison with a simple example first. We are now using TensorFlow 1.x and Tensorflow 2.x to perform the ‘add’ computation individually.

c = a + b

TensorFlow 1.x’s code:

import tensorflow as tf

#build graph

a = tf.constant(3)

b = tf.constant(4)

c = a + b

#use session to run

with tf.Session() as sess:

result = sess.run(c)

print(result)

sess.close()

In TensorFlow 1.x, there are at least two main parts to build in your code even with the simplest ‘add’ computation. The first step is to create a static computation graph. In the code above, the ‘#build graph’ block is to fulfill this job. It doesn’t actually perform any computation, even though it looks like it does. To actually do the computation, you need to open a TensorFlow session. Here comes the second step-the ‘#use session to run’ block. The tf.Session() will place the operations onto devices like CPUs or GPUs and running them. If there are variables in the computation graph, the tf.Session() would hold those variables, initialize, and evaluates them, too.

We can see that the whole process is not instinctive and violate our programming habit. especially when you want to debug a larger neural network, it would be a disaster. TensorFlow 2.x comes to save us from that.

TensorFlow 2.x’s code:

import tensorflow as tf

a = tf.constant(3.)

b = tf.constant(5.)

c = a + c

print(c)

Yes, it is so easy! To finish the ‘add’ computation, TensorFlow 2.x only needs three lines. The first and second line is to create elements in ‘add’ computation and the third line is to execute the ‘add’ operation and return the result directly. Different from TensorFlow 1.x using the static graph, TensorFlow 2.x is dynamic. We can check the computation result directly after each line of your code. It would be much easier to debug. And it adapted to our programming habit.

Compared with TensorFlow 1.x, TensorFlow 2.x makes the development of ML applications much easier. With tight integration of Keras into TensorFlow, eager execution by default, and Pythonic function execution, TensorFlow 2.x makes the experience of developing applications as familiar as possible for Python developers.

Eager mode of TensorFlow 2.x

TensorFlow 2.x removes the unintuitive way to build a computational graph and run it later in the tf.session. The key to realizing it is eager execution. Eager execution provides us an imperative programming environment with Python data structures, immediate error reporting, and Python flow control.

All three of them changed TensorFlow operation behavior to evaluate and return values to Python immediately. It has concrete value flow naturally in your code structure and makes standard Python debugging possible.

Eager execution supports most TensorFlow operations and GPU acceleration. In Tensorflow 2.x, eager execution is enabled by default.

Let’s use a simple example to run TensorFlow operations and of course, you will see the result will return immediately.

h = sigmoid(W * x + b)

Please note that the examples in this section is only for giving you an impression that how tensorflow 2.x code looks like. The details of each kind of Tensor creation and operation will be introduced step by step in next section.

import tensorflow as tf

x = tf.constant([[1., 2., 3.], [4., 5., 6.]])

w = tf.Variable(tf.random.normal([3, 2]))

b = tf.Variable(tf.random.normal([2]))

z = tf.matmul(x, w) + b

y = tf.nn.sigmoid(z)

print(y)

# Returns (out) =>

tf.Tensor(

[[0.4677037 0.25148553]

[0.9712641 0.0902136 ]], shape=(2, 2), dtype=float32)

The static computational graph and session all disappear in our code! We can easily check out the current result at runtime simply by print(). Of course, you can put print() right after the operation line whenever you want to see the intermediate result. For example, put print(x) right after x = tf.constant([[1., 2., 3.], [4., 5., 6.]]) to check x value.

To have a more general impression of how to use eager execution to work, let’s use eager execution to finish a whole process of a Linear model training. The purpose is not to grasp every detail right now right here but to get you familiar with the TensorFlow 2.x style programming. It doesn’t matter if you can’t fully understand what is going on in the code. Either briefly read it through to get a basic idea or skip this part then come back after reading more about TensorFlow and models, both are good.

First, let’s fake some data for training. In this example, we generate 2000 samples. Each sample is a vector with 10 elements in it. The true parameters that we want to finally have are w_true and b_true. Each element in w_true is set to be 3 with the shape [10, 5]. Each element in b_true is set to be 1 with the shape [5].

NUM_EXAMPLES = 2000

x = tf.random.normal([NUM_EXAMPLES, 10])

noise = tf.random.normal([NUM_EXAMPLES, 5])

w_true = tf.ones([10, 5])

w_true = w_true * 3

b_true = tf.ones([5])

y = tf.matmul(x, w_true) + b_true + noise

print(f’x sample: {x[0:1]}’)

print(f’y sample: {y[0:1]}’)

# Returns (out) =>

x sample: [[ 0.7714453 0.96156746 1.2024771 -1.0653574 -0.749367 2.1572323

-1.1070591 -0.13671888 0.7504565 1.1920955 ]]

y sample: [[14.045872 12.701928 13.449406 12.854302 12.903345]]

Next, build the linear model with the tf.keras model subclass. Model parameters are W and B. The linear model returns w*x + b.

class Linear(tf.keras.Model):

def __init__(self):

super(Linear, self).__init__()

self.W = tf.Variable(tf.random.normal([10, 5]), name=’weight’)

self.B = tf.Variable(tf.random.normal([5]), name=’bias’)

def call(self, inputs):

return tf.matmul(inputs, self.W) + self.B

Third, create the model and the optimizer.

model = Linear()

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

Fourth, train the model. After training the model, we can fix our model parameters self.W and self.B. Their values are listed in the output. It is a demo code, the W and B are not close enough to the w_true and b_true. But we can still see the trends that they are bumping around with the target as their center.

steps = 100

for i in range(steps):

with tf.GradientTape() as tape:

#loss

error = model(x) — y

loss_value = tf.reduce_mean(tf.square(error))

grads = tape.gradient(loss_value, [model.W, model.B])

optimizer.apply_gradients(zip(grads, [model.W, model.B]))

if i % 20 == 0:

print(f”Loss at step {i}: {loss_value}”)

print(f”W = {model.W.numpy()}, B = {model.B.numpy()}”)

# Returns (out) =>

Loss at step 0: 0.9959805607795715

Loss at step 20: 0.9947214722633362

Loss at step 40: 0.993640124797821

Loss at step 60: 0.9927115440368652

Loss at step 80: 0.9919135570526123

W = [[3.013761 2.972824 2.9576807 2.9959006 2.9828606] [2.9921997 2.9522212 2.9901388 2.981263 2.9870505] [2.983197 2.9589314 3.0155313 3.0105908 3.0207992] [3.002246 2.9900434 2.9911227 3.0125127 2.9714382] [2.972854 2.9398937 3.0158775 2.9727569 3.0097048] [3.0214002 2.9796052 3.002326 3.0160155 3.0155087] [2.9566758 2.9727437 2.9714837 2.9215772 2.9489958] [2.9522686 3.001058 2.9785402 2.97918 3.0170207] [2.9413743 2.9621522 2.9582999 2.9600418 2.9680848] [2.970826 2.986206 2.9471984 2.9669378 2.975441 ]], B = [1.0471543 0.98861533 1.0299664 0.9623249 0.96185046]

The final step, save, load and predict.

# save

model.save_weights(‘weights’)

del model

# load

model = Linear()

model.load_weights(‘weights’)

# predict

x_new = tf.random.normal([1, 10])

print(x_new)

print(model.predict(x_new))

# Returns (out) =>

tf.Tensor(

[[-0.45371628 -1.0948718 -0.5971005 -1.6125325 0.8018683 -0.4925114

-2.4727316 -0.41909572 0.96381396 -0.81628186]], shape=(1, 10), dtype=float32)

[[-13.022452 -11.254985 -9.975083 -14.200058 -10.875286]]

Autograph mode of TensorFlow 2.x

While eager execution makes development and debugging more interactive, TensorFlow 1.x style graph execution has advantages for distributed training, performance optimizations, and production deployment. To bridge this gap, TensorFlow 2.x introduces functions via the tf.function API.

TensorFlow 2.x supports automatically construct graphs when you need them. The way you create a graph in TensorFlow 2.x is to use tf.function, either as a direct call or as a decorator.

It is common to use tf.function decorator to realize autograph mode of TensorFlow 2.x. tf.function allows to write graphics code using natural Python syntax. When a function is annotated with tf.function, it can be called like any other function. It will be compiled into a graph, which means faster execution and better running on GPU or TPU.

@tf.function

def sigmoid_layer(x, w):

return tf.nn.sigmoid(tf.matmul(x, w))

x = tf.random.uniform((2, 5))

w = tf.random.uniform((5, 2))

sigmoid_layer(x, w)

# Returns (out) =>

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=

array([[0.79586333, 0.8243984 ],

[0.83552325, 0.78892213]], dtype=float32)>

If the code uses multiple functions, you don’t need to annotate them all-any function called from an annotated function will also run in graphical mode. For example,

def inner_function(x, y, b):

x = tf.matmul(x, y)

x = x + b

return x

@tf.function

def outer_function(x):

y = tf.constant([[2.0], [3.0]])

b = tf.constant(4.0)

return inner_function(x, y, b)

outer_function(tf.constant([[1.0, 2.0]])).numpy()

# Returns (out) =>

array([[12.]], dtype=float32)