SUMMARY OF TENSORFLOW WITHOUT A PHD
Its a worth reading tutorial .i recommend to read this tutorial before preceding further.this blog is all about what i learn from this tutorial.
This codelab uses the MNIST dataset, a collection of 60,000 labeled digits that has kept generations of PhDs busy for almost two decades. You will solve the problem with less than 100 lines of Python / TensorFlow code.
First clone the GitHub repository for better understanding:
$ git clone https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd.git
$ cd tensorflow-without-a-phd/tensorflow-mnist-tutorialWhen you launch the initial python script, you should see a real-time visualisation of the training process:
$ python3 mnist_1.0_softmax.py
Train a neural network:
There are 50,000 training digits in this dataset. We feed 100 of them into the training loop at each iteration so the system will have seen all the training digits once after 500 iterations. We call this an “epoch”.
To test the quality of the recognition in real-world conditions, we must use digits that the system has NOT seen during training. Otherwise, it could learn all the training digits by heart and still fail at recognising an “8” that I just wrote. The MNIST dataset contains 10,000 test digits. Here you see about 1000 of them with all the mis-recognised ones sorted at the top (on a red background). The scale on the left gives you a rough idea of the accuracy of the classifier (% of correctly recognised test digits)

To drive the training, we will define a loss function, i.e. a value representing how badly the system recognises the digits and try to minimise it. The choice of a loss function (here, “cross-entropy”) is explained later. What you see here is that the loss goes down on both the training and the test data as the training progresses: that is good. It means the neural network is learning. The X-axis represents iterations through the learning loop.

The accuracy is simply the % of correctly recognised digits. This is computed both on the training and the test set. You will see it go up if the training goes well.


The final two graphs represent the spread of all the values taken by the internal variables, i.e. weights and biases as the training progresses. Here you see for example that biases started at 0 initially and ended up taking values spread roughly evenly between -1.5 and 1.5. These graphs can be useful if the system does not converge well. If you see weights and biases spreading into the 100s or 1000s, you might have a problem.
The bands in the graphs are percentiles. There are 7 bands so each band is where 100/7=14% of all the values are.
Keyboard shortcuts for the visualisation GUI:
1 ......... display 1st graph only
2 ......... display 2nd graph only
3 ......... display 3rd graph only
4 ......... display 4th graph only
5 ......... display 5th graph only
6 ......... display 6th graph only
7 ......... display graphs 1 and 2
8 ......... display graphs 4 and 5
9 ......... display graphs 3 and 6
ESC or 0 .. back to displaying all graphs
SPACE ..... pause/resume
O ......... box zoom mode (then use mouse)
H ......... reset all zooms
Ctrl-S .... save current image
What are “weights” and “biases” ? How is the “cross-entropy” computed ? How exactly does the training algorithm work ? Jump to the next section to find out.