FAST ICA vs Reconstruction ICA vs Orthonormal ICA in Tensorflow / Matlab [Manual Back Prop in TF]

Sep 5, 2018 · 11 min read

While reading the Unsupervised Feature Learning and Deep Learning Tutorial from Professor Andrew Ng, I found out two different methods of performing ICA and I wanted to compare those methods to FastICA.

Please note that this post is for me to practice my coding skills as well as for my future self to review the materials on this post.

Data Set

For this post I will be using two different data sets, which are The Neurofeedback Skull-stripped (NFBS) repository and The Olivetti faces dataset from sklearn.

Dimensionality Reduction by PCA

Before statistically separating our data into independent components, lets first project it on to lower sub space by PCA. As seen above, we can see how each of the images look like.

Now before subtracting the mean from each dimension lets take a look at the mean face as well as mean MRI brain.

After subtracting the mean from each image we can see that the images now look like ghosts. (especially for face images.)

Now lets calculate the co-variance matrix for each data set, the left image per data set is manually calculated co-variance matrix while the right images represent co-variance matrix calculated by np.cov().

Now after performing eigenvalue decomposition for each co-variance matrix we can plot the eigenvalues in decreasing order. (And it looks like 50 is a good cut off.)

Above image represent top 50 eigenbrains as well as eigenfaces. (Looks scary lol.)

Using those 50 top eigen images lets now reconstruct our original data, and as seen above, the original images vs reconstructed image are not quite the same.

Caution While Performing PCA

There is a good tutorial on performing PCA from this website, however in one of the section the author mentions the fact that the only difference between co-variance matrix and scatter matrix are the denominating factor. However, while experimenting I notice that due to how np.cov() is implemented this ratio might be broken.

As seen above, when we compare scatter matrix to co-variance matrix we can see that the co-variance matrix is clear then the scatter matrix. This is mostly because of how numpy cov is implemented, before creating the co-variance matrix it subtract the mean per example as well. (so shift each example to zero mean), hence that difference is made. But if we subtract the mean per example we can observe the same co-variance matrix.

So in conclusion, if you wish to follow the numpy steps, it might be a good idea to double check the values in between.

Fast ICA

Since we have performed dimensionality reduction now lets use FastICA to make our reduced data independent from one another. Additionally, I am going to use log(cosh()) as my activation function and below is an image of how activation function log(cosh()) function looks like.

In tensorflow we can implement the FastICA as seen below. (I am following the sklearn FastICA implementation.)

Now after 1000 iteration we can see the final results for each independent components.

One thing to note here is the fact that ICA captures local changes rather then global changes.

And when we animate the convergence process of each component, it looks something like above.

Reconstruction ICA

Next lets see RICA in action, additionally, if anyone wants to know how to drive back propagation respect to weight w please see here.

Please note! I am not going to perform dimensionality reduction before hand, rather I am only going to perform zca whitening.

And as seen above, after whitening we can see that the edges becomes more clear. And right image is the resulted co-variance matrix after whitening.

As seen above we can implement RICA in tensorflow as a layer wise fashion. And by doing so I decided not to use the back tracking line search algorithm.

Unfortunately, I was not able converge the algorithm, this maybe due to couple of things. Directly performing dimensionality reduction, need more iteration to converge, or need to use the back tracking algorithm to find an optimal step size etc….

As seen above, we can see that using RICA on lower dimensional images works just fine. The above example used 8 by 8 images with 3 channels so 192 vector for each image. (The images were natural images from STL data set.). The face images have dimensionality of 4096, which is 64 * 64.

Orthonormal ICA — Back Tracking Line Search

Before learning the details of Orthonormal ICA it maybe a good idea to learn the back tracking line search in which finds the optimal step size for updating a given weight. (I just understand it as learning the optimal learning rate.)

Orthonormal ICA

Finally, lets try out Orthonormal ICA, one thing to note is the fact that Orthonormal is very similar to Reconstruction ICA, however it has a stronger constraint in which the weight matrix’s co-variance matrix is the identity matrix. Please note that I used the available codes from this GitHub as well as this GitHub to achieve these results. (in Matlab)

As seen above, when we use 8*8 patches of color images (so in total of 192 dimension) the algorithm is able to learn the filters that resembles gabor filters.

However, when using the same algorithm for higher dimensional image, the algorithm as not able to converge.

Interactive Codes

For Google Colab, you would need a google account to view the codes, also you can’t run read only scripts in Google Colab so make a copy on your play ground. Finally, I will never ask for permission to access your files on Google Drive, just FYI. Happy Coding!

Final Words

Finally, I want to make this section my personal notes again. Below I attached a link that is a good practice for matrix calculus, as well as good table for matrix transpose rule.

As well as a good answer to know the difference between ICA, FA and PCA.

How to perform PCA using co-variance matrix in python can be seen here.

If any errors are found, please email me at jae.duk.seo@gmail.com, if you wish to see the list of all of my writing please view my website here.

Reference

1. 5.6.1. The Olivetti faces dataset — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/datasets/olivetti_faces.html#olivetti-faces
2. Faces dataset decompositions — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/auto_examples/decomposition/plot_faces_decomposition.html#sphx-glr-auto-examples-decomposition-plot-faces-decomposition-py
3. [duplicate], H. (2018). How to display multiple images in one figure correctly?. Stack Overflow. Retrieved 31 August 2018, from https://stackoverflow.com/questions/46615554/how-to-display-multiple-images-in-one-figure-correctly
4. Face completion with a multi-output estimators — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/auto_examples/plot_multioutput_face_completion.html#sphx-glr-auto-examples-plot-multioutput-face-completion-py
5. tf.enable_eager_execution must be called at program startup. · Issue #18304 · tensorflow/tensorflow. (2018). GitHub. Retrieved 31 August 2018, from https://github.com/tensorflow/tensorflow/issues/18304
6. Eager Execution | TensorFlow. (2018). TensorFlow. Retrieved 31 August 2018, from https://www.tensorflow.org/guide/eager
7. Linear Algebra (scipy.linalg) — SciPy v1.1.0 Reference Guide. (2018). Docs.scipy.org. Retrieved 31 August 2018, from https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html
8. ZCA whitening (http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening). (2018). Gist. Retrieved 31 August 2018, from https://gist.github.com/dmaniry/5170087
9. tf.pow | TensorFlow. (2018). TensorFlow. Retrieved 31 August 2018, from https://www.tensorflow.org/api_docs/python/tf/pow
10. Brownlee, J. (2018). Gentle Introduction to Vector Norms in Machine Learning. Machine Learning Mastery. Retrieved 31 August 2018, from https://machinelearningmastery.com/vector-norms-machine-learning/
11. math what does upside down triangle symbol mean — Google Search. (2018). Google.co.kr. Retrieved 31 August 2018, from https://www.google.co.kr/search?q=math+what+does+upside+down+triangle+symbol+mean&source=lnms&tbm=isch&sa=X&ved=0ahUKEwi6pNeyl5fdAhXGzmEKHU0pAukQ_AUICigB&biw=1600&bih=907#imgrc=HtbVWFH2bEP3OM:
12. RuntimeError: expected Double tensor (got Float tensor) · Issue #2138 · pytorch/pytorch. (2018). GitHub. Retrieved 31 August 2018, from https://github.com/pytorch/pytorch/issues/2138
13. norm, D. (2018). Derivative of \$l_1\$ norm. Mathematics Stack Exchange. Retrieved 31 August 2018, from https://math.stackexchange.com/questions/1646008/derivative-of-l-1-norm
14. Matrix Theorems . (2018). Stattrek.com. Retrieved 31 August 2018, from https://stattrek.com/matrix-algebra/matrix-theorems.aspx
15. analysis?, W. (2018). What is the relationship between independent component analysis and factor analysis?. Cross Validated. Retrieved 1 September 2018, from https://stats.stackexchange.com/questions/35319/what-is-the-relationship-between-independent-component-analysis-and-factor-analy
16. tf.random_uniform | TensorFlow. (2018). TensorFlow. Retrieved 1 September 2018, from https://www.tensorflow.org/api_docs/python/tf/random_uniform
17. Backtracking line search. (2018). En.wikipedia.org. Retrieved 2 September 2018, from https://en.wikipedia.org/wiki/Backtracking_line_search
18. Exercise:Independent Component Analysis — Ufldl. (2018). Ufldl.stanford.edu. Retrieved 2 September 2018, from http://ufldl.stanford.edu/wiki/index.php?title=Exercise:Independent_Component_Analysis&oldid=1298
19. PedroCV/UFLDL-Tutorial-Solutions. (2018). GitHub. Retrieved 2 September 2018, from https://github.com/PedroCV/UFLDL-Tutorial-Solutions/blob/master/Additional_2_Independent_Component_Analysis/orthonormalICACost.m
20. cswhjiang/UFLDL-Tutorial-Exercise. (2018). GitHub. Retrieved 2 September 2018, from https://github.com/cswhjiang/UFLDL-Tutorial-Exercise/blob/master/Exercise11_independent_component_analysis_exercise/ICAExercise.m
21. objects, S. (2018). Shuffling a list of objects. Stack Overflow. Retrieved 3 September 2018, from https://stackoverflow.com/questions/976882/shuffling-a-list-of-objects
22. order, R. (2018). Randomly shuffle data and labels from different files in the same order. Stack Overflow. Retrieved 3 September 2018, from https://stackoverflow.com/questions/43229034/randomly-shuffle-data-and-labels-from-different-files-in-the-same-order/43229113
23. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/RICA/
24. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/ExerciseRICA/
25. Exercise:Independent Component Analysis — Ufldl. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/wiki/index.php?title=Exercise:Independent_Component_Analysis&direction=prev&oldid=1016#Step_4b:_Reconstruction_ICA
26. Implementing a Principal Component Analysis (PCA). (2014). Dr. Sebastian Raschka. Retrieved 4 September 2018, from https://sebastianraschka.com/Articles/2014_pca_step_by_step.html
27. Python, P. (2018). Principal Component Analysis (PCA) in Python. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/13224362/principal-component-analysis-pca-in-python
28. Decoding Dimensionality Reduction, PCA and SVD. (2015). Big Data Made Simple — One source. Many perspectives.. Retrieved 5 September 2018, from http://bigdata-madesimple.com/decoding-dimensionality-reduction-pca-and-svd/
29. Decomposition, U. (2018). Using Numpy (np.linalg.svd) for Singular Value Decomposition. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/24913232/using-numpy-np-linalg-svd-for-singular-value-decomposition
30. tf.cosh | TensorFlow. (2018). TensorFlow. Retrieved 5 September 2018, from https://www.tensorflow.org/api_docs/python/tf/cosh
31. [ Achieved Post ] Collection of useful presentation for Independent Component Analysis. (2018). Medium. Retrieved 5 September 2018, from https://medium.com/@SeoJaeDuk/achieved-post-collection-of-useful-presentation-for-independent-component-analysis-8e07426bf095
32. Desmos graph. (2018). Desmos Graphing Calculator. Retrieved 5 September 2018, from https://www.desmos.com/calculator
33. DICOM in Python: Importing medical image data into NumPy with PyDICOM and VTK. (2014). PyScience. Retrieved 5 September 2018, from https://pyscience.wordpress.com/2014/09/08/dicom-in-python-importing-medical-image-data-into-numpy-with-pydicom-and-vtk/
34. scipy.misc.imread, u. (2018). using skimage to replace scipy.misc.imread. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/49686013/using-skimage-to-replace-scipy-misc-imread
35. Module: io — skimage v0.15.dev0 docs. (2018). Scikit-image.org. Retrieved 5 September 2018, from http://scikit-image.org/docs/dev/api/skimage.io.html#skimage.io.imread
36. animation example code: dynamic_image.py — Matplotlib 2.0.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/examples/animation/dynamic_image.html
37. An animated image using a list of images — Matplotlib 2.1.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/gallery/animation/dynamic_image2.html
38. An animated image using a list of images — Matplotlib 2.1.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/gallery/animation/dynamic_image2.html
39. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 5 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/ICA/
40. NFBS Skull-Stripped Repository. (2018). Preprocessed-connectomes-project.org. Retrieved 5 September 2018, from http://preprocessed-connectomes-project.org/NFB_skullstripped/
41. Faces dataset decompositions — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 5 September 2018, from http://scikit-learn.org/stable/auto_examples/decomposition/plot_faces_decomposition.html#sphx-glr-auto-examples-decomposition-plot-faces-decomposition-py

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Written by

Jae Duk Seo

https://jaedukseo.me I love to make my own notes my guy, let's get LIT with KNOWLEDGE in my GARAGE

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com