FAST ICA vs Reconstruction ICA vs Orthonormal ICA in Tensorflow / Matlab [Manual Back Prop in TF]

Jae Duk Seo
Sep 5, 2018 · 11 min read
GIF from this website

While reading the Unsupervised Feature Learning and Deep Learning Tutorial from Professor Andrew Ng, I found out two different methods of performing ICA and I wanted to compare those methods to FastICA.

Please note that this post is for me to practice my coding skills as well as for my future self to review the materials on this post.

Data Set

For this post I will be using two different data sets, which are The Neurofeedback Skull-stripped (NFBS) repository and The Olivetti faces dataset from sklearn.

Dimensionality Reduction by PCA

Before statistically separating our data into independent components, lets first project it on to lower sub space by PCA. As seen above, we can see how each of the images look like.

Now before subtracting the mean from each dimension lets take a look at the mean face as well as mean MRI brain.

After subtracting the mean from each image we can see that the images now look like ghosts. (especially for face images.)

Now lets calculate the co-variance matrix for each data set, the left image per data set is manually calculated co-variance matrix while the right images represent co-variance matrix calculated by np.cov().

Now after performing eigenvalue decomposition for each co-variance matrix we can plot the eigenvalues in decreasing order. (And it looks like 50 is a good cut off.)

Above image represent top 50 eigenbrains as well as eigenfaces. (Looks scary lol.)

Using those 50 top eigen images lets now reconstruct our original data, and as seen above, the original images vs reconstructed image are not quite the same.

Caution While Performing PCA

Image from this website

There is a good tutorial on performing PCA from this website, however in one of the section the author mentions the fact that the only difference between co-variance matrix and scatter matrix are the denominating factor. However, while experimenting I notice that due to how np.cov() is implemented this ratio might be broken.

As seen above, when we compare scatter matrix to co-variance matrix we can see that the co-variance matrix is clear then the scatter matrix. This is mostly because of how numpy cov is implemented, before creating the co-variance matrix it subtract the mean per example as well. (so shift each example to zero mean), hence that difference is made. But if we subtract the mean per example we can observe the same co-variance matrix.

So in conclusion, if you wish to follow the numpy steps, it might be a good idea to double check the values in between.

Fast ICA

Image from this website

Since we have performed dimensionality reduction now lets use FastICA to make our reduced data independent from one another. Additionally, I am going to use log(cosh()) as my activation function and below is an image of how activation function log(cosh()) function looks like.

Image from this website

In tensorflow we can implement the FastICA as seen below. (I am following the sklearn FastICA implementation.)

Now after 1000 iteration we can see the final results for each independent components.

One thing to note here is the fact that ICA captures local changes rather then global changes.

And when we animate the convergence process of each component, it looks something like above.

Reconstruction ICA

Image from this website

Next lets see RICA in action, additionally, if anyone wants to know how to drive back propagation respect to weight w please see here.

Please note! I am not going to perform dimensionality reduction before hand, rather I am only going to perform zca whitening.

And as seen above, after whitening we can see that the edges becomes more clear. And right image is the resulted co-variance matrix after whitening.

As seen above we can implement RICA in tensorflow as a layer wise fashion. And by doing so I decided not to use the back tracking line search algorithm.

Unfortunately, I was not able converge the algorithm, this maybe due to couple of things. Directly performing dimensionality reduction, need more iteration to converge, or need to use the back tracking algorithm to find an optimal step size etc….

Image from this website

As seen above, we can see that using RICA on lower dimensional images works just fine. The above example used 8 by 8 images with 3 channels so 192 vector for each image. (The images were natural images from STL data set.). The face images have dimensionality of 4096, which is 64 * 64.

Orthonormal ICA — Back Tracking Line Search

image from this website

Before learning the details of Orthonormal ICA it maybe a good idea to learn the back tracking line search in which finds the optimal step size for updating a given weight. (I just understand it as learning the optimal learning rate.)

PPT from this website
PPT from this website
PPT from this website

Orthonormal ICA

Image from this website

Finally, lets try out Orthonormal ICA, one thing to note is the fact that Orthonormal is very similar to Reconstruction ICA, however it has a stronger constraint in which the weight matrix’s co-variance matrix is the identity matrix. Please note that I used the available codes from this GitHub as well as this GitHub to achieve these results. (in Matlab)

As seen above, when we use 8*8 patches of color images (so in total of 192 dimension) the algorithm is able to learn the filters that resembles gabor filters.

However, when using the same algorithm for higher dimensional image, the algorithm as not able to converge.

Interactive Codes

For Google Colab, you would need a google account to view the codes, also you can’t run read only scripts in Google Colab so make a copy on your play ground. Finally, I will never ask for permission to access your files on Google Drive, just FYI. Happy Coding!

To access the codes for FastICA on MRI please click here.
To access the codes for FastICA on Face Images please click here.
To access the codes for RICA please click here.
To access the codes for Orthonormal ICA please click here.

Final Words

Finally, I want to make this section my personal notes again. Below I attached a link that is a good practice for matrix calculus, as well as good table for matrix transpose rule.

Image from this website
image from this website

As well as a good answer to know the difference between ICA, FA and PCA.

Image from this website

How to perform PCA using co-variance matrix in python can be seen here.

If any errors are found, please email me at jae.duk.seo@gmail.com, if you wish to see the list of all of my writing please view my website here.

Meanwhile follow me on my twitter here, and visit my website, or my Youtube channel for more content. I also implemented Wide Residual Networks, please click here to view the blog post.

Reference

  1. 5.6.1. The Olivetti faces dataset — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/datasets/olivetti_faces.html#olivetti-faces
  2. Faces dataset decompositions — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/auto_examples/decomposition/plot_faces_decomposition.html#sphx-glr-auto-examples-decomposition-plot-faces-decomposition-py
  3. [duplicate], H. (2018). How to display multiple images in one figure correctly?. Stack Overflow. Retrieved 31 August 2018, from https://stackoverflow.com/questions/46615554/how-to-display-multiple-images-in-one-figure-correctly
  4. Face completion with a multi-output estimators — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 31 August 2018, from http://scikit-learn.org/stable/auto_examples/plot_multioutput_face_completion.html#sphx-glr-auto-examples-plot-multioutput-face-completion-py
  5. tf.enable_eager_execution must be called at program startup. · Issue #18304 · tensorflow/tensorflow. (2018). GitHub. Retrieved 31 August 2018, from https://github.com/tensorflow/tensorflow/issues/18304
  6. Eager Execution | TensorFlow. (2018). TensorFlow. Retrieved 31 August 2018, from https://www.tensorflow.org/guide/eager
  7. Linear Algebra (scipy.linalg) — SciPy v1.1.0 Reference Guide. (2018). Docs.scipy.org. Retrieved 31 August 2018, from https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html
  8. ZCA whitening (http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening). (2018). Gist. Retrieved 31 August 2018, from https://gist.github.com/dmaniry/5170087
  9. tf.pow | TensorFlow. (2018). TensorFlow. Retrieved 31 August 2018, from https://www.tensorflow.org/api_docs/python/tf/pow
  10. Brownlee, J. (2018). Gentle Introduction to Vector Norms in Machine Learning. Machine Learning Mastery. Retrieved 31 August 2018, from https://machinelearningmastery.com/vector-norms-machine-learning/
  11. math what does upside down triangle symbol mean — Google Search. (2018). Google.co.kr. Retrieved 31 August 2018, from https://www.google.co.kr/search?q=math+what+does+upside+down+triangle+symbol+mean&source=lnms&tbm=isch&sa=X&ved=0ahUKEwi6pNeyl5fdAhXGzmEKHU0pAukQ_AUICigB&biw=1600&bih=907#imgrc=HtbVWFH2bEP3OM:
  12. RuntimeError: expected Double tensor (got Float tensor) · Issue #2138 · pytorch/pytorch. (2018). GitHub. Retrieved 31 August 2018, from https://github.com/pytorch/pytorch/issues/2138
  13. norm, D. (2018). Derivative of $l_1$ norm. Mathematics Stack Exchange. Retrieved 31 August 2018, from https://math.stackexchange.com/questions/1646008/derivative-of-l-1-norm
  14. Matrix Theorems . (2018). Stattrek.com. Retrieved 31 August 2018, from https://stattrek.com/matrix-algebra/matrix-theorems.aspx
  15. analysis?, W. (2018). What is the relationship between independent component analysis and factor analysis?. Cross Validated. Retrieved 1 September 2018, from https://stats.stackexchange.com/questions/35319/what-is-the-relationship-between-independent-component-analysis-and-factor-analy
  16. tf.random_uniform | TensorFlow. (2018). TensorFlow. Retrieved 1 September 2018, from https://www.tensorflow.org/api_docs/python/tf/random_uniform
  17. Backtracking line search. (2018). En.wikipedia.org. Retrieved 2 September 2018, from https://en.wikipedia.org/wiki/Backtracking_line_search
  18. Exercise:Independent Component Analysis — Ufldl. (2018). Ufldl.stanford.edu. Retrieved 2 September 2018, from http://ufldl.stanford.edu/wiki/index.php?title=Exercise:Independent_Component_Analysis&oldid=1298
  19. PedroCV/UFLDL-Tutorial-Solutions. (2018). GitHub. Retrieved 2 September 2018, from https://github.com/PedroCV/UFLDL-Tutorial-Solutions/blob/master/Additional_2_Independent_Component_Analysis/orthonormalICACost.m
  20. cswhjiang/UFLDL-Tutorial-Exercise. (2018). GitHub. Retrieved 2 September 2018, from https://github.com/cswhjiang/UFLDL-Tutorial-Exercise/blob/master/Exercise11_independent_component_analysis_exercise/ICAExercise.m
  21. objects, S. (2018). Shuffling a list of objects. Stack Overflow. Retrieved 3 September 2018, from https://stackoverflow.com/questions/976882/shuffling-a-list-of-objects
  22. order, R. (2018). Randomly shuffle data and labels from different files in the same order. Stack Overflow. Retrieved 3 September 2018, from https://stackoverflow.com/questions/43229034/randomly-shuffle-data-and-labels-from-different-files-in-the-same-order/43229113
  23. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/RICA/
  24. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/ExerciseRICA/
  25. Exercise:Independent Component Analysis — Ufldl. (2018). Ufldl.stanford.edu. Retrieved 3 September 2018, from http://ufldl.stanford.edu/wiki/index.php?title=Exercise:Independent_Component_Analysis&direction=prev&oldid=1016#Step_4b:_Reconstruction_ICA
  26. Implementing a Principal Component Analysis (PCA). (2014). Dr. Sebastian Raschka. Retrieved 4 September 2018, from https://sebastianraschka.com/Articles/2014_pca_step_by_step.html
  27. Python, P. (2018). Principal Component Analysis (PCA) in Python. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/13224362/principal-component-analysis-pca-in-python
  28. Decoding Dimensionality Reduction, PCA and SVD. (2015). Big Data Made Simple — One source. Many perspectives.. Retrieved 5 September 2018, from http://bigdata-madesimple.com/decoding-dimensionality-reduction-pca-and-svd/
  29. Decomposition, U. (2018). Using Numpy (np.linalg.svd) for Singular Value Decomposition. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/24913232/using-numpy-np-linalg-svd-for-singular-value-decomposition
  30. tf.cosh | TensorFlow. (2018). TensorFlow. Retrieved 5 September 2018, from https://www.tensorflow.org/api_docs/python/tf/cosh
  31. [ Achieved Post ] Collection of useful presentation for Independent Component Analysis. (2018). Medium. Retrieved 5 September 2018, from https://medium.com/@SeoJaeDuk/achieved-post-collection-of-useful-presentation-for-independent-component-analysis-8e07426bf095
  32. Desmos graph. (2018). Desmos Graphing Calculator. Retrieved 5 September 2018, from https://www.desmos.com/calculator
  33. DICOM in Python: Importing medical image data into NumPy with PyDICOM and VTK. (2014). PyScience. Retrieved 5 September 2018, from https://pyscience.wordpress.com/2014/09/08/dicom-in-python-importing-medical-image-data-into-numpy-with-pydicom-and-vtk/
  34. scipy.misc.imread, u. (2018). using skimage to replace scipy.misc.imread. Stack Overflow. Retrieved 5 September 2018, from https://stackoverflow.com/questions/49686013/using-skimage-to-replace-scipy-misc-imread
  35. Module: io — skimage v0.15.dev0 docs. (2018). Scikit-image.org. Retrieved 5 September 2018, from http://scikit-image.org/docs/dev/api/skimage.io.html#skimage.io.imread
  36. animation example code: dynamic_image.py — Matplotlib 2.0.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/examples/animation/dynamic_image.html
  37. An animated image using a list of images — Matplotlib 2.1.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/gallery/animation/dynamic_image2.html
  38. An animated image using a list of images — Matplotlib 2.1.2 documentation. (2018). Matplotlib.org. Retrieved 5 September 2018, from https://matplotlib.org/gallery/animation/dynamic_image2.html
  39. Unsupervised Feature Learning and Deep Learning Tutorial. (2018). Ufldl.stanford.edu. Retrieved 5 September 2018, from http://ufldl.stanford.edu/tutorial/unsupervised/ICA/
  40. NFBS Skull-Stripped Repository. (2018). Preprocessed-connectomes-project.org. Retrieved 5 September 2018, from http://preprocessed-connectomes-project.org/NFB_skullstripped/
  41. Faces dataset decompositions — scikit-learn 0.19.2 documentation. (2018). Scikit-learn.org. Retrieved 5 September 2018, from http://scikit-learn.org/stable/auto_examples/decomposition/plot_faces_decomposition.html#sphx-glr-auto-examples-decomposition-plot-faces-decomposition-py

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…