Matrix factorization works great for building recommender systems. I think it got pretty popular after the Netflix prize competition. All you need to build one is information about which user bought or rated which items and you’re good to go. And I was surprised how amazingly simple to build one…


In this blog, I’ll share some basic data preparation stuff I find myself doing quite often and I’m sure you do too. …


Sometimes histograms and scatterplots arnt enough. Here I’ll cover some of the more complicated plots that you might need to use — violin plots, heatmaps and sankey diagrams. I’ll mostly use python and I’ve picked up this data from here, its data on the startup investment scene in India. …


In one of the projects that I was a part of we had to find topics from millions of documents. You can try doing topic modelling using two methods. Do Non negative Matrix Factorization (NMF) or LDA. NMF is supposed to be a lot faster than LDA, but LDAs supposed…


I’m a huge fan of autoencoders. They have a ton of uses. They can be used for dimensionality reduction like I show here, they can be used for image denoising like I show in this tutorial and a lot of other stuff.

Today I’ll use it to build a recommender…


Previously I had written sort of a tutorial on building a simple autoencoder in tensorflow. In that tutorial I had used the autoencoder for dimensionality reduction. Check it out if you want to. It has a much more detailed explanation on how to build the autoencoder itself. …


Autoencoders can be used to solve a lot of problems. The one I’ll try to solve here is that of dimensionality reduction. This is a pretty common problem in data science. I’ve seen it pop up in a lot of projects that I’ve worked on. If you have structured data…

Soumya Ghosh

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store