Soumya GhoshSimple Matrix Factorization example on the Movielens dataset using PysparkMatrix factorization works great for building recommender systems. I think it got pretty popular after the Netflix prize competition. All…Mar 22, 20185Mar 22, 20185
Soumya GhoshBasic data preparation in Pyspark — Capping, Normalizing and ScalingIn this blog, I’ll share some basic data preparation stuff I find myself doing quite often and I’m sure you do too. I’ll use Pyspark and…Mar 21, 20183Mar 21, 20183
Soumya GhoshVisualising Indian startup investments using Python — violin plots, heatmaps and sankey diagramsSometimes histograms and scatterplots arnt enough. Here I’ll cover some of the more complicated plots that you might need to use — violin…Mar 18, 20182Mar 18, 20182
Soumya GhoshTopic modelling with Latent Dirichlet Allocation (LDA) in PysparkIn one of the projects that I was a part of we had to find topics from millions of documents. You can try doing topic modelling using two…Mar 17, 201814Mar 17, 201814
Soumya GhoshRecommender system on the Movielens dataset using an Autoencoder and Tensorflow in PythonI’m a huge fan of autoencoders. They have a ton of uses. They can be used for dimensionality reduction like I show here, they can be used…Mar 17, 20189Mar 17, 20189
Soumya GhoshDenoising MNIST images using an Autoencoder and Tensorflow in pythonPreviously I had written sort of a tutorial on building a simple autoencoder in tensorflow. In that tutorial I had used the autoencoder for…Mar 15, 20182Mar 15, 20182
Soumya GhoshSimple Autoencoder example using Tensorflow in Python on the Fashion MNIST datasetAutoencoders can be used to solve a lot of problems. The one I’ll try to solve here is that of dimensionality reduction. This is a pretty…Mar 11, 20185Mar 11, 20185