2016, It was a sunny morning in London, Nischal & I were roaming around the Westminster Bridge.

Image for post
Image for post

We were speaking at PyData London that year and the topic was Deep Learning

Image for post
Image for post

Of all the things, there was this one thing we were super curious to know — Which is the font on the boards of the buses?

Image for post
Image for post
Picture courtesy — Wikipedia

This was one of the most legible, crisp, clean font we had ever seen! We quickly googled to figure out it was Johnston. That was easy! However, I see a lot of instances where we will not be able to google and find out the font. Given that I’m a deep learning practitioner, why not use it to classify fonts? …


Image for post
Image for post
Photo by Thomas Kvistholt on Unsplash

The common problem with using the latest release of any framework is that there are no or very few adopters, docs are not updated or point to older versions. We encountered a similar problem while integrating MongoDB driver with Apache Spark 2.X. Majority of the library docs available as of today work only with spark 1.5+.

All we wanted to do was to create a dataframe by reading a mongodb collection. After a lot of googling, we figured out there are two libraries that support such operation:

We decided to use go ahead with the official Spark Mongo connector as it looked straightforward. …


Image for post
Image for post
Photo by Mika Baumeister on Unsplash

The bane of using bleeding edge technology is very less or hidden information of new features in the latest version. We at Unnati use bleeding edge releases of many data science tools for various research and production systems. In this post we explain how to add external jars to Apache Spark 2.x application.

Starting Spark 2.x, we can use the --package option to pass additional jars to spark-submit. Spark will look through the local ivy2 repository for the jar, if it is missing, it will pull the dependency from the central maven server.

$SPARK_HOME/bin/spark-submit --packages org.mongodb.spark:mongo-spark-connector_2.10:2.0.0 <py-file>

In the above example, we are adding mongodb-spark connector. This works perfectly fine. However, there are scenarios where spark is used as part of the python application. In this case, we will use SparkContext to specify the configuration. …


Image for post
Image for post

Booking a hotel is a fairly involved decision. You want to ensure that you are making an informed decision. However, unlike most e-commerce purchases, the product on offer here (a hotel) isn’t as standardised (yet), and thus there are a lot of factors to consider.

Some of the commonly asked questions before booking a hotel include -

These are questions where the answer isn’t exactly a straightforward yes or no, but instead lies on a spectrum and needs a careful consideration of all the factors involved. That’s why every online hotel booking platform shares copious amounts of information about a hotel — reviews and ratings, pictures, amenities, qualitative descriptions etc. …


How to fix word spacing?

Word cloud is one of the most common visualizations we see today, especially with social media analytics. Open source libraries like D3JS have eased developers life. With these libraries we can quickly wire data and get beautiful visualizations. Thanks to Mike Bostock for giving the community D3JS and http://bl.ocks.org. With bl.ocks, we have a plethora of visualizations from the community, open to public with their implementation.

This library from Jason Davies — https://github.com/jasondavies/d3-cloud , can help you build a word cloud in 5 minutes or less. A big thank you to Jason for this handy, neat library. …


Image for post
Image for post
Image courtesy: http://unsplash.com

What?

Word embedding is a technique of converting words to vectors of a high dimension space. In simple terms, in each dimension, we group words based on a particular aspect — gender, colour etc., and score the words based on similarity in that space.

For example — “I have a red car, maroon shirt and a grey bicycle”

One of the dimensions can represent colour. Red, maroon and grey are assigned similar scores. While rest of the words will have very different scores. Another dimension can represent type of object. Car and bicycle are assigned similar scores because they are automobiles.

The output of word embedding is nothing but a high dimension matrix. …


Artificial Neural Networks (ANNs) have totally changed what computers are capable of learning. Though neural networks date back 1940s, we are seeing an astonishing amount of increase of its applications in the recent 5–10 years.

Artificial neural networks are modeled on the functioning of the human brain, where the input is converted into output based on a series of transformations. Though they are capable of achieving complex tasks, the way they work is fairly straight forward.

Three main concepts which explain the working of neural networks:

Neuron

This is a simple computation unit which takes a single or multiple inputs and spits out an output. The function here, transforming the input to output is generally a simple logistic function. …


April 26, 2015

Introduction

I started a side project on Scala with a group of friends (noobs in scala). We chose Scala because it is well known for type safety and functional programming with support for OOP. One of the important parts of the project was speaking to a REST API which returned JSON responses.

We began our hunt for efficient JSON parsers on scala and soon we were flooded with libraries:

With so many options, we were confused! Thanks to this wonderful post from Ooyala Engineering team for putting up a nice comparison of libraries. Finally, we decided to go ahead with json4s because we found it handy to extract objects out of the JSON and also the support it has for Jackson (faster parsing). …


January 22, 2015

Every time I look at the examples page of D3, I’m simply go…

Image for post
Image for post

@mbostock has transformed how visualizations are created for web.

Today I learnt how to use svg markers with D3. I was using force layout to analyze graphs, just like this example. But I wanted a directed graph!

Image for post
Image for post

Later, I came across another example which had direction. I was happy because a ready-made solution solved the problem. But soon I ran into problem as I wanted a custom tree like structure with every path being directed i.e …


September 7, 2014

I always wanted to setup a media server at home for the following reasons:

The easiest solution was to turn my RaspberryPi into a DLNA server. For this I required to a few basic packages and had to configure each.

It was a bit hard to find all of them in a single post and hence I’m writing this post. …

About

Raghotham Sripadraj

Data Scientist | Technology Enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store