[ The Lord of the Rings: An F# Approach ] An Introduction of the Blogposts

4 min readDec 23, 2017

One of my winter traditions is to re-read The Lord of The Rings Book series and binge watch the Movies as both the series have been an integral part of my childhood.

It only made sense for me to apply what I recently learnt in my Data Science classes at University of Washington [ also, soon to start a Master’s program for Data Science at University of Wisconsin ] and the F# I’d picked up this past year to my favorite book and movie series in the form a series of 3 blogposts as a part of my contribution to 2017’s FSharp Advent Calendar.

Needless to say, I have been ecstatic to be working on the blogposts from the moment I was given the thumb’s up by Sergey Tihon to contribute this year to the FsAdvent Calendar; it’s been an awesome few weeks working on the three blogposts and I am extremely glad to be sharing my journey!

Blogposts

The three blogposts I am going to be presenting as a part of my contribution are:

1. The Path of the Hobbits

The Path of the Hobbits incorporates data from the Lord of the Rings and Hobbit Movie series to quantitatively answer the question:

Which Movie Series is better: The Lord of the Rings or The Hobbit Series?

2. The Path of the Wizard

The Path of the Wizard uses Character based data to create a model using the K-Nearest Neighbor Classification Algorithm and Levenshtein Distances to:

Try to predict the Race of a Character by the Name.

3. The Path of the King

The Path of the King involves using data from the Lord of the Rings Book series and the Scene-by-Scene Character Interactions from the Movie script to first do some text mining and then to quantitatively answer the question:

Which relationship among the chosen members in The Lord of the Rings was the best one?

All three of the blogposts involve some form of application of Data Science and highlight the process from data acquisition to actual hypothesis formulation in F#.

Common Libraries

One of the main reasons I decided to write a blogpost that was pertinent to Data Science and F# was to prove how easy it is, now, with libraries such as RProvider, Deedle and F# Charting, all a part of FsLab to successfully conduct data analysis.

Additionally, for the data acquisition, I used FSharp.Data for the Html Type Provider to easily parse and structure data from different websites and the CSV Type Provider whenever required.

I’ll have mentioned the fact multiple times in each of the three blogposts that the Data Acquisition step was probably the most time consuming one but what significantly alleviated the burden was the use of F#’s simple yet elegant libraries that did a lot with a few lines of code.

Acknowledgements

This past September, I was fortunate enough to take some time off work and attend the first OpenFsharp conference in San Francisco where I met some of my F# heroes. It was there that I decided that it was time for me to finally give back to this awesome community by contributing to this year’s FSharp Advent.

Apart from Scott Wlaschin’s workshop on Domain Modelling Made Functional, Jamie Dixon’s presentation on data analysis using F# to improve his son’s Stock Car race ranking really stood out as a unique and extremely enticing one. Inspired by his application of F# and Data Science, I decided to apply the same to my interest in Tolkienen mythology.

I want to also thank the company I work at, Susquehanna International Group . I have been been lucky enough to have some great mentors that have assisted me through my journey of professional development.

I hope you enjoy reading the blogposts as much as I enjoyed writing them! Happy holidays!