Thoughts on data, technology, startups, and oftentimes, other things.

Obtaining neighborhood definition polygons from the Wikimapia API and working with them in R

Jul 25, 2016

The famous Japanese puzzle has been around since the 19th century. However it wasn’t until the late 90’s that computer program was written…

Muhammad Atef

Feb 15, 2016

Create Basic Sunburst Graphs with ggplot2

Solving polar coordinates’ biggest problem of all time (aka text positioning)

Yahia El Gamal

Feb 28, 2016

The Doorman

I want to to tell you the story of our doorman; a diligent, kind hearted chap who is half Slack, half RaspberryPi.

Yusuf Saber

Jan 3, 2016

Latest

Muhammad Atef

Jun 20, 2016

Submit Kaggle Solutions From Command-line Using Phantomjs

Allowing you to submit programatically, even from your remote machine.

You might have participated in some Kaggle competitions where the dataset was in…

Saher El-Neklawy

Jan 14, 2016

Azure Storage Blob Management

A simple CLI for the terminal comfortable

6 Lessons Learned at Web Summit 2015

A few weeks ago I was one of over 40,000 people who descended upon Dublin to be part of the Web…

Semi-Automated Text Cleaning in R

An Open Refine to R Compiler

Cleaning real life textual data is hard. Weather it’s convention inconsistencies, manual data entry mistakes, or a myriad of other reasons, reaching a consistent representation is essential. This…

Muhammad Atef

Dec 22, 2015

Model Calibration

The performance of a trained classification model can be measured in several ways. Accuracy is one important aspect that is…

Mohamed Bassem

Dec 19, 2015

The Distributed R Console

As a data science startup, we write a lot of R scripts. Since we often work with very large amounts of data, our R scripts usually have high CPU and Memory usage. Moreover, these R scripts may take hours or even days to finish. It doesn’t, therefore, make sense to run them on our…

Saher El-Neklawy

Dec 17, 2015

Writing your own dplyr functions

dplyr is awesome, like really awesome. The thing I like most about it is how readable it makes data processing code look. In short, there are two primary aspects that make dplyr great for readability (in addition to it’s great performance, data back-end agnosticism, and…

Defining Your Own R Operators

for better readability and for its sheer awesomeness!

Say you are on your R console, writing some R code that will conquer the world! Let’s say you reached the point where you want to check the non-existence of the…