Anomalies Are All About Context
One of the most contextual data sources of this moment is the mobile phone. Data from apps is one thing, but not that many providers let you download your consumption of your Mobile data, text messages entries and registered phone calls. So thank you Mobile Vikings. Anomalies are everywhere in data. No matter what, there’s always something odd with some numbers. That’s all about context. This is what this article is about.
If you haven’t read the first part, I would recommend you to start there.
Small Recap
The data is meta data with details. No text or anything from text messages but timestamps and types of interactions from over 5,75 years, ranging from 2010 up to october 2015. 46.209 records of data.
Here’s what’s in the data source available:
- The type of interaction (outgoing phone call, incoming phone call, sms sent, sms received, data)
- Timestamp start and finish of phone calls
- The number that called or you received a call from
- Type of bundle
- The price of the bundle
Year 2010 2011 2012 2013 2014 2015 Records 8907 9154 7836 6577 6431 7304 Data 463 302 391 467 2798 4681 Incoming calls 243 259 525 487 408 351 Outgoing calls 309 272 577 498 489 360 Incoming text messages 4208 4529 3450 2727 1488 982 Outgoing text messages 3684 3792 2893 2398 1248 930
In this article, I’ll be focusing on the incoming and outgoing calls, as well for the text messages.
To give you some more context about the data, this graphs gives more context, for sure:

What Are Anomalies Exactly
Anomalies are numbers or events that deviate from the normal or what is standard. In every dataset, there are always some amount of anomalies present over a certain period of time. Mostly, the longer the period of time and thus the larger the dataset is, the more change you have to bump into anomalies.
Some Reasons Why Anomalies Can Occur
There are a few reasons and all of them are have to be placed in its context. One by one.
- Bad data quality (something went wrong with the script for instance in a particular period of time)
- The weather was extremely nice that day (and so we sold more ice cream than ever before — dixit the iceman)
- We gave all our posters away for free that day (some marketing action spikes traffic to the website)
- The electric bill was particularly low this winter month (problem with the pipes)
- …
Actually, if the anomaly is due to bad data or a fault in measurement, it should be called an outlier.
In statistics, an outlier is an observation point that is distant from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. Wikipedia
Anomalies Can Help You To Recreate Context
One of the hardest things to do in marketing is remembering what has changed in the past. For instance, you’re working on a Facebook campagne and are making changes every day over a period of 30 days. Are holding track of every knob twiddling you execute to improve the campaign? I don’t. It’s hard. It’s time consuming. So… what do you do? Keep track of the biggest anomalies. The changes that make an impact. The changes that deviate the standard! Actually, I would dare to state that growth hacking is all about finding and detecting anomalies and making use of those anomalies.
Every impactful change has a certain reason. That reason can be translated into context. It could have been more budget, a bigger target group, a better message, maybe a more compelling picture or video that matches the message. Those parameters all make part of the context of the person seeing the advertisement.
Anomalies are very good to get more insights to recreate context or get more context about a particular event or item. The datasource doesn’t even matter!
A Simple Example With Text Messages
I’ve sorted out my text messages from the first nine months of my mobile data of 2015. This was the result, graphing the text messages my girlfriend and I sent to each other in those months:

When making the same graph based on a daily base, the data shows more detail. A totally different outcome. More interesting!

Two major spikes in text messages around new year (which could make sense) and somewhere in may. Interesting to see that around the beginning of augustus the amount of text messages doesn’t spike but stretches over a longer period of time. Also, something I was aware of, I’m generally receive more text messages than I send text messages, except on new years evening. I remember we celebrated separately that year and met at the end of the evening.
New years eve data
Interesting to see i’ve been communicating a lot more than I used to. Especially starting around 3 am, where we only we’re left with two people of our group celebrating new year. So… I think I kinda got bored and wanted to go home, so I texted her anticipating her arrival and our departure going home. I can remember vividly. Not sure what the text message was about around 5 am though. Sleepwalking? Not sure.
Outgoing text messages Incoming text messages 00:12:44 00:37:41 00:39:26 00:39:33 00:51:20 00:51:35 00:59:40 01:05:16 01:29:11 01:29:15 02:36:41 02:50:32 02:50:35 02:55:24 03:04:53 03:05:19 03:05:26 03:06:06 & 03:06:37 03:07:15 03:08:58 03:09:25 03:21:51 03:23:44 03:23:49 03:24:03 03:39:07 03:39:16 03:39:33 03:46:37 03:46:52 03:46:55 03:46:56 03:46:57 05:09:21
Pentecost data
Same thing around the end of may. Pentecost and we separately went out. I’ll save you the details, but the numbers are likewise. A lots of texting to meet up at the end of the night. 23 received text messages. 21 send text messages. Average amount of time between interactions: a few seconds.
End of july data
A full week of more texting than usual: a birthday party and a few separately small events.
So in the end… The amount of spikes are based around all sorts of mini events (which makes sense — no rocket science there). It is interesting to see what context you’re missing out on. Let’s say you’re a retailer and are aiming for one to one communication… you’ll probably never be able to get all the context you need because of its complexity on such a micro level.
This Is The Moment Data Enhancement Could Make Its Entry
Sometimes… well, any time, enhancing your data is a good thing to do. It makes your data more readable, adds so much more humanized feel to it if it comes to contextualizing and is above all very handy to filter! Just to be clear, this is not big data or hasn’t got anything to do with big data what so ever.
I still feel many people still don’t get big data. What it really is. I could explain it in detail, but this article is a very good article explaining how you’ll eventually get to big data: bottom up starting with micro data. If you understand this, you’ll understand what big data is all about and how enormous ambitious it is as well. The term big data has been abused so many times… including by myself, I have to admit. The article states “macro analytics”, which might be more accurate. At a micro level, there’s also so much context we’re missing out on to conclude things. People often tend to conclude to easily without have the data that allows them to do so.
What Mobile Vikings Could Do With This
I don’t think text mining text messages are allowed but it would be very interesting what topics are increasing the amount of interaction between people, the intent and the sentiment!
For instance, if you’re allowed, and let’s go crazy with this, to read messages from your client and there have troubles finding each other at an event, you could help them locate each other with directions through text messaging or even a separate app. Or give them a few more extra megabytes if you see that their amount of data is almost gone and they still haven’t found each other… I know.. it’s a moonshot but hey… why not.
Originally published at driesbultynck.com.