Mobile Analytics using Python

Calculate your key app usage metrics outside paid license analytics tools

Published in

Multiply

3 min readSep 17, 2019

Wondering how to check the output your analytics tool is giving you?
Wondering if you should bother buying one and hooking it up to you app?
I completely empathise with you!

I’m Andreas, a chemical engineer that started off his career optimising refinery units, who then stepped into the world of consulting and data science working on projects in the utilities, distribution and finance sectors. I have recently taken up the exciting role of the sole data scientist at Multiply (an automated financial advice app).

What I have realised during this time (a key lesson, I can assure you) is that no company has a perfect data infrastructure, not even the companies selling data to other companies.

Being tasked with figuring out the validity of my company’s mobile analytics tool is what inspired me to create this series of posts. I’m going to share python code for calculating key metrics offered by many analytics tools that you would otherwise have to pay for.

What is important to appreciate is that no database was built/optimised to serve your analysis. Each product has its own data requirements with its own production database, which were most probably not designed with you in mind.

Our App Usage Data Architecture

Over-simplified events data flow architecture for our app

Multiply: Native App
Segment: Data integration tool
MixPanel: Our chosen web analytics tool

Advantages of this data architecture:

✅ No need to build and maintain a database to store tracked events

✅ No need to design an ETL process for tracked events

✅ MixPanel ’s out of the box functionalities

Disadvantages of this data architecture:

❌ User/events (app usage) data sits with a 3rd party

❌ App usage data querying language defined by the analytics tool and hourly/daily query caps (jQuery for MixPanel; how many business analysts know how to write jQueries?)

❌ If you have to update the raw app usage data for any reason, you need to familiarise yourself with the analytics tool’s API (if instead these were stored in a cloud database, the task would be more straight froward)

What made me switch from MixPanel to local analysis?

It was not very clear to me how calculations were done in the background
The exclusion of specific users prior to any analysis using user cohorts was not straightforward (filtering on properties was very limited)
CAC calculations required data from our marketing channels

Key findings that came out of the local analysis

Duplication of users caused by MixPanel
Periodic user activity was hard to calculate in MixPanel
For metrics that do not come out of MixPanel’s standard toolbox, calculating them using python/SQL is much easier than writing a jQuery

Required steps for analysis

To carry out the local analysis, you need to have access to the app usage data or have a local cache on your machine. In my case, I created a local Postgres database and pushed MixPanel ’s raw data and daily Facebook Ads Insights to it.

What follows

The next series of posts include tutorials (with code) outlining how to calculate and visualise the corresponding metrics. I created this series because for many metrics I needed to work out, I did not find a clear answer online.

User acquisition stats
User retention with cohort analysis
User conversion funnel
User journey (sankey) diagram
Customer Acquisition Cost (rolling mean) (coming soon)
(integrating Facebook Manager ads insights)

Code snippets provided within each section, and the whole Github repo can be found here:
https://github.com/atsangarides/mobile-analytics

Needless to say, that with all the above implemented, it could be the start of building your own mobile analytics dashboard.