Insights From the Apple Appstore

Introduction
With the myriad of applications on app stores, downloading an application for some particular purpose is no easy task. One often finds himself browsing through reviews and checking comments before making the final decision on what app to download.If downloading can be so ‘not easy’, imagine how much more daunting publishing an application on the app store can be!
A company with a team of competent and ready developers wanting to develop a mobile application will have to ask questions like:
What genre of applications should be built?
Should it be free or paid?
How big should the application be? And many more questions.
Answering these questions will help them make decisions on what to build.
Also, and application is not published on the application store to just be there. It is published to be downloaded and used. It is however not strange to see applications with almost no downloads.
The importance of app ratings and reviews cannot be overemphasized. According to a suvey conducted by Appentive, a company which uses customer feedback to help compainies increase their app downloads:
92% of the top 100 paid apps have at least a 4 star rating.
98% of the top 100 free apps have at least a 4 star rating
Having an app rated 4 star and above therefore greatly increases chances for that application to make it amongst the top 100 of its category, be it paid or free.
In this post, we are going to talk about the apple app store based on data gotten from Kaggle. We are going to analyse the data, look at different genres of apps, discuss some factors affecting app ratings, as well as try to predict if an app is a winner (has a rating of 4 or above) or a looser(has a rating below 4). Here are some questions we shall answer;
1. What app genre generates the most revenue?
2. How are prices of the apps distributed?
3. Can We Predict if our app will be a winner or a loser?
Question 1. What app genre generates the most revenue?
To answer this question, we will assume that the number of ratings is equal to the number of downloads of each application. Analyzing the data, we find out that the app store has 23 genres of applications(it could be more or less now, but I bet it won’t differ by much).
The pie chart shows us the percentage composition of each genre.

An outstanding 53.7% of applications are games. The entertainment and education categories come next. The genre with the lowest number of applications is the ‘catalogs’ genre. Not to fear however, though they are more game app lications, games are not the most downloaded. Social media applications still are the most downloaded in the world according to this Business insider article. So if you think your app is the next whatsapp or facebook or have a disruptive in the catalogs genre, or a new genre all together, go for it!
Also, since the number of games is more than the number of other genre of apps combined, we can expect the most rated apps to be games. The genres with the highest average rating however are productivity, music and business in that order. The catalog genre is still down the scale for average rating though.

The most common rating per genre is 4.5, except for a few genres like finance, that has 0 as most common rating.
Since the games genre has the most number of apps, and the productivity genre has the highest rating per genre, my hypothesis is that either the games genre or productivity genre generate the most revenue.

From our analysis, we see that the hypothesis made earlier was true.The genre that generates the most revenue is the game genre, followed by the productivity genre.
Question 2: How are prices for the Apps distributed?
We already know that we have paid and free apps. Will it not be interesting to know the number of people who actually pay for apps? Have you, reading this ever paid for a mobile app?
Well, questions aside, let provide answers. But before that, one last question; should your killer app be paid or free?
So are they more free or paid apps? Well, our hypothesis is that there are more free apps than paid ones.

So on the x-axis, ‘0’ means the app is not a paid app, and ‘1’ means it is a paid app. 56.4 percent of the apps are free, while 43.6 are paid. Our hypothesis earlier was true, there indeed are more free apps in our dataset than they are paid apps. The difference is not that much though(less than 1000).
Now let us look at the price distribution.

So most apps are less than 25 US dollars, and even much less, given that the average price of all the apps(paid and unpaid) is approximately 1.73 dollars. The average for paid apps only is approximately 3.96 dollars. This should help inform us when we are deciding on a price, if we decide it is a paid app.
The maximum price for an app in our dataset is 299.99US dollars, and the minimum, as we know is 0.299.99 US dollars is small compared to the most expensive apps on the app store now(like CyberTuner, Agro and many more) that cost up to 999.99US dollars.
If you believe that app you have in mind falls in the category of expensive apps, you could set your price to 999.99USD too, and please do meet up with the promised value.
The most downloaded are not the expensive apps, or the games though. It seems social media still wins. It may not be easy competing with whatsapp or facebook, but if economic, geographical and political conditions favor you, head right it. You may just be the next ‘tik tok app’
It seems those who pay for apps seem to be more concerned with ratings than those who don’t, reason being that the ratio of rated paid apps is higher than that of free apps.

Remember to factor that in when making the decision.
Question 3: Is it Possible to predict if an app will be a winner or a looser?
To be succinct, the answer is yes, but not with 100% confidence.
So we built this ‘machine’ called a classifier(it does, just that, classifies under categories) that helps answer questions with 2 possible outcomes; like yes — no, true — false, highly rated-not highly rated.
The classifier predicts whether an app will be a winner or a loser with 69% accuracy. It is not the most accurate, but it is not that bad either.
Whether an app makes it to the winners list depends on a couple of factors. In a more technical language, these factors are called features, and in our case, are just some the columns in our data.
Some features are more important than others in predicting the rating. We discovered for example that most apps show 5 screenshots for display, and that most highly rated apps indeed had 5 screenshots shown for display. Apps with no screenshot are more probable to have a user rating of 0.0.
The most determining criteria in descending order are the genre and number of screenshots shown for display. The number of languages and price have an effect, but not that much.
Here is how a few features vary with user ratings.


Conclusion
In conclusion, building an app, and publishing it on the app store is no easy task. After publishing on the app store, one has to market, and hope for good ratings. It is true that nothing really beats having a good app that delivers on its promise, but drawing insight from data such as this, can help one better position him/herself for success.
So if you want to build an application here is what I suggest:
Choose an appropriate genre
Deliver on what you promise (failure to do so generally results in one star ratings and bad reviews)
When publishing, use 5 screenshots to really show what your app is about
Choose the language depending on your market.
And be as backward compatible( Support a good number of devices, depending on your market).
Thanks for reading. Hope this information helps someone :)
All the technical details and answers to more questions can be found on this githup repository.