Geek Culture
Published in

Geek Culture

Interpreting insights through solving Business problems using A/B testing Experimentation

Photo by Smartworks Coworking on Unsplash
· Investigating the Data
· Investigate the data
· Validating the results
· Thank you for reading!

Data is everywhere. we can solve many problems through data. Many industries are continuously using data to get actionable insights that can produce revenue or profit for the companies. These things indicate that data is an essential part for companies or businesses to drive better decision-making. Hence, I would like to give one of the business problems in one of the social media companies called Yammer. We will use the Yammer dataset that is saved in the Mode Analytics studio

in order to make us easier to pull the data. Yammer is a social network for communicating with coworkers. In this application, Individuals can share documents, updates, and ideas by posting them in groups. Yammer is free to use indefinitely, but companies must pay license fees if they want access to administrative controls, including integration with user management systems like ActiveDirectory. They have centralized analytics teams that focus on providing tools and education that make other teams within Yammer more effective at using data.

The Yammer Analytic philosophy is to constantly consider the value of each individual project like choosing which projects to prioritize, how products analysts are evaluated against core engagement, retention, and growth metrics. In addition to that, we will conduct an A/B test to continuously launch better products to users.

Investigating the Data

The product team wanted to know about the new feature that will be going to launch whether it is going to be a good deal feature or need some improvement before going to launch to the public.

Yammer interface

You can see the picture shows to check an improvement of Yammer’s public core interfaces at the top of the feed in which users type their messages. The product team ran an A/B test from June 1 through June 30. Some users who logged into Yammers were shown the old version of the publisher which is called Control group while other users were shown the new version which is called treatment group randomly on the backend site. On July 1, As a product analyst in yammer analytics, you come up with this chart through the data collected in the database show that the message posting is 50 % higher in the new version/treatment group which is a huge increase in the treatment group such as the following chart below.

Investigate the data

As a product analyst in Yammer Analytics, you come up with this chart to show to the product team. Before going deeper into the causes of this result is considerably impactful when launching to the public or not, we will try to use A/B testing experimentation to prove the significance of this treatment group effect on the behavior of users to post. We will try to check the hypotheses that are likely to occur such as the following :

Photo by Andres Siimon on Unsplash
  1. The novelty effect is the condition where users tend to welcome the new feature that makes higher percentages of users who often post.
  2. Interference between control and treatment groups means that we expect the split between these two groups to be randomly split and no interference but sometimes this assumption does not hold and there is interference between users for social network problems called social networks when splitting the data. Ideally, we should keep the user be independent and have no interference for ideal experimentation. These things can be causes a higher portion of users who post more compared to the control group.
  3. The metric is irrelevant or incorrect that receives a lot of traffic for the control group. for example, Is a bigger button PostNew Message will impactfully influence the way users post?. The choice of proportional metrics can be tricky to make the experimentation can relate to the real-world problems to solve.
  4. The calculation for test statistics is not correct can be a sign that our result will be wrong too. Many statistical methods are used and the way people think about specific statistical methods can make wrong calculations.

After estimating a few factors that affect the drop in user engagement, we will try investigating the data that consists of 4 tables such as users table, events table, experiments table, and normal distribution table.

These are a few summaries of the tables that are captured and run in the mode analytics and you can see the details of the tables here.

Table 1. Yammer_Users

Yammer_users table consists of a few columns such as user_id,created_at, company_id, language, activated_at, and state.

Table 2. Yammer_Events

Yammer_events table consists of few columns such as user_id,occured_at,event_type, event_name, location, device and user_type.

Table 2. Yammer_Experiments

Yammer_emails table consists of few columns such as user_id,occured_at,action, and user_type.

Yammer_normaldistribution table consists of a few columns such as score and value.

Validating the results

Based on a few hypotheses that we decided on the causes of 50% higher proportion in the treatment group, we will try to plot a few graphs that can represent some of these factors and a few actions that we can do to support our hypotheses as our recommendations to the product team. you can see the full PostgreSQL queries in this report in the Mode Analytics Studio.

  1. Average Message posted by the user

one of the metric that determine a core value of Yammer is login frequency. we can see the average of logins per user. The chart depicts that the average number of logins per user is up that indicates not only are users sending more posts, but they are also signing in to Yammer.

Average login between control and treatment group shows a good balance when users wanted to know more about yammer function and engaged in the social network with the coworkers. This also indicades there is no login problems/bug(quick login /logout) in the application.

This chart shows the existing users and the new ones has interferenced one another into same group. It means that the user who signed up in the january will be likely to be same users with users who signed up a day before the test ended. It is better to make a cohort users where there will be a separation of users. This will make a good comparison that could also test for novelty effects(explained above).

This chart shows the users in the control group have less time to post than the existing users. This error could lead to bias and it is better to analyze the reatment group in a way that ignores new users.

We have identified a few plots to support our hypotheses and there are many possibilities to explore by querying the data. You can explore through the courses. We can conclude based on a few plots as our metrics shows that interaction effects possibly occurs through some interferences and the users which are not independent between the groups(control and treatment group) that can affect the results of 50-% higher users who post more in the treatment group. Nevertheless, we can isolate users in the control and treatment groups based on some segmentation like usage by devices, usage by user type like content producers vs readers in order to avoid errors and get better results.

You can check the full results in the Mode Analytics studio in pdf format.

References and consider to read as additional information :

  1. Build the App Your Customers Want: Beta Test with Amazon A/B Testing Service
  2. Statistical method calculation in A/B testing
  3. Using A/B testing to measure the efficacy of recommendations generated by Amazon Personalise
  4. Udacity A/B testing Free course
  5. Report of A/B testing in Yammer in Mode Analytics Studio
  6. Student’s t-test
  7. Trustworthy Online Controlled Experiments: A Practical Guide to A/B testing

Thank you for reading!

I really appreciate it! 🤗 If you liked the post and would like to see more, consider following me. I post topics related to machine learning and deep learning. I try to keep my posts simple but precise, always providing visualization, and simulations.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store