“Viewers turned Streamers”— Twitch Data Interview Questions for Gamers. 💬🎮🎥

Code with Corgis
Code with Corgis
Published in
6 min readMay 13, 2021

--

Dear 👋💻 Gamers💬🎮,

MAKING LEARNING HOW TO CODE

✧・゚:* CUTE(◕‿◕✿) and INFORMATIVEᕙ(⇀‸↼‶)ᕗ!!!

Interview Question: From users who had their first session as a viewer 👀💬, how many streamer sessions🎏 have they had?

  • Return the user id and number of sessions 🔢 in descending order.

In case there are users with the same number of sessions, order them by ascending user id.

Twitch Session Data Schema 🏗️

CREATE TABLE twitch_sessions (
user_id int,
session_start datetime,
session_end datetime,
session_id int,
session_types varchar
);
  • user_id 🔢: unique id for the users using twitch
  • session_start⏳ : timestamp when the stream session start
  • session_end⌛: timestamp when the stream session end
  • session_id 🔢: unique id for the twitch session
  • session_type: type of users which are streamers🎏 or viewers👀

Data 📊

twitch_sessions.csv

Expected Output ✅

All required columns ✅🏛️ and the rows 🚣

of the solution are shown below:

| user_id | n_sessions | 
| ======= | ========== |
| 1 | 1 |
| 3 | 1 |

Hints in Postgresql 🐘

  • ☝️ First, find users who meet the criteria. 🔍 ✅
  • Proceed with the group by using selected users and streaming in a filter

Using Postgresql 🐘

Step 1: SELECT user id, session type, and rank the user id but ORDER BY on session start time FROM the twitch session table

SELECT user_id, session_type,
rank() OVER (PARTITION BY user_id ORDER BY session_start) streams_order
FROM twitch_sessions -- ⬅ Step 1️⃣

Output for Step 1️⃣:

| user_id   | session_type   | streams_order   |
| :=======: | :============: | :=============: |
| 0 | streamer | 1 |
| 0 | viewer | 2 |
| 0 | streamer | 3 |
| 1 | viewer | 1 |
| 1 | streamer | 2 |
| 2 | streamer | 1 |
| 2 | viewer | 2 |
| 2 | viewer | 3 |
| 3 | viewer | 1 |
| 3 | streamer | 2 |

Step 2: SELECT user id from the subquery, which SELECTs the user id and session type rank by the user id ORDER BY session start

SELECT user_id -- ⬅ Step 2️⃣
FROM (SELECT user_id, session_type,
rank() OVER (PARTITION BY user_id
ORDER BY session_start) streams_order
FROM twitch_sessions) s1 -- ⬅ Step 2️⃣

Output for Step 2️⃣:

| user_id  | 
|:--------:|
| 0 |
| 2 |
| 3 |
| 1 |
| 2 |
| 0 |
| 0 |
| 3 |
| 1 |
| 2 |

Step 3: Filter where streams order is 1, and the session type is a viewer

(SELECT user_id
FROM
(SELECT user_id,
session_type,
rank() OVER (PARTITION BY user_id
ORDER BY session_start) streams_order
FROM twitch_sessions) s1
WHERE streams_order = 1 -- ⬅ Step 3️⃣
AND session_type = 'viewer') -- ⬅ Step 3️⃣

Output for Step 3️⃣:

| user_id  | 
|:--------:|
| 1 |
| 3 |

Step 4: SELECT the user-id FROM the twitch session table where the session type is a streamer, and the user id is in the subquery that contains the ranking.

SELECT user_id                      -- ⬅ Step 4️⃣
FROM twitch_sessions -- ⬅ Step 4️⃣
WHERE session_type = 'streamer' -- ⬅ Step 4️⃣
AND user_id in -- ⬅ Step 4️⃣
(SELECT user_id
FROM
(SELECT user_id,
session_type,
rank() OVER (PARTITION BY user_id
ORDER BY session_start) streams_order
FROM twitch_sessions) s1
WHERE streams_order =1
AND session_type = 'viewer')

Corgified Simplified

SELECT user_id                   -- ⬅ Step 4️⃣
FROM twitch_sessions -- ⬅ Step 4️⃣
WHERE session_type = 'streamer' -- ⬅ Step 4️⃣
AND user_id in (1, 3) -- ⬅ Step 4️⃣

Output for Step 4️⃣:

| user_id  | 
|:--------:|
| 3 |
| 1 |

Step 5: Add GROUP BY and ORDER BY number of sessions in DESCending and the user id in ASCending order

SELECT user_id, count(*) n_sessions
FROM twitch_sessions
WHERE session_type = 'streamer'
AND user_id in
(SELECT user_id
FROM
(SELECT user_id,
session_type,
rank() OVER (PARTITION BY user_id
ORDER BY session_start) streams_order
FROM twitch_sessions) s1
WHERE streams_order =1
AND session_type = 'viewer')
GROUP BY user_id -- ⬅ Step 5️⃣
ORDER BY n_sessions DESC, -- ⬅ Step 5️⃣
user_id ASC -- ⬅ Step 5️⃣

Corgified Simplified

SELECT user_id                   
FROM twitch_sessions
WHERE session_type = 'streamer'
AND user_id in (1, 3)
GROUP BY user_id -- ⬅ Step 5️⃣
ORDER BY n_sessions DESC, -- ⬅ Step 5️⃣
user_id ASC -- ⬅ Step 5️⃣

Output for Step 5️⃣:

| user_id | n_sessions | 
| ======= | ========== |
| 1 | 1 |
| 3 | 1 |

Code Solution 💻 ✅:

SELECT user_id, count(*) n_sessions
FROM twitch_sessions
WHERE session_type = 'streamer'
AND user_id in
(SELECT user_id
FROM
(SELECT user_id,
session_type,
rank() OVER (PARTITION BY user_id
ORDER BY session_start) streams_order
FROM twitch_sessions) s1
WHERE streams_order =1
AND session_type = 'viewer')
GROUP BY user_id
ORDER BY n_sessions DESC,
user_id ASC

Hints in Pandas 🐼

  • ☝️ First, find users who meet the criteria. 🔍 ✅
  • Proceed with the group by using selected users and streaming in a filter

Using Pandas 🐼

Pandas Twitch

Step 1: Convert the session start to a date-time format

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])

Step 2: Store the data frame into a variable called result and locate the session group by user-id extract the session start column and find the minimum value in each row

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])result = twitch_sessions.loc[twitch_sessions.groupby('user_id')['session_start'].idxmin()]

Step 3: Store the result in a variable called the first view, where the session type is a viewer

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])result = twitch_sessions.loc[twitch_sessions.groupby('user_id')['session_start'].idxmin()]first_view = result[result['session_type'] == 'viewer']

Step 4: Check to see if the twitch session data frame user-id is in the first view df and a session type is a streamer group by user-id extract the session id count and convert the number of sessions to frame. Finally, reset the index of the data frame

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])result = twitch_sessions.loc[twitch_sessions.groupby('user_id')['session_start'].idxmin()]first_view = result[result['session_type'] == 'viewer']result = twitch_sessions[(twitch_sessions['user_id'].isin(first_view.user_id)) & (twitch_sessions['session_type'] == 'streamer')].groupby('user_id')['session_id'].count().to_frame('n_sessions').reset_index()

Step 5: Sort the values by the number of sessions and ascending is False, which is descending, and sort the values again by user id with ascending is true

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])result = twitch_sessions.loc[twitch_sessions.groupby('user_id')['session_start'].idxmin()]first_view = result[result['session_type'] == 'viewer']result = twitch_sessions[(twitch_sessions['user_id'].isin(first_view.user_id)) & (twitch_sessions['session_type'] == 'streamer')].groupby('user_id')['session_id'].count().to_frame('n_sessions').reset_index()result.sort_values(by='n_sessions', ascending=False).sort_values(by='user_id')

Code Solution 💻 ✅:

import pandas as pdtwitch_sessions['session_start'] = pd.to_datetime(twitch_sessions['session_start'])result = twitch_sessions.loc[twitch_sessions.groupby('user_id')['session_start'].idxmin()]first_view = result[result['session_type'] == 'viewer']result = twitch_sessions[(twitch_sessions['user_id'].isin(first_view.user_id)) & (twitch_sessions['session_type'] == 'streamer')].groupby('user_id')['session_id'].count().to_frame('n_sessions').reset_index()result.sort_values(by='n_sessions', ascending=False).sort_values(by='user_id')

Comparison PostgreSQL and Pandas Together 🐘🐼

Thank you for reading my Data Journey ❤ ,

Kody the Coding Corgi & Bits the Adorable A.I.

P.S.

If you enjoyed this comic strip and could help you in any way, sign up for our newsletter, or buy me a boba, which means a lot, and send your thoughts and feelings about this work.

Are you interested in collaborating? Follow us on LinkedIn.

D.M. us on Instagram or tweet us on Twitter or connect us on LinkedIn.

Please share this with your data friends, corgis friends, and coding corgis friends so we can make more comics in the future with your support. Thank You!

--

--

Code with Corgis
Code with Corgis

🍑 We make CODING CUTE(◕‿◕✿) and INFORMATIVEᕙ(⇀‸↼‶)ᕗ!