SQL interview Questions For Aspiring Data Scientist — The Histogram

Histograms

Julien Kervizic
Hacking Analytics

--

The histogram question is a general warm up question for aspiring data scientist a bit akin to fizz buzz for Software engineering.

It is a step that needs to be repeated countless time in the life of a data-scientist to get a sense of a distribution of different variables and which is not very difficult technically.

Any seasoned data-scientist should be able to do these step out of top of their mind and aspiring data-scientist should be able to prove they have the technical knowledge to do these.

Questions

Given an order item table containing the following fields:

order_item:
order_date DATE,
order_id INT,
user_id INT,
order_item_id INT,
catalog_item_id INT,
item_quantity INT,
item_price INT

1) Provide an histogram of item price
2) Provide an histogram of orders by order price

Answer 1)

SELECT
item_price,
COUNT(1) as frequency
FROM order_item
GROUP BY 1

This is traditionally a warm up question to just check if the candidate has some bases on SQL and can actually use a group by clause.

--

--

Julien Kervizic
Hacking Analytics

Living at the interstice of business, data and technology | Head of Data at iptiQ by SwissRe | previously at Facebook, Amazon | julienkervizic@gmail.com