# How to clear Data Analyst Interview round 1: part 2

In the last article, I shared the questions that I have been asked in the data analyst interviews first round, I wanted to write a second part since I have given a few more interviews in the last few days and was confounded by some really smart questions that I wanted to share/document.

If you are planning to pursue your career in any field even remotely close to data science, you must be strong with your SQL skills. I would personally recommend you to practice all these questions. If you can solve these questions and from the last article, you will be 100 percent selected in the first round of your interview.

# Question 1

Statement: How many rows will be present in output after using the following joins

1. Left join
2. right join
3. inner join
4. outer join

I will be using pandas to show tables and determine the number of rows, since I find it easier to operate rather than creating new tables in a database.

Tables given

Left Join:

Right Join:

Inner join:

Outer Join:

This was a tricky question since we had a value “x” with a lot of duplicates, I got completely confused while answering this and answered all of them wrong. The trick is that every “x” of table 1 will be joined with 3 “x” of table 2. which is why left join has 7 rows.

# Question 2

Statement: What is the difference between UNION and UNION ALL, also give an output of the following tables after applying union operation on them?

Tables given

Answer: Union will join two tables row-wise but only unique row values are added while union all will add all the rows.

Except the id 7 of table 2, all ids will be displayed.

# Question 3

Table given

Statement: Find out the students whose marks are not increasing over time. (I'm not sure if this was the exact problem statement, but I hope you understood what was asked in this) the table name is “marks”

`select distinct answerfrom (select case     when diff<0 then "id"end as answerfrom (select *,marks-lag(marks,1)over(partition by id) as difffrom marks)a) b`

Explanation: We used window function lag in this. First of all, I will try to find the difference between marks and lag values. lag values are row values of one row above, so I can subtract the value of 1 row above to evaluate that marks are increasing progressively. For illustration check the below photo.

then we will use the “case when” to find if there are any values with diff as less than 0(negative value), when we find the value, we will add the value “id” in the column.

and next, we will output all the distinct id values from the case-when column.

Next few questions that were asked in the interviews were available on coding platforms so i solved them there. All the questions are self-explnatory and test your knowledge of SQL functions.

# Question 4

`select floor(avg(salary)), floor(max(salary))from salarieswhere substr(name,1,1)="a";`

# Question 5

`select name, count(name)from match_datawhere batting_status=0group by nameorder by name;`

# Question 6

`select a.personid, a.name, count(b.personid), sum(c.score)from person ainner join friend bon a.personid=b.personidinner join person con c.personid=b.friendidgroup by a.personidhaving sum(c.score)>100order by a.personid;`