SQL Project Planning

Problem:

You are given a table, Projects, containing three columns: Task_ID, Start_Date and End_Date. It is guaranteed that the difference between the End_Date and the Start_Date is equal to 1 day for each row in the table.

If the End_Date of the tasks are consecutive, then they are part of the same project. Samantha is interested in finding the total number of different projects completed.

Write a query to output the start and end dates of projects listed by the number of days it took to complete the project in ascending order. If there is more than one project that have the same number of completion days, then order by the start date of the project.

Sample Input

Sample Output

`2015-10-28 2015-10-292015-10-30 2015-10-312015-10-13 2015-10-152015-10-01 2015-10-04`

Explanation

The example describes following four projects:

• Project 1: Tasks 1, 2 and 3 are completed on consecutive days, so these are part of the project. Thus start date of project is 2015–10–01 and end date is 2015–10–04, so it took 3 days to complete the project.
• Project 2: Tasks 4 and 5 are completed on consecutive days, so these are part of the project. Thus, the start date of project is 2015–10–13 and end date is 2015–10–15, so it took 2 days to complete the project.
• Project 3: Only task 6 is part of the project. Thus, the start date of project is 2015–10–28 and end date is 2015–10–29, so it took 1 day to complete the project.
• Project 4: Only task 7 is part of the project. Thus, the start date of project is 2015–10–30 and end date is 2015–10–31, so it took 1 day to complete the project.

Logic:

Select Start_Date not in the End_Date, and select End_Date not in Start_Date, and then cross join these two. Select Min(End_Date) and filter by Start_Date < End_Date and order by Datediff(,). Remember Datediff takes only two expressions and separated by a comma, instead of a minus sign.

Solution:

`SELECT Start_Date, MIN(End_Date) FROM(SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) AS s,(SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) AS eWHERE Start_Date < End_DateGROUP BY Start_DateORDER BY DATEDIFF(MIN(End_Date), Start_Date), Start_Date;`

Credit

--

--

--

More from JEN-LI CHEN IN DATA SCIENCE

My homepage to record my thought processes for solving SQL and Algorithm questions

Isabelle

In love with telling stories with data