Data Analysis with Excel: Fundamentals Part-2

Abhijit
9 min readNov 29, 2023

--

Welcome to part 2 of Data Analysis with Excel: Fundamentals. This blog is a continuation of part 1 which you can check out here. For this part, we’ll use the same data set that we used for part 1 with the operations that were performed.

Data File:

Part 1 dealt with processing and manipulating text data where we checked functions like LEFT(), RIGHT(), MID(), CLEAN(), LEN(), etc. In this part, we’ll deal with numbers and dates.

‘Text Functions’ as the name suggests only work for the texts and not for the number and dates. In this blog, we’ll look at functions to convert data types. We’ll convert text to numbers, and dates to text, and then we’ll look at how we can convert numbers and texts to dates. All this together will again help us solve the problem of data not being in the correct format.

  1. Converting data with VALUE and TEXT: Continuing our task from part 1, we now want the total from the ‘$ Amount’ column in the ‘Total’.

To do this we’ll use the SUM() function as follows:

In cell V1, type ‘ =SUM( ’, then click in cell Q2, press ctrl+shift, and hit the down arrow which will select the entire column containing the data which will give=SUM(Q2:Q85)

We’ll get:

Now, this is a problem. Output for the numbers should not be zero. To understand the issue, let’s look at the ‘$ Amount’ column. You can see that the data is shifted on the left side of the cell. This hints about a problem being present.

In Excel, the numbers align automatically to the right side of the cell. So, we can assess that the data stored in the ‘$ Amount’ column is in the form of text rather than numbers. Before performing any arithmetic operations on this column, we need to convert them to proper numbers by using the ‘VALUE’ function which converts text that appears in a recognized format to a numeral.

In cell Q2, we already have our formula from the previous calculation in Part-1:

We’ll add the VALUE() function to its output:

We get:

Copy the formula for the rest of the cells in the column:

And so, you can see, correcting the format also corrected the ‘Total’.

In the situation above, we have looked at converting from text to numbers. Next, we’ll be looking at a situation where we have to convert from number to text value.

Let us look at the table in the second sheet ‘MC Invoice Report’.

We’ll extract the month from the column ‘Invoice Date’.

As you can see, the data has been stored as a proper date and hence we cannot apply functions like LEFT, RIGHT, or MID to extract this data.

The function that we are going to use here is the TEXT() function. It works by taking in a numerical value and converting it into text.

Format: =TEXT(value, format_text), where ‘D’ is for day, ‘M’ is for month, and ‘Y’ is for year.

We get:

Copy the formula to other cells to complete the column values.

2. Dates and Basic date functions: Dates are numbers formatted to look like dates and which can be used to perform calculations.

In the ‘MC Invoice Report’ sheet, we need to find a reference date by which we will be doing the calculations. Let’s call this ‘System Date’.

To find the ‘System Date’ we will use the ‘Date and Time’ library from the ‘Formulas’ tab.

From this library, we’ll use the function ‘TODAY()’ to find todays date.

The box that appears here says that the function is volatile. A volatile function recalculates each time there is a change to the workbook.

Finding the ‘Invoice Day’ from ‘Invoice Date’:

Format: =Day(serial_no)

We get:

Let’s do a similar operation to find the month and year in the ‘Over Due By’ and ‘Late Charge’ columns:

We get:

3. Generating Valid dates using the DATE() function: For this part, we’ll look at the sheet ‘Supplier Invoice Statement’.

The dates present in the ‘Invoiced’ and ‘Paid’ columns are not valid dates but just text. We’ll convert them into valid dates. To do this, we’ll use the ‘Date’ function. This function returns the serial number of a particular date.

We’ll do the calculation in the ‘Paid Date’ column. In the column, well use the DATE function for the values in the first cell of the ‘Paid’ column where the month is ‘4’ and the day is ‘17’.

We now have a valid date.

But the date is not the same for every id and so we need to derive dates for all. To determine the date in the correct dates, we’ll use the RIGHT() function as we are extracting from text data.

We get:

After copying the formula for all:

To calculate the ‘Invoice Date’:

We get:

4. Calculating days between two dates (DAYS, NETWORKDAYS, WORKDAY): Let’s do some calculations for the ‘Due Date’ column. Remove the preexisting values.

Let’s say that the due date is 6 days from the ‘Invoice Date’. Now, to calculate the ‘Due Date’ we’ll simply add ‘6’ to the ‘Invoice Date’ as follows:

We get:

Copy the formula to the other cells.

We’ll now do calculations for the ‘Order Due By’ column. Remove the pre-existing data from the column.

To find the ‘Order Due By’ we’ll need to subtract the ‘Payment Date’ by ‘Due Date’ as follows:

We get:

Let us now see on which day the ‘Due Date’ falls on:

To find this, we’ll use the TEXT() function:

We get:

We can see that there is a problem. A lot of these ‘Due Date’ are falling on a Saturday or a Sunday. Which usually are non-working days. In such types of calculations, we often need to account for working days. To do that, there’s a function called ‘WORKDAY’. It returns the serial number of the data before or after a specified number of workdays.

For our sheet, let’s say that the due date is 6 working days after the invoice date as follows:

We get:

We can see that there are no more Sundays or Saturdays in the column.

In some countries, Saturday or Sunday are not always the non-working days. To take care of this, we have another function called ‘WORKDAY.INTL’ or ‘Workday International’. Like ‘WORKDAY’, using parameters to indicate which and how many days are weekend days.

We can choose any one combination as per the organization’s requirement. Or we can also give it a code for [WEEKEND]. It requires seven ‘0’s or ‘1’s. ‘1’ represents a non-workday. ‘0’ represents a workday.

Note: The value starts from Monday.

Apart from off days of the week, there are also holidays due to some festivals or other special events that take place. WORKDAY.INTL takes care of that too.

Since the period is March and April, there is Good Friday, and we have a holiday for that. In 2020, it was celebrated on April 10. We’ll add this using the DATE function within the WORKDAY.INTL as follows:

Sometimes there can be more than one such holiday in a particular duration. Including all of them within a function can be a hassle. The best way to include all of them is by making a list of holidays somewhere and then including the location of the WEEKDAY.INTL function.

A list of holidays is included in the NSW Holidays 2020 sheet.

To include this in the WEEKDAY.INTL function, we’ll copy all these dates and paste in the WEEKDAY.INTL function:

Let’s look at the ‘Over Due By’ column. To find the correct days, we’ll use the ‘NETWORKDAYS’ function as follows:

We get:

Here we end our Part-2 of Excel Fundamentals. You can find the completed solution file here.

Connect with me on LinkedIn.

--

--