How I cleaned my First Dataset Using Microsoft Excel
So, I just dived into the world of data analysis, and let me tell you about cleaning my very first dataset in Microsoft Excel.
Introduction:
In order to ensure accurate and trustworthy findings from data analysis, cleaning dirty or unclean data is an essential step in the process. We will go over step-by-step methods in this guide for efficiently cleaning and organizing data in Microsoft Excel.
Step by Step Process into Cleaning a dirty data.
The fact that the material I used is in a text format must be noted first.
1. As can be seen in the picture below, I opened my Microsoft Excel application first in order to clean my unclean data. I next selected Open, followed by Browse.
2. After that, I selected the "dirty file" and then clicked Open. The image below shows the prompt that appeared after selecting the "open" option.
3 The Delimited option is automatically selected by Excel in the image above. I then clicked on Next, which brought up another prompt and automatically selected the Tab option. Next, I selected the semicolon option, and last, I clicked on the Finish function.
After that, the dirty data is delivered and prepared for cleaning!
In order to know what we will be working with, it is expected of a data analyst to thoroughly examine the data. In this case, the data relates to every employee in the organization.
Now let’s clean up the dirty data properly. I started by changing the names in the first column from upper case to proper case. I changed it to a proper case by clicking on column B first, then using the insert function to insert another column, which brought up an empty column B.
Next, I clicked on the cell I wanted to effect and the proper function (=PROPER(A1)).
Finally, I hit the tab key on my keyboard. The outcome of this is shown in the picture below. We may observe that the name has changed.
I then used autofill to make the necessary changes in the other cells. The content in column B was then pasted into column A using the copy and paste function. Next, I thicked the Value option. Below, you can see pictures
Next, I selected Column C, then I selected Insert to add an additional column, and last I selected Cell C1 and entered the (=value (cell B1)).
Next, I pressed the tab key on my keyboard.This is the result that appears below after I autofill the result.
Then I selected copy and paste special into cell B.I then changed the format to Short date by clicking on the number tab, as you can see below.
The outcome
I cleaned Column C using the same method as Column B, but I replaced the format of Short Date with Currency. The outcome can be seen below.
I proceeded to Column D and made the same case changes to the characters as I had in Column A. Below is the outcome.
I then used “substitute function" to change the abbreviated words to full words. Using =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(PROPER(D1), “Admin”, “Administration”), “Mktg”, “Marketing”), “R & D”, “Research and Development”) . The result is seen in the picture below.
Regarding Column E, I used the same method. I created a new column, changed the text in column E1 to proper case using the PROPER function, and then affected the remaining text in column E using Autofill. Then, just like I did in the previous columns, we copy and paste special.
At this point, the dataset already appears appealing. Simply glance at it .
With each column, I made a header. Here’s the final product after I highlighted every character and used the Ctrl T function to turn everything into a table. And this below is the Clean Data