Lab Book — Week 11

Convert pdf file to excel

I have an set of data inside a pdf file as appendices, but it cannot be read in Jupyter (maybe can, but in a very complex way). Ben tried to teach me how to use python to convert the file into excel, but when I copied the data to python, they were not in the ideal format, so Ben suggested me to copy the data directly.

When I copy the data, problem was that they were not separated as shown in table. There was only one column with data separated by spaces. So I have to find out how to separate them.

How the data were like at first

To convert the text to columns, you can use “text to columns” under “Data” and choose separate by space. Then it looked like this.

Now it looks much better but see the “City of Sydney”? They were also separated. There’s no way that I adjust them manually, so I have to think of a way to separate the texts and values. Ctrl Z — go back.

And… I didn’t find. Actually I found, but you need to write VBA. Of course I will learn how to do it eventually, but not now. So what I did was add commas around the names so that the text to columns function can treat them as one. Now they look nicely arranged.

