A friend recently asked for some help with a dataset he was given. The dataset included millions of rows and thousands of columns. The problem, he said, was that there were numerous instances of columns that were “duplicated”. Well, I said, that word could mean a bunch of different things…