Swift Solution to Dave Thomas’s Data Munging Kata Part 2

If you haven’t read my solution to Part 1 of Dave’s data munging kata, please check it out first. Part 2 asks us to clean an analyze a dataset of football (you know, soccer) results for the British Premier League.

The file football.dat contains the results from the English Premier League for 2001/2. The columns labeled ‘F’ and ‘A’ contain the total number of goals scored for and against each team in that season (so Arsenal scored 79 goals against opponents, and had 36 goals scored against them). Write a program to print the name of the team with the smallest difference in ‘for’ and ‘against’ goals.

This exercise if very similar to part 1 and much of the original code can be reused. This post will focus on the differences between Part 1 and Part 2. [full code on GitHub].

Model

Instead of a WeatherRecord, we’ll need a FootballResult model. When calculating the goals for/against delta variable we need an absolute value of their difference. That’s because with the WeatherRecord the Max temperature was always higher then the Min temperature but in this case either variable can be higher.

I also took a different approach in transforming a table full of strings into a model object. For weather records all three values could be modeled as Ints (day of month index, maxTemp, minTemp). For this exercise, we need a teamName: String, goalsScoredFor: Int, and goalsScoredAgainst: Int.

I chose to write a custom init for my struct that accepts three strings and casts the goal strings into Ints. The reason for this was that its the model’s responsibility to transform the values into a specific type. The executor shouldn’t have this responsibility.

Data Transformer

The first difference is that none of the football data was ‘dirty’, so we could remove the cleanData step. Second, as mentioned in the Model section, we have a custom init method on FootballResult so that we don’t need the .convertStringsToInts() method.

In the Weather data set, we only needed to grab the first three columns, so we looped through all the rows and for each row a for i..< 3 was used to grab the first three columns. I wasn’t happy with this solution an re-worked it here to use flatMap. #somuchcleaner

static func removeUnneededColumns(fromTable table: [[String]]) -> [[String]] {
return table.flatMap{ (row: [String]) in
return [row[1], row[6], row[8]]
} }

.flatMap() is best known for ‘flattening’ an array of arrays so that [[“a”,”b”],[“c”, “d”] gets flattened into [“a”, “b”, “c”, “d”]. But .flatMap() can also map and return an array with fewer elements then you passed into it (something that .map() can’t do). In this case we’re pulling out the cells of the columns we need to keep and returning them.

*happy dance*

Other then changing names from weather to football where appropriate, that’s about all the differences between Part 1 and Part 2. As Dave suggests, I haven’t peeked at Part 3 of the kata, but I’m guessing we’ll be making something more generic.