Today I experienced something really new. It was the day when I truly enjoyed the lecture which was nothing else but showing some visual content on the screen and discussing it. I mean, no computations. No coding. No theorems and no applications of some complex background knowledge. Just observing and making comments. Insane, right?

When I first read the name of the course, it said “Data Visualization”, Andriy Gazin. (It’s not that when I read it for the second time, it changed. I just wanted to emphasise, that it was, like, the first impression, you know).

That’s one of the slides from that lecture, actually.

Exactly the same question appeared in my head.

Nevertheless, the lecture started with some cool graphs. You can see more of them here, it’s Andriy’s blog about infographics and dataviz. But the audience, and especially myself, liked this one the most:

You can read more about the ideas upon this graph here.

Everyone was like: ‘Wow, falling stars, falling stars!’ 
I’d like to emphasise something here: almost anything you’d notice on this graph and consider it to be a rather ‘smart decision’ was done on purpose. Decreasing curves — the metaphor of ‘falling apart’. That’s why you see the ‘years’ axes moved up. Strange coordinate grid — for better perception, as well. We have our data concentrated on some levels, not randomly. One colour? Well, why would we need more? As Andriy says, the fewer attributes you use for visualizing your data — the better, because otherwise, you just draw person’s attention away.

But, what I liked the most, was this brainstorming part. We were given a task:

So, how can we best represent just 2 numbers visually? First suggestions were obvious: segments of different length, circles of different areas, dots on the x-axis, two clusters of 42 and 23 dots and so on. I would provide you with a picture of the whiteboard, covered with drawings, but I was too involved, sorry:) Can you think about any other appropriate encodings, by the way? Leave your comments below, if so.

But why did I name this post ‘overcomplicating’? Well, because here I am, a person who always overcomplicates. It wasn’t enough for me to be the one who suggested the ‘segments’ case, I wanted something ‘oh-my-god-you’re-so-smart’ :) So I raise a hand, and I’m like, ‘two people of ages 42 and 23’. And I’m really waiting for Andriy to say ‘YES, that’s what I wanted to hear!’ But all I receive is this glance, full of bewilderment:

Ok, I’m not saying that I was the only one with crazy ideas :) But the point is, I always act like that!

Bashing geometry with complex coordinates — yep, no problem! Wait, what? No circle in the diagram? Pffff, do you consider that to be an obstacle?

Solving probabilistic problems using generating functions? Never heard of such approach, I’d rather differentiate this binomial couple of times to see the series converge. EASY!
(I won’t post another picture, I think you got the point what the ugly calculations are).

So, what did I learn from this dataviz workshop was actually not Grammar of Graphics (and I’m not saying that I didn’t get the general understanding of that), but rather that no one wants you to deliver some brute-force, complicated stuff as a data scientist. You have to redesign the way you think to become an analyst who can actually suggest something pure and simple but at the same time worth ‘genius!’ exclamation. Not every good specialist is already born with these skills. So today would become a starting point for me of developing them.

P.S. Thanks to Andriy for providing useful links. Don’t forget to visit his blog, there are not only fancy graphs but also other interesting stuff.

P.P.S. You’re gonna so fall in love with this. Thank me later.