Learning Beyond Data Science in a Data Science Class

Togo Kida
graphtogo
Published in
6 min readJan 29, 2020
Things I learned about Data Science in One Semester

In fall 2018, when I entered Harvard as a graduate student, I was struggling a great deal with statistics and probability.

For someone who has been initially trained in design and media arts and has been working in the advertising industry, statistics and probability was something new to me. Why did I have to push myself to take this class? It was because I wanted to take a data science class later on.

After completing my class in statistics and probability, I went on to take my first data science class in fall 2019. Just like my statistics and probability class, I struggled like hell in this data science class, but at the same time, I am so glad that I took this class. I have never taken a formal computer science class in an academic environment, and as a new starter, I would like to write a few things down about the course.

What Motivates a Creative to Study Data Science?

To begin with, why did someone with a creative background go on to take a data science class? I was initially trained in design and media arts in undergrad, and I have spent the last seven years in a creative division at an advertising agency. For me, using digital technologies for whatever creative expression came quite naturally to me.

I have played around with Processing and openFrameworks to build necessary prototypes for work or even pursuing my creative expressions using frameworks as such. However, reading John Maeda’s recent Design in Tech Report and other articles, it has come to my mind that data might open up a new horizon for me to further pursue my creative capabilities. With such insight, I came to understand the importance of having a formal data science education while I’m going through graduate school.

About the Class I Took

In Fall 2019, I was taking two data science classes, and one of them was APCOMP209A(CS109A), which I am writing about right now. The class is known as the introductory course for data science at Harvard. It requires students to take two of the most popular courses at Harvard, which are CS50(Introduction to Programming) and STAT110(Introduction to Probability). It’s a legit computer science class.

I wanted to take during my first semester at Harvard. Still, I lacked the knowledge of probability, which prohibited me from working to the knowledge assessment assignment to enter into data science. I had to change my plan to my dismay. Given that circumstance, I eventually proceeded to STAT110 and experienced hell, but after this hell, the knowledge assessment assignment became a piece of cake. I felt awesome.

To master new expertise, I realized that it’s crucial to appreciate the work done before us, and not to ignore the intellectual asset that was built by them. I was officially cleared to take my first data science class, but this also came with great suffering…

The Diversity of Students

Before this taking this class, I was feeling slightly insecure about myself. This is a computer science class, and also a popular one. It must be packed with CS students… can I be a match for those students? However, upon taking this class, my assumptions were proven wrong.

The class was hard for sure, but I was surprised to see so many non-CS students coming from all over the school. Needless to say that the course was mixed with undergraduate and graduate students, it was also filled with students coming from different schools such as education school and business school. People may ask why, but there were several design school students, including me.

Moreover, students were participating in this class online and listening to the lectures from all over the United States. Looking at the class website, there were numerous students exchanging messages asking to form study groups in each of the cities. The eagerness of the students was almost visible.

I spoke to a few non-CS students, but they were all interested in utilizing the skills of data science in their respective fields. I thought this was great. When you think about data science, many say that it’s only meant for the experts. However, having a cross-over like this would enable further innovations to happen, and I think it will be beneficial in the long run.

Let’s Do it Together

The course was structured well. The class starts with the very basics and walks you through the paradigms of data science. Even for those who have never studied data science. You begin with linear regression, and then becomes, k-NN, and then becomes logistic regression, and then becomes decision tree, and then becomes random forest, and finally leads to neural network.

When I was still in Japan working, I struggled to learn deep learning on my own, but I didn’t really “get” it. Looking back, I think this is because I lacked the understanding of the fundamental paradigm in data science. However, this class enabled me to build the basics from bottom-up.

There were two lectures/labs every week for this class. Assignments were due in every one to two weeks. It was hard work, but that goes the same for STAT110, so that was nothing new to me here. One difference was that this class encouraged to work with other students in pairs. This pair-programming approach was highly effective.

According to GitHub, I Was Writing Something Almost Every Day

It’s essential to work on the problems on your own for some amount of time, but it is also a great learning experience to share the process with your partner and further collaborating through discussions and trials. By working together, I learned that we could learn better.

Let’s Think About it Together

Another thing that I thought unique about this class was that there was an ethics module incorporated in the course. A philosophy researcher at Harvard came to class and discussed together as a class how the usage of data can lead to prejudice and discrimination by looking at real-world examples.

A Philosophy Researcher Leading the Discussion in Class

I thought this ethics module was practical in the sense that students can understand how discrimination in data happens technically, so the discussion that followed was down to earth without anything fluffy. Many of the students were actively participating in this discussion.

Final Project

The grand finale of the semester was to submit the final project. For my final project, I teamed up three other classmates and worked on a recommendation engine for Spotify. Just like the paired homework, I found this exhilarating being able to progress while having a great discussion within the team. Moreover, since this final project had a more extended period, I found it meaningful to work in a real-world setting.

Discussion We Had Together as a Team

The members consisting of the project were also memorable to me. Besides me, the team consisted of two data science master students and a neuroscience professor from Tufts University. The neuroscience professor told me that he was taking sabbatical leave, and he is taking numerous classes to brush up his skill during his leave. He was above my age, but I was deeply impressed by his dedication to learning something new continually.

After going through the semester, I feel I was able to build a foundation in data science. More than that, I met many impressive figures and had a glimpse of what a data science education looks like in the United States.

--

--

Togo Kida
graphtogo

Creative. Marketer. Strategist. Technologist. Formerly at UCLA, Harvard, Dentsu, and Uniqlo. 100 Leading Global Thinkers 2016. Creativity, design & data.