5 Minutes with Arthur Spirling, Deputy Director and Director of Graduate Studies

Learn more about the man behind one of the most competitive M.S. in Data Science programs in the nation, and his vision for our CDS Master’s students

Arthur Spirling joined NYU in 2015 as an Associate Professor of Politics and Data Science, and as a steering committee member of the Moore-Sloan Data Science Environment, a grant housed by CDS.

This Fall 2017, he not only continues to be our Deputy Director, but also starts his tenure as our Director of Graduate Studies for the Master’s in Data Science program. His research focuses on analyzing text-as-data to answer questions in political science, and he also the co-organizer of the text-as-data speaker series with Professor Sam Bowman. Before coming to NYU, he was the John L. Loeb Associate Professor of the Social Sciences at Harvard University.

1. The M.S. program in Data Science has been rapidly evolving since we launched it in 2013. What are some of the new changes to the program this year?

As with every year of the MSDS, we have expanded our numbers — yet simultaneously become more selective. Our fantastic incoming cohort of 96 were selected from some 1659 applicants, meaning our program is considerably harder to enter than many top law schools or business schools. Every single one of our new students has truly remarkable potential and I know my colleagues are very happy to be working with them.

On the curriculum side, this year we are excited about the addition of ‘tracks’ that allow our students to specialize in certain areas of Data Science, like Big Data, Math and Data, Natural Language Processing and Physics. Next year, we will add Biology.

Ultimately, we want to provide a program structure that links domain knowledge and methods via the course offerings. This enables CDS and its students to place themselves at the center of Data Science as the field’s popularity explodes.

Part of our vision at CDS is that Data Science is not just about learning powerful new methods; it’s also about having a deep understanding of the ways those methods can be used in the ‘real world’ of industry and policy. Our tracks — and the demand for them from students! — are proving a great way to make this vision a reality.

2. In addition to your role as DGS, you also co-organize our popular text-as-data speaker series. Analyzing text-as-data is a major part of your research. How did you start getting involved with analyzing text-as-data? Why do you think it’s becoming such a compelling methodology?

Human beings have been writing things down for around 5000 years, but it’s only very recently in human history that anyone other than the social elite were producing texts. And, even when they did, it often wasn’t preserved for future research purposes.

A major change came with the advent of the Internet, social media, and news sites.

Now, literally billions of people write billions of words every day, whether they be online newspapers, product reviews, government reports, or someone commenting on how cute a friend’s baby looks on Facebook.

From a research perspective, we can easily access that information in machine readable form: these huge troves of text data are ready to be analyzed immediately, often in real time.

At the same time, the technology to make older documents machine readable has also advanced remarkably: one can now take, say, government records from World War II, push them through an optical character recognition system, and have quite high quality documents amenable to statistical work. With the explosion of text data has come methods for dealing with these collections, and that symbiotic relationship seems set to continue.

Personally, I became involved in text-as-data because I was studying a particular historical puzzle: the democratization of the UK in the 19th Century. It’s an interesting case because in a relatively short period of time (around 80 years) politicians there embarked on very radical reform, going from a narrow franchise where no one could vote, to one where everyone could.

This is surprising: generally, elites don’t voluntarily give up power to people poorer and less educated than they are. But, more broadly, I noticed that there was actually a lot we didn’t know about politicians and voters back then: how they interacted with each other, how they spoke in parliament, how they organized policy-making and so on. Simultaneously, I also realized that we had millions of records of speeches from which one could make an inference. So I turned to modern technology to understand these events: it’s been a great experience, and I learned a lot both substantively and in terms of methods!

3. You’ve been at CDS for a while now! What has been your favorite experience so far?

Time flies: it’s my third year! My favorite experience is seeing how MSDS students develop during their time with us, and helping them accomplish their goals professionally.

Our students work very hard: our courses are technically tough, and demand a lot of hours of focus and effort. The students persevere and, more often than not, they land themselves in their dream job at tech firms, banks, in government, or in academic research institutions. They are rightly proud of how far they have come — and we are proud of them, too.

Interview conducted by Cherrie Kwok

Like what you read? Give NYU Center for Data Science a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.