Data Science Research: Where do I Begin?

An Undergraduate Student’s Guide to Starting Data Science Research

Varshika Prasanna
NYU Data Science Review
3 min readApr 12, 2022

--

Photo by Kaleidico on Unsplash

Data science is the cornerstone of research. Researchers use data science every day to derive insights, learn, and make inferences from data.

On February 14th, 2022, the NYU Center for Data Science hosted a “Research 101” panel for undergraduate students to receive insights from experts in the field about how they can get into research. So if you’re interested in pursuing your research in data science or confused about where to begin, this one’s for you!

What is Research in Data Science?

Currently, there are at least eight different disciplines that use data science on a regular basis. Prof. Sarah Shugars, CDS Moore-Sloan Faculty Fellow, says that:

Research in data science can be whatever you want it to be.

Regardless of the size of your data, you can answer any question from an interdisciplinary area by learning from other’s approaches.

There are many different types of research in data science. Prof Jonathan Niles-Weed, Assistant Professor of Mathematics and Data Science, finds that research is an “interplay of application, methods & theory” and prefers to focus on the theory and proofs behind machine learning models.

Yara Kyrychenko, a senior studying Math (Honors) and Psychology, is interested in natural language processing and applying pre-existing methods to answer questions in the social sciences and psychology.

Ultimately, how you pursue research in data science depends on what interests you the most.

What are must-have skills to get into research in data science?

Prof. Shugars says that today, there are so many great resources for conducting data science research. With a few lines of code, anyone can access state of the art systems. The skill lies in understanding what they mean.

Sreyas Mohan, a CDS PhD Student, said that calculus, probability and linear algebra are the core mathematical skills required to understand data science models.

Swapneel Mehta, another CDS PhD Student, feels that for those interested in pursuing an application oriented perspective to data science resources, a corporate setting would help familiarize with the research process, and the use-case would be given to you.

Yara said that a research methods class would help familiarize you with the process of experimental design.

John, who pursues research in the theoretical side of data science models, said that he looks for familiarity with proof writing and critical thinking in his fellow researchers.

How do you actually begin the process of getting into data science research?

John recommends the REU program, where he got his start into math research, and slowly transitioned to data science. At REU, undergraduate researchers work on solving a math problem. This preparation can help you if you’re interested in grad school! (*although it’s not ideal for international students*).

Yara recommends the SURE and the AM-SURE programs offered by NYU Courant, where even international students can get funding!

Sreyas recommends taking some graduate classes with project components if you just want to dip your toes into research. John affirms that taking a professor’s class is the best way to get their attention.

For more general resources, the DURF is always a great option! You can also always look at organizations that do AI research such as ML collective, Fast AI, Data Science for Social Good, AI for good.

What classes at NYU can I take to learn more about research?

Swapneel recommends Causal Inference at Steinhardt (APSTA-GE 2012) and Shreyas recommends Machine learning at (CAS DS-GA1003).

Causal Inference discusses methods to perform analyses that can answer important policy questions. Machine Learning covers a wide array of machine learning methods and statistical modeling.

What is the most difficult thing about research?

Patience, says Shreyas. Research is a human endeavor.

Unfortunately, sometimes things don’t work out the way you want them to. This is what makes research very frustrating, but also very rewarding.

All in all, research in data science is about the path you want to follow. I hope this article helps you figure out how to dip your toes in and get started!

--

--