On learning — and loving — data science

Junbo (Jake) Zhao tells us about his journey from China to the US and his passion for data

Junbo (Jake) Zhao is a MS in Data Science student from the class of ‘16. Prior to studying with us, Jake received a Bachelor’s degree in Electronic Engineering. He also has experience in Computer Vision and Music Information Retrieval. Jake is primarily interested in Large Scale Machine Learning and Big Data.

I was born in Beijing, China, and graduated from Wuhan University in electrical engineering. On my sophomore year, I got the chance to work as a research assistant at the signal processing lab — my main project was pedestrian detection, where I first learned about, and fell in love with, machine learning . Then, as a senior student, I got to work in the top university in China, Tsinghua University, on a face recognition project. I also did my first internship at Douban Inc., working with music information retrieval.

With those experiences under my belt, I got into the Center for Data Science at NYU. Soon after I landed at New York, I started seeking out research opportunities. I reached out to Ross Goroshin, a student of Professor Yann LeCun who graduated last year (2015). He was very nice, and spent several hours talking to me about his project. Ross has a research interest in auto-encoders — which is one particular family of models in deep learning I also happen to be interested on, since I implemented and experimented on such models back on the internship at Douban Inc.

Not long after I got a desk at the CIVLR lab open space, I worked on a project, “Stacked What-Where Auto-encoders”, a model that tried to provide a generic architecture able to unify different learning modalities. Joint with Michael Mathieu, Ross Goroshin and Professor LeCun, we got the paper accepted into an ICLR 2016 workshop presentation.

I also had the opportunity to collaborate with another Professor LeCun’s students, Xiang Zhang, on a NLP project. Xiang had the idea of using convolutional neural network on document classification based upon a character-only representation of the corpus. I tried to compare his scheme to the traditional NLP approaches. We got the paper into NIPS 2015.

NIPS is one of the top-tier conferences in the realm of machine learning. Luckily, I got the opportunity to participate, with the poster presentation of our paper with Xiang Zhang and Prof. LeCun. It was an amazing opportunity to discuss ideas and brainstorm with lots of well-known researchers and professors. It also helped us find people who have common research interests and could initiate some collaborations. The diversity of thinking on similar topics in machine learning was one of things that impressed me the most — it really gave me new perspectives.

All in all, I would say my experience at NYU’s Center for Data Science has proved to be a solid next step in my career. Not only is the course structure outstanding from a learning point, but the opportunities for research, collaboration and industry exposure make this school the best experience I could have as a data scientist right now.