DSX Fireside Chat with Sri Krishnamurthy — Founder of QuantUniversity + Northeastern University Data Science Professor
We sat down with Sri Krishnamurthy, the founder of QuantUniversity.com — a data and quantitative analysis company — and adjunct data science professor at Northeastern University about his background in both exciting roles, his experience with data science + machine learning, and his perspective on IBM’s Data Science Experience.
- We’ll be coming out with Part II of this blog post with a DSX notebook that Sri and his students have been using as part of his data science course!
1. Tell us a bit about you — who are you Sri?
I actually have two roles — my first role is as an entrepreneur. I run Quant University, an analytics advisory that first started in 2013. Before Quant University, I used to lead financial engineering at Matworks (and did that for 5 years). From there, I worked to start my own consultancy, where we focus on verticals like finance and energy. Our clients have things like scale and usability in mind but among all the hype, no one has a good business use case yet for machine learning and artificial intelligence. So what we do is work closely with customers and relay the business value of putting these things into play.
Our customers are asking ‘how do we apply these things?’ So we provide B2B training and have actually taken that experience to growing a B2C approach via these 2–3 day workshops. During these workshops, we bring in professionals and train them by relaying expertise in these new fields of machine learning for predictive analytics from technology and business perspective.
We chose Data Science Experience as the platform for doing these workshops because we don’t want to install from scratch in order to enable our workshop participants to get a feel for what it takes to build a spark/data science application or for them to build and share content.
2. These workshops sound really cool — when are the next, upcoming sessions? We have two workshops coming up in Boston and New York on deep learning and anomaly detection — the Boston one is on March 27th extending to the 28th and the other one in New York is on April 5th + 6th.
Learn more about + register for the upcoming Boston and NYC Deep Learning workshops and Anomaly Detection workshops
We’ll have many more workshops coming out on topics like spark, cognitive computing, and artificial intelligence. We’re also going to come out with an Analytics Certificate program (as a full summer program) and a Deep Learning certification for the second half of year.
For more information on QuantUniversity, check out the QuantUniversity website
3. And what about the other role you have Sri?
Other than my professional role, I’m also an adjunct professor at Northeastern University for data science where I teach a course on cognitive computing and AI. The class trains graduate students on how to build efficient and practical data science applications, to leverage APIs, and structure AI apps in the cloud. In the class, we’re asking students to work on Data Science Experience as well.
In order to help the students figure out how to actually build things, we give them templates through DSX, DSX allows them to share solutions with the entire class and get feedback. DSX helps them collaborate in the classroom and share their analyses.
4. Nice! What are some of your favorite things about data science as a field? What about some frustrating aspects of data science?
Earlier in the data science industry, the main focus was “can we leverage data and look for insights from data?” Now, with the open source revolution, languages like R and Python maturing, and an increasing number of Hadoop + Apache projects, there’s an acceptance that you have to rely on data and leverage all the information you’re collecting. It’s been interesting to see the influence of machine learning in terms of getting insights from important pieces of research and inspiring new business innovation.
Data science is no longer simply done by someone with a PhD since companies are looking for people that have expertise with large, streaming data sets. There is specialization happening as well the opportunity to evolve and place weight on various applications — major revolutions in new products and services have come from all these developments.
In a lot of schools and universities, none of the courses on applied data science and machine learning are there. This is creating a huge knowledge gap because, even with all the online resources on youtube and other sites, many students don’t have time to curate in order to learn. Thankfully, universities are coming to a realization that students are going into these careers and actually enjoying them. These students then give feedback to recruitment teams and college career service offices about their roles so universities get feedback about these kinds of courses.
Universities are now bringing in industry faculty who skew towards practitioners — and students love hearing about their real-life experiences and working on how they can realistically apply data in their projects.
5. How has using DSX changed the way you do data science?
With DSX, you can just fire it up without having to spend time on installing and configuring multiple different systems separately. For our workshops at universities, we use DSX to teach how to leverage these technologies for real problems, scale things on apache spark, and do machine learning on apache spark. When we’re putting together these workshops, we don’t want to devote so much time to installing packages — instead we can DSX to focus on what we do best since infrastructure is all there to collaborate.
With DSX, we also are able to pull all the content (like course materials and assignments) together for these participants then we simply add the participants as collaborators. People loved how they were able to access everything on DSX because they didn’t want to build things from scratch, using laptops with different operating systems, and facing the inevitable errors. For us as the workshop organizers, DSX helped us make dynamic changes, fix bugs, and make it easy for us to allow participants to share their own solutions to the rest of the group.
6. What has DSX been useful for (for you and/or the people you work with)? What about things we can continue improving?
Especially with all the work we want to do around deep learning and training really large networks, the ability to use GPUs would be a major enhancement. Also, looking forward to seeing Watson Machine Learning and wondering when it will be part of the product!
It would also be great to be able to structure my thought process into the GUI and have a whole studio of drag + drop nodes and being able to use that as a pipeline and get rest APIs for coding. One use case that I’m really interested in is being able to publish my notebook as a REST api (and deploy on Bluemix).
7. Last Question! What is an interesting and UNDERRATED trend that you see in data science?
Cross-validation of models: people just go for the first solution they build. Platforms like DSX help make cross-validation more intuitive (more parallelizable, leverage multiple instances of python, multi-processing). Doing things on a laptop is restrictive, but using cloud can scale and allow users to tune parameters appropriately. Also, checking and refreshing the health of models as new data comes in is also highly underrated.
Originally published at datascience.ibm.com on March 27, 2017.