“Using College to Start Your Career Right” — Interview with Jesse Steinweg-Woods -Ph.D, Senior Data Scientist at tronc

I would ignore those who say you absolutely need to know big data tools/deep learning right off the bat, because most likely you won’t need them at first to solve many of your company’s problems. — Jesse Steinweg-Woods

Vimarsh Karbhari
Acing AI
7 min readMay 15, 2018

--

Learning from experts and mentors has been paramount in my career. The goal of Acing AI is to help people get into AI. It is certainly not possible without getting insights from Experts. This article marks the first of the many articles where I will share my conversations with experts in the field. This week we have Jesse Steinweg-Woods with us. Jesse is a Senior Data Scientist at tronc (Tribune Online Content). He works on building a recommender system for news articles and understanding customer behaviour for a variety of online news content. Apart from his stellar experience his website has some amazing blog articles and book reviews which I personally liked. I curated a list of questions for Jesse and I am grateful for him to have spend the time to answer them. Now on to the conversation…

Source: Link

Vimarsh Karbhari(VK): What are the top three books about AI/ML/DS have you liked the most? What books have had the most impact in your career?

Jesse Steinweg-Woods(JS):

I have many other book reviews at my website: Book Reviews

VK: What tool/tools (software/hardware/habit) that you have as a Data Scientist has the most impact on your work?

JS: Github and the Python library pandas.

Github because it allows all of my code to be in one place and allows me to share it with my team members (along with use some of their code in my work). It also allows version control, code reviews, and a backup for my code.

Pandas because it allows me to prototype quickly and load data in a variety of formats easily.

VK: Can you share about the Data Science related failures/projects/experiments that you have learned from the most?

JS: Bugs in the recommender system I helped build showed me the importance of logging machine learning systems that are used in a production setting.

I also had a project at my last company where the project didn’t work because my assumptions about how the data was being collected were totally flawed.

In the churn model I designed, negative examples weren’t being generated properly and were introducing a bias in the model. I didn’t detect this until I compared the underlying distributions of the features. Check this if a model isn’t working properly in real life to see if it had some sort of a training bias.

VK: If you were to write a book what would be the title of the book? What would be the main topics you would cover in the book?

JS:“Using College to Start Your Career Right.” I don’t think I handled college quite right the first time. Universities don’t focus enough on teaching students career planning skills, and I feel this is sorely lacking in higher education. I would like to help solve this problem with a book from my own perspective.

VK: In terms of time, money or energy what are the best investments you have made which have given you compounded rewards in your career? — Any book, project, conference, meetups can be anything.

JS:

  1. Designing my own website and pet projects. I learned so much about data science and software engineering doing this.
  2. Following top data scientists and machine learning academics on Twitter. You learn a lot that way about new publications, novel applications, or just funny anecdotes others in the field share.
  3. Networking with other data scientists. This provides a great way to share ideas and new discoveries, along with tips from others that can help you in your own projects.

VK: What are some absurd ideas around data science experiments/projects that are not intuitive to people looking from outside in?

JS:

Sometimes people don’t really understand how machine learning works. Those unfamiliar with it think the computer is “thinking for itself” when all we are really doing is trying to generate a program that can intuit something based on past examples that can be applied to future situations.

VK: In the last year, what has improved your work life which could benefit others?

JS: Try to minimize meetings unless they are absolutely essential. I find I am more productive this way.

VK: What advice would you give to someone starting in this field? What advice should they ignore?

JS: Aim for low-hanging fruit/easy wins first. Sometimes data scientists try to take on projects that are too complicated when simpler ones could be finished in far less time and generate results for your organization much sooner.

I would ignore those who say you absolutely need to know big data tools/deep learning right off the bat, because most likely you won’t need them at first to solve many of your company’s problems.

VK: What is bad recommendations given in data science in your opinion?

JS: People who claim you can become a data scientist in just six months without a closely related background are probably not correct in most cases. The field is very broad and there is a lot to learn. I also don’t like it when people mix up data scientist/data analyst roles. They are not the same thing in my opinion. Both serve different needs and have different skill sets.

VK: How do you determine saying no to experiments/projects?

JS: I try to weigh the impact a project could deliver to the organization along with how much time I think it would take to do it. I prioritize those projects that can deliver the most impact in the least amount of time first. It depends on what data is available however, as you may need data that isn’t yet available to do the project.

VK: Do you ever feel overwhelmed by the amount of data or size of the experiment or a data problem? If yes what do you do to clear your mind?

JS: When starting out with a new database, especially if it is poorly documented (and they usually are unfortunately), it can be incredibly easy to be overwhelmed. I try to start one table at a time inside the database and figure out what the columns mean and whether they may be of value to me. I also try to figure out how the tables are related to each other and how to join them together. This requires patience but you eventually get through it.

VK: How do you think about presenting your hypothesis/outcomes once you have reached a solution/finding?

JS: I try to put myself in the other person’s shoes and ask what would be the best way of getting them to understand a project outcome. This can be especially difficult if the other person isn’t very quantitative by nature, so in this case you really have to narrow down the outcomes of your experiment or project to the absolutely essential parts.

VK: What is the role of intuition in your day to day job and in making big decisions at work?

JS: Intuition definitely helps when deciding what features may be good in a model. It also helps in deciding which projects are worth doing. Unfortunately, both of these only get better with experience.

VK: In your opinion what is the ideal Organizational placement for a data team?

JS: It honestly depends on the team and company. I think Type B data scientists (more focused on software engineering) need to be closely paired with engineering. Type A data scientists (more focused on analysis) need to be closely paired with product or the CEO.

VK: If you could redo your career today, what would you do?

JS: I probably would have either taken more software engineering classes during college or done an internship that had a larger software engineering component during undergrad than I did. I had to learn a lot of software engineering in a very short period of time. While that was exciting and I eventually caught up, it was also challenging. I think it would have been less so if I had more of a familiarity with software engineering best practices before I started to transition into data science out of academia. Software engineering in academia is very different from industry.

VK: What are your filters to reduce bias in an experiment?

JS:

Make sure the distributions in both your control and treatment groups are as similar as possible along with being representative. Randomization is a good way to help reduce bias.

VK: When you hire Data Scientists or Data Engineers or ML Engineers what are the top three technical/non — technical skills you are looking for?

JS: Assuming I wanted to hire a data scientist (more focused on building products, similar to a machine learning engineer):

  • Strong knowledge of machine learning
  • Decent software engineering skills
  • Good communicator

VK: What online blogs/people do you follow for getting advice/learning more about DS?

JS: I like datatau.com a lot for keeping up to date on things. Twitter is also great if you know who to follow. I prefer to follow a mix of leading researchers in academia and top data scientists at companies. This allows me to get practical tips for projects along with new ideas/tools from research groups. If you want an easy way to get started, just see who I follow on Twitter and branch out from there as you find your own interests. My handle is @jmsteinw.

A video featuring Jesse which might be very helpful to all readers:

How to become a Data Scientist in 2017?

I once again want to thank Jesse for sharing his knowledge with us!

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! 😊 If you enjoyed it, test how many times can you hit 👏 in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

--

--