Want to become a Data Scientist ?

Jaskaran kaur cheema
SFU Professional Computer Science
9 min readMar 15, 2019

So, read the interesting journeys of three successful data scientists to gain inspiration and lessons to excel in data science industry. ✌️

By : Fatemeh Renani ,Mohammad Mazraeh, Jaskaran Kaur Cheema

Infographic : Jaskaran Kaur Cheema

“Torture the data, it will confess to anything”-Ronald Coase

Due to the enormous generation of data, modern business marketplace is becoming a data driven environment. Decisions are made on the basis of facts, trends and analysis drawn from the data. Moreover, automation and Machine Learning are becoming core components of IT strategies. Therefore, the role of Data Scientists and Data Engineers is becoming increasing important.

In this blog, we have enumerated the journeys of three Data Scientists who have different educational backgrounds and career paths but have successfully curved a niche for themselves in the Data Science Industry.

We hope that their journeys will inspire you to excel in data science industry.

MANROOP KAUR, Data Engineer ICBC

Manroop Kaur, is a Data Engineer at ICBC Vancouver. She is a graduate of SFU’s Professional Master of Science in Computer Science program Specializing in Big Data.

Can you tell us about ICBC and your current role.

ICBC was built in order to provide basic insurance and managing claims which is the core component of the company. At present, the company is working on RAAP (Rate Affordability Action Plan (RAAP) project that will fundamentally change its business model to create a sustainable auto insurance system which would provide more affordable and fair rates for all. As a part of this project, I am working as a Data Engineer in Claims and Driver Licensing Teams in Information Management Department.

What convinced you to venture in to the Big Data field.

While working with Tech Mahindra, I heard about a project where data was being transferred from traditional database to Hadoop. This was the first time in my life I came across big data terminology and started exploring it by reading online articles. Since I already wanted to expand my education qualification, so I thought of venturing into this field. SFU’s Professional Master’s program was perfect fit so I applied and got accepted into it.

Can you describe your career journey after enrolling in Big Data program

While at SFU, I did my coop with WorkSafeBC. My work focused on Text analysis, doing advanced analytics and applying Machine learning algorithms. After that I applied at ICBC and it’s been a year of working as a Data Engineer with ICBC.

Any courses that you recommend to pursue to be successful in this program.

I believe that Big data program at SFU is structured so well that if you complete the assignments of Programming Lab 1 and 2 diligently, there is no requirement of any other course.

Can you describe any of your most interesting project.

I remember doing a project during internship of detecting the likelihood of claim to be fraudulent. We analyzed the claim data of past 5 years. Regular meetings with real field investigators were held to know about the red flags. Later, data was analyzed using those red flags. This project taught me that in academic setting we focus on obtaining high accuracy but sometimes in real life problems accuracy has different definition. So, the model that data science team was preparing would be termed successful if it was able to detect even 40 out of 500 claims to be fraud which are actually in real life.

Any interesting lesson that you learned after working in this field .

So, when I started learning about data science, I used to get very excited about applying ML algorithms to see the output of my model without spending much time on analyzing or cleaning the data . Later, I realized that data plays vital role and preparing it takes 90% of time but as performance of model depends upon the data being fed to it, preparation time is worth the effort.

How do you reflect on your decision of enrolling in this program.

I think decision of acquiring Master’s Degree in Big Data at SFU has proved to be worth my time and resources I invested in it. As it not only provided me the education in concurrent with the industry requirements but also has helped me securing a good job.

Any advice for people who wants to venture in this field.

I think focusing on one domain rather than doing everything in data science and updating your skills regularly will lead to a successful career.

HAMED KARIMI, Principal Data Scientist, Find Innovations Lab Inc.

Hamed Karimi is a physicist by education. He had a PhD in physics from UBC and he worked as a postdoctoral researcher in Computer science department at UBC. He previously worked in 1QBit as data scientist and optimization researcher. He works as a principal Data Scientist in FIND Innovation Labs Inc. since July 2017.

Could you please tell me a little about FIND innovation Lab. and your position?

In FIND innovation Lab. we help small and medium size retailers to have faster inventory turns and reduce markdowns by create smart promotion. We create platform for retailer to view their custom map . We perform various machine learning algorithm , NLP methods , and unsupervised learning to identify customer profiles. Implementing sophisticated recommender system we help retailers to target customers with appropriate promotion at the right time.

How would you compare academia and industrial position?

In industry you work on real-world problems. Although you may not perform a profound research, you would see the immediate outcome of your job which is really satisfactory and encouraging. In academia you are performing very profound research, however, your research may not have a real world application even in near future.

How would you compare these three data scientists:Self-taught, graduate of professional degree in data science, and computer science graduate

  • The self-taught data scientist are very motivated and usually quick learner since they learn as their desire.
  • On the other hand, since the specialized degrees are designed based on real-world challenges and applications, their graduate are one step ahead of self-taught people.
  • The graduate students in computer science or even machine learning researchers are very focus on research thus to transition to industry they need to learn the applications and the tools.

Do you see this field as an on-going learning path (updating skill set)?

Data science is a fast-pace developing field thus the learning has to be part of daily work. You have to be on the top of the new tools, research articles, new application, and new algorithms otherwise you get outdated in just couple of months.

What are the Skill set your company seeking in a potential candidates — for hiring process ?

Basic requirement: Proficiency in python language, have a relevant education (sciences, math, physics), experience working on machine learning project.

Personality: Self driven in terms of research. Start-ups have very fast-pace environment, thus data scientists need to explore difference ideas and looking at the problem in different perspective thus having research experience is useful.

Words for people who wish to venture in data science field?

I think it is crucial for the new data scientists to have right expectation of the job and know the reality. As a data scientist you might spend 80–90 percent of your time collecting data, cleaning data, developing the baseline. Training the algorithm is the very last step of the process.

RAMTIN SERAJ, Lead Machine Learning Engineer, Axiom Zen

Ramtin Seraj is the lead machine learning engineer in the so-called distributed hierarchies of responsibilities at Axiom Zen. He is responsible for architecting data solutions and make sure everything works end-to-end. He did his bachelors and master in computer science with a focus on machine learning and NLP.

What should computing science students learn besides the courses they pass in the university to become a good data scientist/engineer?

The first important thing is communication. Building documentation, preparing presentations and storytelling are essential skills that sometimes missing in the curriculums.

Even if you are very good at training models and finding patterns in data, it is very important to frame your work and convert it to actionable items. Also, It’s really important to learn how to present your findings in a way that everybody can understand.

How to be a successful data scientist at work?

Based on my experience, there are a few important rules:

  • First, you should be able to fully understand the product and market before searching for patterns. It’s really important to fully understand how things are working and what’s the logic. Even if some patterns are trivial, you should try to find those patterns in the data to make sure your ETL is working well and you have a good understanding of the system. You should fully understand the problem you’re trying to solve.
  • If you think you will be a data scientist and your job will be to train a model on top of a well prepared and clean data sets and a baseline to compare your model, you’re wrong.
  • Doing data science mostly is to do a lot of data cleaning and realize how to aggregate your data. And most of the times there is no baseline to compare your results so you should make sure to validate your data from different aspects.
  • And the last one, be prepared to fail in finding a pattern for more than 90% of times.

What characteristics should one have to be able to successfully lead a data science-based project?

You need to have some experience with everything you can think of and be prepared to do whatever it takes for the team.

As I mentioned before 90% of the time you would clean the data, deal with the noise and sparsity in the data and regulations on those data. You need to get some skills to deal with these problems. Data science is a toolbox, sometimes a high-performance tool does not scale well and sometime you may need a low latency service and you may need to sacrifice some accuracy for that. As a team lead you should be able to deal with these challenges and know the tools in your toolbox.

Have you ever wanted to build your own company? have you ever tried? why/why not?

Everyone In this industry that I know has thought of starting on his/her company at some point but the reality is building companies are really challenging and requires a perfect team to be able to succeed. I’ve seen many startups even with a great team at the start that suffered from the lack of resources and people leave companies because of financial problems. I have this opportunity to build a company inside my current company and what I see Axiom Zen as a home for a group of talented people with a diverse background that I can learn from them every day.

What do you think about applying data science on cryptocurrencies/blockchains data?

There are many interesting data science challenges for blockchain; for example, an interesting one is gas price estimation for Ethereum network. In order to run a code to change the state of smart contract you have to pay gas and for each, you can set a gas price. If you pay a high price for gas you may lose a lot of money and if you pay a little price, there is a chance of your transaction never be mined. There are many factors affecting the gas price and predicting the right gas price for the following transaction is one of the interesting problems you can see these days.

--

--

Jaskaran kaur cheema
SFU Professional Computer Science

Grad Student at SFU | Data Science | Machine Learning| Data Visualization