My Trajectory to Data Science:

Xuan Zhang
Walmart Global Tech Blog
7 min readApr 27, 2022

How I Developed the Skillsets for a Career

Image source

While some data scientists have related degrees in statistics or computer science, numerous people land with training in other disciplines. It is not an easy or quick switch and people may be discouraged along the way. In this article, I would like to share my trajectory to data science. This may be useful to people who plan to start a similar journey.

My Background

During my undergraduate and graduate programs I dealt with lots of data, more specifically, geographical data. I majored in Geographical Information Science (GIS), which discovers locational insights and visualizes spatial data. I learned some Python, limited to predefined geoprocessing in GIS software. For data analysis, I also learned R statistical packages from online materials. While I enjoyed doing research, it took quite a while to receive preliminary results and the feedback loop was normally delayed. On a special occasion, my friends in the data science industry shared with me their rewarding feelings, which encouraged me to pursue a professional data scientist career. I had the passion and capability to reveal things from big data, far beyond a geospatial perspective. Besides, I like the fast pace in solving problems and the potential I could bring to the table. That was when I started the transition, and it took roughly three years from ground zero to joining Walmart Global Tech as a full-time data scientist!

The journey

There are different domains in data science, and all require a solid statistical background and coding skills. Focusing on building these foundations, I started my journey in summer 2018.

Year 1: The first sip

I kicked off my journey with an undergraduate course to learn basic Python and data science concepts. The course provided a systematic way of learning data structures and solving programming problems. It unveiled data science to me: from a vague buzzword to a concrete discipline that I can describe. Starting with an easy course boosted my confidence. Later that year, I took Applied Linear Models which introduced the theory and mathematics behind and built the foundation for data science.

Needed data scientists’ skills

Meanwhile, I constantly talked to my data scientist friends and read job descriptions. The top data scientist skills in demand are: excellence in statistics, machine learning, database systems, data structures & algorithms, domain knowledge, storytelling, and last but not least, good communication skills. They are the building blocks of a successful data science career. To fulfill the technical requirements, I took what was available at my university — Database Management and Advanced Topics in Machine Learning (ML). The former equipped me with the knowledge of databases and structured query language (SQL), whereas the latter provided a holistic overview of cutting-edge ML algorithms. Both provided me with hands-on experience through projects. For example, two classmates and I worked on an image recognition project to detect flooded roads. It leveraged my geospatial domain knowledge and married it with state-of-the-art algorithms. That was when I felt I could bring a unique perspective to data science projects.

Year 2: Building foundations

During the second summer of the journey, I was aware that I needed more training before the dry run. As I moved thousands of miles away from the campus, I turned to books and online courses. The book entitled An introduction to Statistical Learning with Applications in R guided me through a variety of statistical models with exercises. In fact, I read it three times and it benefited me more every time. Additionally, I took Statistics with R specialization which comprised five courses. These courses, especially the first four, are easy to follow with vivid use cases.

Meanwhile, I proactively applied for internships and bootcamps to build my resume and seize future opportunities. With over fifty applications and referrals, I received interviews from Google and Facebook as well as a few bootcamps. I was lucky enough to get an offer for a bootcamp. However, I skipped it and continued my self-learning journey due to the pandemic. During that period, I studied a well-designed Udemy course Python for DS &ML bootcamp, which provides all kinds of exposures to data manipulation, visualization, algorithms with use cases, and tastes of cutting-edge techniques, such as AWS, TensorFlow, Spark, etc. Occasionally, I had data science coding exercises in LeetCode, offering over a thousand coding questions in algorithm, database, and more.

Image source

With a solid theoretical foundation, I started to work on projects for practical cornerstones. It is always good practice to think through different scenarios and learn how to tell a story. When it comes to ML algorithms, I built linear regression models and decision trees extensively for my own research. To gain experience in classification and time series analysis, I turned to Kaggle projects, which trained me on data manipulation, parameter tuning, performance evaluation, model selection, and the reasoning logic behind. Throughout the projects, I read a time series book, Forecasting: Principles and Practice, and learned state-of-the-art algorithms of Prophet and Kats. The learning process benefited my research in return. As time series analysis was rarely used in geographical data, I added that flavor to my research as well.

Year 3: The final sprint

While COVID-19 was still raging, I prepared for graduation and applied for industry jobs. Friends helped me with job referrals, and I reached out to professionals on LinkedIn for job offerings. There were limited opportunities for new graduates since companies were having hiring freezes to mitigate the pandemic impact. Those were dark days without interviews, but rejections kept flying to the box. As that was out of my control, the only thing that I could do was to spend time and effort enhancing my skillsets. Therefore, I took the classic Machine Learning course by Andrew Ng, which I wished to know inside and out. This course walked through important algorithms and reasoning process step by step. I also had LeetCode exercises daily and went through Python and SQL materials from W3schools and HackerRank. Whenever a job looked for some particular background, I took a week or two to pick up new skills, such as causal inference, A/B testing, or revisit my notes. Gradually, I built up my muscles and was waiting for chances to test them.

After the vaccine came out, I was offered interviews occasionally. I would prepare for the technical, behavioral parts, and tried to understand the company and domain as much as I could. Interviews helped me to improve the way I talked about my projects, especially to people without a similar background. The process also demonstrated I should pay special attention to the business problem, or the pain point. When tackling problems, each company emphasizes various aspects: the cost, customer experience, timeline, etc. We would train ourselves to think as data scientists who work on these real cases. My friend suggested me to make notes of interview questions. She said, regardless of the result, the experience could pinpoint what interviewers were interested in. I did that every time and eventually I could sometimes predict questions and know how to impress interviewers. While there was still no good news, I was getting closer. Near the end of April, I received two offers with flying colors. Then I passed more, and in mid-May, I received the offer from the team that I enjoyed the most and canceled following interviews. That is where my old trajectory ended, and the new one would begin.

New journey starts

Conclusion

It was not a short adventure yet rewarding one. There was frustration and sometimes hopelessness. I questioned myself how I could find a data science job while students with related degrees were struggling in the market. I had no answer at that time. Now, I consider my background of interpreting and visualizing things spatially as a bonus. This applies to you as well: your uniqueness could be an indispensable part of future opportunities. As joined Walmart Global Tech for nine months, it is exciting to tackle real world problems and leverage my geospatial expertise.

There may be more people thinking about switching gear to data science. My suggestion would be learning from who are the ones you want to be: their skillsets, mindset, practice, domain knowledge, etc. I am grateful to my friends for their consistent guidance and tips. I also hope you find my story useful in your voyage to data science.

Author’s LinkedIn: linkedin.com/in/xuanzhanguga

--

--