On Becoming a Data Practitioner

Abpencarnacion
Asia-Pacific Youth Data Society
5 min readOct 7, 2020

Hi! I’m Alyssa Encarnacion — a Data Specialist and part-time MS Statistics student from the Philippines. My love for airplanes growing up and my undergraduate degree in economics both made me an aspiring transport economist, and it’s from there where my data journey begins.

Starting a career in data without any computer science background is daunting, but it’s actually quite common among practitioners in the field. Many of us come from previous professions or unrelated degree programs, with the goal of exploring the opportunities data science can provide.

Although I’ve only been working in data for over a year, I want to share some learning and key aspects in my data journey so far, for those who may be starting out the same way I did. And hopefully through this article, I would be able to help you become an effective data practitioner.

On the Process of Learning to Code

Coding is an essential part of data science, so much so that it’s inevitable to have to keep up if one isn’t up to speed yet on that part. I had a very rough start in learning how to code compared to my colleagues who could grasp it quite well; my slow pace led to a lot of backlogs and teammates having to come in and revise my codes which I’d just copy-pasted from elsewhere the night before. I thought I had to memorize everything, and felt frustrated in the times that I couldn’t.

That was how I navigated my first three months of learning Python until our team introduced pair programming as part of our sprints. It was intimidating to be paired with much more experienced teammates (especially in the sessions where they’d have to clean and troubleshoot my codes), but it allowed us to learn from each other’s thought processes and become more creative in how we approached our tasks. Once I caught on, learning how to code became easier.

The key takeaway here is that coding is more about the thought process than it is about memorizing codes. While it’s easy to search how to troubleshoot a certain error, planning how to approach a task as efficiently and effectively as possible is an entirely different exercise. Mastering the latter will be more value-adding to one’s data science career.

On Using the Tools

Next is on using tools. Note how I simply put ‘tools’ there without adding whether they should be the right ones or the best ones, because of course we should always use the correct and most effective tools which will tailor fit to our work. Many data practitioners perceive that since Excel, PowerPoint, etc. are inferior to Python and the like, then all analytics and visualizations have to be done on the more advanced tools. Admittedly, I held that same view as well when I first started out.

For six months, I was assigned as a short-term analyst to a client, and my tasks ranged from data visualization to quantitative reviews and even forecasting. As a fresh graduate at the time, I was insistent on using the more advanced tools such as Tableau and R in my work, an insistence which often proved difficult as it would take up more time to do my tasks than if I had used Microsoft Office which was what the client was accustomed to. After many tries, I realized that I could also use their tools while still applying best practices in data analysis and data management, and eventually I began to feel proud of my work even if it wasn’t created on Python. One of my most challenging yet gratifying projects during that time was a forecasting tool done entirely on Excel.

One key takeaway here is to be honest and open to the tools that you can use. Just because it’s done on MS Office doesn’t mean you still can’t apply basic concepts of data cleaning or best practices in data visualization, and besides, you’ll be surprised to learn the vast capabilities of other tools which can complement the more popular ones used by data scientists.

On Putting the Role into Context

Connected to using our tools, I would like to emphasize the need to use the correct and most effective tools while also considering what our stakeholders use. Our role in data is to provide meaningful insights and improvements to their ways of working, but we must not forget to always align with our stakeholders. There’s no point in creating a solution if they as our end-users can’t benefit from it anyway.

From a business perspective, it always helps to collaborate with stakeholders to understand how the data you pull and analyze will be used. Close collaboration will also help in validating your findings and digging deeper into further questions or use cases that can be explored. Understanding the contexts of these stakeholders allow us to generate more nuanced findings and process improvements which will serve them well.

More than Just Numbers

Before I end this article, I would like to point out that being a data practitioner is a huge responsibility in itself, because the data we deal with aren’t just numbers; they are representations of real people, behaviors, and outcomes, all of which interact with the world around them. While data science can uncover those representations, it’s ultimately up to us practitioners to use those representations well.

Ultimately, we crunch the numbers and create the charts because someone — whether it be a high-profile client or the general public — trusted us to do so.

Our goal then should be to provide what they need in a manner that they will understand while also ensuring that what we provide will be used for the benefit of others. So long as we accomplish that, then we can call ourselves effective data practitioners.

Want to read more content like this? Follow us in our digital accounts and hear updates from the AYDS:

--

--