Helping all our colleagues use Python

Patrick von Glehn
CarePay
Published in
2 min readMay 19, 2022

This month the CarePay data team is launching an introduction to programming with Python training company-wide with the dual aims of a) giving colleagues who have never tried programming in any language a chance to give it a bash and b) empowering them to try out improvements to their own workflows with automation and reproducible data analysis and visualization.

We’ll be basing the training on the excellent introduction to Python course from Kaggle and we’ll be focusing on basic Python syntax and concepts as well as how to work with tabular data (think excel) in Jupyter notebooks — an interactive mixed programming and note-taking environment.

Why and how we use Python in the data team

Although the core backend language at CarePay is Java, in the data team we use Python for many of our projects including data engineering pipelines and other automation flows, machine learning, data analytics, and visualization.

One of the main reasons we love Python is its outstanding collection of open-source data science and engineering packages.

We use the Python package Apache airflow, originally developed by Airbnb, to automate and schedule tasks such as making daily operational database extracts, loading this operational data into our data warehouse, transforming and enriching it into dimensional modeled schemas, and sending out weekly reports.

dbt is another key Python package in our stack. It is fast becoming the industry standard tool of choice to handle the T in ELT (Extract Load and Transform) due to the ease with which it allows you to build workflows of complex interdependent data transformations with SQL templating and automatic DAG (directed acyclic graph) generation.

Python, especially runs in the interactive Jupyter notebook environment has become one of the two dominant tools for exploratory data analysis and visualization (the other being R). The rapid feedback loop that programming in interactive notebooks allows makes iteratively exploring datasets a breeze.

We’re sure our colleagues will enjoy getting their hands dirty with a bit of coding and hope some of them will be able to put their new skills to use in their own work.

Do you want to join our Data team or are you looking for something else? Explore our job opportunities here!

--

--

Patrick von Glehn
CarePay
Writer for

Data Scientist working for CarePay in global healthcare.