Open in app
Home
Notifications
Lists
Stories

Write
Marc-Olivier Arsenault
Marc-Olivier Arsenault

Home

Published in Towards Data Science

·Apr 23, 2021

PR Reviews for SQL code

This is my personal guide on how I review my peer SQL code. As I mentioned before, at Shopify, all our work is peer reviewed. This includes dashboards and SQL code. One of the most used tools for data scientists at Shopify is Mode. …

Sql

10 min read

PR Reviews for SQL code
PR Reviews for SQL code

Published in Towards Data Science

·Jan 29, 2021

Warning systems on data warehouse

The story of how and why I built Whistleblower, a system that allows us to set warnings on any table in our environment. — For the past couple of years, myself and a bunch of others at Shopify were looking for a smart way to set warnings on specific tables. Why warnings? The reason is quite simple, we spend a lot of time building quality front room datasets, as I explain in this previous…

Editors Pick

6 min read

Warning systems on data warehouse
Warning systems on data warehouse

Published in Shopify Data

·Jun 9, 2020

How to thrive in the face of disruption: Tips from Shopify’s Data Team

Shopify’s Data Science & Engineering Foundations We currently face a global pandemic. People are in pain around the world. Daily life has been disrupted. Many face monumental financial damage or unemployment. …

Data Science

9 min read

How to thrive in the face of disruption: Tips from Shopify’s Data Team
How to thrive in the face of disruption: Tips from Shopify’s Data Team

Published in Towards Data Science

·Oct 28, 2019

Using Data Science to save money on my next trip to Mexico

How am I using basic data work to ensure I am getting a good price on my trip. — It has been 4 years since my wife and I took some vacation in a sunny place. Last time, for our honeymoon, we spent some quality time in Mexico. We enjoyed 10 days in a very nice all-inclusive resort in Riviera Maya. Since then, a house, two kids, a new…

AWS Lambda

6 min read

Using Data Science to save money on my next trip to Mexico
Using Data Science to save money on my next trip to Mexico

Published in Towards Data Science

·Apr 29, 2019

Spark & AI Summit 2019

My review of the latest Spark and AI Summit hosted in San Francisco on April 24th and 25th 2019. Last week was hosted the latest edition of the Spark Conference. It was the first time for me attending the conference. …

Big Data

4 min read

Spark & AI Summit 2019
Spark & AI Summit 2019

Published in Towards Data Science

·Apr 17, 2019

Spark JOIN using REGEX

A more technical post about how I end up efficiently JOINING 2 datasets with REGEX using a custom UDF in SPARK Context For the past couple of months I have been struggling with this small problem. …

Regex

4 min read

Spark JOIN using REGEX
Spark JOIN using REGEX

Published in Towards Data Science

·Sep 27, 2018

The data science pyramid

Let’s not start with data science this time. Let’s start with psychology. I am far from having any competence in this domain, but I remember in high school being presented the Maslow’s hierarchy of needs. The best I can describe it is the different stage humans must go through to…

Data Science

4 min read

The data science pyramid
The data science pyramid

Published in Towards Data Science

·Apr 27, 2018

This is what I really do as a Data Scientist

Data Science is getting very popular and many people are trying to jump into the bandwagon, and this is GREAT. But many assume that data science, machine learning, plug any other buzzword here, is to plug data to some Sckit-Learn libraries. Here is what the actual job is. To bring…

Machine Learning

5 min read

This is what I really do as a Data Scientist
This is what I really do as a Data Scientist

Published in Towards Data Science

·Feb 15, 2018

Lossless Triplet loss

A more efficient loss function for Siamese NN — At work, we are working with Siamese Neural Net (NN) for one shot training on telecom data. Our goal is to create a NN that can easily detect failure in Telecom Operators networks. To do so, we are building this N dimension encoding to describe the actual status of the…

Machine Learning

7 min read

Lossless Triplet loss
Lossless Triplet loss

Published in Towards Data Science

·Nov 22, 2017

KOLMOGOROV–SMIRNOV TEST

A needed tool in your data science toolbox — Lately, at work, we had to do a lot of unsupervised classification. We basically had to distinguish N classes from a sample population. …

Machine Learning

6 min read

KOLMOGOROV–SMIRNOV TEST
KOLMOGOROV–SMIRNOV TEST
Marc-Olivier Arsenault

Marc-Olivier Arsenault

Data Science Lead at Shopify— Personal blog available at http://coffeeanddata.ca

Following
  • TDS Editors

    TDS Editors

  • Jeremie Harris

    Jeremie Harris

  • Mike Solty

    Mike Solty

  • Min Fang

    Min Fang

  • Julie Zhuo

    Julie Zhuo

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable