Member-only story

3 Reasons Why I’m Ditching SSIS for Python

Joshua Feierman
Towards Data Science
4 min readOct 14, 2019
Photo by Chris Ried on Unsplash

I’ve been using the Microsoft SQL Server technology stack for more than a decade, and while I continue to be extremely bullish about it, I’ve lately changed my tune on a key component of it, namely SQL Server Integration Services, or SSIS for short. SSIS is a very powerful tool to perform extract, transform, and load (ETL) workflows on data, and can interact with pretty much any format out there. And while I’ve mostly seen it used in the context of loading data into or out of SQL Server, that certainly isn’t its only use.

I’ve authored more than my share of SSIS packages over the years, and while I still feel it’s a tremendous tool to have in your arsenal (and one that in many cases may be the only one available in large enterprises with strict standards around technology usage), I’ve now decided that for reasons I’ll outline below, I’d prefer using Python for most, if not all, ETL needs. This is especially true when combining Python with two modules specifically made for manipulating and analyzing data at scale, namely Dask and Pandas.

Python is free and open source

Python is a completely open source language, and is maintained by the Python Software Foundation. It, and a huge number of its packages, are available completely free of charge, and you can easily contribute to the underlying source code…

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Joshua Feierman
Joshua Feierman

Written by Joshua Feierman

I love to write about all things data, tech, and personal development.

Responses (9)