Introducing PySurvival

sfotso
sfotso
Apr 12, 2019 · 3 min read

Heads up, we’ve moved! If you’d like to continue keeping up with the latest technical content from Square please visit us at our new home https://developer.squareup.com/blog

PySurvival is an open source python package for Survival Analysis modeling.


Today, we’re excited to introduce PySurvival, a python package for Survival Analysis modeling.

This article is the first installment in a four part series, which will include tutorials designed to demonstrate how to easily make the most of the package. You can also find these tutorials on the official website:


What is PySurvival ?

PySurvival is an open source python package for Survival Analysis modeling — the modeling concept used to analyze or predict when an event is likely to happen. It is built on top the most commonly used machine learning packages: NumPy, SciPy, and PyTorch.

PySurvival provides a very easy way to navigate between theoretical knowledge on Survival Analysis and detailed tutorials on how to conduct a full analysis, as well as build and use a model. The package contains:


Installation

If you have already installed a working version of gcc, the easiest way to install Pysurvival is using pip.

pip install pysurvival

The complete installation steps can be found here.


Introduction to Survival analysis

What is Survival Analysis ?

Survival analysis is used to analyze or predict when an event is likely to happen. It originated in medical research, but its use has greatly expanded to many different fields. For instance:

Censoring: why regression models cannot be used?

The real strength of Survival Analysis is its capacity to handle situations when the event has not happened yet. To illustrate this, let’s take the example of two customers of a company and follow their active/churn status between January 2018 and April 2018:

  • customer A started doing business prior to the time window, and as of April 2018, is still a client of the company
  • customer B also started doing business before January 2018, but churned in March 2018

Here, we have an explicit depiction of the event for customer B. However, we have no information about customer A, except that he/she hasn’t churned yet at the end of the January 2018 to April 2018 time window. This situation is called censoring.

One might be tempted to use a regression model to predict when events are likely to happen. But to do that, one would need to disregard censored samples, which would result in a loss of important information. Fortunately, Survival models are able to take censoring into account and incorporate the uncertainty, so that instead of predicting the time of an event, we predict the probability that an event happens at a particular time.

Square Corner Blog

Buying and selling sound like simple things - and they…

sfotso

Written by

sfotso

French guy exploring the world

Square Corner Blog

Buying and selling sound like simple things - and they should be. Somewhere along the way, they got complicated. At Square, we're working hard to make commerce easy for everyone.

sfotso

Written by

sfotso

French guy exploring the world

Square Corner Blog

Buying and selling sound like simple things - and they should be. Somewhere along the way, they got complicated. At Square, we're working hard to make commerce easy for everyone.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store