PyData London 2016 — Part II

Data Reply
DataReply
Published in
2 min readMay 2, 2018

Here goes the second and final part of my blog post on the PyData London 2016 conference. In the first part I covered the main points from the talks on statistics and machine learning. Now we will dive into presentations on Python itself and other highlights.

Break time at the conference.

The conference was attended and organised by many active committers and here’s the news they had to share:

  • Yet another array library, DyNB, was presented by I. Zaid. The goal of DyNB is to solve issues found in NumPy — e.g. treatment of missing data, predefined length strings, and primitive types. At the moment DyNB is compatible only with Python and C++, however eventually the list should grow (R, Julia, and Javascript were mentioned in the presentation).
  • Bandicoot, a recent side project at MIT for mining mobile data, was presented by L. Rocher who is one of the authors of the toolkit. Bandicoot aims to make researchers’ lives easier by providing utilities for reproducibility, management of data quality (metrics on missing data, wrong data types, etc.), data visualisation, anonymization, machine learning, and even feature extraction.
  • What’s New in High-Performance Python? This question was answered by G. Markall. After explaining the key steps in optimising a Python program (profiling, understanding, and action) and demonstrating how to use profiling tools (Python profiler, Kernprof, and VTune) in the first part of the talk, he reviewed new features in the Numba library (parallel CUDA and Multicore functionality, JIT classes, support for CFFI, the @generate_jit decorator, and a couple more).
  • Aspiring data visualisation gurus attended the tutorial of Bokeh, an interactive web dashboard library comparable to R’s Shiny, which is gaining more and more attention. Compatibility with Python, R, and Julia makes it a really attractive choice for a single dashboarding tool in your pocket.
  • Bqplot is a lightweight visualisation library for the Jupyter notebook recently open-sourced by Bloomberg and presented during the conference by S. Corlay (one of the authors). Points worth mentioning include support of interactive plots, a variety of visualisation types, and an API which enables custom mouse interactions with your graph. Oh, and it’s based on the grammar of graphics (just like R’s ggplot).

Good Ol’ Python

Of course more established Python features were also revisited. One of the conference’s starting points was a Beginner Bootcamp, which set the scene for anyone unfamiliar with data analysis in Python or even programming in general. Another must-go talk for beginners was Pandas from the Inside where dozens of examples were given for why you should start using it. Moreover, it was also relevant for Python veterans who want to understand how NumPy is used under the hood and how to speed up your DataFrame operations…….

Originally published at www.datareply.co.uk.

--

--