This article reviews some common options for parallelizing Python code, including process-based parallelism, specialized libraries, ipython parallel, and Ray.

An introductory tutorial on reinforcement learning with OpenAI Gym, RLlib, and Google Colab

This tutorial will use reinforcement learning (RL) to help balance a virtual CartPole. The video above from PilcoLearner shows the results of using RL in a real-life CartPole environment.
  • What is…

Thoughts and Theory

Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster

One of FLAML’s algorithms CFO tuning the # of leaves and the # of trees for XGBoost. The two heatmaps show the loss and cost distribution of all configurations. The black dots are the points evaluated in CFO. Black dots connected by lines are points that yield better loss performance when evaluated (image by authors).
  • The need for economical AutoML methods
  • Economical AutoML with FLAML
  • How to scale FLAML’s optimization algorithms with Ray Tune

The need of economical AutoML methods

Ray makes parallel and distributed computing work more like you would hope (image source)
  • Why should you parallelize and distribute with Ray
  • How to get started with Ray
  • Trade-offs in distributed computing (compute cost, memory, I/O, etc)

Why should you parallelize and distribute with Ray?

A goal of Modin is to allow data scientists to use the same code for small (kilobytes) and large datasets (terabytes). Image by Devin Petersohn.

How to get started with Modin

Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning applications.

What is Ray

Resources (dark blue) that scikit-learn can utilize for single core (A), multicore (B), and multinode training (C)
  • Changing your optimization function (solver)
  • Using different hyperparameter optimization techniques (grid search, random search, early stopping)
  • Parallelize or distribute your training with joblib and Ray

Changing your optimization algorithm (solver)

Learn about how to visualize decision trees using matplotlib and Graphviz

Image from my Understanding Decision Trees for Classification (Python) Tutorial.
  • How to Fit a Decision Tree Model using Scikit-Learn
  • How to Visualize Decision Trees using Matplotlib
  • How…

Michael Galarnyk

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store