XGBoost for Machine Learning in a Python Windows Environment

Buddhika Sameera
Sep 8, 2018 · 4 min read

Extreme Gradient Boosting, well known as XGBoost is an optimized distributed gradient boosting system designed to be highly efficient, flexible and portable. XGBoost is a recent implementation of Boosted Trees. It is one of the machine learning algorithms that yields great results for supervised learning problems. XGBoost implements machine learning algorithms under the Gradient Boosting framework. This provides a parallel tree boosting, also known as GBDT, GBM that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment(Hadoop, SGE, MPI) and can solve problems beyond billions of examples. The most recent version integrates naturally with DataFlow frameworks(e.g. Flink and Spark). Original paper on XGBoost.

If you are a Machine Learning enthusiast who work on your own, one of the very first things you may encounter while using XGBoost is it’s installation on the operating systems. Specially for Windows I would say it’s not well optimized. Hence thought of sharing my experience and two successful methods of installing XGBoost in Windows 10.

Method 1:

Requirement : Microsoft Visual C++ Redistributable for Visual Studio 2017

  1. Download .whl from below link, make sure the PD running operating system and the python version for example if your OS is 64bit and python version 3.6 then you may choose cp36‑cp36m‑win_amd64.whl to download) https://www.lfd.uci.edu/~gohlke/pythonlibs/#xgboost
  2. Open a command prompt cd to the file location downloaded and pip install

If you are lucky enough, your XGBoost Windows environment is ready now!

If not, then you may use below method,

Method 2: From GitHub using Git Bash

Requirements:

  1. Download Git for Windows and install GitBash, once installed you may access it via start menu and this will be the terminal for the XGBoost installation. And follow below commands to direct and download XGBoost from GitHub.

A1828@A1828-NB-02 MINGW64 ~

cd /c/Users/A1828/

$ git clone — recursive https://github.com/dmlc/xgboost

$ cd xgboost

$ git submodule init

$ git submodule update

2. Installing fill fledged 64/32 bit compiler provided with MinGW-W64. Once downloaded, installing with following configurations.

3. Once installed, add the bin folder path (C:\Program Files\mingw-w64\x86_64–8.1.0-posix-seh-rt_v6-rev0\mingw64\bin) to the system variables.

(Control Panel\System and Security\System -> Advanced system settings -> System Properties -> Environment Variables ->

4. Exit GitBash terminal, launch again and try below,

$ which mingw32-make

Should return something similar to

and run below to alias and to start compiling from the directory where we downloaded XGboost

$ alias make=’mingw32-make’

$ cd /c/Users/A1828/xgboost

5. Compiling each sub module, run below

$ cd dmlc-core

$ make -j4

$ cd ../rabit

$ make lib/librabit_empty.a -j4

$ cd ..

$ cp make/mingw64.mk config.mk

$ make -j4

5. Open the Conda promot, type Anaconda in the start menu search bar and run as administrator one the CLI is open follow below instructions to complete the installation

Get to the folder where you have the XGBoost python package

> cd C:\Users\A1828\xgboost\python-package

(base) C:\Users\A1828\xgboost\python-package>python setup.py install

6. Add the run time libraries to the os environment path varible

open a Python Jupyter notebook and run below,

import os

mingw_path = ‘C:\\Program Files\\mingw-w64\\x86_64–5.3.0-posix-seh-rt_v4-rev0\\mingw64\\bin’

os.environ[‘PATH’] = mingw_path + ‘;’ + os.environ[‘PATH’]

Now you are good to run your own XGBoost in Python

Let’s test in a Windows Python 3 development environment, by running a classification model for the famous iris Data Set

Hope this article helps you to setup your XGBoost environment for Windows, trying my best to spare time to share the experiences. Planning to write my next article to show how to run XGBoost in a PySpark Cluster.

KIT,

Buddhika.

Data Scientist : Axiata Analytics Centre

www.linkedin.com/in/buddhika-sameera-gamage-a4799b14

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade