Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Gradient Boosting in Python from Scratch

11 min readMar 29, 2022

--

Evolution of beaks; Image licence https://creativecommons.org/licenses/by/4.0/deed.en; Original link https://commons.wikimedia.org/wiki/File:Charles_Darwin,_Journal_of_Researches..._Wellcome_L0026712.jpg

The aim of this article is to explain every bit of the popular and oftentimes mysterious gradient boosting algorithm using Python code and visualizations. Gradient boosting is the key part of such competition-winning algorithms as CAT boost, ADA boost or XGBOOST thus knowing what is boosting, what is the gradient and how the two are linked in creating an algorithm is a must for any modern machine learning practitioner.

The implementation and animations of gradient boosting for regression in Python can be accessed in my repo: https://github.com/Eligijus112/gradient-boosting

The main picture of the article depicts the process of evolution and how, over a long period of time, the beak size of a bird species adapts to its surroundings. https://en.wikipedia.org/wiki/Darwin%27s_finches

Just like animals adapt in various ways given new facts in their habitats, so do machine learning algorithms adapt to the data environment we put them in. The main idea behind the gradient boosting algorithm is that the main engine of it is a low accuracy and simple algorithm which learns from its own previous mistakes.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Eligijus Bujokas
Eligijus Bujokas

Written by Eligijus Bujokas

A person who tries to understand the world through data and equations