Machine Learning used to build a Diversified Portfolio: K-Means Clustering
Introduction
In this article, we will explore K-Means Clustering:
- What is K-Means Clustering?
- Algorithm
- K-Means Clustering Application: Building a diversified portfolio
Jupyter Notebooks are available on Google Colab and Github.
For this project, we use several Python-based scientific computing technologies listed below.
import time
import kneed
import requests
import numpy as np
import pandas as pd
from tqdm import tqdm
import seaborn as sns
import ipywidgets as widgets
from scipy.stats import mstats
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from datetime import datetime, timedelta
from requests.adapters import HTTPAdapter
from requests.exceptions import ConnectionError
from requests.packages.urllib3.util.retry import Retry
What is K-Means Clustering?
K-Means Clustering is a form of unsupervised machine learning (ML). It is considered to be one of the simplest and most popular unsupervised machine learning techniques. Unsupervised algorithms use vectors on data points. These data points are not labeled or classified. Our goal is to discover hidden patterns and group the data points in a sensible way based on similarity of features. Each group of data points is a cluster and each cluster will have a center.