K-means Clustering Analysis

Steps of Cluster Analysis

  1. Creating the data matrix,
  2. Obtaining similarity or distance matrices,
  3. Determining the clustering method and forming clusters,
  4. Interpretation of results.

- K-Means

Let’s Get to Know Data Set.

# BUSINESS PROBLEM
# Rule-based customer segmentation method with RFM
# for customer segmentation of K-Means, a machine learning method
# comparison is expected.
# DATA SET
# The dataset named Online Retail II is a UK-based online sales
# store's sales between 01/12/2009 - 09/12/2011
# contains
# VARIABLES
# InvoiceNO – Invoice Number
# If this code starts with C, it means the operation has been cancelled.
# StockCode – Product Code
# Unique number for each product
# Description – Product Name
# Quantity – Product Quantity
# Indicates how many of the products on the invoices were sold.
# InvoiceDate – Invoice date
# UnitPrice – Invoice price (Sterling)
# CustomerID – Unique customer number
# Country – Country name

Let’s install the necessary libraries.

Read dataset.

Let’s take a look at the numeric variables in dateset.

Replace with Thresholds

Calculating RFM Metrics

K-means Standardization

So, how many clusters should it be divided into?

Visualizing Clusters and Marking Their Centers

--

--

--

Data Scientist. / Writing about Data Science, Statistics./ https://www.linkedin.com/in/zeyland/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Embedding Content into a Medium Post

Elmo Embeddings in Keras with TensorFlow hub

Monthly Expenses Template

Loading Data from OpenStreetMap with Python and the Overpass API

Why Do You Need A Data Strategy?

How to build a data science project from scratch

Traps In Decision Making — Avoid At Any Cost!

Modern Data Projects Expectation Vs Reality And Delivery

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Zeynep

Zeynep

Data Scientist. / Writing about Data Science, Statistics./ https://www.linkedin.com/in/zeyland/

More from Medium

K-Means Clustering on Country Dataset

Titanic Survival Prediction

Big Data Analysis

K-Means Clustering From Scratch-with Manual Similarity Measure