My First Data Modeling Project: Design For Global Superstore Sales

Nnamdi Samuel
Art of Data Engineering
6 min readAug 2, 2023
Photo by Nathália Rosa on Unsplash

The global superstore is a fictitious company that specializes in the sales of office supplies, furniture, and products of technology. It operates as an online marketplace and delivers products globally. It has also collected sales data from 2011 to 2014.

The purpose of this project

The main aim of this endeavor for the global superstore is to increase sales and, consequently, generate higher profits. This project’s central goal is to accomplish this objective. It intends to achieve this by creating a structured database design that includes a central table of facts and the required dimension tables to establish connections between different elements. This will enable meaningful comparisons and analysis.

Objectives

  1. To eliminate or reduce data redundancy as much as possible
  2. To optimize data usage
  3. To reduce data entry errors that may affect the data’s validity
  4. Carry out analysis showing vital information between entities
  5. Improve data consistency

Description of entities in the overall design

  1. Customers: This contains the customers’ information such as their full names, contact numbers, and email addresses.
  2. Shipping: This contains the shipping date, the shipping mode, and the cost of shipping.
  3. Orders: This contains the quantity of the items delivered, the total cost of the items delivered, the order priority, and the discount as related to the items.
  4. Product: This contains information about the products such as the name, available amount in stock, their categories, and sub-categories.
  5. DeliveryAddress: Here lies the information of the location where the products are delivered such as the street, the postcode, city, state, and the country.

Description of entities in the final design

  1. Sales: This is the fact table consisting of the price of the items, the costs, the shipping cost, and the quantity supplied.
  2. Products: This involves the products’ names, categories, sub-categories, and descriptions.
  3. Location: This contains the continents, countries, and cities to which the items are delivered.
  4. Time: This contains the period of delivery — the exact day of the year, the year, the quarter of the year, and the month of the year.

Relationship between entities

  1. A customer can make many orders and many orders may be made by one customer.
  2. Each shipping may have many orders made by customers and many orders can be conveyed through one shipping.
  3. One address can have many shippings and many shippings can have one address.
  4. A product can have many orders and many orders can have one product.
  5. Many sales can be made in one location and one location can have many sales.
  6. Many sales can be made at a particular time
  7. Many sales can be made by selling one product and a product can have many sales.

Initial ERD design

Final ERD Design:

The initial ER Diagram is modified to the final ER Diagram by eliminating redundancies and refining the entities.

DDL FOR THE OVERALL DESIGN:

-- MySQL Workbench Forward Engineering

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION';

-- -----------------------------------------------------
-- Schema Mangata_gallo
-- -----------------------------------------------------

-- -----------------------------------------------------
-- Schema Mangata_gallo
-- -----------------------------------------------------
CREATE SCHEMA IF NOT EXISTS `Mangata_gallo` DEFAULT CHARACTER SET utf8 ;
USE `Mangata_gallo` ;

-- -----------------------------------------------------
-- Table `Mangata_gallo`.`Customers`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_gallo`.`Customers` (
`CustomerID` INT NOT NULL AUTO_INCREMENT,
`FullName` VARCHAR(255) NOT NULL,
`ContactNumber` VARCHAR(45) NOT NULL,
`Email` VARCHAR(255) NOT NULL,
PRIMARY KEY (`CustomerID`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_gallo`.`DeliveryAddress`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_gallo`.`DeliveryAddress` (
`AddressID` INT NOT NULL,
`Street` VARCHAR(255) NOT NULL,
`PostCode` VARCHAR(45) NOT NULL,
`City` VARCHAR(45) NOT NULL,
`State` VARCHAR(45) NOT NULL,
`Country` VARCHAR(45) NOT NULL,
PRIMARY KEY (`AddressID`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_gallo`.`Shipping`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_gallo`.`Shipping` (
`Ship ID` INT NOT NULL,
`Ship Date` DATE NOT NULL,
`ShipMode` VARCHAR(45) NOT NULL,
`AddressID INT` INT NOT NULL,
`ShipCost` DECIMAL NOT NULL,
PRIMARY KEY (`Ship ID`),
INDEX `Address_fk_id_idx` (`AddressID INT` ASC) VISIBLE,
CONSTRAINT `Address_fk_id`
FOREIGN KEY (`AddressID INT`)
REFERENCES `Mangata_gallo`.`DeliveryAddress` (`AddressID`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_gallo`.`Products`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_gallo`.`Products` (
`ProductID` INT NOT NULL,
`ProductName` VARCHAR(255) NOT NULL,
`AmountInStock` INT NOT NULL,
`Price` DECIMAL NOT NULL,
`Category` VARCHAR(45) NOT NULL,
`Subcategory` VARCHAR(45) NOT NULL,
PRIMARY KEY (`ProductID`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_gallo`.`Orders`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_gallo`.`Orders` (
`OrderID` INT NOT NULL,
`CustomerID` INT NOT NULL,
`ProductID` INT NOT NULL,
`DeliveryID` INT NOT NULL,
`Quantity` INT NOT NULL,
`TotalCost` DECIMAL NOT NULL,
`OrderPriority` VARCHAR(45) NOT NULL,
`Discount` DECIMAL NOT NULL,
`Ship` INT NOT NULL,
PRIMARY KEY (`OrderID`),
INDEX `customer_fk_id_idx` (`CustomerID` ASC) VISIBLE,
INDEX `product_fk_id_idx` (`ProductID` ASC) VISIBLE,
INDEX `Shipping_fk_id_idx` (`Ship` ASC) VISIBLE,
CONSTRAINT `customer_fk_id`
FOREIGN KEY (`CustomerID`)
REFERENCES `Mangata_gallo`.`Customers` (`CustomerID`)
ON DELETE CASCADE
ON UPDATE CASCADE,
CONSTRAINT `product_fk_id`
FOREIGN KEY (`ProductID`)
REFERENCES `Mangata_gallo`.`Products` (`ProductID`)
ON DELETE CASCADE
ON UPDATE CASCADE,
CONSTRAINT `Shipping_fk_id`
FOREIGN KEY (`Ship`)
REFERENCES `Mangata_gallo`.`Shipping` (`Ship ID`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;


SET SQL_MODE=@OLD_SQL_MODE;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;

DDL FOR THE STAR SCHEMA DESIGN:

-- MySQL Workbench Forward Engineering

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION';

-- -----------------------------------------------------
-- Schema Mangata_StarSchema
-- -----------------------------------------------------

-- -----------------------------------------------------
-- Schema Mangata_StarSchema
-- -----------------------------------------------------
CREATE SCHEMA IF NOT EXISTS `Mangata_StarSchema` DEFAULT CHARACTER SET utf8 ;
USE `Mangata_StarSchema` ;

-- -----------------------------------------------------
-- Table `Mangata_StarSchema`.`Products`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_StarSchema`.`Products` (
`ProductKey` INT NOT NULL,
`ProductName` VARCHAR(255) NOT NULL,
`Category` VARCHAR(45) NOT NULL,
`Subcategory` VARCHAR(45) NOT NULL,
`Description` VARCHAR(255) NOT NULL,
PRIMARY KEY (`ProductKey`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_StarSchema`.`Time`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_StarSchema`.`Time` (
`TimeKey` INT NOT NULL,
`FullDate` DATE NOT NULL,
`Year` INT NOT NULL,
`Quarter` VARCHAR(45) NOT NULL,
`Month` INT NOT NULL,
`Event` VARCHAR(255) NOT NULL,
PRIMARY KEY (`TimeKey`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_StarSchema`.`Location`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_StarSchema`.`Location` (
`LocationKey` INT NOT NULL,
`Continent` VARCHAR(45) NOT NULL,
`Country` VARCHAR(45) NOT NULL,
`City` VARCHAR(45) NOT NULL,
PRIMARY KEY (`LocationKey`))
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `Mangata_StarSchema`.`Sales`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `Mangata_StarSchema`.`Sales` (
`Saleskey` INT NOT NULL,
`Price` DECIMAL NOT NULL,
`Cost` DECIMAL NOT NULL,
`Shipping` DECIMAL NOT NULL,
`Quantity` INT NOT NULL,
`Location` INT NOT NULL,
`ProductKey` INT NOT NULL,
`TimeKey` INT NOT NULL,
PRIMARY KEY (`Saleskey`),
INDEX `product_fk_idx` (`ProductKey` ASC) VISIBLE,
INDEX `time_fk_idx` (`TimeKey` ASC) VISIBLE,
INDEX `location_fk_idx` (`Location` ASC) VISIBLE,
CONSTRAINT `product_fk`
FOREIGN KEY (`ProductKey`)
REFERENCES `Mangata_StarSchema`.`Products` (`ProductKey`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `time_fk`
FOREIGN KEY (`TimeKey`)
REFERENCES `Mangata_StarSchema`.`Time` (`TimeKey`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `location_fk`
FOREIGN KEY (`Location`)
REFERENCES `Mangata_StarSchema`.`Location` (`LocationKey`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;


SET SQL_MODE=@OLD_SQL_MODE;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;

Visualizations:

Finally, with the available data, I used Tableau to analyze and make visualizations limited to the United States.

Thank you!

If you found this article interesting, please follow me on this platform. You can reach out to me on LinkedIn for any discussion.

--

--

Nnamdi Samuel
Art of Data Engineering

Data Engineer💥Voracious Reader and a Writer || Chemical Engineer