The Startup
Published in

The Startup

Data Pre-processing-Refining Gold

Picture credit click here
  1. Data Acquisition (Ingredients)
  2. Importing the required libraries (Kitchen)
  3. Importing the Data Set (Kitchen ingredients)
  4. Handling the missing data (Checking the ingredients)
  5. Encoding the categorical data (Do all ingredients go together?)
  6. Splitting the data set
  7. Feature Scaling (Setting up all the ingredients on the dough)

Data Acquisition

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset=pd.read_csv('Data.csv')X=dataset.iloc[:, :-1].valuesY=dataset.iloc[:, -1].values
from sklearn.impute import SimpleImputerImputer= SimpleImputer(missing_values=np.nan, strategy='mean')Imputer.fit(X[:, 1:3])X[:, 1:3]=Imputer.transform(X[:, 1:3])
from sklearn.compose import ColumnTransformerfrom sklearn.preprocessing import OneHotEncoderct = ColumnTransformer(transformers=[('encoder',OneHotEncoder(),[0])],remainder='passthrough')X=np.array(ct.fit_transform(X))from sklearn.preprocessing import LabelEncoderle = LabelEncoder()Y = le.fit_transform(Y)
from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)
  • x_train — features for the training data
  • x_test — features for the test data
  • y_train — dependent variables for training data
  • y_test — independent variable for testing data
from sklearn.preprocessing import StandardScalersc = StandardScaler()X_train[:, 3:] = sc.fit_transform(X_train[:, 3:])X_test[:, 3:] = sc.transform(X_test[:, 3:])

--

--

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +760K followers.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aishwar Govil

Data Science || Machine Learning || IOT || Artificial Intelligence