Classifying Breast Cancer %98.18 accurate with KERAS

The dataset I’m using in this project is
Breast Cancer Wisconsin (Original) Data Set
by
Dr. WIlliam H. Wolberg (physician)
University of Wisconsin Hospitals
Madison, Wisconsin, USA
Creating, reshaping, scaling and splitting the data
Link to the data can be found in here.
Imports
import numpy as np
from sklearn import preprocessing, cross_validation
import pandas as pdReading the data
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data')Reshaping
Adding feature colums to the dataframe
df.columns = ['id','clump_thickness','unif_cell_size','unif_cell_shape','marg_adhesion','single_epith_size','bare_nuclei','bland_chrom','norm_nucleoli','mitoses','class']Dropping id column because is has no correlation with the class
df.drop(['id'], inplace=True, axis=1)Replacing empty data with -99999 to be a outlier
df.replace('?', -99999, inplace=True)Mapping class values to binary, it is 2 and 4 in our data. (2 for benign, 4 for malignant)
df['class'] = df['class'].map(lambda x: 1 if x == 4 else 0)Final dataframe

Scaling the data
Creating X(features) and y(classes)
X = np.array(df.drop(['class'], axis=1))
y = np.array(df['class'])Creating scaler instance
scaler = preprocessing.MinMaxScaler()Finally scaling the data
X = scaler.fit_transform(X)Splitting the data
X_train, X_test, y_train, y_test = cross_validation.train_test_split(
X, y, test_size=0.2)Creating the model and training
Usual imports
from __future__ import print_function
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation
import tensorflow as tfCreating the model
Creating the model instance
model = Sequential()Adding Layers to the model
model.add(Dense(9, activation='sigmoid', input_shape=(9,)))
model.add(Dense(27, activation='sigmoid'))
model.add(Dropout(0.25))
model.add(Dense(54, activation='sigmoid'))
model.add(Dropout(0.25))
model.add(Dense(27, activation='sigmoid'))
model.add(Dropout(0.25))
model.add(Dense(1, activation='sigmoid'))Compiling the model
model.compile(optimizer=keras.optimizers.Adam(), loss=keras.losses.mean_squared_logarithmic_error)I’m using Adam as optimizer and mean squared logarithmic error as loss function.
Traning the model
model.fit(X_train, y_train, batch_size=30, epochs=2000, verbose=1, validation_data=(X_test, y_test))Output:
Epoch 2000/2000
558/558 [==============================] - 0s 320us/step - loss: 0.0104 - val_loss: 0.0182
Evaluating results
loss = model.evaluate(X_test, y_test, verbose=1, batch_size=30)print("Final result is {}".format(100 - loss*100))Output:
Final result is 98.18395614690546
Final result is %98.18 accuracy.
