Week 2- Breast Cancer Detection Data Analysis

Yahya Koçak
bbm406f19
Published in
2 min readDec 11, 2019

W e are Yahya Koçak and Girayhan Yıldırım. This is our second blog about our project Breast cancer detection. This week, we will analyze and inform you of our data we will use.

Data Information

Our data was generated by observing the cell characteristics of patients examined by William H. Wolberg for 2 years. It has been classified by reducing these properties between 0 and 1. As a result, 2 results were obtained as benign(B) and malignant(M) cells.

Our dataset has 32 attributes. These quantities are: id, diagnosis, radius_mean, texture_mean, perimeter_mean,
area_mean, smoothness_mean, compactness_mean, concavity_mean,
concave points_mean, symmetry_mean, fractal_dimension_mean,
radius_se, texture_se, perimeter_se, area_se, smoothness_se,
compactness_se, concavity_se, concave points_se, symmetry_se,
fractal_dimension_se, radius_worst, texture_worst,
perimeter_worst, area_worst, smoothness_worst,
compactness_worst, concavity_worst, concave points_worst,
symmetry_worst, fractal_dimension_worst

Data Analysis

The number of samples in our data set is 569.

Number of Benign and Malignant
Number of Benign:  357
Number of Malignant : 212

Visualization of dataset

3–12 features visualization
13–22 features visualization
23–32 features visualization

See you soon!

--

--