Splitting CSV Into Train And Test Data

Published in

the ML blog

3 min readMay 29, 2017

Hola everyone!

While working with datasets, a machine learning algorithm works in two stages — the testing and the training stage. Normally the data split between test-train is 20%-80%.

In order to successfully implement a ML algo, you need to be clear about how to split the data into testing and training, and this short post talks exactly about that.

We will start by installing packages needed.

We will be using pandas to import the dataset we will be working on and sklearn for the train_test_split() function, which will be used for splitting the data into the two parts.

Next, we will start our program by importing the packages needed for the process.

Splitting CSV Into Train And Test Data

Written by Nishank Sharma