Splitting CSV Into Train And Test Data

Nishank Sharma
the ML blog
Published in
3 min readMay 29, 2017

--

Hola everyone!

While working with datasets, a machine learning algorithm works in two stages — the testing and the training stage. Normally the data split between test-train is 20%-80%.

In order to successfully implement a ML algo, you need to be clear about how to split the data into testing and training, and this short post talks exactly about that.

We will start by installing packages needed.

We will be using pandas to import the dataset we will be working on and sklearn for the train_test_split() function, which will be used for splitting the data into the two parts.

Next, we will start our program by importing the packages needed for the process.

--

--

Nishank Sharma
the ML blog

Hello, I’m Nishank. I design beautiful, usable and enjoyable interfaces.