What happens when our data is not a time-series, but still have a time dimension which is very important? This is a Python solution for time-based cross-validation with all required inputs and an output matches scikit-learn methods. — Training and evaluating machine learning models usually require a training set and a test set. In most cases, train and test splitting is done randomly by taking 20% of the data as test data, unseen by the model and using the rest for training.