Chris Kim, Model Maker

I find and make models.
I think they’re models with pretty interesting looks and traits. I train them. I test them. Even though I try to make the best model, they sometimes don’t score very highly. You probably have guessed already, but I’m talking about predictive models.
I made a function that lets me choose from an original dataset the features, target, NLP vectorizer, and classifier. With all these inputs, I’ll get an accuracy score based on the actual target. Then using a helper function, I get a dataframe that lists which vectorizer and classifier were used. It returns in descending order based on accuracy score. All this work to find out, which model and parameters provided the most accurate score.
I used these functions in my last project, the Data Scientist Seeking Data Scientist. These two functions work well for analyzing text data to determine if NLP can help determine classes.
Things to improve on: Construct function in a pipeline or as an object. I’ve done some pipelining and object-oriented coding, but it is not as natural to me as writing a straight function. There’s great examples on the bottom of Scikit-Learn’s website that I will need to go back and read through again. For those interested, here is my code:
I promise to update this blog once I rewrite my code in a pipeline/class. Hopefully, promising this makes me accountable!
