Awesome read!
Matt Ross
81

Matt,

It sounds like what you are describing is feature selection. The method you described seems like it would work as a method to understanding feature importance; however, I’ve never tried it my self. I also wonder what the cost would be of that method and that there are perhaps similar methods available to get the information.

A simple form of feature selection is calculating the correlation between input and target pairs. Doing this for all inputs and the target output would give you a good idea of what features are essential and which are less so. I believe this would yield similar results as to what you propose doing.

Going up the complexity ladder, you have an exhaustive feature search, which would have you trying all possible combinations of features to check the effect on overall accuracy. The feature combination that gives you the highest accuracy being the winning combination. If you tracked the delta-accuracy while adding/subtracting features, you would get what you originally want, but this method is not very efficient.

Finally my favorite approach. Minimum Redundancy Maximum Relevance Feature Selection attempts to do precisely what the name suggests. Sadly there is not a lot easy to understand the material on this method online *idea for a new blog post maybe*. “Thoughtful machine Learning in Python” by Matthew Kirk has an excellent chapter on feature selection and specifically about MRMR feature selection.

I enjoyed reading “Under The Hood of Neural Network Forward Propagation — The Dreaded Matrix Multiplication.” I’m Looking forward to the next article.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.