100 Days of ML — Day 15 — Your First NFL Neural Network in 13 Lines of Code
So I revisited my very terrible neural network from Day 10, researching things on StackOverflow and StackExchange and stacks of pancakes at local diners (which didn’t have much in the way of AI, but man, I really needed the carbs after five days on Keto).
In that article, I made mention of the neural network problems essentially being either a lack of hidden layer or a lack of normalization. I went back to some old projects and I was confused by my old code, so I didn’t add a hidden layer. I went with normalization. Here it is:
from numpy import exp, array, random, dot, linalg
input_raw = array([[27,138,94],[16,160,138],[30,247,182],[20,269,65]])
output_raw = array([[8,20,24,9]]).T
test_raw = array([18,119,113])
training_set_inputs = input_raw/linalg.norm(input_raw)
training_set_outputs = output_raw/linalg.norm(output_raw)
test = t/linalg.norm(test_raw)
random.seed(1)
synaptic_weights = 2 * random.random((3, 1)) — 1
for iteration in range(10000):
output = 1 / (1 + exp(-(dot(training_set_inputs, synaptic_weights))))
synaptic_weights += dot(training_set_inputs.T, (training_set_outputs — output) * output * (1 — output))
print((1 / (1 + exp(-(dot(test, synaptic_weights)))))*linalg.norm(o))
To normalize the data, you throw in one of NumPy’s functions: linalg.norm(). This does the stats voodoo of subtracting the distance of the data of all the average and then throwing that number under a standard deviation.
The input data (which I don’t think I mentioned last time) was from the first two Cowboys games and first two Redskins games. The test data is from Philly’s first game. It breaks down list this:
— Number of First Downs
— Number of Passing Yards
— Number of Rushing Yards
Here’s what my neural network predicted for 18 first downs, 119 passing yards, and 113 rushing yards:
[ 25.49805554]
My neural network at the end stated that Philly should have scored 25.5 points. They actually scored 18. My neural network that could was off by a touched. It was under, which is good for Price is Right, but does mean there are issues which I’ll address in a moment.
I tried Arizona’s first game. They had 16 first downs, 225 passing yards, and 74 rushing yards. And my neural network said:
[ 7.42979666]
They actually scored 12. I’m encouraged because at least I’m not pumping out probabilities anymore and I’m within a touchdown, but in my head, I’m getting yelled at by the offensive coordinated for missing my assignment.
Here are the issues I probably have:
— Not enough iterations: I did okay. 10,000 for this dataset should be enough, but maybe it could’ve used more and having said that maybe it needed:
— More data. I gave my neural network a sample size of 4. That’s the equivalent of a teenager who knows everything after reading “A Brief History In Time”. I really need to up the ante. Wait until there have been five games and utilize all the teams. I didn’t even have a validation set for crying out loud. Also, I need to optimize this next thing:
— Better features. My features are okay, but there is some correlation between yardage and first downs, both rushing and passing. I probably would be better off using more mutually exclusive data. At the very least, I should choose 3–8 more features for my neural network to much on.
— Make sure my math is right. As stated, normalizing data is taking everything in the list/vector and subtracting that from the average of the list/vector and dividing that entire line by the standard deviation. The inverse would be to multiply your data by the standard deviation and add the average. These equations look like this:
I also didn’t scale, so that might be hurting a few things.
— Finally, I still want that hidden layer. Even with stuff that does correlate, a hidden layer would account for and correct that.
I’ll follow up in the middle of October with stronger data sets and better practices.
Jimmy Murray is a Florida based comedian who studied Marketing and Film before finding himself homeless. Resourceful, he taught himself coding, which led to a ton of opportunities in many fields, the most recent of which is coding away his podcast editing. His entrepreneurial skills and love of automation have led to a sheer love of all things related to AI.
Resources:
https://en.wikipedia.org/wiki/Softmax_function
https://stackoverflow.com/questions/16171151/how-to-handle-real-numbers-in-a-neural-network
https://stats.stackexchange.com/questions/97477/denormalizing-data
https://stackoverflow.com/questions/21030391/how-to-normalize-an-array-in-numpy
https://stackoverflow.com/questions/21030391/how-to-normalize-an-array-in-numpy
https://stackoverflow.com/questions/32888108/denormalization-of-predicted-data-in-neural-networks
#100DaysOfML
#ArtificialIntelligence
#MachineLearning
#DeepLearning