Capstone Countdown ….
The final week of my Data Science Immersive class starts today, but in reality, this week, and the eleven that preceded it, mark much more of a beginning than an end. This course has opened a new professional path for me and my classmates, an exciting path in a challenging and interesting field. Before that path continues, however, the capstone project must be completed.
If anyone taking this class ever reads this blog somewhere down the road, I can tell you this: do not wait until the last minute to work on these projects. I am able to take in this week with as much enjoyment as anything else, because I have been working tirelessly in the evenings and on the weekends to make sure I would not have my back up against the wall this week with the deadline looming.
My project is coming along. It doesn’t do nearly everything I wanted it to do — but that’s OK. I wanted it to pick baseball games with some degree of accuracy — and it does that, just to a much lower degree of accuracy than I had hoped. And I still have four days to improve it!
One of the things I have come to grips with over the last six weeks or so is that I’m not going to learn everything I need to know about Python, predictive modeling, or the tools and websites involved, during the twelve weeks of this class. And I’ve grown to be OK with that.
When I compare what I know today to what I knew on April 23, the day before the class started, I am amazed at the progress. These ‘Immersive’ classes present an incredible amount of information in a very short time period. It took some time, and a Shakespeare reading from our professor, but I no longer feel bad about not being able to understand everything completely, immediately.
And I would relay that message to any future student in one of these immersive classes who reads this. As long as you are giving the process everything you have, 100 percent — whatever progress you are making is still progress, even if there are parameters on models you don’t understand, or Python code you cannot remember.
Regarding my project, I have now tried hundreds of parameter combinations in several different models, including Logistic Regression, Random Forests, and a Support Vector Classifier. So far, I have been unable to get any of the models to pick much better than about 57% of games, and even that measure is not achieved on a consistent basis. But, the dozens and dozens of hours spent on the project have not been a waste, nor will the dozens more I will spend over the next few days, regardless of the final results.
I have thought about my dad a great deal while I have worked on this project. He taught me to love baseball and sports statistics. He also taught me to do things the right way. I have not always followed that advice, but when I have, the results have been solid.
These are my thoughts as week twelve begins.
