Data Science / Machine Learning Resources

I maintain this list of data science and machine learning resources for myself and for new entrants to the field. Feedback is appreciated.

Ground-level material:

  1. Data Science for Business : one of the few books that starts with business problems and shows how ML/DS can be applied. Good place to start in the field.
  2. Introduction to Information Retrieval : by Chris Manning et al. : blend of machine learning and software engineering to build info retrieval / search systems.
  3. fastml.com : blog covers explanations and attempts at a variety of machine learning models, good place for looking at a variety of ML ideas
  4. blog.echen.me : detailed exploration of specific data science problems / areas
  5. bayesian methods for hackers : groundwork for bayesian analysis
  6. primer on neural networks : by andrej karpathy
  7. NLP and neural networks : yoav goldberg

Various topics:

  1. Propensity modeling: http://blog.echen.me/2014/08/15/propensity-modeling-causal-inference-and-discovering-drivers-of-growth/
  2. A/B testing pitfalls: https://www.quora.com/When-should-A-B-testing-not-be-trusted-to-make-decisions
  3. Causal impact modeling: http://multithreaded.stitchfix.com/blog/2016/01/13/market-watch

Deeper material:

  1. Elements of Statistical Learning : Frequentist take on a variety of machine learning techniques, a good reference book
  2. Bayesian Data Analysis : first level bayesian analyses
  3. Linear Dynamic Systems : Video lectures by Steven Boyd on linear algebra, oriented around practical applications rather than heavy theory.
  4. Pattern Recognition and Machine Learning : Bayesian view of machine learning
  5. Recommendation Systems handbook : good intro and good reference too

Software Engineering materials:

  1. Paper on Machine Learning System Debt

Other materials:

  1. Stock compensation: https://blog.wealthfront.com/new-college-grad-stock-compensation/