Learn R and/or Python
Most people will recommend you to start with R. But I’ll say you can pick R or Python, especially if you've been working with Python. I have shifted to Python since some time and when I need to use R there's a python module 'RPy' which allows me to do (almost) all of that from Python.
- R: Getting Started with Data Science — by Mark Steadman and Dallin Akagi
- Python: Getting Started with Data Science — by Mark Steadman and Dallin Akagi
Tan's data mining book
- Xiong, Hui, Pang-Ning Tan, and Vipin Kumar. "Mining strong affinity association patterns in data sets with skewed support distribution." Data Mining, 2003. ICDM 2003. Third IEEE International Conference on. IEEE, 2003.
- Slides: http://www-users.cs.umn.edu/~kumar/dmbook/index.php#item4
Work on Data Science competitions
Read DataTau [http://www.datatau.com/news]
Some of my bookmarks
- http://www.edii.uclm.es/~useR-2013/Tutorials/kuhn/user_caret_2up.pdf
- http://www.johnmyleswhite.com/notebook/2009/02/25/text-processing-in-r/
- http://in.pycon.org/2011/static/files/talks/11/Introduction_To_ML_Partial_2.pdf
- http://www.pytables.org/docs/LargeDataAnalysis.pdf
- http://caret.r-forge.r-project.org/training.html
- http://scikit-learn.org/stable/
- https://github.com/JohnLangford/vowpal_wabbit/wiki
- [Enigma](http://enigma.io)
- http://www.data.gov.in/catalogs/?filter=catalog_type%3Acatalog_type_raw_data&sort=updated%20desc
- http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html
- http://ghdx.healthmetricsandevaluation.org/global-burden-disease-study-2010-gbd-2010-data-downloads
- http://www.indiawaterportal.org/datafinder
- http://www.healthmetricsandevaluation.org/tools/data-visualizations
- https://visualisingadvocacy.org/
- http://setosa.io/#/
- http://olihb.com/
- http://dl.dropbox.com/u/7586336/blogger/Cambridge_R_googleVis_with_knitr_and_RStudio_May_2012.html#(1)
- http://lamages.blogspot.co.uk/2012/09/interactive-web-graphs-with-r-overview.html
- http://visual.ly/
- http://blog.ajantriks.net/
- http://geohackers.in/tutorials/
- https://github.com/geohacker/indicwiki
- http://cis-india.org/openness/blog/indic-wikipedia-visualisation-project-visualising-basic-parameters
- http://www.thisisthegreenroom.com/2009/choropleths-in-r/
- http://blog.revolutionanalytics.com/
- http://www.cs171.org/#!index.md
Email me when Sagar Jauhari publishes or recommends stories