I have been getting a really large number of messages on Social Media (Linkedin,Twitter) about people asking how to be a data scientist . I think its high time I pen down all my thoughts in form of article .This article aims to tell you from beginning till end of how to learn machine learning and stuff(a lot) around it and get that prestigious data scientist job.
Beginnings are always the toughest ,before you get started with learning the actual machine learning or deep learning, there are few things called Pre requisites ,I feel every data scientist should know.
- Linear Algebra: Before you actually start your machine learning journey ,make sure you know the basics of Linear Algebra, things like matrices, vectors ,matrix multiplication shouldn’t make you uncomfortable, if they do I would suggest please learn them .If you can’t find a right source to learn, Dont worry look below
- Calculus : Another really important thing to learn (and also a lot of fun ) thing to know and learn is calculus , I feel for beginning differential calculus is enough to learn enough Machine Learning/Deep Learning.You should be well versed with derivatives ,slopes of lines ,functions like sigmoid, tanH etc
- Python ! : Very important , since machine learning isn’t only math- heavy but coding heavy as well. You should be fine with not only basics of Python coding but also the important libraries(read mandatory) like Numpy, Scipy, Pandas, Matplotlib ,scikit-learn.
- Statistics: Last but not the least , you should be comfortable with a bit of statistics, basics like mean,median ,mode,standard deviation should make you fine enough.
OK , Phew too much right? Now that, You’ve learnt it what next?
Now lets get started with learning actual machine learning, the authority and my favourite machine learning teacher Andrew Ng comes in the picture.
Start with the World’s most favourite course on machine learning, Coursera’s Machine learning
This course will get you started on machine learning, also on Linear Algebra.
Unfortunately when this course was developed, Matlab was the favourite language to do machine learning, now the trend is totally changed its Python.
My advice to all of you (which I followed myself as well) is to learn everything from the course videos and implement assignments of the course and learnings of the week in Python myself . To get started ,you will find a lot of standard implementation of machine learning algorithms in Python easily on the Internet, if not worry not look even more below. 😃
So , once you are started with this awesome journey of machine learning , you will need some help on the way. I am gonna try to cover those topics.
A Machine learning Library? Now What’s that?
Once you start coding Prof Andrew’s teachings you will realise that to code all the models or algorithms ,you need some standard python library ,for off the shelf code. There are lot in the market,TensorFlow ,PyTorch,Keras,Theano and so on.I personally use TensorFlow(my favourite , also because I have made some contributions to it 😜) . To find off the shelf ready codes using TensorFlow for step by step pandas intro and machine learning models like Linear Regression,Logistic Regression and Neural nets start from here I am sure you will make your way ahead 😄
But,How do I learn TensorFlow?
Hmm, Interesting, though ideally you should be learning TensorFlow by the code labs ,I shared above, if you still feel the need to learn TensorFlow formally , feel free to do this course (Right from Google!)
Just this one machine learning course isn’t sufficient !
Yup that’s more than 100% true.Just this course alone will hardly get you started, After this jump to an awesome ocean called Deep Learning. Again, our awesome Professor comes to our rescue Andrew Ng .
With his fairly latest specialisation on Coursera ,here he has brought us awesome stuff again this time.5 course series to get you inside awesome deep learning.
OK, That’s fine but what about real world problems?
Yes, I understand you will at some of time will feel , all thats cool but where are the problems? real ones! where do I practice my skill? where do I get my hands dirty? Well worry not , our saviour is here ,the mighty Kaggle
Kaggle’s got everything ,starting from basics ,data sets and kernels(You will know whats that soon)
OK,What after all this, I need a job!!
Sorry to bust your bubble, just all this usually isn’t sufficient to get a Data Scientist job 😐 ,companies love to hire Generalists,You need to have other number of Skills like:
- Data Engineering : Before a data scientist actually does some “science” over the data,it should have the data ,right? Thats what data engineering is. Getting data from multiple sources to a standard system is what’s data engineering is .Big Data, Apache Spark,Hive, etc comes into picture here.
- SQL : Oh yes, to query databases you must know the language of databases ,called SQL or Structured Query Languages.
- Cloud Architectures: Yes, now that every machine learning application is built on cloud, you should know how applications are architected,what is done why its done ,how its done.
- Math (Really Intense one) : Not to scare you or something but I have actually had interviews where I have been asked questions on probability,information theory, graphs,calculus and few more ,thats usually rare but happens !
Before I consider this article finished ,for now I would like to add some general practices I feel every Deep learning Engineer,Data Scientist(Employed or Unemployed)should follow in General:
1.Never Stop learning! Yes, no matter if you have gotten a job or not ,you should not stop learning, I learn everyday! Currently I have been religiously studying this awesome and mandatory book for every data scientist ,The Deep Learning Book
2.Start Implementing Papers Once you’ve reached a level , I would suggest start implementing state-of-the art research papers in python or C++ .This cool website will give you the resources to do so. Papers with code
3.Start following the Giants on social media : Yes most importantly follow the leaders of the field ,Andrew Ng, Ian GoodFellow ,Yoshua Bengio and many others on social media (Linkedin,Twitter) this will keep you posted with latest updates on the field and also you will get to know their views :)