a) that was a faux pas on me
b) Using R doesn’t qualify someone to be a data scientist?
R and SAS are two other common tools of a data scientist. Even then, there are still more tools for modeling. Personally, my experience is with R, python and Matlab. Never been a fan of the latter.
But it really does depend where you work.
In some companies, they use the term data scientists as a hiring technique. People like to be called a data scientist and thus the a company may take a data analyst role and rename it. It makes the candidate feel better..until they find out what work they will be doing.
In other companies, it is more of a research position. This would call for running experiments or doing analysis either on live populations or on previously collected data. This is probably what most people assume a data scientist is.
The reason I picked data scientists vs data engineer is because those job titles are much easier distinguish even between companies. I don’t tend to try to pin what is the difference between data scientist vs statistician because some might say there is a difference, others will say there isn’t.
c) Is the difference whether someone can handle big data or not?
I don’t mean in the sense of having to manage a hadoop cluster. Even data engineers don’t often have to mange that kind of work. I just think when it comes to being practical, understanding how much compute your model might take is an important step. For instance, I find it important to think about how one might reduce their population size or sample in such a way that a model doesn’t block other processes from occurring.