An Introduction to Data Science

Vaishnavi Ajmera
VLearn Together
Published in
6 min readJun 6, 2020

As a learner, starting any new thing to learn we should know a few things about it like what is it, what are the features, what is the scope, domain. etc. So, I thought to give a brief introduction to Data Science.

Data, the word we are hearing almost everywhere in every field. Every second we are generating one or other type of data while using social media, buying various products online or offline, watching online, learning, creating and doing anything on our smartphones is also taking data of our daily activities. According, to the EMC in 2020, there will be around 40 trillion gigabytes of data. We are using this data in various ways for personal growth as well as the growth of this world towards advancement.

Here, comes the role of Data Science. We can find many different definitions from various individuals on this but after reading and analyzing I find the essence of all these can one used to define it as follows —

Data Science is the language to express or communicate by using data. It is the art of answering various questions using data, i.e, using data to generate various useful insights and finding trends hidden behind data. Data Science is more about data than science.

Now from the recent few years, we have ubiquitous resources to become a Data Scientist. We have tons of data, we have various algorithms, we have various open-source software, and now we can store the gazillions of data at very low cost. So, there’s never been a better time than this to be a data scientist. And it is no wonder that Harvard Business Review has called Data Science as “the sexiest job in the 21st century”.

Data Scientist is someone who uses the data, cleans it, analyzes it, etc. and tell various fascinating stories through data. Data scientist is a person who finds solutions to problems using data.

What makes a person a Data Scientist?

To be a Data Scientist, a person should possess these 3 qualities — Curious, Argumentative, and Judgemental. Apart from these qualities, one should consider in which industry he or she wants to work as there are various industries in which a Data Scientist can work. Let’s say if you want to work for an IT or web-based firm as a data scientist then you need a different set of skills. And if you want to work in the health industry as a data scientist you need a different set of skills. So it is your choice to figure out in which field you are interested, in which field you have a competitive advantage. Maybe it is health, finance, retail, computers, film, etc. After figuring out, one can acquire the skills related to it, the platforms, the tools, etc. After gaining proficiency in these we can use this knowledge to solve real-world problems and can tell the world what you can do with the data.

The Domains of Expertise in Data Science

Data science is a very broad field with a multitude of various skills and technologies under it. The different fields which come under Data Science are as follows-

  1. Data Engineering — It is the aspect of Data Science that focuses on the practical applications of data collection and validation of data, where it comes from. This also includes the task of organizing data in a way that it can be in a usable form.
  2. Data Analysis and Visualization (Data Mining) — Data Mining is the process of extracting insights from the data using certain methodologies and techniques to make smart decisions. Data Visualization is the way to present these insights by using different visual tools and methods present. The applications of Data mining depends on the industry we are working in.
  3. Database Management — The Database Management System ensures the accuracy in tracking data regularly as a minute detail or change in data can be helpful. So it provides a link among data. There are various roles such as Data Specialist, Database Administrator comes under Database Management.
  4. Business Intelligence — Business Intelligence helps businesses to grow by finding patterns and trends in their data history. Business intelligence uses data to drive changes, help in eliminating inefficiencies, and quickly adapt to market or supply the necessary changes. BI tools enable us to find useful insights in reports, summaries, dashboards, graphs, charts and maps to provide users with detailed intelligence about the state of the business.
  5. Machine Learning — Machine Learning is the field of study that equips the computers with the capability to learn without being explicitly programmed. It is the methodology to find predictions from the data and help us make better decisions. Once the data is processed and organized by the Database Administrators, Business Analysts, it is provided to ML Engineers to make predictions. Machine Learning is classified mainly into Supervised and Unsupervised Learning. Firstly, the model is trained on the data available and then it is used to make predictions on unknown or new data. There are various metrics used for determining the accuracy or quality of the model.
  6. Deep Learning — Deep Learning is the branch of Machine learning whose functions are inspired by our brains by using artificial neural networks and learn from large amounts of data. Deep Learning allows the machine to solve more complex problems which can’t be solved using basic machine learning algorithms. The important property of neural networks is that they produce better results with more data, bigger models, and more computations.
  7. Natural Language Processing — Natural Language Processing is the manipulation of natural language i.e. raw data, like text or speech. It is a type of linguistic science which converts speech to text or text to speech or find some knowledge from the raw data generated by humans. There are many applications associated with this field such as Amazon’s Alexa, Google’s Siri, etc.
  8. Computer Vision — Computer Vision is that branch of Data Science which trains computers or machines to understand the visual data. It helps computers to see and to gain high-level understanding from digital images or videos as we human see through our eyes and identify different things.
  9. Reinforcement Learning — Reinforcement Learning is that domain of Machine Learning which let computers to learn a good strategy from experimental trials and relative simple feedback received. It can also be defined as learning by observations. We can take an example to understand it such as how we learn the rules of a game by just observing it for some time.
  10. Cloud Computing - Cloud Computing is on-demand computing services such as servers, storage, databases, networking, software, etc. As today, tons of data is available so to process it, analyze it, and compute from it we need resources in large amount so it can be used from various cloud service providers which also provide mobility, ease, affordability and through this we can use distributed computing to increase the computation speed and performance.

Applications of Data Science

You can find various applications of Data Science around you in today’s world. In fact, you are using various applications without knowing that they are the ones which use data science. And many applications are using your data to provide you with an easy and good lifestyle. There are thousands of applications in this field. Some of the most important or top ones are —

Speech Recognition, Image Detection, Recommender systems of various e-commerce websites, Netflix, Instagram and is also used in detecting various diseases and recommending medicines in Healthcare, Internet Search, Gaming, Augmented Reality, Fraud Detection, Market Basket Analysis, etc.

Data Science is a very emerging field in today’s world. Almost every person is associated with it directly or indirectly. We can’t imagine today’s world without it. As we are generating lots of data and we have the ability to use this data for the betterment of us. And a lot of experiments are happening all over the world that allows us to use the direct result of our ability to analyze data and be able to design experiments and then roll out humongous efforts in providing various things such as relief, credit, opportunities, resources, etc.to those who have been disenfranchised in the past and give them an opportunity to join the rest of the world in the prosperity and happiness and health of the humankind. By this, we can say that the future of Data Science, as well as ours, is going to bright.

--

--