Learning DATA SCIENCE
This material presents a set of POSTs related to Machine Learning, where we will start by studying: concepts, scenarios and predictions for Artificial Intelligence (AI), in addition to some basic concepts of Statistics. Next, we will show you some tools that help us on a daily basis when we work with Machine Learning . We will also see a brief description of Data and Big Data, passing through Non-Relational Databases (NOSQL).
After this base, we will enter the main subject: Machine Learning , where several materials will be presented detailing algorithms, techniques, libraries ( librarys ) and etc, we will give a greater focus to Classification Algorithms and Natural Language Processing (NLP) and last but not least importantly, the Metrics that can be applied.
Then, we will talk about Data Science , this area that has been growing and tends to grow more and more… explanation of what it is, what a Data Scientist does , tools used, a POST with several videos that show the techniques and tools used in the daily life of a Data Scientist.
In the end, we will see some applications that use Machine Learning .
That’s it, I hope you like the compilation of posts, and it can help in some way in your studies! This post will be constantly updated and I count on everyone’s feedback so that we can improve this material even more. If you want to suggest articles, you can suggest that I add them here, the idea is to be a source of studies.
BEFORE YOU BEGIN
Some observations regarding professionals in the area:
- curious
- Love to learn (always learning)
- Problem Solver (This point is very important, everything we are going to do is to solve some situation, some problem… so focusing on this point is crucial. It all starts with understanding the problem, that is, if you know what you really need to do to solve a problem or situation is an excellent start.)
CONCEPTS, SCENARIOS, FORECASTS…
Articles related to concepts related to Machine Learning, AI Predictions and Scenarios.
Analyst predictions 2022: The future of data management
THE AI INDEX REPORT (Measuring trends in Artificial Intelligence)
THE AGE OF AI (ORIGINAL SERIES — YOUTUBE)
Very good series explaining the advances of AI
MATERIALS ABOUT ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, STATISTICS AND ETC…
MATHEMATICS, PROBABILITY AND STATISTICS
Material focused on Mathematics, Probability and Statistics, excellent base…
Estatidados — Master Thiago Marques Statistics Community
Probability Learning I : Bayes’ Theorem
Probability Scoring Methods in Python
Common Probability Distributions: The Data Scientist’s Crib Sheet
Probabilistic Model Selection with AIC, BIC, and MDL
High-performance mathematical paradigms in Python
A Simple Introduction to Complex Stochastic Processes
MATHEMATICS COURSE — KHAN ACADEMY
( Pre-Calculus | Differential Calculus | Integral Calculus | Differential Equations | Multivariable Calculus | Linear Algebra | Statistics and Probability | Advanced Statistics (AP®︎ Statistics) )
TOOLS
They present some of the most used tools in this universe…
7 FREE DATA ANALYSIS TOOLS YOU SHOULD KNOW
TOP 27 FREE SOFTWARE FOR TEXT ANALYSIS, TEXT MINING AND TEXT ANALYTICS
SCIKIT-LEARN E KERAS (CHEAT SHEET)
SCIKIT-LEARN — USER GUIDE
- Challenge: Masters of Scikit-Learning (with Mario Filho)
DATA and BIG DATA
Data Concepts, where to find data sources, Big Data, NOSQL and also SQL.
DATA
Data are codes that constitute the raw material of information , that is, it is untreated information. Data represent one or more meanings that alone cannot convey a message or represent some knowledge.
DATA SOURCES, WHERE TO FIND IT?
HOW TO BECOME A COMPANY DRIVED BY FACTS AND DATA
MCKINSEY: COMPETING IN A DATA-DRIVEN WORLD
DATA-DRIVEN MUST BE A CULTURE AND NOT A PROJECT.
IN 60 SECONDS (which is done in 60 seconds on the Internet, with a historical basis since 2016)
KNOW THE POWER OF BIG DATA — INFOGRAPHIC
EXTRACTING BUSINESS VALUE FROM THE 4 V’S OF BIG DATA
BIG DATA HOW TO TRANSFORM A DATABASE INTO STRATEGY
NOSQL Non
-Relational Databases
“SQL — Structured Query Language ”, that is, Structured Query Language. It was the way found so that communication with a database could be done in an uncomplicated, agile way that could be easily understood and learned by developers.
MACHINE LEARNING
Everything you need to know to get started in Machine Learning, check it out…
A LITTLE HISTORY… A Brief History of Machine Learning
WHAT YOU NEED TO KNOW ABOUT MACHINE LEARNING…
HOW MACHINE LEARNING EVOLVED OVER THE PERIOD
DIFFERENCE BETWEEN DATA MINING AND MACHINE LEARNING
FUNDAMENTALS OF MACHINE LEARNING ALGORITHMS (WITH PYTHON ER CODES)
YOUR FIRST MACHINE LEARNING PROJECT IN PYTHON (STEP BY STEP)
Machine Learning Tutorial with the Titanic Mario
Filho dataset — Video Sequence
MACHINE LEARNING YEARNING (Excellent book)
14 DIFFERENT TYPES OF LEARNING IN MACHINE LEARNING
- Supervised Learning -
Unsupervised Learning -
Reinforcement Learning -
Semi -Supervised Learning -
Self- Supervised Learning -Supervised Learning
- Multi -Instance Learning -
Inductive Learning -
Deductive Inference -
Transductive LearningTransductive Learning
- Multi — Task Learning -
Active Learning -
Online Learning -
Transfer Learning -
Ensemble Learning
AN ESSENTIAL GUIDE FOR NUMPY TO MACHINE LEARNING IN PYTHON
7 TECHNIQUES FOR DIMENSIONALITY REDUCTION
DATA REPRESENTATION IN MACHINE LEARNING
BEST LIBRARIES FOR PYTHON AND NATURAL LANGUAGE PROCESSING
Classification Algorithms, Regression, Neural Networks, Clustering …
DECISION TREES AND RANDOM FORESTS FOR RANKING AND REGRESSION
INTRODUCTION TO THE K-NEAREST NEIGHBOUR ALGORITHM (PYTHON CODE)
SVM ALGORITHM (SUPPORT VECTOR MACHINE) FROM EXAMPLES AND CODE (PYTHON ER)
NLP — Natural Language Processing
It is a field of Artificial Intelligence that gives machines the ability to read, understand and extract meaning from human languages.
BEST LIBRARIES FOR PYTHON AND NATURAL LANGUAGE PROCESSING
TOP 10 POSTS +1 ABOUT NLP 2019…
YOUR GUIDE TO NATURAL LANGUAGE PROCESSING (NLP)
NATURAL LANGUAGE PROCESSING WITH DEEP LEARNING (TALK — VIDEO)
7 DEEP LEARNING APPLICATIONS FOR NATURAL LANGUAGE PROCESSING
ADVANCED NATURAL LANGUAGE PROCESSING (NLP) FOR ENTERPRISE DOMAINS
RECOMMENDATION SYSTEMS IN PRACTICE
PROTOTYPE OF A STEP-BY-STEP RECOMMENDING SYSTEM PART 1: COLLABORATIVE FILTERING BASED ON KNN ITEMS
AN INTRODUCTION TO TOPICS MODELING USING LATENT SEMANTIC ANALYSIS (IN PYTHON)
THE AMAZING POWER OF WORD VECTORS
WORD EMBEDDING — VISUAL INSPECTOR
INTRODUCTION TO WORD EMBEDDINGS
DEEP LEARNING BOOK
Book in Portuguese on Deep Learning provided by the Data Science Academy .
LINEAR ALGEBRA FROM THE DEEP LEARNING BOOK BY GOODFELLOW, I., BENGIO, Y., AND COURVILLE, A. (2016)
METRICS
Most used model evaluation metrics in Machine Learning
CROSS VALIDATION: CONCEPT AND EXAMPLE IN R
INTERPRETING MACHINE LEARNING MODELS ( EN )
EVALUATION OF THE CLASSIFICATION MODEL
ROC CURVE EXPLAINED IN AN IMAGE
DATA SCIENCE
Data science is a term that escapes any single complete definition, which makes it difficult to use, especially if the goal is to use it correctly. Most articles and publications use the term loosely, with the assumption that it is universally understood. However, data science — its methods, goals, and applications — evolves with time and technology. Data science 25 years ago referred to collecting and cleaning data sets and applying statistical methods to that data. In 2018, data science has grown into a field that encompasses data analytics, predictive analytics, data mining, business intelligence, machine learning, and more.
More Definition, Positions, Methods, Packages, videos and much more about Data Science, check it out below…
DEFINE DATA SCIENCE: WHAT, WHERE AND HOW IS DATA SCIENCE
COMPARISON — JOBS IN DATA SCIENCE
10 MACHINE LEARNING METHODS EVERY DATA SCIENTIST SHOULD KNOW
5 PYTHON PACKAGES A DATA SCIENTIST CAN’T LIVE WITHOUT
45 TECHNIQUES USED BY DATA SCIENTISTS
12 ALGORITHMS EVERY DATA SCIENTIST SHOULD KNOW
TOP USED DATA SCIENCE LIBRARIES FOR PYTHON, R AND SCALA
DATA SCIENCE RESOURCES : CHEAT SHEETS (R, PYTHON…)
[CHEAT SHEET] PYTHON BASICS FOR DATA SCIENCE
LEARN DATA SCIENCE WITH DATA MINING
Video Collection on Data Science (Algorithms, Graphs, Tips, Models in Production)
DATA VIEW
Below is an indication of a book that shows how to build better graphics and more attractive Dashboards . What’s the point of all good data analysis if we don’t know how to demonstrate it in a clean and clear way?!?!? Good Read Storytelling with Data: A Guide to Data Visualization for Business Professionals
“ Storytelling with Data is admirably well written, a masterful display of rare art in the business world. Cole Nussbaumer Knaflic possesses a unique skill — a gift — in telling stories using data. At JPMorgan Chase, she helped improve our ability to explain complicated analyzes to executive management and the regulators we work with. Cole’s book brings his talents together in an easy-to-read guide, with excellent examples anyone can learn from to spur smarter decision making.”
― Mark R. Hillis , JPM Chase’s Chief Mortgage Risk Officer
Some tools you can use for Data visualization:
- Microsoft Power BI
- Tableau
- Google Data Studio
Useful links (tips from Prof. Grimaldo)
- R visual gallery: https://www.r-graph-gallery.com/
- Python Visual Gallery: https://python-graph-gallery.com/
- Sites to visit:
- d3js.org (various charts and scripts)
- https://datavizcatalogue.com/ (models)
- https://depictdatastudio.com/charts/ (templates)
- https://github.com/ft-interactive/chart-doctor/tree/master/visual-vocabulary (templates)
Storytelling :
- https://www.data-to-viz.com/
- https://www.visualcapitalist.com/
- https://howmuch.net/
- History of Data Visualization: http://euclid.psych.yorku.ca/SCS/Gallery/milestone/historia_infografia.pdf
Example of what not to do…
MACHINE LEARNING APPLICATIONS
Check out some applications that use Machine Learning, do you know of any more? Made one? Send it in the comments I add here!
LUPPAR NEWS-REC (INTELLIGENT NEWS RECOMMENDER)
DETECTION OF PARKINSON’S DISEASE THROUGH VOICE RECORDINGS
PREDICTIVE MODELS OF ENEM 2015 WRITING NOTES
APPLICATIONS OF NEURAL NETWORKS AND GENETIC ALGORITHMS (GOOGLE’S DINOSAUR AND FLAPPY BIRD)
GENERATING FORECAST GRAPHICS USING R TO PREDICT MEDALS IN THE OLYMPICS
TEXT ANALYTICS WITH R, PRACTICAL EXAMPLE: ANALYZING TWITTER FOOTBALL DATA
5 APPLICATIONS OF ARTIFICIAL INTELLIGENCE IN MEDICINE
ARTIFICIAL INTELLIGENCE LEARNS CHEMISTRY TO PREVENT HOW TO MAKE MEDICINES
HOW TO GET STARTED IN THE AREA OF DATA SCIENCE
This is a very discussed topic, several questions arise such as:
- What course should I take to become a data scientist ?
- Do you have vacancies in the market?
- How do I become more visible and fight for a vacancy in data science ?
- How do I earn BRL 20,000.00, or more, per month as I saw on television?
Here are some personal comments regarding the above questions:
- There is no one course that will make you a data scientist, there are several good courses in this area that will help you on your way to your goal. The tip is to take a course and absorb as much content as possible, write down the points you need to improve (example: programming, statistics, linear algebra…), research and study these points from the outside (if you prefer, take specific courses in these areas);
- Yes, there are many vacancies in the market, both for Data Scientist and Data Engineer (the Data Engineer is what lays the foundation for the Data Scientist, it is an area that is also growing a lot and in my view, companies with large Artificial Intelligence projects, you should start in this area, before calling a Data Scientist — this will vary from company to company, depending on the size of the projects and their goals);
- If you don’t have it yet, build a Portfolio (on Github , for example, or on a blog) contemplating the work you’ve already done, it can be the work you’ve done at university, personal work and etc (always try to make a project from the beginning until the end (deploy)), that is, show your potential! This will make it more visible to the market.
- Don’t be fooled, this value is not quite the one practiced mainly here in Brazil, of course there are large institutions that pay this amount or even more, but they are exceptions and not the rule! But the tip is, study and you can earn that value!
More information at: Getting Started in Data Science .
That’s it folks, I hope you liked it and I count on everyone’s support by sending suggestions for materials that we can put here and make the material even better!