Due to the current stay at home policy, I am working from home. When I am browsing through the internet, to find COVID-19 related news article, I am curious how some of the websites are able to fetch the latest news or articles. These websites could retrieve news or articles from various websites.
I am thinking, fantastic! Do they build multiple crawlers on various websites to get the latest news? But that will need a lot of work.
Or maybe there exists some API to retrieve the news for free? …
Why do we need to learn how to extract data from WebSocket?
Isn’t it requests already able to scrape almost all kinds of websites?
First of all, let me give a brief introduction to WebSocket, and you will know why do we need to learn how to scrape it.
Websocket protocol is used to provide persistent real-time connection. It means that most of the websites use WebSocket to send real-time data from its server to the web so that you are able to view the ever-changing live data.
You might ask, what kind of websites normally uses WebSocket?
There are a few specific kinds of websites that I know, uses WebSocket. For instance, live betting, cryptocurrency, and stock market websites. …
Previously, when I was studying at university, I wanted to go into the data science field so badly.
I kept on study a lot of statistics with the hope that I could ace my data science interview.
However, I kept failing.
After that, I spent some time, trying to recall is there any common areas that most of the hiring managers are looking for.
I figured out that I actually do not have a nice data science project to showcase.
Hence, I spent some time, tried to craft a data science project, and thought I was going to succeed this time. …