
Pitch day
So I pitched the project idea to the professors and other students. It went well. There is one big ‘but’.
The real start of this whole thing was to learn new things such as hybrid apps and web scraping.
The feedback I got was I had do to good research into what can be scraped and what not, privacy and copyright wise.
I also had to make a tighter planning. I’m not much of a planner for school projects. I just start early and evenly distribute work over the available weeks, I make sure to test everything along the way so I’m prepared for stumbling blocks, and like that, I never have to rush to reach a deadline. It works for me, but I understand, when you to work together on projects or in the real world, this is not always ideal. So a tighter planning it is.
So I researched the legal restrictions of scraping and they were right. A LOT of websites don’t allow automated scripts to scrape data, no matter what you’re planning on using it for. There are a lot of things you can do about that though such as:
- Leaving a link to a page where in you say who you are and why you are scraping a site.
- Restricting the speed at the amount of automated scrapes you can do to a website to one time each 10–15 seconds.
- Checking their robots.txt file if you are allowed to scrape.
- Etc …
I’m going to build this whole thing anyway for the soul purpose of learning the technology. I’m going to be testing on website that specifically allow to do so, and I’m planning on implement all things listed above.
Source:
Web Scraping and crawling are perfectly legal, Right? — https://benbernardblog.com/web-scraping-and-crawling-are-perfectly-legal-right/

