I have used beautiful soup. You can of course combine scrapy and beautiful soup. I could have easily used
beautifulsoup inside of parse_dir_contents
for the code
Beautiful soup is strictly parsing while Scrapy handles a lot of other stuff. My next tutorial I am making is on combining Scrapy with Beautiful soup.
Taken from elsewhere:
Scrapy is a Web-spider or web scraper framework, You give Scrapy a root URL to start crawling, then you can specify constraints on how many (number of) URLs you want to crawl and fetch,etc. It is a complete framework for web-scraping or crawling.
BeautifulSoup is a parsing library which also does a pretty good job of fetching contents from URL and allows you to parse certain parts of them without any hassle. It only fetches the contents of the URL that you give and then stops. It does not crawl unless you manually put it inside an infinite loop with certain criteria.
In simple words, with Beautiful Soup you can build something similar to Scrapy. Beautiful Soup is a library while Scrapy is a complete framework.