Jul 26, 2017 · 1 min read
Thank you for this extensive comparison! We found Apache Nutch to be the best match for our use case. Documentation is the biggest pain point but if you deep dive into the source code its fairly easy to get things running and get the most out of it since most use cases are already handled by Nutch Plugins although not documented.. I wrote about our outcomes here: https://blog.smartive.ch/replace-google-gsa-with-a-custom-search-engine-and-crawler-813838691a2
