Story of a web crawler Once upon a time a crawler was born, it discovered the web and since then it could never get enough of it It would crawl a billion pages a month, but it also became a collector, storing HTML and URLs. It learned manners when visiting website and got trained in fighting off hackers and traps. Essentially,
Scalability | Robustness | Efficient | Extensibility It could do
1. Search Engine Indexing, eg: googlebot
2. Web mining or detective work
3. Web monitoring to catch cheaters