How to Make a Web Crawler in Swift 🕷
More often than not, my App Suite is thirsty for information that can be found around the Web.
Unless it’s something that takes a couple of minutes, for this kind of needs I like to automate as much as possible (it saves hours of work!).
Building A Web Crawler in Swift
Since I needed to extract some information from the Internet, this time I’ve decided to give Swift a try.
There are two main ways to do scripts in Swift:
- by following Hector Matos’s awesome guide here;
- by using John Sundell’s Marathon (the README.md is all you need);
It turns out, coding a Swift script is not any different than coding a new Swift class, function etc. I’ve never felt out of home.
Also, building a Web Crawler in Swift is incredibly easy:
I believe the script is pretty self-explanatory:
you input a starting web page, the number of maximum webpages to crawl, and the word you’re interested in.
In my case, I didn’t need these inputs to change: but if you do want them to, Hector Matos explains how to write a script that takes arguments here.
Once launched, the script will start crawling the web from the given page and jump around all the links it can find.
If you want a deeper explanation, please jump to the comments section 👇🏻 where I analyze each component of the script in detail.
In conclusions, I’d say that, depending on what you want to do, and by using the right tools, Swift sure is one of the very viable options for scripting!
A Note About Semaphores
If you must handle asynchronicity (like in my case, due to the loading of webpages), you need to have something that makes your script stay alive.
Shameless plug: if want to know more about Semaphores, make sure to read my previous article here.
Scripts and Xcode Playgrounds
If your script allows it, I’d suggest you to use Xcode Playgrounds during the script development:
This way it’s much faster to test it without the need to keep switching between your terminal and your code editor.
Once ready, you can quickly turn it into a real script.
For this post I’ve made a small repository, Selenops 🕷 (a spider that flies, yay! 😱), on GitHub:
in there you'll find the very same script written as an Xcode Playground, as a Marathon Script, and as a standard “Command Line Tool” Script.
Federico is a Bangkok-based Software Engineer with a strong passion for Swift, Minimalism, Design, and iOS Development.