Using BigQuery SQL To Prettify Your Ugliest URLs And Extract Insights

Learn BigQuery SQL techniques to extract only the most relevant elements of your ugliest URLs.

Zach Quinn
Learning SQL

--

Making Sense of Messy URLs

Often discounted as mere metadata, URLs you scrape, ingest or create contain a lot of relevant information that is often ignored simply because they’re too “ugly.” It’s true that some URLs can be off-putting. But more often, URLs contain information that doesn’t just make for interesting analysis; it can actively drive traffic and, by extension, revenue for your clients or yourself.

Take, for instance, the utm paramter you’ve undoubtedly seen in paragraph-long URLs. For digital marketers, this is a useful tool because it tells us what source pushed you to our site, landing page or affiliate link. Though, if you’re learning SQL and not interested in digital advertising, I, admittedly, may have lost you by now.

If you don’t care about gathering precise data to drive traffic you should care about parsing URLs because URLs represent some of the more complex STRING type data you’ll encounter. The functions you’ll use to extract the “good stuff” from URLs are applicable to a lot of STRING operations in SQL.

Therefore, my intent isn’t to bore you with a bunch of functions exclusive to parsing URL information. Instead, I hope to share a few approaches that can be transferrable when working with longer…

--

--

Zach Quinn
Learning SQL

Journalist—>Sr. Data Engineer; new stories every Monday.