Tools to scrape leads for Enterprise and B2B Outbound Marketing 🕵️💰

This article will explain the best sources and tools to for scraping leads from our startup screening operation and portfolio companies at Paua Ventures.

While only 16% of marketers say outbound practices provide the highest quality leads for sales (HubSpot, 2017), Outbound lead-generation is still one of the most promising tools in the marketing and sales mix of an early-stage Enterprise or DeepTech focused startup for a variety of reasons:

  • Quickly validate assumptions of different potential clients, target industries and buying personas
  • Immediate start of customer conversations while Inbound marketing can take months to take-off
  • Volume of leads is correlated on own effort
  • And the best of all, hundreds of leads can be generated at (next to) no cost.
From Conference Website to Lead List (Sample: Hannover Messe and DataMiner)

1. Identifying the source 🔍

Potential customer leads can be found in structured data anywhere on the web. To identify the right sources, a hypothesis driven approach will provide the best results. The source for data-scraping should already pre-qualify leads to the highest extent possible.

If a company is sending someone to an expensive conference, they might be more likely to have a budget for a Software solution. However, companies that frequently attended free meetups might have an urgent pain point to solve which might speed up the sales cycle

Sources should display data in a structured format (list or table) with as many data-points or filtering options as possible. Usually, simple WordPress built website are the easiest to scrape while most large data provider have anti-scraping provisions in place. The more contact information a source provides the better, however, it is not a problem to enrich data using free tools from nothing more than a list of company names.

Samples sources:

___________________________________________________________________ Goal: Start a campaign targeting a specific industry

Source: Exhibitor or Attendee List of Trade Fairs

  • List attendees of a the 2018 Legal Recruiting Summit: Link
  • French Hardware Producers at the Embedded World: Link

___________________________________________________________________ Goal: Start a campaign targeting a specific customer geography

Source: Member lists of local Industry or Business Associations

  • Chemical Companies from the German state of Hesse: Link
  • Wineries in Napa Valley: Link

___________________________________________________________________

Goal: Start a campaign for customer that uses a specific software

Source: Job-advertisement requiring a certain skill-set

  • Manchester based companies requiring SAP HANA: Link
  • US based Manufacturer requiring Autocad: Link

___________________________________________________________________

Other useful sources include: B2B-marketplaces, company directories, Groups on LinkedIN, University Alumni Databases, Financial databases..

2. Scraping Tools 🔨

Data providers implement a variety of anti-scraping measures to protect their data and deter bots and scrapers, there are are two basic concepts to extract data regardless of the way it is protected or structured:

1. Client based scrapers:

Chrome plug-ins or local scripts that basically scrape “what is displayed” ✔️No data-science expertise needed to get started ✔️Most tools are free to use for B2B-volumes ❌Manual effort required in handling exports 🔝Recommended for extracting less than 500 rows

2. Cloud based scapers:

SaaS solutions that automate scraping tasks to the highest extend possible ✔️Ability to built complex scraping work-flows included detail-pages ✔️Exports can be automated and regular updates can be scheduled ❌Results cannot be controlled while extractor is running ❌Paid subscription required (limited trials available) 🔝Recommended for extracting more than 500 rows and any frequent tasks

From working with portfolio companies as well as our own scraping we use to identify startups, these are the recommended tools to get started with scraping. Please note that free enrichment tools are covered in this article.

Data Miner

Type: Client | Price: 500 pages free, USD 20+/months after|Website: Link ✔️ Crowdsourced “recipes” to scrape most common websites (i.e. Angel.co) ✔️ Batch features, easy export to cloud to create batches ✔️ Works well if cloud based tools are blocked by website owner ❌ Manual effort required, doesn’t work on very complex sites

Import.io

Type: Cloud | Price: 7 day free, USD 1.999+/ year after|Website: Link ✔️ Works well for Job boards like Indeed, Jobfluent ✔️ No technical skills required to get started ✔️ Automation/Scheduling Features, Import into Google Sheets/JSON ❌ High price, URL based usage limitations (not related to actual results)

Grep.sr

Type: Hybrid| Price: 1000 pages free, USD 20+/months after|Website: Link ✔️ Combines UX of a Chrome plug-in with powerful cloud scraping ✔️ Handles high volumes very well ❌ Doesn't handle complex pages, results only visible after run ❌ Limited ability for trouble-shooting

dexi.io

Type: Cloud| Price: Tree trial, USD 119+/months after |Website: Link ✔️ Customizable to perform highly complex task ✔️ Allows interaction with website (clicks, take screenshots..) ❌ Not intuitive to use, data-science knowledge needed

Scraper

Type: Client| Price: Free and Open Source |Website: Link ✔️ Free for any volume ✔️ Easy “scrape similar” function to built simple list’s in no-time ❌ Trouble-shooting requires knowledge of XPath syntax and jQuery

SelectorGadget has a tool to help you with finding the right JQuerry.

Built your own:

Type: Client| Price: Free ✔️ Solve individual scraping challenges with custom code ❌ Requires Python knowledge and time and to built for every use-case

To built individual scrapers Hackernoon has compiled this useful Tutorial to get started. As they mention there are several open-source repositories like Scrapy.

Important Notice: Pricing is updated as of April 2018. Please check your local privacy laws as well as terms & conditions of the data owner before scraping data and handling personal information. Also consider local e-mail advertisement regulations before sending out commercial messages. Some useful resources to stay compliant:

https://en.wikipedia.org/wiki/Email_spam_legislation_by_country (Global) https://www.ftc.gov/tips-advice/business-center/guidance/can-spam-act-compliance-guide-business (US) https://certified-senders.org/ (DACH) https://www.gov.uk/marketing-advertising-law/direct-marketing (UK)

If you are a B2B software company looking for venture funding in Seed and Series A stage, please get in touch. Looking forward to comments on your feedback and favorite scraping tools!!

INSIDE THE PAUA

Ad-lib Thoughts from the Paua Family

Björn Müller

Written by

Entrepreneuer + ex VC — currently @Lufthansa Innovation Hub

INSIDE THE PAUA

Ad-lib Thoughts from the Paua Family