How to bulk convert Salesforce leads using email/web domains to match existing accounts

Hank Holiday
15 min readApr 2, 2020

--

Keeping your data nice and clean helps out the whole team.

There’s a very common Salesforce data management process that most folks come across sooner or later. You have a bunch of leads (hundreds, thousands, tens of thousands) and you want to convert them into accounts and contacts with a few key limitations:

  • If there is already a contact with the same email you want to merge the converted lead with it. This is important so that you end up with a merged activity and campaign membership history.
  • If there isn’t a contact but there is an account for this company then you want the newly created contact to belong to the existing account.
  • If there isn’t an existing contact or account then you want to create new ones.

Salesforce has built-in support for converting and merging leads, but the built-in tool forces you to point-and-click your way through a manual interface one lead at a time. If you’ve got thousands of leads to get through you need something more scalable.

First I should address the obvious solution: You could just buy a third-party solution. There are a ton of tools and AppExchange packages out there which will help you convert leads and match them with existing accounts and contacts. I’ve used many of them and some are pretty good. But they all come with their own problems: additional expense, complex UIs, more failure points in your data pipeline, granting third-party access to your data, etc. Plus in come cases they just feel like overkill. Do you really need to install another AppExchange package to do something that can be solved using Salesforce’s built-in features?

So to summarize, if what you’re looking for is an out-of-the-box way to clean up your leads, contacts & accounts, this article isn’t that. If that’s what you’re going for, just search the AppExchange for “bulk lead conversion”.

The process below is my own method for cleaning up a big database of leads, contacts & accounts using only free resources and built-in Salesforce functionality. At a high level the process involves establishing a set of website and company name key fields to use to match leads to accounts, then periodically running an apex script to match and convert your leads. Along the way I will share some specific techniques and tools for solving common problems (like getting websites for large account lists and converting company names into easily matchable keys).

Even if you find this process to be a poor fit for your own environment, understanding what an end-to-end manual process looks like can help you improve your own processes and better prioritize potential automation projects.

Prerequisites

  • Your Salesforce user account will need to have permission to customize Salesforce as well as permission to use both the data loader and the developer console.
  • You’ll need to download and install the Apex Dataloader.
  • You’ll need a tool to edit spreadsheets / produce CSV files. (Excel, Google Sheets, Numbers, etc)

The Strategy

In order to match leads with accounts you will want some sort of unique company identifier to help you do the matching. You can use the company name, of course. But names are finicky since they can be typed in so many different ways (“Acme Inc”, “Acme, Inc.”, “Acme”, etc). You can use external identifiers like DUNS codes, but those are not available for all accounts and they require lots of work to populate on a consistent basis.

My favorite unique company identifier is web domain. The domain is the part of the base website URL that leaves off any extraneous prefixes like “www”. Examples include “adobe.com”, “acme.co.uk” and“legislature.colorado.gov”. It’s not perfect, but it’s easy to verify and populate in bulk. In most of the Salesforce environments that I setup I try to use the web domain as the primary way to de-dupe account records and match leads with accounts.

If that strategy doesn’t seem like it will work in your use case then you can keep reading, but keep in mind that you’ll need to come up with some sort of unique company identifier otherwise your work is going to get really difficult really quickly. Matching leads to accounts by company name is simply not practical for large databases.

The Setup

Actually running the lead conversion process is actually quite simple and is outlined in the next section. But most of the work to be done is in establishing a reliable set of fields which can be used to uniquely identify accounts and match them with leads. This section will walk you through the process of creating and populating those fields.

Side note: The fields you’ll build out in this section are useful for more than just lead conversion. Once created, they can help you do tons of different things. You can use them to augment your data from third-party providers, synchronize your Salesforce information with other business systems, automatically grant customers community users access to cases based on their email address and more.

Part 1: Create Custom Fields to Assist with Matching

First create two custom formula fields called “Email Domain”: one on the Contact object and one on the Lead object. They will be text formulas with the formula value below. The formula extracts the part of the email to the right of the at sign and then lowercases it.

lower(trim(right(Email, len(Email) - find("@", Email))))

Next create create a custom formula field called “Company Name Key” on the Lead object using the formula value below. It’s basically just removing special characters and common suffixes like “inc” and “llc”. Salesforce formula fields don’t support regex so this ends up being pretty ugly. But hey it works! You can, of course, add additional substitutions as needed for your use case.

Formatting Note: I recommend using indentation in Salesforce formulas to help keep them more readable (though it doesn’t help that much for formulas as deeply nested as this one). I tend to copy and paste my formulas into a code text editor like VS Code or Sublime when I need to make changes. I recommend setting the tabs to two spaces and the syntax to Javascript.

substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
lower(Company), " ", ""
), ",", ""
), ".", ""
), "-", ""
), "'", ""
), "(", ""
), ")", ""
), "\\", ""
), "&", ""
), "/", ""
), "incorporated", ""
), "inc", ""
), "llc", ""
), "ltd", ""
)

Now create a similar custom formula field on the Account object called “Account Name Key” using the formula value below. This should be exactly the same formula as the one you used on the Lead object except it uses the “Name” field instead of the “Company” field.

substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
substitute(
lower(Name), " ", ""
), ",", ""
), ".", ""
), "-", ""
), "'", ""
), "(", ""
), ")", ""
), "\\", ""
), "&", ""
), "/", ""
), "incorporated", ""
), "inc", ""
), "llc", ""
), "ltd", ""
)

You can validate that the above formulas work by creating some reports or by adding them to your page layouts.

Finally create a single-line text field (255 characters) on the Account object called “Web Domain”. There’s no way to make this into a formula that works 100% of the time, so you’ve got a couple of options, outlined in Part 3 of this section.

Part 2: Add Websites to Your Accounts

You want to be sure that as many of your accounts have websites as possible. As mentioned earlier, company names often make inexact match keys. So the more accounts that have websites, the more accurate your account matching will be. If a significant number of your Salesforce account records are missing websites, you’re going to want to fix that before proceeding.

Here some strategies for back-populating Account websites:

  • Coworker Crowdsourcing: The simplest strategy is to dump all of your accounts to a Google spreadsheet (run an accounts report, filter for only those with blank websites, be sure to include the account id, export the report as a CSV, import it into Google and convert it to a Google sheet). Add a column to the Google sheet for website and then invite your team to collaboratively populate out the website column. (One person takes rows 1 to 100, another takes 101 to 200, etc.) When you’re done use the data loader to update and you’re done! This approach is stupidly simple but it certainly works.
  • Contact Emails: If your accounts have child contacts with email addresses then you can use the contact email domains as a data source. Just run an accounts and contacts report, include the account id and the new Email Domain formula you created. Export the report to CSV, import it into your spreadsheet tool, remove duplicate rows based on the account id and then use the data loader to update the accounts, setting the “Website” field on the account to equal the “Email Domain” field from the contact. (You don’t need to add the “http://” or anything, it’ll all work out fine.)
  • Company Name to Website Vendors: There are tons of different websites out there that will help you convert company names to URLs. Some of them are great, some are terrible, some are cheap some are expensive. Generally speaking the tools with easy-to-use UIs are the more expensive ones. You can save a lot of money by using an API-based tool and then using a script to submit the requests and process the results. Even putting together a comprehensive list of the vendors in this space would be pretty tough, so instead the next two bullet points cover a couple of my favorite vendor-based tools to use.
  • Clearbit: Clearbit is an awesome data company. They’ve got lots of paid tools, but they also have some free tools. Of particular relevance is their free Company Name to Domain API. If you sign up for a free API key you can use their API to convert names to web domains. Since they are primarily an API-focused company they don’t have an actual UI, but I’ve built out a simple UI which you can use here: Clearbit Console. This is a great option, but coverage can be extremely spotty depending on what sort of companies you have in your database.
  • Google Search Results APIs (SERP): There are tons of tools out there called “SERP APIs” (Search Engine Result Page APIs). These are essentially Google scraping platforms which allow you to programmatically search Google and then get the results as a well-organized JSON object (Google doesn’t offer this type of service themselves because they don’t want bots to have unfettered access to their search results). This is hands down the best option in terms of coverage. In my experience you get 99%+ coverage even for fairly small organizations. Plus you can get lots of other information from Google’s Knowledge Graph (like address, phone number, etc). Google basically knows everything and you’re just piggy backing on top of them with this strategy. Most of the platforms allow 100 or so API requests on their free tier. Paid tiers start at $20 or $30 per month. I’ve built out a simple UI which you can use to search one of these APIs called “serpstack”: Top Google Result Searcher. You’ll need to setup your own API key to use the tool.

Part 3: Convert Account Websites to Web Domains

Now most of your accounts should have values in their “Website” fields, but they are still missing “Web Domain” field values. What we need do is convert the fully-qualified website strings (like “https://www.acme.com/en/index.html”) into simplified web domain strings (like “acme.com”).

You have a couple of options for turning website URLs into web domains:

  • Option A: Do it manually. Export your account websites to a spreadsheet and then copy and paste them into a domain extraction tool and use the data loader to save them back to Salesforce. This strategy is simple and fast but the downside is that you’ll have to keep doing it every time you want to do a big lead conversion. I’ve created a simple Domain Extractor Tool which you can use. There are a number of other tools out there too if you Google “url to domain extractor”.
  • Option B: Create an Apex trigger. If you’re going to rely on the “Web Domain” field going forward, the best way to keep it up to date is to create an Apex trigger which automatically populates the account web domain field whenever a new account is inserted or when the website changes on an existing account. The exact code to make that work is something I’ll leave out of this article, but I’ve created a Github Gist with some well-tested Apex regex that can do the extraction itself: ExtractingWebDomainsInSalesforceApex.cls.

The Conversion Process

Now that we’ve created some matching fields, we have everything in place to build out a fairly solid lead-to-contact-and-account matching process that looks something like this:

  1. Try to match Lead Email to existing Contact Email
  2. Try to match Lead Email Domain to existing Account Web Domain
  3. Try to match Lead Company Name Key to existing Account Name Key
  4. If no match, create a new Account and Contact

So how do we actually execute this process? Well this is where you could choose to use an out-of-the-box bulk lead conversion solution. And if you do choose this path, all of the work we’ve done so far is still valuable because it will make it easier to get more consistent matches between your leads and your accounts.

But what I describe in this section is an alternate option involving a single Apex script. It’s simple, it’s fast and it’s free. But it involves code and therefore there is an element of danger involved.

So before I get to the what, let me first explain the why.

Wait, why aren’t we using an out-of-the-box solution?

This would all be pretty simple if we could just use the data loader.

If we could convert leads with the data loader we would just dump everything into a spreadsheet, match leads to contacts and accounts with VLOOKUPs and then be off to the races! Unfortunately we cannot use the Apex Dataloader to convert leads. You can supposedly ask Salesforce to enable a special feature which makes this possible, but that comes with its own problems (like field mappings not coming over during conversion).

As I mentioned before there are many apps and AppExhange packages which will bulk convert your leads and can even be configured to do so on a scheduled basis. There are tons of options out there and new tools are coming out every year. So what’s the problem?

  1. Some of these tools can be fairly expensive. And the cheap / free ones tend to be… less than awesome.
  2. Most of the tools out there are really complicated to setup and use. They tend to have UIs that look like they were designed in the 90s. Lots of clicking and tabbing through fields and multi-part wizard navigation.
    Why does the UX matter? Bad tool UX leads to bad data and embarrassing mistakes. At one of my clients they routinely had private customer data being irreversibly merged with Sales data for an entire year due to a poorly configured AppExchange data management package (which was wholly due to the tool being annoyingly difficult to configure). At another client they ended up sending the wrong marketing emails to thousands of people due to a single typo in a boolean condition two pages down on the third wizard screen out of five. They had to send apology emails to thousands of customers. And it happened three times before the root cause was finally resolved.
  3. Many of these data management tools involve granting a third party access to your data. Not only does that open up potential compliance risks but in some cases (like with HIPAA-restricted data) this makes the procurement process WAY more complicated.
  4. The last reason is admittedly subjective. It doesn’t feel like bulk converting leads should require extra tools or extra money. Salesforce is already an expensive platform and it has a built-in enterprise-grade development language called Apex. And the thing is, converting a lead with Apex is easy. Here’s what it looks like:
Database.LeadConvert lc = new Database.LeadConvert();lc.setLeadId(lead.id);
lc.setConvertedStatus('Qualified');
lc.setAccountId(account.Id); // optional
lc.setContactId(contact.Id); // optional
Database.LeadConvertResult lcResult = Database.convertLead(lc);

Of course, these data management platforms are doing much more than just converting leads. They provide things like automation, reliability, usability, logging, enterprise integrations, support, etc. But still, if your underlying goal is essentially just the conversion of lead records, then it is at least worth understanding alternate solutions which rely on a simpler toolbox.

Running the Lead Conversion Script

Ok, we’ve got our leads, accounts and contacts. The matching fields are created and we’re ready to convert some leads. We’ve decided that we’re going to do this thing using only built-in Salesforce functionality. So how does this work?

  1. Build a lead query. First you need to build a SOQL query which defines exactly which leads you want to convert. This is the most important part. If you mess this part up then you might end up converting a lot of leads that you don’t want to convert. I recommend using a tool like SoqlX to build out a query which returns only the leads which you want to convert.
  2. Customize the script below. Once you’ve crafted your query, copy the code below into a notepad or code editor of some sort (like VS Code or Sublime). Follow the instructions in the “SETTINGS” section at the top to paste in your query and specify all the other options. It’s probably a good idea to save a copy of your modified script somewhere for future reference.
  3. Run the script for a small test batch. Start off with a small “batchSize” value of 1 or 2 in the SETTINGS section. Open the Salesforce developer console (Lightning: Gear Icon > Developer Console, Classic: Click your name > Developer Console). Select the Debug menu and then “Open Execute Anonymous Window”. Paste the script code into the window. Check the “Open Log” checkbox and then click the “Execute” button.
  4. Verify the results. After the script runs, open the debug log and check the “Debug Only” checkbox in the gray bar along the bottom of the log panel. Check the query count message towards the top to make sure the total number of leads in the query matches your expectations. Scroll to the bottom and you should see the conversion results. You can copy a lead id or email and then use that to query for the corresponding records in your query tool of choice or just search for it in the Salesforce UI. Confirm that everything looks right. Rerun the script with another few records and follow the same process to manually check that they look right.
  5. Up the batch size and re-run the script. Once you feel confident in the results, set the batch size to 50 and continue re-running the script until you’ve converted all of your leads.
  6. That’s it! You can re-run this process on a regular basis. But you’ll need to be sure to keep updating the Web Domain fields on the accounts before each run in order to keep your match accuracy up.

The Lead Conversion Script Code

The Apex script is embedded below and also accessible at the Github link here: BulkLeadConversionScript_SalesforceApex.cls.

Some things to be aware of:

  • Running apex code in a dev console in production has risks. There is no “undo” button. If you somehow snuck a delete statement into the code below you could potentially create a lot of pain quite quickly. But that said, I’ve used this script in production for actual workloads and it’s pretty reliable. If you stick to modifying the “SETTINGS” section up top then nothing bad will happen.
  • You will need to customize the SETTINGS section before running the script. This script won’t magically work in your environment without a few tweaks. You will need to build out a SOQL query to correctly identify your leads and you will need to specify the fields to use for matching leads with accounts. And there are some other settings too. Read through all of the comments up top and in the SETTINGS section and make sure you understand them before running the script.

Additional Notes on Account Web Domains

That’s it! Obviously you may not end up using the above process in its entirety. But if you do end up deciding to use the “Web Domain” as a key part of your data management strategy going forward, here are a few tips which might prove useful.

  • What about companies without websites? You can certainly just leave the web domain blank for those and fallback on other matching strategies. But in some cases you may end up deciding to make the “Website” field required to ensure that you never have accounts without an id. In that case one idea is to just make up a fake web domain based on the company name. So “John’s Flower Shop” might become “johns-flower-shop.nosite”. You can even create a workflow rule or an apex trigger to automatically populate the website field when new accounts/leads are created without one. You can follow a logic similar to the logic in the Account Name Key formula, except maybe replacing spaces with dashes to increase readability.
  • What about companies with multiple websites? There are two ways to deal with that: One is by allowing your web domain field to contain a comma-delimited list of domains. Then you’ll need to update the script to match leads to accounts based on partial matches instead of full matches. The other way to deal with this is to create child accounts underneath the parent account for each different website. Often times this is actually a more accurate representation of the organizational hierarchy anyway.
  • What else can I do with web domains? This article focused on matching leads with accounts, but the “Web Domain” field is also helpful for finding and merging duplicate accounts. If you do a good job of managing your database, duplicate accounts shouldn’t crop up very often and using Salesforce’s built-in account merging functionality is probably sufficient. Another note to the wise: Be very careful about setting up any sort of tools which automatically merge duplicate accounts. Once merged, there is no “undo” button. Depending on how things are setup, a mistyped website can end up resulting in permanently messed up activity histories and other irreversible mutations.

--

--

Hank Holiday

Meta-founder of Quorum1. Serial entrepreneur, engineer, designer & Salesforce architect. Interested in learning, governance, economics & complex systems.