Data Loading I: As Sexy As It Sounds

Jenn R Kant
4 min readAug 13, 2018

--

I recently achieved a Data Architecture and Management Designer certification from Salesforce, (stay that three times fast). There are a number of tidbits I picked up in my studying and wanted to share. This is post 1 of 3. Data Loading I: As Sexy As It Sounds; Data Loading II: Choose Your Weapon; Data Loading III: The Bulk API, It’s For Admins Too!

As a Salesforce solution architect, implementation consultant, or the million other names my peers and I go by, I am no stranger to data loading. Historic data migrations from sunsetting databases or brand new data, scoured from emails and texts, I’ve seen it all. Nearly every implementation has some sort of data that needs to be inserted or updated or both (upserted), and how you approach it can be the difference between a 3am data load or reasonable bedtime. There’s a lot of room to make mistakes, and I’ve made most of them.

xkcd, “Machine Learning”

Clean Your Data

As a consultant, I try and be as clear with my client about this item as early as possible: “If poop goes in, then poop comes out.” There are a lot of tools that can make poop less …poopy, but it’s waste water reclamation, and that’s a ton of extra steps and a lot of extra money.

  • Consulting Coach #1: I have conversations about data migration right alongside discovery sessions about business processes. It needs to be at the front of the project. If you wait until later to talk about it, people get sad and grumpy. I also have the conversation about who is going to maintain all that data after the project is over, but that’s for another post.

A client’s budget and their threshold for busywork dictate where they land on the spectrum of a smart data cleaning solution vs. elbow grease.

  • Consulting Coach #2- Unfortunately, if your client chooses the elbow grease route, the best bet is for the client to do that themselves. They know the data more intimately than you can write logic rules for. Most consultants build this into their SOWs. I will never forget hours of 17,000+ manually cleansed addresses that I will never get back because of a poorly written SOW and not knowing how to pivot. (Look at the baby consultant playing with Excel! Aw!)
  • Consulting Coach #3- If your client is cleaning their own data, look at your project timeline and add at least 2 weeks. Even with a utilization rate of 80%, no one can clean data for 6 hours straight without making mistakes.
  • Consulting Coach #4- Throw out the above guidelines, buy an enterprise level tool (e.g. DemandTools) for your consulting practice and give the usage away to your clients. Reducing the headache of data cleaning can be the difference in a one-time or returning client.

There are some old-school database science buzzwords and acronyms I had to learn for the DAMD Certification. “ETL” was one of them. ETL stands for Extract, Transform, Load. It refers to the process of database transformation, particularly in storing or moving data.

  • Extract- Download data from your old CRM in the format of a csv.
  • Transform- Open and manipulate data in Excel to clean or match with other records. Prepare any lookup or master detail columns using the VLOOKUP tool; make sure your picklist values are compliant (case-sensitive!).
  • Load- Push your shiny, clean records into Salesforce using a tool like the Import Wizard or Data Loader.

There is another acronym, made possible by these new-fangled technologies like Salesforce, “ELT”. That’s right: Extract, Load, THEN Transform.

  • Extract- Download data from your old CRM in the format of a csv.
  • Load- Push the unchanged data into Salesforce.
  • Transform- Prior-configured automation (process builders, workflow rules, and triggers) cleans the data. Validation rules don’t help you here as they don’t actually do any transforming. They would have clobbered you over the head in your Load step if you forgot to disable them.

Then there’s a third take on this that is kind of a combination of both. This is something like DemandTools. You extract your data, load it into a third-party “cleaning” tool where you have configured with your fuzzy and exact logic. Once it’s clean it gets pushed into the system you indicated at the start of the load process, like, hmm, say… Salesforce. These are called “ELTLPWYFT Tools.” Extract, Load, Transform, Load, Party-With-Your-Free-Time-Tools. Just kidding, I made that up. I don’t have free time.

I suspect that people who actually went to school for this can probably speak more elegantly about it. What I needed to know for DAMD, though, about what Wikipedia tells me is a 1970’s CS concept, is that “ETL Tools” is a thing, and that thing looks a lot like DemandTools, but not Data Loader.

Those kind of tools are very powerful, but they are also fairly costly and not accessible for a lot of Salesforce clients. For the sake of the rest of the post, we’ll assume you are broke and can’t afford the Cadillac, but remember, on the exam, unless told otherwise, answer questions as if you had the enterprise level, money-is-no-object budget.

(I mention DemandTools a lot. That is simply because it is the most often recommended. I don’t really have a horse in this race. None of my words here are sponsored. I feel obligated to state, however, that my company does partner with DemandTools’ parent company.)

Read more in the next post: “Data Loading II: Choose Your Weapon.”

--

--

Jenn R Kant

Salesforce Platform Enthusiast, Consultant, Homebrewer & Artist. Seeking jokes, drinks, & clients. The only opinions I have are my own. (She/Her)