Data Loading II: Choose Your Weapon

Jenn R Kant
6 min readAug 13, 2018

--

I recently achieved a Data Architecture and Management Designer certification from Salesforce, (stay that three times fast). There are a number of tidbits I picked up in my studying and wanted to share. This is post 2 of 3. Data Loading I: As Sexy As It Sounds; Data Loading II: Choose Your Weapon; Data Loading III: The Bulk API, It’s For Admins Too!

Chose Your Weapon

We’ll say we have clean data, but a lot of it, and it doesn’t live in Salesforce.

At a high level, there are 3 categories of getting data into Salesforce: manual entry, a data loading tool, or using the APIs.

I’m not manually entering 10,000 records. My personal threshold for manually entering records is very low. If there are more than 10 fields, I’ll fire up a tool for a single record. Some of this is dictated by personal preference, but I’m all about working in the most efficient way possible.

As a “frequent flier” when loading data, I skip the Import Wizard altogether. It’s overly “clicky” to me, and I can’t do any of the extra party tricks clients pay me for. If I really think about it, I don’t know anyone who actually uses that thing. There’s a line somewhere between a sales person that enters their own leads and a sales person who has an assistant do it. Frankly, I’d rather have an assistant enter the data.

Import Wizard aside, there are a large variety of other tools on the market. Some are more extravagant than others, but more often than not, I find myself reaching for Ol’ Reliable. That’s right the good, old fashioned, totally free, Salesforce Data Loader. There was a time when I relied on Jitterbit, but something broke one time in a release, and I never got around to fixing it.

  • Consulting Coach #4- Your preference of data loader is like your preference of OS. Mac vs PC, Apple vs Android, Data Loader vs Jitterbit. They can all to an okay job, but shine if you have a really specific application they are built for. Choose your team wisely. (#teammystic #teamapple #teamjacob #teamwubanub)

So I’ve got my data, and I’m loading it into Salesforce, and I notice it’s going super slow, and is resulting in tons of lock errors errors. The message looks something like “Unable to obtain exclusive access to this record.” Here’s the tips I’ve picked up to get moving faster and with less errors.

What is causing the locks? A record is being updated more than once. That could be caused by one of a few things.

First, the record could be on your csv twice. For example, if you have a list of contacts, and were updating their addresses and had Tony Stark on there twice you could get an error depending on the placement of the items. More often than not, however, the lock is caused by multiple child records. If Tony, and Steve, and Bruce are all part of the same account, they could cause locks.

Let’s take a step back. The way data loader processes records is by taking a batch at a time, queuing them up, and performing whatever action (insert, update, etc) on that set of records. It then moves on to the next set. The default batch size is 200 records. An interesting thing to note is this is also the MAXIMUM batch size and the documentation on Data Loader actually says, “We recommend a value between 50 and 100.” Not sure why they wouldn’t set the default to the recommendation, but anyway, you can update that in the Settings if you’re in troubleshooting-mode.

Each batch consumes one API call. This is ones of many actions that count toward your global API limits. The number of batches/API calls are calculated by the total number of records divided by the batch size, then rounded up to the next whole number.

When we update records, we are not just updating that single row of data, but also, any parent records. Updating a contact, for example, also “updates” the master records, the account. This is true of the child side of any Master-Detail or Lookup relationship. If the child record is updated it “locks” the parent record as well. The lock stays for the duration of the processing of that record change. Any workflow rules, process builder, or triggers execute. If those automations cause other records to change, more automation might kick off, each adding processing time. Sharing rules are recalculated. If you have a lot of sharing rules on the child or parent object this can add significant processing time as well.

A whole lot happens when you create (insert) or edit (update) a record. You’re probably familiar with it if you took the Salesforce Admin exam, but it is worth reviewing more detailed documentation of Salesforce Order of Execution. This is like in elementary math class where you learned that: 4 + 2 * (3 + 2) equals 14, not 20, because parenthesis execute go first, and then multiplication and division before addition or subtraction. Validation rules fire before workflow rules so if the validation doesn’t pass, the system doesn’t have to roll back the changes from the workflow rule. The more automated your Salesforce org, the more likely you are to run into record locks related to Order of Execution.

Fortunately, we are not powerless to help with this issue.

If we go back to our totally randomly selected set of sample data of Tony, Steve, and Bruce (#teammarvel) we can see that Tony, Steve, and Bruce all belong to the same account. We can also see that if we have used the default batch size for data loader, 200, those contacts are in two separate batches.

The thing is, Salesforce has a special skill at handling this issue if the child records that are updating the same parent are in the same batch. Split across two batches we have greatly increased the likelihood of a lock, but if instead we order our data by the parent id, we are able to address the issue by using the tool’s own build in ability to manage it.

Granted this data is generous for the example. Your sibling records might still reach across multiple batches and cause locks, or you might have more than 200 children of a single parent being updated. However, much of the time, this can at least address the bulk of locked records.

Special considerations should be made when you are data loading objects with multiple parent objects. On task, for example, there are both the “Name” (whoid) and “Related To” (whatid) lookups. Which one do you sort by? That depends on your data contents and your architectural model.

There is at least one more way to cause a lock worth nothing. A favorite scapegoat of young admins everywhere: if you try and update a child record at the same time that another user is updating that record, its parent, or even its sibling, you could cause a lock. It is a much less likely scenario, but is possible. If you do find that your users run into record locks in the UI, this is probably indicative of an underlying architecture issue. Any one parent record should not have more than 10,000 child records. As you get closer to that number, you start to see performance degradation. For example, a common solution of grouping “orphan contacts” under one single bucket account could cause issues if that is the final home of those contacts. Another common example of this is in B2B support organizations. If AW Computing uses Salesforce to support their client, Universal Containers, and multiple case are being worked for Universal Containers at the same time, edits to those cases could encounter a lock derived from the Account relationship.

--

--

Jenn R Kant

Salesforce Platform Enthusiast, Consultant, Homebrewer & Artist. Seeking jokes, drinks, & clients. The only opinions I have are my own. (She/Her)