Some Lessons Learned on Core Data

Michael Gachet
BPXL Craft
8 min readOct 28, 2015

--

I’ve been using Core Data since iOS 3.1. I was working on my first application: a weather app providing past weather data as well as long range forecasting for the United States. I did not really need to use Core Data, but I wanted to gain some experience with that framework. Since then I’ve used Core Data in about 90 percent of the projects I’ve worked on.

Over the years I made many mistakes with Core Data, most of which were related to using multiple contexts and concurrency. Before diving into the things I’ve learned along the way, let’s start with a short primer on Core Data.

What is Core Data?

You read all sorts of things on the Internet about Core Data, mostly depending on the past experience of whoever writes about it.

If I were to describe what Core Data is to me, I would say it’s an Apple framework which allows you to:

  1. Define entities with properties (called attributes) and relationships with other entities (one-to-one, one-to-many, or many-to-many). This is what you do when you create your Core Data model in Xcode.
  2. Create, delete, retrieve, and, more generally, manage instances of those entities (called managed objects) at runtime.
  3. Persist those managed objects to memory or disk.

The Managed Objects …

At runtime the static model entities are used to create instances of NSManagedObject subclasses which are commonly referred to as managed objects. Those managed objects do not exist in a vacuum; they have to exist within a context that is an instance of the NSManagedObjectContext class.

… Live in a Context …

The context in which managed objects live can be regarded as a scratch pad in which objects are loaded, modified, created, deleted, and saved.

When several objects exist within a context they define an object graph in that context. The context is then responsible for managing all objects in this object graph:

  • Ensuring the relationships are properly updated whenever an object is added to or deleted from the graph.
  • Fetching objects “from storage.”
  • Saving objects “to disk.”

At runtime, any managed object instance lives within a single context.

… But There Can Be More Than One Context in an App

If you think of contexts as scratch pads, you can immediately see why having multiple contexts could be a good thing.

  • One context contains all the objects that are visible to the user in the main UI.
  • Another context could be used to parse and persist objects downloaded from a server. Since it’s not the UI context, the newly created objects would not appear immediately in the UI, and their retrieval would not block the UI. You could also discard that context entirely if the download or parsing failed.
  • Yet another context could be used to create a new object with the option to discard it if the creation process is canceled by the user.

It is rare to develop a Core Data application with only one context.

The Core Data Stack

There are more pieces of the Core Data puzzle we have not mentioned so far: the persistent stores and persistent store coordinator. Together with the context and managed objects they form the Core Data stack.

The persistent stores are used to store the data. There can be more than one, even though on iOS you are likely to only have one. The persistent store coordinator is used to manage access to the persistent stores. It is also the holder of the one model used for all stores and manages all contexts connected to it. You can find more information on this topic in this presentation by fellow Pixel Paul Goracke.

What’s Important to Know About Contexts

The Core Data concurrency model imposes that you specify the concurrency type of a context when you create it. As of iOS 9, there are two (non-deprecated) concurrency types:

  • The NSMainQueueConcurrencyType used by the main UI context in an application.
  • The NSPrivateQueueConcurrencyType used for a context running all operations on a private queue. Background contexts are typically initialized like this.

When you are dealing with managed objects from the main UI context on the main thread, you can manipulate managed objects directly as you would any other object in an application.

In all other cases (accessing a managed object from the main context in another thread, making changes to a managed object belonging to a PrivateQueue context) the concurrency model imposes that you perform all operations on a managed object in its context’s queue using either of the two methods, performBlock or performBlockAndWait.

A Word on Parent-Child Contexts

Another important point that you need to get a good grip on is that contexts can be chained such that one is the child of another. If and when you do this, you have to realize that:

  1. Changes saved in a child context are pushed to the parent, not to the persistent store. The parent context behaves as the child’s persistent store.
  2. Changes saved in a parent are not pushed down to the child.
  3. Changes saved in a sibling are not pushed to other siblings.

This means that changes only propagate one level up in a context hierarchy. If the context you are saving is several levels away from the persistent store, you will need to save each parent context in the hierarchy until you reach the context that saves to the persistent store to actually persist your changes.

“There’s something very important I forgot to tell you.” — “What?” — “Don’t cross the streams.” — “Why?” — “It would be bad.” — Ghostbusters (1984)

“All that is left to us is Honor.” — Core Data famous last words

Managed objects and contexts are like that; never pass a managed object directly from one context to another. It would be bad, as in, “I’ve been chasing this random crash for two days” bad.

And it is very easy to do. Fortunately the Core Data team at Apple added the launch argument for Xcode “com.apple.CoreData.ConcurrencyDebug” to iOS 8, making it easier for us to test against concurrency issues.

This should be your default for every single Core Data project you work on

Use it! Always! Make it part of your default Xcode template. It will save you hours of trouble down the line. It basically enforces a strict obedience of the Core Data concurrency model. With this option enabled you will see the following whenever you violate the concurrency rules:

Core Data Multithreading model violation crash

A Few Good Rules to Follow

Rule #1: No matter which thread you are on, you are guaranteed that the following two properties are safe to access on a managed object:

  • objectID, which gives you the unique ID of this managed object.
  • managedObjectContext, which gives you the context this managed object lives on.

Accessing a managed object’s custom attributes, for reading or writing, must be done on the thread of the context managing that object. Failure to do so will violate the Core Data multi-threading contract. You will only have your honor left!

Access to the managed object context also needs to be thread-safe. For instance in iOS 9 (and probably for prior versions as well), the method hasChanges on the context must be accessed on the context’s own thread.

Rule #2: A managed object created on the main UI context can only be manipulated directly if it is on the main thread. Any attempt to manipulate this object on another thread may crash the application.

Rule #3: A managed object belonging to a PrivateQueue context can only be manipulated on that context’s queue using either of the methods:

  • performBlockAndWait, which will run whatever is within that block synchronously on the context’s queue.
  • performBlock, which will run whatever is within that block asynchronously on the context’s queue.

One interesting fact: The method performBlockAndWait will likely run on the same thread it was called on, even if called on a privateQueue context.

Rule #4: The main UI context can also use the two block methods to run code on its own queue (which is on the main thread). This is useful in cases where the main UI context must update some objects following an operation on another thread.

What these four rules mean in practice is that if you have a managed object and you don’t know whether the thread you are on is the one this object was created on, you can safely do this:

You will read or hear that just accessing custom properties is OK. It may work in practice. But the framework does not guarantee it and may crash. When the Core Data concurrency debug launch option in Xcode
( - com.apple.CoreData.ConcurrencyDebug 1) is turned on, the app will crash if you access most properties on a managed object from a thread that is not the one the managed object was created on. That includes all your custom properties. Such access is regarded as a violation of the Core Data multithreading API contract. In addition to objectID and managedObjectContext, the following properties seem to be safely accessible from any thread:

  • fault, faultingState, hasChanges, and entity.

The following properties are not thread-safe:

  • custom properties, inserted, deleted, updated, changedValues, and hasFaultForRelationshipNamed.

Rule #5: In 99.9 percent of the cases there is only one safe way to pass a managed object from one context to another: using its objectID and using the context confinement methods described above.

Safe (but not safest) way to retrieve a managed object by ID

The method objectWithID used in this code is fast but unsafe. If the object no longer exists in the persistent store, accessing it will crash.

The method existingObjectWithID(: , error:) is slower but much safer. It guarantees that the returned object exists in the store. That is why it is slower; it performs a round-trip to the store to retrieve the object. The many hard-to-debug crashes I have witnessed over the years have convinced me that the safety of this later method far outweighs the performance cost when dealing with multiple contexts. So I would encourage you to write this instead:

A safer but slower way to retrieve a managed object by ID

However, there are two cases where you can’t use the method above.

The first case is when your object does not have a permanent object ID. In this case you need to consider your Core Data stack configuration. In the snippet above:

  • If the main context is the parent of the background context, retrieving the object by ID will work. The main context acts as the background context’s store. The background context therefore has access to that temporary ID.
  • If the main context is not the parent of the background context, retrieving the object by ID will fail. The temporary ID only exists in the main context and not in the store. The background context therefore has no access to that ID. In that case, you will need to obtain a permanent object ID for the objects in the main context using the API obtainPermanentIDsForObjects(:error:) before you can retrieve them in the background context.

The second case is when contexts are connected to two different persistent store coordinators. This goes beyond this article. For more information on this topic, reference the upcoming objc.io book on Core Data.

Get Started With Core Data

If this is all new to you, the Apple documentation is a good place to start.

Paul Goracke has given several presentations on Core Data:

  • Core Data Potpourri (video and pdf).
  • Care and Feeding of Your Core Data Performance (pdf)
  • Multiple Presistent Stores in Code Data (pdf)

To set up your Core Data stack, the Apple sample Core Data application is not that great. I would refer you to “My Core Data Stack” by Marcus Zarra, which provides a much better alternative to the default Apple code.

Finally, there is a book coming out really soon written by Florian Kugler and Daniel Eggert from objc.io which is very current and extensively covers some advanced Core Data topics.

--

--

Michael Gachet
BPXL Craft

Curious mind, iOS developer, runner. All views are mine...