Speech Recognition and Core Data in iOS 10

We’re always excited to play with new APIs that arrive with each iOS release. Apple introduced a new Speech Recognition API that allows us to tap into the server-side speech-processing capabilities that power Siri. Core Data has significant changes for us this year that make it much easier and safer to work with. Let’s make a mashup app where we experiment with speech recognition and new Core Data APIs. To demonstrate these two new APIs we’ll make a contrived example app called “Tell Core Data” that lets users add colors to a table with voice commands.

Speech Recognition

The iOS platform has had speech-to-text in the system keyboard since 2011. The new Speech Recognition APIs expose that functionality decoupled from the keyboard so we can programmatically process dictation results. This gives us the ability to do pattern-matching on what a user is dictating. My first programming experiences were inspired by text-based adventure games like “Tass Times in Tonetown” in which you issue text commands like “GO EAST” and “WEAR JUMPSUIT” — it would have been fun to be able to tap into speech-to-text processing for my games.

The first step to using the Speech Recognition API is to add a couple entries to Info.plist to designate the app requires permissions to send data to Apple’s servers. We will

import Speech

to import the Speech module. There’s a little bit of setup required to initialize speech recognition. Take a look in the SpeechService class in the example project to see how it works, which is based on Apple’s SpeakToMe example.

The basic flow is first we request authorization with

SFSpeechRecognizer.requestAuthorization

. When the user presses a button to begin speaking, we’ll send off a series of calls beginning with initializing an

AVAudioSession

. Then we’ll create an

SFSpeechAudioBufferRecognitionRequest

and provide that to an

SFSpeechRecognitionTask

. Finally we start an

AVAudioEngine

. After that series of calls we have fast, live speech recognition we can consume as text provided in the callback from the speech recognition task. In our example app, when a user says a color like “red”, we’ll dispatch those commands to Core Data to make changes in the persistent store, and then refresh our table view. Here’s our example after calling out a series of colors.

picture1

This new speech-processing capability is another data point in a trend where Apple is giving us more power to do interesting things. As fun is it is to play with speech recognition, new Core Data APIs are far more germane to the apps we build today. Core Data is getting some really terrific updates this year.

Core Data

Core Data is getting much better about how it handles multiple concurrent requests with regards to locking. A typical Core Data stack uses multiple Managed Object Contexts (MOC) each one attached to a Persistent Store Coordinator (PSC). Usually a main-queue MOC is used for reading, and writes are done on background-queue MOCs. Core Data’s concurrency model has required locking at the PSC level to coordinate multiple requests. For apps that needed to download large payloads and write them in the background, we have benefitted from maintaining two independent stacks with independent PSCs. The big change this year is that the lock on the PSC is moved down to the SQLite layer which results in a much more responsive Core Data stack, and might eliminate the cases where two stacks are needed.

Another exciting change this year is the introduction of

NSPersistentContainer

which wraps up an

NSManagedObjectModel

,

NSPersistentStoreCoordinator

, and

NSManagedObjectContext

in one object. This makes setup much easier than building and attaching each of those components separately. Setting up an

NSPersistentContainer

couldn’t be easier.

let container = NSPersistentContainer(name: "DataModel")

container.loadPersistentStores() { (storeDescription, error) in
if let error = error as NSError? {
// Error handling
}
}

When we use an

NSPersistentStoreContainer

, we get support for the common workflow that includes using a main queue MOC for fetches, and private queue MOCs for doing updates in the background. For the main queue context, look for the aptly named

viewContext

property. Private queue contexts are provided as a block parameter when you call the handy

performBackgroundTask()

method.

container.performBackgroundTask() { (privateQueueMOC) in
// Use the private-queue MOC...
}

There’s a philosophy shift behind this year’s Core Data revisions in that we’ll have really good conventions and sane defaults so that we do less configuration ourselves.

NSPersistentContainer

exemplifies that philosophy as do some other new additions such as

automaticallyMergesChangesFromParent

, which relieves us of writing code to handle merge notifications.

NSPersistentContainer

will simplify Core Data usage for many apps, and for those cases where it fits we’re looking forward to using it. Other Core Data features we’re happy to see are Swift generics usage in Core Data types, and

UICollectionViewDataSourcePrefetching

support in

NSFetchedResultsController

.

Check out the example app if you’d like to read through more of the implementation of the new Core Data stack. That wraps up our mashup of new Speech Recognition and Core Data.

This article is part of our Welcome to iOS 10 series.

The post Speech Recognition and Core Data in iOS 10 appeared first on Possible Mobile.

Originally published on Possible Mobile

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.