Optical Text Recognition in iOS 13

Published in

Atomic Robot

4 min readJun 10, 2019

At WWDC 2019, the iOS Vision framework got a big upgrade. It now supports optical character recognition that can extract raw text from an image! An app can now easily add and take advantage of this capability. All the processing happens right on the device, which makes processing time really fast. Let’s look at how easy it is to do this in Xcode 11 and iOS 13.

Let’s assume I have a recipies app and I want to share one of my favorite recipes with other users. My recipe is in a book that I have at home. How cool would it be if instead of just posting an image of the recipe, I could extract the text. This would allow my recipe to be searchable by ingredient and would allow the text of the recipe to be more easily converted into a readable format.

Take a look at the picture of my recipe. The quality is not horrible but not perfect either. Let’s see how we can use the vision framework to extract the text, and whether the output is accurate.

Using Xcode 11 with a minimum deployment target of iOS 13, let’s check out an example. In the following code, I am using the new Apple vision framework to turn my image into text. I have a TextView in a ViewController that I am going to use to display the text that the vision framework finds in the recipe.

The above code makes a VNRecognizeTextRequest, which is part of the vision framework. The result in this request gives recognized text observations. Thes are lines of text extracted from the image. For display purposes, I have added the line of text to the textView that will display in the simulator. Here is the resulting output from the recipe.

Caramel Oatmeal Chewies13/4 cups quick or old-fashioned oats1.%4 cups all-purpose flour, divided3/4 cup packed brown sugar1/2 teaspoon baking soda14 teaspoon salt (optional)3/4 cup (11 sticks) butter or margarine, melted2 cups (12-ounce package) NESTLE TOLL HOUSESemi-Sweet Chocolate Morsels1 cup chopped nuts1 cup caramel ice cream toppingPREHEAT oven to 350°F. Grease bottom of 13X9-inchbaking pan.COMBINE oats, 1'7 cups flour, brown sugar, baking sodaand salt in large bowl. Stir in butter; mix well. Reserve 1 cupoat mixture; press remaining oat mixture onto bottom ofprepared baking pan.BAKE for 12 TO 15 minutes or until golden brown. Sprinklewith morsels and nuts. Mix caramel topping with remainingflour in small bowl; drizzle over morsels to within 1/4 inch ofpan edges. Sprinkle with reserved oat mixture.BAKE for 18 to 22 minutes or until golden brown. Cool in panon wire rack; refrigerate until firm.Makes about 2'7 dozen bars

Pretty accurate! The recognizer did fail on some things like 13/4 instead of 1 3/4, but for the most part everything else was translated correctly. Notice also that the text is given to us by lines and therefore no formatting exists. This means the text won’t wrap properly without some additional logic to correct the formatting. From what I can tell, there is nothing in the vision framework that will help with formatting. Another library or my own logic would be needed to add it back.

There are a number of options that can be specified in the VNRecognizeTextRequest.

request.recognitionLevel = .accurate
request.recognitionLevel = .fast

In my testing, a recognition level of fast returned almost instantly while a level of .accurate took about 3 seconds. In this image though, the .accurate method actually got things wrong that the fast method got correct. Because the fast method returns so quickly, it would be really good for use cases where the camera is in use and you want to have real time text recognition.

For cases where a large amount of text might take some time to parse, I can add a progress bar while execution is taking place on the background thread. The VNRecognizeTextRequest has a progress handler to do this:

request.progressHandler = { (request, value, error) in}

Overall the vision framework is really cool and it’s exciting how easy it is to add to an app. I’m curious what developers will do with this in the future!

If you found this helpful, check out our other articles at https://medium.com/atomic-robot or get in touch at https://atomicrobot.com/contact/ if you would like to work with us on your next project.

Optical Text Recognition in iOS 13

Written by Brian Telintelo