USING PYTHON IN PRE-BUILD PHASE IN XCODE

Originially published at my blog here.

Ilia Kosynkin
Sep 9, 2018 · 10 min read

Preamble

Pre- and post-build events is very useful feature of xCode, yet quite underestimated and under-researched (in my opinion). Usually, whole interaction with this feature, in an average project, is to press the button to add new “Run Script” phase and copy-paste script from a library documentation so it can work (I, for instance, usually integrate R.swift straight away). However, since you can run pretty much any command from “Run Script” phase the range of tasks that can be done from there is much wider.

In this article, I would like to show how to integrate a simple Python script, that will check the code and throw the compiler warning (or error) in case there is a problem.

Important note: this article mainly concentrates on Python script creation and it’s integration, rather then Swift code. In case you have questions related to the Swift code — make sure to ask it in the comment section!

TL; DR:

If you’re not particularly interested in digging into details and/or already have a script to integrate or you just want to know how to throw custom warnings/error in “Run Script” phase of xCode — there is the compressed explanation especially for you:

  • To run Python script from xCode, you need to put following in “Run Script” phase: python "{script_path}"
  • To throw custom error/warning you need to print it to stdout in the following format: “{path}:{line}: (error|warning): {message}”. The “path” format argument is the path to file to show warning/error at, the “line” argument is the line to show warning/error at, error or warning specify wherever it will be error or warning and “message” argument is the message to be shown.
  • If your script prints error — it must finish with a non-zero exit code. For Python, call sys.exit(code) to finish script with an error code. If you don’t do so, xCode will generate the additional error.
  • In case you would rather just play with code, check out the little app I wrote to illustrate the material in the article: GitHub.

Idea outline

There are plenty of reasons to want to implement additional code checks: starting from searching for TODOs in the code base and finishing with more complex rules to enforce (like forcing test-coverage for example).

In this article, I want to show how to implement the check for the model’s layers. For this purpose, we will integrate simple REST endpoint: XKCD, which returns current webcomic from XKCD Site.

Let’s start with a small dictionary to make sure that we all on the same page here:

  • Model’s layer — an abstract conception, which is there to encapsulate all classes which objects are used within certain parts of an application. For example, DB layer objects are used in operations with database (doesn’t matter wherever it’s CoreData or Realm). Model’s layer doesn’t have an “in code” representation, it’s more of an abstract aggregation.
  • Model’s entity — abstract business model, that describes properties that we want to use. For example, from XKCD we have Webcomic, that we want to operate on. That’s an entity. When we have NetWebcomic it’s representation of an entity within the Net layer.

We will use three model layers for this purpose:

  • Net — objects from this layer are responsible for parsing data from the Internet (API, web scrapping, etc.)
  • DB — database layer, objects are representing the data stored in the local database.
  • App — objects from this layer are used in the business logic (for presenting data in UI, CRUD operations, etc.)

Let’s also define a few assumptions that will make the task much easier, while not breaking practical usability and usefulness of the example:

  1. All entities (in our case only one, Webcomic) must have a representation (class) in every layer.
  2. Representation (class) name of entity in a layer must be in following format: “{prefix}{entity_name}”. So representation of Webcomic in the Net layer will be “NetWebcomic”.
  3. All representations (classes) must be in one folder.

The rule that we want to enforce upon all entities is the following: representation of an entity in every layer should have the same properties (i.e. same names, but they can have different types). If there is a class that has/misses property that others doesn’t/do have — our script must conveniently print a warning or throw an error on project building. The idea behind this rule is to make sure that if there is a change in any layer it’s reflected in other layers as well. That would come in handy if you added a property in one layer, but got distracted and forget to add counterpart properties in others.

That pretty much sums up the outline, let me continue to the implementation details, but first I would like to present the reason why I picked Python over other options.

Why Python?

Programmers who have done things with “Run Script” phase may be wondering about the choice of Python there since usually Bash is used for simple tasks.

It actually doesn’t really matter what you use: you can run any language or program from run script phase. However, Python has the following advantages over other options:

  • It’s there out of the box: OSX comes with Python 2.7 installed, so you can use it without requiring your colleagues to install additional packages/programmes.
  • It’s popular language with a lot of battle-tested libraries and wide community to help you out.

If you feel more comfortable with JS or Ruby or any other language — you can apply things explained in this article to those languages as well (to a certain extent, obviously).

Representations

Let me start off with defining actual classes for layers.

Net

Nothing too special here. Almost direct mapping of JSON from the endpoint, except for “date” property. It’s mapped from “day”, “month” and “year” fields of JSON to one field (comic.date = “(day)-(month)-(year)”).

DB

I used CoreData as database framework, you can use Realm instead. Also, note that type of “date” changed. It is not exactly relevant for us, as long as the property “date” exists in the class.

App

The class that is used in the business logic of the application.

Python script

Now let’s dig into actual Python code that will parse our models’ code and check it. I will split the code into sections to make it easier to understand it. The code is supposed to work with Python 2.7 without any additional libraries required, however, it should be easy to adapt it for Python 3.

Define and parse CLI arguments

Package “argparse” is the built-in solution for parsing arguments from command line. We create the object of ArgumentParser class with the description of what our script is doing.

Next, we’re adding command line arguments:

  • Path — can be supplied via “-p=” or “–path=” and will be in “path” property of an object with parsed arguments.
  • Error — if this flag is set, the script will throw an error instead of a warning in inconsistency is found.

The function “parse_args()” collects all defined arguments into an object (“args” in our case) which contains all arguments in properties, names of which are defined by “dest” argument (so the path will be in args.path).

Parse models

Since we assume that all models are in one folder, we just iterate over files in path supplied by command line argument and parse all “.swift” files.

First, I declared a few variables that will help us to parse data and store the result. Variables: class_re and var_re are precompiled regex patterns. First is for finding classes within the file and second for extracting variables from the body of those classes.

Dictionary “classes” is used as a more convenient way of storing parsed data. The layout of the dictionary looks as following:

So, for every entity, we store each representation in “prefix” object, each prefix stores path to particular representation and variables objects. Each variable stores type and line in the source code to throw warning/error at.

Parsing itself is rather trivial: we iterate over files and directories at the path with os.walk, which returns us root, directories and files in the current root (os.walk recursively goes over whole hierarchy at the path). Then we iterate over files and create the full path to every file in the current root (os.walk returns files names, not paths).

Then we make sure that file we’re working on is actually “.swift” file (since there might be files from different languages and storyboard/xib files there, which we’re not interested in) and if it’s not we continue to next file.

If the file is actually Swift file — we’re reading it’s content via codecs package, which supports UTF-8 (in 2.7 version of Python UTF files weren’t supported in default “open” function).

Then we’re parsing the content of the file with class_re regex, which gives us all matched occurrences of classes within the file. Every match contains the name of the class in 0 position and it’s content in 2 position (1 position contains a parent class name or an empty string if there is no parent).

Then we’re trying to parse class name. Remember that we assume that every representation’s name is in the format “{prefix}{entity_name}”, so classes with names that don’t match this format don’t interest us. Extraction is don’t via helper function, which is presented below:

This function returns a tuple of prefix and entity name if class name matched format and None if it didn’t. Note that we’re using global variable “prefixes” from the outer scope. In this way, we can adjust supported prefixes in an easy way.

In the loop, we check if “extract” returned a tuple (not None) and if it did, we’re proceeding to store parsed information.

First, we make sure that “classes” have an entity name in it (we add it if it’s not present).

Then we check for the prefix in the same way and adding “__path__” variable in the initialisation process, so we always have the path to throw warning/error at.

And finally, we parse the content of class with var_re regex. It returns the list of variables. We’re interested in positions 0, 4 and 5. Position 0 has the whole matched variable definition which is used to find line number on source code with helper function “line_for_string” (presented below). Position 4 contains the name of the variable and position 5 contains the type of the variable.

This function is rather trivial: we’re splitting content into lines, iterating over those lines and returning index + 1 (xCode counts lines from 1, not from 0) if there is a match (or None if there is no match).

That sums up parsing section, let’s move on to the generation of warnings/errors.

Validation and warnings/errors generation

Now, when we have all the necessary data prepared in memory, we can validate it and throw a warning/error if there is an inconsistency.

Let me present whole code first:

Variable “had_error” is there just to indicate wherever there was at least one inconsistency found in check.

We’re iterating over “classes” getting class name and associated with its dictionary. In second loop we’re generating a unique pair of prefixes with help of build-it itertools package. This ensures that each pair of layers is checked only once, so we won’t do the same work twice (and throw warning/error twice as well).

In the next step, we’re ensuring that first object (first keys) have more or equal amount of keys in it. This ensures that if we have an excess variable it’s going to get spotted.

Then we’re iterating over first keys (the key is the variable name) and if the variable starts with “__” we’re ignoring it (variables in the format “__{name}__” are assumed to be internal variables of the script and are not taken into account). Then we iterate over second keys and check if there is the match of names. If there is a match we’re setting flag “passed” to True and breaking loop. Flag passed indicates if a match was found.

After the loop, if a match weren’t found we actually throwing an error warning. The process is quite easy — we just print out data in following format: “{path}:{line}: (error|warning): {message}”. Yep, that’s it, xCode monitors what script prints to the stdout and, if it matches format, — it conveniently shows error or warning in the UI.

Wraping up

The last step that we need to perform is to exit the script correctly. It’s rather an optional step, however doing so will prevent xCode error from happening. To exit script correctly you need to call:

Function sys.exit finishes the process with exit code supplied as an argument (in our case -1 if there was an error and we treat inconsistency as error and 0 otherwise).

The reason for this is that xCode will throw an error if “Run Script” phase printed errors, but finished with 0 (normal) exit code.

Calling the script

Now when we have our script ready we need to call it in the “Run Script” phase. Just create new standard “Run Script” phase and put the following line in it:

Note that it assumes that script is called “model_validator.py”, the target named “python-integration” and model layers are located in “models” folder.

PROJECT_DIR variable contains path to the project (note, not to the target).

Let’s run it!

Seems to be working just nicely!

Test app

Note that I left out details of Swift implementation (like a database, fetching data from API and so on). I think it would be more beneficial to concentrate on the script, as it shows how to parse classes and work with the parsed data. In case you would like to play with test app for yourself and see how it works, check out the GitHub page.

Conclusion

I hope that this article will bring at least some developers to use “Run Phase” scripts feature of xCode to improve their code and workflow. What I’ve described is just a very small part of what is possible.

I wish you good luck and hope to see you in the next articles!

Ilia Kosynkin

Written by

Passionate programmer and traveller

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade