From Zero to Apache Airflow Contribution — Part 2

Rafael Bottega
6 min readMay 10, 2020

--

You are in part 2 of how to make your first Apache Airflow contribution. If you haven’t started in part 1, I advise you to go there first.

In part 2 we are going across all the steps to find an issue, make changes, test it, create a PR and merge into the Apache Airflow project on Github 🎉.

Check issues that you can work in

The first step is to find something to fix, update or introduce, as it is your first one, I would recommend you to get something quick and easy, like a documentation update.

So let’s have a look at what is open in the Github issues list (here) or you can also have a look at Jira tickets (here). For simplicity let’s keep with the Github issue for now. Let's add a filter into the issues list that has the flag good-first-issue, which are simple issues to try to solve (here is for you). Another filter could bearea:docs for issues to update documentation.

Nice! I found this one that looks simple, it says that the documentation is misleading and doesn’t match the code with the explanation either with the dag tree view image.

Let’s investigate! 🤔

I found that the documentation was updated already and left a comment showing the PRs updating what this issue was complaining, but the tree view image was not updated, so let’s make a PR to update it.

Create local branch

First thing is to create a local branch to apply those changes, let’s name it with the number of the issue I am working and some explanation. From the master branch run the command, just changing the branch name:

git checkout -b 8246-concepts-last-run-only-docs-fix

Here comes the hard working. Implement the changes necessary to finish this request and resolve the issue. Before pushing it to Github, there are a few tests and checks to do.

Build the documentation locally

As it is a documentation fix, the next step is to build the docs to be able to see if it really looks like it should. Especially important to see if the headers, code blocks and link are fine. To build the documentation run the following Breeze command:

./breeze build-docs

This command will transform all the reStructuredText found on the docs folder into HTML files in the docs/_build/html folder. After the command finished you can use your browser to open those files, start opening the index: docs/_build/html/index.html. Ok, the changes in the documentation look good, the next step is to run static checks.

Run static checks

This command will run several checks like, add licenses to all files, fix the end of files, trim trailing whitespace, run lint, sort imports in python files, and many more. When running the command below, it will not check all the files in the repository but only what is present in the branch staged changes, run git status to see what files will be checked, then run the static check:

./breeze static-check all

For some changes like sorting imports, it automatically will make the change into your branch and alert you like below.

Excellent now you are ready to commit your changes into Github 🙌.

Commit and create a draft PR

Before you commit your code into your origin Airflow repository, we need to make sure we have the latest version of the master branch. I’ve explained it in part 1 but let’s do it again to always remember this important step. At your branch merge the updated upstream master:

git fetch upstream
git merge upstream/master

Merging upstream master is important before committing your branch because it prevents failures on Pull Request tests.

Now the branch is ready to be committed and pushed into the origin. To create more descriptive commits, you can commit each change separated with a message explaining what was changed. After that push into the origin and then we are ready to create the PR.

git commit -m "update the tree view of dag"
git push origin 8246-concepts-last-run-only-docs-fix

The PR will be created at the upstream repository, then go to the Pull Request page of Airflow (here). You will see in yellow a message to open a pull request for your branch. Clicking there, you will need to create a title and a description of your PR. Put a brief and direct title, and follow the rules showing at the description template. Read the Pull Request Guidelines for extra information.

Basically you are going to add a description about what this PR is for and tag the issue you are solving. You can link a Pull Request to an issue using keywords (documentation here) like Resolves or Fix and # the issue number. Tick the checkboxes you agree with, all boxes need to be ticked to be able to merge this PR.

Before creating the actual PR, let’s create it as a draft first for two reasons: As soon as the PR is created, Github actions will start to test the changes, so maybe it will raise errors and you will need to add another commit; Or you still need to add more changes into this branch. It will help the reviewers to only check your code when it is ready. Your PR will look like this:

Ready to submit the PR

Great, all the test are green, so it means that your code passed all the tests necessary to be merged. So now let’s change it from a draft into a PR at ready for review button.

Now the process to merge these changes into the main repository started, you are one step to achieve your first open-source contribution into Apache Airflow. This last step will occur “automagically” because now one of the repository committers will need to review, approve and merge your PR.

Maybe after the review, some changes were requested. If it happens, go back to your branch, apply the requested changes, run the static check, commit, make sure all Github tests pass and wait for another review.

There is no much you can do now, it can take hours or days to happen. One way to request a review is to post it into the #development channel of Apache Airflow community on Slack (here).

Approved and merged

Nice, someone reviewed the changes, approved and merged the Pull Request. Job is done! 🎉

You will see that after the PR is merged, the issue will be closed as well. Now you can delete your branch, there is nothing else to do there.

Here we conclude our quest. After these two parts, we learned how to configure our MacBook to be ready to make a change into the Apache Airflow repository, learned how to find an issue to solve, and finally learned how to submit this change to be added into the next Airflow version.

Congrats, now go and add this into your Linkedin 😉 it will look great there. I hope to see you contributing to this and other repositories soon.

Bye 👋

--

--