Conducto for CI/CD

Easy Error Resolution

Matt Jachowski
Conducto
Published in
6 min readApr 21, 2020

--

Anyone who has spent time with complex CI/CD pipelines has spent a lot of that time resolving errors with them. Bugs are just a reality when you are trying to implement a complex system. Conducto makes it as easy as possible to resolve the three types of errors we think that you are most likely to encounter:

We think that our thoughtful approach to error surfacing and handling will save you a ton of time and make you more productive.

Explore our live demo, view the source code, or clone the demo and run it for yourself.

git clone https://github.com/conducto/demo.git
cd demo/cicd
python error_resolution.py --local

Alternatively, download the zip archive here.

Flaky Errors

Sometimes your pipeline has a flaky test that periodically fails for no good reason. You really should fix it, but you do not want it to block you now. You have two options: you can Reset the node to try again, or you can Skip the node to ignore the error and move on.

This is the flaky error example from our demo with the Reset and Skip buttons boxed in yellow.

Reset

If the test passes 80% of the time and fails 20% of the time, and you just want to run it again to give it a chance to pass, click the Reset button in the toolbar to try re-run the node. If it passes, then great, your pipeline will continue on.

After clicking Reset, the node still fails, as seen in the timeline.

Skip

In this scenario, the test keeps failing even after a few resets. In this case, you should just skip the node. Select the errored test2 node and click the Skip button in the toolbar to let your pipeline continue to the deploy node. Alternatively, you can select the errored parent test node, which will mark all subnodes as skipped, and let your pipeline continue to the deploy node.

After skipping the errored test2 node, the pipeline is able to continue to the deploy node.

Specification Errors

You are going to make typos or forget things like environment variables when you write a pipeline specification, that is just human. In Conducto, quickly fix errors like these by selecting the errored node, click the Modify button in the toolbar, fix the offending parameter, then click the Reset button to immediately re-run the node.

Note that these fixes are isolated to the live instance of the pipeline, and do not modify anything in the pipeline script. You need to port your fixes to the pipeline script so that future runs do not suffer from the same errors.

Fix an Environment Variable

In the demo, we made a typo in the name of an environment variable. You can fix the error by selecting either the errored env_error node or its specification_error parent node, clicking the Modify button, then correcting the typo: CRATCH_DIR -> SCRATCH_DIR.

Correct the typo, CRATCH_DIR -> SCRATCH_DIR, in the Modify modal.

After clicking Update, you can verify that you see the expected diff in the right hand node pane.

Verify that the change you made is correct by viewing the Execution Parameters diff.

Finally, click Reset and you will see the node complete successfully.

Fix a Command

In the next node, we made a typo in the command. You can fix that error by selecting the errored command_error node, clicking the Modify button, then correcting the typo: lss -> ls.

Correct the typo in the command, lss -> ls, in the Modify modal.

After clicking Update, you can verify that you see the expected diff in the right hand node pane.

Verify that the change you made is correct by viewing the Execution Parameters diff.

Finally, click Reset and you will see the node complete successfully.

Errors Requiring Debugging

Sometimes you have a real issue that you need to debug. You can use debug mode by clicking the empty bug icon or live debug mode by clicking the lightning bug icon.

You can choose to debug with a snapshot of your code or live debug with your local code mounted directly into your debug container.

Debug Mode

Debug mode gives you a shell in a container with the node’s command and execution environment, including environment variables and a copy of your code. You can immediately reproduce the exact results you see in your pipeline. You can modify command, environment, and code in this container. Any changes are discarded when you exit this shell, so you must manually port your fixes back to your local code.

Live Debug Mode

Live debug mode gives you the same shell as debug mode, but also mounts your local code so that you can edit code outside of the shell with your own editor. Conversely, any changes you make inside the livedebug container persist outside on your local host even after you exit the shell, allowing you to instantly commit any of your fixes to your repo.

Debug Example

In this example, you should use live debug mode. Click the lightning bug in the upper right hand corner of the node pane to get a command copied to your clipboard. Paste that command into a local shell. Run the command to immediately reproduce the error reported by the pipeline.

Now, since the live debug container mounts the code from your local filesystem, you can edit and debug using your own editor and debug environment. Test your fix by re-running the command in the live debug container.

A debug container works the same way, but the code is copied into the container and has no connection to your local machine. So, you must edit and debug entirely within the debug shell.

A live debug session starts with a command that you paste into a shell. In the debug container you can cat the command, execute it to immediately reproduce the error, and re-run it to test your fix once you have debugged it in your own local editor.

Once you have fixed the code, you must click Rebuild Image to rebuild the image so that the pipeline can see the updated code. Once the image is rebuilt, you can click Reset to re-run the node to see it run successfully. As a shortcut, you can click Rebuild and Reset in the upper right hand corner of the node pane.

Rebuild the image then Reset to re-run the node in one step by clicking Rebuild and Reset, which is conveniently the default button displayed in the yellow box.

You can view the history of each run of a node in the node pane timeline. Select any row in the timeline to see the Command, Execution Parameters, Stdout, and Stderr for that run of the node. Here, we can see the output of the first run that errored, and the second run that was successful.

Toggle between different rows in the timeline to see Command, Execution Parameters, Stdout, and Stderr for different runs of a node.

We are developers who know the pain of re-creating execution environments and debugging in fragile setups. So, we built Conducto to make error resolution as quick and easy as possible. We hope that you will find debugging in Conducto to be a breath of fresh air. Check out Rapid and Painless Debugging to see us applying these techniques to our actual internal CI/CD pipeline.

If you have not yet, get started with Conducto now. Local mode is always free and is only limited by the cpu and memory on your machine. Cloud mode gives you immediate scale. Use the full power of python to write pipelines with ease. And, enjoy easy error resolution.

--

--