Introducing Tailor - an engineering automation tool box
Tailor is a workflow system developed to work as an automation toolbox for engineers. Its core capability is running heavy engineering tools while storing data in a way that is accessible for the engineer.
As engineers we apply Python scripting at an increasing rate. Python and the large ecosystem of open source, high quality scientific libraries have made it easy to start writing scripts for automation, customized calculations, number-crunching, plotting and automated report generation. This enables faster iterations and more computations for each project.
However, the administration of scripts, files and servers was, at least in our case, chaos! Results for each iteration ended up in a new folder, typically named newest2 and old_v1. We had all these powerful scripts lying around, but the knowledge of how to run them only existed in the heads of a few dedicated engineers. To distribute a job to servers for crunching we had to manually move files and scripts.
A battle-hardened system
Tailor is developed by the engineers at Entail. Our background is from design and optimization of very big construction projects, like bridges and floating oil rigs. Even though every project is unique, the tasks involved in solving the engineering problems are often similar. With frequent design iterations, we needed a system to organize the tasks into automated workflows.
Why not use an existing tool? For an engineering tool to work it must be lean; easy to setup and use, and have a powerful Python API. It must also run on a Windows platform to allow for typical commercial engineering software. Apache Airflow is an example of a modern workflow/pipeline tool built around Linux. With the use of Windows Subsystem for Linux you can in principle make Airflow work. However, for most engineers this workaround is too cumbersome and in practice not an option.
So we made a workflow system for engineers. By engineers. A fully managed service that takes the heavy burden of managing a workflow system. Tailor orchestrates your engineering workflow with a simple workflow scripting language. It controls how each task in the workflow is executed and what inputs and outputs to process. Add to that automatic piping of files and data, and you can quickly automate your work. Iterations now run like a breeze.
Some core concepts
To explain how the system works, we will now define a few core concepts and building blocks.
DAG: Directed Acyclic Graphs (DAG) are blueprints for how computations in a workflow are performed, i.e., they define the order of and relationship between the tasks. The figure below visualizes a DAG.
Task: A typical engineering workflow consist of several steps. In Tailor, each step is defined as a task. A task executes a function.
Workflow: A workflow is the execution of a DAG with defined inputs and files.
Worker: A worker asks the workflow database for tasks it can execute and does the actual computation. It can run either on your local machine or on a (cloud) server.
The magic: All the nitty gritty details of task distribution, parallelization, data piping, file piping and storage are handled by our workflow service so that you don’t have to.
A typical engineering problem
Let’s have a look at how it works.
Say you’re an engineer working on space exploration. You are involved in the design of a rocket landing barge that should give a decent operability in waves anywhere in the world. In practical terms, the less the barge moves in waves the easier it is to land the rocket on it. To calculate the dynamic response of the barge, you have developed a set of Python functions that perform your calculations in four steps:
- Create a geometrical model of a barge based on the width, assuming constant length and volume. The geometry parametrization is done through Rhino’s Python API.
- Calculate hydrodynamic properties. The wave interaction solver WAMIT is executed for each Rhino geometry.
- Run simulations to calculate the motion of the rocket during and after landing. The time domain simulations are performed through the OrcaFlex Python API with hydrodynamic data from WAMIT.
- Post process the results to find worldwide operability. Worldwide wave statistics is applied based on the OrcaFlex results.
For each barge, simulations of the rocket landing in 128 different wave conditions were performed. Without a workflow system the task of processing such an amount of simulations is time consuming and storing results according to company procedures is cumbersome. With Tailor, it does not matter if you run 128 or 1024 simulations, the system handles the hard work and you can focus on the physics. The workflow is now organized; inputs, outputs and files are all accessible through the Python client (pyTailor) and the Tailor Web App. The figure below shows a graphical representation of the workflow (for visualization purposes limited to two barges and two wave conditions).
You now have access to an indexed data set, easily accessible on the workflow object in Python or through the web interface. This means that you can now extract data for further post processing, you can share the data in a consistent way with other parties and store it for future use.
Your workflow is complete. Let’s look at some results. We were interested in the loss of operability of the barge as a landing platform, i.e., the time at which the barge motion in waves prohibits safe landing of the rocket. The figure below shows this loss of operability for two different barge geometries for various ocean environments. As it turns out, the key parameter governing the operability was the width of the barge.
How was this achieved?
Workflow programming in Tailor is not rocket science! With the Python client pyTailor, you can use PythonTasks to call Python functions, use BranchTasks to achieve parallelization, and then use a DAG to define how these tasks relate to each other:
The Inputs, Outputs and Files objects are helper-objects for parameterization. When we say
kwargs=inputs.barge we are specifying that the keyword arguments are parameterized and shall be looked up from the ’barge’ name in the workflow’s inputs when the task is executed. The concept of parameterization becomes clearer when we see how inputs, outputs and files are defined when we run a Workflow below.
The DAG defined in the code is visualized below during a workflow run with five barge inputs.
We now have a parameterized DAG, the blueprint for running the workflow. With this DAG, we can now run a Workflow with specified inputs and files.
Once the workflow has been started it can be monitored from the Tailor Web App. The figure below shows how the workflow can be found in the list of workflows. When the workflow is selected, the workflow files appear on the right side for inspection and direct download.
By clicking on the Details link, the definitions, inputs, outputs and files can be inspected for the workflow and for each task. If a task fails for any reason, the error message is shown, the task can be reset and the inputs changed and run again.
But there’s more…
The example above illustrates some of the benefits for you as an engineer to automate your daily work. But there’s more to Tailor than meets the eye.
You don’t even have to code: The workflow can be made available in the Tailor Web App, so that colleagues that do not use Python can manipulate the inputs and files and run a new workflow with the same DAG.
Collaborate: The workflow and all associated data is directly available to all approved members of your project. Hence, you can collaborate directly on the data set, where everyone simultaneously has access to the workflow. If you at a later stage revise the barge design and rerun the dynamic response workflow, your colleagues can simply refer to your new workflow rather than the old and re-use their processing methods. Therefore you achieve a true collaboration on the engineering process without spending time and resources on data management.
Reuse and revisit: Your workflow is stored in a database and your inputs, definitions and results are all accessible. Use this as a good starting point for the next project, or as a basis for evaluating if your design iterations take you in the right direction. Or relish that all the required data is stored for you to later train that machine learning algorithm you’re developing in your spare time.