Optimize process simulation with Python-Aspen integration

How to boost R&D experimentations integrating Aspen Plus with Python to automate, calibrate and optimize process simulations with a data-driven approach

Published in

Eni digiTALKS

8 min readJul 27, 2022

A digital twin of a power plant. Image credit: GE Power

Being a data scientist and being faced with problems to tackle with very little data available seems like a paradox.

However, especially in research areas, the experimental database is initially limited, and generally in the industrial world we rely on process simulators, which need to be accurately calibrated to replicate precisely what happens experimentally.

In this article we will show you how a Data Scientist can integrate a process simulator (especially Aspen Plus) with Python with the goal of:

Accurately calibrate the process simulation
Leverage process simulation as a black-box model and implement optimization algorithms to tune the process parameters involved
Generate synthetic data to develop fully data-driven models

Process Simulation

Since its beginnings in the 1970s, process simulation has undergone a considerable development. Today, it is possible to model and simulate very extensive processes or even process networks with complex behaviour of substances accurately in steady-state and dynamic mode. This includes not only conventional chemical processes, but also many special processes, for example from bio-or polymer technology. In the chemical industry, process simulations support the entire life cycle of a chemical process from development, design and construction to optimization of operation. (Dechema, 2021)

Applications include design, debottlenecking, engineering studies, design audits, control system check-out, process and dynamic simulation, operator training simulators, pipeline management systems, production management systems and digital twins.

Examples of software used to simulate the material and energy balances of chemical process plants are Aveva PRO/II, Schlumberger OLGA and the Aspen Suite.

In the latter we find one of the most widely used process simulation software Aspen Plus, the one we will explore in this article.

Figure 1 — Example of Process Simulation in Aspen Plus

There are three main limitations of this tool:

Slowness: the greater the complexity of the process you want to simulate the longer the time to produce the results
Memory/Storage usage: while running a simulation Aspen Plus takes up a significant percentage of the available RAM, as well as producing a large amount of temporary files
Customization: it is very difficult, even if not impossible, to implement any specific add-ons to extend the functionalities of the software

What advantages a Data Scientist could have by using Aspen Plus as a black-box model?

Run several simulations in parallel, reducing the time it takes to complete a submitted task
Implement different types of algorithms (e.g., optimizers) to better calibrate simulations or optimize process parameters for a future plant design

Is it possible? Yes, it is! Let’s see how.

Control Aspen from Python

Run a Simulation Process

The PyWin32 library enables to use the features of the Win32 application programming interface on Python. It is possible to install the library via pip:

pip install pywin32

This library allows to easily access Window’s Component Object Model (COM) and control Microsoft applications via Python.

For example, it is possible to open Excel through this script:

import win32com.client as win32excel = win32.gencache.EnsureDispatch('Excel.Application')excel.Visible = True_ = input("Press ENTER to quit:")excel.Application.Quit()

In order to open Aspen Plus and run a specific simulation file we can simply:

Replace Excel.Application with Apwn.Document as EnsureDispatch argument in the previous snippet
Load a specific simulation with the InitFromArchive2 function that just needs as argument the simulation file path
Run the simulation with Engine.Run2 function

The Python snippet looks like this:

aspen = win32.gencache.EnsureDispatch(Apwn.Document)aspen.InitFromArchive2(path)aspen.Engine.Run2()

Note that the compatibility of the previous script has been tested with the Aspen Version V10-V12, loading the *.apw or *.bkp simulation files.

Manage input/output of a Simulation Process using the Variable Explorer

Ok, we are now able to run a specific Aspen Plus simulation file from Python!

To use it as a black-box model we need two more things:

Fully control the inputs of a simulation; for instance, we may need to change the reboiler temperature, the number of the distillation column stages and so on.
Get the results of a simulation run to get the performances of a specific input configuration

Aspen Plus includes, under the Customize tab, the Variable Explorer that displays a tree-view, like the classic File Explorer in Windows.

Each user input is represented as a variable (or leaf) within the tree, and includes extra auxiliary information (e.g., prompts, labels, help-tips and fields) that are not always visible.

In the example below we can see that, following the \Data\Blocks\FLASH\Input\TEMP, the temperature of a flash is set to 150 °F.

The tree-structure of a simulation file can be easily accessed using the following instruction:

self.aspen.Tree.FindNode(tree_path).Value

Where tree_path is the path (e.g., \Data\Blocks\FLASH\Input\TEMP) of the specific object inside the tree with the dot notation with the Application.Tree prefix.

For instance for the flash temperature mentioned above we use Application.Tree.Data.Blocks.Flash.Input.Temp.

To set a specific value to a variable inside the variable explorer we need to do the following operations:

Get the link to the specific tree node
Get the node measurement unit (a numerical code, a mapping table shows the code for a specific unit of measurement)
Get the node basis (Mole, Mass, …)
Use the SetValueUnitAndBasis function to set the new value, according to the specific node measurement unit and basis.

#access to the specific node
node = self.aspen.Tree.FindNode(tree_path)#get the measure unit of the node (codified)
node_unit = node.AttributeValue(ATTRNAME_MAP['Unit'])#get the node basis (Mole,Mass,…)
node_basis = node.AttributeValue(ATTRNAME_MAP['Basis'])node.SetValueUnitAndBasis(Value=value, unitcol=node_unit, basis=node_basis)

Here we go!

Now we can set specific input to a simulation file once opened inside Aspen Plus, run the simulation, and get the outputs.

Python-Aspen integration: what can we do?

Let’s go back to data science stuff: what can we do with this powerful Python access to Aspen Plus?

Figure 4 — Parallel Computing (https://www.verifyrecruitment.com/blog/the-future-of-parallel-computing/)

Automation

First of all, we can automize the run of multiple simulations. For example, if we want to perform a sensitivity analysis of a specific variable, we can just launch one simulation after another, setting a different value of a particular input and get the results (in terms of a specific output variable).

What is the advantage of using this Python-Aspen integration instead of the sensitivity analysis feature included in Aspen Plus? We can split the multiple runs using multiprocessing running different Aspen simulation on different cores!

This can be done using the subprocess python library. See the documentation for all details.

Calibration

Let’s imagine that we want to simulate a new technology that is in the experimental stage.

A common problem is to calibrate the simulation so that it can most likely replicate the experimental process; for example, we may have to calibrate the value of the equilibrium constant of a particular reaction.

How can we do this in an automatic way?

If we have a subset of experimental tests, we can use the Aspen simulation as a black-box model and an optimization model to find the best value of this specific constant.

We must perform the following actions:

Retrieve the equilibrium constant inside the Aspen variable tree (it could be composed of different parameters to be varied)
Determine the search space of the constant (or any specific parameter)
Choose the best optimization technique based on the type of problem and the type of data at hand (we suggest to test PSO, Particle Swarm Optimization, with reduced dataset)
Identify the output variable that can be used to compare the experimental results with the simulation outputs
Define a tolerance threshold to “accept” the calibration result

Once the simulation is well calibrated, the automation process described above can be used to generate a synthetic dataset composed of simulated data obtained by varying input parameters defined for each of the search spaces.

The more experimental data available to calibrate the simulation the higher the quality of the synthetic dataset generated. In case the calibration is of very good quality, the generated dataset could become the training dataset for a “fully data-driven” model.

Optimization

Similarly, we can modify the previous Calibration procedure to optimize any input parameter by specifying a certain objective function:

Retrieve the parameters to be varied inside the Aspen variable tree
Determine the search space of the parameters
Choose the best optimization technique based on the type of problem and the type of data at hand
Identify the objective function to be maximized/minimized
Define a tolerance threshold to “accept” the optimization result

Benefits and future developments

The Python-Aspen integration can be very useful in enterprise contexts (see R&D) where the lack of available data prevents data scientists from directly developing classic data-driven models.

Moreover, by taking advantage of the Aspen Plus simulation tree structure, it is clear how the proposed solution does not turn out to be technology-specific but is totally context-independent and therefore applicable and exportable to different business areas that use Aspen Plus as simulation software.

The ability to quickly calibrate process simulations, allows the data scientist to generate a synthetic dataset to perform several analysis (e.g., predictive models, root-cause analysis,…).

On the other hand, the business process modeler would take advantage of this integration to have at his disposal an “enhanced process simulator”.
What’s more by installing it on a remote virtual machine and building a frontend capable of receiving tasks from the user, running simulations and reporting the results obtained, the user would also benefit from not running process simulations on his personal pc. This last aspect is not insignificant: the Aspen Plus users know that memory consumption and the generation of temporary files make their device almost unusable.

To make this Python-Aspen integration complete first of all, as previously mentioned, compatibility with all versions of Aspen Plus should be tested (for now it has been tested only with V10-V12).

Also, the possibility of being able to run models created from other software in the Aspen suite and commonly used in the industrial world such as Hysys and Aspen Custom Modeler (ACM) should be verified.

References

Dechema. (2021). Process Simulation — Fit for the future. ProcessNet, 2–3.
Mavenvale. (2021). Chemical Process Modeling Software.