Simulation Optimization Representing A Mixture of Methods — A Graphic Image of Scenery

Simulation Optimization — Operational Methods

Included a playground of scientific code in C++ and Python for Performance Optimization and Demonstration of Code

Aswin Vijayakumar.
Analytics Vidhya

--

Introduction

This article contains collected code samples from various sources such as Intel MKL and SceneNet Deep Learning Framework that uses Numerical and Stochastic methods of optimization. The C++ code repositories presented here use sample code that can be performance enhanced using OpenMP (OMP) or OpenMPI (MPI).

Instructions for running code using OMP:

The repository is built for executing in OMP to make use of Parallelization and Vectorization techniques.

cmake .. && make all ./app --intervals 4 --iterations 8 --condition_factor 2 --multiplier 1000000

First you need to install cmake in Linux by:

sudo apt-get install cmake

or obtain this docker repository:

git clone https://github.com/aswinvk28/cpp-performance-math-problems cpp_math_problemsgit checkout navier-stokes-autodiff

Instructions for running code using MPI:

Please amend the code in the repository to execute the code in MPI programming method. Then execute the command line app using mpiexec. In order to execute the command line app, you need to execute the shell script given in the repository.

./run_project.sh

mpiexec -n 4 ./app --intervals 4 --iterations 8 --condition_factor 2 --multiplier 1000000
  1. You need to create an Implementation Function
double * navier_stokes(double * u2, double u0, const double dt, const double dx, const double p, const double alpha, int length, double * model)

2. Against the Reference Implementation Function

double * navier_stokes_ref(double * u1, double u0, const double dt, const double dx, const double p, const double alpha, int length, double * model)

The implementation function executes at each interval of the given vector of values, similar to reference implementation. The reference implementation is compared against a hand-written model.

Initial Hypothesis:

The combinatorial equation of the linearly estimated model is:

Initial Hypothesis

A linear damping factor is applied to the original units per cell vector that satisfies our Estimated model. This signifies our assumption that the units per cell observed for the estimated model satisfies.

Preliminary Analysis:

In the preliminary analysis, we take a macroscopic approach which leads us to the optimum curve for the units per cell. We need to represent the function based on a power series formula which is represented by the problem below:

In a power series equation, the accuracy score is mean squared error in a polynomial regression which is measured using error vectors dropped from original data points to the predicted curve. In a Support Vector Machine (SVM), they are either functional margins or geometric margins. Since we assume that each point in a dispersion relation transfers mass to one another, a flow function and an analytic complex variable will solve our problem. We can choose our phase in a power series equation and represent as:

A₀ + A₁ cos(θ) + A₂ cos(2θ) + A₃ cos(3θ) + A₄ cos(4θ) + A₅ cos(5θ) + …

Derivative: —

A₀ + -A₁ sin(θ) + -2 * A₂ sin(2θ) + -3 * A₃ sin(3θ) + -4 * A₄ sin(4θ) + -5 * A₅ sin(5θ) + …

This will make our function analytic in every direction as well as create a flow function that a classifier such as SVM can predict. There exists another set of power series coefficients which represent the one with the shifted phase.

B₀ + B₁ sin(θ) + B₂ sin(2θ) + B₃ sin(3θ) + B₄ sin(4θ) + B₅ sin(5θ) + …

Derivative: —

B₀ + B₁ cos(θ) + 2 * B₂ cos(2θ) + 3 * B₃ cos(3θ) + 4 * B₄ cos(4θ) + 5 * B₅ cos(5θ) + …

Let us represent our “Flow Model” as:

f(x, t) : the state space model

Cost Function:

The Cost Function towards Fourier Transform

When you perform gradient descent on the “Reference Implementation Function” the analysis context is set as discrete. The intervals vary in size which produces a model conformance check

Hence, our resulting coefficients will be a Fourier Transform, in fact, it is an Inverse STFT (Inverse Short time Fourier Transform) to the time domain.

Popular python packages that can perform STFT are:

  • numpy
  • librosa
  • scipy
  • tensorflow

Other packages that can perform STFT are:

This article from Intel explains the Consistency of Floating point arithmetic using C++ compilers. FP (Floating-Point) Accuracy across all GPUs, operating systems must be consistent. In order for it to be consistent, the properties of addition and multiplication must follow associative principles and the floating point numbers may be serialized to be executed under various threads and vector units (VU). The problems demonstrated here show the usage of OMP (OpenMP) and OpenMPI for evaluating a computational model. The computational models selected for the exercises are in C++ and Python. The problems demonstrated are:

- Monte Carlo Diffusion as Dispersion Relation,

- Lattice Boltzmann Method,

- Design of Experiments and

- Complex Analysis.

Results obtained by executing the code for the given parameters:

Mode 1: When quantisation factor is set to a default fraction

qf = ( MAX. ESTIMATED VALUE ) * 2 / 3

You can adjust the Max. estimated value of the model as Force Term and assess the outliers based on the chosen fraction of the max. estimated value.

./app \
--intervals 4 \
--iterations 8 \
--condition_factor 2 \
--multiplier 1
Concordance, Reliability, Residual Error, Precision
Concordance, Reliability, Residual Error, Precision

Code Examples

./app \
--intervals 3 \
--iterations 7 \
--condition_factor 2 \
--multiplier 1 \
--quantisation_factor 2e-2
Concordance, Reliability, Residual Error, Precision
Concordance, Reliability, Residual Error, Precision

The change in interval lengths do change the precision, reliability and concordance. In interpolating our hypothesized units per cell parameter which is calibrated for

Length = 100

a spline interpolation is used, with control points

α and 1 - α

Setting the goal

Precision

As you can see the Precision varies in both the models, and this is because of:

--intervals 4 --iterations 8       being used in model1
--intervals 3 --iterations 7 being used in model2

When performance improvisation is performed, each interval (of length 2⁴ and 2³ respectively) is taken for our computational model. We take the MAE (Mean Absolute Error) at each interval to finalize our result.

Reliability

The values of reliability vary in both the models, and this is because the intervals do change with change in precision. Reliability is defined as the probability of failure due to N experiments.

Reliability

Residual

Over an interval, the model value changes by small deviations. The deviations represented as a square or by itself normalized by the model value gives the residual property of the model in that interval.

Residual Error

Also, represented as condition number, it takes an N-norm at the numerator and the multiplies by the inverse of the matrix. For large intervals, it is a relevant statistic because it provides a Taylor series expansion.

Concordance

With the definition of the model, the force term represents: ρ ( δu / δt ) and the derivative of force term, V(τ) = ( 1 / ρ ) ( δF / δx ), represents the intervals.

The concordance statistic here takes into consideration a numeric value and a complex arg. The numeric value corresponds to ( δx / x ) and the complex arg corresponds to z/ zwhere the complex number is the product of contour integral of flow function and the complex derivative. The flow function taken here is the Force term.

Inference from Statistical Analysis

It is our interval analysis that is key to our understanding of model parameters such as precision, residual error, condition numbers, reliability model and concordance from the model.

As discussed above in “Preliminary Analysis”, the power series parameter, θ, over the interval from `a` to `b` with interval Length L = ( b - a ), is represented as:

θ = 2π n x / L

In the Taylor Series expansion, the kᵗʰ derivative will contain a normalization parameter nᵏ because the spline interpolation with control points α and 1 - α at the interval for every derivative will be in the same form:

f(k) = α fᶦ + (1−α) fᶦ⁺¹

Here using the Taylor Series expansion for deviations, we arrive at a birthday function by Ramanujan. The birthday problem of recursion is the solution to the number of trials a person can make before finding another person with the same birthday in a year containing N days. The birthday problem is given here https://en.wikipedia.org/wiki/Birthday_problem

Birthday Problem Normalized by Interval Length
Birthday Problem normalized by interval length

The birthday problem is relevant to the derivation of a set of Ramanujan’s distributions such as

  • Ramanujan Q-distribution, and
  • Ramanujan R-distribution

In fact, I can say where Ramanujan R-distribution is useful. In case of our macroscopic problem where we iterate the intervals and find that our birthday problem can actually create a generating function of deviations of observations recorded, the Ramanujan R-distribution is a probabilistic problem of permutation matrix.

If we make use of our problem here, where we hypothetically apply Optical Flow Phase-Based methods to every derivative of our flow function, the resulting data wrangling stage has two options:

Firstly, to consider coefficients Aₙ as ordered set of real valued variables, and the derivatives as k unordered set of complex valued functions representing a matrix of complex derivatives and coefficients in order to solve a graph problem.

Let’s discuss some Reinforcement Learning here, if you have got a dataset and you need to score your findings of the dataset based on possibly, a Monte Carlo Tree Search (MCTS) Algorithm, you may search through the dataset and hierarchically cluster the dataset based on your chosen algorithm or a model. This way the state space that is iterated will be huge. The permutation matrix used below maps the inputs to the model to the outputs from the first factor graph to the second factor graph. As explained below:

Permutations of Ramanujan R Problem
Matrix Representing Ramanujan R Problem

Once they are mapped the state space of possible N x 5 x 5 as we can see 5 factors here, is reduced to 2 x 5 x 5.

Secondly, to consider coefficients Aₙ as part of a time series continuum, randomization is performed to choose the complex valued state space by matching the k World Traces taken in a pair. In this case the jᵗʰ derivative, may be matched against an iᵗʰ derivative.

The paper here, https://arxiv.org/pdf/1506.02335.pdf, describes a multi-match algorithm run which matches random permutation in a state space of rank (N+k), matching the k elements to specific k elements.

“The Ramanujan R-distribution here shows the probability that matches can be performed with no two columns of coefficients repeating its structure throughout in order to solve the problem by independently considering the flow function derivatives such as a permutation matrix problem. “

“The binomial distribution is represented as the product of Ramanujan Q-distribution and Ramanujan R-distribution. “

The significance of the interval length is high because whether it is momentum conservation model, kᵗʰ derivatives, energy or Hamiltonian mechanics, it is the same spline interpolation used for this macroscopic problem.

The autodiff package helped me deduce derivatives, and is good at forward and backward differentiation

@Courtesy: https://autodiff.github.io/

As given in the paper, https://arxiv.org/pdf/1801.06397.pdf, using Optical Flow methods the results obtained in analyzing the datasets are concluded to have two facets:

(1) Diversity, and

(2) Realism.

As stated in this paper, a network trained on specialized data generalizes worse to other datasets than a network trained on diverse data. It also states, most of the learning task can be accomplished via simplistic data and data augmentation.

Three operational simulation optimization methods that I would like to discuss here, and they show a variety of techniques on how to reach at an optimal solution.

A list of computational models:

Monte Carlo (MC) Diffusion as a Dispersion relation

In Monte Carlo Diffusion, the probability of diffusion can be written as the sum of probabilities of air moving into and air moving out of a selected point

Problem: Given problem is a computational model with small interval of time, small displacement that happen over a period of time, when there is an input velocity of air supplied

With input air supplied, the mass flux can be written as:

Mass Flux (Volumetric) Difference

With this linear equation, if it is taken computationally forwards with the Kinetic Energy (K.E.) of the dynamics of air a relation between space and time is obtained:

Navier stokes relation for momentum conservation at isotropic external pressure and absence of any shear force across walls with input velocity time series
Navier stokes relation for momentum conservation at isotropic external pressure and absence of any shear force across walls with input velocity time series

Our variables of the model are:

  • Number of units per cell
  • Probability of movement
  • Dimensionless constant of rate of air density
  • Input time series of velocity of fluid

In the derivation of the Navier-Stokes model, the equations used are: Fick’s diffusive flux and Gibb’s free energy. The dimensionless parameter of rate of air density is assumed to be constant during the computation. The equation assumes that there is synchronicity between the units per cell parameter and the dimensionless constant of rate of air density. The major drawback of the model is that it distributes linearly from 1 to 0 and never considers bouncing back or shear stress on walls analyzed within a unit segment.

There is dispersion of particles due to the large input signal.

Simulation Optimization model:

Stochastic Optimization (SO)

With stochastic gradient descent using TensorFlow library, the parameters of the computational model are determined within a unit segment length. An input velocity time series of t ^ 5 is used here for the Navier-Stokes relation.

Input Velocity Time Series

As worked out in this repository:

Velocity computed is based on:

Computed Velocity From the Input Times Series

This implies the velocity is linearly reduced from a value of 0.2 approx to a value of 0.1 approx. Length of total significant points are taken to be 100. The other parameters in the equation are described as:

  1. the points of significant velocity observations are plotted as scatter plot:
points of significant velocity observations

2. the initial velocity is

Initial Velocity

3. the probability is

The Probability

4. the alpha value is

The Alpha Value

The displacement and time parameters of the computational model are:

dt = 1e-3

dx = 1e-6

By calibrating the results from Monte Carlo Diffusion, we always get the same pattern for the volumetric relation of units per cell

volumetric relation of units per cell

Gaussian Kurtosis Explanation

Gaussian Kurtosis Explanation

This is quite expected as our original model showed this result of Momentum Conservation model as the Kurtosis of the original Navier-Stokes relation is higher than that of the volumetric relation of units per cell of the computational model

Kurtosis of the original Navier-Stokes relation

To observe the entire cuboid of air within our bounded context at the same instant, the interval where this pattern repeats itself must be identified. It can be shown that such intervals are bound to exist due to a Sizing problem. I believe it is possible to show such intervals of Kurtosis for the speed of sound 343 m/s calibrated within every cuboid of a designated space.

A list of computational models:

Lattice Boltzmann Method (LBM)

In Lattice Boltzmann method, we discretize the fluid into 8 velocity vectors and 1 rest velocity at an instant of time. Using Lattice Boltzmann method the speed of sound is determined by:

Lattice Boltzmann Method Velocity of Sound

The energy of the Boltzmann Lattice Model depends on density, velocity and velocity vector with speed of sound as an additional parameter. here’s a demonstration of the Lattice Boltzmann Method applied to an obstacle.

Lattice Boltzmann Method (LBM)
Please take time to observe the changes in the LBM

Code Examples

Directions for Lattice Boltzmann Method
Energy Equation for the Lattice Boltzmann Method

The energy of the Lattice Boltzmann Model is dependent on the input/output direction, density, or velocity acting on the particle and the velocity vectors acting from each direction.

Simulation Optimization model:

Evolutionary Strategies (ES)

Directions for Lattice Boltzmann Method

The model based on input time series as discussed before, evolves in a combinatorial fashion as described below:

The code sample for LBM is provided below:

A fraction of 1 / 9 is assigned to the orthogonal directions, a fraction of 1 / 36 is assigned to the diagonal directions and a fraction of 4 / 9 is assigned to the rest velocity.

A list of computational models:

Design of Experiments

The design of experiments which can be installed using:

pip install — user pyDOE2

are an array of techniques constructed to simulate a given world trace. The experiments that it can create are: Factorial Designs, Response-Surface Designs and Randomized Designs.

Factorial Designs

  1. General Full-Factorial (fullfact)
  2. 2-Level Full-Factorial (ff2n)
  3. 2-Level Fractional-Factorial (fracfact)
  4. Plackett-Burman (pbdesign)

Response-Surface Designs

  1. Box-Behnken (bbdesign)
  2. Central-Composite (ccdesign)

Randomized Designs

  1. Latin-Hypercube (lhs)

Code Examples

Simulation Optimization model:

Response Surface Methodology

Using Response Surface Methodology, a series of regression models are fit against provided input parameters. Using the computational model, Design of Experiments, one can run a set of experiments that interpolate data based on desired set of characteristics such as L2 Norm and a flow characteristic used for modeling the state space.

In this case, the MNIST digits and EMNIST characters are simulated from ground truth data using two extreme intervals which are the MNIST digits 7 and 5 by the method of Randomized Designs (Latin Hypercube). The interpolating characteristic used here is the product of the experiment matrix and the difference between experiment value and the ground truth with a degree of 1.

MNIST digits transforms from 7 to 5 Using Design of Experiments
MNIST digits transforms from 7 to 5

The code repository for design of experiments is given below:

A list of computational models:

Circulation and Flux in Complex Analysis

In Complex Analysis, analytic complex variables are represented as described below:

analytic complex variables

If the complex function is harmonic, they are represented as:

complex function is harmonic

Suppose there is a flow function, defined by:

the flow function

The circulation and flux along designated contour lines is defined as:

circulation and flux along designated contour lines

Simulation Optimization model:

Optical Flow Analysis using Phase-Based Methods

Using phase-based methods, the spatial and temporal property of an image is extracted. Optical Flow analyses are a group of methods that can do image segmentation, determining the constraints in a video, etc. Here is a repository that does the optical flow techniques to predict the differential relation in an image. The finding of optical flow derivatives is performed using SceneNet framework. The script transforms the image into defined size and applies horizontal field of view and vertical field of view to the image. The input of the image is just the depth map normalized by 1e-3.

Packages that can do Optical Flow:

How is this generated !!!

Just take the product of complex conjugate of a 2 channel 2D band-pass signal and optical flow derivatives as a complex number which gives you a real part, imaginary part and arctan2 component. To represent the optical flow derivatives as a spatial component, consider them by using moving images and then resolving them into X and Y axes.

The code repository for optical flow analysis is given below:

Conclusion

In summary, the Combinatorics analyses an algorithm and then generalizes a problem in terms of sizing of the state space and model complexity and solves the problem of representation of state space. With Complex Analysis, the temporal and spatial component or the complexity of data can be established. The method of Design of Experiments make use of social graphs that enables simulation of data in a standardized approach. Monte Carlo methods are sampling techniques and performed on many use cases. Monte Carlo methods are powerful techniques that combine with other problem solving techniques such as Hamiltonian mechanics (MCMC with Hamiltonian).

This is a paper on Variation of gradients of Deep Learning networks and it is quite interesting to read as they explain about activation patterns for several nodes within the neural network. Also they discuss that the largest variations within a deep neural network happens when the layers with fewer nodes changes its activation pattern. It is observed that deep neural networks (DNN) in a deep learning problem achieves its Complexity and Invariancy simultaneously.

--

--

Aswin Vijayakumar.
Analytics Vidhya

Project, technical details and standards for Computer Vision and Data Science. Contact: aswinkvj@klinterai.com.