A never ending Journey

5 min readAug 13, 2017

A journey is a story of experiences, solving new problems, an affair with new spectrum of things moving unpredictably. A journey is always phenomenal making you realise that there are always bad roads and good roads.

I am on a never ending journey now — watching people do cool stuff, creating innovative things. I am trying to mimic them to learn how things can be done differently. I am learning new meanings now. This part of journey is interesting. My journey with open source project Tellurium is quite fascinating. I should thank my guide, Kyle Medley for helping me discover this beautiful World. In this blog, I shall be sharing a brief overview of things I tried with Tellurium.

The cool technologies that I used in this project include Apache Spark, Apache Hadoop, Apache Zeppelin, Livy & Docker.

Week 1 — Week 2

Parameter Scan : We have created a new module distributed_parameter_scaning which helps users to provide multiple models and simulations to run for each model and all these models are run in a distributed environment parallelly and then collect results to an array/graph. Interested in knowing a little more ? Read my previous blog . You can also check my pull request related to Parameter Scan

Week 3 — Week 4

Parameter Estimation : To estimate a particular parameter, we now have a new module that also runs in distributed environment. In order to run this, a user provides the model (SBML/Antimony) and bounds of the parameter(s) to estimate. The module internally uses differential evolution where the objective is Sum of Squared Errors. We have tested this for Immigration Death Model and Lotka Volterra Model and presented a poster at Beacon 2017.

Below are the commit links for Parameter Estimation

Added Parameter Estimation · sys-bio/tellurium@43c5a2a

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

1. Added Feature to provide UserDefinedFunctions 2. Change Variable N... · sys-bio/tellurium…

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

And here is the Pull Request link for Parameter Estimation.

Week 5 — Week 6

Sensitivity Analysis : This describes how sensitive a particular parameter is for a small change in its value. Like the previous two modules, sensitivity analysis is a new module where users will provide SBML/Antimony models, a custom simulator where the users will have freedom to define their own pre-simulation and simulations. Along with that, the users will provide the parameters for which bounds are provided — these get changed in simulations (in a distributed environment). The Results of Sensitivity Analysis are categorised as follows

a) Metrics → Compute the mean, standard deviation or variance for each of the parameters

For example, sensitivity of PP_K with respect to r1b_k2, r8a_a8, and
r10a_a10 individually. And the final output will be return the average of getCC(‘PP_K’, ‘r1b_k2’), getCC(‘PP_K’, ‘r8a_a8’) and getCC(‘PP_K’, ‘r10a_a10’) individually.

b) BINS → Given the sizes of bins for each parameter, the final result will provide how many values fall into the each bin of the bins provided.

For example, the user will provide bins of different range for each parameter. And the final output will depict how many times the getCC(‘PP_K’, ‘r1b_k2’), getCC(‘PP_K’, ‘r8a_a8’), and getCC(‘PP_K’, ‘r10a_a10’) values fall into each bin.

c) Everything → Print the results of every simulation ran.

Below are the commits for Sensitivity Analysis

Sensitivity Analysis Added · sys-bio/tellurium@46d73ca

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Sensitivity Analysis -- Returning Values for Reduce Functionalities · sys-bio/tellurium@4bd4ee7

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Sensitivity Analysis -- Modelling Change · sys-bio/tellurium@5c459ce

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Model changed · sys-bio/tellurium@33eb54c

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Changes in Collect · sys-bio/tellurium@3c2b04c

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Changes in Pre_Sim · sys-bio/tellurium@05facfa

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

User Defined Simulation for Sensitivity Analysis · sys-bio/tellurium@5223bee

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Reduce Functionalities for Sensitivity Analysis · sys-bio/tellurium@33727e4

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

This link has pull request for Sensitivity Analysis.

Below lists the pull requests in providing functionality of distributed computation for Tellurium

Allows users to run computations in Apache Spark by ShaikAsifullah · Pull Request #181 · sys-bio…

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Distributed Parameter Scan with PlotArray, plotGraduatedArray & plotPolyArray by ShaikAsifullah …

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Stochastic Simulation in Apache Spark by ShaikAsifullah · Pull Request #183 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

2.0 distrib by ShaikAsifullah · Pull Request #187 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

2.0 distrib by 0u812 · Pull Request #188 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Merge changes for sensitivity analysis by 0u812 · Pull Request #190 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

2.0 distrib by ShaikAsifullah · Pull Request #191 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

github.com

Week 7— Week 9

Experimenting with Apache Livy

With Livy, we are trying to decouple Client interaction and Spark Cluster and integrate with Livy so that the users can still run their jobs from any system.

There is a Wiki page https://github.com/sys-bio/tellurium/wiki/Livy-Instructions that describes the work done corresponding to Livy.

We have also built a wrapper (as we are still experimenting with Livy) that helps users to communicate to our Spark Clusters. Here is the brief overview of how it can be done

Every Consumer needs to get register with us.
For every registered user, we shall create a user
Every customer needs to send his/her public key or we can share them password for authentication
Then he can use the wrapper to connect to the server and transfer scripts from his local system
There can be many types of files that he should send

i) Code like that of Zeppelin Notebook
ii) SBML XML File
iii) Addition Python Helper Files (e.g. Custom Simulator in case of sensitivity analysis)

The above diagram makes it more clear regarding the usage of wrapper.

This is how the user can communicate

import distribtellurium as dte

This will import all the required scripts that allows the client to run jobs on cluster.

There is a method available (add_file) in distribtellurium that allows to ship the local code to Spark Cluster. Depending on the type of the file, there is extra parameter available ( run=True ) which the client provides to only one file.

distrib_work=dte()

distrib_work.add_file(filename=”sensitivity_test.py”,run=True)

If there are any additional files, the client can just call the same method but without run parameter or run=False.

distrib_work.add_file(filename=”huang-ferrell-96.xml”)

distrib_work.add_file(filename=”custom_simulator.py”)

Finally, the client should use the start method which runs the “sensitivity_test.py” on the cluster and provides results locally — to the client .

distrib_work.start()

Week 10

Apache Zeppelin was integrated with Apache Livy so that the users can run their spark jobs through Zeppelin which is connected to Livy Server running on the Cluster.

Week 11— Week 12

Dockerization

A docker image container Apache Spark, Apache Zeppelin (connected to Spark Cluster) and latest tellurium build is on its way. By this, we can scale it to any cluster of any size. Here is the link of docker repo.

A video demonstrating the above will be soon added to make the whole process easy to understand.

Thanks for Reading..

A never ending Journey

Week 1 — Week 2

Week 3 — Week 4

Added Parameter Estimation · sys-bio/tellurium@43c5a2a

tellurium - Python Environment for Modeling and Simulating Biological Systems

1. Added Feature to provide UserDefinedFunctions 2. Change Variable N... · sys-bio/tellurium…

tellurium - Python Environment for Modeling and Simulating Biological Systems

Week 5 — Week 6

Sensitivity Analysis Added · sys-bio/tellurium@46d73ca

tellurium - Python Environment for Modeling and Simulating Biological Systems

Sensitivity Analysis -- Returning Values for Reduce Functionalities · sys-bio/tellurium@4bd4ee7

tellurium - Python Environment for Modeling and Simulating Biological Systems

Sensitivity Analysis -- Modelling Change · sys-bio/tellurium@5c459ce

tellurium - Python Environment for Modeling and Simulating Biological Systems

Model changed · sys-bio/tellurium@33eb54c

tellurium - Python Environment for Modeling and Simulating Biological Systems

Changes in Collect · sys-bio/tellurium@3c2b04c

tellurium - Python Environment for Modeling and Simulating Biological Systems

Changes in Pre_Sim · sys-bio/tellurium@05facfa

tellurium - Python Environment for Modeling and Simulating Biological Systems

User Defined Simulation for Sensitivity Analysis · sys-bio/tellurium@5223bee

tellurium - Python Environment for Modeling and Simulating Biological Systems

Reduce Functionalities for Sensitivity Analysis · sys-bio/tellurium@33727e4

tellurium - Python Environment for Modeling and Simulating Biological Systems

Allows users to run computations in Apache Spark by ShaikAsifullah · Pull Request #181 · sys-bio…

tellurium - Python Environment for Modeling and Simulating Biological Systems

Distributed Parameter Scan with PlotArray, plotGraduatedArray & plotPolyArray by ShaikAsifullah …

tellurium - Python Environment for Modeling and Simulating Biological Systems

Stochastic Simulation in Apache Spark by ShaikAsifullah · Pull Request #183 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

2.0 distrib by ShaikAsifullah · Pull Request #187 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

2.0 distrib by 0u812 · Pull Request #188 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

Merge changes for sensitivity analysis by 0u812 · Pull Request #190 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

2.0 distrib by ShaikAsifullah · Pull Request #191 · sys-bio/tellurium

tellurium - Python Environment for Modeling and Simulating Biological Systems

Week 7— Week 9

Experimenting with Apache Livy

Week 10

Week 11— Week 12

Dockerization

Written by Shaik Asifullah