GROMACS on Google Cloud

Nobuhisa Mizue
Google Cloud - Community
5 min readOct 23, 2021

--

GROMACS

Context

I published an article last time running AlphaFold as a container on Google Cloud. Google Cloud Life Sciences is a highly flexible, inexpensive and fast tool for processing large amounts of data and demanding computing power.

Using the same approach as in the previous article, this time I’ll show you how to run GROMACS on Google Cloud.

GROMACS is an open source software for molecular dynamics (MD). It’s a very popular tool that can simulate the physical movement of atoms and molecules in research institutes and drug discovery.

It’s costly to have a richly configured server running GROMACS. If you create a GROMACS container image only once, you can run it instantly when you need it, speeding up heavy MD processing and reducing wasteful costs.

Steps

There are two work steps:

  1. Create GROMACS container image
  2. Run GROMACS

Let’s go through the steps in order.
Please note that some steps have been omitted as I don’t intend to make this article a complete procedural document.

Create GROMACS container image

Enableing APIs

Enable the following APIs.

  • Genomics API
  • Google Cloud Life Sciences API
  • Google Container Registry API
  • Compute Engine API

You need to have GROMACS configuration file such as TPR file. You can download benchMEM.tpr from the GROMACS Benchmark web site for this time.

Create a VM for temporary work

Create a VM with 64GB memory (e2-standard-16). Then SSH login to that VM and install docker.

sudo apt-get updatesudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpgecho \
“deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable” | sudo tee \
/etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get updatesudo apt-get install docker-ce docker-ce-cli containerd.io

Make yourself able to use docker commands with a non-root user privilege.

sudo gpasswd -a $(whoami) docker

Then, re-login to the VM.

Create a Dockerfile

Create a “Dockerfile” with the following contents. Please note that the latest version of GROMACS is 2021.3 as of October 20th, 2021.

FROM ubuntu: 18.04
# Copy the contents of the current directory into the container.
ADD ..
# Install Gromacs and its dependencies.
RUN apt -y update \
&& apt install -y wget g++ libxml2-dev openmpi-bin openmpi-doc libopenmpi-dev \
&& wget https://cmake.org/files/v3.21/cmake-3.21.2-linux-x86_64.sh \
&& sh cmake-3.21.2-linux-x86_64.sh --skip-license --prefix=/usr \
&& wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-2021.3.tar.gz \
&& tar xvf gromacs-2021.3.tar.gz \
&& rm gromacs-2021.3.tar.gz \
&& mkdir gromacs-2021.3/build \
&& cd gromacs-2021.3/build \
&& cmake .. -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_CXX_COMPILER=/usr/bin/g++ -DGMX_THREAD_MPI=on \
&& make -j \
&& make install

Build a container image

docker build -t gromacs_2021.3 .

Confirm that a container image was created.

docker imagesREPOSITORY TAG IMAGE ID CREATED SIZE
gromacs_2021.3 latest 8a882c6ccdda 50 seconds ago 951MB
ubuntu 18.04 39a8cfeef173 4 weeks ago 63.1MB

Run the docker container locally and verify that GROMACS works fine.
To check the GROMACS version, execute the “gmx -version” command as follows.

docker run -it gromacs_2021.3 bashroot @ 1234567890:/# gromacs-2021.3/build/bin/gmx -version
:-) GROMACS — gmx, 2021.3 (-:
GROMACS is written by:
Andrey Alekseenko Emile Apol Rossen Apostolov
Paul Bauer
Herman JC Berendsen Par Bjelkmar Christian Blau Viacheslav

A. Lemkul Viveca Lindahl Magnus Lundborg Erik Marklund

Erik Lindahl, and David van der Spoel
Copyright © 1991–2000, University of Groningen, The Netherlands.
Copyright © 2001–2 019, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
Check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and / or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx, version 2021.3

Tag the container image and push to the container registry

docker tag gromacs_2021.3 gcr.io/<PROJECT_ID>/gromacs_2021.3:v1gcloud docker --push gcr.io/<PROJECT_ID>/gromacs_2021.3:v1

Now, up to here, the VM is no longer needed, you can delete it.
You can work with Cloud Shell from here.

Install dsub

If you are not familiar with dsub, please see here.

pip install dsub

Create a bucket

You can create a bucket of Google Cloud Storage (GCS). GROMACS execution logs and input/output data are passed through this bucket.

gsutil mb gs://<PROJECT_ID>-gromacs

Run Hello world (optional)

Let’s output “Hello World” to confirm that the dsub command works. This is an optional step.

dsub \
--project <PROJECT_ID> \
—-zones “us-central1-*” \
--logging gs://<PROJECT_ID>-gromacs/logs \
--command =’echo “Hello World” > “${OUTPUT}”’\
--output OUTPUT=gs://<PROJECT_ID>-gromacs/hello_world.txt \
--subnetwork default \
--wait

You can confirm that “hello_world.txt” and logs folder are created under the bucket.

Run GROMACS

You need to copy “benchMEM.tpr” file under the GCS bucket you created. You can use Cloud Console to upload the tpr file.
Now you are ready to run GROMACS. Execute below command. As you can see from the command (--command option), it runs cd and cp followed by “gmx mdrun”. The execution results are archived with the tar command.

dsub --project <PROJECT_ID> \
--zones “us-central1-*” \
--logging gs://<PROJECT_ID>-gromacs/logs \
--image=gcr.io/<PROJECT_ID>/gromacs_2021.3:v1 \
--command=’cd /; cp ${INPUT}/benchMEM.tpr; /gromacs-2021.3/build/bin/gmx mdrun -nsteps 50000 -deffnm /benchMEM; tar cvzf $OUTPUT benchMEM.*’ \
--input INPUT=gs://<PROJECRT_ID>-gromacs/benchMEM.tpr \
--output OUTPUT=gs://<PROJECT_ID>-gromacs/benchMEM.tar.gz \
--subnetwork default \
--preemptible \
--min-cores=8

This command finishes immediately, displays the job ID, and the job runs in the background. It takes about 15 minutes for this job to finish.
To check the job status, execute the followging command.

dstat --provider google-v2 \
--project <PROJECT_ID> \
--jobs ‘<JOB_ID>’ \
--status ‘*’ \
--format json

During the time, the status of the job changes as follows.

VM starting (awaiting worker checkin)

Pulling ”gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim”

Started running ”user-command”

Success

A VM is created temporarily during the job, but it’s automatically deleted when the GROMACS process is completed.

a GCE VM is launching

After the job is finished, the following files are created under the bucket.

Google Cloud Storage

You can see log files under the logs folder.

gs://<PROJECT_ID>-gromacs/logs/<JOB_ID>-stderr.log
gs://<PROJECT_ID>-gromacs/logs/<JOB_ID>-stdout.log
gs://<PROJECT_ID>-gromacs/logs/<JOB_ID>.log

If you unpack benchMEM.tar.gz, you will see that it contains the following files.

files in benchMEM.tar.gz

Conclusion

Like the previous AlphaFold article, I used dsub to run the MD tool GROMACS. The steps are almost the same as last time, so it should be easy to carry out. Once you’ve created the container image, all you have to do is put the data under your GCS bucket and run the gmx mdrun command with dsub.
With Google Cloud, you can run large amounts of life science data and large amounts of computing at high speeds and gain new insights quickly, so please give it a try.

--

--

Nobuhisa Mizue
Google Cloud - Community

Customer Engineer at Google Cloud. All views and opinions are my own.