OakVar v2.5.43 is out. Exciting developments.

Ryangguk Kim
OakVar
Published in
5 min readSep 2, 2022

It’s been about a month since OakVar 2.5.0 was announced. OakVar has been being steadily updated during that time, with some exciting developments. Below are the status of OakVar as of v2.5.43.

If you haven’t heard of OakVar, it is an open-source genomic variant analysis platform. It aims to 1) empower researchers and the public so that genomics be more accessible to more people and 2) be the most convenient and powerful platform for genomic variant analysis. For more information, see https://github.com/rkimoakbioinformatics/oakvar.

New Result Viewer

The result viewer has been updated to a new look. Please take a look at the screenshots below.

OakVar new result viewer summary tab
OakVar new result viewer filters tab
OakVar new result viewer variants tab

Section headers are tidy on the left side, and control buttons are minimal on the right side. More over, 100k variant-limit has been finally lifted. Millions of variants can be handled by the result viewer now, and to handle a possibly enormous amount of variants better, pagination has been introduced. How many variants will be loaded in each page can be configured in OakVar user configuration file (gui_result_pagesize field) or directly on the variant table.

OakVar new result viewer reports tab

Also, a new report tab has been added, where report files in different formats can be generated and downloaded. This way, you don’t have to go back to the job submission page to get the report files for the job you are exploring.

More updates to the result viewer are coming and the job submission and web store pages will get big updates as well. If you have any suggestion for the next step or any feedback on the new result viewer, please feel free to write to me.

GENCODE v41

OakVar’s default mapper module, gencode has been updated with GENCODE v41.

Stabler Module Installation

In the past, if there was a connection issue while installing modules which were as big as tens of GB, users would end up with a truncated file and the installation process would be aborted. This was quite a headache for some users. With the latest update of OakVar, big modules have been republished using smaller parts and ov module install works with those chunks. If there was a problem with internet connection and thus downloading a module was interrupted, OakVar would be smart enough to resume from where the problem occurred.

For module developers to take advantage of this new feature, https://docs.oakvar.com/register has instructions on how to register big modules at the OakVar Store using this feature.

GitHub Connectivity

You can now host your modules on GitHub to share them with the world, since ov module install now accepts a GitHub URL as an argument. Let’s say you have a custom module hosted on the dev branch of your GitHub repository organization/repository in the following folder:

https://github.com/organization/repository/
oakvar_modules/
annotators/
awesomeannotator/
awssomeannotator.py
awssomeannotator.yml
awssomeannotator.md
data/
awesomeannotator.sqlite

Your colleagues can install the awesomeannotator module with the following command.


ov module install \
https://github.com/organization/repository/\
tree/dev/oakvar_modules/annotators/awssomeannotator

This will download the content of the awesomeannotator folder in the dev branch of the repository and install it as awesomeannotator annotation module in your OakVar system.

VCF2VCF

ov run --vcf2vcf mode got updates as well.

  • Conversion of genome assembly by liftover will be performed if input genome assembly is not GRCh38.
  • INFO field in the output will escape space characters to prevent downstream errors.
  • Full length sequence ontology terms are written in the output.

Multiple Genome Assemblies

With the latest version of OakVars’ VCF format converter module, OakVar now tries to detect the genome assembly of each input file in the VCF format, unless a genome assembly is given to ov run with -l option. This means that VCF format input files in different genome assemblies can be combined and analyzed together with OakVar. However, this automatic detection of genome assembly is still an experimental feature since there is no standard way to denote genome assembly in VCF files.

Flexible Module Location

ov run and ov module commands can now use a path to a module directory instead of a module name. This means more flexibility and convenience in using OakVar modules. For example, let’s say that you are developing a new version of an existing module, namely acmgguide, and that the module’s old version is at /mnt/oakvar/modules/annotators/acmgguide. If you are developing its new version at /home/ubuntu/oakvarmodules/acmgguide, you can do A/B tests with

# generate a result file with the current version.
ov run input.vcf -a acmgguide -t vcf -n current_acmgguide
# generate a result file with the new version.
ov run input.vcf -a /home/ubuntu/oakvarmodules/acmgguide -t vcf -n new_acmgguide

Developer Guide

Documentation at https://docs.oakvar.com has now a developer guide, which keeps adding contents. If you want to develop a custom OakVar module for your project, this is the place to check out and get help. OakVar’s architecture, tips on making modules, etc. are being explained here. Of course, you can always reach out to me with your questions.

More Convenient Deployment

OakVar has been being updated for easier deployment. Using environmental variables to do initial set up of OakVar has been improved with OV_ROOT_DIR environmental variable. For example,

pip install oakvar
export OV_ROOT_DIR=/mnt/oakvar
ov system setup

will create a new OakVar setup at /mnt/oakvar, or if the folder already exists, connect to it. If you want to let each node install select modules in the node’s storage but to have all of them to share the same location for storing job results, you can do something like this.

pip install oakvar
export OV_JOBS_DIR=/mnt/shared/oakvar/jobs
ov system setup
ov module install module_1 module_2 …

For more on using environmental variables to set up OakVar as well as using setup files, see https://docs.oakvar.com/install_system.

Need a Storage Space for Your Module?

Due to the generous sponsors who recently subscribed to the monthly sponsorship support through GitHub, OakVar now has more space to store modules. If you have developed an OakVar module and need a storage space for publishing it to the OakVar Store, please reach out to me.

And of course, there have been many miscellaneous bug fixes and improvements overall.

I hope you will feel the same excitement as I do regarding the current state of OakVar and its future possibilities. If you haven’t tried OakVar yet, please give it a shot and let me know what you think. Below are relevant links.

Repository: https://github.com/rkimoakbioinformatics/oakvar
Documentation: https://docs.oakvar.com
Medium OakVar channel: https://medium.com/oakvar
GitHub Sponsors: https://github.com/sponsors/rkimoakbioinformatics

Ryangguk

--

--

Ryangguk Kim
OakVar
Editor for

I am CEO of a bioinformatics startup, Oak Bioinformatics, LLC.