OakVar v2.5 is here!

Ryangguk Kim
OakVar
Published in
3 min readJul 25, 2022
DNA sequence

I’m excited to announce the release of OakVar v2.5 today. This release comes with big news.

First of all, if you are not familiar with OakVar, it is a genomic variant interpretation platform. It will read genomic variants in VCF and other formats, add annotation to each variant using diverse annotation sources such as ClinVar, gnomAD, COSMIC, etc, and generate reports in formats such as VCF, Excel, CSV, etc. It has a graphical user interface to visualize annotated variants as well. You can check it out at https://oakvar.com.

OakVar Store

Now to the news. First, OakVar store! You can now publish your new OakVar modules to the OakVar store with OakVar command-line interface. For example, let’s say you made an awesome OakVar annotation module named awesome and wants to share it with the world. You can do this in three steps.

>ov module pack awesome

This will create one or two files, depending on whether your module has data folder in it or not. Your module’s code will be packed into awesome__<version>__code.zip where version is the version number defined in awesome.yml file in your module’s directory, and if your module has data subdirectory, awesome__<version>__data.zip also will be created.

Then, upload these zip files to somewhere people can download. Using their URLs,

>ov store register awesome --code-url ... --data-url ...

will register your module in the OakVar store. --data-url is needed only if your module produced a data zip file.

This way, you have total control of your module’s publication. You can just delete the module zip files from where you stored them and OakVar store will automatically deregister those deleted versions. If you move the module zip files to new locations you can just register them again with new URLs.

VCF2VCF

VCF2VCF is a new workflow introduced in OakVar v2.5. With --vcf2vcf option,

ov run input.vcf --vcf2vcf -n annotated

will generate annotated.vcf with annotated variants. This workflow will bypass the generation of annotated.sqlite which is a usual by-product of ov run. The advantage of --vcf2vcf is speed. Depending on the number of samples in input, this option can give an order of magnitude or more faster annotation. For example, mapping the variants in the chromosome 20 of the 1000 Genomes Project data took about 10 minutes with --vcf2vcf in our test system. With ClinVar annotation added, about 15 minutes. The disadvantage is the lack of the database file, which is needed for visualization of the result with ov gui. However, stay tuned since we are working on improving this aspect.

Easy setup

Setting up OakVar is now easier than ever. After installing it with pip install oakvar, just issue

ov system setup

This will do all the steps necessary to set up OakVar in your system. This command is recommended after updating OakVar to a new version as well. We aim to make this command to be the one-stop place for all setup needs.

ov system setup can receive a setup json file as well, making it easy to deploy to local and cluster nodes. It even recognizes environmental variables as setup config values, which will be useful in Docker deployment. Details of ov system setup is described at https://docs.oakvar.com/install.

Documentation

This release comes with brand-new documentation. https://docs.oakvar.com will be the central location for all documentation on OakVar, including an initial release of developer guide on OakVar modules at https://docs.oakvar.com/devguide.

Hopefully you are by now interested in trying this new version of OakVar. Go ahead to https://docs.oakvar.com/install for instruction and try it for yourself!

OakVar GitHub: https://github.com/rkimoakbioinformatics/oakvar

If you have any question or feedback, feel free to write to me.

--

--

Ryangguk Kim
OakVar
Editor for

I am CEO of a bioinformatics startup, Oak Bioinformatics, LLC.