Self-hosted data for WEBKNOSSOS

Norman Rzepka
WEBKNOSSOS
Published in
4 min readFeb 28, 2023

--

Sharing and working with large 3D image data can be very tedious without the right software. WEBKNOSSOS makes this very easy. Free accounts on webknossos.org include some storage space to get your started. However, if you already have storage resources, you may not want to upgrade just for storage. With its OME-Zarr support, WEBKNOSSOS can access datasets that are stored externally. In this tutorial, we’ll explain how to convert data into OME-Zarr and how to set up a static file server for use with WEBKNOSSOS.

Using webknossos.org instead of a self-hosted WEBKNOSSOS instance has the benefit that we maintain the server, install updates frequently, and backup your annotations. Also, you can upgrade to paid features of WEBKNOSSOS at any time.

Architecture diagram of WEBKNOSSOS with internal storage and remote storage (e.g. your own server and cloud storage)

Getting your data ready

WEBKNOSSOS supports a range of chunked file formats for external access including OME-Zarr, N5 and Neuroglancer precomputed. We strongly recommend OME-Zarr, because it is best supported. Other file formats, such as TIFF, CZI, IMS or HDF5, need to be converted to OME-Zarr. There are multiple tools available to do that, e.g. bioformats2raw or NGFF-Converter. If you are savvy in Python, you can use the “webknossos” package:

import webknossos as wk

ds = wk.Dataset.from_images(
"path/to/tiff/stack",
"dataset_wkw",
voxel_size=(4, 4, 40),
data_format="zarr"
)
ds.compress()
ds.downsample()

The Python library also has features for creating layers and mags (resolution levels) from arbitrary numpy arrays. You can easily integrate that into your existing scripts. Make sure to check out the examples in the documentation.

Set up your own storage server

You can also use your own server as storage for WEBKNOSSOS. You need to have a server that is publicly available to the Internet and has a (sub)domain name attached to it. It needs to run a server application with HTTPS support. We also recommend basic auth to prevent unauthorized access to your data.

In this tutorial, we show how to set up Caddy on an Ubuntu server. Caddy is a great choice because it includes automatic HTTPS configuration and is easy to use. Other software such as Apache, nginx or traefik are also great options and there are many tutorials available on how to set them up.

First, you need to install Caddy as explained in the documentation: https://caddyserver.com/docs/install. Next, you need to assign a folder from where the data is served. In our example, we’ll use “/opt/webknossos”. Go ahead and create that folder.

Now, we need to configure Caddy. It should be located under “/etc/caddy/Caddyfile”. Depending on your Linux distribution, it might be located somewhere else. Copy the following content into your Caddyfile. Change the domain name in the first line and generate your own password for basic auth. You can use the “caddy hash-password” command to generate the password.

example.cloud.scm.io {
root * /opt/webknossos
file_server browse
basicauth * {
webknossos $2a$14$uGab5vbFo/VH1Jubz39/yOW57uQlPmhT//mbGvT85dDn.xIiqRJam
}
}

Reload the Caddy service using “sudo systemctl reload caddy”. After a few seconds, Caddy should serve your data.

Now you are ready to add your data to this folder. If you don’t have data of your own, you can download and extract the following dataset to test your setup: https://static.webknossos.org/data/l4_sample.zarr.zip

The folder structure should look something like this:

/opt/webknossos/
└── l4_sample
├── .zgroup
├── color
│ ├── .zattrs
│ ├── .zgroup
│ ├── 1
│ ├── 2-2-1
│ ├── 4-4-1
│ ├── 8-8-2
│ └── 16-16-4
├── datasource-properties.json
└── segmentation
├── .zattrs
├── .zgroup
├── 1
├── 2-2-1
├── 4-4-1
├── 8-8-2
└── 16-16-4

Your server is now fully prepared to serve data to WEBKNOSSOS. Head over to your account on webknossos.org and add your dataset. Go to “Add Dataset” > “Add Remote Dataset” and enter the URL to your dataset, e.g. https://example.cloud.scm.io/l4_sample. Click add layer and voilá your data should be imported and ready to be visualized, annotated and shared.

Cloud storage

If you don’t want to manage your own storage servers, you can also buy storage from one of the many cloud providers. Amazon S3 is the most popular choice, but can be quite pricey. Especially costs for egress traffic add up quickly. While Google Cloud storage is a popular alternative, pricing is in the same ballpark. Cheaper alternatives include Backblaze B2, Cloudflare R2, and Scaleway Object Storage.

To use that, simply create an account with the provider of your choice, create a storage bucket, upload some datasets, and fetch the credentials. Now, you can import the data into WEBKNOSSOS using the URL, e.g. s3://webknossos-zarr/demodata/l4_sample, and the corresponding credentials.

To learn more about updates to WEBKNOSSOS, follow us on Twitter or Mastodon. If you haven’t already, go to webknossos.org and sign up for a free account.

--

--