Humio 1.1: Introducing Views & Repos
Today we’re releasing a new major update to Humio, with tons of fixes and one new big feature: Views & Repositories.
As a Humio user you already know our concept of a ‘dataspace’. With this release, we’re abolishing that notion of dataspaces and replacing it with two separate entities: Repositories (which contain data) and Views (which only show data). Don’t worry, with the upgrade all your dataspaces have been converted to repositories and you can still use all the same APIs and UI interactions as before.
About the Names
There are a few really hard problems in computer science, and one of them is naming things. The names View and Repo has been the subject of much debate here at Humio, and a good way to think of these two words is to compare them with the names Table and View as used in context of a relational database. In a database tables contain the data and views contain a read-only result of a query potentially joining multiple tables. Likewise in Humio: a repository contains data, and a view is a filtered aggregate across repositories. Data contained in Humio is not as structured as it would be in a relational database, so it doesn’t make sense to call it a table. Thus, the more generic concept of a repository.
Play with Views!
To get started, you can just click the ‘Add’ button on the front page, which will bring you to this choice between a green pill, and a purple pill:
Choosing the green pill lets you create a view. Views and repos have separate access control (list of users who can see it), and you need to be admin in a repository to be able to create a view that makes reference to that repository.
Example: Restricting Access
Say you have a repository that contains all your AD logs from your entire organization, but you have decentralized help-desk operations in different parts of the company. Maybe it is a problem that the various help-desks can see everyone else’s AD activity.
Now you can create a view of just the AD logs that pertain to part of your organization, and only assign access rights to personel in the local helpdesk team so that that they only have access to the AD logs for their own users. It might look like this:
Then you can go to the settings of the
east-AD-helpdesk view, and add just the users who should be allowed to see event data from
AD5.prod . Those users do not need to have access to the underlying repository to be able to access the view.
Previously in Humio, it was an issue that you were not able to mix data with different retention. Now you can create a view that merges data from two separate repositories, each with their own retention setting. Beware though, that when you get into the ‘tail’ of the view, not all repositories contribute to the data content.
More about Views
The expressions that limit which part of a repo that is visible to a view can contain filter expressions, which also includes rewriting events. For instance, you can hide information such as social security numbers or credit card information in the view by using the
replace function in the filter expression, or you can rewrite field names using
rename or simply reassign then usin
:= to be similar across different data sources.
Many elements are local to views: files, saved queries, dashboards. These are not shared across views or repositories, so they can also be used as a place to make such changes local.
Views do not make queries more expensive or slow. In general they should reduce the workload for on-prem customers because less data ends up being searched. As always however, the best way to make queries run fast is to use tags
#tag=value, and preferrably as early as possible in the query expression. Since the filter expressions used in defining a view are essentially prepended on the queries, it is particularly beneficial if you put the tag filters in the view definition.
So What About Repositories?
As noted, all existing data spaces have been converted to repositories, which contain all the same saved queries, files and dashboard that you already had. And for simple use cases, you can just keep doing everything in a repository like you used to.
Once you start creating views however, you’ll notice that some things are missing in views settings menu: Parsers, Ingest Tokens, Retention Settings, i.e., everything having to do with data ingest, parsing and storage. The distinctive feature with repos is that they contain the data, and so this is also where you have to go to configure these aspects.
Other Things in Humio 1.1.0
We’ve also done a wealth of other improvements that most people probably have missed. Maybe you’ll notice that it runs up to 30% faster on long searches. We’ve removed a lot of small bottlenecks in the core of the search engine, and all together we find that it runs much more smooth.
One final note for on-prem installations: Updating to 1.1 is not reversible, so you cannot roll back safely. The release notes have some provisions for how to backup your pre-upgrade state to be able to do a lossy rollback; i.e. you can do a partial roll back loosing certain changes that happened since the update.