Document Level Security in Elasticsearch — Part 1

Alon Aizenberg
9 min readJun 5, 2019

--

TL;DR

This is a simple hands-on guide on how to setup and use Document Level Security (DLS) feature in Open Distro for Elasticsearch. We will be using just docker and curl and nothing more for this guide. If you want to skip the explanations, and go directly to the hands-on part click here.

For a more advanced and production ready setup see the second part of this blog post.

Why Open Distro for Elasticsearch?

Recently amazon web services forked the open source version of Elasticsearch and published it with a few new plugins under the name “Open Distro for Elasticsearch”. I will not go into the details of why amazon did this, and was it an ethical move or not. I will also not discuss the reasons why an organisation would want to use Open Distro for Elasticsearch and not the original distribution of Elasticsearch from the elastic company. There are many discussions about these topics to which I have little add. The goal of this post is simply to explain how to configure and use the Document Level Security feature in Elasticsearch. Open Distro for Elasticsearch has an open source and free implementation of this feature, and this is the main reason I’m using Open Distro for this post. In general there is very little information about DLS feature out there, and I hope that this post, and the follow up post will help others struggling to understand and use this feature.

Elasticsearch security background

Historically Elasticsearch comes with an open source and completely free core, and has a few extensions for which you would need to pay a licence. One of the most used non free extensions is the security extension (now part of what is called “Elastic Stack Features”, formerly packaged as X-Pack). The security extension combines a few very important features such as Encrypted communications, Role-based access control, Authentication and Authorization, Document level security and many more (see here for a list of different payed features of official Elasticsearch distribution). As you can understand, some of these features are very basic and important for anyone who is productively using the product, and hence it was one of the ways elastic would sell licences to users. But recently some of the security features in Elasticsearch were open sourced and now offered for free by the elastic company (for example check this announcement by elastic, where they open Encrypted communications, Role-based access control and additional features). I can assume that opening these features happened due to the competition between the Elastic and Amazon distributions (as in the open distro all security extensions are fully open under permissive apache 2.0 licence). One of the features which is still not free in the official distribution but free in the Open Distro for elastic search is the Document level security. Now let’s see what Document level security is, how it works and how to use it.

What is Document Level Security

Document-level security (DLS) is a feature that allows to set different content filter rules based on end user roles. With the DLS feature we can configure 2 different roles, with different content filters on a particular Elasticsearch index, and hence force users with specific role to access only a portion of the documents in an index based on their role.

For example, we have 2 different user types in a system, managers and employees. Managers should be able to read content specifically published for managers but also content specifically published for employees. Employees on the other hand can read only content specifically published for employees. We can achieve this type of a search system by using the DLS feature.

But this could also be used for more complex setups, where there are multiple groups of users, with users members in 1 or more groups.

Getting to work

Let’s see how to set this up with Open Distro for Elasticsearch (but generally this works in a very similar fashion with the regular Elasticsearch distribution).

1. Basic Elasticsearch node setup with docker

Let’s start a cluster with 1 node running open distro for Elasticsearch via the official docker images as explained in the official documentation here.

Execute the following command to start Elasticsearch node:

docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" amazon/opendistro-for-elasticsearch:0.9.0

Wait for the node to fully start and check that elastic is running by executing a simple curl command:

curl -XGET https://localhost:9200 -u admin:admin --insecure

You should see the output:

{
"name" : "KtSUmCQ",
"cluster_name" : "odfe-cluster",
"cluster_uuid" : "KXnLKkmISsSWDxGEW209qQ",
"version" : {
"number" : "6.7.1",
"build_flavor" : "oss",
"build_type" : "tar",
"build_hash" : "2f32220",
"build_date" : "2019-04-02T15:59:27.961366Z",
"build_snapshot" : false,
"lucene_version" : "7.7.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}

2. Create index and index some content

Now let’s index some documents, first let’s create a simple index:

curl -XPUT "https://localhost:9200/content" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"groups": {
"type": "keyword"
}
}
}
}
}'

Response should look like this:

{
"acknowledged": true,
"shards_acknowledged": true,
"index": "content"
}

Note that we have a very simple mapping with just 3 fields, a title, content and list of groups. In addition we will have the _id field as generated for us by Elasticsearch.

Now let’s index 3 documents into the “content” index using Pulp Fiction quotes. We will use the bulk API for this:

curl -XPOST "https://localhost:9200/_bulk" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{"index":{"_index":"content","_type":"_doc"}
{"title":"Ezekiel 25-17","content":"The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of the evil men. Blessed is he who, in the name of charity and goodwill, shepherds the weak through the valley of darkness, for he is truly his brothers keeper, and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon thee!","groups":["groupA","groupB","groupC"]}
{"index":{"_index":"content","_type":"_doc"}
{"title":"A Royale with Cheese","content":"Alright, well you can walk into a movie theater in Amsterdam and buy a beer. And I dont mean just like in no paper cup, Im talking about a glass of beer. And in Paris, you can buy a beer at McDonalds. And you know what they call a, uh, a Quarter Pounder with Cheese in Paris?","groups":["groupC"]}
{"index":{"_index":"content","_type":"_doc"}
{"title":"About Foot Massages","content":"You know, Im getting kinda tired. I could use a foot massage myself.","groups":["groupB","groupC"]}
'

Response should look like this:

{
"took": 52,
"errors": false,
"items": [
{
"index": {
"_index": "content",
"_type": "_doc",
"_id": "zPZkJmsBxSvmw4TP5eSN",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
},
{
"index": {
"_index": "content",
"_type": "_doc",
"_id": "zfZkJmsBxSvmw4TP5eSN",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 201
}
},
{
"index": {
"_index": "content",
"_type": "_doc",
"_id": "zvZkJmsBxSvmw4TP5eSN",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1,
"status": 201
}
}
]
}

As you can see, we indexed 3 documents. The first document groups field has the groups [“groupA”,”groupB”,”groupC”], the second has only one group [“groupC”] and the third has 2 groups: [“groupB”,”groupC”]. These documents will be queried for by end users, and we will next setup the configuration which will allow end users, with their respective groups to find only documents they are entitled to see/find.

3. Setup a basic DLS with local user, a role and a rolemapping

Now that we have a running Elasticsearch instance, and we indexed a few documents, let’s understand how the Document Level security works. We will use a very basic setup at start, where we are going to manage our users in the local storage of Elasticsearch security plugin. This would work for our explanation, but for a real world scenario this is not very useful. For a more realistic setup, using JWT representing authenticated users see my second blog post on the Document Level Security feature.

So let’s create a user:

curl -XPUT "https://localhost:9200/_opendistro/_security/api/internalusers/quentin" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"password": "tarantino",
"attributes": {
"groups": "groupA\", \"groupB"
}
}'

Response should look like this:

{
"status": "CREATED",
"message": "'quentin' created."
}

We created a user with the name “quentin”, with a special attribute called groups. This attribute contains 2 strings: groupA and groupB. These will be used later as the end user’s permissions on the documents in the index “content”.

Now we will create a role (called permissinedReadRole), and attach a DLS configuration to it.. This role, will be assigned to users, and users that have this role, will have the DLS configuration assigned to the API calls they make.

curl -XPUT "https://localhost:9200/_opendistro/_security/api/roles/permissinedReadRole" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"indices": {
"content": {
"*": [
"SEARCH"
],
"_dls_": "{\"terms_set\": {\"groups\": {\"terms\": [\"${attr.internal.groups}\"], \"minimum_should_match_script\": {\"source\": \"1\"}}}}"
}
}
}'

Response will look like:

{
"status": "CREATED",
"message": "'permissinedReadRole' created."
}

If we look at the above configuration, the _dls_ string sets up a terms_set query on the groups field of the “content” index. It will take the internal users groups attribute (${attr.internal.groups}), and use it as a terms_set filter when the user executes a SEARCH API query on the “content” index.

Now we only need to map the user (quentin)to the role:

curl -XPUT "https://localhost:9200/_opendistro/_security/api/rolesmapping/permissinedReadRole" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"users": [
"quentin"
]
}'

Response will look like:

{
"status": "CREATED",
"message": "'permissinedReadRole' created."
}

We are almost done, now the only thing left is to test this setup. What is expected to happen is that when the user “quentin” executes a SEARCH API call on the index “content”, he will get back only the documents which fit the list of groups we assigned to him when we created the user. As this user has groups groupA and groupB, we will expect to find the first document, and the last one, but not the second one (which is readable only by users who have groupC in their groups list.

Let’s make the query:

curl -XGET https://localhost:9200/content/_search -u quentin:tarantino --insecure

And we should get back only 2 out of 3 documents, so response will look like this:

{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "content",
"_type": "_doc",
"_id": "xSrwHGsBZEOqa8cU7en1",
"_score": 1,
"_source": {
"title": "Ezekiel 25-17",
"content": "The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of the evil men. Blessed is he who, in the name of charity and goodwill, shepherds the weak through the valley of darkness, for he is truly his brothers keeper, and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon thee!",
"groups": [
"groupA",
"groupB",
"groupC"
]
}
},
{
"_index": "content",
"_type": "_doc",
"_id": "xyrwHGsBZEOqa8cU7en1",
"_score": 1,
"_source": {
"title": "About Foot Massages",
"content": "You know, Im getting kinda tired. I could use a foot massage myself.",
"groups": [
"groupB",
"groupC"
]
}
}
]
}
}

If you want to see that with a different user, you are getting different results, you can try to execute the same query with the admin user, who has no DLS configured and hence should get back all 3 documents:

curl -XGET https://localhost:9200/content/_search -u admin:admin --insecure

4. Cleanup — Drop all Elasticsearch resources we created:

So we created 4 resources: an index + 3 documents, a local user, a role and a rolemaping. Let’s drop them all to have a clean system again:

curl -XDELETE "https://localhost:9200/content" -u admin:admin --insecurecurl -XDELETE "https://localhost:9200/_opendistro/_security/api/internalusers/quentin" -u admin:admin --insecurecurl -XDELETE "https://localhost:9200/_opendistro/_security/api/roles/permissinedReadRole" -u admin:admin --insecurecurl -XDELETE "https://localhost:9200/_opendistro/_security/api/rolesmapping/permissinedReadRole" -u admin:admin --insecure

Conclusion

We have learned what is Document Level Security in the context of Elasticsearch, and learned how to configure it in a very basic setup. In the next part of this blog post I will explain a more realistic setup where your users are not managed in the Elasticsearch internal user store. Specifically we will look at a situation where you have some system where your end users live, and that system is handling end user authentication process. Then that system generates a JWT token for each authenticated user, containing the groups of the specific user. In this scenario, we will need to configure Open Distro for Elasticsearch, to accept JWT tokens generated by the authentication system, and allow users to execute search requests authenticating with the JWT. We will see how to set this up, including all the needed configuration for Document Level Security feature.

In case you have any questions you can reach me on Twitter @alonaizenberg

--

--

Alon Aizenberg

Development / Product manager — @alonaizenberg on Twitter.