Document Level Security in Elasticsearch — Part 2

Alon Aizenberg
12 min readJun 19, 2019

--

Introduction

In the first part of the series about Elasticsarch Document Level Security feature, we showed the most simple possible configuration of Document level Security feature in Elasticsearch using Open Distro for Elasticsearch. In this part, we will explore a more realistic and production like scenario. We will assume that we store end user permissions in a non standard user store. An example to such a store is any backend that does not have an LDAP based API. Then we will see how to setup from start to finish a search system that respects end users permissions when they search for content.

What you need to know

  1. Docker installed on your machine + basic docker knowledge. We will use docker extensively for almost everything in this guide.
  2. Basic Elasticsearch knowledge.
  3. Basic curl knowledge.
  4. We will also use openssl to generate RSA key pairs, and some node.js based projects to generate test JWT token, but you are not required to have any knowledge or experience with these tools.

A note about the setup and versions

This guide was tested on OSX, with docker version 18.09.1 and Open Distro for Elasticsearch docker image version 0.9.0. I assume that this will work almost without any changes on linux, other docker versions and vanila version of Elasticsearch 6.X with the Open Distro security plugin installed. But other setups were not tested.

I used docker for almost every task in this tutorial, for example as a container for running openssl to generate RSA key pairs, as nodejs and npm runtime to generate JWT tokens and additional tasks. You can execute these commands without docker by installing openssl, nodejs and other tools locally. I prefer using docker, as it keeps my machine clean and lean :).

Intro

In the first part of the series about Document Level Security (DLS) in Elasticsearch, we have seen how to do a very simple setup of Elasticsearch with DLS enabled, using Open Distro for Elasticsearch security plugin, and local users.

In this blog post, we will look at a more realistic, production ready setup.

The scenario we are talking about is content search over data in a single Elasticsearch index, where not all users are equal. Some users must be able to find content which should not be visible for other users.

To do this, the content itself in the index must have a permission set assigned to each document, for example consider the following Elasticsearch mapping:

"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"groups": {
"type": "keyword"
}
}
}
}

Here we have for each document in the index, a special field called groups, which will have a list of group names. End users will be allowed to find only the documents which have 1 or more group names, that match our end user list of groups.

The data model will then be an index with documents, with a set of group names for each document, and on the other hand, end users, who have a list of groups they hold memberships in.

We will use some quotes from the iconic Pulp Fiction move, to test Document Level Security in this blog post. Our data will look like this:

Document level security — testing data setup

In the above setup, we have 4 documents, and 3 users. The user Quentin will be able to find documents 1,2 and 3. The user Jules will be able to find document 2 and 3. The user Vincent will be able to find only document 1. Note that document 4 will not be returned to any of our 3 users, as none of them possesses groupD.

Document permissions will be stored in the index, and user permissions will be stored in the users JWT token.

The JWT tokens will be generated in such a system by some authentication service, which will authenticate users, then query some backend to get the user’s permissions (for example the list of groups the user is a member of), and the finally generate the JWT token with the users group list, and sign it with a private key.

Then this JWT can be used to access Elasticsearch, and execute search queries with respect to user permissions. It could look like this:

In the above diagram green numbered circles represent an offline setup by the system administrator, and the blue numbered circles represent the end user interactions with the system:

  1. A setup step where the system administrator generates Public/Private key pair, and configures the authentication service to sign tokens with the private key, and the security plugin of Elasticsearch to verify tokens with public key.
  2. End user authenticates to the authentication service with for example SAML2, basic authentication, or any other authentication method. Then the authentication service queries a backend for users groups, and generates a JWT token.
  3. The token is then returned to the end user.
  4. End user makes a search query using the JWT from step 3 as means of authentication. The security plugin verifies the token with the public key, and then executes the query with the respect to the users groups taken from the token.
  5. Finally results are returned to the end user, with only the data this user is allowed to see.

To set this up, and test everything we will use docker and curl. This is going to be a bit long, so stay with me ;)

Note: the below steps are not the same numbering of steps as the diagram above.

Step #1 — Generate public/private key pair

The first thing we need to do is to generate RSA key pair. This will be needed for simulating work with Json Web Tokens (JWTs). JWT tokens are signed with either a symmetric key or asymmetric key. In this example we will use asymmetric keys. Let’s user docker and openssl to generate private and public RSA key pair:

docker run --rm -i -v /tmp:/tmp alpine /bin/sh -c  "apk add openssl; openssl genrsa -out /tmp/private.pem 2048; openssl rsa -in /tmp/private.pem -outform PEM -pubout -out /tmp/public.pem"

This generates 2048-bit RSA private and public keys and saves it to the files /tmp/private.pem and /tmp/public.pem respectively on your host machine.

Step #2 — Configure and start a docker container with Open Distro for Elasticsearch

We will need to configure Elasticsearch to accept JWT as means of authentication. To do this we will use the public key generated in the previous step, and configure the security plugin of Open Distro for Elasticsearch to accept JWT tokens and validate them against the public key which we generated in step #1. First we will create a folder called odfe-dls and a file in this folder called config.yml:

opendistro_security:
dynamic:
http:
anonymous_auth_enabled: false
xff:
enabled: false
internalProxies: '192\.168\.0\.10|192\.168\.0\.11' # regex pattern
remoteIpHeader: 'x-forwarded-for'
proxiesHeader: 'x-forwarded-by'
authc:
basic_internal_auth_domain:
http_enabled: true
transport_enabled: true
order: 4
http_authenticator:
type: basic
challenge: true
authentication_backend:
type: intern
jwt_auth_domain:
enabled: true
http_enabled: true
transport_enabled: true
order: 0
http_authenticator:
type: jwt
challenge: false
config:
signing_key: |-
-----BEGIN PUBLIC KEY----- MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvbfMJLb9U7KSKjg86TmgqfMJegAuMrK54G5Dn75zjM+jmcENCyIYNGP3w9/r+xEGe79A/JSnTZUF1rv/A2/P6DYfrrpemNd4fuabGPpoyd9msswDmWb3ic75Khxzq8HmcCxNHxs+bn172Qlj2f0oIoL2xA7k3VIAvCfzHvYwCZEyHtmdLnldFB4A+dXgfjLEHlKFU83iueh96JCKN9rY/SFfT7YB97WmTggcwkiUFW8mUuCIFZIsB5S57qbs20dWJ6ikwdMNitrx700zAdINTf/1S4UykKqs8N7OahVisYPHSaQFz+KCTE4m9GGL0VLuK0usXWeb1QeuYjLtCi5hGQIDAQAB
-----END PUBLIC KEY-----
jwt_header: "Authorization"
jwt_url_parameter: null
roles_key: "roles"
subject_key: "sub"
authentication_backend:
type: noop

Replace the content of signing_key above with the content of the /tmp/public.pem file from your host machine.

Note: the format of the public key is very important. It must be exactly as seen in the above example, the full string in one line without any line breaks, and the indentation of the yml file is also very important. In case you misconfigured the key you will see errors in the Elasticsearch node logs at startup time (and periodically), for example:

odfe-node1    | [2019-06-10T13:32:27,216][ERROR][c.a.d.a.h.j.HTTPJwtAuthenticator] [BZtQ-aM] Error creating JWT authenticator: io.jsonwebtoken.io.DecodingException: Illegal base64 character: '
odfe-node1 | '. JWT authentication will not work
odfe-node1 | io.jsonwebtoken.io.DecodingException: Illegal base64 character: '
odfe-node1 | '
odfe-node1 | at io.jsonwebtoken.io.Base64.ctoi(Base64.java:206) ~[jjwt-api-0.10.5.jar:?]
odfe-node1 | at io.jsonwebtoken.io.Base64.decodeFast(Base64.java:255) ~[jjwt-api-0.10.5.jar:?]

And when you will try to authenticate with a JWT token, you will see in the Elasticsearch node logs:

odfe-node1    | [2019-06-11T10:54:11,028][ERROR][c.a.d.a.h.j.HTTPJwtAuthenticator] [BZtQ-aM] Missing Signing Key. JWT authentication will not work

Now we will use docker to start a new Elasticsearch node and mount the security configuration file (config.yml) to the correct place in the docker container by running from the odfe-dls directory:

docker run --rm -v $(pwd)/config.yml:/usr/share/elasticsearch/plugins/opendistro_security/securityconfig/config.yml -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" amazon/opendistro-for-elasticsearch:0.9.0

Note that this command starts a new temporary docker container. Any data indexed into this container will be lost once the container is stopped.

Now that we have Elasticsearch node started, we need to run a command in the Elasticsearch container to load the new security configuration from the config.yml file. To do so we will use docker exec command. First we will need to find the container id of the Elasticsearch node:

docker ps -n 1

This will print the container id of the last container you started with docker.

Now let’s reload the configuration:

docker exec <CONTAINER-ID> /usr/share/elasticsearch/plugins/opendistro_security/tools/securityadmin.sh -f /usr/share/elasticsearch/plugins/opendistro_security/securityconfig/config.yml -icl -nhnv -cacert /usr/share/elasticsearch/config/root-ca.pem -cert /usr/share/elasticsearch/config/kirk.pem -key /usr/share/elasticsearch/config/kirk-key.pem -t config

Replace <CONTAINER-ID> with your container id.

You should see the following output:

Open Distro Security Admin v6
Will connect to localhost:9300 ... done
Elasticsearch Version: 6.7.1
Open Distro Security Version: 0.9.0.0
Connected as CN=kirk,OU=client,O=client,L=test,C=de
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Clustername: odfe-cluster
Clusterstate: GREEN
Number of nodes: 1
Number of data nodes: 1
.opendistro_security index already exists, so we do not need to create one.
Populate config from /usr/share/elasticsearch
Will update 'security/config' with /usr/share/elasticsearch/plugins/opendistro_security/securityconfig/config.yml
SUCC: Configuration for 'config' created or updated
Done with success

Now we are almost ready, all is left to do is to create the index, index the testing date, create a role with Document Level Security enabled, and create a role mapping to map all users with JWT tokens to assume this role.

Step #3 — Prepare the data and the security configuration

The next steps are almost identical to the steps for setting up the data and the roles configuration from the first part of series, but in some cases there are mild differences. So let’s start:

First let’s create the index and the mapping:

curl -XPUT "https://localhost:9200/content" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"groups": {
"type": "keyword"
}
}
}
}
}'

Now index 4 documents:

curl -XPOST "https://localhost:9200/_bulk" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{"index":{"_index":"content","_type":"_doc", "_id" : "1"}
{"title":"Ezekiel 25-17","content":"The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of the evil men. Blessed is he who, in the name of charity and goodwill, shepherds the weak through the valley of darkness, for he is truly his brothers keeper, and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon thee!","groups":["groupA","groupC"]}
{"index":{"_index":"content","_type":"_doc", "_id" : "2"}
{"title":"A Royale with Cheese","content":"Alright, well you can walk into a movie theater in Amsterdam and buy a beer. And I dont mean just like in no paper cup, Im talking about a glass of beer. And in Paris, you can buy a beer at McDonalds. And you know what they call a, uh, a Quarter Pounder with Cheese in Paris?","groups":["groupB"]}
{"index":{"_index":"content","_type":"_doc", "_id" : "3"}
{"title":"About Foot Massages","content":"You know, Im getting kinda tired. I could use a foot massage myself.","groups":["groupA","groupB"]}
{"index":{"_index":"content","_type":"_doc", "_id" : "4"}
{"title":"A please would be nice","content":"Get it straight, gentlemen: Im not here to say please, Im here to tell you what to do. And if self-preservation is an instinct that you possess, you would better do it and do it quick. If my help is not appreciated, lots of luck, gentlemen.","groups":["groupD"]}
'

We will use curl to create a role with DLS setup, to filter by the groups field in the index, using values located in our JWT tokens:

curl -XPUT "https://localhost:9200/_opendistro/_security/api/roles/permissinedReadRole" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"indices": {
"content": {
"*": [
"SEARCH"
],
"_dls_": "{\"terms_set\": {\"groups\": {\"terms\": [\"${attr.jwt.groups}\"], \"minimum_should_match_script\": {\"source\": \"1\"}}}}"
}
}
}'

And the role mapping, which will map our users to the role we created above (permissinedReadRole):

curl -XPUT "https://localhost:9200/_opendistro/_security/api/rolesmapping/permissinedReadRole" -u admin:admin --insecure -H 'Content-Type: application/json' -d'
{
"backendroles" : [ "frontEndUser"]
}'

This role mapping is very important, is generally assigns all users with JWT tokens in which there is a role frontEndUser to have the above created role (with DLS) called permissinedReadRole in elasticsearch.

Step #4 — Generate Json Web Tokens for our 3 users

Now let’s generate a JWT token and sign it with the private key from the file /tmp/private.pem:

docker run --rm -i -v /tmp:/tmp node /bin/bash -c "npm install -g jwtgen; jwtgen -a RS256 -p /tmp/private.pem -c "sub=Quentin" -c "iss=localhost" -c "roles=frontEndUser" -c 'groups=groupA\" , \"groupB' -e 36000 > /tmp/quentin-token.txt"

this generated a new file called /tmp/quentin-token.txt. The content of this file will look like this:

eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NjAzMzYzMDUsImV4cCI6MTU2MDM3MjMwNSwic3ViIjoiUXVlbnRpbiIsImlzcyI6ImxvY2FsaG9zdCIsInJvbGVzIjoiZnJvbnRFbmRVc2VyIiwiZ3JvdXBzIjoiZ3JvdXBBXCIgLCBcImdyb3VwQiJ9.psilLfme8HUG2z_AhQBuhm5OXm6BiD8qcNzPUt9rNuWWlToaNU53TPFaT9uOH3YE4YVJvlwUmHRl-5lI80TcqMRfn1blzKXRBSpbvCzTFVXh81Q-vWiZqEjY-c__NrdVqs2Rvnik6u7l90gzA9Z8b7I0KImH279hH26ADxAIuRTJ51_adA-x0p0xb7DstuY2JO1fjMnEYzIwA1D-HQ6CB2CjM7j5fNHH-BEnMc42NxfCVbca1KzeiqPjn0RErVmxpjg-C2UqXsidLsGO0zEo5idZqBQ5VGKL5kpitd2QTM58pb2Evd6BV7XdwbC_SkHz15lG0IKsYH1h3huFqxj_0Q

If you want to decode the content of the file, use jwt.io online decoder, or use jwt-cli:

npm install -g jwt-cli
cat /tmp/quentin-token.txt | jwt

If you do not have nodejs/npm and do not want to install on your local machine, you could use docker to run the jwt-cli parser like this:

docker run --rm -i -v /tmp:/tmp node /bin/bash -c "npm install -g jwt-cli; cat /tmp/quentin-token.txt | jwt"

Step #5 — Test that everything works

Now that we have everything created, we need only to check that everything works as expected. To do this we will run a curl SEARCH API call, authenticate in this API call with the JWT generated in the steps above, and verify that we are getting back only 3 documents out of the 4 documents indexed:

curl -XGET -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NjAzMzYzMDUsImV4cCI6MTU2MDM3MjMwNSwic3ViIjoiUXVlbnRpbiIsImlzcyI6ImxvY2FsaG9zdCIsInJvbGVzIjoiZnJvbnRFbmRVc2VyIiwiZ3JvdXBzIjoiZ3JvdXBBXCIgLCBcImdyb3VwQiJ9.psilLfme8HUG2z_AhQBuhm5OXm6BiD8qcNzPUt9rNuWWlToaNU53TPFaT9uOH3YE4YVJvlwUmHRl-5lI80TcqMRfn1blzKXRBSpbvCzTFVXh81Q-vWiZqEjY-c__NrdVqs2Rvnik6u7l90gzA9Z8b7I0KImH279hH26ADxAIuRTJ51_adA-x0p0xb7DstuY2JO1fjMnEYzIwA1D-HQ6CB2CjM7j5fNHH-BEnMc42NxfCVbca1KzeiqPjn0RErVmxpjg-C2UqXsidLsGO0zEo5idZqBQ5VGKL5kpitd2QTM58pb2Evd6BV7XdwbC_SkHz15lG0IKsYH1h3huFqxj_0Q" https://localhost:9200/content/_search --insecure

Replace the Bearer base64 encoded token above with the contents of the file /tmp/quentin-token.txt.

And you should see as expected only 3 documents:

{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "content",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"title": "Ezekiel 25-17",
"content": "The path of the righteous man is beset on all sides by the iniquities of the selfish and the tyranny of the evil men. Blessed is he who, in the name of charity and goodwill, shepherds the weak through the valley of darkness, for he is truly his brothers keeper, and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon thee!",
"groups": [
"groupA",
"groupC"
]
}
},
{
"_index": "content",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"title": "A Royale with Cheese",
"content": "Alright, well you can walk into a movie theater in Amsterdam and buy a beer. And I dont mean just like in no paper cup, Im talking about a glass of beer. And in Paris, you can buy a beer at McDonalds. And you know what they call a, uh, a Quarter Pounder with Cheese in Paris?",
"groups": [
"groupB"
]
}
},
{
"_index": "content",
"_type": "_doc",
"_id": "3",
"_score": 1,
"_source": {
"title": "About Foot Massages",
"content": "You know, Im getting kinda tired. I could use a foot massage myself.",
"groups": [
"groupA",
"groupB"
]
}
}
]
}
}

Now let’s create the Jules user token. This user has only 1 group - groupB. As there are 2 documents with groupB (2 and 3), a search query with Jules token will return 2 documents out of 4.

Let’s generate a token for Jules in a similar way we generated the Quentin user token:

docker run --rm -i -v /tmp:/tmp node /bin/bash -c "npm install -g jwtgen; jwtgen -a RS256 -p /tmp/private.pem -c "sub=Jules" -c "iss=localhost" -c "roles=frontEndUser" -c "groups=groupB" -e 36000 > /tmp/jules-token.txt"

And the token in the file /tmp/jules-token.txt will look similar to this:

eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NjAzMzk2MDAsImV4cCI6MTU2MDM3NTYwMSwic3ViIjoiSnVsZXMiLCJpc3MiOiJsb2NhbGhvc3QiLCJyb2xlcyI6ImZyb250RW5kVXNlciIsImdyb3VwcyI6Imdyb3VwQiJ9.pi5p9DePfhfhK91qTSQR4RCO9l8cbWHXJXYSR8ecLtQCxxT5g0G35qmFvur02lsLrwRlnvDif-qRWM0YKRgXmZgbpxhrxdR_SSaxo-wtqOUYgVrWfVg5yQ0OncIWx9k6diGQfut1m5CSN_CL2v4tS58tmKCkxdN7dO7s8s7xZK9aGX-c0LC2rjoYHEjQoPZjj2EW5FzhMkdIILdv1gUuL2IWMTaAB9SlYJ6dHdwODE5houb3An5kxer-ulbf3IJdK-Es2FBuximI6J1vWwBJ5YPWp6SxCRNauAmRFKnIjHmn30wkj49fOsFEsQEo7JRtte9aj7FsvHglT2NJJ2vtUg

Now let’s make a search call on behalf of Jules, expecting to find only 2 documents (2 and 3):

curl -XGET -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NjAzMzk2MDAsImV4cCI6MTU2MDM3NTYwMSwic3ViIjoiSnVsZXMiLCJpc3MiOiJsb2NhbGhvc3QiLCJyb2xlcyI6ImZyb250RW5kVXNlciIsImdyb3VwcyI6Imdyb3VwQiJ9.pi5p9DePfhfhK91qTSQR4RCO9l8cbWHXJXYSR8ecLtQCxxT5g0G35qmFvur02lsLrwRlnvDif-qRWM0YKRgXmZgbpxhrxdR_SSaxo-wtqOUYgVrWfVg5yQ0OncIWx9k6diGQfut1m5CSN_CL2v4tS58tmKCkxdN7dO7s8s7xZK9aGX-c0LC2rjoYHEjQoPZjj2EW5FzhMkdIILdv1gUuL2IWMTaAB9SlYJ6dHdwODE5houb3An5kxer-ulbf3IJdK-Es2FBuximI6J1vWwBJ5YPWp6SxCRNauAmRFKnIjHmn30wkj49fOsFEsQEo7JRtte9aj7FsvHglT2NJJ2vtUg" https://localhost:9200/content/_search --insecure

And indeed this is what we will get:

{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "content",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"title": "A Royale with Cheese",
"content": "Alright, well you can walk into a movie theater in Amsterdam and buy a beer. And I dont mean just like in no paper cup, Im talking about a glass of beer. And in Paris, you can buy a beer at McDonalds. And you know what they call a, uh, a Quarter Pounder with Cheese in Paris?",
"groups": [
"groupB"
]
}
},
{
"_index": "content",
"_type": "_doc",
"_id": "3",
"_score": 1,
"_source": {
"title": "About Foot Massages",
"content": "You know, Im getting kinda tired. I could use a foot massage myself.",
"groups": [
"groupA",
"groupB"
]
}
}
]
}
}

We will omit the same for our last user Vincent, but you can try it out and generate a token for Vincent, and check that Vincent can find only 1 document (with id=1). Also you can play around and generate a token that will yield no results at all.

Conclusion

We seen in this tutorial how to use Document Level Security in Elasticsearch. In case you have any questions you can reach me on Twitter @alonaizenberg

--

--

Alon Aizenberg

Development / Product manager — @alonaizenberg on Twitter.