Using PostgreSQL row level security (RLS) to authorize READ queries for your application’s users.

joseph bartels
5 min readJun 3, 2018

--

Photo by Marvin Ronsdorf on Unsplash

Row Level Security was introduced in v9.5 of PostgreSQL. It filters query results based on user defined row policies. You can read more here . Some key points are that RLS is set up per table using 2 sets of rules — one for CREATE/ UPDATE/ DELETE and another ruleset for READ.

Cool. But how does RLS impact performance, if at all?

The scenario

This will be a comparison using RLS to filter the rows a user can read vs. plain ol’ Table JOINs with a simple privilege table. I hope to answer the question of whether RLS is suitable as the primary privilege system in an app with thousands of users.

Here are the tables we’ll be working with:

page_photo has ~ 100,000 rows. (~100 photos * 1,000 pages)

page has 1,000 rows. (~ 50% public)

page_user has 10,000 rows where each user gets 10 pages. (10 pages * 1,000 users)

RLS does not apply to a sudo user by default so we need to create a ROLE with some basic privileges:

Case 1: page WHERE public = TRUE

Without RLS:

With RLS:

RLS loses this round.

When relying on RLS row filtering the query is 1.7x slower and when including WHERE public = true with and without RLS appear equivalent (~8.5ms).

CASE 2: page WHERE public = TRUE with partial INDEX

Let’s redo, but with a partial INDEX on public = true which is the USING condition for the RLS policy.

Without RLS:

With RLS:

Using a partial INDEX on public IS true speeds the RLS query up to match its non-RLS equivalent. Nice!

Case 3: JOIN vs RLS Privileges

Let’s do something more interesting and grab all the page_photos that a user has privileges to see.
i.e. we’ll query for all the page_photos that belong to a page where the user has a corresponding entry in the page_user table.

Without RLS:

Not great, but let’s try that with an index:

10x faster. cool! I wonder if we can do better…

22x faster than original! Now we’re cooking with gas.

With RLS:

Now this is where RLS gets interesting because our READ rule needs to “simulate” the same access privileges that the table JOIN did above ^.

Aside: I’ll be leaving the indexes in place 👆

There are a couple of ways I can think to implement RLS in this situation:

A) create a function with this API: can_access_page(page_id, user_id)

Applying this function to the USING expression for an RLS policy on page_photo table means you’re passing in page_photo.page_id and probably some LOCAL / SESSION variable, perhaps from a SET app.current_user = 1.

Bear in mind that RLS filters ALLL Teh Thingz (rows) — so if USING or WITH CHECK expressions execute a function that triggers a db lookup, then lots of queries may be firing solely to dwindle the universe of rows down before even running the main query (please correct me if this is inaccurate).

or…

B) query once to get all page_ids that the user has access to:

SELECT page_id from page_user WHERE user_id = 1;

and then, armed with this information, create a Function with an API like this: bigint_contains(page_ids, page_id)

which, in pseudo code may be page_ids.contains(page_photo.page_id)

Method A seems easier (but less performant). Let’s try that for now. The function:

And the RLS policy:

Keep in mind that if app.user_id does not exist, the function will heroically fail and throw an error.

And here’s how one might implement those pieces using a transaction:

Wow. Not impressive you might say, but let’s consider it like this:

It takes about 0.7ms to execute can_access_page and there are 100,000 page_photo rows. So if we were in 3rd grade we might conclude that filtering should take 100,000*0.7ms = 70,000ms or 70 seconds. But it does not take 70 seconds — it only takes 1.6 seconds. So we’re clearly experiencing optimizations here. Even though the total elapsed time is unimpressive I would not be quick to call this an RLS fail though; more likely it’s just not an ideal use-case for RLS — especially considering the non-RLS alternative is significantly faster at < 2ms.

Let’s try method B which does not involve doing a db lookup while enforcing the RLS Policy.

And to get an Array of page_ids the user can access:

There is one problem with this approach… I was unsuccessful in applying the results of this query to app.user_page_ids.

Apparently SET app.user_page_ids = ”some query” does not work. I even tried stuffing the query into a function and setting it that way — no luck. If you have the solution speak up! For now we’ll just set it manually:

And the RLS policy:

All together:

That’s 2x faster than method A but still a longshot from the non-RLS alternative.

Conclusion

update (June 2019): Take this conclusion 👇 with a grain of salt and check out Ethan Resnick’s comment for ideas on getting RLS performing as good as the non-RLS equivalents.

— -

I would not use RLS as the main privilege system for db lookups when the number of rows that are likely to be filtered out, for any given query, is a substantial amount of the total rows in that table.

This opinion does not extend to using a RLS Policy’s WITH CHECK expression, which only operates on the row being inserted/deleted/updated. In which case I expect much better performance since we’re dealing with 1 row at a time. I’ll have to dig into that use-case specifically though.

I hope it does not turn you off to RLS and that you explore its implementation for yourself — and to share what you’ve learned. There isn’t a large amount of information (yet) when it comes to RLS best practices or good use-cases — but please leave links in the comments if you come across a good resource!

For those interested. I’ve included the EXPLAIN ANALYZE for some of the queries:

--

--