PostgreSQL Row Level Security

Written — Updated
  • RLS effectively lets you add extra WHERE conditions to any SQL query. This can go in both directions — it applies to SELECT, UPDATE, INSERT, and DELETE.
  • The advantage of this is that you don't need to remember to add these checks into every query in your application. Likewise for when you are updating your auth model — just change the relevant policies and it all takes effect.
  • Syntax and Basic Usage

    • CREATE POLICY name ON table_name
          [ FOR { ALL | SELECT | INSERT | UPDATE | DELETE } ]
          [ TO { role_name | PUBLIC | CURRENT_ROLE | CURRENT_USER | SESSION_USER } [, ...] ]
          [ USING ( using_expression ) ]
          [ WITH CHECK ( check_expression ) ]
    • The USING clause applies on reads, and the WITH CHECK clause applies to writes. If both USING and WITH CHECK could apply to the policy, WITH CHECK can be omitted to use the same clause specified in USING.
    • RLS uses a "default deny" policy. You must add extra checks to get any data out of it.
    • Types of operations and which they use:
      • SELECT - USING
        • The USING clause determines if you can see the row to update it, and the WITH CHECK applies to what you're trying to write to the row.
      • DELETE - USING
        • This means that you may need a separate policy for DELETE unless you want the user to be able to delete anything they can read.
    • These clauses can be pretty much anything that can go in a WHERE clause, including function calls. Note that USING (true) is the way to make an RLS policy that always allows access (e.g. for admin roles).
    • Permissive and Restrictive

      • RLS policies can be permissive (the default) or restrictive. For the set of policies that apply to an operation, at least one permissive policy must pass, and all of the restrictive policies must pass.
      • In SQL terms, this is (permissive1 OR permissive2) AND restrictive1 AND restrictive2.
      • Note that there must be at least one permissive policy for anything to work.
    • RLS policies and normal role GRANTs will both be checked.
      • If you have DELETE granted but the RLS denies a DELETE, then you can't delete it.
      • If RLS allows you to delete a row but your role doesn't have the permission GRANTed on the table, then you can't delete it.
    • A very simple policy would be something like this.
    • CREATE POLICY p ON thetable
      	USING (user_id = current_user);
      -- Or to force RLS for table owners too.
    • As in the above, RLS must be explicitly enabled on the table.
    • This would let you select any rows that match the current user, and not modify the table at all.
    • Policy names apply per table, so you can reuse the same name across tables.
  • Managing Users and Roles

    • Most sample uses of RLS check against current_user — that is, actual database users and roles. But this sucks.
    • There are a lot of reasons that you don't want to coopt your database's role management to manage your app's users as well.
      • Roles are global in the entire Postgres system, not scoped to a particular database.
      • There's no good built-in place to store metadata about a database user, so you still need a users table anyway.
      • GRANTing permissions becomes a lot more complicated and error-prone.
      • Using current_user in your policy checks requires you to set security_invoker = on on any view so that its checks permissions as your DB user, instead of the user that created the view. This option is new in PostgreSQL v15.
    • Fortunately you don't have to. PostgreSQL allows you to set runtime variables with almost any name you want, and use these in policies. The only restriction is that the name must have a period, such as myapp.user, to separate its namespace from the main runtime settings.
    • SET LOCAL myapp.user = '12345';
      SET LOCAL myapp.roles = 'aa,bb,cc';
      SELECT current_setting('myapp.user'), current_setting('myapp.roles');
    • When doing this, be sure to use SET LOCAL and do everything inside a transaction, or add configuration to your Postgres connection pool to clear these variables when the connection is recycled. This will prevent user and role settings from leaking across requests.
    • One good way to set these variables is to provide a stored procedure in the database, which will do the appropriate queries and set the relevant variables. This allows you to use database migrations for these queries, so that the policies, the database tables, the queries to set them, and the variable values all stay in sync.
  • Role-Based Access Control

    • The fastest way is to just have a column for each permission type that holds the user or roles with that permission. Then you can just set a variable with the current roles, as above, and have the policy check it directly against the relevant column in each row.
    • A GIN index on the role array columns will make it fast to check.
    • This works well for a simple system but there are a bunch of potential reasons to not do it. You may need a more expressive system with multiple types of permissions, or maybe you have objects with many distinct roles and worry about the implications of putting too many IDs into each row, or maybe it just chafes at your normalized database design sense.
    • In that case, you'll probably have something like a table of object permissions, which links each role to what it can do to an object. An RLS policy can handle that as well:
    • CREATE POLICY userdata ON the_objects
      	TO webapp_dbrole
      	USING (
            owner = current_setting('myapp.user') OR
            EXISTS (
          	SELECT 1
          	FROM role_permissions rp
            		rp.object_id =
              	AND rp.role_id = any(
                    regexp_split_to_array(current_setting('myapp.roles'), ','))
          WITH CHECK (
            owner = current_setting('myapp.user') OR
            EXISTS (
          	SELECT 1
              FROM role_permissions rp
      	    	rp.object_id =
              	AND rp.role_id = any(
                    regexp_split_to_array(current_setting('myapp.roles'), ','))
            		AND rp.action IN ('write', 'admin')
      CREATE POLICY delete_userdata ON the_objects
      	AS restrictive
          FOR DELETE
      	TO webapp_dbrole
          USING (
            owner = current_setting('myapp.user') OR
            EXISTS (
          	SELECT 1
              FROM role_permissions rp
      	    	rp.object_id =
              	AND rp.role_id = any(
                    regexp_split_to_array(current_setting('myapp.roles'), ','))
            		AND rp.action IN ('write', 'admin')
    • Note that the second policy for DELETE has the same USING clause as the first policy's WITH CHECK clause. It is also set as restrictive, which ensures that the DELETE policy must be passed; just passing the all-operations policy is not enough.
    • With a setup like this, any list operation that does something naive like SELECT * FROM the_objects ORDER BY created DESC LIMIT 50 will probably end up needing to sequential scan the objects table. But you can write those in a way that is more aware of the permissions model to regain performance.
    • SELECT DISTINCT ON ( the_objects.*
      	FROM role_permissions rp
          JOIN the_objects ON rp.object_id ON
          WHERE rp.role_id = any(regexp_split_to_array(current_setting('myapp.roles'), ','))
          ORDER BY created DESC
          LIMIT 50
    • While we give up the simplicity of RLS here, we can retain it for any other operations where we know in advance exactly which objects we want to look at. Also, the security guarantees of RLS still apply here, so even if the query has an error it can't return any rows that it should not return.
  • Pitfalls and Considerations

    • Infinite Recursion

      • A table can not reference itself, directly or indirectly, in an RLS policy. This will appear to work when you create the policy, but fail at query time with an infinite recursion error. This can come up most often when writing RLS policies on the user/role/etc. tables themselves.
      • Sometimes the best way to work around this is a view. As mentioned earlier, views run by default in the context of the user that created the view, so you can use them to bypass RLS in specific cases.
      • Of course, this needs to be used with care. A good practice is for all of these views to do checks similar to what you have in your RLS policies.
    • Performance

      • Each RLS policy implicitly adds more WHERE conditions, and this can cause performance issues, especially since these extra WHERE conditions are not obvious when writing the queries in the client application. EXPLAIN will help here, but note that you must run the queries as a "normal" user. Superusers always bypass RLS.
      • Performance can also be an issue on list operations, as noted above. Writing these differently to avoid a sequential scan can help.
    • Management

      • Stored procedures and similar methods of computing in the database have become less popular nowadays, and for good reason. While the performance can be nice, it is much more difficult to manage these, use source control, and so on. The same applies for RLS policies,
      • You can pay a big price if you use RLS policies for too many things. Too many conflicting policies for different needs can get hard to manage and kill performance. I would recommend using this mainly for actual security purposes, and enforcing business logic some other way.
      • RLS is not supported on materialized views as of the current version 15.
    • Security Holes

      • The DELETE operations feels like a write, but it uses the USING clause, so you could inadvertently set up a policy in which a user can delete any row they can read, even if they can't update it. Set up a separate policy for DELETE and make sure to set it restrictive.
      • If you're using parent and child tables, RLS policies only apply to the one that is queried. An RLS policy that exists only on a child table will not be used when querying the parent table.
      • RLS with SET or SET LOCAL will not work properly with pgbouncer in statement pooling mode; you will likely return rows for the wrong users and it may only happen in production when multiple people are hitting the app at once. You must use session pooling mode, or use transaction pooling mode with transactions in your application when accessing data, to ensure that the SET variables only apply to the queries you're about to run.
      • You must make sure that the variables for the current user, role, etc. do not get leaked between sessions. This can be best done by unsetting them in the "recycle" function of your postgres connection pool, or by always using SET LOCAL and doing everything within a transaction.
      • RLS will not apply if you connect as a superuser. It will not apply if you connect as the same user that owns the table, unless you use ALTER TABLE thetable FORCE ROW LEVEL SECURITY.
  • Previous Approaches for Security

    • These were described in a talk at the PGConfig 2015 conference.
    • Security Barriers and Views

      • Postgres has security barrier views which allow you to lock down a table and add extra permissions to a view for actual access. The security barrier helps prevent pushdown on functions past the securing WHERE clauses, which would let functions potentially see rows that they were not supposed to see.
      • Postgres also has a feature that would actually let you insert a row into a view, when the view is so simple that it's obvious how to do so on the underlying table. Security barriers also have a WITH CHECK option to add security on these rows, to make sure that any row inserted would actually be visible in the view the user was inserting to.
    • MultiSchema

      • Each tenant gets a schema (or database) that has its own set of tables. The application then chooses the schema based on the customer.
      • This makes migrations a hassle. Your migration system needs to be aware of this, and if you don't always make changes through the migration system, then it's easy to for the schemas to get out of sync.
      • Any query that might need to query across tenants (internal analytics, data migrations, etc.) is a nightmare to write. Partitioned tables can help there.
  • References

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.