5 tips for GDPR compliancy & protecting user data in Elixir/Phoenix

Photo by Craig Whitehead on Unsplash

Most applications store at least some personal user data in the form of name and email address. Unfortunately, while the majority of developers know that passwords should be securely stored, personal user data is often neglected. The result is rising cases of identity theft and other fraud, due to stolen personal information — which is why regulations like the GDPR are much needed.

Whether you’re part of a business organisation preparing for GDPR, or just a developer who cares about their user’s privacy & security — below is a list of 5 tips that will help you get started treating user data the right way. Hope it’s useful ✌️

1. Note down and review the data you collect

Any information that can be used to identify an individual is classified as Personal Identifiable Information (or PII for short). It is also referred to as “personal data” throughout the article.

Common PII stored by applications include:

  • Person’s name(s)
  • Email address
  • Postal address
  • Phone number(s)
  • Payment information (bank account/card details)

Other (maybe not so common) info might be location coordinates (latitude and longitude), social or passport numbers, and so on.

Regardless of what language or framework you’re using, this is the first and most important step: examine all of your data, note down the PII you collect, where it is stored, and review it regularly.

Are you collecting info that you’re not using anymore? Easy — drop the column from your database. Note that if you have backups, you will have to process those to remove the information there as well.

Do you have duplicate PII information, e.g. storing the same user email address in different tables or databases? Try using UIDs instead — less columns to manage is better.

Very often we don’t really need all the user data we store, so this is a good opportunity for some cleanup.

2. Exclude parameters from Phoenix’s logger

By default, Phoenix would log all HTTP parameters but comes with a rule that filters outpassword and password_confirmation params.

You can specify your own fields in config.exs or prod.exs like so:

config :phoenix, :filter_parameters, [
"password",
"password_confirmation",
"first_name",
"last_name",
"email"
]

The setting should take precedence over services like Honeybadger or Appsignal (used for monitoring traffic/exceptions), so none of that data will end up on 3rd party websites; however, old entries before this change will stay as they are, so you have to ensure you remove those completely.

3. Prevent personal data from leaking in logs/exceptions

Very often, personal information is leaked by accident in debugging statements, server logs or runtime exceptions.

Thankfully, it is really easy to prevent this from happening. Assuming you have a user.ex module for your Ecto schema, all you have to do is implemented the Inspect protocol:

defmodule MyApp.User do
use MyAppWeb, :model
  # Blacklist sensitive fields.
defimpl Inspect do
@sensitive_fields [
:first_name,
:last_name,
:email
]
def inspect(user, opts) do
user
|> Map.drop(@sensitive_fields)
|> Inspect.Any.inspect(opts)
end
end
  ... ✂️ ...

If an exception happens and something tries to print a %User{} struct, those fields will be now dropped from the struct before being printed.

Another note on 3rd party tools: services like Papertrail or Timber (used for analysing server logs) will likely contain leaked personal data, so make sure you erase any past records or archives after deploying the change.

4. Protect data at rest and in use

In the case of a database/datacenter breach or stolen backups, encrypting user’s personal data will provide the necessary layer of security.

Update 01/03/18: Previous version of this article used Cipher in the examples; however, tokens produced by Cipher are susceptible to certain cryptographic attacks. Follow this issue for more information.

If you’re using Phoenix, which ships with Plug, then there’s already a handy function that you can use to encrypt short text. The function requires access to a secret key and secret signing key, which you should keep safe and never commit in code.

iex(1)> alias Plug.Crypto.MessageEncryptor
Plug.Crypto.MessageEncryptor
iex(2)> MessageEncryptor.encrypt("Hello world!", key, signing_key)
"QTEyOEdDTQ.1iWlClVXMyhbheDMHsAuexL9Mhd8l2GsuBY7natp2fnKQi5qXmWortpjo0c.ubl2n3viv3bQHj0U.7Yy7NOVJaHCV3pLS.4-shPeCh4BwNMe07K_Bq7Q"
iex(3)> MessageEncryptor.decrypt(encrypted_text, key, signing_key)
{:ok, "Hello world!"}

You can also opt in for using hairnet — an Erlang package (also on Hex) that similarly to Plug implements 128bit AES encryption in GCM mode.

If you manage your database server(s) yourself, then check if you’re encrypting all backup files; alternatively, it might be easier to just encrypt the whole file system. Otherwise, most providers like AWS/Heroku offer encryption at rest in their paid plans.

What about data in transit? There’s no excuse not to use SSL these days 😉

Database backups, of course, will still contain the old, unencrypted user data. Make sure you either migrate those, or simply delete them if too old.

5. Use Ecto’s embedded schemas

The more personal data you store, the more difficult it is to manage, especially when you have many columns scattered across different tables.

Most popular databases support storing arrays and JSON objects. When you don’t need to query data or update it frequently, Ecto’s embedded schemas are a great way to keep personal data in one place.

A brief example adapted from this excellent article will give you a flavour how easy it is to use them:

defmodule User do
use Ecto.Model
  schema "users" do
embeds_many :addresses, Address
end
end
defmodule Address do
use Ecto.Model
  embedded_schema do
field :street_name
field :city
field :state
field :zip_code
end
end

In your database, you’d have just a single column for all user addresses:

defmodule MyApp.Repo.Migrations.CreateUsers do
use Ecto.Migration
  def change do
alter table(:users) do
add :addresses, {:array, :map}, default: []
end
end
end

This significantly reduces the columns / Ecto fields you’ll have to deal with when doing steps 3) and 4) ✌️

Conclusion

Protecting user’s personal data is not easy and the amount of work highly depends on your application and what personal user data you collect and process. The 5 tips above barely scratched the surface, so if you’d like to read more, check out some of the links below.

If you found this short article useful and would like to see more content on Elixir — please share and help spread the word by hitting the 👏 button.

Thanks for reading!

Further reading:

I’m Svilen — a full-stack web developer and co-founder at Heresy. We’re always looking for engineers who enjoy working with the latest technologies and solving challenging problems. If you’re curious, check out jobs page!