Erlang nugget : ETS

ETS or Erlang Term Storage is a builtin application that allows developers to create ram based key-value storage objects.

I think the above statement is enough to understand what ETS is all about. Now let’s understand how to create an ETS table and see how to add, retrieve and delete data from it.

Creating ETS table

To create an ETS table we use,

ets:new(TableName, Options).

Here “TableName” is name of the of table which should be an atom. In “Options” we define different parameters for the table. Following is an example of how to create an ETS table.

1> TableId = ets:new(mytable, [set, public, named_table ,{write_concurrency, false}, {read_concurrency, true}]).
  • The first argument is the name of the table i.e. “mytable”.
  • The second arguments i.e. “Options” contains a list of atoms,
[set, public, named_table ,{write_concurrency, false}, {read_concurrency, true}]
  • set means that the table will be set table i.e. one key, one object, no order among objects. So with set you will not have duplicate keys in your ETS table. There are other table types available too and you can refer them here.
  • public means that any process which has the TableId can read and write to our ETS table. If you omit public then only the process that creates the ETS table can read and write to it.
  • named_table means that the table name (i.e. the first argument given to ets:new/2 ) is associated with the table identifier. What this means is that if you want to able to refer your table with the name “mytable” then you will need to set this option.
  • {write_concurrency, false}, means that multiple processes cannot write to the table at once even if they are updating different objects. If set to true then the table will be optimized for concurrent write access i.e. multiple processes can write update different objects.
  • {read_concurrency, true}, means that table will optimized for concurrent read access.

Storing into ETS

Storing data into ETS is really simple,

2> ets:insert(mytable, {key1, value1}).
true
3> ets:insert(mytable, {<<"key1">>, <<"value1">>}).
true
4> ets:insert(mytable, {{key21,<<"key22">>}, value21}).
true
5> ets:insert(mytable, {key3, value31, value32, value33}).
true

Above we use ets:insert_new/2 to store data into our table. The first argument is the table name and second argument is the tuple which you want to store in the table. By default the first element of the tuple is considered to be the key. Notice in the above example that you can use any data type for keys i.e. atoms, binary, tuples etc.

One thing worth mentioning here is that ets:insert/2 can also be used to update the value associated with a key. So if you try to insert a key which already exists in the table then the value will get updated to the new value. This behavior of ets:insert/2 is true for table type set and ordered_set because in both of these types the keys have to be unique and if you try to insert a key that already exists it will overwrite the existing key and value.

Retrieving data

Now that we have stored the data in our table it's time to retrieve it. Firstly we try to see all the data that is there in the table,

6> ets:match(mytable, '$1').
[[{<<"key1">>,<<"value1">>}],
[{{key21,<<"key22">>},value2}],
[{key3,value31,value32,value33}],
[{key1,value1}]]

we use ets:match/2 to get the data from the table. The first argument is table name and second argument is pattern to match against each element. Our pattern is ‘$1’, which is a special pattern that will match against every element.

Now let's have a look at a little more evolved example of retrieving data,

7> ets:match(mytable, {key1, '$1'}).
[[value1]]
8> ets:match(mytable, {'_', '$1'}).
[[<<"value1">>],[value2],[value1]]
  • In the first line we try to retrieve “key1” from the table. Note the pattern match is of the form “{key1, ‘$1’}”. So any tuple that has its first element i.e the key as “key1” will match against it and the second argument will be retrieved as a part of successful match by using “$1”.
  • In the second line we try to match all tuples that have 2 elements but we ignore the first element from all the matched elements using ‘_’ .

Here I would like to discuss what this ‘$1’ really mean. Recall how you do normal pattern matching in Erlang,

1> {key, Value1, Value2} = {key, v11, v12}.

The above pattern when gets translated to the following in ets:match/2,

ets:match(TableId, {key, '$1', '$2'})

So in place of using variables use use ‘$1’, ‘$2’, ‘$3’ and so on, to denote the variables. Have a look at the following example for more clarity,

9> ets:match(mytable, {key3, '$1', '_', '$2'}). 
[[value31,value33]]

So a match will be successful is the key is “key3” and the rest of the values are extracted using ‘$1’ and ‘$2’. Notice that we have ignore the second last value using ‘_’ so it doesn’t appear in the return value of ets:match/2.

Other useful ETS functions

  • ets:member(Table, Key) : check if the Key exists in the table.
  • ets:delete(Tab) : deletes the ETS table denoted by “Tab”.
  • ets:delete(Table, Key): delete the given Key and the corresponding Value from the table.
  • ets:delete_all_objects(Tab) : clears the data from the table.
  • ets:tab2file(Table, Path): stores the ETS table to given path eg,
ets:tab2file(mytable, "/home/user/mytable.dat").
  • ets:file2tab(Path): load the stored ETS table from the file.

For more details refer here : http://erlangwiki.dougedmunds.com/doku.php?id=erlang:ets