Single database column anonymization with sqlcmdcli

Anonymize text columns longer than two characters!

Sergio Govoni
CodeX
Published in
3 min readDec 17, 2022

--

In the previous article Data masking with sqlcmdcli we talked about the complex activity of data anonymization when you have to take a production database to carry out tests in the development environment even outside the customer’s company perimeter. A possible solution is to mask sensitive data on a copy of the production database by asking the customer to carry out the appropriate checks before giving us the authorization to withdraw the anonymized copy of the data.

The anonymizedb command of the sqlcmdcli allows you to carry out this type of non-reversible anonymization on all the text-type columns of a database. The anonymizedb command has recently been enriched with new options that allow you to carry out the anonymization operation on a specific column, limiting the operation only to sensitive data for the company or private data for people.

Let’s download the latest version of sqlcmdcli from GitHub, you can find it here.

Let’s extract the executable file contained in the sqlcmdcli.zip and place it, for example, in the C:\SQL\Tools\sqlcmdcli directory.

The VM we will use to test the new options has a local instance of SQL Server 2022 Developer Edition with a copy of the WideWorldImporters sample database that you can download here.

We want to anonymize the PhoneNumber column of the Application.People table in the WideWorldImporters database. The PhoneNumber column contains the telephone number of the company’s employees so we have to anonymize it for privacy reasons. Afterward, we are going to extend the same job to the LogonName column, and so on.

sqlcmdcli can anonymize the PhoneNumber column by running the following command.

sqlcmdcli.exe anonymizedb -servername:sql2022 -databasename:WideWorldImporters -username:xxxxx -password:xxxxx -schemaname:Application -tablename:People -columnname:PhoneNumber -verbose

The output is shown in the following picture.

The PhoneNumber column has been anonymized, let’s go ahead with the anonymization of the LogonName column. Let’s run the following command.

sqlcmdcli.exe anonymizedb -servername:sql2022 -databasename:WideWorldImporters -username:xxxxx -password:xxxxx -schemaname:Application -tablename:People -columnname:LogonName -verbose

The output is shown in the following picture.

The LogonName column has been anonymized, this is the result.

PersonID    LogonName                                          FullName
----------- -------------------------------------------------- ---------------------
2 qgrg}F}ojk}uxrjosvuxzkxy4ius Kayla Woodcock
3 n{jyutuF}ojk}uxrjosvuxzkxy4ius Hudson Onslow
4 oyghkrrgxF}ojk}uxrjosvuxzkxy4ius Isabella Rupp
5 k|gsF}ojk}uxrjosvuxzkxy4ius Eva Muirden
6 yuvnognF}ojk}uxrjosvuxzkxy4ius Sophia Hinton
7 gszF}ojk}uxrjosvuxzkxy4ius Amy Trefl
8 gtznutmF}ojk}uxrjosvuxzkxy4ius Anthony Grosse
9 groiglF}ojk}uxrjosvuxzkxy4ius Alica Fatnowna
10 yzkrrgxF}ojk}uxrjosvuxzkxy4ius Stella Rosenhain
11 kzngtuF}ojk}uxrjosvuxzkxy4ius Ethan Onslow
12 nktxlF}ojk}uxrjosvuxzkxy4ius Henry Forlonge
13 n{jyutnF}ojk}uxrjosvuxzkxy4ius Hudson Hollinworth
14 roriF}ojk}uxrjosvuxzkxy4ius Lily Code
15 zgpyF}ojk}uxrjosvuxzkxy4ius Taj Shand
16 gxinkxrF}ojk}uxrjosvuxzkxy4ius Archer Lamble
17 vovkxqF}ojk}uxrjosvuxzkxy4ius Piper Koch
18 qgzokjF}ojk}uxrjosvuxzkxy4ius Katie Darwin
19 pgoyF}ojk}uxrjosvuxzkxy4ius Jai Shand
20 pgiqvF}ojk}uxrjosvuxzkxy4ius Jack Potter

It is important to remember that this anonymization technique is not reversible, please don’t use it in production environments!

Enjoy sqlcmdcli!

--

--

Sergio Govoni
CodeX
Writer for

CTO at Centro Software, Microsoft Data Platform MVP