CPF: one database to rule them all

7 min readNov 1, 2017

Brazil’s Individual Taxpayer Identification Number — photo credit: Wikimedia Commons

This piece is part of the Identity and Internet Series, written by Yasodara Cordova of Harvard Kennedy School and Coding Rights, a member of the Privacy International Network. It does not necessarily reflect the views or position of Privacy International.

Original in Portuguese: https://medium.com/identidade-e-internet/identidade-e-internet-c49e693dd5c4

The registration issued by the Federal Revenue Service is the only identity that governs Brazilians’ finances and, therefore, is indispensable.

Every transaction done through financial institutions in Brazil is registered in a database that links to the CPF number, which stands for “cadastro de Pessoas físicas” or “Individual Taxpayer Identification Number”. Although there are still places, like slums and rural areas, where financial transactions are independent of banks and removed from digital environments, there are efforts to bring Brazilians into the official financial system. The CPF number is mandatory for each citizen, from 12 years onward. The availability of small loans as a public policy to reduce inequality became a strong argument to promote the CPF, even for people that do not have bank accounts.

The CPF gives a unique number to each citizen able to perform financial operations, to track possible taxpayers. Therefore, the number ends up linking credentials that can give information about the buying habits of the individuals.

In the recent past, the CPF was the financial identification of the entire family, designated by the husband’s figure, as the “head of a family”. When women married, their CPF would become void, and their husband would represent the financial situation of the whole family. Young children and married women did not need their own CPF. Today, in contrast, children receive their CPF shortly after birth. There are more than 180 million CPF numbers in Brazil, out of a population of 207 million. According to the Brazilian Federal Revenue Office , the number is unique and replaceable only if compromised. This occurs only when someone dies, or by some administrative order.

The main difference between the CPF and the US Social Security Number resides in its financial nature. Hence, the CPF is widely used by private institutions to access the financial records of citizens to give them a certain level of credit.

Fragile platform

The CPF is issued based on a well-known algorithm, the Luhn Algorithm, which is used to generate verification digits. Using this algorithm to verify the authenticity of this document weakens the CPF issuance system. Today there are numerous websites and countless programmes that generate valid CPF numbers that can be used online to create false identities.

This database does not support biometric information, although the law creating the National ID mandates the aggregation of the CPF and electoral databases containing biometric data. It is a publicly disclosed number, available in several platforms and situations. Though financial transactions are private, the CPF number can be consulted by companies, mainly to analyse people’s profiles to complete credit operations. The number is also used to receive government aid in welfare programmes and thus, can be stored and associated with information from different government registers, such as student databases, public health records, etc.

As of 2012, it became possible to obtain a CPF number over the Internet. To get a CPF number, a person fills out a registration form on the Brazilian Federal Revenue Office website with their name, mother’s name, voting registration number, date and place of birth, and address and telephone numbers (landline and cellular). If their number is lost, they must go personally to the government offices, carrying some proof of photo identification. In a quest to verify the identity of citizens on the online system, the platform requires information that is potentially invasive, especially if leaked. The lack of information about security involving the data and the reckless ways that the CPF is exposed with personal information on the web, show how the concept of authorisation combines with the idea of identification in many levels of these service platforms.

The fact that the number is a gateway to people’s financial data adds value to the CPF database. For example, Experian, a company present in more than 44 countries, acquired part of the most famous Brazilian credit bureau, the “Serasa”. To improve its database, Experian made an agreement with the Revenue Service involving fund transfers in exchange of data. After the publication of the agreement, the Revenue Service backed down and the transfer was suspended by a court.

The CPF can also be used, according to the Central Bank, to create lists of compliant debtors, called cadastros positivos”. People listed there can receive bigger discounts, or be eligible to receive exclusive products and services. The Central Bank has regulated permission to create these listings, assured by Law 12.4141/2012. According to the regulation,

“any person who meets the requirements established in the Article 1 of the Decree №7.829/2012, can manage a database with information about compliant debtors, to form the credit history of natural or legal persons.”

The Central Bank maintains a general register of clients of financial institutions in a centralised database. This report can be consulted in person or by mail request. The “Registato”, which also belongs to the Central Bank, is a system that “allows citizens to have quick and secure access to information about their relationship with financial institutions and their credit operations through the Internet”. There is no transparency on its technical specifications, but it allows access to a web database of personal data.

Threats to privacy

To understand the threats the CPF number represents, it is important to note that there will always be a group of people who has access to this information, unrestrictedly, like managers and system administrators. At the same time, citizens do not have access to error logs of the systems. If there is a malfunction and for any technical reason they are disadvantaged, access to their data and the log errors is forbidden, and is held under custody of the Central Bank.

The most vulnerable citizens, the ones that are targeted by public policies involving transfers of small quantities of money, are victims of the Transparency Law, that makes the disclosure of the recipients of public funds mandatory on the web. Usually, citizens are digitally illiterate and do not know about the publication of their names, CPF numbers and addresses on the web, such as on the famous website “Bolsa-Familia”. They end up being easy targets for banks and other financial institutions offering loans and services at low prices, due to their financial connection with the government being publicly available.

Since there is no penalty for a data breach, not even a Data Protection Law, it is not difficult to find situations where leaks affected citizens and no one was held accountable. The disclosure or misuse of this information is not even transparent, and in many cases, the leak is not made public. Data exposure is seen as a mechanism against fraud, and is misused as a transparency measure.

On one hand, systems such as the “Harpia” software which tracks and learns from personal citizen data, help in the fight against evasion of taxes. On the other hand, the lack of clear standards and norms regarding the responsibilities of those involved, as well as the absence of transparency in the processes of consultation and modification of these databases, is a matter of concern. The opacity in dealing with data from financial transactions goes hand in hand with difficulty in correcting the system, auditing failures, and other salutary activities in relation to digital systems.

Another factor to weigh is the complete inefficiency of these systems regarding the quality of access. Between badly implemented captcha and script errors, with no effort, users can come across websites with expired TLS/SSL certificates. Almost every government website that deals with registry numbers or certificates does not offer HTTPS. Many of these websites have fundamental weaknesses, which endanger consumer access, and are susceptible to breaches, unsolicited modifications, unauthorised downloads, and more.

The concept of e-government comprises the digital competence of government institutions themselves. When this capability does not exist, what happens is the expansion of points of weakness due to the nature of the network. In the case of the CPF, the exposure of financial transaction data opens the door to social segregation coming from techniques such as profiling.

Possible solutions?

New possibilities, such as distributed ledgers or federated systems, appear as alternatives to centralised records. Moreover, it would be interesting to think about interventions prioritising citizen safety rather than preventing bank fraud. Technically, having specific databases with restricted access through well-designed APIs, and following basic rules of Internet security, would be an improvement. These items, when combined with efficient public policies around data leaks communication and accountability, as well as transparency and access, could represent a reasonable starting point for a better framework for e-government for privacy.

A person’s financial identity, when networked, can identify much more than their financial transactions. These are “clues” to entire personalities, and it is possible to establish very precise profiles of people, their habits and tastes, based on this data. To think of governments using financial data to track the habits of a population to uphold surveillance politics, or to support dictatorships, awakens one of the most powerful memes of Brazilian Netflix fans: “This is so Black Mirror!”

CPF: one database to rule them all

Written by Privacy International