An Engineer’s Guide to Privacy by Design

There isn’t a single ‘right’ way to implement privacy by design. But an array of emerging best practices, known as “privacy engineering”, help those who want to build products, systems and processes that integrate privacy and trust without compromising innovation.

Published in

CodeX

13 min readSep 10, 2021

“It’s our thesis that privacy will be an integral part of the next wave in the technology revolution and that innovators who are emphasising privacy as an integral part of the product life cycle are on the right track.”
— The Privacy Engineer’s Manifesto

What is privacy by design?

Privacy by design (‘PbD’) was coined and developed by Ann Cavoukian, the former Information and Privacy Commissioner for the Canadian province of Ontario in 1995, yet it remains a vague and ambiguous concept.

PbD has been heavily criticised for the lack of practical guidance and leaving too many open questions about its implementation.

Some argue that lawyers and app developers don’t speak the same language and so developers may have no idea how to translate abstract legal principles into concrete engineering steps (or how to evaluate if technical privacy measures they’ve implemented are sufficient for compliance purposes).

At the core, PbD is about ensuring that the privacy of individuals is protected by integrating privacy into the development of digital products, services, business practices, and infrastructures. The concept was inspired by the Fair Information Practices developed in the US in the 70s and 80s in response to the growing use of automated data systems containing information about individuals.

PbD is a form of “privacy engineering”, similar to “safety engineering” in civil, electrical and mechanical engineering which ensures that engineered systems like roads, bridges or cars provide acceptable levels of safety.

Note that when we say ‘privacy’ we don’t just mean security. The terms are often used interchangeably but, as you’ll see, while compliance with PbD includes security, it’s a much broader concept.

The Foundational Principles

The PbD concept comprises 7 key principles (aka “foundational principles”). While these principles are a great place to start your PbD journey, they only provide a high level framework, i.e. the they don’t tell you how to go about implementing PbD nor do they provide any metrics on how you might measure the success of your PbD practices:

Proactive not reactive; preventive not remedial — Anticipate and prevent privacy invasive events before they happen. Don’t wait for privacy risks to materialise, rather think of solutions for resolving privacy violations before they’ve occurred. Consider privacy upfront in any product design process, and not as an afterthought.
Privacy as the default setting — Ensure that personal data is automatically protected in any given product, system, architecture or business practice. Users shouldn’t have to take any action to protect their privacy (like tick an explicit opt-out from tracking, data collection or targeting) — privacy should be built into the system.
Privacy embedded into design — Embed privacy into the design and architecture of any system or product and into relevant processes and protocols. Don’t simply bolt it on later, once the product or system has been launched and privacy issues come up. Make privacy an essential component of the product’s core functionality.
Full functionality — positive-sum, not zero-sum — Don’t view privacy as a drag on functionality, but rather find creative ways to accommodate all legitimate interests (i.e. of both users and the business) in a “win-win” approach.
End-to-end security — full lifecycle protection — Apply privacy and security throughout the entire data lifecycle — from collection to storage, usage and destruction. Ensure cradle-to-grave, secure lifecycle management of information, end-to-end.
Visibility and transparency — Keep it open — your privacy practices should be visible and transparent, to users and all other stakeholders. Privacy notices/policies should be clear, easy to find and not misleading.
Respect for user privacy — keep it user-centric — Keep user interests front and centre. Offer strong privacy defaults, appropriate notices of how data is collected, stored and used, and empower users to update, delete or access their data.

foundational principles of privacy by design — PbD Foundational Principles — Source: https://www.capgemini.com/2020/06/addressing-the-challenge-of-privacy-engineering/

In practice, PbD requires the translation of the above principles into engineering and product principles and practices. This could mean, for example, that:

PbD is included in the SDLC/product development process, in specific, testable and measurable requirements, including both the back end and UX/UI design.
privacy issues are flushed out as part of the product’s architecture and system design considerations.
user-centric products are developed focusing on privacy related end-user goals, needs, wants and constraints.

Is PbD mandatory?

GDPR

In the EU, PbD is now a legal obligation since the introduction of the General Data Protection Regulation (GDPR) in 2018 and infringement carries heavy fines — the maximum fine for non-compliance is the greater of €20 million or 4% of annual global turnover. In the three years that GDPR has been in force, over €1 billion in fines has been issued for violations of PbD obligations.

It’s worth noting that due to the extra-territorial application of the GDPR, any Australian business whose activities are caught by the GDPR (i.e. it operates in the EU and collects personal data from EU citizens), should implement PbD to ensure compliance with the GDPR.

Australian Privacy Law reform

While there’s currently no specific legal requirement in Australia for businesses to implement PbD, the Australian Government is currently undertaking a comprehensive review of the Privacy Act. Following this review, Australian businesses will be subjected to wide-ranging GDPR-style legal obligations, including PbD with penalties for failure to comply likely be more severe.

The reforms are expected to increase maximum civil penalties for serious or repeated privacy breaches from $2.1 million to the greater of: 10% of a company’s annual domestic turnover; $10 million; or, three times the value of any benefit obtained through the misuse of information. These sweeping changes are expected to become effective in 2022; businesses that aren’t prepared may face massive fines and loss of consumer trust.

Australian privacy best practice

PbD is considered a privacy best practice in Australia and recommended by OAIC, the NSW Information and Privacy Commissioner, the Office of the Victorian Commissioner for Privacy & Data Protection and the National Data Commissioner. The Federal Government has also implemented a PbD approach to the (now retired) COVIDSafe app in April 2020.

The Australian Consumer Data Right

The recently introduced Australian Consumer Data Right (CDR) aims to provide greater choice and control for Australians over how their data is used and disclosed. It allows consumers to access personal data and direct a business to securely transfer that data to an accredited data recipient. The CDR regime currently applies to the banking sector and will soon be rolled out for the energy and telecoms sectors, with more sectors (insurance, healthcare, retail, super) to be included down the line.

The CDR regime is accompanied by 13 privacy safeguards (contained in the Competition and Consumer Act and supplemented by the Consumer Data Rules). These privacy safeguards set out the privacy rights and obligations for users of the CDR scheme and — importantly — are modelled on PbD principles,such as requiring companies to obtain consumers’ informed consent to collect, disclose, hold or use personal data; openness and transparency; anonymising data; using data for direct marketing; overseas disclosure of data; data security and consumers’ right to correct their data. Customer data, account data and transaction data are types of CDR consumer data to which the Privacy Safeguards apply.

The risk exposure for companies that handle CDR data is greatly increased under the Privacy Safeguards (compared to the existing Australian Privacy Principles) not only as the maximum penalties are now significantly higher (see above), but also because consumers now have a legal right to make direct claims against companies for infringement (this isn’t currently available under the Privacy Act).

Does privacy by design make business sense?

Privacy is a ‘nice to have’, for sure, but it also makes commercial sense because of these 3 key trends:

Shifting consumer expectations when it comes to privacy as a result of increased awareness of a fundamental privacy power imbalance;
The need to establish user trust and confidence, protect the company’s brand and reputation and avoid privacy violations; and
legal and compliance reasons — massive fines for non-compliance with increasing privacy regulation around the world.

Why privacy by design is hard to do

PbD is a complex undertaking and can be difficult to achieve for 3 key reasons:

Lack of a privacy-first culture — for PbD to succeed, the organisation must acknowledge the importance of privacy. Best practice usually requires appointing a dedicated privacy officer or privacy team accountable for privacy protection; having a robust privacy policy approved by senior management; C-Suite commitment to privacy as a company value; regular privacy impact assessments (PIAs) and audits; and rolling out a privacy education/awareness training programme.
Lack of multidisciplinary collaboration — PbD is inherently a multidisciplinary exercise. PbD implementation requires a sustained, consistent collaboration across different stakeholders (and sometimes jurisdictions). Legal, product, engineering, procurement, marketing, HR, Finance and compliance/risk management should all be involved in the PbD effort. PbD can be a challenge for startups and SMEs due to their limited resources. PbD collaboration may also be hampered in larger organisations due to cumbersome politics, pervasive bureaucracy or lack of executive buy-in.
Lack of data hygiene — Multiple siloed systems can lead to ‘orphan datasets’, duplicate, inconsistent or unknown data. Many companies don’t have a single clear corporate view of what data is collected, where it lives and how it’s used across the board (or who’s responsible for data governance and management). This means that privacy risks cannot be properly identified or preempted and the company is more likely to be non-compliant with its legal obligations.

How to implement privacy by design

The bottom line is that there is no one right way to do PbD. PbD should be tackled company-wide by multiple stakeholders, and tailored to the context and specific circumstances of each organisation. Below are a few privacy engineering best practices which have evolved to help engineers and product managers integrate PbD into their products and systems.

elements of privacy engineering — Privacy Engineering — Source: https://www.lawinfographic.com/what-is-privacy-engineering_by_jessica_lam/

Consider privacy by design early & continuously

The key to complying with the PbD principles and embedding privacy into product functionality, systems and processes is to address privacy early and on a continuous basis. Events which should trigger early consideration of privacy include:

Designing or deploying a product, service or project that involves the collection of personal information (whether from users, customers or employees).
Changes to methods by which personal data is collected, stored or used by the organisation or any specific team within.
Planning a new marketing product or campaign involving the use of user/customer personal information for targeting purposes.
A plan to merge, manipulate or transform multiple databases containing personal data.
When systems containing personal data are being retired.
When there’s a plan to incorporate personal data obtained from public or commercial sources/databases into an existing database.
A new business process that involves significant new collection, use or disclosure of personal data.
Using 3rd party vendors that could access, store or use personal data.

2. Identify personal data

The first step in assessing any privacy risks in your project/product is to clearly identify the personal data involved:

what personal data is collected, processed or used
when and how it’s being collected
where it’s stored/located
how it’s being used
who is accessing/using it
how it moves across various systems/ who will it be shared with
how long it’s being retained for (and where)

3. Identify the privacy impact

The goal is to understand, at a very basic level, where the privacy risk is — what is the potential for privacy violations and what impact such violations might have.

Identify:

the key threat actors who might violate an individual’s privacy
the key individuals whose privacy might be violated

Threat actors — consider not only malicious threat actors but also third parties such as vendors (hosting provider, billing provider, network provider, audit firm, call center operator, HR platforms, CRM etc), governments, competitors etc.

Individuals — consider:

High level categories or segments (users vs. employees, certain types of users or locations).
What are the user’s (or other data subject’s) privacy expectations?
What legal/regulatory issues might impact the risk — is personal data collected from vulnerable data subjects (e.g. children, patients etc) which might be subject to specific legal/regulatory requirements?

Once you have a clear understanding of what personal data would be collected or processed and the potential risks, you may need to involve your privacy officer or legal team to evaluate whether a privacy impact assessment (PIA) may be required. A PIA is an analysis of the privacy risks associated with processing personal data in relation to a product, service or process. PIAs can also include recommendations for reducing the risks identified as part of the PIA. Depending on the size of your organisation, there may be policies or processes around when a PIA is required and who should conduct it.

4. Identify specific privacy product requirements

Each product or project will have its own technical, usability, commercial and legal requirements coming from different stakeholders. You may need to identify and comply with specific privacy requirements such as:

The software should only collect the personal data necessary for its intended functionality or purpose;
The software should include appropriate mechanisms for obtaining end user consent to personal data collection;
The database should have mechanisms in place to avoid future data linkage;
The software should encrypt all personal data by default using standardised encryption mechanisms with securely managed encryption keys;
All personal data should be anonymised whenever possible;
There should be an expiry date associated with all personal data that is collected;
All collected personal data should be properly deleted after they expire;
The software should provide audit trails showing how personal data was collected, processed and deleted;
The software should be subject to a thorough security risk and threat assessment;
The system should enable consumer/user requests for personal data edits, changes or deletion (and a prompt response); or
Users can control their own personal data (view, access, edit, delete, share with 3rd parties).

5. Take a privacy-by-architecture approach

Privacy violation risks can be minimised by reducing a threat actor’s access to personal data or governing the threat actor’s behavior through technical, administrative or operational means. This involves system architecture as well as various strategies, tactics and processes that together ensure the risk is managed and mitigated.

The “privacy-by-architecture” approach to system design says that a system’s privacy friendliness can be measured by:

How identifiable personal data is; and
How centralised the architecture is.

A system is more privacy friendly when data is less identifiable and less centralised.

According to this approach, you can mitigate privacy risks by anonymising data and pushing that data toward a more client- or user-centric architecture. The most privacy-friendly level of identifiability is a system that is anonymous and unlinkable. Perfect anonymisation is difficult to achieve in practice but Privacy Enhancing Technologies (particularly emerging ones such as homomorphic encryption (HE), trusted execution environments, secure multi-party computation and differential privacy,) offer clever approaches to the identifiability problem.

When data is centralised, there’s a higher risk of privacy violations compared to data stored and used directly in the user’s domain. Therefore decentralisation is another technique said to make systems more privacy friendly. For example, Federated Learning is a novel machine learning technique (notably used by Google) that trains an algorithm across multiple decentralised edge devices or servers holding local data samples, without exchanging them.

Other privacy-friendly tactics include:

Role-based access controls (only certain users have access to certain data, e.g. separate HR data from Finance data);
Isolating — keeping personal data on separate servers or processing personal data (for different purposes) independently in separate databases or systems. Data can be isolated geographically, demographically, per customer, per individual etc.
Minimising the personal data collected.
Excluding certain personal data from collection.
Stripping personal data from data shared with 3rd parties and incorporating contractual provisions that require data deletion by the vendor when no longer necessary or further to a data subject request for data removal.
Deletion — regularly removing personal data when it’s no longer required.

6. Bake privacy into the SDLC

Baking privacy into the SDLC and better tooling in the CI/CD pipeline are an emerging best practice, analogous to security’s upstream shift into the development cycle. This results in a system that respects its users privacy out of the gates, without bandaids or slow manual processes. The system would identify data flows clearly (such that data discovery and mapping efforts are redundant) and your product stack would self-describe its data operations immediately upon deployment. Continuous application privacy can be achieved across the SDLC, from automated privacy coding to automated risk analysis and consent tracking, governance and reporting.

Tools such as Ethyca, Privitar and Dastra (and many others privacy tech and GRC tools) help achieve privacy compliance and cover aspects such as data mapping, customer data requests and consent management as well as analytics and data science privacy solutions.

The way forward

There’s no question that, today, digital privacy and trust are key issues in the tech sector and beyond. Consumers want to be assured that the digital products they use have been designed with integrity, safety and ethics in mind.

The future undoubtedly holds exciting opportunities for those who can deliver innovation with baked-in privacy, and companies who can capitalise on those capabilities will win the trust race. Engineers will have a critical role to play in making those wins a reality.

_________________________________

Resources & Best practices

IAAP’s Strategic Privacy by Design
Free e-book — The Privacy Engineer’s Manifesto
Applying Privacy by Design in Software Engineering — An European Perspective by Karin Bernsmed, Department of software engineering, safety and security SINTEF ICT
Technical Privacy Metrics: a Systematic Survey by Isabel Wagner and David Eckhoff (June 2018)
Engineering Privacy by Sarah Spiekermann (Vienna University of Economics and Business) & Lorrie Faith Cranor Carnegie Mellon University — School of Computer Science and Carnegie Institute of Technology
Privacy Engineering & Assurance by Nokia
Embedding GDPR in the secure development lifecycle (SDLC)
EY article: How to Successfully Embed a Culture of Privacy by Design
AI ethics: How Salesforce is helping developers build products with ethical use and privacy in mind
‘Privacy by design’: Google to give people more power over their personal data
Uber Privacy & Security blog
LinkedIn Engineering’s Fairness, Privacy, and Transparency by Design in AI/ML Systems
Shift Left: Create a Fast, Secure Development Lifecycle
Stackoverflow’s blog — Privacy is an afterthought in the software lifecycle. That needs to change.