Learning Series – What is a digital identity? - Laboratoire de confiance numérique du Canada

April 13, 2023

Learning Hub

Cryptography, Decentralized Identity, Digital Credentials, Digital Economy, Digital Government, Digital ID, Digital Identity, Digital Signatures, Digital Trust, Interoperability, Learning, Privacy by Design, Selective Disclosure, Zero Knowledge Proof, ZKP

Part 6 – Privacy

We’ve touched upon privacy in a previous publication (see Part 3 – Models), and emphasized how it was an important component in establishing digital trust and overcoming barriers to adoption in terms of reliability and accountability (see Part 4 – Barriers to Adoption). Now having elaborated some of the underlying technologies that enable the trusted use of identity credentials in a digital space (see Part 5 – Cryptography), we’ll demonstrate how, when paired with prevailing concepts within the digital identity ecosystem, digital credential holders are empowered to protect their privacy and the security of their personal information. Established principles surrounding privacy, adopted in Canada and around the world, lay the groundwork to illustrate how ideas made possible by digital credential technology are aligned with privacy protection for all individuals. We will describe the following concepts:

Privacy by Design
Selective Disclosure
Zero-Knowledge Proof

Privacy by Design

Privacy by Design is a Canadian framework that was first developed and published in the early 2000’s by Ann Cavoukian, the former Information and Privacy Commissioner of Ontario.¹ It has since been recognized by the Global Privacy Assembly², and its elements have been integrated into the International Organization for Standardization ISO/DIS 31700 Consumer protection — Privacy by design for consumer goods and services.³ Privacy by Design presents seven foundational principles that lay out a holistic approach to embedding privacy concerns in systems development. These are:

Privacy protection is proactive, not reactive – threats to privacy should be anticipated and prevented before they happen.
Privacy as a default setting – personal information is automatically protected by default. If an individual does nothing, then their privacy remains intact.
Privacy embedded into design – privacy concerns must be an essential component of the functionality being created, not an add-on to the core.
Privacy as a positive sum – privacy and non-privacy system objectives can usually be met in a positive-sum manner so that all concerns are accommodated.
Full lifecycle protection – privacy measures apply during the full lifecycle of the information, from creation or collection right through to secure destruction.
Openness – all business practices and technology involved operate according to stated objectives and are subject to independent verification.
Respect for user privacy – a user-centric approach that keeps the interests of the individual at the forefront when developing a system or process design.

Applying these principles from the outset ensures that individuals maintain control over their personal information. Organizations are also held to a standard of accountability for their processes. Using these principles, and enabling digital credentials with the cryptographic practices discussed in Part 5 – Cryptography, instill trust in digital credentials. This trust should accelerate adoption as more people realize that digital credentials can in fact be more privacy protecting than physical credentials. Selective disclosure and zero-knowledge proofs are examples of how digital credentials can be more privacy protecting than physical credentials.

Selective Disclosure

Selective disclosure is the ability for a subject to share a selected subset of information, limited to only what is required at the time. From a digital credential perspective, this means the ability to share only the credential identifiers and attributes required to prove an assertion within a specific context.

An example often used to illustrate this concept is the verification of age to enter an establishment, like a bar. At the door, a bouncer asks you to prove that you are above a certain age. In this example, a physical driver’s licence could be used as a piece of ID to verify your age. You produce your ID and the bouncer looks at you and the picture on the ID to make sure they match. This is the bouncer verifying that the credential belongs to you. Your date of birth is used to determine if you meet the age requirement for entry. Unfortunately, you have also revealed sensitive personal information that was not required, like your home address and full date of birth. Even more concerning, identity documents are sometimes scanned electronically in this scenario, which means you also risk giving up control of your information. If selective disclosure were possible in this scenario, you should have been able to expose only your picture and date of birth. Even simply your year of birth would suffice in most cases.

From a digital identity perspective, the approach to selective disclosure has largely been focussed on using digital signature techniques to expose only the identifiers or attributes necessary, while keeping other information inaccessible. Work is ongoing to mature methods that would achieve this ability to only share necessary attributes. However, a prerequisite to broad based use of this approach involving digital signatures to selective disclosure will require widespread adoption of common underlying standards to ensure interoperability.

Zero-Knowledge Proof

A step further in privacy protection is zero-knowledge proofs. Let’s go back to the bar example. With a physical driver’s licence that bouncer sees your photo, name, date of birth, address, height, eye colour etc. With selective disclosure the bouncer sees only the specific attributes needed for the transaction (getting into a bar). The attributes in this case are name, photo and date of birth. In fact, the birth date was not what was really required. The only thing the bouncer needs to know is that you are over a certain age. Zero-knowledge proof enables this by asking a question of the credential: Is the date of birth greater or equal to the minimum requirement? Assuming our example is taking place on March 1, 2023 and the legal drinking age is 18 the question could be: Is the date of birth no later than March 1, 2005? In other words, with zero-knowledge proof you can prove you are of-age to a bouncer without having to reveal any information about your date of birth.

Zero-knowledge proofs can also be used when you use a password to authenticate. Just like the bouncer doesn’t need to know your birth date, they just need to know if you are of legal drinking age, the verifier does not necessarily require that you enter the password – they just need to know that you know the password! Zero-knowledge proof provides a mechanism to determine this knowledge without entering and transmitting that knowledge in an environment that might be subject to malicious actors.

There are a few non-digital examples used to demonstrate this concept. The ‘Ali Baba Cave’ example is commonly used to showcase a physical representation. In this case, you have two characters, Tania and Simon. Both are on an adventure together, they end up in front of a cave with two open entrances to two distinct paths (A and B). There is a gate inside the cave which connects both paths. A secret code can open the gate and grant passage, after which it is sealed once more. Simon knows the secret code to open the gate inside the cave and wants to demonstrate this without revealing it. Tania wants to verify that Simon actually knows the secret code to open the door. We now have the roles of ‘prover’ and ‘verifier.’

Simon secretly enters the cave through one of the paths, A or B. Then, Tania gets closer to the cave and asks Simon to come out through one path or the other. If Simon knows the secret code, he can open the gate and return through the path requested by Tania. The first time, Simon might have passed the test by chance, since there’s a 50% possibility that he entered through the same path Tania requested he exit. If the process is repeated multiple times, then the possibility that Simon exits through the same path chosen by Tania without having the secret code reduces considerably.

In a digital transaction, the ZKP cryptographic algorithm can be repeated many times very quickly, reducing the odds that the prover was lucky to almost nothing. If an answer can be provided that is correct every time, then there can be statistical certainty on the part of a verifier that the person has the knowledge they are asking about. But how is the possession of knowledge revealed without revealing the knowledge itself in an online world?

A zero-knowledge proof must meet three criteria:

Completeness – When a statement is true, the verifier will be convinced of this fact;
Soundness – No random guess or other malicious approach to making a true statement is possible, with near-zero certainty; and
Zero-knowledge – If the statement is true, the verifier learns nothing other than the statement is true.

To accomplish this, some very advanced mathematics and encryption techniques are used. If the inputs to a complex algorithm are the digital value of the knowledge requested plus a random number, the results of the algorithm can be sent to a verifier that can determine if the “answer” required possession of the knowledge, without revealing it. If this process is repeated enough times and the answer always demonstrates knowledge held, then probability theory allows the verifier to be statistically certain that the person they are dealing with does indeed have knowledge of the secret. But how efficient can this process be when it needs to be repeated enough times to provide statistical certainty? Several techniques are being advanced in this area, such as non-interactive zero-knowledge proof relying on the Fiat–Shamir heuristic.⁴

There is another approach that is applicable when the answer is known to both parties. As an example, both parties may know an authenticator, like a password, but wish to avoid the security concerns associated with transmitting it during authentication. In this case, complex mathematics can be used to generate patterns. Several patterns are produced using the same “seed”, an initial input that defines the sequence of outputs generated under certain conditions, and compared to one generated by the verifier. Complex math can be used to determine if all patterns produced must have come from the same seed. This process is more efficient and provides confidence that the knowledge is held, without entering or transmitting it.

Wrap-up

The more technical aspects of digital credentials ultimately allow for the simplest of ideas: privacy, security, and trust. Increasing the confidence of everyday credential holders in bringing their interactions online is foundational to the underlying concepts of digital credentials. Accelerating the adoption of digital identity technology will make its benefits more widely available and empower innovation on a greater scale.

We want to hear from you. Let us know your interests, questions, and any topics relating to digital identity that you would like to know more about. Get in touch with IDLab!

¹ Office of the Information and Privacy Commissioner of Ontario, 2013, “Privacy by Design” (PDF), https://www.ipc.on.ca/wp-content/uploads/2013/09/pbd-primer.pdf, retrieved March 13, 2023.

² International Conference of Data Protection and Privacy Commissioners, 2010, “Resolution on Privacy by Design” (PDF), https://edps.europa.eu/sites/edp/files/publication/10-10-27_jerusalem_resolutionon_privacybydesign_en.pdf, retrieved March 13, 2023.

³ International Organization for Standardization, 2021, “Consumer protection — Privacy by design for consumer goods and services”, https://www.iso.org/obp/ui/#iso:std:iso:31700:dis:ed-1:v1:en, retrieved March 13, 2023.

⁴ Fiat & Shamir, 200, “How To Prove Yourself: Practical Solutions to Identification and Signature Problems”, https://link.springer.com/chapter/10.1007/3-540-47721-7_12, retrieved March 13, 2023.

Categories