This FAQ attempts to answer frequently asked questions about the DP-3T project, the problems it tries to address, and its design choices. It is by no means complete. We’ll be updating this FAQ as we go, for now we have been focussing on answering the technical questions first. Feedback is very welcome.
* [Protocol Questions](#protocol-questions)
* [P1: Why don’t infected users upload the ephemeral Bluetooth identifiers (EphIDs) they have observed to the backend server, so that other apps can download them and check for contacts locally?](#p1-why-dont-infected-users-upload-the-ephemeral-bluetooth-identifiers-ephids-they-have-observed-to-the-backend-server-so-that-other-apps-can-download-them-and-check-for-contacts-locally)
* [P2: Why don’t infected users upload the ephemeral Bluetooth identifiers (EphIDs) they have observed to the backend server, so that other apps can ask the server if there is a match with their own EphIDs?](#p2-why-dont-infected-users-upload-the-ephemeral-bluetooth-identifiers-ephids-they-have-observed-to-the-backend-server-so-that-other-apps-can-ask-the-server-if-there-is-a-match-with-their-own-ephids)
* [P3: Why not use multi party computation or custom privacy\-preserving protocols (PSI, PIR, etc\.) instead to query the server for the observed ephemeral Bluetooth identifiers?](#p3-why-not-use-multi-party-computation-or-custom-privacy-preserving-protocols-psi-pir-etc-instead-to-query-the-server-for-the-observed-ephemeral-bluetooth-identifiers)
* [P4: Why is the system not using public key cryptography when broadcasting identifiers?](#p4-why-is-the-system-not-using-public-key-cryptography-when-broadcasting-identifiers)
* [P5: Why not use mixnets or other anonymous communication systems to query the server?](#p5-why-not-use-mixnets-or-other-anonymous-communication-systems-to-query-the-server)
* [P6: Why do infected people upload a seed (which enables recreating EphIDs) instead of their individual EphIDs?](#p6-why-do-infected-people-upload-a-seed-which-enables-recreating-ephids-instead-of-their-individual-ephids-)
* [P7: Why do you call your design "decentralized" while having a backend?](#p7-why-do-you-call-your-design-decentralized-while-having-a-backend)
## Protocol Questions
Questions regarding the underlying protocol and mitigations for known vulnerabilities
### P1: Why don’t infected users upload the ephemeral Bluetooth identifiers (`EphIDs`) they have observed to the backend server, so that other apps can download them and check for contacts locally?
*Short answer:** The bandwidth cost of downloading all observed Bluetooth
identifiers from all infected individuals is high. Furthermore, it facilitates
attacks that insert or remove contact events. Finally, it reveals interactions
between pseudonymous users to the backend server, without providing extra
privacy in comparison with publishing the infected users’ seeds.
*Long answer:** It is possible to build a privacy-friendly contact tracing
system by letting diagnosed patients upload the list of observed ephemeral
Bluetooth identifiers (EphIDs). All other smartphones would then download this list,
and check if any of the identifiers they generated was seen by (and therefore in
close physical proximity to) an infected patient.
This option, however, is very costly. In Europe there are more than 30,000
patients a day. The number of observed EphIDsis also high. We expect people to
be in close physical proximity with many people. For instance, spending 24 hours
at home with your partner will already yield 96 recorded EphIDs(assuming they
change every 15 minutes). So let’s say an infected person uploads 5000 unique
contact events for 21 days. We then need to transfer 150 million records. Even
using efficient representations (e.g., a cuckoo filter) this would take at least
600MB to be downloaded by every app, every day.
Sending observed contacts also increases the likelihood that a tech-savvy user
creates fake contact events, which in turn can lead to unnecessary anxiety. To
### P2: Why don’t infected users upload the ephemeral Bluetooth identifiers (`EphIDs`) they have observed to the backend server, so that other apps can ask the server if there is a match with their own `EphIDs`?
*Short answer:* This results in a high load on the server and either reveals
privacy sensitive information to the server, or requires anonymous
communication.
*Long answer:** In this solution, rather than apps downloading a list of all
EphIDs observed by infected patients, they would instead query the backend
server with their own EphIDs to ask if any of them has been in contact with an
infected patient. The consequence is a significant increase in bandwidth usage.
In particular, the apps must daily query all the EphIDs that they broadcasted
in the last 21 days (as newly diagnosed patients might have seen these in the
past), which is estimated as approximately 2,000 EphIDs per day per user.
For privacy reasons, it is essential that the server cannot link all EphIDs of a
single user. Therefore, users must query their EphIDs separately and via an
anonymous communication network so that their identifiers remain unlinkable. For
50 million users, the server must therefore be able to process more than a
million lookup queries per second.
### P3: Why not use multi party computation or custom privacy-preserving protocols (PSI, PIR, etc.) instead to query the server for the observed ephemeral Bluetooth identifiers?
We all love privacy-preserving cryptography. However, the scale at which this
system must operate is significant: a server set size of 150 million entries of
16 bytes each (corresponding to 30k new infections a day and 5000 distinct
recorded EphIDs), a client set of 2,000 items, and 50 million daily queries
(>500 queries per second).
It might be possible to design and deploy special purpose cryptographic
techniques that scale to this level and we are aware of research prototypes that
might be able to fulfil the requirements and for which code might be available.
However, a significant investment of time and engineering effort would still be
needed to take such prototypes and develop them to the point where they could be
deployed in a mobile application.
### P4: Why is the system not using public key cryptography when broadcasting identifiers?
In DP-3T any device must communicate with all of their neighbours, meaning that
authentication is impossible. Thus, a malicious party can inject their own
traffic and hence participate in any exchange.
Secondly, any application of public key cryptography would require a connection
between devices or multiple broadcasts (each broadcast is limited to only 11
bytes and the smallest public keys are around 32 bytes). In a crowded
environment there is substantial message loss from interference between
messages. It is unlikely that performing N^2 connections or exchanges between N
apps would function effectively, in contrast to N broadcasts in the current
protocol.
### P5: Why not use mixnets or other anonymous communication systems to query the server?
Our design uses a small amount of dummy messages to provide traffic analysis protection for uploads to the backend and epidemiologists with respect to network adversaries. The use of a mixnet, Tor or other anonymous system would in addition conceal the IP address of users submitting reports with respect to the backend.
In future versions of the app, if an approppriate anonymous communication network appears, we may include the option of submitting data anonymously to the backend.