DNS Privacy ConsiderationsElkins26241United States of AmericaWVtjw.ietf@gmail.com
Internet Area
dpriveDNSThis document describes the privacy issues associated with the use of the DNS
by Internet users. It provides general observations about typical current
privacy practices. It is intended to be an analysis of the present situation
and does not prescribe solutions. This document obsoletes RFC 7626.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are candidates for any level of Internet
Standard; see Section 2 of RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Table of Contents
. Introduction
. Scope
. Risks
. Risks in the DNS Data
. The Public Nature of DNS Data
. Data in the DNS Request
. Data in the DNS Payload
. Cache Snooping
. Risks on the Wire
. Unencrypted Transports
. Encrypted Transports
. Risks in the Servers
. In the Recursive Resolvers
. Resolver Selection
. Active Attacks on Resolver Configuration
. Blocking of DNS Resolution Services
. Encrypted Transports and Recursive Resolvers
. In the Authoritative Name Servers
. Other Risks
. Re-identification and Other Inferences
. More Information
. Actual "Attacks"
. Legalities
. Security Considerations
. IANA Considerations
. References
. Normative References
. Informative References
. Updates since RFC 7626
Acknowledgments
Contributions
Author's Address
IntroductionThis document is an analysis of the DNS privacy issues, in the spirit
of .
The Domain Name System (DNS) is specified in , , and
many later RFCs, which have never been consolidated. It is one of the most
important infrastructure components of the Internet and is often ignored or
misunderstood by Internet users (and even by many professionals). Almost
every activity on the Internet starts with a DNS query (and often several).
Its use has many privacy implications, and this document is an attempt at a
comprehensive and accurate list.
Let us begin with a simplified reminder of how the DNS works (see also
). A client, the stub resolver, issues a
DNS query to a server called the recursive resolver (also called caching
resolver, full resolver, or recursive name server). Let's use the query
"What are the AAAA records for www.example.com?" as an example. AAAA is the
QTYPE (Query Type), and www.example.com is the QNAME (Query Name). (The
description that follows assumes a cold cache, for instance, because the
server just started.) The recursive resolver will first query the root name
servers. In most cases, the root name servers will send a referral. In this
example, the referral will be to the .com name servers. The resolver repeats
the query to one of the .com name servers. The .com name servers, in turn,
will refer to the example.com name servers. The example.com name servers will
then return the answers. The root name servers, the name servers of .com, and
the name servers of example.com are called authoritative name servers. It is
important, when analyzing the privacy issues, to remember that the question
asked to all these name servers is always the original question, not a
derived question. The question sent to the root name servers is "What are
the AAAA records for www.example.com?", not "What are the name servers of
.com?". By repeating the full question, instead of just the relevant part of
the question to the next in line, the DNS provides more information than
necessary to the name server. In this simplified description, recursive
resolvers do not implement QNAME minimization as described in ,
which will only send the relevant part of the question to the upstream name
server.
DNS relies heavily on caching, so the algorithm described
above is actually a bit more complicated, and not all questions are
sent to the authoritative name servers. If the
stub resolver asks the recursive resolver a few seconds later, "What are the SRV records
of _xmpp-server._tcp.example.com?", the recursive resolver will
remember that it knows the name servers of example.com and will just
query them, bypassing the root and .com. Because there is typically
no caching in the stub resolver, the recursive resolver, unlike the
authoritative servers, sees all the DNS traffic. (Applications, like
web browsers, may have some form of caching that does not follow DNS
rules, for instance, because it may ignore the TTL. So, the
recursive resolver does not see all the name resolution activity.)
It should be noted that DNS recursive resolvers sometimes forward
requests to other recursive resolvers, typically bigger machines,
with a larger and more shared cache (and the query hierarchy can be
even deeper, with more than two levels of recursive resolvers). From
the point of view of privacy, these forwarders are like resolvers
except that they do not see all of the requests being made (due to
caching in the first resolver).
At the time of writing, almost all this DNS traffic is currently
sent unencrypted. However, there is increasing deployment
of DNS over TLS (DoT) and DNS over HTTPS (DoH)
, particularly in mobile devices, browsers, and by
providers of anycast recursive DNS resolution services. There are a
few cases where there is some alternative channel encryption, for
instance, in an IPsec VPN tunnel, at least between the stub resolver and
the resolver. Some recent analysis on the service quality of encrypted DNS
traffic can be found in .
Today, almost all DNS queries are sent over UDP . This has
practical consequences when considering encryption of the traffic as a
possible privacy technique. Some encryption solutions are only designed for
TCP, not UDP, although new solutions are still emerging .
Another important point to keep in mind when analyzing the privacy
issues of DNS is the fact that DNS requests received by a server are
triggered for different reasons. Let's assume an eavesdropper wants
to know which web page is viewed by a user. For a typical web page,
there are three sorts of DNS requests being issued:
Primary request:
This is the domain name in the URL that the user
typed, selected from a bookmark, or chose by clicking on a
hyperlink. Presumably, this is what is of interest for the
eavesdropper.
Secondary requests:
These are the additional requests performed by
the user agent (here, the web browser) without any direct
involvement or knowledge of the user. For the Web, they are
triggered by embedded content, Cascading Style Sheets (CSS),
JavaScript code, embedded images, etc. In some cases, there can
be dozens of domain names in different contexts on a single web
page.
Tertiary requests:
These are the additional requests performed by
the DNS service itself. For instance, if the answer to a query is
a referral to a set of name servers and the glue records are not
returned, the resolver will have to send additional requests to turn
the name servers' names into IP addresses. Similarly, even if
glue records are returned, a careful recursive server will send
tertiary requests to verify the IP addresses of those records.
It can also be noted that, in the case of a typical web browser, more
DNS requests than strictly necessary are sent, for instance, to
prefetch resources that the user may query later or when
autocompleting the URL in the address bar. Both are a significant privacy
concern since they may leak information even about non-explicit
actions. For instance, just reading a local HTML page, even without
selecting the hyperlinks, may trigger DNS requests.
Privacy-related terminology is from
. This document obsoletes .
ScopeThis document focuses mostly on the study of privacy risks for the
end user (the one performing DNS requests). The risks of
pervasive surveillance are considered as well as risks coming from a more
focused surveillance. In this document, the term "end user" is used
as defined in .
This document does not attempt a comparison of specific privacy protections
provided by individual networks or organizations; it makes only general
observations about typical current practices.
Privacy risks for the holder of a zone (the risk that someone gets the data)
are discussed in and .
Privacy risks for recursive operators (including access providers and
operators in enterprise networks) such as leakage of private namespaces or
blocklists are out of scope for this document.
Non-privacy risks (e.g., security-related considerations such as cache poisoning) are
also out of scope.
The privacy risks associated with the use of other protocols that make use of
DNS information are not considered here.
RisksThe following four sections outline the privacy considerations associated with
different aspects of the DNS for the end user. When reading these sections, it
needs to be kept in mind that many of the considerations (for example, recursive
resolver and transport protocol) can be specific to the network context that a
device is using at a given point in time. A user may have many devices, and each
device might utilize many different networks (e.g., home, work, public, or
cellular) over a period of time or even concurrently. An exhaustive analysis of
the privacy considerations for an individual user would need to take into
account the set of devices used and the multiple dynamic contexts of each
device. This document does not attempt such a complex analysis; instead, it
presents an overview of the various considerations that could form the basis of
such an analysis.
Risks in the DNS DataThe Public Nature of DNS DataIt has been stated that "the data in the DNS is public". This sentence
makes sense for an Internet-wide lookup system, and there
are multiple facets to the data and metadata involved that deserve a
more detailed look. First, access control lists (ACLs) and private
namespaces notwithstanding, the DNS operates under the assumption
that public-facing authoritative name servers will respond to "usual"
DNS queries for any zone they are authoritative for, without further
authentication or authorization of the client (resolver). Due to the
lack of search capabilities, only a given QNAME will reveal the
resource records associated with that name (or that name's nonexistence). In other words: one needs to know what to ask for in
order to receive a response. There are many ways in which supposedly "private"
resources currently leak. A few examples are DNSSEC NSEC zone walking ,
passive DNS services , etc. The zone transfer QTYPE is
often blocked or restricted to authenticated/authorized access to
enforce this difference (and maybe for other reasons).
Another difference between the DNS data and a particular DNS
transaction (i.e., a DNS name lookup): DNS data and the results of a
DNS query are public, within the boundaries described above, and may
not have any confidentiality requirements. However, the same is not
true of a single transaction or a sequence of transactions; those
transactions are not / should not be public. A single transaction
reveals both the originator of the query and the query contents; this
potentially leaks sensitive information about a specific user. A
typical example from outside the DNS world is that the website of Alcoholics Anonymous is public but the fact that you visit it should not be. Furthermore,
the ability to link queries reveals information about individual use
patterns.
Data in the DNS RequestThe DNS request includes many fields, but two of them seem particularly
relevant for the privacy issues: the QNAME and the source IP address.
"Source IP address" is used in a loose sense of "source IP address + maybe
source
port number", because the port number is also in the request and can be used to
differentiate between several users sharing an IP address (behind a
Carrier-Grade NAT (CGN), for instance ).
The QNAME is the full name sent by the user. It gives information
about what the user does ("What are the MX records of example.net?"
means they probably want to send email to someone at example.net,
which may be a domain used by only a few persons and is therefore
very revealing about communication relationships). Some QNAMEs are
more sensitive than others. For instance, querying the A record of a
well-known web statistics domain reveals very little (everybody
visits websites that use this analytics service), but querying the A
record of www.verybad.example where verybad.example is the domain of
an organization that some people find offensive or objectionable may
create more problems for the user. Also, sometimes, the QNAME embeds
the software one uses, which could be a privacy issue (for instance,
_ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org.
There are also some BitTorrent clients that query an SRV record for
_bittorrent-tracker._tcp.domain.example.
Another important thing about the privacy of the QNAME is future
usages. Today, the lack of privacy is an obstacle to putting
potentially sensitive or personally identifiable data in the DNS. At
the moment, your DNS traffic might reveal that you are exchanging emails but not with whom. If your Mail User Agent (MUA) starts looking up
Pretty Good Privacy (PGP) keys in the DNS , then
privacy becomes a lot more important. And email is just an example;
there would be other really interesting uses for a more privacy-friendly DNS.
For the communication between the stub resolver and the recursive resolver,
the source IP address is the address of the user's machine. Therefore, all
the issues and warnings about collection of IP addresses apply here. For the communication between the recursive resolver and the authoritative name
servers, the source IP address has a different meaning; it does not have the
same status as the source address in an HTTP connection. It is typically the
IP address of the recursive resolver that, in a way, "hides" the real user.
However, hiding does not always work. The edns-client-subnet (ECS) EDNS0 option is sometimes used (see one privacy analysis in ).
Sometimes the end user has a personal recursive resolver on their machine.
In both cases, the IP address originating queries to the authoritative server
is as sensitive as it is for HTTP .
A note about IP addresses: there is currently no IETF document that describes
in detail all the privacy issues around IP addressing in general, although
does discuss privacy considerations for IPv6 address generation
mechanisms. In the meantime, the discussion here is intended to include both
IPv4 and IPv6 source addresses. For a number of reasons, their assignment and
utilization characteristics are different, which may have implications for
details of information leakage associated with the collection of source
addresses. (For example, a specific IPv6 source address seen on the public
Internet is less likely than an IPv4 address to originate behind an address-sharing scheme.) However, for both IPv4 and IPv6 addresses, it is
important to note that source addresses are propagated with queries
via the ECS option and comprise metadata about the host, user,
or application that originated them.
Data in the DNS PayloadAt the time of writing, there are no standardized client identifiers contained in
the DNS payload itself (ECS, as described in , is widely used; however, is only an Informational RFC).
DNS Cookies are a lightweight DNS transaction security mechanism that
provides limited protection against a variety of increasingly common
denial-of-service and amplification/forgery or cache poisoning attacks by
off-path attackers. It is noted, however, that they are designed to just verify
IP addresses (and should change once a client's IP address changes), but they are
not designed to actively track users (like HTTP cookies).
There are anecdotal accounts of Media Access Control (MAC) addresses
and even user names being inserted in nonstandard EDNS(0) options
for stub-to-resolver communications to support proprietary functionality
implemented at the resolver (e.g., parental filtering).
Cache SnoopingThe content of recursive resolvers' caches can reveal data about the
clients using it (the privacy risks depend on the number of clients).
This information can sometimes be examined by sending DNS queries
with RD=0 to inspect cache content, particularly looking at the DNS
TTLs . Since this also is a reconnaissance
technique for subsequent cache poisoning attacks, some countermeasures have already been developed and deployed .
Risks on the WireUnencrypted TransportsFor unencrypted transports, DNS traffic can be seen by an eavesdropper like
any other traffic. (DNSSEC, specified in , explicitly excludes
confidentiality from its goals.) So, if an initiator starts an HTTPS
communication with a recipient, the HTTP traffic will be encrypted, but the
DNS exchange prior to it will not be. When other protocols become more
and more privacy aware and secured against surveillance (e.g., ,
), the use of unencrypted transports for DNS may
become "the weakest link" in privacy. It is noted that, at the time of writing,
there is ongoing work attempting to encrypt the Server Name Identification (SNI) in the TLS handshake
, which is one of the
last remaining non-DNS cleartext identifiers of a connection target.
An important specificity of the DNS traffic is that it may take a
different path than the communication between the initiator and the
recipient. For instance, an eavesdropper may be unable to tap the
wire between the initiator and the recipient but may have access to
the wire going to the recursive resolver or to the authoritative
name servers.
The best place to tap, from an eavesdropper's point of view, is
clearly between the stub resolvers and the recursive resolvers,
because traffic is not limited by DNS caching.
The attack surface between the stub resolver and the rest of the
world can vary widely depending upon how the end user's device is
configured. By order of increasing attack surface:
The recursive resolver can be on the end user's device. In (currently) a small number of cases, individuals may choose to
operate their own DNS resolver on their local machine. In this
case, the attack surface for the connection between the stub
resolver and the caching resolver is limited to that single
machine. The recursive resolver will expose data to authoritative
resolvers as discussed in .
The recursive resolver may be at the local network edge. For
many/most enterprise networks and for some residential networks, the
caching resolver may exist on a server at the edge of the local
network. In this case, the attack surface is the local network.
Note that in large enterprise networks, the DNS resolver may not
be located at the edge of the local network but rather at the edge
of the overall enterprise network. In this case, the enterprise
network could be thought of as similar to the Internet Access
Provider (IAP) network referenced below.
The recursive resolver can be in the IAP network. For most residential
networks and potentially other networks, the typical case is for the
user's device to be configured (typically automatically through DHCP or
relay agent options) with the addresses of the DNS proxy in the Customer
Premises Equipment (CPE), which in turn
points to the DNS recursive resolvers at the IAP. The attack surface for
on-the-wire attacks is therefore from the end user system across the
local network and across the IAP network to the IAP's recursive resolvers.
The recursive resolver can be a public DNS service (or a privately run DNS
resolver hosted on the public Internet). Some machines
may be configured to use public DNS resolvers such as those
operated by Google Public DNS or OpenDNS. The user may
have configured their machine to use these DNS recursive resolvers
themselves -- or their IAP may have chosen to use the public DNS
resolvers rather than operating their own resolvers. In this
case, the attack surface is the entire public Internet between the
user's connection and the public DNS service. It can be noted that if the
user selects a single resolver with a small client population (even when using
an encrypted transport), it can actually serve to aid tracking of that user as
they move across network environments.
It is also noted that, typically, a device connected only to a modern cellular
network is
directly configured with only the recursive resolvers of the IAP and
afforded some level of protection against some types of eavesdropping
for all traffic (including DNS traffic) due to the cellular network
link-layer encryption.
The attack surface for this specific scenario is not considered here.
Encrypted TransportsThe use of encrypted transports directly mitigates passive surveillance of the
DNS payload; however, some privacy attacks are still possible. This section
enumerates the residual privacy risks to an end user when an attacker can
passively monitor encrypted DNS traffic flows on the wire.
These are cases where user identification, fingerprinting, or correlations may be
possible due to the use of certain transport layers or cleartext/observable
features. These issues are not specific to DNS, but DNS traffic is susceptible
to these attacks when using specific transports.
Some general examples exist; for example, certain studies highlight
that the OS fingerprint values of IPv4 TTL, IPv6 Hop Limit, or TCP Window size can be used to fingerprint client OSes or that various techniques can be
used to de-NAT DNS queries .
Note that even when using encrypted transports, the use of cleartext transport
options to decrease latency can provide correlation of a user's connections,
e.g., using TCP Fast Open .
Implementations that support encrypted transports also commonly reuse
connections for multiple DNS queries to optimize performance (e.g., via DNS
pipelining or HTTPS multiplexing). Default configuration options for encrypted
transports could, in principle, fingerprint a specific client application.
For
example:
TLS version or cipher suite selection
session resumption
the maximum number of messages to send and
a maximum connection time before closing a connections and reopening.
If libraries or applications offer user configuration of such options (e.g.,
), then they could, in principle, help to identify a specific user. Users
may want to use only the defaults to avoid this issue.
While there are known attacks on older versions of TLS, the most recent
recommendations and the development of TLS 1.3 largely
mitigate those.
Traffic analysis of unpadded encrypted traffic is also possible
because the sizes and timing of encrypted DNS
requests and responses can be correlated to unencrypted DNS requests upstream
of a recursive resolver.
Risks in the ServersUsing the terminology of , the DNS servers (recursive
resolvers and authoritative servers) are enablers: "they facilitate
communication between an initiator and a recipient without being
directly in the communications path". As a result, they are often
forgotten in risk analysis. But, to quote again, "Although
[...] enablers may not generally be considered as attackers, they may
all pose privacy threats (depending on the context) because they are
able to observe, collect, process, and transfer privacy-relevant
data". In parlance, enablers become observers when they
start collecting data.
Many programs exist to collect and analyze DNS data at the servers -- from
the "query log" of some programs like BIND to tcpdump and more sophisticated
programs like PacketQ and DNSmezzo . The
organization managing the DNS server can use this data itself, or it can be
part of a surveillance program like PRISM and pass data to an
outside observer.
Sometimes this data is kept for a long time and/or distributed to
third parties for research purposes , security
analysis, or surveillance tasks. These uses are sometimes under some
sort of contract, with various limitations, for instance, on
redistribution, given the sensitive nature of the data. Also, there
are observation points in the network that gather DNS data and then
make it accessible to third parties for research or security purposes
("passive DNS" ).
In the Recursive ResolversRecursive resolvers see all the traffic since there is typically no
caching before them. To summarize: your recursive resolver knows a
lot about you. The resolver of a large IAP, or a large public
resolver, can collect data from many users.
Resolver SelectionGiven all the above considerations, the choice of recursive resolver has
direct privacy considerations for end users. Historically, end user devices
have used the DHCP-provided local network recursive resolver. The choice by a
user to join a particular network (e.g., by physically plugging in a cable or
selecting a network in an OS dialogue) typically updates a number of system
resources -- these can include IP addresses, the availability of IPv4/IPv6, DHCP
server, and DNS resolver. These individual changes, including the change in
DNS resolver, are not normally communicated directly to the user by the OS
when the network is joined. The choice of network has historically determined
the default system DNS resolver selection; the two are directly coupled in
this model.
The vast majority of users do not change their default system DNS settings
and so implicitly accept the network settings for the DNS. The network resolvers
have therefore historically been the sole destination for all of the DNS
queries from a device. These resolvers may have varied
privacy policies depending on the network. Privacy policies for these servers
may or may not be available, and users need to be aware that privacy
guarantees will vary with the network.
All major OSes expose the system DNS settings and allow users to manually
override them if desired.
More recently, some networks and users have actively chosen
to use a large public resolver, e.g., Google Public
DNS,
Cloudflare,
or Quad9. There can be many reasons: cost
considerations for network operators, better reliability, or anti-censorship
considerations are just a few. Such services typically do provide a privacy
policy, and the user can get an idea of the data collected by such
operators by reading one, e.g., Google Public DNS - Your
Privacy.
In general, as with many other protocols, issues around centralization also
arise with DNS.
The picture is fluid with several competing factors
contributing, where these factors can also vary by geographic region. These include:
ISP outsourcing, including to third-party and public resolvers
regional market domination by one or only a few ISPs
applications directing DNS traffic by default to a limited subset of resolvers (see )
An increased proportion of the global DNS resolution traffic being served by
only a few entities means that the privacy considerations for users are
highly dependent on the privacy policies and practices of those
entities. Many of the issues around centralization are discussed in
.
Dynamic Discovery of DoH and Strict DoTWhile support for opportunistic DoT can be determined by probing a resolver on
port 853, there is currently no standardized discovery mechanism for DoH and
Strict DoT servers.
This means that clients that might want to dynamically discover such encrypted
services, and where users are willing to trust such services, are not able to do
so. At the time of writing, efforts to provide standardized signaling mechanisms
to discover the services offered by local resolvers are in progress
. Note that an increasing number of ISPs
are deploying encrypted DNS; for example, see the Encrypted DNS Deployment
Initiative .
Application-Specific Resolver SelectionAn increasing number of applications are offering application-specific encrypted DNS resolution settings, rather than defaulting to
using only the system resolver. A variety of heuristics and
resolvers are available in different applications, including hard-coded lists of recognized DoH/DoT servers.
Generally, users are not aware of application-specific DNS settings and may
not have control over those settings. To address these limitations, users
will only be aware of and have the ability to control such settings if
applications provide the following functions:
communicate the change clearly to users when the default application
resolver changes away from the system resolver
provide configuration options to change the default
application resolver, including a choice to always use the system resolver
provide mechanisms for users to locally inspect, selectively forward,
and filter queries (either via the application itself or use of the
system resolver)
Application-specific changes to default destinations for users' DNS
queries might increase or decrease user privacy; it is highly
dependent on the network context and the application-specific
default. This is an area of active debate, and the IETF is working on
a number of issues related to application-specific DNS settings.
Active Attacks on Resolver ConfigurationThe previous section discussed DNS privacy, assuming that all the traffic
was directed to the intended servers (i.e., those that would be used in the
absence of an active attack) and that the potential attacker was purely
passive. But, in reality, there can be active attackers in the network.
The Internet Threat model, as described in , assumes that the attacker
controls the network. Such an attacker can completely control any insecure DNS
resolution, both passively monitoring the queries and responses and substituting
their own responses. Even if encrypted DNS such as DoH or DoT is used, unless
the client has been configured in a secure way with the server identity, an active attacker can impersonate the server. This implies that opportunistic
modes of DoH/DoT as well as modes where the client learns of the DoH/DoT server
via in-network mechanisms such as DHCP are vulnerable to attack. In addition, if
the client is compromised, the attacker can replace the DNS configuration with
one of its own choosing.
Blocking of DNS Resolution ServicesUser privacy can also be at risk if there is blocking
of access to remote recursive servers
that offer encrypted transports, e.g., when the local resolver does not offer
encryption and/or has very poor privacy policies. For example, active blocking
of port 853 for DoT or blocking of specific IP addresses could restrict the resolvers
available to the user. The extent of the risk to user privacy is highly
dependent on the specific network and user context; a user on a network that
is known to perform surveillance would be compromised if they could not access
such services, whereas a user on a trusted network might have no privacy
motivation to do so.
As a matter of policy, some recursive resolvers use their position in the query
path to selectively block access to certain DNS records. This is a form of
rendezvous-based blocking as described in . Such
blocklists often include servers known to be used for malware, bots, or other
security risks. In order to prevent circumvention of their blocking policies,
some networks also block access to resolvers with incompatible policies.
It is also noted that attacks on remote resolver services, e.g., DDoS, could
force users to switch to other services that do not offer encrypted transports
for DNS.
Encrypted Transports and Recursive ResolversDoT and DoHUse of encrypted transports does not reduce the data available in the recursive
resolver and ironically can actually expose more information about users to
operators. As described in , use of session-based encrypted
transports (TCP/TLS) can expose correlation data about users.
DoH-Specific ConsiderationsDoH inherits the full privacy properties of the HTTPS stack and as a consequence
introduces new privacy considerations when compared with DNS over UDP, TCP, or
TLS . describes the privacy considerations in
the server of the DoH protocol.
A brief summary of some of the issues includes the following:
HTTPS presents new considerations for correlation, such as explicit HTTP
cookies and implicit fingerprinting of the unique set and ordering of HTTP
request header fields.
The User-Agent and Accept-Language request header fields often convey specific
information about the client version or locale.
Utilizing the full set of HTTP features enables DoH to be more than an HTTP
tunnel, but it is at the cost of opening up implementations to the full set of
privacy considerations of HTTP.
Implementations are advised to expose the minimal set of data needed to
achieve the desired feature set.
specifically makes selection of HTTPS functionality vs. privacy an
implementation choice. At the extremes, there may be implementations that
attempt to achieve parity with DoT from a privacy perspective at the cost of
using no identifiable HTTP headers, and there might be others that provide feature-rich data flows where the low-level origin of the DNS query is easily
identifiable. Some implementations have, in fact, chosen to restrict the use of the User-Agent header so that resolver operators cannot identify the specific
application that is originating the DNS queries.
Privacy-focused users should be aware of the potential for additional client
identifiers in DoH compared to DoT and may want to only use DoH client
implementations that provide clear guidance on what identifiers they add.
In the Authoritative Name ServersUnlike what happens for recursive resolvers, the observation capabilities of
authoritative name servers are limited by caching; they see only the requests
for which the answer was not in the cache. For aggregated statistics ("What
is the percentage of LOC queries?"), this is sufficient, but it prevents an
observer from seeing everything. Similarly, the increasing deployment of QNAME
minimization reduces the data visible at the
authoritative name server. Still, the authoritative name servers see a part
of the traffic, and this subset may be sufficient to violate some privacy
expectations.
Also, the user often has some legal/contractual link with the
recursive resolver (they have chosen the IAP, or they have chosen to use a
given public resolver) while having no control and perhaps no
awareness of the role of the authoritative name servers and their
observation abilities.
As noted before, using a local resolver or a resolver close to the
machine decreases the attack surface for an on-the-wire eavesdropper.
But it may decrease privacy against an observer located on an
authoritative name server. This authoritative name server will see
the IP address of the end client instead of the address of a big
recursive resolver shared by many users.
This "protection", when using a large resolver with many clients, is
no longer present if ECS is used because, in this case,
the authoritative name server sees the original IP address (or
prefix, depending on the setup).
As of today, all the instances of one root name server, L-root,
receive together around 50,000 queries per second. While most of it
is "junk" (errors on the Top-Level Domain (TLD) name), it gives an
idea of the amount of big data that pours into name servers. (And
even "junk" can leak information; for instance, if there is a typing
error in the TLD, the user will send data to a TLD that is not the
usual one.)
Many domains, including TLDs, are partially hosted by third-party
servers, sometimes in a different country. The contracts between the
domain manager and these servers may or may not take privacy into
account. Whatever the contract, the third-party hoster may or may not be honest; in any case, it will have to follow its local laws. For
example,
requests to a given ccTLD may go to servers managed by organizations
outside of the ccTLD's country. Users may not anticipate that
when doing a security analysis.
Also, it seems (see the survey described in ) that there is a
strong concentration of authoritative name servers among "popular" domains
(such as the Alexa Top N list). For instance, among the Alexa Top
100K, one DNS provider hosts 10% of
the domains today. The ten most important DNS providers together host one-third of
all domains. With the control (or the ability to sniff the traffic) of a few
name servers, you can gather a lot of information.
Other RisksRe-identification and Other InferencesAn observer has access not only to the data they directly collect but also
to the results of various inferences about this data. The term "observer" here is used very generally; for example, the observer might
passively observe cleartext DNS traffic or be in the network
that is actively attacking the user by redirecting DNS resolution, or it might be a
local or remote resolver operator.
For instance, a user can be re-identified via DNS queries. If the
adversary knows a user's identity and can watch their DNS queries for
a period, then that same adversary may be able to re-identify the
user solely based on their pattern of DNS queries later on regardless
of the location from which the user makes those queries. For
example, one study found that such re-identification is possible so that "73.1% of all day-to-day links
were correctly established, i.e. user u was either re-identified
unambiguously (1) or the classifier correctly reported that u was not
present on day t + 1 any more (2)". While that study related to web
browsing behavior, equally characteristic patterns may be produced
even in machine-to-machine communications or without a user taking
specific actions, e.g., at reboot time if a characteristic set of
services are accessed by the device.
For instance, one could imagine that an intelligence agency
identifies people going to a site by putting in a very long DNS name
and looking for queries of a specific length. Such traffic analysis
could weaken some privacy solutions.
The IAB Privacy and Security Program also has a document
that considers such inference-based attacks in a more
general framework.
More InformationUseful background information can also be found in (regarding the risk of privacy leaks through DNS) and in a few academic papers:
, , , and
.
Actual "Attacks"A very quick examination of DNS traffic may lead to the false conclusion that
extracting the needle from the haystack is difficult. "Interesting" primary
DNS requests are mixed with useless (for the eavesdropper) secondary and
tertiary requests (see the terminology in ). But, in
this time of "big data" processing, powerful techniques now exist to get from
the raw data to what the eavesdropper is actually interested in.
Many research papers about malware detection use DNS traffic to
detect "abnormal" behavior that can be traced back to the activity of
malware on infected machines.
Yes, this research was done for the greater good, but technically it is a privacy attack and it demonstrates the
power of the observation of DNS traffic. See ,
, and .
Passive DNS services allow reconstruction of the data of sometimes an entire zone. Well-known passive DNS services keep only the DNS
responses and not the source IP address of the client, precisely for
privacy reasons. Other passive DNS services may not be so careful.
And there are still potential problems with revealing QNAMEs.
The revelations from the Edward Snowden documents, which were leaked from the
National Security Agency (NSA), provide evidence of the use of the DNS in mass
surveillance operations . For example, the MORECOWBELL
surveillance program uses a dedicated covert monitoring infrastructure
to actively query DNS servers and perform HTTP requests to obtain meta-information about services and to check their availability. Also, the
QUANTUMTHEORY
project, which includes detecting lookups for certain addresses and injecting
bogus replies, is another good example showing that the lack of privacy
protections in the DNS is actively exploited.
LegalitiesTo our knowledge, there are no specific privacy laws for DNS data in any
country. Interpreting general privacy laws, like the European Union's
or GDPR, in the context of DNS traffic data is not an easy task, and
there is no known court precedent. See an interesting analysis in
.
Security ConsiderationsThis document is entirely about security -- more precisely, privacy. It just
lays out the problem; it does not try to set requirements (with the choices
and compromises they imply), much less define solutions. Possible solutions
to the issues described here are discussed in other documents (currently too
many to all be mentioned); see, for instance, "Recommendations for DNS
Privacy Operators" .
IANA ConsiderationsThis document has no IANA actions.
ReferencesNormative ReferencesDomain names - concepts and facilitiesThis RFC is the revised basic definition of The Domain Name System. It obsoletes RFC-882. This memo describes the domain style names and their used for host address look up and electronic mail forwarding. It discusses the clients and servers in the domain name system and the protocol used between them.Domain names - implementation and specificationThis RFC is the revised specification of the protocol and format used in the implementation of the Domain Name System. It obsoletes RFC-883. This memo documents the details of the domain name client - server communication.Privacy Considerations for Internet ProtocolsThis document offers guidance for developing privacy considerations for inclusion in protocol specifications. It aims to make designers, implementers, and users of Internet protocols aware of privacy-related design choices. It suggests that whether any individual RFC warrants a specific privacy considerations section will depend on the document's content.Pervasive Monitoring Is an AttackPervasive monitoring is a technical attack that should be mitigated in the design of IETF protocols, where possible.Informative ReferencesVie privée: et le DNS alors? [Privacy: what about DNS?]DNS Cache snooping - should I be concerned?ISCAnonymous Resolution of DNS QueriesLecture Notes in Computer Science, Vol. 5332Cloud Computing: Centralization and Data SovereigntyEuropean Journal of Law and Technology, Vol. 3, No. 2Corrupted DNS Resolution Paths: The Rise of a Malicious Resolution AuthorityISC/OARC WorkshopGot Malware? Three Signs Revealed In DNS TrafficDirective 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such dataEuropean ParliamentOfficial Journal L 281, pp. 31-50A Day at the Root of the InternetACM SIGCOMM Computer Communication Review, Vol. 38, No. 5Security and privacy issues of edns-client-subnetA Day in the Life of the Internet (DITL)CAIDADNS-DNS: DNS-based De-NAT SchemeDNS Footprint of MalwareOARC WorkshopAn End-to-End, Large-Scale Measurement of DNS-over-Encryption: How Far Have We Come?IMC '19: Proceedings of the Internet Measurement Conference, pp. 22-35DNSmezzoDNS Resolver Information Self-publicationGoogleICANNICANN This document describes methods for DNS resolvers to self-publish
information about themselves, such as whether they perform DNSSEC
validation or are available over transports other than what is
defined in RFC 1035. The information is returned as a JSON object.
The names in this object are defined in an IANA registry that allows
for light-weight registration. Applications and operating systems
can use the methods defined here to get the information from
resolvers in order to make choices about how to send future queries
to those resolvers.
There is a GitHub repo for this draft where pull requests can be
issued: https://github.com/DNSOP/draft-ietf-dnsop-resolver-
information However, starting issues on the WG mailing list is
preferred.
Work in ProgressSpecification of DNS over Dedicated QUIC ConnectionsPrivate Octopus Inc.Sinodun ITSalesforce This document describes the use of QUIC to provide transport privacy
for DNS. The encryption provided by QUIC has similar properties to
that provided by TLS, while QUIC transport eliminates the head-of-
line blocking issues inherent with TCP and provides more efficient
error corrections than UDP. DNS over QUIC (DoQ) has privacy
properties similar to DNS over TLS (DoT) specified in RFC7858, and
latency characteristics similar to classic DNS over UDP.
Work in ProgressEncrypted DNS Deployment InitiativeEDDIAnalysis of Privacy Disclosure in DNS QueryMUE '07: Proceedings of the 2007 International Conference on Multimedia and Ubiquitous Engineeringpp. 952-957Privacy-Preserving DNS: Analysis of Broadcast, Range Queries and Mix-based Protection MethodsESORICS 2011, pp. 665-683getdnsCache Snooping or Snooping the Cache for Fun and ProfitAnalyzing Characteristic Host Access Patterns for Re-Identification of Web User SessionsLecture Notes in Computer Science, Vol. 7127NSA's MORECOWBELL: Knell for DNSA tool that provides a basic SQL-frontend to PCAP-filesDNS-OARCRelease 1.4.3commit 29a8288Passive DNS Replication17th Annual FIRST ConferencePretty Bad Privacy: Pitfalls of DNS EncryptionWPES '14: Proceedings of the 13th Workshop on Privacy in the Electronic Society, pp. 191-200PRISM (surveillance program)WikipediaGuidelines for Writing RFC Text on Security ConsiderationsAll RFCs are required to have a Security Considerations section. Historically, such sections have been relatively weak. This document provides guidelines to RFC authors on how to write a good Security Considerations section. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.DNS Security Introduction and RequirementsThe Domain Name System Security Extensions (DNSSEC) add data origin authentication and data integrity to the Domain Name System. This document introduces these extensions and describes their capabilities and limitations. This document also discusses the services that the DNS security extensions do and do not provide. Last, this document describes the interrelationships between the documents that collectively describe DNSSEC. [STANDARDS-TRACK]Minimally Covering NSEC Records and DNSSEC On-line SigningThis document describes how to construct DNSSEC NSEC resource records that cover a smaller range of names than called for by RFC 4034. By generating and signing these records on demand, authoritative name servers can effectively stop the disclosure of zone contents otherwise made possible by walking the chain of NSEC records in a signed zone. [STANDARDS-TRACK]DNS Security (DNSSEC) Hashed Authenticated Denial of ExistenceThe Domain Name System Security (DNSSEC) Extensions introduced the NSEC resource record (RR) for authenticated denial of existence. This document introduces an alternative resource record, NSEC3, which similarly provides authenticated denial of existence. However, it also provides measures against zone enumeration and permits gradual expansion of delegation-centric zones. [STANDARDS-TRACK]DNS Zone Transfer Protocol (AXFR)The standard means within the Domain Name System protocol for maintaining coherence among a zone's authoritative name servers consists of three mechanisms. Authoritative Transfer (AXFR) is one of the mechanisms and is defined in RFC 1034 and RFC 1035.The definition of AXFR has proven insufficient in detail, thereby forcing implementations intended to be compliant to make assumptions, impeding interoperability. Yet today we have a satisfactory set of implementations that do interoperate. This document is a new definition of AXFR -- new in the sense that it records an accurate definition of an interoperable AXFR mechanism. [STANDARDS-TRACK]Issues with IP Address SharingThe completion of IPv4 address allocations from IANA and the Regional Internet Registries (RIRs) is causing service providers around the world to question how they will continue providing IPv4 connectivity service to their subscribers when there are no longer sufficient IPv4 addresses to allocate them one per subscriber. Several possible solutions to this problem are now emerging based around the idea of shared IPv4 addressing. These solutions give rise to a number of issues, and this memo identifies those common to all such address sharing approaches. Such issues include application failures, additional service monitoring complexity, new security vulnerabilities, and so on. Solution-specific discussions are out of scope.Deploying IPv6 is the only perennial way to ease pressure on the public IPv4 address pool without the need for address sharing mechanisms that give rise to the issues identified herein. This document is not an Internet Standards Track specification; it is published for informational purposes.Extension Mechanisms for DNS (EDNS(0))The Domain Name System's wire protocol includes a number of fixed fields whose range has been or soon will be exhausted and does not allow requestors to advertise their capabilities to responders. This document describes backward-compatible mechanisms for allowing the protocol to grow.This document updates the Extension Mechanisms for DNS (EDNS(0)) specification (and obsoletes RFC 2671) based on feedback from deployment experience in several implementations. It also obsoletes RFC 2673 ("Binary Labels in the Domain Name System") and adds considerations on the use of extended labels in the DNS.TCP Fast OpenThis document describes an experimental TCP mechanism called TCP Fast Open (TFO). TFO allows data to be carried in the SYN and SYN-ACK packets and consumed by the receiving end during the initial connection handshake, and saves up to one full round-trip time (RTT) compared to the standard TCP, which requires a three-way handshake (3WHS) to complete before data can be exchanged. However, TFO deviates from the standard TCP semantics, since the data in the SYN could be replayed to an application in some rare circumstances. Applications should not use TFO unless they can tolerate this issue, as detailed in the Applicability section.Recommendations for Secure Use of Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS)Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) are widely used to protect data exchanged over application protocols such as HTTP, SMTP, IMAP, POP, SIP, and XMPP. Over the last few years, several serious attacks on TLS have emerged, including attacks on its most commonly used cipher suites and their modes of operation. This document provides recommendations for improving the security of deployed services that use TLS and DTLS. The recommendations are applicable to the majority of use cases.Confidentiality in the Face of Pervasive Surveillance: A Threat Model and Problem StatementSince the initial revelations of pervasive surveillance in 2013, several classes of attacks on Internet communications have been discovered. In this document, we develop a threat model that describes these attacks on Internet confidentiality. We assume an attacker that is interested in undetected, indiscriminate eavesdropping. The threat model is based on published, verified attacks.DNS Privacy ConsiderationsThis document describes the privacy issues associated with the use of the DNS by Internet users. It is intended to be an analysis of the present situation and does not prescribe solutions.Security and Privacy Considerations for IPv6 Address Generation MechanismsThis document discusses privacy and security considerations for several IPv6 address generation mechanisms, both standardized and non-standardized. It evaluates how different mechanisms mitigate different threats and the trade-offs that implementors, developers, and users face in choosing different addresses or address generation mechanisms.Technical Considerations for Internet Service Blocking and FilteringThe Internet is structured to be an open communications medium. This openness is one of the key underpinnings of Internet innovation, but it can also allow communications that may be viewed as undesirable by certain parties. Thus, as the Internet has grown, so have mechanisms to limit the extent and impact of abusive or objectionable communications. Recently, there has been an increasing emphasis on "blocking" and "filtering", the active prevention of such communications. This document examines several technical approaches to Internet blocking and filtering in terms of their alignment with the overall Internet architecture. When it is possible to do so, the approach to blocking and filtering that is most coherent with the Internet architecture is to inform endpoints about potentially undesirable services, so that the communicants can avoid engaging in abusive or objectionable communications. We observe that certain filtering and blocking approaches can cause unintended consequences to third parties, and we discuss the limits of efficacy of various approaches.DNS Query Name Minimisation to Improve PrivacyThis document describes a technique to improve DNS privacy, a technique called "QNAME minimisation", where the DNS resolver no longer sends the full original QNAME to the upstream name server.Specification for DNS over Transport Layer Security (TLS)This document describes the use of Transport Layer Security (TLS) to provide privacy for DNS. Encryption provided by TLS eliminates opportunities for eavesdropping and on-path tampering with DNS queries in the network, such as discussed in RFC 7626. In addition, this document specifies two usage profiles for DNS over TLS and provides advice on performance considerations to minimize overhead from using TCP and TLS with DNS.This document focuses on securing stub-to-recursive traffic, as per the charter of the DPRIVE Working Group. It does not prevent future applications of the protocol to recursive-to-authoritative traffic.Client Subnet in DNS QueriesThis document describes an Extension Mechanisms for DNS (EDNS0) option that is in active use to carry information about the network that originated a DNS query and the network for which the subsequent response can be cached. Since it has some known operational and privacy shortcomings, a revision will be worked through the IETF for improvement.Domain Name System (DNS) CookiesDNS Cookies are a lightweight DNS transaction security mechanism that provides limited protection to DNS servers and clients against a variety of increasingly common denial-of-service and amplification/ forgery or cache poisoning attacks by off-path attackers. DNS Cookies are tolerant of NAT, NAT-PT (Network Address Translation - Protocol Translation), and anycast and can be incrementally deployed. (Since DNS Cookies are only returned to the IP address from which they were originally received, they cannot be used to generally track Internet users.)DNS-Based Authentication of Named Entities (DANE) Bindings for OpenPGPOpenPGP is a message format for email (and file) encryption that lacks a standardized lookup mechanism to securely obtain OpenPGP public keys. DNS-Based Authentication of Named Entities (DANE) is a method for publishing public keys in DNS. This document specifies a DANE method for publishing and locating OpenPGP public keys in DNS for a specific email address using a new OPENPGPKEY DNS resource record. Security is provided via Secure DNS, however the OPENPGPKEY record is not a replacement for verification of authenticity via the "web of trust" or manual verification. The OPENPGPKEY record can be used to encrypt an email that would otherwise have to be sent unencrypted.The Transport Layer Security (TLS) Protocol Version 1.3This document specifies version 1.3 of the Transport Layer Security (TLS) protocol. TLS allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery.This document updates RFCs 5705 and 6066, and obsoletes RFCs 5077, 5246, and 6961. This document also specifies new requirements for TLS 1.2 implementations.DNS Queries over HTTPS (DoH)This document defines a protocol for sending DNS queries and getting DNS responses over HTTPS. Each DNS query-response pair is mapped into an HTTP exchange.DNS TerminologyThe Domain Name System (DNS) is defined in literally dozens of different RFCs. The terminology used by implementers and developers of DNS protocols, and by operators of DNS systems, has sometimes changed in the decades since the DNS was first defined. This document gives current definitions for many of the terms used in the DNS in a single document.This document obsoletes RFC 7719 and updates RFC 2308.Issues and Requirements for Server Name Identification (SNI) Encryption in TLSThis document describes the general problem of encrypting the Server Name Identification (SNI) TLS parameter. The proposed solutions hide a hidden service behind a fronting service, only disclosing the SNI of the fronting service to external observers. This document lists known attacks against SNI encryption, discusses the current "HTTP co-tenancy" solution, and presents requirements for future TLS-layer solutions. In practice, it may well be that no solution can meet every requirement and that practical solutions will have to make some compromises.The Internet is for End UsersThis document explains why the IAB believes that, when there is a conflict between the interests of end users of the Internet and other parties, IETF decisions should favor end users. It also explores how the IETF can more effectively achieve this.Recommendations for DNS Privacy Service OperatorsThis document presents operational, policy, and security considerations for DNS recursive resolver operators who choose to offer DNS privacy services. With these recommendations, the operator can make deliberate decisions regarding which services to provide, as well as understanding how those decisions and the alternatives impact the privacy of users. This document also presents a non-normative framework to assist writers of a Recursive operator Privacy Statement, analogous to DNS Security Extensions (DNSSEC) Policies and DNSSEC Practice Statements described in RFC 6841.QUIC: A UDP-Based Multiplexed and Secure TransportThis document defines the core of the QUIC transport protocol. QUIC provides applications with flow-controlled streams for structured communication, low-latency connection establishment, and network path migration. QUIC includes security measures that ensure confidentiality, integrity, and availability in a range of deployment circumstances. Accompanying documents describe the integration of TLS for key negotiation, loss detection, and an exemplary congestion control algorithm.Making the DNS More Private with QNAME MinimisationA privacy framework for 'DNS big data' applicationsAn Analysis of TCP Traffic in Root Server DITL DataDNS-OARC 2014 Fall WorkshopTor FAQs: I keep seeing these warnings about SOCKS and DNS information leaks. Should I worry?TorTowards Plugging Privacy Leaks in Domain Name SystemUpdates since RFC 7626Many references were updated. Discussions of encrypted transports, including
DoT and DoH, and sections on DNS payload, authentication of servers, and blocking of services were added.
With the publishing of
on QNAME minimization, text, references, and initial attempts to
measure deployment were added to reflect this. The text and references on the
Snowden revelations were updated.
The "Risks Overview" section was changed to "Scope" to help clarify the risks
being considered. Text on cellular network DNS, blocking, and
security was added. Considerations for recursive resolvers were collected and placed
together. A discussion on resolver selection was added.
AcknowledgmentsThanks to and to the CENTR members for the original work
that led to this document. Thanks to for the interesting
discussions. Thanks to and for proofreading and
to , , , ,
, and for proofreading, providing technical
remarks, and making many readability improvements. Thanks to ,
, , , , , and
for good written contributions. Thanks to and
for a detailed review of the -bis. And thanks to the IESG
members for the last remarks.
Contributions and were the original authors of the
document, and their contribution to the initial draft of this document is greatly appreciated.
Author's AddressElkins26241United States of AmericaWVtjw.ietf@gmail.com