Presentation format
DNS resource records (RRs) can be expressed in text form using the DNS presentation format. The format is originally defined in RFC 1035#section-5.1 and RFC 1034#section-3.6.1 and is most frequently used to define a zone in master files, more commonly known as zone files. The term “presentation format” is officially established in RFC 8499#section-5.
The presentation format is a concise tabular serialization format with provisions for convenient editing. The DNS is intentionally extensible and many RFCs define additional types and the typical representation for the corresponding RDATA sections. Consequently, the presentation format is not defined by one single specification, but rather many specifications.
The presentation format is NOT context-free and correct interpretation of the specification(s) is rather dependent on extensive knowledge of the DNS.
Note
This document is meant to be a concise source on interpretation of the presentation format, but is still very much a work in progress. Please consider contributing if anything is unclear or incorrect.
Format
Note
Modified text from RFC 1035#section-5.1
The presentation format defines a number of entries. Entries are predominantly
line-oriented, though parentheses can be use to continue a list of items
across a line boundary, and text literals can contain CRLF within the text.
Any combination of tabs and spaces act as a delimiter between the separate
items that make up an entry. The end of any line can end with a comment.
Comments start with a ;
(semicolon).
The following entries are defined:
<blank>[<comment>]
$ORIGIN <domain-name> [<comment>]
$INCLUDE <file-name> [<domain-name>] [<comment>]
$TTL <TTL> [<comment>]
<domain-name><rr> [<comment>]
<blank><rr> [<comment>]
Blank lines, with or without comments, are allowed anywhere in the file.
Three control entries are defined: $ORIGIN, $INCLUDE and $TTL (defined in RFC 2308#section-4). $ORIGIN is followed by a domain name, and resets the current origin for relative domain names to the stated name. $INCLUDE inserts the named file into the current file, and may optionally specify a domain name that sets the relative domain name origin for the included file. $INCLUDE may also have a comment. Note that an $INCLUDE entry never changes the relative origin of the parent file, regardless of changes to the relative origin made within the included file. $TTL is followed by a decimal integer, and resets the default TTL for RRs which do not explicitly include a TTL value.
The last two forms represent RRs. If an entry for an RR begins with a
<blank>
, then the RR is assumed to be owned by the last stated owner. If
an RR entry begins with a <domain-name>
, then the owner name is reset.
<rr>
contents take one of the following forms:
[<TTL>] [<class>] <type> <RDATA>
[<class>] [<TTL>] <type> <RDATA>
The RR begins with optional TTL and class fields, followed by a type and RDATA field appropriate to the type and class. Class and type use the standard mnemonics, TTL is a decimal integer. Omitted class and TTL values are default to the last explicitly stated values. Since type and class mnemonics are disjoint, the parse is unique. (Note that this order is different from wire format order; the given order allows easier parsing and defaulting.)
<domain-name>s make up a large share of the data in the master file. The labels in the domain name are expressed as character strings and separated by dots. Quoting conventions allow arbitrary characters to be stored in domain names. Domain names that end in a dot are called absolute, and are taken as complete. Domain names which do not end in a dot are called relative; the actual domain name is the concatenation of the relative part with an origin specified in a $ORIGIN, $INCLUDE, or as an argument to the master file loading routine. A relative name is an error when no origin is available.
<character-string> is expressed in one or two ways: as a contiguous set of
characters without interior spaces, or as a string beginning with a "
and ending with a "
. Inside a "
delimited string any character can
occur, except for a "
itself, which must be quoted using \\
(backslash).
Because these files are text files several special encodings are necessary to allow arbitrary data to be loaded. In particular:
of the root.
@ A free standing @ is used to denote the current origin.
- X where X is any character other than a digit (0-9), is
used to quote that character so that its special meaning does not apply. For example, “.” can be used to place a dot character in a label.
- DDD where each D is a digit is the octet corresponding to
the decimal number described by DDD. The resulting octet is assumed to be text and is not checked for special meaning.
- ( ) Parentheses are used to group data that crosses a line
boundary. In effect, line terminations are not recognized within parentheses.
- ; Semicolon is used to start a comment; the remainder of
the line is ignored.
Handling of Unknown DNS Resource Record (RR) Types
The intentional extensibility in the DNS may lead to software implementations lagging behind in support. RFC 3597#section-5 introduces generic notations to represent unknown types, classes and the corresponding RDATA in text form.
Note
Modified text from RFC 3597#section-5.
The type field for an unknown RR type is represented by the word TYPE
immediately followed by the decimal RR type code, with no intervening
whitespace. In the class field, an unknown class is similarly represented
as the word CLASS
immediately followed by the decimal class code.
This convention allows types and classes to be distinguished from each other and from TTL values, allowing both <rr> forms to be unambiguously parsed.
[<TTL>] [<class>] <type> <RDATA>
[<class>] [<TTL>] <type> <RDATA>
The RDATA section of an RR of unknown type is represented as a sequence of white space separated words as follows:
The special token
\\#
(a backslash immediately followed by a hash sign), which identifies the RDATA as having the generic encoding defined herein rather than a traditional type-specific encoding.An unsigned decimal integer specifying the RDATA length in octets.
Zero or more words of hexadecimal data encoding the actual RDATA field, each containing an even number of hexadecimal digits.
If the RDATA is of zero length, the text representation contains only the
\\#
token and the single zero representing the length.
Even though an RR of known type represented in the \#
format is effectively
treated as an unknown type for the purpose of parsing the RDATA text
representation, all further processing by the server MUST treat it as a
known type and take into account any applicable type-specific rules regarding
compression, canonicalization, etc.
Service Binding and Parameter Specification via the DNS
RFC 9460 introduces a key-value syntax to the presentation format for
the SVCB
and HTTPS
type (initially). The addition is a major change
for implementors of presentation format parsers.
Note
Write (or copy) a section on the format from RFC 9460#section-2.1.
The RFC specifies a number of initial Service Parameter Keys (SvcParamKeys). IANA maintains these and additional keys in the Service Parameter Keys (SvcParamKeys) registry in the DNS Service Bindings (SVCB) category.
alpn and no-default-alpn
RFC 9460#section-7.1.1 specifies the alpn
and no-default-alpn
SvcParamKeys. The alpn
SvcParamKey takes a comma-separated list of
Application-Layer Protocol Negotiation (ALPN) Protocol IDs (maintained
by IANA in the TLS Application-Layer Protocol Negotiation (ALPN) Protocol IDs category), the syntax for which
is defined in RFC 9460#appendix-A.1.
A problem arises when items in the comma-separated list may contain a ,
(comma) or \\
(backslash). RFC 9460#section-2.1 specifies
SvcParamValue to be a char-string
and some implementations (incorrectly)
unescape char-string
during the scanner stage. Consequently, the fact that
a character is escaped
(\000
or \X
) is lost to the comma-separated
list parser. None of the registered protocol identifiers (currently) contains
a ,
(comma) and the specification dismisses the issue in the interest of
progress.
RFC 9460#appendix-A.1 specifies simple-comma-separated
, for lists of
items that cannot contain either of the aforementioned characters, and
comma-separated
for lists of items that can. The specification overlooks
that alpn
, or comma-separated lists, are encoded on the wire as a sequence
of strings, or a sequence of length octet followed by a maximum of 255 data
octets. A name server writing a transfer to disk in plain text can therefore
not encode data using the simple-comma-separated
scheme.
The specification contradicts itself in RFC 9460#section-7.1.1 by
stating that presentation format parsers MAY simply disallow the ,
and
\\
characters in ALPN IDs instead of implementing the value-list escaping
procedure by relying on the opaque key format (e.g., key1=\002h2
) in the
event that these characters are needed. Since SvcParamValue is defined to be
char-string
, the problem persists. To implementations that unescape during
the scanner stage, the escape sequence is still lost and implementations that
unescape during the parser stage did not have the problem to start with.
RFC 9460 incorrectly assumes that char-string
presents text.
Programming languages typically classify a token as string if it is quoted,
an identifier or keyword if it is a contiguous set of characters, etc.
Unescaping is then typically done by the scanner because tokens can be
classified during that stage. The presentation format defines basic syntax to
identify tokens, but as the format is NOT context-free and intentionally
extensible, the token can only be classified during the parser stage. Simply
put, char-string
in the presentation format cannot be unescaped during the
scanner stage as the scanner does not know the type of information the
char-string
presents. Domain names are a prime example.
The RR foo. NS \.
defines bar\.
as a relative domain name. The \\
(backslash) is important because it signals that the trailing dot does not
serve as a label separator.
Note
This issue has been discussed on the DNSOP IETF mailing list.
As BIND, Knot and NSD implement double escaping, so does simdzone even though the behavior is incorrect.