Presentation format

DNS resource records (RRs) can be expressed in text form using the DNS presentation format. The format is originally defined in RFC 1035#section-5.1 and RFC 1034#section-3.6.1 and is most frequently used to define a zone in master files, more commonly known as zone files. The term “presentation format” is officially established in RFC 8499#section-5.

The presentation format is a concise tabular serialization format with provisions for convenient editing. The DNS is intentionally extensible and many RFCs define additional types and the typical representation for the corresponding RDATA sections. Consequently, the presentation format is not defined by one single specification, but rather many specifications.

The presentation format is NOT context-free and correct interpretation of the specification(s) is rather dependent on extensive knowledge of the DNS.

Note

This document is meant to be a concise source on interpretation of the presentation format, but is still very much a work in progress. Please consider contributing if anything is unclear or incorrect.

Format

Note

Modified text from RFC 1035#section-5.1

The presentation format defines a number of entries. Entries are predominantly line-oriented, though parentheses can be use to continue a list of items across a line boundary, and text literals can contain CRLF within the text. Any combination of tabs and spaces act as a delimiter between the separate items that make up an entry. The end of any line can end with a comment. Comments start with a ; (semicolon).

The following entries are defined:

<blank>[<comment>]

$ORIGIN <domain-name> [<comment>]

$INCLUDE <file-name> [<domain-name>] [<comment>]

$TTL <TTL> [<comment>]

<domain-name><rr> [<comment>]

<blank><rr> [<comment>]

Blank lines, with or without comments, are allowed anywhere in the file.

Three control entries are defined: $ORIGIN, $INCLUDE and $TTL (defined in RFC 2308#section-4). $ORIGIN is followed by a domain name, and resets the current origin for relative domain names to the stated name. $INCLUDE inserts the named file into the current file, and may optionally specify a domain name that sets the relative domain name origin for the included file. $INCLUDE may also have a comment. Note that an $INCLUDE entry never changes the relative origin of the parent file, regardless of changes to the relative origin made within the included file. $TTL is followed by a decimal integer, and resets the default TTL for RRs which do not explicitly include a TTL value.

The last two forms represent RRs. If an entry for an RR begins with a <blank>, then the RR is assumed to be owned by the last stated owner. If an RR entry begins with a <domain-name>, then the owner name is reset.

<rr> contents take one of the following forms:

[<TTL>] [<class>] <type> <RDATA>

[<class>] [<TTL>] <type> <RDATA>

The RR begins with optional TTL and class fields, followed by a type and RDATA field appropriate to the type and class. Class and type use the standard mnemonics, TTL is a decimal integer. Omitted class and TTL values are default to the last explicitly stated values. Since type and class mnemonics are disjoint, the parse is unique. (Note that this order is different from wire format order; the given order allows easier parsing and defaulting.)

<domain-name>s make up a large share of the data in the master file. The labels in the domain name are expressed as character strings and separated by dots. Quoting conventions allow arbitrary characters to be stored in domain names. Domain names that end in a dot are called absolute, and are taken as complete. Domain names which do not end in a dot are called relative; the actual domain name is the concatenation of the relative part with an origin specified in a $ORIGIN, $INCLUDE, or as an argument to the master file loading routine. A relative name is an error when no origin is available.

<character-string> is expressed in one or two ways: as a contiguous set of characters without interior spaces, or as a string beginning with a " and ending with a ". Inside a " delimited string any character can occur, except for a " itself, which must be quoted using \\ (backslash).

Because these files are text files several special encodings are necessary to allow arbitrary data to be loaded. In particular:

of the root.

@ A free standing @ is used to denote the current origin.

X where X is any character other than a digit (0-9), is: used to quote that character so that its special meaning does not apply. For example, “.” can be used to place a dot character in a label.
DDD where each D is a digit is the octet corresponding to: the decimal number described by DDD. The resulting octet is assumed to be text and is not checked for special meaning.
( ) Parentheses are used to group data that crosses a line: boundary. In effect, line terminations are not recognized within parentheses.
; Semicolon is used to start a comment; the remainder of: the line is ignored.

Handling of Unknown DNS Resource Record (RR) Types

The intentional extensibility in the DNS may lead to software implementations lagging behind in support. RFC 3597#section-5 introduces generic notations to represent unknown types, classes and the corresponding RDATA in text form.

Note

Modified text from RFC 3597#section-5.

The type field for an unknown RR type is represented by the word TYPE immediately followed by the decimal RR type code, with no intervening whitespace. In the class field, an unknown class is similarly represented as the word CLASS immediately followed by the decimal class code.

This convention allows types and classes to be distinguished from each other and from TTL values, allowing both <rr> forms to be unambiguously parsed.

[<TTL>] [<class>] <type> <RDATA>

[<class>] [<TTL>] <type> <RDATA>

The RDATA section of an RR of unknown type is represented as a sequence of white space separated words as follows:

The special token \\# (a backslash immediately followed by a hash sign), which identifies the RDATA as having the generic encoding defined herein rather than a traditional type-specific encoding.

An unsigned decimal integer specifying the RDATA length in octets.

Zero or more words of hexadecimal data encoding the actual RDATA field, each containing an even number of hexadecimal digits.

If the RDATA is of zero length, the text representation contains only the \\# token and the single zero representing the length.

Even though an RR of known type represented in the \# format is effectively treated as an unknown type for the purpose of parsing the RDATA text representation, all further processing by the server MUST treat it as a known type and take into account any applicable type-specific rules regarding compression, canonicalization, etc.

Service Binding and Parameter Specification via the DNS

RFC 9460 introduces a key-value syntax to the presentation format for the SVCB and HTTPS type (initially). The addition is a major change for implementors of presentation format parsers.

Note

Write (or copy) a section on the format from RFC 9460#section-2.1.

The RFC specifies a number of initial Service Parameter Keys (SvcParamKeys). IANA maintains these and additional keys in the Service Parameter Keys (SvcParamKeys) registry in the DNS Service Bindings (SVCB) category.

alpn and no-default-alpn

RFC 9460#section-7.1.1 specifies the alpn and no-default-alpn SvcParamKeys. The alpn SvcParamKey takes a comma-separated list of Application-Layer Protocol Negotiation (ALPN) Protocol IDs (maintained by IANA in the TLS Application-Layer Protocol Negotiation (ALPN) Protocol IDs category), the syntax for which is defined in RFC 9460#appendix-A.1.

A problem arises when items in the comma-separated list may contain a , (comma) or \\ (backslash). RFC 9460#section-2.1 specifies SvcParamValue to be a char-string and some implementations (incorrectly) unescape char-string during the scanner stage. Consequently, the fact that a character is escaped (\000 or \X) is lost to the comma-separated list parser. None of the registered protocol identifiers (currently) contains a , (comma) and the specification dismisses the issue in the interest of progress.

RFC 9460#appendix-A.1 specifies simple-comma-separated, for lists of items that cannot contain either of the aforementioned characters, and comma-separated for lists of items that can. The specification overlooks that alpn, or comma-separated lists, are encoded on the wire as a sequence of strings, or a sequence of length octet followed by a maximum of 255 data octets. A name server writing a transfer to disk in plain text can therefore not encode data using the simple-comma-separated scheme.

The specification contradicts itself in RFC 9460#section-7.1.1 by stating that presentation format parsers MAY simply disallow the , and \\ characters in ALPN IDs instead of implementing the value-list escaping procedure by relying on the opaque key format (e.g., key1=\002h2) in the event that these characters are needed. Since SvcParamValue is defined to be char-string, the problem persists. To implementations that unescape during the scanner stage, the escape sequence is still lost and implementations that unescape during the parser stage did not have the problem to start with.

RFC 9460 incorrectly assumes that char-string presents text. Programming languages typically classify a token as string if it is quoted, an identifier or keyword if it is a contiguous set of characters, etc. Unescaping is then typically done by the scanner because tokens can be classified during that stage. The presentation format defines basic syntax to identify tokens, but as the format is NOT context-free and intentionally extensible, the token can only be classified during the parser stage. Simply put, char-string in the presentation format cannot be unescaped during the scanner stage as the scanner does not know the type of information the char-string presents. Domain names are a prime example.

The RR foo. NS \. defines bar\. as a relative domain name. The \\ (backslash) is important because it signals that the trailing dot does not serve as a label separator.

Note

This issue has been discussed on the DNSOP IETF mailing list.

As BIND, Knot and NSD implement double escaping, so does simdzone even though the behavior is incorrect.