Outlook – Should a use utf-8 or “utf-8” as a charset value in an email header

charsetemailoutlook.comutf-8

When sending an email to an Outlook.com with forwarding turned on, I find that the forwarded mail is rejected.

On examining the sent mail and the mail sitting in Outlook's inbox. I find that Microsoft have essentially re-written parts of the mail body.

For example

This is a multi-part message in MIME format.
--=_5226908e44ebc0462f06052400644d2f
Content-Type: multipart/alternative;
 boundary="=_926d2a45bc543e1972443c87118fa61a"

--=_926d2a45bc543e1972443c87118fa61a
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset=utf-8

SGF2aW5nIGFub3RoZXIgZ28gYXQgZm9yd2FyZGluZyBhbiBlbWFpbCB2aWEgT3V0bG9vay4NCg0K
DQo=
--=_926d2a45bc543e1972443c87118fa61a
Content-Transfer-Encoding: base64
Content-Type: text/html; charset=utf-8

Becomes as follows; note the quotes around the charset value:

--=_5226908e44ebc0462f06052400644d2f
Content-Type: multipart/alternative;
boundary="=_926d2a45bc543e1972443c87118fa61a"

--=_926d2a45bc543e1972443c87118fa61a
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset="utf-8"

SGF2aW5nIGFub3RoZXIgZ28gYXQgZm9yd2FyZGluZyBhbiBlbWFpbCB2aWEgT3V0bG9vay4NCg0K
DQo=
--=_926d2a45bc543e1972443c87118fa61a
Content-Transfer-Encoding: base64
Content-Type: text/html; charset="utf-8"

Now aside from the fact that the mail RFC’s specifically forbid modifying the body anyway (which breaks the DKIM signature) I have to ask which is the correct way to write charset=utf-8 in an email header?

Best Answer

RFC2045 provides in section 5.1 the grammar used to construct valid Content-Type headers in MIME messages:

5.1.  Syntax of the Content-Type Header Field

   In the Augmented BNF notation of RFC 822, a Content-Type header field
   value is defined as follows:

     content := "Content-Type" ":" type "/" subtype
                *(";" parameter)
                ; Matching of media type and subtype
                ; is ALWAYS case-insensitive.

     type := discrete-type / composite-type

     discrete-type := "text" / "image" / "audio" / "video" /
                      "application" / extension-token

     composite-type := "message" / "multipart" / extension-token

     extension-token := ietf-token / x-token

     ietf-token := <An extension token defined by a
                    standards-track RFC and registered
                    with IANA.>

     x-token := <The two characters "X-" or "x-" followed, with
                 no intervening white space, by any token>

     subtype := extension-token / iana-token

     iana-token := <A publicly-defined extension token. Tokens
                    of this form must be registered with IANA
                    as specified in RFC 2048.>

     parameter := attribute "=" value

     attribute := token
                  ; Matching of attributes
                  ; is ALWAYS case-insensitive.

     value := token / quoted-string

     token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
                 or tspecials>

     tspecials :=  "(" / ")" / "<" / ">" / "@" /
                   "," / ";" / ":" / "\" / <">
                   "/" / "[" / "]" / "?" / "="
                   ; Must be in quoted-string,
                   ; to use within parameter values

Note how value is defined as token / quoted-string.

Further down in the section is a textual clarification with an example:

   Note that the value of a quoted string parameter does not include the
   quotes.  That is, the quotation marks in a quoted-string are not a
   part of the value of the parameter, but are merely used to delimit
   that parameter value.  In addition, comments are allowed in
   accordance with RFC 822 rules for structured header fields.  Thus the
   following two forms

     Content-type: text/plain; charset=us-ascii (Plain text)

     Content-type: text/plain; charset="us-ascii"

   are completely equivalent.

As you can see, quoting is not required when the value already is a token (1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials>) but valid nonetheless.

Related Question