33.14 Rmail and Coding Systems

Rmail automatically decodes messages which contain non-ASCII characters, just as Emacs does with files you visit and with subprocess output. Rmail uses the standard ‘charset=charset’ header in the message, if any, to determine how the message was encoded by the sender. It maps charset into the corresponding Emacs coding system (see Coding Systems), and uses that coding system to decode message text. If the message header doesn’t have the ‘charset’ specification, or if charset is not recognized, Rmail chooses the coding system with the usual Emacs heuristics and defaults (see Recognize Coding).

Occasionally, a message is decoded incorrectly, either because Emacs guessed the wrong coding system in the absence of the ‘charset’ specification, or because the specification was inaccurate. For example, a misconfigured mailer could send a message with a ‘charset=iso-8859-1’ header when the message is actually encoded in koi8-r. When you see the message text garbled, or some of its characters displayed as hex codes or empty boxes, this may have happened.

You can correct the problem by decoding the message again using the right coding system, if you can figure out or guess which one is right. To do this, invoke the M-x rmail-redecode-body command. It reads the name of a coding system, and then redecodes the message using the coding system you specified. If you specified the right coding system, the result should be readable.

When you get new mail in Rmail, each message is translated automatically from the coding system it is written in, as if it were a separate file. This uses the priority list of coding systems that you have specified. If a MIME message specifies a character set, Rmail obeys that specification. For reading and saving Rmail files themselves, Emacs uses the coding system specified by the variable rmail-file-coding-system. The default value is nil, which means that Rmail files are not translated (they are read and written in the Emacs internal character code).