Skip to main content

32.26 Checksum and Hash

Emacs has built-in support for computing cryptographic hashes. A cryptographic hash, or checksum, is a digital fingerprint of a piece of data (e.g., a block of text) which can be used to check that you have an unaltered copy of that data.

Emacs supports several common cryptographic hash algorithms: MD5, SHA-1, SHA-2, SHA-224, SHA-256, SHA-384 and SHA-512. MD5 is the oldest of these algorithms, and is commonly used in message digests to check the integrity of messages transmitted over a network. MD5 and SHA-1 are not collision resistant (i.e., it is possible to deliberately design different pieces of data which have the same MD5 or SHA-1 hash), so you should not use them for anything security-related. For security-related applications you should use the other hash types, such as SHA-2 (e.g. sha256 or sha512).

function secure-hash-algorithms

This function returns a list of symbols representing algorithms that secure-hash can use.

function secure-hash algorithm object \&optional start end binary

This function returns a hash for object. The argument algorithm is a symbol stating which hash to compute: one of md5, sha1, sha224, sha256, sha384 or sha512. The argument object should be a buffer or a string.

The optional arguments start and end are character positions specifying the portion of object to compute the message digest for. If they are nil or omitted, the hash is computed for the whole of object.

If the argument binary is omitted or nil, the function returns the text form of the hash, as an ordinary Lisp string. If binary is non-nil, it returns the hash in binary form, as a sequence of bytes stored in a unibyte string.

This function does not compute the hash directly from the internal representation of object’s text (see Text Representations). Instead, it encodes the text using a coding system (see Coding Systems), and computes the hash from that encoded text. If object is a buffer, the coding system used is the one which would be chosen by default for writing the text into a file. If object is a string, the user’s preferred coding system is used (see Recognize Coding in GNU Emacs Manual).

function md5 object \&optional start end coding-system noerror

This function returns an MD5 hash. It is semi-obsolete, since for most purposes it is equivalent to calling secure-hash with md5 as the algorithm argument. The object, start and end arguments have the same meanings as in secure-hash.

If coding-system is non-nil, it specifies a coding system to use to encode the text; if omitted or nil, the default coding system is used, like in secure-hash.

Normally, md5 signals an error if the text can’t be encoded using the specified or chosen coding system. However, if noerror is non-nil, it silently uses raw-text coding instead.

function buffer-hash \&optional buffer-or-name

Return a hash of buffer-or-name. If nil, this defaults to the current buffer. As opposed to secure-hash, this function computes the hash based on the internal representation of the buffer, disregarding any coding systems. It’s therefore only useful when comparing two buffers running in the same Emacs, and is not guaranteed to return the same hash between different Emacs versions. It should be somewhat more efficient on larger buffers than secure-hash is, and should not allocate more memory.