E.10 C Integer Types
Here are some guidelines for use of integer types in the Emacs C source code. These guidelines sometimes give competing advice; common sense is advised.
- Avoid arbitrary limits. For example, avoid
int len = strlen (s);
unless the length ofs
is required for other reasons to fit inint
range. - Do not assume that signed integer arithmetic wraps around on overflow. This is no longer true of Emacs porting targets: signed integer overflow has undefined behavior in practice, and can dump core or even cause earlier or later code to behave illogically. Unsigned overflow does wrap around reliably, modulo a power of two.
- Prefer signed types to unsigned, as code gets confusing when signed and unsigned types are combined. Many other guidelines assume that types are signed; in the rarer cases where unsigned types are needed, similar advice may apply to the unsigned counterparts (e.g.,
size_t
instead ofptrdiff_t
, oruintptr_t
instead ofintptr_t
). - Prefer
int
for Emacs character codes, in the range 0 .. 0x3FFFFF. More generally, preferint
for integers known to be inint
range, e.g., screen column counts. - Prefer
ptrdiff_t
for sizes, i.e., for integers bounded by the maximum size of any individual C object or by the maximum number of elements in any C array. This is part of Emacs’s general preference for signed types. Usingptrdiff_t
limits objects toPTRDIFF_MAX
bytes, but larger objects would cause trouble anyway since they would break pointer subtraction, so this does not impose an arbitrary limit. - Avoid
ssize_t
except when communicating to low-level APIs that havessize_t
-related limitations. Although it’s equivalent toptrdiff_t
on typical platforms,ssize_t
is occasionally narrower, so using it for size-related calculations could overflow. Also,ptrdiff_t
is more ubiquitous and better-standardized, has standardprintf
formats, and is the basis for Emacs’s internal size-overflow checking. When usingssize_t
, please note that POSIX requires support only for values in the range -1 ..SSIZE_MAX
. - Normally, prefer
intptr_t
for internal representations of pointers, or for integers bounded only by the number of objects that can exist at any given time or by the total number of bytes that can be allocated. However, preferuintptr_t
to represent pointer arithmetic that could cross page boundaries. For example, on a machine with a 32-bit address space an array could cross the 0x7fffffff/0x80000000 boundary, which would cause an integer overflow when adding 1 to(intptr_t) 0x7fffffff
. - Prefer the Emacs-defined type
EMACS_INT
for representing values converted to or from Emacs Lisp fixnums, as fixnum arithmetic is based onEMACS_INT
. - When representing a system value (such as a file size or a count of seconds since the Epoch), prefer the corresponding system type (e.g.,
off_t
,time_t
). Do not assume that a system type is signed, unless this assumption is known to be safe. For example, althoughoff_t
is always signed,time_t
need not be. - Prefer
intmax_t
for representing values that might be any signed integer value. Aprintf
-family function can print such a value via a format like"%"PRIdMAX
. - Prefer
bool
,false
andtrue
for booleans. Usingbool
can make programs easier to read and a bit faster than usingint
. Although it is also OK to useint
,0
and1
, this older style is gradually being phased out. When usingbool
, respect the limitations of the replacement implementation ofbool
, as documented in the source filelib/stdbool.in.h
. In particular, boolean bitfields should be of typebool_bf
, notbool
, so that they work correctly even when compiling Objective C with standard GCC. - In bitfields, prefer
unsigned int
orsigned int
toint
, asint
is less portable: it might be signed, and might not be. Single-bit bit fields should beunsigned int
orbool_bf
so that their values are 0 or 1.