Arithmetic Types in C++
Intro
Simple things are always the hardest to explain.
In C++, arithmetic types include integral types and floating-point types.
The Integer
Let’s begin with the most popular integer type, int:
int i = 0;
In C++, int is fixed in width.
This means that its size cannot change, and there’s a limited range of values it can hold.
- Typically, an
intis 32 bits or 4 bytes, and the range is -2,147,483,648 to +2,147,483,647
Length Modifiers
Often, a different size of integer is desired:
- A bigger integer to hold a larger range of values
- A smaller integer to be more space efficient
Thankfully, we have the modifiers long and short:
long int: an integer type whose rank is higher thanintshort int: an integer type whose rank is lower thanint- A plain
longis the same aslong int. Similarly, a plainshortis the same asshort int
So, is long int bigger than int and short int smaller than int?
This is where things start to get interesting.
Take long int. The C++ standard only guarantees that it is at least as big as int. It does not have to be bigger.
In fact:
- On most versions of Windows,
sizeof(long)is4, same assizeof(int) - On 64-bit Linux distros,
sizeof(long)is8
Note that even when they have the same width, int and long int are two different types. Likewise, int and short int are always two different types.
- Typically (on both Windows and Linux), a
short intis 16 bit
What if you want even bigger or smaller integer types? Well, if you want to go bigger, just slap another long:
long long int: an integer type whose rank is higher thanlong int- Typically,
long long intis 64 bit
How about going smaller?
Now, if a C++ novice is asked this question right after being told about long long int, their response is likely:
short short int
Alas, there is no such thing as short short int in C++ (or C). It is spelled
signed char
Huh? What does signed come from and what is char doing here? Are we still talking about integers??
I cannot provide a better answer than
Well, for historical reasons, it is what it is
Signed-ness Modifiers
All C++ integers are either signed or unsigned.
signedtypes can hold negative valuesunsignedtypes can only hold non-negative values
Further, integer arithmetic is different between signed and unsigned:
signedinteger overflow is undefined behavior (read: very bad)unsignedinteger overflow is well-defined - it wraps around- That does not mean
unsignedtypes are safer!- In fact, I’ve encountered bugs that could have been avoided were
signedtype used - Both
signedandunsignedare useful in different use cases
- In fact, I’ve encountered bugs that could have been avoided were
If we don’t specify the signed-ness, it defaults to signed. That is:
intis the same assigned intlongis the same assigned longshortis the same assigned short
and so on.
Except for char.
signed char is not the same as char. In fact, the following are three distinct types:
signed charcharunsigned char
All three are character types. However, only signed char and unsigned char are considered standard integer types.
The unfortunate fact is, char is overloaded to mean at least three different things.
charcan mean character, the basic unit of a stringcharcan mean byte, the basic unit of raw memorycharcan mean integer, as we’ve seen withsigned charandunsigned char
The direction of C++ seems to be that char should only mean character.
- Use
std::byteif you mean a byte - Use
std::int8_tif you mean an integer
Although, there is a lot of existing code out there, and char is ubiquitous.
Looking back, really:
signed charshould have beenshort shortunsigned charshould have beenunsigned short short
Fixed width integer types
In summary, we have 10 standard integer types: there are 5 ranks, each with signed and unsigned variants.
The C++ standard only guarantees minimal bit counts and that the each rank is at least as wide as the last:
// char: at least 8 bit
// short: at least 16 bit
// int: at least 16 bit
// long: at least 32 bit
// long long: at least 64 bit
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
- For complete info, refer to here
If precise width is demanded, there are fixed width integer types at our disposal:
int8_t: exactly 8-bit signed integeruint8_t: exactly 8-bit unsigned integer- … and so on, up to
int64_tanduint64_t
Aside from width precision, these types are more coherent when it comes to naming/spelling. You don’t need to worry about long or short or char or the order thereof.
long signed longis legal
With fixed width integers, the only pattern is [u]intN_t and that’s it!
The inconvenience is that you have to #include <cstdint>. That, and technically they are defined in namespace std, so the more tedious but correct way to spell them is e.g. std::int32_t.
- This matters, because in the future if there’s only
import std;for pulling in the standard library, you need tostd::prefix to find these types
In reality, the [u]intN_t may be (but not required to be) typedefs of the standard integer types. For example:
int32_tmay be defined asusing int32_t = int;uint8_tmay be defined asusing uint8_t = unsigned char;int64_tmay belongon one system, butlong longon another
That’s the whole point of these integer types: platform-independent access to fixed width integers.
Floating-point types
Aside from integral types, floating-point types are other category of arithmetic types.
There are three standard floating-point types:
floatis single precision floating-point typedoubleis double precision floating-point typelong doubleis extended precision floating-point type
Alas, the spelling of these types is, again, less than perfect.
- If
doubleis double precision, why is single precision notsingle? - If
longis used to denote “higher precision”, why isdoublenotlong float?
The short answer is again “for historical reasons, it is what it is”. We just have to get used to it.
Since C++23, there are fixed width floating-point types, similar to their fixed width integer cousins:
float16_tfloat32_tfloat64_t- …
However, these are never the standard floating-point types (float, double, long double). They are (aliases to) extended floating-point types.
Finally, floating-point types are inherently signed. There is no unsigned float. Nor is there signed float.
Arithmetic Operations
Now comes the hard part.
Seriously; the previous parts are easy
Consider a binary arithmetic operation, T @ U, where T and U denote two arithmetic types. @ can be +, -, *, and so on.
C++ requires converting T and U to a common type before carrying out the operation. The main 3 rules are:
- If at least one of them is floating-point, then the common type is the largest floating-point.
int @ float=>float @ floatfloat @ double=>double @ double
- All
charandshortvariants are first converted toint; this is known as integer promotion.char @ char=>int @ intunsigned short @ unsigned short=>int @ int
- For two integer types with the same signed-ness, the one with higher rank is the common type
int @ long=>long @ longunsigned int @ unsigned long=>unsigned long @ unsigned long
Now, when it comes to mixed signed-ness operation, such as unsigned int @ long, the recommendation is:
- Don’t.
No, really. The rules for this area are complex and the results can be surprising. Most compilers warn about this sort of operation, for good reasons.
- If you want to challenge yourself, you may take look at this infographic from hackingcpp.com
Since C++20, we even introduce integer comparison functions:
cmp_less(T, U)cmp_greater(T, U)- …
Precisely because the builtin mixed-signed-ness operations are error-prone.
Afterword
When it comes to basic stuff like integers and floating-points, pretty much everything was inherited from C, which has been for a long while. Some design probably made sense “back then”, but there is definitely better design today. Unfortunately, we don’t get to do-over; backward compatibility is pivotal to C, and in turn C compatibility is pivotal to C++.