# Arithmetic Types in C++

## Intro

Simple things are always the hardest to explain.

In C++, **arithmetic types** include **integral types** and **floating-point types**.

**The** Integer

Let’s begin with the most popular integer type, `int`

:

```
int i = 0;
```

In C++, `int`

is **fixed** in width.
This means that its size cannot change, and there’s a limited range of values it can hold.

- Typically, an
`int`

is 32 bits or 4 bytes, and the range is -2,147,483,648 to +2,147,483,647

## Length Modifiers

Often, a different size of integer is desired:

- A bigger integer to hold a larger range of values
- A smaller integer to be more space efficient

Thankfully, we have the modifiers `long`

and `short`

:

`long int`

: an integer type whose rank is higher than`int`

`short int`

: an integer type whose rank is lower than`int`

- A plain
`long`

is the same as`long int`

. Similarly, a plain`short`

is the same as`short int`

So, is `long int`

bigger than `int`

and `short int`

smaller than `int`

?

This is where things start to get interesting.

Take `long int`

. The C++ standard only guarantees that it is *at least as big as* `int`

. It does *not* have to be bigger.

In fact:

- On most versions of Windows,
`sizeof(long)`

is`4`

, same as`sizeof(int)`

- On 64-bit Linux distros,
`sizeof(long)`

is`8`

Note that even when they have the same width, `int`

and `long int`

**are two different types**. Likewise, `int`

and `short int`

are always two different types.

- Typically (on both Windows and Linux), a
`short int`

is 16 bit

What if you want even bigger or smaller integer types? Well, if you want to go bigger, just slap another `long`

:

`long long int`

: an integer type whose rank is higher than`long int`

- Typically,
`long long int`

is 64 bit

How about going smaller?

Now, if a C++ novice is asked this question right after being told about `long long int`

, their response is likely:

```
short short int
```

Alas, there is no such thing as `short short int`

in C++ (or C). It is spelled

```
signed char
```

Huh? What does `signed`

come from and what is `char`

doing here? Are we still talking about integers??

I cannot provide a better answer than

Well, for historical reasons, it is what it is

## Signedness Modifiers

All C++ integers are either `signed`

or `unsigned`

.

`signed`

types can hold negative values`unsigned`

types can only hold non-negative values

Further, integer arithmetic is different between `signed`

and `unsigned`

:

`signed`

integer overflow is undefined behavior (read: very bad)`unsigned`

integer overflow is well-defined - it wraps around- That does
**not**mean`unsigned`

types are safer!- In fact, I’ve encountered bugs that could have been avoided were
`signed`

type used - Both
`signed`

and`unsigned`

are useful in different use cases

- In fact, I’ve encountered bugs that could have been avoided were

If we don’t specify the signedness, it defaults to `signed`

. That is:

`int`

is the same as`signed int`

`long`

is the same as`signed long`

`short`

is the same as`signed short`

and so on.

**Except for char**.

`signed char`

is **not** the same as `char`

. In fact, the following are three distinct types:

`signed char`

`char`

`unsigned char`

All three are character types. However, only `signed char`

and `unsigned char`

are considered standard integer types.

The unfortunate fact is, `char`

is overloaded to mean at least three different things.

`char`

can mean*character*, the basic unit of a string`char`

can mean*byte*, the basic unit of raw memory`char`

can mean*integer*, as we’ve seen with`signed char`

and`unsigned char`

The direction of C++ seems to be that ** char should only mean character**.

- Use
`std::byte`

if you mean a byte - Use
`std::int8_t`

if you mean an integer

Although, there is a lot of existing code out there, and `char`

is ubiquitous.

Looking back, really:

`signed char`

should have been`short short`

`unsigned char`

should have been`unsigned short short`

## Fixed width integer types

In summary, we have 10 standard integer types: there are 5 ranks, each with `signed`

and `unsigned`

variants.

The C++ standard only guarantees minimal bit counts and that the each rank is at least as wide as the last:

```
// char: at least 8 bit
// short: at least 16 bit
// int: at least 16 bit
// long: at least 32 bit
// long long: at least 64 bit
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
```

- For complete info, refer to here

If precise width is demanded, there are **fixed width integer types** at our disposal:

`int8_t`

: exactly 8-bit signed integer`uint8_t`

: exactly 8-bit unsigned integer- … and so on, up to
`int64_t`

and`uint64_t`

Aside from width precision, these types are more coherant when it comes to naming/spelling. You don’t need to worry about `long`

or `short`

or `char`

or the order thereof.

`long signed long`

is legal

With fixed width integers, the only pattern is `[u]intN_t`

and that’s it!

The inconvenience is that you have to `#include <cstdint>`

. That, and technically they are defined in namespace `std`

, so the more tedious but correct way to spell them is e.g. `std::int32_t`

.

- This matters, because in the future if there’s only
`import std;`

for pulling in the standard library, you need to`std::`

prefix to find these types

In reality, the `[u]intN_t`

may be (but not required to be) typedefs of the standard integer types. For example:

`int32_t`

may be defined as`using int32_t = int;`

`uint8_t`

may be defined as`using uint8_t = unsigned char;`

`int64_t`

may be`long`

on one system, but`long long`

on another

That’s the whole point of these integer types: *platform-independent* access to fixed width integers.

## Floating-point types

Aside from integral types, **floating-point types** are other category of arithmetic types.

There are three standard floating-point types:

`float`

is*single precision floating-point type*`double`

is*double precision floating-point type*`long double`

is*extended precision floating-point type*

Alas, the spelling of these types is, again, less than perfect.

- If
`double`

is double precision, why is single precision not`single`

? - If
`long`

is used to denote “higher precision”, why is`double`

not`long float`

?

The short answer is again “for historical reasons, it is what it is”. We just have to get used to it.

Since C++23, there are fixed width floating-point types, similar to their fixed width integer cousins:

`float16_t`

`float32_t`

`float64_t`

- …

However, these are *never* the standard floating-point types (`float`

, `double`

, `long double`

). They are (aliases to) *extended floating-point types*.

Finally, floating-point types are inherently signed. There is no `unsigned float`

. Nor is there `signed float`

.

## Arithmetic Operations

Now comes the hard part.

Seriously; the previous parts are

easy

Consider a binary arithmetic operation, `T @ U`

, where `T`

and `U`

denote two arithmetic types. `@`

can be `+`

, `-`

, `*`

, and so on.

C++ requires converting `T`

and `U`

to a common type before carrying out the operation. The main 3 rules are:

- If at least one of them is floating-point, then the common type is the largest floating-point.
`int @ float`

=>`float @ float`

`float @ double`

=>`double @ double`

- All
`char`

and`short`

variants are first converted to`int`

; this is known as*integer promotion*.`char @ char`

=>`int @ int`

`unsigned short @ unsigned short`

=>`int @ int`

- For two integer types with the same signedness, the one with higher rank is the common type
`int @ long`

=>`long @ long`

`unsigned int @ unsigned long`

=>`unsigned long @ unsigned long`

Now, when it comes to **mixed signedness operation**, such as `unsigned int @ long`

, the recommendation is:

**Don’t**.

No, really. The rules for this area are complex and the results can be surprising. Most compilers warn about this sort of operation, for good reasons.

- If you want to challenge yourself, you may take look at this infographic from hackingcpp.com

Since C++20, we even introduce integer comparison functions:

`cmp_less(T, U)`

`cmp_greater(T, U)`

- …

*Precisely* because the builtin mixed-signedness operations are error-prone.

## Afterword

When it comes to basic stuff like integers and floating-points, pretty much everything was inherited from C, which has been for a long while. Some design probably made sense “back then”, but there is definitely better design today. Unfortunately, we don’t get to do-over; backward compatibility is pivotal to C, and in turn C compatibility is pivotal to C++.