In common with the C language we have been used to having the float, double and long double types available since the first versions of C++. (Did you know that long double often uses 80 bits of precision, padded to 96 bits, on 32-bit platforms, but a full 128 bits on 64-bit platforms?) In this article we’re going to discuss the <stdfloat> header new in C++23, which provides types with explicit precisions between 16 and 128 bits regardless of hardware platform, within the std:: namespace.
The first point to note is that even if this header is present (which it isn’t yet with Clang’s libc++), support for all of the five new floating-point types is not guaranteed. For each of these there are feature test macros with value 1 if support is present. Even then, hardware support is not guaranteed—these types may instead be emulated in software by the standard library (or there could be typedef to a user-defined type).
There are two new 16-bit types, these being float16_t and bfloat16_t. The latter of these (known as Brain Floating-Point, with applications in neural networks) has the same exponent range as 32-bit floats, but with much reduced precision—only 8 bits. Then there are float32_t, float64_t and float128_t, with the first two of these mapping exactly to float and double in terms of precision and exponent range (and therefore native hardware support). On 32-bit machines, float128_t support most likely either does not exist or will use software emulation, while hardware support for bfloat16_t is probably limited to the GPU.
The feature test macros have names of the form __STDCPP_FLOAT128_T__ and can be tested for with #if (support for these does not yet appear to be present in Clang). All types except for bfloat16_t map to C native types of the form _Float128, which is useful if binary compatibility is required. There are also literals suffixes of the form f128 (or F128) available when the type alias is supported (it would appear that using namespace std::literals is not necessary with either GCC or Clang, but output directly to streams is not currently supported by Clang).
Here is a sample program to illustrate basic usage, with output being 1.20312 x 9.99922e+09 = 1.20125e+10:
#include <stdfloat>
#include <iostream>
#if __STDCPP_BFLOAT16_T__ != 1
#error No support for 16-bit brain float
#endif
int main() {
auto a = 1.2bf16;
std::bfloat16_t b = 1e10bf16; // suffix literal is needed
std::cout << a << " x " << b << " = " << a * b << '\n';
}
The table below shows the literal suffices, C language type and ranges for all five types defined in header <stdfloat>:
| Type | Literal suffix | C language type | Type properties | |||
|---|---|---|---|---|---|---|
| bits of storage | bits of precision | bits of exponent | max exponent | |||
| float16_t | f16 or F16 | _Float16 | 16 | 11 | 5 | 15 |
| float32_t | f32 or F32 | _Float32 | 32 | 24 | 8 | 127 |
| float64_t | f64 or F64 | _Float64 | 64 | 53 | 11 | 1023 |
| float128_t | f128 or F128 | _Float128 | 128 | 113 | 15 | 16383 |
| bfloat16_t | bf16 or BF16 | (N/A) | 16 | 8 | 8 | 127 |
It would appear that conversions between different types is not implicit, so unless using auto, the type specifier match the suffix exactly. Also, mixed-mode arithmetic between bfloat16_t and float16_t, and automatic promotions to larger types may not be provided.
To summarize, the presence of the <stdfloat> header provides a standardized way for the C++ library to target code to different floating-point hardware for the specified precision. Fixed-width types are beneficial in scenarios requiring precise memory control or specific performance characteristics, such as embedded systems or high-performance computing. In the case that new hardware support is provided, a simple recompile is all that is needed.