Half-precision floats¶

Enoki provides a compact implementation of a 16 bit half-precision floating point type that is compatible with the FP16 format on GPUs and high dynamic range image libraries such as OpenEXR. To use this feature, include the following header:

#include <enoki/half.h>

Current processors don’t natively implement half precision arithmetic, hence mathematical operations involving this type always involve a half\(\to\) float\(\to\) half roundtrip. For this reason, it is unwise to rely on it for expensive parts of a computation.

The main reason for including a dedicated half precision type in Enoki is that it provides an ideal storage format for floating point data that does not require the full accuracy of the single precision representation, which leads to an immediate storage savings of \(2\times\).

Note

If supported by the target architecture, Enoki uses the F16C instruction set to perform efficient vectorized conversion between half and single precision variables (however, this only affects conversion and no other arithmetic operations). ARM NEON also provides native conversion instructions.

Usage¶

The following example shows how to use the enoki::half type in a typical use case.

using Color4f = Array<float, 4>;
using Color4h = Array<half, 4>;

uint8_t *image_ptr = ...;

Color4f pixel(load<Color4h>(image_ptr)); // <- conversion vectorized using F16C

/* ... update 'pixel' using single-precision arithmetic ... */

store(image_ptr, Color4h(pixel)); // <- conversion vectorized using F16C

Reference¶

class half¶

A half instance encodes a sign bit, an exponent width of 5 bits, and 10 explicitly stored mantissa bits.

All standard mathematical operators are overloaded and implemented using the processor’s floating point unit after a conversion to a IEEE754 single precision. The result of the operation is then converted back to half precision.

uint16_t value¶: Stores the represented half precision value as an unsigned 16-bit integer.

half(float value)¶: Constructs a half-precision value from the given single precision argument.

operator float() const¶: Implicit half to float conversion operator.

static half from_binary(uint16_t value)¶: Reinterpret a 16-bit unsigned integer as a half-precision variable.

half operator+(half h) const¶: Addition operator.

half &operator+=(half h)¶: Addition compound assignment operator.

half operator-() const¶: Unary minus operator

half operator*(half h) const¶: Multiplication operator.

half &operator*=(half h)¶: Multiplication compound assignment operator.

half operator/(half h) const¶: Division operator.

half &operator/=(half h)¶: Division compound assignment operator.

bool operator<(half h) const¶: Less-than comparison operator.

bool operator<=(half h) const¶: Less-than-or-equal comparison operator.

bool operator>(half h) const¶: Greater-than comparison operator.

bool operator>=(half h) const¶: Greater-than-or-equal comparison operator.

bool operator==(half h) const¶: Equality operator.

bool operator!=(half h) const¶: Inequality operator.

friend std::ostream &operator<<(std::ostream &os, const half &h)¶: Stream insertion operator.

Table Of Contents

Half-precision floats¶

Usage¶

Reference¶