lofi-double.Rmd
Double precision floating point values in R are stored in 64-bits - 1 bit sign, 11 for the exponent and 52 for the mantissa.
To convert to a low-fidelity representation, bits are dropped from the exponent and mantissa. This operation obviously results in a loss of precision in the number stored.
Note: lofi
has no explicit detection or support for NA
, NaN
, Inf
or denormalized numbers.
Double precision floating point values are converted to low-fidelity representation by truncating the mantissa, and re-encoding the exponent. Low-fidelity floats have limited range, poorer precision, and will almost never give back the exact starting value when unpack()ed
.
The following converts a double into a 10 bit float (with a sign bit, 2-bit exponent and 7-bit mantissa). The reconstructed double is close to the original value, but not an exact match.
Representation | Bits | Value | Bit layout |
---|---|---|---|
Double precision | 64 | -1.234 | |
Lofi double dbl_to_lofi(-1.234, float_bits = c(1, 2, 7))
|
10 | 669L | |
Reconstructed double lofi_to_dbl(669L, float_bits = c(1, 2, 7))
|
64 | -1.226562 |
A common 8-bit float (as described here) has 1-bit for the sign, 3-bits for the exponent and 4-bits for the mantissa.
This 8-bit float has a range of [-15.5, 15.5] and pretty terrible precision.
#> [1] -7.035 -3.836 2.186 12.246 -8.950 11.952 13.340 4.824
#> [9] 3.873 -13.146
#> [1] 220 206 65 104 225 103 106 83 78 234
#> [1] -7.000 -3.750 2.125 12.000 -8.500 11.500 13.000 4.750
#> [9] 3.750 -13.000
#> [1] 0.035 0.086 0.061 0.246 0.450 0.452 0.340 0.074 0.123 0.146