library(lofi)

Introduction

Double precision floating point values in R are stored in 64-bits - 1 bit sign, 11 for the exponent and 52 for the mantissa.

To convert to a low-fidelity representation, bits are dropped from the exponent and mantissa. This operation obviously results in a loss of precision in the number stored.

Note: lofi has no explicit detection or support for NA, NaN, Inf or denormalized numbers.

Double to 10-bit Lofi

Double precision floating point values are converted to low-fidelity representation by truncating the mantissa, and re-encoding the exponent. Low-fidelity floats have limited range, poorer precision, and will almost never give back the exact starting value when unpack()ed.

The following converts a double into a 10 bit float (with a sign bit, 2-bit exponent and 7-bit mantissa). The reconstructed double is close to the original value, but not an exact match.

Representation Bits Value Bit layout
Double precision 64 -1.234
Lofi double dbl_to_lofi(-1.234, float_bits = c(1, 2, 7)) 10 669L
Reconstructed double lofi_to_dbl(669L, float_bits = c(1, 2, 7)) 64 -1.226562

Double to 8-bit Lofi

A common 8-bit float (as described here) has 1-bit for the sign, 3-bits for the exponent and 4-bits for the mantissa.

This 8-bit float has a range of [-15.5, 15.5] and pretty terrible precision.

(original       <- round(runif(10, min = -15, max = 15), 3))
#>  [1]  -7.035  -3.836   2.186  12.246  -8.950  11.952  13.340   4.824
#>  [9]   3.873 -13.146
(lofi_values   <- dbl_to_lofi(original, float_bits = c(1, 3, 4)))
#>  [1] 220 206  65 104 225 103 106  83  78 234
(reconstructed <- lofi_to_dbl(lofi_values, float_bits = c(1, 3, 4)))
#>  [1]  -7.000  -3.750   2.125  12.000  -8.500  11.500  13.000   4.750
#>  [9]   3.750 -13.000
abs(reconstructed - original)
#>  [1] 0.035 0.086 0.061 0.246 0.450 0.452 0.340 0.074 0.123 0.146