packing-specification.Rmd
Pack/Unpack values as low-fidelity representation into a single 32-bit integer. These functions require a packing spec () which defines how values are converted to/from their low-fidelity representations.
The packing specification (pack_spec
) is a named list detailing how values should be converted to their low-fidelity representations.
The name of the values in the pack_spec
correspond to the names in the values
argument to pack()
.
The following are valid for packing:
Integers are packed by truncating leading bits that aren’t needed e.g. the number 8 only needs 4 bis to represent it, and the other leading 28 bits can be ignored.
Packing an integer in this way is lossless - the reconstructed value using `unpack() will be identical to the original value.
nbits
- total number of bits to usesigned
- keep a sign bit? Default: FALSEmult
- pre-scale the value when packing, and undo scaling when unpacking. i.e. `(value + offset) * mult. Default: 1offset
- offset the value when packing, and undo offset when unpacking . i.e. `(value + offset) * mult. Default: 0Doubles are packed by truncating the mantissa and re-encoding the exponent. This will almost definitely lead to loss of precision, and any reconstructed value will not be identical to the original.
float_name
- name of floating point representation to use. Options:
nbits
- total number of bits to use. If specified, this takes precedence over the `float_name value.maxval
- only used if `nbits is specified. Used to calculate the total bits in the exponent.signed
- keep a sign bit? Only used if `nbits is specified. Default: FALSEfloat_bits
- [advanced] A 3-element numeric vector giving the number of bits to assign to the sign, exponent and mantissa, respecively. If given, float_bits
takes precedence over both nbits
and float_name
.Logical values only require a single bit, but more bits can be specified if desired.
nbits
- total number of bits to use. Optional. Default: 1A choice is very similar to a factor, but the labels are only stored in the specification, and the index is 0-based (instead of 1-based)
options
- vector of options to match values against. Only the index of the value into this list is stored.nbits
- total number of bits to use. Optional. If not given, it is calculated to be the number of bits necessary to store all possible options.Specifying the stored of a scaled value is sometimes easier than trying to work out how to corectly store a double precision floating point.
nbits
- number of bits to usemin
- minimum value to be stored. Default: 0max
- maximum value to be stored in the given bits. Every stored value is scaled by (2^nbits - 1)/max when
pack()ed, and unscaled when
unpacked()`.Packing a colour is achieved by truncating the bits for each of the R, G and B channels separately.
nbits
- number of bits to usergb_bits
- [advanced] A 3-element numeric vector giving the number of bits to assign to the R, G and B channels respecively. If given, rgb_bits
takes precedence over nbits
.This packing specification allows you to specify a function to convert a value into a low-fidelity representation, and a matching function to reconstruct the original value.
nbits
- number of bits to usepack_func
- a function. Alternatively, can specify a formula which will be converted internally to a function taking a single argument, .x
.unpack_func
- a function. Alternatively, can specify a formula which will be converted internally to a function taking a single argument, .x
. # Specify how to convert values to low-fidelity representation
pack_spec <- list(
x = list(type = 'double', float_name = 'bfloat16'),
valid = list(type = 'logical'),
stars = list(type = 'integer', mult = 10, nbits = 6),
alpha = list(type = 'scaled', max = 1, nbits = 5),
grade = list(type = 'choice', options = c('A', 'B', 'C', 'D'))
)
# Assemble the values
values <- list(
x = 1.234e21,
valid = TRUE,
stars = 4.5, # Star rating 0-5. 1 decimal place.
alpha = 0.8, # alpha. range [0, 1]
grade = 'B'
)
# pack them into a single integer
packed_values <- pack(values, pack_spec)
# Reconstruct the initial values from the packed values
unpack(packed_values, pack_spec)
#> $x
#> [1] 1.226708e+21
#>
#> $valid
#> [1] TRUE
#>
#> $stars
#> [1] 4.5
#>
#> $alpha
#> [1] 0.8064516
#>
#> $grade
#> [1] "B"
In the following example, the same distance value is encoded in two different ways.
The first method simply encodes the value as a 16 bit floating point value known as a bfloat16.
In the second method the pack_func
is used to take the log of the value, multiply it by 100 and then store this as a 16bit integer. The unpack_func
defines the reverse.
pack_spec <- list(
distance1 = list(type = 'double', float_name = 'bfloat16'),
distance2 = list(type = 'custom', nbits = 16,
pack_func = ~100 * log(.x), unpack_func = ~exp(.x/100))
)
values <- list(
distance1 = 1.234e21,
distance2 = 1.234e21
)
packed_values <- pack(values, pack_spec)
unpack(packed_values, pack_spec)
#> $distance1
#> [1] 1.226708e+21
#>
#> $distance2
#> [1] 1.228401e+21