library(tidyverse)
library(bench)
library(Rcpp)
Problem
I have a sequence of values in R and I want to reverse the bits in each value.
Problem dimensions:
- Only considering
raw
values - Up to ~10^5 raw values to reverse
Bit reversal - overview
For a vector of raw bytes I want:
- the output of
rawToBits()
but with each byte bit-reversed i.e. most significant bit first - with all raw-bits-representing bytes concatentated together into a single vector
That is, each byte within a vector of values is unmoved, but each byte has its bits reversed.
So instead of
rawToBits(as.raw(c(15, 1)))
[1] 01 01 01 01 00 00 00 00 01 00 00 00 00 00 00 00
… I want:
00 00 00 00 01 01 01 01 00 00 00 00 00 00 00 01
Lookup table in pure R
- Using a method from the bithacks page
- There are only 256 possible raw values, just define the bit reversed version and look it up
- Adapted table to pure R
- Minor issues:
- Can’t use a
raw
value as an index, so need to convert it tointeger
first - Need to handle zero values - so add 1 to the index
- Can’t use a
reverse_bit_lookup <- as.raw(
c(0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0,
0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0, 0x08, 0x88, 0x48,
0xc8, 0x28, 0xa8, 0x68, 0xe8, 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8,
0x78, 0xf8, 0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, 0x14,
0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4, 0x0c, 0x8c, 0x4c, 0xcc,
0x2c, 0xac, 0x6c, 0xec, 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c,
0xfc, 0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, 0x12, 0x92,
0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2, 0x0a, 0x8a, 0x4a, 0xca, 0x2a,
0xaa, 0x6a, 0xea, 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, 0x16, 0x96, 0x56,
0xd6, 0x36, 0xb6, 0x76, 0xf6, 0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae,
0x6e, 0xee, 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe, 0x01,
0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, 0x11, 0x91, 0x51, 0xd1,
0x31, 0xb1, 0x71, 0xf1, 0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69,
0xe9, 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9, 0x05, 0x85,
0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, 0x15, 0x95, 0x55, 0xd5, 0x35,
0xb5, 0x75, 0xf5, 0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed,
0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd, 0x03, 0x83, 0x43,
0xc3, 0x23, 0xa3, 0x63, 0xe3, 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3,
0x73, 0xf3, 0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, 0x1b,
0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb, 0x07, 0x87, 0x47, 0xc7,
0x27, 0xa7, 0x67, 0xe7, 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77,
0xf7, 0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, 0x1f, 0x9f,
0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff))
raw_vec <- as.raw(c(15, 1))
reverse_bit_lookup[as.integer(raw_vec) + 1L]
[1] f0 80
Lookup table in Rcpp
- Using a method from the bithacks page
- Use
Rcpp
- This is the same method as the lookup table in R, just with the speed of Rcpp
Rcpp::cppFunction(code='
RawVector reverse_lookup_Rcpp(RawVector x) {
unsigned int N = x.size();
RawVector reversed(N);
static const unsigned char BitReverseTable256[256] = {
# define R2(n) n, n + 2*64, n + 1*64, n + 3*64
# define R4(n) R2(n), R2(n + 2*16), R2(n + 1*16), R2(n + 3*16)
# define R6(n) R4(n), R4(n + 2*4 ), R4(n + 1*4 ), R4(n + 3*4 )
R6(0), R6(2), R6(1), R6(3)
};
for (unsigned int i=0; i<N; ++i) {
reversed[i] = BitReverseTable256[x[i]];
}
return reversed;
}')
reverse_lookup_Rcpp(raw_vec)
[1] f0 80
64bit bit-twiddling in Rcpp
- Using a method from the bithacks page
- Uses 64bit operations to reverse the bits in a byte.
Rcpp::cppFunction(code='
RawVector reverse_twiddle64_Rcpp(RawVector x) {
unsigned int N = x.size();
RawVector reversed(N);
for (unsigned int i=0; i<N; ++i) {
reversed[i] = ((x[i] * 0x80200802ULL) & 0x0884422110ULL) * 0x0101010101ULL >> 32;
}
return reversed;
}')
reverse_twiddle64_Rcpp(raw_vec)
[1] f0 80
Benchmark using bench::mark()
raw_vec <- as.raw(1:10000 %% 256)
bm1 <- bench::mark(
reverse_bit_lookup[as.integer(raw_vec) + 1L],
reverse_lookup_Rcpp(raw_vec),
reverse_twiddle64_Rcpp(raw_vec)
)
expression | median | itr/sec |
---|---|---|
reverse_bit_lookup[as.integer(raw_vec) + 1] | 80.2µs | 11941.8 |
reverse_lookup_Rcpp(raw_vec) | 14.1µs | 73295.6 |
reverse_twiddle64_Rcpp(raw_vec) | 11.4µs | 78641.2 |
Summary
- The
Rcpp
methods are ~6x faster than the pure R lookup - Thanks to hadley for the “+1L” correction.
Future
- Idiomatic Rcpp patches welcomed
- Only if they give a speed boost ;)
- I know that I write my C++ code like it’s just C