mikefc
library(tidyverse)
library(bench)
library(Rcpp)

Problem

I have a sequence of values in R and I want to reverse the bits in each value.

Problem dimensions:

  • Only considering raw values
  • Up to ~10^5 raw values to reverse

Bit reversal - overview

For a vector of raw bytes I want:

  • the output of rawToBits() but with each byte bit-reversed i.e. most significant bit first
  • with all raw-bits-representing bytes concatentated together into a single vector

That is, each byte within a vector of values is unmoved, but each byte has its bits reversed.

So instead of

rawToBits(as.raw(c(15, 1)))
 [1] 01 01 01 01 00 00 00 00 01 00 00 00 00 00 00 00

… I want:

00 00 00 00 01 01 01 01  00 00 00 00 00 00 00 01

Lookup table in pure R

  • Using a method from the bithacks page
  • There are only 256 possible raw values, just define the bit reversed version and look it up
  • Adapted table to pure R
  • Minor issues:
    • Can’t use a raw value as an index, so need to convert it to integer first
    • Need to handle zero values - so add 1 to the index
reverse_bit_lookup <- as.raw(
  c(0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0,
    0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0, 0x08, 0x88, 0x48,
    0xc8, 0x28, 0xa8, 0x68, 0xe8, 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8,
    0x78, 0xf8, 0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, 0x14,
    0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4, 0x0c, 0x8c, 0x4c, 0xcc,
    0x2c, 0xac, 0x6c, 0xec, 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c,
    0xfc, 0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, 0x12, 0x92,
    0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2, 0x0a, 0x8a, 0x4a, 0xca, 0x2a,
    0xaa, 0x6a, 0xea, 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
    0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, 0x16, 0x96, 0x56,
    0xd6, 0x36, 0xb6, 0x76, 0xf6, 0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae,
    0x6e, 0xee, 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe, 0x01,
    0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, 0x11, 0x91, 0x51, 0xd1,
    0x31, 0xb1, 0x71, 0xf1, 0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69,
    0xe9, 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9, 0x05, 0x85,
    0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, 0x15, 0x95, 0x55, 0xd5, 0x35,
    0xb5, 0x75, 0xf5, 0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed,
    0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd, 0x03, 0x83, 0x43,
    0xc3, 0x23, 0xa3, 0x63, 0xe3, 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3,
    0x73, 0xf3, 0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, 0x1b,
    0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb, 0x07, 0x87, 0x47, 0xc7,
    0x27, 0xa7, 0x67, 0xe7, 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77,
    0xf7, 0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, 0x1f, 0x9f,
    0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff))


raw_vec <- as.raw(c(15, 1))
reverse_bit_lookup[as.integer(raw_vec) + 1L]
[1] f0 80

Lookup table in Rcpp

  • Using a method from the bithacks page
  • Use Rcpp
  • This is the same method as the lookup table in R, just with the speed of Rcpp
Rcpp::cppFunction(code='
RawVector reverse_lookup_Rcpp(RawVector x) {
  unsigned int N = x.size();
  RawVector reversed(N);

  static const unsigned char BitReverseTable256[256] = {
# define R2(n)     n,     n + 2*64,     n + 1*64,     n + 3*64
# define R4(n) R2(n), R2(n + 2*16), R2(n + 1*16), R2(n + 3*16)
# define R6(n) R4(n), R4(n + 2*4 ), R4(n + 1*4 ), R4(n + 3*4 )
  R6(0), R6(2), R6(1), R6(3)
};

  for (unsigned int i=0; i<N; ++i) {
    reversed[i] = BitReverseTable256[x[i]]; 
  }

  return reversed;
}')

reverse_lookup_Rcpp(raw_vec)
[1] f0 80

64bit bit-twiddling in Rcpp

  • Using a method from the bithacks page
  • Uses 64bit operations to reverse the bits in a byte.
Rcpp::cppFunction(code='
RawVector reverse_twiddle64_Rcpp(RawVector x) {
  unsigned int N = x.size();
  RawVector reversed(N);

  for (unsigned int i=0; i<N; ++i) {
    reversed[i] = ((x[i] * 0x80200802ULL) & 0x0884422110ULL) * 0x0101010101ULL >> 32;
  }

  return reversed;
}')

reverse_twiddle64_Rcpp(raw_vec)
[1] f0 80

Benchmark using bench::mark()

raw_vec <- as.raw(1:10000 %% 256)
bm1 <- bench::mark(
  reverse_bit_lookup[as.integer(raw_vec) + 1L],
  reverse_lookup_Rcpp(raw_vec),
  reverse_twiddle64_Rcpp(raw_vec)
)

expression median itr/sec
reverse_bit_lookup[as.integer(raw_vec) + 1L] 66.9µs 13065.3
reverse_lookup_Rcpp(raw_vec) 11µs 76799.8
reverse_twiddle64_Rcpp(raw_vec) 11.1µs 78889.4

Summary

  • The Rcpp methods are ~6x faster than the pure R lookup
  • Thanks to hadley for the “+1L” correction.

Future

  • Idiomatic Rcpp patches welcomed
    • Only if they give a speed boost ;)
    • I know that I write my C++ code like it’s just C