mikefc

Introduction

While working on fast output of images for the foist package, I realised the GIF format was simple enough to write quickly, so I worked on an R implementation of a GIF image writer, before translating to C++ for speed.

Update

Thanks to BrodieG for some optimization of the code!

What’s in an uncompressed GIF?

  • Header
    • “GIF89a” to indicate this is a GIF file
    • width + height. 2 bytes each. Little endian.
  • Palette
    • Only going to support 128 colour palette
    • i.e. 128x 3 bytes representing 128 RGB triplets
  • Uncompressed data blocks
    • GIFs with 2^n colours require up to 2^(n+1) bits to encode because of control codes like CLEAR and STOP.
    • So, by limiting GIF to 128 colours/levels, each uncompressed pixel will fit in a single byte. See wikipedia for more detailed info on uncompressed gifs.
    • To prevent the LZW decompressor from actually kicking-in during the decode process, need to send a CLEAR token every 126 bytes

Test matrix

ncol <- 255
nrow <- 160
int_mat <- matrix(0:127, nrow = nrow, ncol = ncol, byrow = TRUE)
dbl_mat <- int_mat/127
int_vec <- as.vector(t(int_mat))  # In row-major ordering

R GIF encoder

Limitations

  • 128 colour levels only
  • No animation
  • Data is not compressed
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Write a matrix to an uncompressed GIF using just R
#'
#' @param int_vec integer vector with values in range [0, 127]
#' @param nrow number of rows
#' @param ncol number of cols
#' @param filename output filename
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
write_gif_R <- function(int_vec, nrow, ncol, filename) {
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Calculate Width low/high bytes, Height low/high bytes
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  w_lo <- bitwAnd(ncol, 0xff)
  w_hi <- bitwAnd(bitwShiftR(ncol, 8), 0xff)
  
  h_lo <- bitwAnd(nrow, 0xff)
  h_hi <- bitwAnd(bitwShiftR(nrow, 8), 0xff)
  
  
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # GIF header is "GIF89a"
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  header <- c(0x47, 0x49, 0x46, 0x38, 0x39, 0x61)
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Size = width, height (2-bytes each, little endian)
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  size <- c(w_lo, w_hi, h_lo, h_hi)
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # 0xF6 = global colour table with 7 bits per colour.
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  global_colour_table_header = c(0xf6, 0x00, 0x00)
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # RGB triplets for every second colour in the first viridis palette
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  palette <- t(viridis::viridis.map[seq(1,256,2),1:3] * 255)
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Description of data
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  image_descriptor_header <- c(0x2c,                   # Data start
                               0x00, 0x00, 0x00, 0x00, # start position
                               w_lo, w_hi, h_lo, h_hi, # end position
                               0x00,                   # No local colour table
                               0x07)                  # Code size 2^7
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Written at end of GIF file
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  gif_terminator <- 0x3B
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Maximum length of a data chunk = 2^7 - 2 = 126 bytes
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  N <- 126
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # A chunk of data consists of
  #   - 1 byte for length
  #   - 1 byte for CLEAR code (0x80)
  #   - N bytes of data.  N <= 126
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  extract_chunk <- function(i) {
    indices <- seq(0, N-1) + i*N + 1
    c(N+1, 0x80, int_vec[indices])
  }
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Number of full chunks
  # Remaining bytes that don't make a full chunk
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  N_full_chunks   <- length(int_vec) %/% N
  remaining_bytes <- length(int_vec) %%  N
  
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Extract all the data for the full chunks
  # Faster method from @BrodieGaslam
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # full_chunk_data <- unlist(lapply(seq(N_full_chunks) - 1, extract_chunk))
  full_chunk_data <- rbind(
    rep(N+1, N_full_chunks), rep(0x80, N_full_chunks),
    matrix(int_vec[seq_len(N*N_full_chunks)], nrow = N)
  )
  dim(full_chunk_data) <- NULL
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Create a chunk for the leftover data
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  rem_data <- c(remaining_bytes + 1, 
                       0x80,
                       int_vec[seq(remaining_bytes) + N_full_chunks * N])
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # This is the last data chunk which contains just the STOP code
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  data_end <- c(0x01, 0x81, 0x00)
  
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # Write all the data to file
  #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  writeBin(
    as.raw(c(header, 
      size, 
      global_colour_table_header, 
      palette, 
      image_descriptor_header,
      full_chunk_data, 
      rem_data, 
      data_end,
      gif_terminator)),
    con      = filename,
    size     = 1,
    useBytes = TRUE
  )
  
}
write_gif_R(int_vec, nrow, ncol, "test-r.gif")

Benchmark

  • The above R code was turned into C++ code and included in foist
  • The caTools package also includes a GIF encoder
tmp <- tempfile()
res <- bench::mark(
  write_gif_R(int_vec, nrow, ncol, tmp),
  foist::write_gif(dbl_mat, tmp),
  caTools::write.gif(dbl_mat, tmp)
)
(#tab:benchmark_grey)Benchmark results
expression min mean median itr/sec mem_alloc
write_gif_R(int_vec, nrow, ncol, tmp) 1.62ms 2.26ms 2.19ms 442 6.11MB
foist::write_gif(dbl_mat, tmp) 361.71µs 408.49µs 387.65µs 2448 65.07KB
caTools::write.gif(dbl_mat, tmp) 1.98ms 3.06ms 3.08ms 326 2.35MB

Summary

  • It’s possible to write a GIF encoder in pure R (if compression is excluded)
  • Handling all the data in Rcpp (like foist does) makes things super fast.
  • caTools::write.gif() is relatively slow uses a load of memory.