Introduction
While working on fast output of images for the foist package, I realised the GIF format was simple enough to write quickly, so I worked on an R implementation of a GIF image writer, before translating to C++ for speed.
Update
Thanks to BrodieG for some optimization of the code!
What’s in an uncompressed GIF?
- Header
- “GIF89a” to indicate this is a GIF file
- width + height. 2 bytes each. Little endian.
- Palette
- Only going to support 128 colour palette
- i.e. 128x 3 bytes representing 128 RGB triplets
- Uncompressed data blocks
- GIFs with 2^n colours require up to 2^(n+1) bits to encode because of
control codes like
CLEAR
andSTOP
. - So, by limiting GIF to 128 colours/levels, each uncompressed pixel will fit in a single byte. See wikipedia for more detailed info on uncompressed gifs.
- To prevent the LZW decompressor from actually kicking-in during the decode process, need to send a CLEAR token every 126 bytes
- GIFs with 2^n colours require up to 2^(n+1) bits to encode because of
control codes like
Test matrix
ncol <- 255
nrow <- 160
int_mat <- matrix(0:127, nrow = nrow, ncol = ncol, byrow = TRUE)
dbl_mat <- int_mat/127
int_vec <- as.vector(t(int_mat)) # In row-major ordering
R GIF encoder
Limitations
- 128 colour levels only
- No animation
- Data is not compressed
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Write a matrix to an uncompressed GIF using just R
#'
#' @param int_vec integer vector with values in range [0, 127]
#' @param nrow number of rows
#' @param ncol number of cols
#' @param filename output filename
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
write_gif_R <- function(int_vec, nrow, ncol, filename) {
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Calculate Width low/high bytes, Height low/high bytes
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
w_lo <- bitwAnd(ncol, 0xff)
w_hi <- bitwAnd(bitwShiftR(ncol, 8), 0xff)
h_lo <- bitwAnd(nrow, 0xff)
h_hi <- bitwAnd(bitwShiftR(nrow, 8), 0xff)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# GIF header is "GIF89a"
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
header <- c(0x47, 0x49, 0x46, 0x38, 0x39, 0x61)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Size = width, height (2-bytes each, little endian)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
size <- c(w_lo, w_hi, h_lo, h_hi)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 0xF6 = global colour table with 7 bits per colour.
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
global_colour_table_header = c(0xf6, 0x00, 0x00)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# RGB triplets for every second colour in the first viridis palette
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
palette <- t(viridis::viridis.map[seq(1,256,2),1:3] * 255)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Description of data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
image_descriptor_header <- c(0x2c, # Data start
0x00, 0x00, 0x00, 0x00, # start position
w_lo, w_hi, h_lo, h_hi, # end position
0x00, # No local colour table
0x07) # Code size 2^7
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Written at end of GIF file
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gif_terminator <- 0x3B
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Maximum length of a data chunk = 2^7 - 2 = 126 bytes
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
N <- 126
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# A chunk of data consists of
# - 1 byte for length
# - 1 byte for CLEAR code (0x80)
# - N bytes of data. N <= 126
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
extract_chunk <- function(i) {
indices <- seq(0, N-1) + i*N + 1
c(N+1, 0x80, int_vec[indices])
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Number of full chunks
# Remaining bytes that don't make a full chunk
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
N_full_chunks <- length(int_vec) %/% N
remaining_bytes <- length(int_vec) %% N
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Extract all the data for the full chunks
# Faster method from @BrodieGaslam
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# full_chunk_data <- unlist(lapply(seq(N_full_chunks) - 1, extract_chunk))
full_chunk_data <- rbind(
rep(N+1, N_full_chunks), rep(0x80, N_full_chunks),
matrix(int_vec[seq_len(N*N_full_chunks)], nrow = N)
)
dim(full_chunk_data) <- NULL
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a chunk for the leftover data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
rem_data <- c(remaining_bytes + 1,
0x80,
int_vec[seq(remaining_bytes) + N_full_chunks * N])
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# This is the last data chunk which contains just the STOP code
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
data_end <- c(0x01, 0x81, 0x00)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Write all the data to file
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
writeBin(
as.raw(c(header,
size,
global_colour_table_header,
palette,
image_descriptor_header,
full_chunk_data,
rem_data,
data_end,
gif_terminator)),
con = filename,
size = 1,
useBytes = TRUE
)
}
write_gif_R(int_vec, nrow, ncol, "test-r.gif")
Benchmark
- The above R code was turned into C++ code and included in foist
- The
caTools
package also includes a GIF encoder
tmp <- tempfile()
res <- bench::mark(
write_gif_R(int_vec, nrow, ncol, tmp),
foist::write_gif(dbl_mat, tmp),
caTools::write.gif(dbl_mat, tmp)
)
expression | min | median | itr/sec | mem_alloc |
---|---|---|---|---|
write_gif_R(int_vec, nrow, ncol, tmp) | 1.54ms | 2.15ms | 462 | 6.19MB |
foist::write_gif(dbl_mat, tmp) | 323.43µs | 363.27µs | 2605 | 13.19KB |
caTools::write.gif(dbl_mat, tmp) | 2ms | 2.99ms | 333 | 2.35MB |
Loading required namespace: tidyr
Summary
- It’s possible to write a GIF encoder in pure R (if compression is excluded)
- Handling all the data in Rcpp (like foist does) makes things super fast.
caTools::write.gif()
is relatively slow and uses a load of memory.