defaultlist - an R list with a user-defined default value. Take 2

Problem: I want a list with a user-defined default value

The default list in R is great, but sometimes I’d like it to return something other than NULL if a name isn’t in the list.

Some motivating examples and initial implementation is yesterday’s post

What’s new since yesterday?

  • Can now store a NULL in a defaultlist without it being replaced with the default value
  • defaultdict now responds correctly for both character and integer indexing.
  • Added [ extraction (in addition to [[ and $)

defaultlist implementation

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Create a list with a default value
#'
#' This behaves exactly like a 'list()' object, except if the requested value
#' does not exist, a default value is returned (instead of NULL).
#'
#' v0.1 2020-05-11  
#'      Initial release.  
#' v0.2 2020-05-12 
#'      Can now store 'NULL' in list properly. 
#'      Added `[` access
#'      Handle defaults when indexing by a numeric
#'
#' Similar to a `defaultdict` in Python
#'
#' @param value default value to return if item not in list
#'
#' @return new `defaultlist` object
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
defaultlist <- function(value) {
  structure(list(), class = 'defaultlist', value = value)
}

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Fetch value from defaultlist: [[]] or $
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`[[.defaultlist` <- `$.defaultlist` <- function(x, y) {
  if ((is.character(y) && y %in% names(x)) || 
    (is.numeric(y) && as.integer(y) <= length(x))) {
      NextMethod()
    } else {
      attr(x, 'value')
    }
}

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Fetch value from defaultlist: []
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`[.defaultlist` <- function(x, y) {
  if ((is.character(y) && y %in% names(x)) || 
    (is.numeric(y) && as.integer(y) <= length(x))) {
      NextMethod()
    } else if (is.character(y)) {
      setNames(list(attr(x, 'value')), y)
    } else {
      list(attr(x, 'value'))
    }
}

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Print like a list
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
print.defaultlist <- function(x, ...) {
  attr(x, 'value') <- NULL
  attr(x, 'class') <- NULL
  print(x)
}

Example - defaultlist with a default of FALSE

haystack <- defaultlist(FALSE)
haystack$surprise <- TRUE

haystack[['hello']]
[1] FALSE
haystack$mcfly
[1] FALSE
haystack[['surprise']]
[1] TRUE
haystack$surprise
[1] TRUE

defaultlist in action - list with a default of ‘0’

counter <- defaultlist(0)

things <- c('andy', 'bob', 'carol', 'carol')

for (thing in things) {
  counter[[thing]] <- counter[[thing]] + 1
}

counter
$andy
[1] 1

$bob
[1] 1

$carol
[1] 2

Extra Credit Example - using nested defaultlists to count most common letter pairs in a stream of unknown characters

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create nested defaultlists
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
counter <- defaultlist(defaultlist(0))

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a stream of characters heavily weighted towards 'a' and 'e'
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
set.seed(1)
stream <- sample(letters[1:5], 1000, replace = TRUE, prob = c(5, 1, 1, 1, 4))
head(stream)
[1] "a" "a" "e" "d" "a" "d"
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Count the pair of characters (prev, this)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
for (idx in 2:length(stream)) {
  this <- stream[[idx]]
  prev <- stream[[idx  - 1]]
  counter[[prev]][[this]] <- counter[[prev]][[this]] + 1
}

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The most probable letter pair in the stream is: a-a
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sort(unlist(counter), decreasing = TRUE)
a.a a.e e.a e.e a.b b.a b.e d.a a.c e.b a.d c.e c.a e.c e.d d.e b.b c.b c.c d.b 
178 138 137 108  41  39  37  35  32  32  30  30  30  28  27  19   8   8   8   7 
d.c b.d b.c d.d c.d 
  7   7   5   4   4