Problem: I want a list with a default value
The default list
in R is great, but sometimes I’d like it to return something
other than NULL
if a name isn’t in the list.
Contrived motivating example
Say I want to use a list as a counter for elements in a stream of data, but I’m not sure before-hand what elements are present. Each time I want to increase the count of a particular name, I first have to manually check if the name is already in the list. If it is, then add 1 to that location, otherwise set the counter for that item to 1.
counter <- list()
things <- c('bob', 'david', 'kate', 'susan', 'susan')
for (thing in things) {
if (thing %in% names(counter)) {
counter[[thing]] <- counter[[thing]] + 1L
} else {
counter[[thing]] <- 1
}
}
counter
$bob
[1] 1
$david
[1] 1
$kate
[1] 1
$susan
[1] 2
This is fine, but if I’ve got a lot of counters, I’d like something a bit less clunky.
Preferred syntax
I want something which looks pretty much like a list, but if the requested
name is not in the list, then the default value is returned, rather than NULL
.
In the case of a counter, I want this new defaultlist
to return 0
if
the requested name is not present.
counter <- defaultlist(0)
things <- c('bob', 'david', 'kate', 'susan', 'susan')
for (thing in things) {
counter[[thing]] <- counter[[thing]] + 1
}
defaultlist
implementation
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Create a list with a default value
#'
#' This behaves exactly like a 'list()' object, except if the requested value
#' does not exist, a default value is returned (instead of NULL).
#'
#' Similar to a `defaultdict` in Python
#'
#' @param value default value to return if item not in list
#'
#' @return new `defaultlist` object
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
defaultlist <- function(value) {
structure(list(), class = 'defaultlist', value = value)
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Fetch value from defaultlist
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`[[.defaultlist` <- `$.defaultlist` <- function(x, y) {
res <- unclass(x)[[y]]
if (is.null(res)) attr(x, 'value') else res
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Print like a list
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
print.defaultlist <- function(x, ...) {
attr(x, 'value') <- NULL
attr(x, 'class') <- NULL
print(x)
}
defaultlist
in action - list with a default of ‘0’
counter <- defaultlist(0)
things <- c('bob', 'david', 'kate', 'susan', 'susan')
for (thing in things) {
counter[[thing]] <- counter[[thing]] + 1
}
counter
$bob
[1] 1
$david
[1] 1
$kate
[1] 1
$susan
[1] 2
Example - defaultlist
with a default of FALSE
haystack <- defaultlist(FALSE)
haystack[['surprise']] <- TRUE
haystack[['hello?']]
[1] FALSE
haystack$mcfly
[1] FALSE
haystack[['anyone home?']]
[1] FALSE
haystack[['surprise']]
[1] TRUE
Extra Credit Example - using nested defaultlists to count most common letter pairs in a stream of unknown characters
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create nested defaultlists
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
counter <- defaultlist(defaultlist(0))
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a stream of characters heavily weighted towards 'a' and 'e'
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
set.seed(1)
stream <- sample(letters[1:5], 1000, replace = TRUE, prob = c(5, 1, 1, 1, 4))
head(stream)
[1] "a" "a" "e" "d" "a" "d"
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Count the pair of characters (prev, this)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
for (idx in 2:length(stream)) {
this <- stream[[idx]]
prev <- stream[[idx - 1]]
counter[[prev]][[this]] <- counter[[prev]][[this]] + 1
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The most probable letter pair in the stream is: a-a
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sort(unlist(counter), decreasing = TRUE)
a.a a.e e.a e.e a.b b.a b.e d.a a.c e.b a.d c.e c.a e.c e.d d.e b.b c.b c.c d.b
178 138 137 108 41 39 37 35 32 32 30 30 30 28 27 19 8 8 8 7
d.c b.d b.c d.d c.d
7 7 5 4 4