Re-defining 'return()'

Introduction

  • R has a number of reserved words which are core to the language and cannot be redefined.

  • According to R langauge definition these are:

    if else repeat while function for in next break TRUE FALSE NULL Inf NaN NA NA_integer_ NA_real_ NA_complex_ NA_character_ … ..1 ..2 etc.

  • return() is not reserved which means we can redefine it. Note: This is a bad idea (tm)

Why you might think this is useful.

I immediately thought that redefining return() could be useful by allowing you to intercept the return and do some logging or add meta-information (like a timestamp).

However, this won’t work, because if you redefine return() then when your custom function returns to the function from which it was called, it keeps executing that original function! i.e. it doesn’t actually do the real return!

return <- function(x) {print("Write something to a log file...")}

myfun <- function(x) {
  return(x)
  print("This line should not run")
}

myfun(1)
[1] "Write something to a log file..."
[1] "This line should not run"

Malicious redefinition

And so the only point seems to be able to write malicious code that will never return properly but will definitely corrupt any results when the original function drops off the end.

# redefine 'return' and make things awful for everyone
return <- function(x) {
  switch(typeof(x),
         'double'    = x + runif(1, max=100),
         'integer'   = x + as.integer(runif(1, max=100)),
         'character' = {x[1] <- 'always'; x},
         'logical'   = !x,
         x)
}


myfun <- function(x) {
  return(x)
}

myfun(c(1L, 2L, 3L))
[1] 27 28 29
myfun(c(TRUE, TRUE, FALSE))
[1] FALSE FALSE  TRUE
myfun(c("never", "do", "this"))
[1] "always" "do"     "this"  

Prevention?

So how could we prevent redefinition of return()?

geospacedman’s solution from twitter was neat: always use base::return(x) to return from a function.

This should be perfectly safe and unsinkable. (Narrator: it is not.)

By default ‘return()’ will be the one defined in the .Globalenv

return <- function(x) { stop("<nelson>Ha Ha!</nelson>") }

myfun <- function(x) { return(x) }
myfun(1:3)
Error in return(x): <nelson>Ha Ha!</nelson>

We can force it to call the return() defined in ‘base’ package.

myfun <- function(x) { base::return(x) }
myfun(1:3)
[1] 1 2 3

But we could overwrite the version of return() in the ‘base’ package.

unlockBinding("return", baseenv()) # Tip: Don't do this

evil_return <- function(x) { 
  warning("<nelson>Ha Ha!</nelson>") 
  x
}
  
assign("return", evil_return, pos=baseenv())


myfun <- function(x) { base::return(x) }
myfun(1)
[1] 1
Warning message:
In base::return(x) : <nelson>Ha Ha!</nelson>

It’s evilness all the way down

  • We could also redefine :: so that the return() function in ‘base’ is never touched
  • We’re screwed.
evil_return <- function(x) { 4:6 }

`::` <- function(pkg, name) { evil_return }


myfun <- function(x) { base::return(x) }
myfun(1:3)
[1] 4 5 6

The only code where it is sensible to re-define return()

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The following is the only #rstats code in existence where 
# you are allowed to redefine the value of 'return' 
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
return  <- function(return) {return + 1}

totally_sensible_function <- function() {
  '+' <- `*`
  `1` <- 2
  return <- `1` + `1` + `1` + `1`
  return(return)
}

totally_sensible_function()
[1] 17

Code smells

  • JennyBryan’s great User2018 talk on code smells suggested using the s3 class system to make code more readable.
  • Pro Tip: If you’re redefining return, there ain’t nothing going to reduce that code smell.
return           <- function(x) { UseMethod('return') }
return.default   <- function(x) {x}
return.integer   <- function(x) {x + 1}
return.numeric   <- function(x) {x + 2}
return.character <- function(x) {x[1] <- 'hello'; x}
return.logical   <- function(x) {!x}

myfun <- function(x) {
  return(x)
}

myfun(1L)
[1] 2
myfun(1.0)
[1] 3
myfun("goodbye")
[1] "hello"
myfun(TRUE)
[1] FALSE

Summary

Redefining return()

  • It’s a bad idea.
  • Don’t do it.
  • It has no upsides.
  • The only use-cases seem to be malicious
  • There’s no way to prevent it.