mikefc

I needed to create a special mutate function that uses the dplyr groups if they’re defined, otherwise it should calculate over the entire data.

I couldn’t find anything online on how to do this, so I came with a solution.

It doesn’t seem there’s a blessed/sanctioned way of doing this.

I worked out the method to use in my previous post.

Then I realised I wanted to do this to a few other functions, so I thought I’d have a shot at making a function that converts other functions into ones which respect dplyr groups.

Example function: Reverse the order of rows of a data.frame

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Simple function to flip the order of rows in a data.frame
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
flip <- function(.data, ...) {
  .data[rev(seq(nrow(.data))), , drop=FALSE]
}


#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
  head()
# A tibble: 6 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  21       6   160   110  3.9   2.62  16.5     0     1     4     4
2  21       6   160   110  3.9   2.88  17.0     0     1     4     4
3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1
4  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1
5  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2
6  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The entire data is flipped (as expected). 
# Nothing special has been done with groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
  head() %>% 
  group_by(am) %>%
  flip()
# A tibble: 6 x 11
# Groups:   am [2]
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1
2  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2
3  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1
4  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1
5  21       6   160   110  3.9   2.88  17.0     0     1     4     4
6  21       6   160   110  3.9   2.62  16.5     0     1     4     4

engroupify()

The engroupify() function takes a function name as an argument, and returns a new version of the function which takes any dplyr groups into account

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Make a function which operates on data.frames respect dplyr `group_by()`
#'
#' @param fsym function name as a symbol which takes '.data' as its first argument
#' 
#' @return new function which respects dplyr groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
engroupify <- function(fsym) {
  
  fname <- rlang::ensym(fsym)

  new_body <- bquote({
      if (dplyr::is_grouped_df(.data)) {
        indices <- group_indices(.data)
        .data   <- do(.data, .(fname)(., ...))
        .data  <- .data[order(order(indices)), , drop = FALSE]
        return(.data)
      }

      .(fname)(.data, ...)
  })
  
  rlang::new_function(rlang::fn_fmls(fsym), new_body)
}


#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a new version of flip which respects dplyr groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gflip <- engroupify(flip)


#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The intial data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
  head()
# A tibble: 6 x 11
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  21       6   160   110  3.9   2.62  16.5     0     1     4     4
2  21       6   160   110  3.9   2.88  17.0     0     1     4     4
3  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1
4  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1
5  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2
6  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Using the new `gflip()` function to flip the data within groups!
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
  head() %>%
  group_by(am) %>%
  gflip()
# A tibble: 6 x 11
# Groups:   am [2]
    mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1  22.8     4   108    93  3.85  2.32  18.6     1     1     4     1
2  21       6   160   110  3.9   2.88  17.0     0     1     4     4
3  21       6   160   110  3.9   2.62  16.5     0     1     4     4
4  18.1     6   225   105  2.76  3.46  20.2     1     0     3     1
5  18.7     8   360   175  3.15  3.44  17.0     0     0     3     2
6  21.4     6   258   110  3.08  3.22  19.4     1     0     3     1

Future

  • Still hoping to find out how this should be done properly within dplyr.