I needed to create a special mutate function that uses the dplyr groups if they’re defined, otherwise it should calculate over the entire data.
I couldn’t find anything online on how to do this, so I came with a solution.
It doesn’t seem there’s a blessed/sanctioned way of doing this.
I worked out the method to use in my previous post.
Then I realised I wanted to do this to a few other functions, so I thought I’d have a shot at making a function that converts other functions into ones which respect dplyr groups.
Example function: Reverse the order of rows of a data.frame
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Simple function to flip the order of rows in a data.frame
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
flip <- function(.data, ...) {
.data[rev(seq(nrow(.data))), , drop=FALSE]
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
head()
# A tibble: 6 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The entire data is flipped (as expected).
# Nothing special has been done with groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
head() %>%
group_by(am) %>%
flip()
# A tibble: 6 x 11
# Groups: am [2]
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
2 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
3 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
4 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
5 21 6 160 110 3.9 2.88 17.0 0 1 4 4
6 21 6 160 110 3.9 2.62 16.5 0 1 4 4
engroupify()
The engroupify()
function takes a function name as an argument, and returns
a new version of the function which takes any dplyr groups into account
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#' Make a function which operates on data.frames respect dplyr `group_by()`
#'
#' @param fsym function name as a symbol which takes '.data' as its first argument
#'
#' @return new function which respects dplyr groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
engroupify <- function(fsym) {
fname <- rlang::ensym(fsym)
new_body <- bquote({
if (dplyr::is_grouped_df(.data)) {
indices <- group_indices(.data)
.data <- do(.data, .(fname)(., ...))
.data <- .data[order(order(indices)), , drop = FALSE]
return(.data)
}
.(fname)(.data, ...)
})
rlang::new_function(rlang::fn_fmls(fsym), new_body)
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Create a new version of flip which respects dplyr groups
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gflip <- engroupify(flip)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# The intial data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
head()
# A tibble: 6 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Using the new `gflip()` function to flip the data within groups!
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mtcars %>%
head() %>%
group_by(am) %>%
gflip()
# A tibble: 6 x 11
# Groups: am [2]
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 21 6 160 110 3.9 2.62 16.5 0 1 4 4
4 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
Future
- Still hoping to find out how this should be done properly within dplyr.