dplyr-cli
dplyr-cli uses {littler} to
run dplyr commands on CSV files in the terminal.
dplyr-cli makes use of the terminal pipe | instead of the magrittr pipe (%>%)
to run sequences of commands.
cat mtcars.csv | group_by cyl | summarise "mpg = mean(mpg)" | kable
#> | cyl| mpg|
#> |---:|--------:|
#> | 4| 26.66364|
#> | 6| 19.74286|
#> | 8| 15.10000|
Motivation
I wanted to be able to do quick hacks on CSV files on the command line without actually starting a proper R session.
What dplyr commands are supported?
Any command of the form dplyr::verb(.data, code)
Currently two extra commands are supported which are not part of dplyr.
csvperforms no dplyr command, but only outputs the input data as CSV to stdoutkableperforms no dplyr command, but only outputs the input data as aknitr::kable()formatted string to stdout
Limitations
- When using special shell characters such as
(), you’ll have to quote your code arguments - Every command runs in a separate R session - startup overhead can get prohibitive.
Usage
./dplyr.sh --help
dplyr-cli
Usage:
dplyr.sh <command> [--file=fn] [--csv | -c] [--verbose | -v] [<code>...]
dplyr.sh -h | --help
Options:
-h --help show this help text
-f FILE --file=FILE input CSV or RDS filename. If reading from stdin, assumes CSV [default: stdin]
-c --csv write output to stdout in CSV format (instead of default RDS file)
-v --verbose be verbose
Installation
dplyr-cli is run from the shell but at every invocation is starting a new
rsession where the following packages are expected to be installed:
install.packages('littler') # Scripting front-end for R
install.packages('glue') # text manipulation
install.packages('dplyr') # data manipulation
install.packages('docopt') # CLI description language
You’ll then need to download the shell script from my github and put dplyr.sh
somewhere in your path.
git clone https://github.com/coolbutuseless/dplyr-cli
Example data
write.csv(mtcars, "mtcars.csv", row.names = FALSE)
Example 1 - Basic Usage
# cat contents of input CSV into dplyr-cli.
# Use '-c' to output CSV if this is the final step
cat mtcars.csv | ./dplyr.sh filter -c mpg == 21
"mpg","cyl","disp","hp","drat","wt","qsec","vs","am","gear","carb"
21,6,160,110,3.9,2.62,16.46,0,1,4,4
21,6,160,110,3.9,2.875,17.02,0,1,4,4
# Put quotes around any commands which contain special characters like <>()
cat mtcars.csv | ./dplyr.sh filter -c "mpg < 11"
"mpg","cyl","disp","hp","drat","wt","qsec","vs","am","gear","carb"
10.4,8,472,205,2.93,5.25,17.98,0,0,3,4
10.4,8,460,215,3,5.424,17.82,0,0,3,4
# Combine dplyr commands with shell 'head' command
./dplyr.sh select --file mtcars.csv -c cyl | head -n 6
"cyl"
6
6
4
6
8
Example 2 - Simple piping of commands (with shell pipe, not magrittr pipe)
cat mtcars.csv | \
./dplyr.sh mutate "cyl2 = 2 * cyl" | \
./dplyr.sh filter "cyl == 8" | \
./dplyr.sh kable
| mpg| cyl| disp| hp| drat| wt| qsec| vs| am| gear| carb| cyl2|
|----:|---:|-----:|---:|----:|-----:|-----:|--:|--:|----:|----:|----:|
| 18.7| 8| 360.0| 175| 3.15| 3.440| 17.02| 0| 0| 3| 2| 16|
| 14.3| 8| 360.0| 245| 3.21| 3.570| 15.84| 0| 0| 3| 4| 16|
| 16.4| 8| 275.8| 180| 3.07| 4.070| 17.40| 0| 0| 3| 3| 16|
| 17.3| 8| 275.8| 180| 3.07| 3.730| 17.60| 0| 0| 3| 3| 16|
| 15.2| 8| 275.8| 180| 3.07| 3.780| 18.00| 0| 0| 3| 3| 16|
| 10.4| 8| 472.0| 205| 2.93| 5.250| 17.98| 0| 0| 3| 4| 16|
| 10.4| 8| 460.0| 215| 3.00| 5.424| 17.82| 0| 0| 3| 4| 16|
| 14.7| 8| 440.0| 230| 3.23| 5.345| 17.42| 0| 0| 3| 4| 16|
| 15.5| 8| 318.0| 150| 2.76| 3.520| 16.87| 0| 0| 3| 2| 16|
| 15.2| 8| 304.0| 150| 3.15| 3.435| 17.30| 0| 0| 3| 2| 16|
| 13.3| 8| 350.0| 245| 3.73| 3.840| 15.41| 0| 0| 3| 4| 16|
| 19.2| 8| 400.0| 175| 3.08| 3.845| 17.05| 0| 0| 3| 2| 16|
| 15.8| 8| 351.0| 264| 4.22| 3.170| 14.50| 0| 1| 5| 4| 16|
| 15.0| 8| 301.0| 335| 3.54| 3.570| 14.60| 0| 1| 5| 8| 16|
Example 3 - set up some aliases for convenience
alias mutate="./dplyr.sh mutate"
alias filter="./dplyr.sh filter"
alias select="./dplyr.sh select"
alias summarise="./dplyr.sh summarise"
alias group_by="./dplyr.sh group_by"
alias ungroup="./dplyr.sh ungroup"
alias count="./dplyr.sh count"
alias arrange="./dplyr.sh arrange"
alias kable="./dplyr.sh kable"
cat mtcars.csv | group_by cyl | summarise "mpg = mean(mpg)" | kable
| cyl| mpg|
|---:|--------:|
| 4| 26.66364|
| 6| 19.74286|
| 8| 15.10000|
Security warning
dplyr-cli uses eval(parse(text = ...)) on user input. Do not expose this
program to the internet or random users under any circumstances.