# priorityqueue

`priorityqueue`

is a package with two
priority queue implementations for R:
one written in C, and the other written in plain R.

Both of these are *max-priority-queues* i.e.
“popping” from the queue returns the value with highest priority.

These priority queues support any R object. The priority can be any finite `double()`

value.

## What’s in the box

Two priority queue implementations with identical call APIs.

- Pure R priority queue:
`rpq_create()`

,`rpq_insert(rpq, obj, priority)`

,`rpq_pop(rpq)`

- C-based priority queue:
`cpq_create()`

,`cpq_insert(cpq, obj, priority)`

,`cpq_pop(cpq)`

## Installation

You can install from GitHub with:

```
# install.package('remotes')
remotes::install_github('coolbutuseless/priorityqueue')
```

## Example

Priority queues are a data-structure to store objects along with a priority with each object.

When *popping* a value off the queue, the object with the largest priority is
returned.

```
library(priorityqueue)
rpq <- rpq_create()
rpq_is_empty(rpq)
```

`#> [1] TRUE`

```
rpq_insert(rpq, obj = "#RStats", priority = 2)
rpq_insert(rpq, obj = "hello" , priority = 3)
rpq_insert(rpq, obj = 2 * 2 , priority = 1)
rpq_length(rpq)
```

`#> [1] 3`

```
# Retrieve the objects from largest priority to smallest
rpq_pop(rpq)
```

`#> [1] "hello"`

`rpq_pop(rpq)`

`#> [1] "#RStats"`

`rpq_pop(rpq)`

`#> [1] 4`

## Rationale

I needed a data structure in R which would allow me to insert/pop items based upon
a *priority* value - what is known as a *priority queue*.

I wrote a quick solution in R that met my basic needs i.e. an environment
containing a list of objects and a vector of values, plus some standard functions
to manipulate. This code is the basis of the `rpq_*`

functions in this package.

I then wondered how this could be done with C, and as a learning exercise I
wrote a minimal priority queue in C which supported R objects. This was tidied
to be the C prioirty queue (`cpq_*`

) in this package.

## Timing

In general the C implementation is faster and uses less memory than the R implementation (no surprises there).

For larger queues, the C implementation can be ~10x faster and use 1/30th the memory.

This implementation of a priority queue compares favourably with the priority queue included in the {collections} package on CRAN.

## Click to show/hide the timing code

```
r_priority_queue <- function(N) {
set.seed(1)
rpq <- rpq_create()
for (i in 1:(N*2)) {
rpq_insert(rpq, 1, runif(1))
rpq_insert(rpq, 1, runif(1))
}
for (i in 1:N) {
rpq_insert(rpq, 1, runif(1))
rpq_insert(rpq, 1, runif(1))
rpq_pop(rpq)
rpq_pop(rpq)
rpq_pop(rpq)
}
}
cpq_priority_queue <- function(N) {
set.seed(1)
cpq <- cpq_create()
for (i in 1:(N*2)) {
cpq_insert(cpq, 1, runif(1))
cpq_insert(cpq, 1, runif(1))
}
for (i in 1:N) {
cpq_insert(cpq, 1, runif(1))
cpq_insert(cpq, 1, runif(1))
cpq_pop(cpq)
cpq_pop(cpq)
cpq_pop(cpq)
}
}
collections_priority_queue <- function(N) {
set.seed(1)
colq <- collections::priority_queue()
for (i in 1:(N*2)) {
colq$push(1, runif(1))
colq$push(1, runif(1))
}
for (i in 1:N) {
colq$push(1, runif(1))
colq$push(1, runif(1))
colq$pop()
colq$pop()
colq$pop()
}
}
res <- bench::press(
N = c(10, 100, 1000),
bench::mark(
`Plain R priority queue` = r_priority_queue(N),
`C priority queue` = cpq_priority_queue(N),
`collections::priority_queue()` = collections_priority_queue(N)
)
)
```

`#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.`

expression | N | itr/sec | mem_alloc |
---|---|---|---|

Plain R priority queue | 10 | 774.412196 | 408.02KB |

C priority queue | 10 | 3443.282577 | 226.52KB |

collections::priority_queue() | 10 | 3271.967561 | 278.96KB |

Plain R priority queue | 100 | 61.281542 | 7.94MB |

C priority queue | 100 | 370.743903 | 1.48MB |

collections::priority_queue() | 100 | 317.896057 | 1.47MB |

Plain R priority queue | 1000 | 1.463592 | 645.93MB |

C priority queue | 1000 | 36.510057 | 14.73MB |

collections::priority_queue() | 1000 | 29.865650 | 14.71MB |

## Technical Bits

Plain R priority queue

- Implemented as an environment with storage for a list of objects and a vector of priorities.
- New objects and their priorities are appended to the list/vector in the environment.
- When a value is
*popped*, the maximum priority is found (using`whichmax()`

) and the matching object is returned

C priority queue

- A heap-based priority queue.
- Creating a
`cpq`

creates an ExternalPointer to a`cpq_t`

object which contains the queue and other information all entirely within C. - An R list is attached to this ExternalPointer in C (as a
*protected*item on the pointer). When new R objects are added to the priority queue, they are also added to this list to prevent them from being garbage collected.

## Acknowledgements

- R Core for developing and maintaining the language.
- CRAN maintainers, for patiently shepherding packages onto CRAN and maintaining the repository