Introduction
Many structured formats for documents and data have a natural nested hierarchy.
That is, there are lower-level elements nested within higher level structures
A great xample of a nested document creation using nested function calls is HTML creation with {htmltools}
. In the code below, the call to create the body()
is
nested within the function call to create the top-level html()
document.
The structure of the nested function calls in R match the structure of the created HTML document. This is both aesthetically pleasing, and low friction e.g. it’s easy to work out the structure of the input if you know what you want as the output (and vice versa).
library(htmltools)
tags$html(
tags$body(
tags$p('...'),
tags$p('...')
)
)
<html>
<body>
<p> ... </p>
<p> ... </p>
</body>
</html>
However there are many document formats where the current tools do not support this idea of nested creation. This may be because the underlying C library or API doesn’t really support the concept.
A (contrived) example of this is the creation of {grid}
graphics with nested viewports.
The following code creates two nestedviewport
and then draw squares within them.
I’m calling this style of document creation linear (in contrast to nested).
library(grid)
grid.newpage()
vp1 <- viewport(x = 0.5, y = 0.5, angle = 22.5)
pushViewport(vp1)
grid.rect(width=20, height=20, default.units = 'mm', gp = gpar(fill='hotpink'))
vp2 <- viewport(x = 0.5, y = 0.5, angle = 22.5)
pushViewport(vp2)
grid.rect(width=10, height=10, default.units = 'mm', gp = gpar(fill='blue'))
A hypothetical alterative nested syntax for grid graphics creation could look like the following code.
viewport(
angle = 22.5,
grid.rect(...),
viewport(
angle = 22.5,
grid.rect(...)
)
)
In the above code, the nested makes clearer the relationship between the viewports and the rectangles.
My use-case: document creation
I am working through a situation where the tools available are linear (because of the underlying C library), but I want a nested structure exposed to the user.
In its simplest form, my use case is for the creation of a document consisting of multiple paragraphs. Each paragraph in turn consisting of multiple sentences.
document
paragraph
sentence
sentence
paragraph
sentence
sentence
sentence
The constraints on this problem are:
- The creation of the document, paragraph and sentence should each be separate functions.
- There is a
document
object which all the functions manipulate. - The final
document
is returned to the user
The following post details:
- The current linear structure for document creation
- A nested structure for document creation
Current linear function structure
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Document is stored in a list
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new_doc <- function() {
list(par = 0, words = list())
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Creating a new paragraph increases the 'par' counts and prepares some
# space for the words in this paragraph
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new_par <- function(doc) {
doc$par <- doc$par + 1
doc$words[[doc$par]] <- list()
doc
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Add a sentence to the current paragraph
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new_sentence <- function(doc, words) {
doc$words[[doc$par]] <- append(doc$words[[doc$par]], words)
doc
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Simple print method
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
print_doc <- function(doc) {
for (i in seq_along(doc$words)) {
cat(strwrap(paste(doc$words[[i]], collapse = " "), width = 20), sep = "\n")
if (i != length(doc$words)) cat("\n")
}
}
Linear function calls
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Modify the `document` with every call
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
my_doc <- new_doc()
my_doc <- new_par(my_doc)
my_doc <- new_sentence(my_doc, "Hello there.")
my_doc <- new_sentence(my_doc, "My name is greg.")
my_doc <- new_par(my_doc)
my_doc <- new_sentence(my_doc, "Start of second par.")
print_doc(my_doc)
#> Hello there. My
#> name is greg.
#>
#> Start of second
#> par.
Notes
- the
document
object must be passed in to each function document
object is modified and returned by each function- There is no auto indentation in R which would indicate which sentences go with which paragraphs i.e. the linear structure of the functions don’t reveal the nested structure of the document when coding.
Nested function structure
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Nested evaluation
# @param ... child elements are evaluated after the parent is created
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
doc <- function(...) {
# Create document
document <- as.environment(list(par = 0, words = c()))
# Evaluate child 'par' elements within this document
# Adapt the call for child elements to include a 'document' as argument
for (elem in rlang::enquos(...)) {
elem <- rlang::call_modify(elem, document = document)
rlang::eval_tidy(elem)
}
as.list(document)
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Add a paragraph to the current document
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
par <- function(document, ...) {
document$par <- document$par + 1
document$words[[document$par]] <- list()
for (elem in rlang::enquos(...)) {
elem <- rlang::call_modify(elem, document = document)
rlang::eval_tidy(elem, env = sys.frame())
}
}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Add words to the current paragraph
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sentence <- function(document, words) {
document$words[[document$par]] <- append(document$words[[document$par]], words)
}
Nested function calls
my_doc <- doc(
par(
sentence("Hello there."),
sentence("My name is Greg.")
),
par(
sentence("Start of second par.")
)
)
print_doc(my_doc)
#> Hello there. My
#> name is Greg.
#>
#> Start of second
#> par.
Notes
- Code structure now mimics document structure
- Internally adapting the function calls so that the user does not need
to specify the passing of the
document
to each call. This simplifies the users code at the cost of complicating the functions a little.