5 Assembling bytecode (II)
The package {rbytecode}
provides a means of creating R bytecode objects from a text representation of the instructions. This text is referred to as R bytecode assembly.
The compilation process in {rbytecode}
differs from R’s built-in bytecode compiler in that it does not take R code as input - instead the input code is a sequence of R bytecode instructions.
5.1 High-level description
A high-level overview of the assembly process is show in the image below:
- R bytecode assembly (text) is compiled using
rbytecode::asm()
- The output is a standard R bytecode object.
- This bytecode object can then be executed by
eval()
where it is passed to R’s bytecode VM to produce a result.
5.2 Details on compilation
The process of compiling bytecode assembly to a bytecode object actually happens in two phases:
- Parsing of the assembly code into a data.frame object called
bcdf
i.e.bytecode data.frame
- The
bcdf
is the main data structure for translation of the instructions into a bytecode object.
Creating a bytecode object from bcdf
is enabled by using other key data structures and methods from the {compiler}
package.
For each row in the bcdf
:
- The integer code for the instruction is its position in
compiler:::Opcodes.names
(and adjusted for C’s 0-based indexing) - The argument count is from
compiler:::Opcodes.argc
- The type of each argument have been included in the
{rbytecode}
package’s main meta-information data structure:rbytecode::ops
- For each argument for the instruction from this row in the
bcdf
place the arguments in theconsts
list - The full instruction is then a vector of integers comprised of:
- the instruction opcode
- integer indexes in the
consts
list - one for each argument
See Section 13 for a walkthrough of the compilation process on some simply bytecode assembly.