Skip to content

regular-vm/specification

Repository files navigation

The RISC Educational Generalized UtiLity ARchitecture

The RISC Educational Generalized UtiLity ARchitecture (REGULAR) is a capable, general-purpose machine with a minimalist design that is easy to work with while also being well suited for a variety of computational tasks. The little-endian, 32-bit architecture boasts 32 scalar registers, of which 31 (all but the program counter) are available for general purpose use. The instruction set is cleanly designed to simplify decoding, and it is kept deliberately minimal to facilitate different programming styles, reduce implementation complexity, and allow for expansion in the future.

Registers

REGULAR exposes 32, 32-bit registers to the programmer, named sequentially from r0 to r31, all of which are identical and may be used interchangeably and as the argument to any instruction where appropriate. One special case is that the processor reads from r0 to determine control flow; many assemblers support a pc alias (for "program counter", naturally) for this register. This exception has no effects on its use: it may be read from and operated on just like any other register. At the beginning of execution of each instruction, the processor will ensure that it refers to the address of the instruction immediately after the current one.

Special note for stack-based programming

REGULAR has no special register to indicate the current position of the stack pointer ("the top of the stack", in procedural languages that implement control flow in this way). When implementing code that is expected to function in this manner is recommended that one of the general purpose registers is reserved to serve this purpose. Frequently this is r31, and many assemblers alias it to sp for this use. In addition, an assembler temporary (commonly r30, aliased to at) is often reserved for stack manipulation, in which case it has unspecified value for the purposes of general use.

Suggested calling convention

To simplify function calls for control flow using a execution stack, it is recommended to implement the interprocedural application binary interface (ABI) as follows:

  • Arguments are passed on the stack, in order, followed by any required context. A function's return value, if any, is placed on the stack after this and this structure is torn down by the the caller.
  • All registers are callee-saved, except for the assembler temporary. Any registers used by the callee should first be saved to the stack so they can be restored prior to return from the procedure.

In pictures, the stack layout throughout the various stages of a call would look like this:

Instructions

Each REGULAR instruction is 32 bits wide. The first byte encodes the opcode, while the remaining three bytes encode register information or immediate values. For the purposes of instruction encoding, op is the numerical value of the opcode identifying the instruction, rA, rB, rC, … are registers (not necessarily distinct) with the letters standing in for the register's number, and imm is an immediate constant embedded in the instruction.

Instruction types

The following are possible encodings for the types of instructions that REGULAR supports:

op

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op ignored

op rA

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op A ignored

op rA imm

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op A imm

op rA rB

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op A B ignored

op rA rB imm

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op A B imm

op rA rB rC

Bit 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Use op A B C

The bits of each component of these instructions are laid out so that the lower bits of their numerical value corresponds to lower bit numbers for the instruction as a whole.

Instruction set

Name Encoding Description
nop 0x00 Perform no operation.
add 0x01 rA rB rC Perform an unsigned 32-bit addition of the values contained in rB and rC and store the result in rA.
sub 0x02 rA rB rC Perform an unsigned 32-bit subtraction of the value contained in rC from the value contained in rB and store the result in rA.
and 0x03 rA rB rC Perform a logical AND operation of the values contained in rB and rC and store the result of the operation in rA.
orr 0x04 rA rB rC Perform a logical OR operation of the values contained in rB and rC and store the result of the operation in rA.
xor 0x05 rA rB rC Perform a logical XOR operation of the values contained in rB and rC and store the result of the operation in rA.
not 0x06 rA rB Perform a logical NOT of the value contained in rB and store the result in rA.
lsh 0x07 rA rB rC Logically shift the value in rB by the number of bits represented by the signed quantity in rC. If this value is positive, shift the value contained in rB left by this many bits; if it is negative the shift will be to the right by the absolute value of the value in rC. In both instances newly vacated bits will be zeroed. If the value in rC is outside of the range (-32, 32) the result is undefined.
ash 0x08 rA rB rC Arithmetically shift the value in rB by the number of bits represented by the signed quantity in rC. If this value is positive, shift the value contained in rB left by this many bits; if it is negative the shift will be to the right by the absolute value of the value in rC. Newly vacated bits will be zeroed in the former case and be a duplicate of the most significant bit in the latter. If the value in rC is outside of the range (-32, 32) the result is undefined.
tcu 0x09 rA rB rC Subtract the unsigned value stored in rC from the unsigned value stored in rB with arbitrary precision and store the sign of the result in rA.
tcs 0x0a rA rB rC Subtract the signed value stored in rC from the signed value stored in rB with arbitrary precision and store the sign of the result in rA.
set 0x0b rA imm Store, with sign extension, the 16-bit signed value imm into rA.
mov 0x0c rA rB Copy the value from rB into rA.
ldw 0x0d rA rB Read a 32-bit word from the memory address referred to by rB and store the value into rA. If the address in rB is not word-aligned, the result is implementation-defined.
stw 0x0e rA rB Store the value in rB as a 32-bit value at the memory address referred to by rA. If the address in rA is not word-aligned, the result is implementation-defined.
ldb 0x0f rA rB Read an 8-bit unsigned byte from the memory address referred to by rB and store the value into rA. The upper 24 bits of rA are unaffected.
stb 0x10 rA rB Store the lower 8 bits of the value in rB as a byte at the memory address referred to by rA.

To complement this somewhat limited set, most assemblers implement more complex psuedoinstructions built on top of these base instructions by taking advantage of the assembler temporary.

Releases

No releases published

Packages

No packages published