Introduction
What is Universal Machine Code?
UMC Instructions
Misc
| Mneumonic | Operand 1 | Operand 2 | Description |
|---|---|---|---|
mov | Reg | Any | Sets op1 = op2 |
cast | Reg | Any | Sets op1 = op2, casting |
bcast | Integer Reg | Reg | Bit / Reinterpret cast |
abs | Numeric | Signed Numeric | Gets positive value |
nop | Does nothing | ||
dbg | Reg | Prints the value and register for debugging |
Arithmetic
| Mneumonic | Operand 1 | Operand 2 | Operand 3 | Description |
|---|---|---|---|---|
add | Numeric Reg | Numeric | Numeric | op1 = op2 + op3 |
sub | Numeric Reg | Numeric | Numeric | op1 = op2 - op3 |
mul | Numeric Reg | Numeric | Numeric | op1 = op2 * op3 |
div | Numeric Reg | Numeric | Numeric | op1 = op2 / op3 |
mod | Numeric Reg | Numeric | Numeric | op1 = op2 mod op3 |
Jumps, Calls and Conditional Branches
| Mneumonic | Operand 1 | Operand 2 | Description |
|---|---|---|---|
jmp | Label | Unconditional jump to the given label | |
jal | Label | Instr Reg | Jumps to the given label, setting the instruction register to the next instruction |
bz | Label | Numeric | Jump to label if op2 is zero |
bnz | Label | Numeric | Jump to label if op2 is not zero |
Comparison Operations
Comparison operations take an unsigned destination register (minimum width 1), and sets the destination to 1 if true, otherwise sets to 0. Comparisons are only allowed between the same register sets.
| Mneumonic | Operand 1 | Operand 2 | Operand 3 | Description |
|---|---|---|---|---|
eq | Unsigned Reg | Numeric | Numeric | \(op2 = op3\) |
gt | Unsigned Reg | Numeric | Numeric | \(op2 \gt op1\) |
gte | Unsigned Reg | Numeric | Numeric | \(op2 \ge op3\) |
lt | Unsigned Reg | Numeric | Numeric | \(op2 \lt op3\) |
lte | Unsigned Reg | Numeric | Numeric | \(op2 \le op3\) |
Bitwise Operations
All (signed and unsigned) integers are defined to be stored in twos-complement representation. The following bitwise operations work only between integer types.
| Mneumonic | Operand 1 | Operand 2 | Operand 3 | Description |
|---|---|---|---|---|
and | Integer Reg | Integer | Integer | Bitwise AND |
or | Integer Reg | Integer | Integer | Bitwise OR |
xor | Integer Reg | Integer | Integer | Bitwise XOR |
not | Integer Reg | Integer | Integer | Bitwise NOT |
Memory
Blocks of memory can be allocated with the alloc instruction.
| Mneumonic | Operand 1 | Operand 2 | Description |
|---|---|---|---|
alloc | Address Reg | Unsigned | Allocates a continguous block of memory X bytes long |
free | Address Reg | De-allocates a block of memory | |
load | Reg | Address Reg | Loads the given register from the address |
store | Address Reg | Any | Stores a value into the address |
msize | Unsigned Reg | Sets op1 to the size of a memory address in bytes | |
isize | Unsigned Reg | Sets op1 to the size of a instruction address in bytes |
UMC Registers
There are a unlimited number of registers in UMC. Each type (including width) is a separate set of registers.
Register Types
| Register Set | Syntax | Widths | Description |
|---|---|---|---|
| Unsigned Int | u32:0, u8:0, uX:0 | Any | Unsigned Integers |
| Signed Int | i32:0, i64:0 | Any | Signed Integers |
| Floats | f32:0, f64:0 | 32/64 | Floating Point Numbers |
| Memory Address | m:0, m:1 | At least 32-bit | Memory location |
| Instruction Address | n:0, n:1 | At least 32-bit | Instruction location |
Vector registers
All register types can be extended to a vector register type.
| Register Set | Syntax | Description |
|---|---|---|
| Unsigned Int | u32x4:0 | Vector of 4 Unsigned 32-bit Integers |
| Signed Int | i64x4:0 | Vector of 4 Signed 64-bit Integers |
| Float | f32x8:1 | Vector of 8, 32-bit Floats |
| Memory Address | mx8:0 | Vector of 8 memory addresses |
| Instruction Address | nx8:0 | Vector of 8 instruction addresses |
Binary Format
This RFC defines the binary format for UMC.
All UMC binary files begin with the following 16-byte magic data:
Magic: 0x7F 0x55 0x4D 0x43 0x20 0x42 0x79 0x74 0x65 0x64 0x6F 0x64 0x65 0x00 0x00 0x00
This translates to the following ascii string: \x7FUMC Bytecode\0\0\0
All UMC binary files are then immediately followed by two bytes describing the UMC binary format version:
Major Version: 1 byte
Minor Version: 1 byte
If a UMC Virtual Machine encounters an incompatible major version, it should refuse to execute the program. If a UMC Virtual Machine encounters a minor version higher than what is supported for that major version by the Virtual Machine, then it should refuse to execute the program. Lower minor versions than are supported by the UMC VM for a particular major version will work correctly on VMs that correctly follow the UMC Specification for a higher minor version.
Little Endian Base 128 (LEB128)
The UMC Binary File format extensively uses LEB128 for integers.
Version 0.3
Type Header Table
The first header found in a 1.0 UMC Binary file contains the register type header, which maps type indices to types
This table first starts with the number of entries:
Entry Count: Unsigned LEB128
Followed by each entry in series:
Control Byte:
- Length Flag: 1 bit
- Constant Flag: 1 bit
- Reserved (0): 5 bits
- Register Type: 3 bits
Width: Unsigned LEB128
Length: Unsigned LEB128 (Only if indicated in the control byte)
Duplicate entries are currently not permitted, as they may be significant in future versions.
Duplicate entries are permitted for register entries (constant flag not set).
A instruction references a different register if it has a different type index, even if the register index is the same.
This allows using encoding more registers of the same type in the same number of bytes.
For example, it allows 256 unique register operands to be encoded instead of 128 in two bytes.
The register types are encoded as follows:
| Register Type | Encoding |
|---|---|
| Unsigned Integer | 0b000 |
| Signed Integer | 0b001 |
| Float | 0b010 |
| Memory Address | 0b011 |
| Instruction Address | 0b100 |
Pre-initialised memory Table
UMC supports pre-initialised memory regions which can be referenced by a memory label, which is encoded as a memory constant. The pre-initialised memory table immediately follows the RT Header, and begins with the number of entries:
Entry Count: Unsigned LEB128
Then for each entry is stored contiguously:
Data Length: Unsigned LEB128
Data: Variable number of bytes
Instruction Encoding
Immediately after the type header table, are all instructions, one after the other.
The first byte of each instruction is the opcode:
Op Code: 1 byte
| Op Code | Value |
|---|---|
| NOP | 0b000000 |
| MOV | 0b000001 |
| ADD | 0b000010 |
| SUB | 0b000011 |
| MUL | 0b000100 |
| DIV | 0b000101 |
| MOD | 0b000110 |
| JMP | 0b001000 |
| JAL | 0b001001 |
| BZ | 0b001010 |
| BNZ | 0b001011 |
| EQ | 0b001100 |
| GT | 0b001101 |
| GTE | 0b001110 |
| AND | 0b010000 |
| OR | 0b010001 |
| XOR | 0b010010 |
| NOT | 0b010011 |
| ALLOC | 0b100000 |
| FREE | 0b100001 |
| LOAD | 0b100010 |
| STORE | 0b100011 |
| SIZE | 0b100100 |
| CAST | 0b110001 |
| ECALL | 0b110100 |
| DBG | 0b111111 |
Fixed Instruction Encoding
The majority of instructions are encoded using this format, since for a given op code they have fixed number of arguments. Each operand begins with its type:
Operand Type: Unsigned LEB128
This operand type is an index in the Type Header Table. Depending on this entry, one of the following values are then decoded:
Register operands are encoded as:
Register Index: Unsigned LEB128
Unsigned integer constants are encoded as:
Constant Value: Unsigned LEB128
Signed integer constants are encoded as:
Constant Value: Signed LEB128
Float constants are encoded as:
Constant Value: 4 or 8 bytes
The number of bytes for the constant value is dictated by the width’s bytes, rounded up. The only supported float constant types are f32 and f64, which use 4 and 8 byte constants respectively.
Memory Label Constants are encoded as:
Constant Value: Unsigned LEB128
These refer to an index in the pre-initialised memory table.
Instruction Label Constants are encoded as:
Constant Value: Unsigned LEB128
Variable Instruction Encoding
This encoding is used for the ECALL opcode.
The variable instruction encoding first starts with the operand count:
Operand Count: Unsigned LEB128
And then the encoding is the same as the fixed instruction encoding.
Sizeof Instruction Encoding
msize and isize are encoded like a fixed instruction with two operands:
- Register Operand, but with no index
- Unsigned Register Operand
Environment Calls
Environment calls are like system calls in UMC. An environment call requests the the VM perform privilleged, (likely platform specific) action on its behalf.
UMC External Call Table
| Name | id | Details | Short Description |
|---|---|---|---|
exit | 0x0 | exit | Halt the UMC VM with the given exit code |
open | 0x1 | open | Open a file |
close | 0x2 | close | Close a file |
read | 0x3 | read | Read bytes from a file |
write | 0x4 | write | Write bytes to a file |
getarg | 0x10 | getarg | Retrieve an arbitrary bytes argument passed to the UMC VM by index |
External Call List
exit
Exits with the given exit code
Parameters:
- Result Register (Ignored)
- External Call Code
- Exit Code
ecall u1:0, 0x0, #1 ; Exit the VM with exit code 1
open
Open attempts to open a file with the given filename. On success, it returns a positive integer filename handle
Parameters:
- Result Register (signed integer)
- External Call Code
- Null terminated filename (memory address)
ecall i32:1, 0x1, m:0
close
Close the given file. On success, 1 is stored into the result register Parameters:
- Result Register (unsigned, at least 1 bit)
- External Call Code
- File handle (unsigned integer)
ecall u1:0, 0x2, u32:1
read
Read up to the given number of bytes from the file. On success, the number of bytes read is stored into the result register. Parameters:
- Result Register (unsigned)
- External Call Code
- File handle (unsigned integer)
- Buffer (memory address)
- Maximum number of bytes to read (unsigned integer)
ecall u64:1, 0x3, u32:0, m:1, u64:0
write
Write the given number of bytes from a memory address into a file Stores the number of bytes successfully written into the result register Parameters:
- Result Register (unsigned)
- External Call Code
- File handle (unsigned integer)
- Buffer (memory address)
- Maximum number of bytes to write (unsigned integer)
ecall u64:0, 0x4, u32:0, m:2, u64:1
getarg
Retrieve an user-provided argument given the UMC Program at runtime.
The first argument is always a memory address of the null-terminated name of the program. The remaining arguments are provided by the user when the UMC VM is started.
All values are copies of the original arguments, and the same arguments can be retrieved multiple times in the program. Memory addresses are also “copies”, but values may be stored in them - this can allow returning parameters to the caller.
UMC programs that use this environment call should document their required parameters.
Calling getarg with type or argument number that doesn’t match the provided arguments will cause the program to terminate.
Examples:
ecall m:0, 0x10, #0 ; Get 0th argument - null terminated name of the program
ecall m:1, 0x10, #1 ; Get first argument as a memory address
ecall u64:0, 0x10, #1 ; Get first argument as a 64-bit unsigned integer
ecall i64:1, 0x10, #2 ; Get second argument as a 64-bit signed integer
ecall f32:0, 0x10, #3 ; Get third argument as a 32-bit float
Basic Architecture
The original proposition of UMC is to create a register-based VM bytecode (as opposed to a stack-based VM). LuaJIT successfull implemented a register-based bytecode with significant performance improvements [1]
Syntax Proposal - Typed Register Sets (Accepted)
;;;;; FULLY TYPED REGISTER SETS PROTOTYPE
; Pros:
; No implicit casting since independent register sets
; - Vector registers can't be reshaped
; - No undefined behaviour
; - Easy to implement registers as vecs
; Cons:
; Very verbose
; - Can be mitigated by implicit register type
; add u32:0, :0, :1
fib:
; i32:0 (n)
b i32:0 < $0, .L4
mov u32:0, $0 ; int cur = 0
mov u32:1, $1 ; int next = 1
mov u32:2, $0 ; int i = 0
.L3:
; u32:3 next
mov u32:3, u32:2 ; int next2 = next
add u32:3, u32:3, u32:0 ; next2 += cur
mov u32:0, u32:1 ; cur = next
mov u32:2, u32:3 ; next = next2
add u32:2, u32:2, $1
b u32:2 != n, .L3
.L4:
mov i32:0, u32:0
sum:
; a:0 - Array of integers
; u32:0 - Length of the array
b i32:0 == 0, .L8 ; If empty, terminate
mov i32:1, $0 ; int total = 0
shl u32:0, u32:0, $2 ; Multiply by 4 to get length in bytes
add a:1, a:0, u0 ; end address = start address + length
.L8:
load u32:2 a:0
add i32:1, i32:1, u32:2
add a:0, a:0, $4
b a:0 != a:1, .L8
.L9:
mov i32:0, $0
ret
;;;;;;;;;;; IMPLICIT MULTIPLE OPERANDS ;;;;;;;;;;;;
fib:
; i32:0 (n)
b i32:0 < $0, .L4
mov u32:0, $0 ; int cur = 0
mov u32:1, $1 ; int next = 1
mov u32:2, $0 ; int i = 0
.L3:
; u32:3 next
mov u32:3, :2 ; int next2 = next
add u32:3, :3, u32:0 ; next2 += cur
mov u32:0, :1 ; cur = next
mov u32:2, :3 ; next = next2
add u32:2, :2, $1
b u32:2 != n, .L3
.L4:
mov i32:0, u32:0
sum:
; a:0 - Array of integers
; u32:0 - Length of the array
b i32:0 == 0, .L8 ; If empty, terminate
mov i32:1, $0 ; int total = 0
shl u32:0, :0, $2 ; Multiply by 4 to get length in bytes
add a:1, a:0, u32:0 ; end address = start address + length
.L8:
load u32:2 a:0
add i32:1, i32:1, u32:2
add a:0, a:0, $4
b a:0 != a:1, .L8
.L9:
mov i32:0, $0
ret
;;;;;;;;; VECTOR INSTRUCTIONS ;;;;;;;;;;;;;
doublearray:
load u32x8:0, a:0
mul u32x8:0, u32x8:0, $2
store u32x8:0, a:0
Original Syntax Proposal - Types don’t include widths
;;;;; Instructions
; Based off RISC-V basic instruction set
; https://projectf.io/posts/riscv-cheat-sheet/#arithmetic
; Pros:
; Large reduction in instructions achieved
; - Lots of instructions have a signed/unsigned variant - add, div, mul, even bit shift, cond. jump
; - Instructions overloaded, or only work on some types of registers
; - Simple idea with different applications
; - Register names (not numbers) describe their capabilities
; Specifying width for same registers matches hardware
; - Can be simulated if not big enough
; Vector registers having different name, to avoid implicit conversions
; Destination register gives type of operation
; Cons:
; Undefined behaviour to reshape vector registers?
; - Semantic checker for different acceses?
; Lots of implicit sign casting
; Lots of repeating the same widths?
; - Could use implicit from destination like:
; add u0:32 u0: u1:
; Arithmetic
add u0:32, $100 ; Immediate
add i0:32, i1:32, i2:32 ; Registers, all same, ints
add f0:32, f1:32, f3:32 ; Registers, all same, floats
add i0:64, i1:64, u1:32 ; Mix of types and widths
cast i2:64, u1:32
add [i64] %0, %1, ~~~u1:32~~~
neg i1:32 ; cast to i1, then negate
sub i0:32, u1:32, u2:32 ; Can subtract unsigned numbers (can the result be signed?)
mul u1:32, u2:32, u3:32 ; Multiplication of several signed numbers
mul u1:64, u2:32, u3:32 ; Multiplication (completely safe)
mul i1:64, i2:32, u3:32 ; Signed Multiplication
div u3:32, u2:32, u1:32 ; Unsigned division
div i3:32, u2:32, i1:32 ; Signed division
; Bitwise logic
and i3:32, u2:32, i1:32 ; Just works on bits. Only width needs to match.
not i0:32, i0:32 ; same again
or i0:32, i0:32 ; same again
xor i0:32, i0:32 ; same again
; Shifts
shl i0:32, i0:32, $2 ; Arithmetic left shift (x2)
shr u0:32, i0:32, $2 ; Logical shift (determined by destination)
shl i0:32, i1:32, i2:32 ; Arithmetic left shift
shr u0:32, i1:32, i2:32 ; Logical right shift (determined by destination)
; This is quite good, saves a few instructions
; Memory accessing
load u0:32, 0xDEADBEEF ; Loads a 32 bit value into the register
load u0:8, 0xDEADBEEF ; Loads a 8 bit value into the register
mov u0:8, 0xFF
stor 0xDEADBEEF, u0:8
stor 0xDEADBEEF, u0:32 ; Store 32 bits even though its 8 bit
; Immediate offset in load / store?
load u0:32, 4(0xDEADBEEF)
; Jumps
j 0xBEEF ; Jump to immediate address
j .LABEL ; Jump to label (labels have type aX)
jal a0, 0xBEEF ; Jump to immediate address, store next instruction address in a0
jalr a0, a1, $4 ; Jump to a1+4, store next instruction address in a0 (Should immediate be multiplied by instruction size?)
call .LABEL ; Jump to
ret ;
; Conditional branches
beq u0:32, i0:32, .LABEL ; Branch if the two are equal
bne u0:32, i0:32, .LABEL ; Branch if the two are not equal
bgt u0:32, i0:32, .LABEL ; Branch if greater than
bge u0:32, i0:32, .LABEL ; Branch if greater than or equal to
; Allow aX instead of label above?
;;;; zero-register, like ARM64 and RISC-V? Partly an encoding concern
seq u0:1, i0:32, i0:8 ; Set u0 to 1 if equal
sne u0:1,
; Alternative?
add %0:u32, %1:i32, %2:i32
; Vector registers?
; iv, uv, fv, av avoids
add iv0:32x8 iv0:32x8, iv0:32x8
add av0:x8, av0:x8, $4
;;;; Fibonacci example ;;;;
; Problem - how are parameters passed? As u0,u1,.. by default?
fib:
; i0:32 (n)
b i0:32 < $0, .L4
mov u0:32, $0 ; int cur = 0
mov u1:32, $1 ; int next = 1
mov u2:32, $0 ; int i = 0
.L3:
; u3: next
mov u3:32, u2:32 ; int next2 = next
add u3:32, u3:32, u0:32 ; next2 += cur
mov u0:32, u1:32 ; cur = next
mov u2:32, u3:32 ; next = next2
add u2:32, u2:32, $1
b u2:32 != n, .L3
.L4:
mov i0:32, u0:32
ret
;;;; Sum array ;;;;
sum:
; a0 - Array of integers
; u0 - Length of the array
b u0:32 == 0, .L8 ; If empty, terminate
mov i1:32, $0 ; int total = 0
shl u0:32, u0:32, $2 ; Multiply by 4 to get length in bytes
add a1, a0, u0 ; end address = start address
.L8:
load u2:32, a0
add i1:32, i1:32, u2:32
add a0, a0, $4
b a0 != a1, .L8
.L9:
mov i0:32, $0
ret
;;;;; Vector instructions - Double array
doublearray:
; Assume multiple of 4
load uv0:32x8, a0
mul uv0:32x8, uv0:32x8, $2
store uv0:32x8, a0
; Pick operations
pick u1:32, uv1:32x4, 0
pick u2:32, uv1:32x4, 1
pick u3:32, uv1:32x4, 2
pick u4:32, uv1:32x4, 3
pick uv1:32x4, uv0:32x8, 0 ; Take first 4
pick uv2:32x4, uv0:32x8, 4 ; Take second 4
; Put (reverse pick) operations
pick uv1:32x4, u1:32, 0
; Pick alternative - similar to ARM
mov u0:32, uv1:32x4[0]
mov u1:32, uv1:32x4[1]
mov u2:32, uv1:32x4[2]
mov u3:32, uv1:32x4[3]
mov uv2:32x2, uv1:32x4[2]
; Broadcast
brdc uv1:32x4, 0
; MUL V0.4S, V2.4S, V3.S[2] multiplies each of the four 32-bit elements in V2 by the 32-bit scalar value in lane 2 of V3, storing the result vector in V0.
; https://developer.arm.com/documentation/102474/0100/Fundamentals-of-Armv8-Neon-technology/Registers--vectors--lanes-and-elements
; is there any point having a stack pointer register?
; special case of a-type register
; How to detect overflow??
; add u3:32, u2:32, u1:32 : $overflow > u4:1
Alternative Syntax Proposal - No types
add r0:32, $100
adds r0:32, r1:32, r2:32 $-100
neg r0:32, r0:32
sub r3:32, r2:32, r1:32 # r3 = r2 - r1, signed/unsigned doesn't matter
mul r3:32, r2:32, r1:32
muls r3:32, r2:32, r1:32
mul r3:64, r2:32, r2:32
muls r3:64, r2:32, r2:32
div u3:32, u2:32, u1:32
divs u3:32, u2:32, u1:32
; Bitwise - Sign not relevant
and r3:32, r2:32, r1:32
not r3:32, r2:32, r1:32
or r3:32, r2:32, r1:32
xor r3:32, r2:32, r1:32
; Shifts
sll r3:32, r2:32, r1:32 ; Logical left shift
slr r3:32, r2:32, r1:32 ; Logical right shift
sra r3:32, r2:32, r1:32 ; Arithmetic left shift
; No arithmetic right shift
load r3:32, 0xDEADBEEF ; Load 32 bits into register
store 0xDEADBEEF, r3:32 ; Store 32 bits into memory
Alternative Syntax Proposal - Typed Instructions
;;;; TYPED INSTRUCTION PROTOTYPE
; Pros:
; No implicit casting since independent register sets
; - Vector registers can't be reshaped
; - No undefined behaviour
; Cons:
; %1 and %1 in different instructions, makes it hard to read
; - This code is mostly going to be generated anyway
; Some instructions are supposed to have different types:
; - load [u32] a0
; - vector broadcast
; Memory leak of registers?
fib:
; i0:32 (n)
b [i32] %0 < #0, .L4
mov [u32] %0, #0 ; int cur = 0
mov [u32] %1, #1 ; int next = 1
mov [u32] %2, #0 ; int i = 0
.L3:
; u3: next
mov [u32] %3, %2 ; int next2 = next
add [u32] %3, %3, #0 ; next2 += cur
mov [u32] %0, #1 ; cur = next
mov [u32] %2, %3 ; next = next2
add [u32] %2, %2, #1 ; i++
.L4:
mov [i32] %0, {u0:32}
ret
sum:
; a0 - Array of integers
; u0 - Length of the array
b [u32] %0 == 0, .L8 ; If empty, terminate
mov [i32] %1, #0 ; int total = 0
shl [u32] %0, %0, #2 ; Multiply by 4 to get the length in bytes
add [a] %1, %0, {u0:32} ; end address = start address + length
.L8:
load [u32] %2, [a0] ; Load the memory address
add [i32] %1, %1, {u2:32}
add [a] %1, %0, {u0:32}
b [a] %0 == %1, .L8
.L9:
mov [i32] %0, #0
ret
External calls / Syscalls in UMC
; External calls in UMC, function like exact order
ecall retreg, .LABEL?, u32:1, u32:0, f32:3, i32:1,
System V x86-64 ABI Translation
rdi <= u32:1
rsi <= u32:0
rdx <= i32:1
xmm0 <= f32:3
; System V can only return rax, but this is interpreted based on ecall
; e.g. ecall a:0 ; interpret as address
; e.g. ecall u32:0 ; interpret as u32
; e.g. ecall i64:0 ; interpret as i64
ret
UMC External call
.LABEL:
u32:0 <= u32:1
u32:1 <= u32:0
f32:0 <= f32:3
i32:0 <= i32:1
ret X ; retreg <= X
Explicit and Implicit Casting
tl;dr: Widening is allowed, but all other casts must be explicit. Calculations are always done in the domain of the domain of the destination register.
Examples
Examples of what is allowed:
add u32:0, u32:0, u32:0 ; Great
add u64:0, u32:0, u32:0 ; Implicit widening cast
; Alowed, and recommended for large multiplications
; Would translate as multiply low and high
mul u64:0, u32:0, u32:0
mul u128:0, u64:0, u64:0
; Comparisons can use any width unsigned register
gt u1:0, u32:0, u32:0
gt u1:0, u64:0, u32:0 ; 64-bit comparison
; Use explicit casts where necessary
; Do 32-bit subtraction
mov u32:1, u64:0 ; Require u64:0 to fit in 32 bits
sub u32:2, u32:1, u32:0
; Do 64-bit subtraction, expect the result to fit in 32 bits
sub u64:2, u64:1, u32:0
mov u32:2, u64:2
mov u32:0, i32:0
gt u1:0, u32:0, u32:1 ; Performs unsigned comparison: Weird if i32:0 is negative
mov i32:0, i32:0
gt u1:0, i32:0, i32:0 ; Performs signed comparison: Weird u32:0 uses the top bit
What is not allowed
; This is not allowed, result depends on cast before vs after, confusing
sub u16:0, u32:0, u32:0
; Confusing comparisons
gt u1:0, u32:0, i32:0 ; This is actually Undefined behaviour in C when i32:0 is negative
; E.g. 0b1000, 0b0001 => Signed comparison interprets u32:0 as smallest negative
; E.g. 0b0000, 0b1000 => Unsigned comparison interprets smallest negative as bigger than 0
Types of casts
What operations do we need to do?
- Narrowing cast (signed, unsigned or float)
- Unsigned <-> Signed (bit or value-based?)
- Unsigned <-> Float (Value-based)
- Signed <-> Float (Value-based)
bcast - Binary / Bitwise cast - for twos complement conversions/narrowing
vcast - Value cast - Best effort at encoding the same value with the new datatype
bcast u32:0, u64:0 ; Truncation
bcast i32:0, i64:0 ; Truncation
bcast u32:0, i32:0 ; Reinterpret two's complement i32:0 as a u32:0.
bcast i32:0, u32:0 ; Reinterpret unsigned as two's complement - if top bit is set then it signed overflow occurs
vcast f64:0, u32:0 ; Take the unsigned integer and convert it into its closest floating point representation
vcast f64:0, i32:0 ; Take the signed integer and convert it into its closest floating point representation
vcast f32:0, f64:0 ; Take the value of the f64 and re-encode it as a f32 (precision lost)
; Explicit vcast with specific behaviour?
abs u32:0, i32:0 ; Take the signed integer, make it positive so it definitely fits correctly (no vcast for u32 and i32?)
; Maybe just a fcast with specific behaviour too?
fcast f64:0, u32:0 ; Re-encode as a float
fcast f64:0, i32:0 ; Re-encode as a float
fcast f32:0, f64:0 ; Re-encode as a different precision float (not a straightforward bit-cast)
brdc u32x8:0, u32:0 ; Broadcast u32 to vector of u32s.
; Does it make sense to allow every instruction to operate on vector elements?
; Or just allow mov to put/pick (Leaning towards this, otherwise it signficantly complicates the instruction set)
Rationale
Many instructions need to specify whether to operate in signed or unsigned form, such as comparisons. The top bit of an unsigned integer can be confused with a signed negative number.
Doing something like:
sub u32:0, u64:0, u32:1
Does not equate to a single instruction in x86, RISC-V or ARM. It must do a 32-bit or 64-bit operation, with a narrow/widen before/after
Problems with using extra registers
The idea is that the casting instruction will be eliminated during JIT compilation. However, the cast causes there to potentially be 2 registers used for the same value (one of which may outlive the other).
For example, if the following code were to be JIT compiled:
mov u32:1, u64:0
sub u32:2, u32:1, u32:0
it might require one register to store the 64 bit value, and another to store the 32-bit value (one of which may live much longer than the other). In more complex examples, with many more registers, this might cause registers to overflow, even though u32:1 is really not needed. However, if necessary this can be solved by clearing the register in UMC:
mov u32:1, u64:0
sub u32:2, u32:1, u32:0
mov u64:1, #0 ; If the u64:0 value is never needed
Now, only one register is needed if the instruction set supports accessing 64-bit registers as 32-bit. The UMC JIT can set u64:0 to #0 without an output register, as it knows this is guaranteed.
UMC Memory Model
- Four Instructions:
allocandfree,loadandstore - One register type with archecture-dependent width:
a:0 addandsuballowed with one address register and one (un)signed integer “offset”
Examples
alloc a:0, 4
store a:0, u32:1 ; Store 4-byte integer into address register
load u32:1 ; Load it as a 32 bit integer
free a:0 ; Frees the memory allocated earlier. a:0 is now invalid
Rationale
Allocating and de-allocating memory is a common task, which if we are running in the UMC VM, can easily be delegated to the VM.
Questions:
- How do we allocate a block of memory large enough for a memory address if size is machine-dependent? A constant?
UMC Bytecode Format
In UMC we have not only “unlimited” registers, but also “unlimited” register types. These need to be represented in a fixed amount of space, which requires a variable-length encoding of instructions.
Constraints:
- Unlimited registers and register types
- Minimum 64 instruction opcodes, ideally more or extensible
Register-based bytecodes are generally faster than stack-based ones, but spend significantly longer on “instruction-fetch”. So we want to use an efficient encoding to mitigate this, especially since we have unbounded size.
Register Types
- Unsigned Int (+ width)
- Signed Int (+ width)
- Float (+ width)
- Memory Address
- Instruction Address
Plus vector types with a “length” This could fit in 4 bits, first bit as a flag for a vector. However we might also want a flag for a constant, at which point we’d need 8 bits (a full byte).
Basic/Naive encoding
Example of 1.5 byte type and 1 byte index
u32:1 | 0001 00010000 00000001 | 2.5 bytes
^^^^ ^^^^^^^^ ^^^^^^^^
unsigned | |
32 index 1
Example of vectorised register
u32x8:1 | 1001 00010000 0100 00000001
^^^^ ^^^^^^^^ ^^^^ ^^^^^^^^
unsigned vector | | |
32 8 index
Example of a full u32-u32 instruction (8.5 bytes):
add | 00000010 <- add opcode
u32:1, | 0001 00010000 00000001
u32:2, | 0001 00010000 00000010
u32:3 | 0001 00010000 00000011
Some instructions have only one valid register type, or a constant, so they don’t necessarily need a type:
jmp | 00001000 <- jump opcode
n:0 | 00000000 <- jmp to register 0 value
jmp | 00001000 <- Jump-constant instruction ideally?
.LABEL | 00001000
A lot of instructions are likely to be “uniform”, using the same types (for the registers that aren’t forced), so we could consider an encoding like (5.5 bytes):
add | 00000010 <- add opcode
u32 | 0001 00010000
:1 | 0000 0001
:2 | 0000 0010
:3 | 0000 0011
However, using a full byte just to store the 32/64-part of u32/u64 is pretty wasteful, especially considering they are probably the most common type. Unfortunately the full range of u1-u32 is allowed, so we can’t simply store log2(width).
Huffman encoding of register types
Ultimately, the most common register types will be used massively more than the less common types.
I propose a Huffman encoding of types (with their widths / lengths!).
In the following example, a program uses 32-bit registers most often, plus some others used in UMC.
entry | register
--------------------
0000 0000 | u32
0000 0001 | i32
0000 0010 | u64
0000 0011 | i64
0000 0100 | f32
0000 0101 | f64
0000 0111 | u1 (used as a boolean in UMC)
0000 1001 | n (Instruction Address)
0000 1011 | m (Memory Address)
0000 1111 | u32x8 (example of vector registers, probably only a handful used)
... a few more application specific, maybe u128/i128, u8 etc.
0000 0000 0000 0001 | If necessary, for rare types, a 2-byte or 1.5-byte encoding can be used
Instruction and Memory address registers will only really be used in add/sub for offsetting, so won’t be too high, but should still fit in the table.
It might also be a good idea to ‘fix’ the first few entries of the table for types we know will be used, like u32, u64, i32, i64, u1, n and m to
reduce how much binaries change and make things like combining two umc binary files easier.
Instructions now for non-uniform types (7 bytes):
add | 0000 0010 <- add opcode
u32x8:1 | 0000 1111 0000 0001
u32x8:2 | 0000 1111 0000 0010
u32:2 | 0000 0000 0000 0010
Covers any 3 operand types with a range of 256 different registers.
Only 5 bytes needed for a uniform instruction
add | 0100 0010 <- add opcode
u32 | 0000 0000 <- u32 index
:1 | 0000 0001
:2 | 0000 0010
:3 | 0000 0011
Re-indexing with the most common as lower numbers can also be used to ensure the most common register indexes are packed tightly, without changing the result of the program. 1 byte is probably the minimum for a register index. Another advantage is that registers can be stored in an implementation-dependent way without decoding penalty, and we can just store a list of register operands.
Can reserve one or two bits in the instruction opcode for 2-byte encoding where rare types or high indices are used. Leading 1 on operand type could indicate a inline constant
Huffman encoding registers
We could take this even further for a much greater space saving, by encoding the exact registers (type + index).
entry | register
---------------------
0000 0000 | u32:0
0000 0001 | u32:1
0000 0010 | u32:2
0000 0011 | u32:3
0000 0100 | u64:0
0000 0101 | u64:1
0000 0110 | i32:0
0000 0111 | i32:1
0000 1000 | i32:2
0000 1001 | f32:0
0000 1010 | f32:1
0000 1011 | u32x8:0
This causes an 4-byte encoding of:
add | 0000 0010 <- add opcode
u32:2 | 0000 0010
u32:1 | 0000 0001
u32:0 | 0000 0000
The cost of non-uniform encodings is no longer an issue (Still 4 bytes):
add | 0000 0010 <- add opcode
u64:0 | 0000 0100
u32:1 | 0000 0001
u32:2 | 0000 0010
However this would require a decently large table at the start of the file, and decoding logic would also be more complex, if you wanted to store values of the same type contiguously.
Initialised data
In NASM, its possible to define pre-initialised data in a file, which gets put in the .data segment:
x db "hello\0"
This is especially useful for defining static strings, and it would be nice to be able to do this in UMC.
A simple syntax would be:
.x: "hello\0"
However this would then be confusing / mixing memory addresses versus instruction addresses.
Instead, we could introduce a memory label:
&x: "hello\0" ; A null terminated string constant
&y: [0x01, 0x02, 0x03] ; Arbitrary bytes
And this could only be used as a memory address constant:
load u8:0, &x