JIT Compilation in Nitpick
Overview
Nitpick includes a built-in JIT assembler for runtime code generation. The JIT subsystem builds on two foundations:
- WildX (
wildx_alloc.cpp) — W⊕X memory management with ASLR, guard pages, code signing, and quota enforcement - Assembler (
assembler.cpp) — x86-64 instruction encoder with label backpatching, register allocation, peephole optimization, and instruction selection
The JIT is accessible from Nitpick code through the jit stdlib package, which provides FFI bindings to the C++ assembler API.
Architecture
┌──────────────────────────────────────────────┐
│ Nitpick Source Code │
│ use jit; use wildx; │
├──────────────────────────────────────────────┤
│ jit.npk FFI Bindings │
│ 81 bindings + 17 helpers + constants │
├──────────────────────────────────────────────┤
│ Assembler Pipeline │
│ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │ IR Queue │→ │ Peephole │→ │ Liveness │ │
│ │ (lazy) │ │ Optimizer│ │ Analysis │ │
│ └──────────┘ └──────────┘ └───────────┘ │
│ ↓ ↓ ↓ │
│ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │ Linear │→ │ Insn │→ │ Machine │ │
│ │ Scan RA │ │Selection │ │ Code Emit │ │
│ └──────────┘ └──────────┘ └───────────┘ │
├──────────────────────────────────────────────┤
│ WildX — W⊕X Memory Manager │
│ ASLR | Guard Pages | Code Signing | Quota │
├──────────────────────────────────────────────┤
│ x86-64 Hardware │
└──────────────────────────────────────────────┘
Instruction Set (v0.7.2+)
The assembler supports 45+ x86-64 instructions across multiple categories:
Integer
- Data movement:
MOV r64, imm64/MOV r64, r64 - Arithmetic:
ADD,SUB,IMUL(r64,r64 and r64,imm32) - Bitwise:
XOR,AND,OR,NOT,NEG - Shifts:
SHL,SHR,SARwith imm8 - Compare:
CMP r64, r64/CMP r64, imm32 - Stack:
PUSH,POP - Flow:
JMP,JE/JNE/JL/JLE/JG/JGE/JB/JBE/JA/JAE,RET - Call:
CALL r64,CALL label,CALL abs
Floating-Point (SSE2)
MOVSD(reg-reg, load, store),ADDSD,SUBSD,MULSD,DIVSD,UCOMISD- XMM0–XMM15 registers
SIMD (SSE)
MOVAPS(reg-reg, aligned load/store),ADDPS,MULPS- Packed float32 (4x f32) operations
Memory
MOV r64, [base+offset](load),MOV [base+offset], r64(store)LEA r64, [base+offset](address computation)store_local/load_local(RBP-relative stack frame)
Register Allocator (v0.7.3+)
The JIT includes a linear scan register allocator for automatic register assignment:
extern func:asm_create = int64();
extern func:asm_vreg_new_gpr = int64(int64:ctx);
extern func:asm_mov_r64_imm64 = NIL(int64:ctx, int64:reg, int64:val);
extern func:asm_add_r64_r64 = NIL(int64:ctx, int64:dst, int64:src);
extern func:asm_mov_r64_r64 = NIL(int64:ctx, int64:dst, int64:src);
extern func:asm_ret = NIL(int64:ctx);
extern func:asm_finalize = int64(int64:ctx);
extern func:asm_execute = int64(int64:guard);
func:main = int32() {
int64:a = asm_create();
int64:v0 = asm_vreg_new_gpr(a);
int64:v1 = asm_vreg_new_gpr(a);
drop asm_mov_r64_imm64(a, v0, 10i64);
drop asm_mov_r64_imm64(a, v1, 32i64);
drop asm_add_r64_r64(a, v0, v1);
drop asm_mov_r64_r64(a, 0i64, v0); // REG_RAX = 0
drop asm_ret(a);
int64:guard = asm_finalize(a);
int64:result = asm_execute(guard);
// result == 42
exit 0;
};
func:failsafe = int32(tbb32:err) { exit 1; };
Features: - 12 allocatable GPRs (RAX, RCX, RDX, RSI, RDI, R8, R9, RBX, R12-R15) - 14 allocatable XMMs (XMM0-XMM13) - Automatic spill/reload when registers are exhausted - Auto prologue/epilogue when callee-saved registers are needed - Mixed physical + virtual register support
Peephole Optimizer (v0.7.4)
The JIT runs a peephole optimization pass on the IR before register allocation:
| Pattern | Optimization | Bytes Saved |
|---|---|---|
MOV r, 0 |
XOR r, r |
6–7 |
MOV r, r |
eliminated | 3–4 |
ADD r, 0 / SUB r, 0 |
eliminated | 7 |
SHL r, 0 / SHR r, 0 |
eliminated | 4 |
MOV r, X; MOV r, Y |
dead store eliminated | 10 |
MOV r, 2^n; IMUL d, r |
SHL d, n |
~7 |
XOR r, r; ADD r, s |
MOV r, s |
3–4 |
Statistics available via nitpick_asm_peephole_stats().
Instruction Selection (v0.7.4)
During code emission, the allocator selects optimal machine encodings:
| IR Instruction | Selected Encoding | Bytes Saved |
|---|---|---|
MOV_IMM64 (value ≤ 0xFFFFFFFF) |
MOV r32, imm32 |
4–5 |
CMP r, 0 |
TEST r, r |
4 |
ADD r, 1 |
INC r |
4 |
SUB r, 1 |
DEC r |
4 |
ADD/SUB r, imm8 |
imm8 form | 3 |
Statistics available via nitpick_asm_insn_sel_stats().
Profiling Integration (v0.7.4)
JIT code can be registered with Linux perf for profiling:
jit.asm_perf_map_register(code_ptr, code_size, "my_jit_function");
// Now visible in: perf record -p <pid> && perf report
This writes to /tmp/perf-<pid>.map in the format expected by perf.
WildX Security (v0.7.1)
All JIT code runs through WildX's security pipeline:
- ASLR: Random mmap hints for JIT pages
- Guard pages: PROT_NONE sentinels around executable regions
- Code signing: FNV-1a hash verified before every execution
- W⊕X: Strict WRITABLE → EXECUTABLE state machine (never both)
- Quota: Default 64MB, configurable via nitpick_wildx_set_quota()
- Audit logging: --wildx-audit flag for ALLOC/SEAL/EXEC/FREE events
Multi-Architecture (v0.7.4)
Architecture detection and abstraction:
let arch = jit.asm_get_arch(); // ASM_ARCH_X86_64 or ASM_ARCH_AARCH64
let ok = jit.asm_arch_supported(arch); // true on x86-64
AArch64 backend is stubbed for future implementation. The architecture abstraction layer supports querying the current target and checking support before code generation.
Safety
JIT compilation is a Layer 3 (raw) operation — it bypasses all safety guarantees. Executable memory is: - Not bounds-checked - Not type-checked - A security risk if used with untrusted input
Always use WildX guards and code signing for JIT code.
Related
- memory_model/wild.md — unmanaged memory
- types/pointer.md — pointer operations
- safety_layers.md — safety layer definitions