Binary Serialization (binary)
The binary module provides a structured, endian-aware serialization framework in Nitpick. It is primarily used to encode native primitive types (integers, strings, floats, booleans) into contiguous byte formats suitable for disk storage, inter-process communication, or network I/O.
In the current Nitpick specification, binary buffers are managed as opaque int64 handles via the stdlib/binary.npk module rather than a first-class language type.
Overview
Unlike string (which is designed for textual operations but lacks specific character encodings) or buffer (which provides raw, unencoded contiguous memory manipulation), a binary context is inherently stateful. Every binary object implicitly tracks:
- Length (
len): The total number of valid bytes written. - Capacity (
cap): The internally allocated buffer space (which auto-expands). - Position (
pos): A read cursor indicating the byte offset for the nextreadoperation.
Writes always append to the binary length, while reads advance the internal pos cursor.
Creating a Binary Buffer
To interact with binary functions, you must import the standard module and use the bin_new allocator, which returns an opaque int64 handle representing the buffer.
use "binary.npk".*;
func:main = int32() {
// Allocate a new binary buffer (returns an int64 handle)
int64:buf = raw bin_new();
// Do operations...
// Free the buffer when done
drop bin_free(buf);
exit(0);
};
Writing Data
Writes append data explicitly to the end of the buffer (at offset len). The internal capacity is dynamically expanded automatically. All integers and floats are encoded in little-endian byte order.
// Write primitives
drop bin_write_int8(buf, 10i32);
drop bin_write_int16(buf, 2000i32);
drop bin_write_int32(buf, 300000i32);
drop bin_write_int64(buf, 9000000000i64);
drop bin_write_flt32(buf, 3.14f64);
drop bin_write_flt64(buf, 2.7182818f64);
drop bin_write_bool(buf, 1i32); // Evaluates as true
Note: flt32 accepts an f64 argument for ergonomic compatibility but intrinsically down-casts and serializes 4 bytes. bool accepts an int32 but writes a single byte (1 or 0).
Reading Data
Reads decode bytes natively back into Nitpick values. Every read operation evaluates the bytes starting from pos and advances pos by the number of bytes read.
Before reading, ensure the cursor is properly positioned. Usually, you must seek to the beginning of the buffer.
// Seek back to the start of the buffer
drop bin_seek(buf, 0i64);
// Reading primitives
int32:val1 = raw bin_read_int8(buf); // reads 1 byte, returns sign-extended int32
int32:val2 = raw bin_read_int16(buf); // reads 2 bytes
int32:val3 = raw bin_read_int32(buf); // reads 4 bytes
int64:val4 = raw bin_read_int64(buf); // reads 8 bytes
flt64:f1 = raw bin_read_flt32(buf); // reads 4 bytes, returns up-casted flt64
flt64:f2 = raw bin_read_flt64(buf); // reads 8 bytes
int32:b1 = raw bin_read_bool(buf); // reads 1 byte (1 or 0)
String Encoding
Strings in binary are length-prefixed, NOT null-terminated. This provides O(1) buffer allocation upon decoding and safe containment of internal null bytes.
When bin_write_str is invoked:
1. The 8-byte (int64) length of the string is written in little-endian.
2. The raw bytes of the string are written immediately following the prefix.
drop bin_write_str(buf, "Hello World");
// Resets cursor to 0
drop bin_seek(buf, 0i64);
// Automatically decodes the 8-byte prefix and reads 11 bytes
string:s = raw bin_read_str(buf);
File I/O
Nitpick supports direct interaction between binary buffers and the filesystem.
// Write the current buffer to a binary file
drop bin_to_file(buf, "data.bin");
// Read a binary file directly into a new buffer
int64:file_buf = raw bin_from_file("data.bin");
Cursor Control
You can manually inspect the read/write cursor and payload sizes:
bin_size(buf) -> int64: Returns the total number of bytes written (len).bin_pos(buf) -> int64: Returns the current read cursor position (pos).bin_remaining(buf) -> int64: Returnslen - pos(bytes available for reading).bin_seek(buf, offset): Sets theposcursor. The offset is clamped to[0, len].
Error Handling
The binary module uses Nitpick's strict error-checking parameters. If a seek offset goes beyond len, the cursor is clamped to len.
If a read occurs beyond the boundary (where bin_remaining() is less than the required bytes), undefined memory reads may occur at the libc-level boundary since k-semantics does not strictly enforce boundary offsets here. Be careful to ensure you have enough bytes using bin_remaining before consuming payloads.
Complete API Table
| Function Signature | Description |
|---|---|
bin_new() -> int64 |
Allocates a new empty binary buffer and returns the handle. |
bin_free(h) |
Deallocates a buffer and its underlying data block. |
bin_write_int8(h, v) |
Writes 1 byte to the end of the buffer. |
bin_write_int16(h, v) |
Writes 2 bytes (little-endian) to the end. |
bin_write_int32(h, v) |
Writes 4 bytes (little-endian) to the end. |
bin_write_int64(h, v) |
Writes 8 bytes (little-endian) to the end. |
bin_write_flt32(h, v) |
Writes a downcasted 4-byte IEEE 754 float. |
bin_write_flt64(h, v) |
Writes an 8-byte IEEE 754 float. |
bin_write_bool(h, v) |
Writes a single byte (0 or 1). |
bin_write_str(h, s) |
Writes an 8-byte length prefix followed by the string payload. |
bin_read_int8(h) -> int32 |
Reads 1 byte, returning a sign-extended int32. |
bin_read_int16(h) -> int32 |
Reads 2 bytes (little-endian). |
bin_read_int32(h) -> int32 |
Reads 4 bytes (little-endian). |
bin_read_int64(h) -> int64 |
Reads 8 bytes (little-endian). |
bin_read_flt32(h) -> flt64 |
Reads 4 bytes, returning an up-casted flt64. |
bin_read_flt64(h) -> flt64 |
Reads 8 bytes. |
bin_read_bool(h) -> int32 |
Reads 1 byte (returns 0 or 1). |
bin_read_str(h) -> string |
Reads an 8-byte length prefix, then returns the allocated string. |
bin_seek(h, pos) |
Sets the read cursor (clamps to length bounds). |
bin_size(h) -> int64 |
Returns the total bytes written. |
bin_pos(h) -> int64 |
Returns the current read cursor position. |
bin_remaining(h) -> int64 |
Returns available bytes to read (len - pos). |
bin_from_file(path) -> int64 |
Reads a file from disk into a newly allocated buffer handle. |
bin_to_file(h, path) |
Writes the buffer payload into a disk file. |
Examples
Defining a Network Packet Sequence:
use "binary.npk".*;
func:build_packet = int64(int32:type, int64:timestamp, string:payload) {
int64:h = raw bin_new();
drop bin_write_int32(h, type);
drop bin_write_int64(h, timestamp);
drop bin_write_str(h, payload);
pass h;
};
Complete Encode/Decode and File I/O Example:
use "binary.npk".*;
struct:User = {
int32:id;
string:name;
};
func:main = int32() {
// 1. Encode
User:u = { id: 42i32, name: "Alice" };
int64:buf_out = raw bin_new();
drop bin_write_int32(buf_out, u.id);
drop bin_write_str(buf_out, u.name);
// 2. Save to file
drop bin_to_file(buf_out, "user.dat");
drop bin_free(buf_out);
// 3. Load from file
int64:buf_in = raw bin_from_file("user.dat");
// 4. Decode
User:loaded = {
id: raw bin_read_int32(buf_in),
name: raw bin_read_str(buf_in)
};
drop bin_free(buf_in);
exit 0;
};
Error Handling Patterns
When using bin_read_* or bin_seek, invalid boundaries (reading past len or seeking past len) are gracefully clamped or return safe default zeroes to prevent segmentation faults. However, standard error handling should explicitly check bin_remaining() before performing grouped deserializations.
int64:rem = raw bin_remaining(buf);
if (rem < 8i64) {
// Cannot safely read an int64 payload
println("Error: truncated payload");
exit 1;
}
int64:val = raw bin_read_int64(buf);
Performance Notes
- Append-only Structure: Because binary objects act as serialization streams,
bin_write_*functions only append data to the end of the buffer, which handles automatic memory reallocation seamlessly. It is optimized for contiguous streaming. - Random Access Writes:
binarydoes NOT support random access writes or modifications of bytes in the middle of a buffer. If you need to manipulate specific offset values, memory-map abufferinstead. - Copy Overheads: Strings dynamically allocate on read operations since they are not zero-copy string views. High-throughput scenarios parsing many strings will incur memory allocation pressure.
Known Limitations
- The
binarytype is currently an opaqueint64handle interacting withnitpick-libc. There is no first-class language keyword equivalent natively enforcing type limits on it. - No zero-copy views. Reading strings allocates new strings.
- Writing occurs implicitly at the end (
len). You cannot overwrite bytes in the middle of the buffer (requires manual external byte-masking usingbuffer).
binary vs buffer
A buffer acts as a raw mutable memory region. You access specific byte offsets, deal heavily in pointer arithmetic, and interface closely with C FFI structures or memory pages.
The binary module represents an abstracted serialization stream. It natively enforces layout, endianness, capacity reallocation, and encoding parameters on behalf of the developer. Use binary when dumping variables to disk/network, and use buffer when you just need raw unmanaged RAM blocks.