Keep basic block offsets as construction-only metadata rather than
storing them on every Executable. The validator now receives the offsets
through a transient Rust FFI span, and the bytecode dump rebuilds block
starts by scanning labels, terminators, and exception handler metadata.
Drop the table from the bytecode cache format and bump the format
version so old caches are rebuilt. This removes a field that was only
used by validation and bytecode dump paths.
The plan is to start caching compiled JS bytecode on disk. Before
loading anything from a cache we need confidence that the bytes are
structurally well-formed, since a corrupted or tampered-with cache
file could otherwise hand the interpreter an out-of-bounds jump or a
constant-pool index that points past the end of the table.
This commit lays down the scaffolding for that validator. The walker
lives in Rust (Libraries/LibJS/Rust/src/bytecode/validator.rs) so
that it can share the existing Bytecode.def-driven layout machinery
with the encoder. C++ calls into it through cbindgen, the same way
the rest of the Rust pipeline is wired up.
For now, the validator only does Pass 1: walk the byte stream,
verify each instruction is 8-byte aligned, the opcode byte is in
range, and the reported length keeps us inside the buffer. The
length lookup is generated from Bytecode.def so fixed-length and
variable-length instructions stay in sync with the rest of the
codegen automatically. Per-field bounds checks (operands, labels,
table indices, cache indices) and structural extras (basic block
offsets, exception handlers, source map) come in follow-up commits.
The validator runs after every successful compilation in debug and
sanitizer builds, gated on !NDEBUG || HAS_ADDRESS_SANITIZER, so we
get an extra sanity check on every executable the encoder produces
without paying for it in release builds. Failure trips a
VERIFY_NOT_REACHED with the offset, opcode, and error category
logged via dbgln().