internal/pkgbits: explain the rationale for reference tables

The primary benefit of reference tables is to the linker, though they
are also reasonably compact as compared to absolute element indices. It
is worth also checking if reference table structure is similarly
exploited past the IR linking stage.

Ideally, the reference table definition would live in / near the linker.
As it stands, it's a bit hard to infer the purpose of the reference
tables when looking at pkgbits in isolation.

Change-Id: I496aca5a4edcf28e66fa7863ddfa4d825e1b2e89
Reviewed-on: https://go-review.googlesource.com/c/go/+/675596
Auto-Submit: Mark Freeman <mark@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This commit is contained in:
Mark Freeman 2025-05-22 11:06:23 -04:00 committed by Gopher Robot
parent 4878b4471b
commit f14f3aae1c

View file

@ -37,7 +37,68 @@ type AbsElemIdx = uint32
// relative to some other index, such as the start of a section.
type RelElemIdx = Index
// TODO(markfreeman): Isn't this strictly less efficient than an AbsElemIdx?
/*
All elements are preceded by a reference table. Reference tables provide an
additional indirection layer for element references. That is, for element A to
reference element B, A encodes the reference table index pointing to B, rather
than the table entry itself.
# Functional Considerations
Reference table layout is important primarily to the UIR linker. After noding,
the UIR linker sees a UIR file for each package with imported objects
represented as stubs. In a simple sense, the job of the UIR linker is to merge
these "stubbed" UIR files into a single "linked" UIR file for the target package
with stubs replaced by object definitions.
To do this, the UIR linker walks each stubbed UIR file and pulls in elements in
dependency order; that is, if A references B, then B must be placed into the
linked UIR file first. This depth-first traversal is done by recursing through
each element's reference table.
When placing A in the linked UIR file, the reference table entry for B must be
updated, since B is unlikely to be at the same relative element index as it was
in the stubbed UIR file.
Without reference tables, the UIR linker would need to read in the element to
discover its references. Note that the UIR linker cannot jump directly to the
reference locations after discovering merely the type of the element;
variable-width primitives prevent this.
After updating the reference table, the rest of the element may be copied
directly into the linked UIR file. Note that the UIR linker may decide to read
in the element anyway (for unrelated reasons).
In short, reference tables provide an efficient mechanism for traversing,
discovering, and updating element references during UIR linking.
# Storage Considerations
Reference tables also have compactness benefits:
- If A refers to B multiple times, the entry is deduplicated and referred to
more compactly by the index.
- Relative (to a section) element indices are typically smaller than absolute
element indices, and thus fit into smaller varints.
- Most elements do not reference many elements; thus table size indicators and
table indices are typically a byte each.
Thus, the storage performance is as follows:
+-----------------------------+-----------+--------------+
| Scenario | Best Case | Typical Case |
+-----------------------------+-----------+--------------+
| First reference from A to B | 3 Bytes | 4 Bytes |
| Other reference from A to B | 1 Byte | 1 Byte |
+-----------------------------+-----------+--------------+
The typical case for the first scenario changes because many sections have more
than 127 (range of a 1-byte uvarint) elements and thus the relative index is
typically 2 bytes, though this depends on the distribution of referenced indices
within the section.
The second does not because most elements do not reference more than 127
elements and the table index can thus keep to 1 byte.
Typically, A will only reference B once, so most references are 4 bytes.
*/
// A RefTableEntry is an entry in an element's reference table. All
// elements are preceded by a reference table which provides locations
// for referenced elements.