ladybird

mirror of https://github.com/LadybirdBrowser/ladybird.git synced 2026-04-18 09:50:27 +00:00

Author	SHA1	Message	Date
Andreas Kling	583fa475fb	LibJS: Call RawNativeFunction directly from asm Call The asm interpreter already inlines ECMAScript calls, but builtin calls still went through the generic C++ Call slow path even when the callee was a plain native function pointer. That added an avoidable boundary around hot builtin calls and kept asm from taking full advantage of the new RawNativeFunction representation. Teach the asm Call handler to recognize RawNativeFunction, allocate the callee frame on the interpreter stack, copy the call-site arguments, and jump straight to the stored C++ entry point. NativeJavaScriptBackedFunction and other non-raw callees keep falling through to the existing C++ slow path unchanged.	2026-04-15 15:57:48 +02:00
Andreas Kling	23fea4208c	LibJS: Pair-load asm global Realm metadata Place Realm's cached declarative environment next to its global object so the asm global access fast paths can fetch the two pointers with a paired load. These handlers never use the intervening GlobalEnvironment pointer directly.	2026-04-14 12:37:12 +02:00
Andreas Kling	b6c7f6c0c4	LibJS: Cache Executable constants for asm Call Mirror Executable's constants size and data pointer in adjacent fields so the asm Call fast path can pair-load them together. The underlying Vector layout keeps size and data apart, so a small cached raw span lets the hot constant-copy loop fetch both pieces of metadata at once.	2026-04-14 12:37:12 +02:00
Andreas Kling	5761f6bc54	LibJS: Pair-load PropertyNameIterator index counters Load PropertyNameIterator's indexed-property count and next index together when stepping the fast path. Keeping the paired count live into the named-property case also avoids reloading it before computing the flattened index.	2026-04-14 12:37:12 +02:00
Andreas Kling	3005945b38	LibJS: Pair-load PropertyNameIterator shape metadata Load PropertyNameIterator's cached property cache and shape snapshot together before validating the receiver shape. The two fields already sit adjacent in the object layout, so the fast path can fetch both without any extra reshuffling.	2026-04-14 12:37:12 +02:00
Andreas Kling	8dcb2b95ec	LibJS: Pair-load asm environment coordinates Load EnvironmentCoordinate::hops and ::index together in the asm environment-walk helper. The pair-load keeps the DSL explicit about which two fields travel together and removes another scalar metadata fetch from the fast path.	2026-04-14 12:37:12 +02:00
Andreas Kling	acbbb2d726	LibJS: Pair-load property IC lookup metadata Load the cached property offset and dictionary generation with paired loads in the property inline-cache fast paths. AsmIntGen now verifies these reads against the actual cache layout, so the DSL keeps both fields named and self-documenting.	2026-04-14 12:37:12 +02:00
Andreas Kling	335c278b8f	LibJS: Pair-load property IC shape metadata Load the cached shape and prototype pointer together in the property inline-cache fast paths that already read both. This keeps the cache-entry metadata fetches aligned with the DSL's paired-load model without changing the surrounding control flow.	2026-04-14 12:37:12 +02:00
Andreas Kling	75eb3a28ce	LibJS: Pair-load asm Return resume metadata Load the inline frame's return pc and destination register at once when Return or End resumes an asm-managed caller. This keeps the unwind metadata with the helper that consumes it and removes a separate scalar load from both handlers.	2026-04-14 12:37:12 +02:00
Andreas Kling	58aa725afb	LibJS: Reuse executable state in asm Call The asm Call fast path was still reloading the executable pointer while building the inline callee frame, even though it had already loaded the same pointer while validating the call target. Carry that executable pointer through frame setup and reload the passed argument count from the call bytecode instead of the fresh frame header. This trims a couple more loads from the hot path.	2026-04-14 12:37:12 +02:00
Andreas Kling	517812647a	LibJS: Pack asm Call shared-data metadata Pack the asm Call fast path metadata next to the executable pointer so the interpreter can fetch both values with one paired load. This removes several dependent shared-data loads from the hot path. Keep the executable pointer and packed metadata in separate registers through this binding so the fast path can still use the paired-load layout after any non-strict this adjustment. Lower the packed metadata flag checks correctly on x86_64 as well. Those bits now live above bit 31, so the generator uses bt for single- bit high masks and covers that path with a unit test. Add a runtime test that exercises both object and global this binding through the asm Call fast path.	2026-04-14 12:37:12 +02:00
Andreas Kling	50c497c59b	LibJS: Use precomputed asm Call frame counts Executable already caches the combined registers, locals, and constants count that the asm Call fast path needs for inline frame allocation. Use that precomputed total instead of rebuilding it from the registers count and constants vector size in the hot path.	2026-04-14 12:37:12 +02:00
Andreas Kling	fffc16b2f6	LibJS: Trust inline-call bytecode in asm Call The asm Call fast path already checks SharedFunctionInstanceData's cached can_inline_call bit before touching the executable pointer. That cache is only true for ordinary functions with compiled bytecode, so the extra executable null check is redundant work in the hot path.	2026-04-14 12:37:12 +02:00
Andreas Kling	44deea24fe	LibJS: Pair-load asm Call stack bounds The asm Call fast path reads InterpreterStack::m_top and m_limit back-to-back while checking whether the inline callee frame fits. Those fields are adjacent, so we can load them together with one paired load and keep the stack-size check otherwise unchanged.	2026-04-14 12:37:12 +02:00
Andreas Kling	fa931612e1	LibJS: Pair-store the asm Call frame setup Teach the asm Call fast path to use paired stores for the fixed ExecutionContext header writes and for the caller linkage fields. This also initializes the five reserved Value slots directly instead of looping over them as part of the general register clear path. That keeps the hot frame setup work closer to the actual data layout: reserved registers are seeded with a couple of fixed stores, while the remaining register and local slots are cleared in wider chunks. On x86_64, keep the new explicit-offset formatting on store_pair* and load_pair* without changing ordinary [base, index, scale] operands into base-plus-index-plus-offset addresses. Add unit tests covering both the paired zero-offset form and the preserved scaled-index lowering.	2026-04-14 12:37:12 +02:00
Andreas Kling	8ae173f4fd	LibJS: Use paired loads in the asm Call fast path Use the new paired-load DSL operations in the inline Call path for the adjacent environment, ScriptOrModule, caller metadata, and callee-entry loads. The flow stays the same, but the hot call setup now needs fewer scalar memory operations on aarch64.	2026-04-14 12:37:12 +02:00
Andreas Kling	8c7c46f8ec	LibJS: Inline asm interpreter JS Call fast path Handle inline-eligible JS-to-JS Call directly in asmint.asm instead of routing the whole operation through AsmInterpreter.cpp. The asm handler now validates the callee, binds `this` for the non-allocating cases, reserves the callee InterpreterStack frame, populates the ExecutionContext header and Value tail, and enters the callee bytecode at pc 0. Keep the cases that need NewFunctionEnvironment() or sloppy `this` boxing on a narrow helper that still builds an inline frame. This preserves the existing inline-call semantics for promise-job ordering, receiver binding, and sloppy global-this handling while keeping the common path in assembly. Add regression coverage for closure-capturing callees, sloppy primitive receivers, and sloppy undefined receivers.	2026-04-14 08:14:43 +02:00
Andreas Kling	7a01a64087	LibJS: Expose asmint Call offset metadata Emit the ExecutionContext, function-object, executable, and realm offsets that the asm Call path needs to inspect and initialize directly when building inline frames.	2026-04-14 08:14:43 +02:00
Andreas Kling	12a916d14a	LibJS: Handle AsmInt returns without C++ helpers Handle Return and End entirely in AsmInt when leaving an inline frame. The handlers now restore the caller, update the interpreter stack bookkeeping directly, and bump the execution generation without bouncing through AsmInterpreter.cpp. Add WeakRef tests that exercise both inline Return and inline End so this path stays covered.	2026-04-14 08:14:43 +02:00
Andreas Kling	df0fdee2a0	LibJS: Cache JS-to-JS inline call eligibility Store whether a function can participate in JS-to-JS inline calls on SharedFunctionInstanceData instead of recomputing the function kind, class-constructor bit, and bytecode availability at each fast-path call site.	2026-04-14 08:14:43 +02:00
Andreas Kling	2ca7dfa649	LibJS: Move bytecode interpreter state to VM The bytecode interpreter only needed the running execution context, but still threaded a separate Interpreter object through both the C++ and asm entry points. Move that state and the bytecode execution helpers onto VM instead, and teach the asm generator and slow paths to use VM directly.	2026-04-13 18:29:43 +02:00
Andreas Kling	3e18136a8c	LibJS: Add a String.fromCharCode builtin opcode Specialize only the fixed unary case in the bytecode generator and let all other argument counts keep using the generic Call instruction. This keeps the builtin bytecode simple while still covering the common fast path. The asm interpreter handles int32 inputs directly, applies the ToUint16 mask in-place, and reuses the VM's cached ASCII single-character strings when the result is 7-bit representable. Non-ASCII single code unit results stay on the dedicated builtin path via a small helper, and the dedicated slow path still handles the generic cases.	2026-04-12 19:15:50 +02:00
Andreas Kling	7bc40bd54a	LibJS: Add a charAt builtin bytecode fast path Tag String.prototype.charAt as a builtin and emit a dedicated bytecode instruction for non-computed calls. The asm interpreter can then stay on the fast path when the receiver is a primitive string with resident UTF-16 data and the selected code unit is ASCII. In that case we can return the VM's cached empty or single-character ASCII string directly.	2026-04-12 19:15:50 +02:00
Andreas Kling	d31750a43c	LibJS: Add a charCodeAt builtin bytecode fast path Teach builtin call specialization to recognize non-computed member calls to charCodeAt() and emit a dedicated builtin opcode. Mark String.prototype.charCodeAt with that builtin tag, then add an asm interpreter fast path for primitive-string receivers whose UTF-16 data is already resident. The asm path handles both ASCII-backed and UTF-16-backed resident strings, returns NaN for out-of-bounds Int32 indices, and falls back to the generic builtin call path for everything else. This keeps the optimistic case in asm while preserving the ordinary method call semantics when charCodeAt has been replaced or when string resolution would be required.	2026-04-12 19:15:50 +02:00
Andreas Kling	7ffe01cee3	LibJS: Split builtin call bytecode opcodes Replace the generic CallBuiltin instruction with one opcode per supported builtin call and make those instructions fixed-size by arity. This removes the builtin dispatch sled in the asm interpreter, gives each builtin a dedicated slow-path entry point, and lets bytecode generation encode the callee shape directly. Keep the existing handwritten asm fast paths for the Math builtins that already benefit from them, while routing the other builtin opcodes through their own C++ execute implementations. Build the new opcode directly in Rust codegen, and keep the generic call fallback when the original builtin function has been replaced.	2026-04-12 19:15:50 +02:00
Andreas Kling	879ac36e45	LibJS: Cache stable for-in iteration at bytecode sites Cache the flattened enumerable key snapshot for each `for..in` site and reuse a `PropertyNameIterator` when the receiver shape, dictionary generation, indexed storage kind and length, prototype chain validity, and magical-length state still match. Handle packed indexed receivers as well as plain named-property objects. Teach `ObjectPropertyIteratorNext` in `asmint.asm` to return cached property values directly and to fall back to the slow iterator logic when any guard fails. Treat arrays' hidden non-enumerable `length` property as a visited name for for-in shadowing, and include the receiver's magical-length state in the cache key so arrays and plain objects do not share snapshots. Add `test-js` and `test-js-bytecode` coverage for mixed numeric and named keys, packed receiver transitions, re-entry, iterator reuse, GC retention, array length shadowing, and same-site cache reuse.	2026-04-10 15:12:53 +02:00
Andreas Kling	4c1e2222df	LibJS: Fast-path safe writes into holey array holes Teach the asm PutByValue path to materialize in-bounds holey array elements directly when the receiver is a normal extensible Array with the default prototype chain and no indexed interference. This avoids bouncing through generic property setting while preserving the lazy holey length model. Keep the fast path narrow so inherited setters, inherited non-writable properties, and non-extensible arrays still fall back to the generic semantics. Add regression coverage for those cases alongside the large holey array stress tests.	2026-04-09 20:06:42 +02:00
Andreas Kling	da1c943161	LibJS: Make holey array lengths lazy Treat setting a large array length as a logical length change instead of forcing dictionary indexed storage or materializing every hole up front. This keeps dense fills on Array(length) on the holey indexed path and only falls back to sparse storage when later writes actually create a large realized gap. The asm indexed get/put fast paths assumed holey arrays always had a materialized backing store. Guard those paths with a capacity check so lazy holey arrays fall back safely until an index has been realized. Add regression coverage for very large holey arrays and for densely filling a large holey array after pre-sizing it with Array(length).	2026-04-09 20:06:42 +02:00
Andreas Kling	272562ddc5	LibJS: Remove dead C++ bytecode compilation functions Remove Bytecode::compile() and the old create() overloads on ECMAScriptFunctionObject that accepted C++ AST nodes. These have no remaining callers now that all compilation goes through the Rust pipeline. Also remove the if-constexpr Parse Node branch from async_block_start, since the Statement template instantiation was already removed. Fix transitive include dependencies on Generator.h by adding explicit includes for headers that were previously pulled in transitively.	2026-03-19 21:55:10 -05:00
Andreas Kling	1ff61754a7	LibJS: Re-box double arithmetic results as Int32 when possible When the asmint computes a double result for Add, Sub, Mul, Math.floor, Math.ceil, or Math.sqrt, try to store it as Int32 if the value is a whole number in [INT32_MIN, INT32_MAX] and not -0.0. This mirrors the JS::Value(double) constructor and allows downstream int32 fast paths to fire. Also add label uniquification to the DSL macro expander so the same macro can be used multiple times in one handler without label collisions.	2026-03-19 09:42:04 +01:00
Andreas Kling	5e403af5be	LibJS: Tighten asmint ToInt32 boxing Teach js_to_int32 to leave a clean low 32-bit result on success, then use box_int32_clean in the ToInt32 fast path and adjacent boolean coercions. This removes one instruction from the AArch64 fjcvtzs path and trims the boolean boxing path without changing behavior.	2026-03-19 09:42:04 +01:00
Andreas Kling	645f481825	LibJS: Fast-path Float32Array indexed access Add the small AsmIntGen float32 load, store, and conversion operations needed to handle Float32Array directly in the AsmInt typed-array GetByValue and PutByValue paths. This covers direct indexed reads plus both int32 and double stores, and adds regression coverage for Math.fround rounding, negative zero, and NaN.	2026-03-19 09:42:04 +01:00
Andreas Kling	6614971e6f	LibJS: Fast-path Uint8ClampedArray indexed access Teach the asm typed-array GetByValue and PutByValue paths to handle Uint8ClampedArray directly. Reads can share the Uint8Array load path, while int32 stores clamp in asm instead of bailing out to C++. Add a direct indexed access regression test for clamped int32 stores.	2026-03-19 09:42:04 +01:00
Andreas Kling	9299d430c8	LibJS: Cache typed array data pointers for indexed access Cache raw data pointers on fixed-length typed array views so asm GetByValue and PutByValue can use them directly for indexed element access. Replace the asm typed-array hot-path ArrayBuffer/DataBlock/ByteBuffer walk with one cached_data_ptr load. Remove six unconditional loads, four branches, and the byte_offset add before the element access, trading them for one cached_data_ptr null check. Keep direct C++ typed-array access on IsValidIntegerIndex-based checks, invalidate cached pointers eagerly when a backing ArrayBuffer is detached, and add regression coverage for shrink, regrow, and detach on number and BigInt typed arrays.	2026-03-18 13:59:05 -05:00
Andreas Kling	b4185f0ecd	LibJS: Split packed and holey asm indexed fast paths Use dedicated Packed branches in GetByValue and PutByValue so in-bounds indexed accesses can skip hole checks and slot reloads. Keep Holey writes on the guarded arm, and keep append writes on the C++ slow path so PutByValue still respects non-extensible indexed objects and arrays with a non-writable length. Add a bytecode regression that exercises both append failure cases through the real js binary path.	2026-03-17 22:28:35 -05:00
Andreas Kling	614713ed08	LibJS: Replace IndexedProperties with inline Packed/Holey/Dictionary Replace the OwnPtr<IndexedPropertyStorage> indirection with inline indexed element storage directly on Object. This eliminates virtual dispatch and reduces indirection for indexed property access. The new system uses three storage kinds tracked by IndexedStorageKind: - Packed: Dense array, no holes. Elements stored in a malloced Value* array with capacity header (same layout as named properties). - Holey: Dense array with possible holes marked by empty sentinel. Same physical layout as Packed. - Dictionary: Sparse storage using GenericIndexedPropertyStorage, type-punned into the m_indexed_elements pointer. Transitions: None->Packed->Holey->Dictionary (mostly monotonic). Dictionary mode triggers on non-default attributes or sparse arrays. Object keeps the same 48-byte size since m_indexed_elements (8 bytes) replaces IndexedProperties (8 bytes), and the storage kind + array size fit in existing padding alongside m_flags. The asm interpreter benefits from one fewer indirection: it now reads the element pointer and array size directly from Object fields instead of chasing through OwnPtr -> IndexedPropertyStorage -> Vector. Removes: IndexedProperties, SimpleIndexedPropertyStorage, IndexedPropertyStorage, IndexedPropertyIterator. Keeps: GenericIndexedPropertyStorage (for Dictionary mode).	2026-03-17 22:28:35 -05:00
Andreas Kling	f574ef528d	LibJS: Replace Vector<Value> with Value* for named property storage Replace the 24-byte Vector<Value> m_storage with an 8-byte raw Value* m_named_properties pointer, backed by a malloc'd allocation with an inline capacity header. Memory layout of the allocation: [u32 capacity] [u32 padding] [Value 0] [Value 1] ... m_named_properties points to Value 0. This shrinks JS::Object from 64 to 48 bytes (on non-Windows platforms) and removes one level of indirection for property access in the asm interpreter, since the data pointer is now stored directly on the object rather than inside a Vector's internal metadata. Growth policy: max(4, max(needed, old_capacity * 2)).	2026-03-17 22:28:35 -05:00
Tim Ledbetter	85e84b352c	LibJS: Widen binary operation fast path to include doubles	2026-03-12 10:28:36 -05:00
Tim Ledbetter	36f74ba96c	Revert "LibJS: Shrink ExecutionContext by replacing ScriptOrModule …" … with Cell*. This reverts commit `d3495c62a7`.	2026-03-11 23:13:18 +00:00
Andreas Kling	31606fddd3	LibJS: Add Mov2/Mov3 instructions to reduce dispatch overhead Add Mov2 and Mov3 bytecode instructions that perform 2 or 3 register moves in a single dispatch. A peephole optimization pass during bytecode assembly merges consecutive Mov instructions within each basic block into these combined instructions. When merging, identical Movs are deduplicated (e.g. two identical Movs become a single Mov, not a Mov2). This optimization is implemented in both the C++ and Rust codegen pipelines. The goal is to reduce the per-instruction dispatch overhead, which is significant compared to the actual cost of moving a value. This isn't fancy or elegant, but provides a real speed-up on many workloads. As an example, Kraken/imaging-desaturate.js improves by ~1.07x on my laptop.	2026-03-11 17:04:32 +01:00
Andreas Kling	d3495c62a7	LibJS: Shrink ExecutionContext by replacing ScriptOrModule with Cell* Replace the 16-byte Variant<Empty, GC::Ref<Script>, GC::Ref<Module>> with a simple 8-byte GC::Ptr<Cell> that points to either a Script or Module (or is null for Empty). A helper function script_or_module_from_cell() converts back to the full ScriptOrModule variant when needed (e.g. in VM::get_active_script_or_module).	2026-03-11 13:33:47 +01:00
Andreas Kling	96d02d5249	LibJS: Remove derivable fields from ExecutionContext Remove four fields that are trivially derivable from other fields already present in the ExecutionContext: - global_object (from realm) - global_declarative_environment (from realm) - identifier_table (from executable) - property_key_table (from executable) This shrinks ExecutionContext from 192 to 160 bytes (-17%). The asmint's GetGlobal/SetGlobal handlers now load through the realm pointer, taking advantage of the cached declarative environment pointer added in the previous commit.	2026-03-11 13:33:47 +01:00
Andreas Kling	d5eed2632f	AsmInt: Add branch_zero32/branch_nonzero32 to the asmint DSL These test only the low 32 bits of a register, replacing the previous pattern of `and reg, 0xFFFFFFFF` followed by `branch_zero` or `branch_nonzero`. On aarch64 the old pattern emitted `mov w1, w1; cbnz x1` (2 insns), now it's just `cbnz w1` (1 insn). Used in JumpIf, JumpTrue, JumpFalse, and Not for the int32 truthiness fast path.	2026-03-08 23:04:55 +01:00
Andreas Kling	368efef620	AsmIntGen: Support [pb, pc, field] three-operand memory access Teach the DSL and both arch backends to handle memory operands of the form [pb, pc, field_ref], meaning base + index + field_offset. On aarch64, since x21 already caches pb + pc (the instruction pointer), this emits a single `ldr dst, [x21, #offset]` instead of the previous `mov t0, x21` + `ldr dst, [t0, #offset]` two-instruction sequence. On x86_64, this emits `[r14 + r13 + offset]` which is natively supported by x86 addressing modes. Convert all `lea t0, [pb, pc]` + `loadNN tX, [t0, field]` pairs in the DSL to the new single-instruction form, saving one instruction per IC access and other field loads in GetById, PutById, GetLength, GetGlobal, SetGlobal, and CallBuiltin handlers.	2026-03-08 10:27:13 +01:00
Andreas Kling	54a1a66112	LibJS: Store cache pointers directly in bytecode instructions Instead of storing a u32 index into a cache vector and looking up the cache at runtime through a chain of dependent loads (load Executable, load vector data pointer, multiply index, add), store the actual cache pointer as a u64 directly in the instruction stream. A fixup pass (Executable::fixup_cache_pointers()) runs after Executable construction in both the Rust and C++ pipelines, walking the bytecode and replacing each index with the corresponding pointer. The cache pointer type is encoded in Bytecode.def (e.g. PropertyLookupCache, GlobalVariableCache*) so the fixup switch is auto-generated by the Python Op code generator, making it impossible to forget updating the fixup when adding new cached instructions. This eliminates 3-4 dependent loads on every inline cache access in both the C++ interpreter and the assembly interpreter.	2026-03-08 10:27:13 +01:00
Andreas Kling	fe48e27a05	LibJS: Replace GC::Weak with GC::RawPtr in inline cache entries Property lookup cache entries previously used GC::Weak<T> for shape, prototype, and prototype_chain_validity pointers. Each GC::Weak requires a ref-counted WeakImpl allocation and an extra indirection on every access. Replace these with GC::RawPtr<T> and make Executable a WeakContainer so the GC can clear stale pointers during sweep via remove_dead_cells. For static PropertyLookupCache instances (used throughout the runtime for well-known property lookups), introduce StaticPropertyLookupCache which registers itself in a global list that also gets swept. Now that inline cache entries use GC::RawPtr instead of GC::Weak, we can compare shape/prototype pointers directly without going through the WeakImpl indirection. This removes one dependent load from each IC check in GetById, PutById, GetLength, GetGlobal, and SetGlobal handlers.	2026-03-08 10:27:13 +01:00
Andreas Kling	271cd0173d	AsmInt: Remove redundant accessor check from GetByValue SimpleIndexedPropertyStorage can only hold default-attributed data properties. Any attempt to store a property with non-default attributes (such as accessors) triggers conversion to GenericIndexedPropertyStorage first. So when we've already verified is_simple_storage, the accessor check is dead code.	2026-03-08 10:27:13 +01:00
Andreas Kling	95eaac03fb	AsmInt: Inline environment binding path for GetGlobal/SetGlobal Instead of calling into C++ helpers for global let/const variable access, inline the binding lookup directly in the asm handlers. This avoids the overhead of a C++ call for the common case. Module environments still use the C++ helper since they require additional lookups that aren't worth inlining.	2026-03-08 10:27:13 +01:00
Andreas Kling	e486ad2c0c	AsmIntGen: Use platform-optimal codegen for NaN-boxing operations Convert extract_tag, unbox_int32, unbox_object, box_int32, and box_int32_clean from DSL macros into codegen instructions, allowing each backend to emit optimal platform-specific code. On aarch64, this produces significant improvements: - extract_tag: single `lsr xD, xS, #48` instead of `mov` + `lsr` (3-operand shifts are free on ARM). Saves 1 instruction at 57 call sites. - unbox_object: single `and xD, xS, #0xffffffffffff` instead of `mov` + `shl` + `shr`. The 48-bit mask is a valid ARM64 logical immediate. Saves 2 instructions at 6 call sites. - box_int32: `mov wD, wS` + `movk xD, #tag, lsl #48` instead of `mov` + `and 0xFFFFFFFF` + `movabs tag` + `or`. The w-register mov zero-extends, and movk overwrites just the top 16 bits. Saves 2 instructions and no longer clobbers t0 (rax). - box_int32_clean: `movk xD, #tag, lsl #48` (1 instruction) instead of `mov` + `movabs tag` + `or` (saves 2 instructions, no t0 clobber). On x86_64, the generated code is equivalent to the old macros.	2026-03-07 22:18:22 +01:00
Andreas Kling	c65e1955a1	AsmInt: Skip redundant zero-extension in more box_int32 sites UnsignedRightShift: after shr on a zero-extended value, upper bits are already clear. GetByValue typed array path: load32/load8/load16/load8s/load16s all write to 32-bit destination registers, zeroing the upper 32 bits. Both can use box_int32_clean to skip the redundant AND 0xFFFFFFFF.	2026-03-07 22:18:22 +01:00

1 2

56 commits