cmd/compile, cmd/internal: fine-grained fiddling with loop alignment

This appears to be useful only on amd64, and was specifically
benchmarked on Apple Silicon and did not produce any benefit there.
This CL adds the assembly instruction `PCALIGNMAX align,amount`
which aligns to `align` if that can be achieved with `amount`
or fewer bytes of padding. (0 means never, but will align the
enclosing function.)

Specifically, if low-order-address-bits + amount are
greater than or equal to align; thus, `PCALIGNMAX 64,63` is
the same as `PCALIGN 64` and `PCALIGNMAX 64,0` will never
emit any alignment, but will still cause the function itself
to be aligned to (at least) 64 bytes.

Change-Id: Id51a056f1672f8095e8f755e01f72836c9686aa3
Reviewed-on: https://go-review.googlesource.com/c/go/+/577935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
This commit is contained in:
David Chase 2024-04-02 11:12:44 -04:00
parent 31c8150082
commit 18d0e6a14f
14 changed files with 221 additions and 79 deletions

View file

@ -416,6 +416,7 @@ const (
AJMP
ANOP
APCALIGN
APCALIGNMAX // currently x86, amd64 and arm64
APCDATA
ARET
AGETCALLERPC