Commit graph

719 commits

Author SHA1 Message Date
Cherry Mui
a2b3c73f75 simd/archsimd: correct ARM64 IfElse semantics
ARM64's IfElse behavior is reversed from other platforms. Reverse
it. Internally, its bitSelect is also the reverse of Wasm's
BitSelect. Reverse the ARM64 one to match.

Make Masked and IfElse tests portable.

Change-Id: Icd2dbcb3383b2be642fd6fc7115ef1cbef0f9b78
Reviewed-on: https://go-review.googlesource.com/c/go/+/793361
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-06-23 20:49:14 -07:00
thepudds
61202f574e test/codegen: update runtimefreegc test for slice backing store fix
CL 789060 fixed a case in the backing store analysis
for range over slice statements.

This CL makes a corresponding update to the off-by-default
GOEXPERIMENT=runtimefreegc codegen tests.

While here, we slightly tweak the wording and regexp in the
equivalent default test code (for when GOEXPERIMENT=runtimefreegc
is disabled).

Updates #79909
Fixes #79972

Change-Id: Ic6dfe04fee711b2b71a0edccb115477ad01dc5d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/789980
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2026-06-13 16:46:43 -07:00
Cuong Manh Le
6e04f9f830 cmd/compile: fix slice backing store analysis
The range over slice statement keeps a pointer to the backing store of
the slice, making it from exclusive to nonexclusive at that point. Thus
we need to mark it as transition there.

Fixes #79909

Change-Id: I7292b5644ac658fa3a6ccd9fa949b454d2f3d770
Reviewed-on: https://go-review.googlesource.com/c/go/+/789060
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2026-06-11 11:34:44 -07:00
Keith Randall
7c074f14e6 cmd/compile: use position of nil check when merging it with subsequent store
Semantically the nil check happens first, so we want the position of
the nil check.

In CL 659317 I added the don't-merge-with-store logic. Turns out that
was not right, it was just a way to work around the problem that I
have just fixed in the previous CL in this stack.

Fixes #79762

Change-Id: Id84d89d1843cc07b6f880f68d881c510d742c5aa
Reviewed-on: https://go-review.googlesource.com/c/go/+/785440
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2026-06-09 10:32:41 -07:00
Alexander Musman
2458ba018e simd: route HiToLo through Float64x2 and fix codegen tests
Route HiToLo through Float64x2.SetElem/GetElem instead of Uint64x2
to avoid a round-trip through a GP register.

Update simd_arm64.go codegen test for current API.

This is a cherry-pick of CL 787302.

Updates #79899

Change-Id: I3d98bd137474a5188509e5ee365c0d9af386e32c
Reviewed-on: https://go-review.googlesource.com/c/go/+/787303
Reviewed-by: Arseny Samoylov <samoylov.arseny@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-06-08 11:45:15 -07:00
Josh Bleecher Snyder
a531ad45f7 cmd/compile: remove multi-register shift optimization on amd64
The multi-register shift rewrites were flawed.

When bits is zero mod 64, SHRD and SHLD leave the destination unchanged,
so the result is lo rather than lo | hi.

We don't have enough information at hand to make better decisions here.
It'd take a lot of machinery to propagate non-zero-ness from prove,
and constant-only would have limited usefulness.

Conveniently, every occurrence in std guards against this.

This was introduced by me (eep) in CL 297050, and extended in CL 399061.
I re-measured on recent vintage amd64 machine, and the fused instructions
are no faster. SHRD/SHLD are pretty constrained (resultInArg0,
count in CX, clobbers flags).

Packages math and edwards25519:

                   │ with-fold  │            without-fold             │
                   │   sec/op   │   sec/op     vs base                │
FMA-64                0.7609n ± 0%   0.7591n ± 1%       ~ (p=0.436 n=10)
ScalarBaseMult-64      9.981µ ± 1%    9.827µ ± 0%  -1.54% (p=0.000 n=10)
ScalarMult-64          33.01µ ± 0%    32.90µ ± 0%  -0.33% (p=0.000 n=10)
geomean                630.5n         626.1n       -0.70%

Clean up the ops as well, since nothing now generates them.

Change-Id: I37423aa558d7f626e81ee7db807b43de1747be1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/785801
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-06-04 09:57:03 -07:00
David Chase
627bc968ea [dev.simd] simd: add ARM64 PMULL (carrylessMultiplyWidenLo) intrinsic
Add polynomial (carryless) multiply long using VPMULL/VPMULL2.
In this CL Uint64x2→Uint64x2 (2D→1Q).
GetHi folding produces VPMULL2 without extra instructions.

Also adds clmul_arm64.go with CarrylessMultiply{Even,Odd,OddEven,EvenOdd}
helper methods matching the amd64 API.

Also adds a feature check for ARM64.PMULL().

Directly based on CL 784020 which includes the 8-bit CLMUL.
Original author: alexander.musman@gmail.com

Change-Id: I6c554398f97c5c827bad92b271b8d03fd8adbd49
Reviewed-on: https://go-review.googlesource.com/c/go/+/785240
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-06-01 09:53:55 -07:00
Alexander Musman
80ab7bc1fa [dev.simd] simd: rename LoLong intrinsics to WidenLo
MulLoLong → MulWidenLo, ShiftLeftLoLongConst → ShiftLeftWidenLoConst.
Consistent with wasm's MulWiden{Lo,Hi} naming convention.

Change-Id: I58c4624f62ab977ca37822390779d13019a0c37d
Reviewed-on: https://go-review.googlesource.com/c/go/+/784120
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-05-30 13:16:33 -07:00
Joel Sing
687ce6f9af test/codegen: floating point constant loads for riscv64
Change-Id: I0c6fa8ec6bd3ff4f97082fb9a400f19e3f5b28a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/748381
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2026-05-29 12:19:38 -07:00
Cherry Mui
4b37aa2bc1 [dev.simd] all: merge master (bfbbe96) into dev.simd
Updated internal/runtime/maps/memhash_aes_simd.go to use the
newly renamed Load functions.

Conflicts:

- src/cmd/compile/internal/amd64/simdssa.go
- src/cmd/compile/internal/ssa/_gen/AMD64.rules
- src/cmd/compile/internal/ssa/_gen/ARM64.rules
- src/cmd/compile/internal/ssa/_gen/simdAMD64.rules
- src/cmd/compile/internal/ssa/_gen/simdAMD64ops.go
- src/cmd/compile/internal/ssa/rewriteAMD64.go
- src/cmd/compile/internal/ssagen/intrinsics.go
- src/cmd/compile/internal/types2/stdlib_test.go
- src/go/types/stdlib_test.go
- src/internal/buildcfg/exp.go
- test/codegen/simd.go

Merge List:

+ 2026-05-22 bfbbe9667e runtime: remove unreachable code in malloc_generated.go
+ 2026-05-22 99675026a7 internal/strconv: fix mishandling of long outputs
+ 2026-05-22 fc245b6427 syscall: add export linknames for socketcall on S390X
+ 2026-05-22 cc623858f6 runtime/pprof: update test's expected frame count
+ 2026-05-21 e4283592e5 fmt: give advice on wrapper functions
+ 2026-05-21 1bcea1df64 cmd/{vet,fix}: use new constants from /x/tools/go/analysis/suite
+ 2026-05-21 60f0ced65b internal/testenv: make MustHaveSource detect missing source
+ 2026-05-21 e0a8616941 math/rand/v2: add method Rand.N
+ 2026-05-21 8621461b26 cmd: update vendored x/arch
+ 2026-05-21 0db3804845 archive/zip: turn off large zip test on 32-bit archs
+ 2026-05-21 abdc5da461 simd/archsimd/_gen: annotate text/template usage
+ 2026-05-21 661e0c610e internal/strconv: work around escape analysis bug
+ 2026-05-20 04ed01963e runtime/trace: remove unused runtime_readTrace declaration
+ 2026-05-20 e2c188568d cmd/compile: compute embedded field offset in static initialization
+ 2026-05-20 dd1da37fa4 runtime: always call slowpath for heap bits in span
+ 2026-05-20 5a1c0ee6de test/fixedbugs: minor adjustments to line-directive specific tests
+ 2026-05-20 78f63eb790 crypto/internal/cryptotest/wycheproof: avoid reading go.sum at test time
+ 2026-05-20 244c8ae4c8 crypto/internal/fips140/nistec: avoid some mul64 in p256 calculations
+ 2026-05-20 2c659bb4db crypto/internal/fips140/nistec: optimize P-256 scalar fiat implementation
+ 2026-05-20 e4e6887cee crypto/internal/fips140/nistec: mechanically improve P-256 scalar fiat code
+ 2026-05-20 be35de22f1 crypto/internal/fips140/nistec: replace P-256 scalar assembly with fiat
+ 2026-05-20 91a81e5ae1 go/types,cmd/compile/internal/types2: add String methods
+ 2026-05-20 bbf60f3bbd all: update to x/tools@b38156a7
+ 2026-05-20 f571fc93b0 encoding/json: clarify that v1 Unmarshal calls UnmarshalerFrom methods
+ 2026-05-20 c700213f6c encoding/json/jsontext: expand Decoder.UnreadBuffer documentation
+ 2026-05-20 4a38094e42 database/sql: add RowsColumnScanner, expose ConvertAssign
+ 2026-05-20 4dde0f6c36 all: use linknamestd for new linknames
+ 2026-05-20 6a002d1474 cmd/dist: pass -std to assembler
+ 2026-05-20 694604e524 runtime: further reduce number of size classes
+ 2026-05-20 3652f299a8 cmd/link: skip TestAbstractOriginSanity
+ 2026-05-20 05ab7b8da5 cmd/compile/internal/syntax: refactor/reword new line directives tests
+ 2026-05-20 c0bd270406 net/netip: update godoc comments
+ 2026-05-20 acced3df03 crypto/internal/fips140/edwards25519/field: delete Square amd64 assembly
+ 2026-05-20 a00bbab762 crypto/internal/fips140/edwards25519/field: speed up add chains
+ 2026-05-20 1926d1d95d cmd/compile: clarify relativity of a simple file name in a line directive
+ 2026-05-20 3a9c8e1d90 archive/zip: fix writer-side Zip64 edge cases
+ 2026-05-20 a7ea4a7ecd cmd/compile/internal/syntax: resolve //line filenames relative to source directory
+ 2026-05-20 b8246db0c3 cmd/go/internal/clean: print all removals
+ 2026-05-20 96db4cf31f cmd/go/internal/clean: forget about makefiles
+ 2026-05-20 8a69bfb1bb archive/zip: fix reader-side Zip64 edge cases
+ 2026-05-20 8b672822b2 internal/profile: return error from gzip.Writer.Close in Profile.Write
+ 2026-05-20 c410b4944e runtime: remove duplicated code in no scan slow path
+ 2026-05-20 2f0459745c cmd/compile: make ReassignOracle StaticValue unwrap parens
+ 2026-05-20 1bcfdf2df2 cmd/compile: switch to ReassignOracle.StaticValue in escape call analysis
+ 2026-05-20 84e0c4965a cmd/compile: move FuncAssignments into ReassignOracle
+ 2026-05-20 5af294bac7 cmd/compile: handle multiply-assigned func vars in escape analysis
+ 2026-05-20 3c05d2a519 debug/pe: add FuzzReader
+ 2026-05-20 71300e8011 internal/strconv: use fast unrounded scaling for floating-point
+ 2026-05-20 fd7a0e680d cmd: update golang.org/x/arch for riscv64 disassembler
+ 2026-05-20 c3f7d75877 internal/poll: omit embedded type field in splicePipe construction
+ 2026-05-20 4136ffed69 simd/archsimd: decode non-broadcast memory operands
+ 2026-05-20 856c405c4f internal/cpu: correcting spelling errors in the comments
+ 2026-05-20 b9c5520dbc encoding/json: clarify that v1 Marshal calls MarshalerTo methods
+ 2026-05-20 a40c232e81 crypto/x509: skip TestReadUniqueDirectoryEntries if symlinks unsupported
+ 2026-05-20 7eeacc9cce net/netip: remove incorrect comment in Prefix.AppendTo
+ 2026-05-20 4a6d3a3b46 cmd/go/internal/envcmd: report GOPACKAGESDRIVER
+ 2026-05-20 6c8731962d runtime: have patience for trailing thread in contention test
+ 2026-05-20 b99b8feaae runtime: split gp.m.locks bits for lock vs acquirem
+ 2026-05-19 47f26133bd cmd/internal/obj/loong64: add ll.acq.{w,d}, sc.rel.{w,d}, sc.q instruction
+ 2026-05-19 45f1313c18 go/types: generate alias_test.go from respective types2 source
+ 2026-05-19 8494d25c4c os/signal: make NotifyContext Cause match context.Canceled
+ 2026-05-19 37bce6617f types2, go/types: add missing alias test to types2, simplify go/types test
+ 2026-05-19 2760c3f5a3 go/types: use mustParse helper to simplify tests where possible
+ 2026-05-19 b12ed667d9 crypto/ecdsa: test hash size restrictions
+ 2026-05-19 8329d31307 runtime/loong64: use ABIInternal convention in cgocallbackg
+ 2026-05-19 f93504bfd6 cmd/internal/obj/loong64: add FRINT{F,D} instructions
+ 2026-05-19 15f44ffcc3 runtime: skip gcBlackenEnabled check and gcmarknewobject in fast path
+ 2026-05-19 2a93576965 interrnal/buildcfg: enable SizeSpecializedMalloc by default
+ 2026-05-19 b7ad0fe092 all: use SkipObjectResolution mode in parser.ParseFile calls where possible
+ 2026-05-19 8ddf0031cf cmd/compile: disallow nointerface method satisfying type constraint
+ 2026-05-19 063f8b07c1 crypto/tls: fix broken quic_test.go
+ 2026-05-19 24e654197a runtime: add benchmarks for allocating slices of pointers
+ 2026-05-19 1dd2bef375 runtime/secret: fix cgo crashes inside of secret.Do
+ 2026-05-19 c8b14e157f math/big: only use pool for large allocations
+ 2026-05-19 aee6009ba5 cmd/link: check linkname access to assembly symbols
+ 2026-05-19 ad46b4815e crypto/tls: clamp effective minimum version to TLS 1.3 when using QUIC
+ 2026-05-19 edf006c9a3 net/mail: escape arbitrary input when including them in errors
+ 2026-05-19 0db7bea636 cmd/dist: fix JSON processing of trailing bytes
+ 2026-05-19 5563d58a15 go/printer: update comments and simplify test (cleanup)
+ 2026-05-19 c07a0f09b8 doc: document new ppc64/linux features
+ 2026-05-19 4b77d329ea encoding/json/v2: add string option hint optimization
+ 2026-05-19 469636308b encoding/json/jsontext: drop duplicate import
+ 2026-05-19 83b29183af crypto/internal/fips140/rsa: add large exponent OAEP for ACVP
+ 2026-05-19 7f4f2c1c7b crypto/ecdsa: check the hash length in PrivateKey.Sign
+ 2026-05-19 2f9a9642e1 crypto/ecdsa: reject empty hashes
+ 2026-05-19 1debc9f0ce crypto/tls: surface private key parsing error from X509KeyPair
+ 2026-05-19 18f72b3842 crypto/tls: add a test for running with broken certificates
+ 2026-05-19 2f57f7626e crypto/tls: remove the x509keypairleaf GODEBUG setting
+ 2026-05-19 1634ae8c7c crypto/tls: remove the tls10server GODEBUG setting
+ 2026-05-19 0f4862de57 crypto/tls: remove tls3des GODEBUG setting
+ 2026-05-19 14a4bc2051 crypto/tls: remove tlsrsakex GODEBUG setting
+ 2026-05-19 a7bc19bf37 crypto/tls: remove the tlsunsafeekm GODEBUG setting
+ 2026-05-19 5cc4ceb800 crypto/tls: add TestInvalidHandshakeSignature
+ 2026-05-19 78b71d40fd encoding/json/jsontext: skip allocation test when inlining is disabled
+ 2026-05-19 6b0243ccf6 crypto/tls: implement MLKEM1024 key exchange
+ 2026-05-19 97a57b481f crypto/tls: use mlkem.GenerateKey for ML-KEM hybrids
+ 2026-05-19 27532dc35c crypto/tls: deprecate Config.Rand
+ 2026-05-19 542d7d549f crypto/tls: let Config.CurvePreferences override GODEBUG options
+ 2026-05-19 c9a3e8bbd2 encoding/json/jsontext: skip inline-dependent test on noopt builders
+ 2026-05-19 e01f29f918 crypto/internal/fips140/rsa: check hash length in PKCS#1 v1.5 signatures
+ 2026-05-19 47cc60743b runtime,runtime/cgo: port ios/arm64 working dir setup from C to Go
+ 2026-05-19 95e935b1b3 crypto/tls: update generated certificates
+ 2026-05-19 c74ba7d265 crypto/tls: add ML-DSA support
+ 2026-05-19 003833a138 cmd/link: track content-hashed-ness for cloned symbols
+ 2026-05-19 99623c5a17 crypto/rsa: skip TestKeyGenerationVectors on older FIPS 140-3 modules
+ 2026-05-19 f142be8f2f go/printer: do not indent composite literals in return statements
+ 2026-05-19 4e51025e3e crypto/x509: add ML-DSA support
+ 2026-05-19 d80de8f117 cmd/go: sort subcommands in help output
+ 2026-05-19 4bf23b51b8 crypto/x509: honor SSL_CERT_{FILE,DIR} on windows/darwin
+ 2026-05-19 93da30397d math/big: move Int.Divide and corresponding test function up (cleanup)
+ 2026-05-19 5f47eb0cdf math/big: refactored TestIntDivide tests, added more test cases
+ 2026-05-19 05f75fb9e8 internal/runtime/maps,runtime/: pass keys by value to MemHash{32,64} and StrHash.
+ 2026-05-19 e26a373785 runtime/secret: implement goroutine inheriting secret state
+ 2026-05-19 e73e73470e cmd/compile: improve known bits debug print
+ 2026-05-19 7d2eb15103 net/http/fcgi: handle error returned by w.Close() in writePairs
+ 2026-05-19 880ef11ecf cmd/compile: make computeKnownBitsForShift iteration faster
+ 2026-05-19 fabaedcbe8 cmd/compile: fold == != with a const and a bijective operation into the const
+ 2026-05-19 75560e67c9 runtime: introduce a mallocgc fast path
+ 2026-05-19 6716b79b58 lib/time: update to 2026b/2026b
+ 2026-05-19 2378242315 runtime/_mkmalloc: allow for folding const bool exprs
+ 2026-05-19 e9edbced42 encoding/json/v2: remove recursion and error on `string` on unsupported type
+ 2026-05-19 e8c1e370c9 database/sql: add cursor cancelation test, document some cursor issues
+ 2026-05-19 64315a2d18 bytes, strings: use builtin min function in genSplit
+ 2026-05-19 03d1f8efc8 crypto/rsa: bypass Go+BoringCrypto for small, insecure, flaky keys
+ 2026-05-19 c2ecd421b8 crypto/mlkem: add Wycheproof coverage
+ 2026-05-19 7df2a42f94 crypto: move Wycheproof test coverage from x/crypto
+ 2026-05-19 caa4c72fee crypto/x509: accept non-string pkix.Name attributes
+ 2026-05-19 d2095798a1 crypto/internal/cryptotest: add Wycheproof schema/helpers
+ 2026-05-19 3e1c31701c crypto/ecdsa: add c2sp.org/det-keygen test vectors for ECDSA key generation
+ 2026-05-19 0db36238c6 cmd/internal/obj/x86: shorten MOVQ r64, imm32 for positive immediates
+ 2026-05-19 c888fd67f0 crypto/mldsa: don't precompute PublicKey
+ 2026-05-19 7bc111c6eb crypto/mldsa: new package
+ 2026-05-19 4212586726 cmd/compile/internal/ssa: prefer registers x8-x15/f8-f15 on riscv64
+ 2026-05-19 0c3b9f837d cmd/compile: fix corner case boundedness for oversized shifts
+ 2026-05-18 8f7f951965 math/big: add Int.Divide and RoundingMode aliases
+ 2026-05-18 2677fe9bbe go/constant: add StringLen function
+ 2026-05-18 15fd4ff942 runtime: move post allocation work into postMallocgc
+ 2026-05-18 c7a107bfbf encoding/json/internal/jsontest: rename testdata to _embed
+ 2026-05-18 722ee60825 runtime: combine sizespecializedmalloc small stubs into a single stub
+ 2026-05-18 0a151acad8 runtime: remove race and valgrind cases from specializedmalloc stubs
+ 2026-05-18 b23aea0c94 runtime/_mkmalloc: set position in substituteWithBasicLit
+ 2026-05-18 e212a16d1e internal/buildcfg: flip default of GenericMethods
+ 2026-05-18 21e3cdefc3 cmd/go: simplify go.mod to have at most two require sections
+ 2026-05-18 f3f3d0859a cmd/compile/internal/noder: update UIR to V4
+ 2026-05-18 813b317cc9 net/http/httptest: add NewTestServer with in-memory network
+ 2026-05-18 a871fd3732 internal/nettest: add internal fake networking implementation
+ 2026-05-18 2e67b18935 net/http: fix data race in TestServerNoWriteTimeout/h2
+ 2026-05-18 71c7ea1c6c crypto/internal/fips140/aes/gcm: constant-time GHASH
+ 2026-05-18 c1f0b9bdba go/printer: fix false positive doc comment
+ 2026-05-18 44fde0fd08 crypto/internal/fips140/rsa: add large exponent support for ACVP tests
+ 2026-05-18 3cdb042b2e crypto/rsa: add c2sp.org/det-keygen test vectors for RSA key generation
+ 2026-05-18 5cd903156e crypto/rsa: generate primes ≡ 7 mod 8 and update comments
+ 2026-05-18 2361851aa9 crypto: improve panic message when a hash function is unavailable
+ 2026-05-18 6de59e2070 crypto: return an error if a hash function is not available
+ 2026-05-18 3825609217 crypto/tls: remove a couple FIPS 140-3 mode skip from tests
+ 2026-05-18 aca2bff284 crypto/tls: consistently use testenv.SetGODEBUG in tests
+ 2026-05-18 9578a80f15 crypto/tls: remove old test config and certificates
+ 2026-05-18 907b4be52b crypto/tls: port TestClientAuth to the new certificates
+ 2026-05-18 c78a8273c8 crypto/tls: migrate off legacy testConfig
+ 2026-05-18 ca4f272170 crypto/tls: switch tests to new test certificates and keys
+ 2026-05-18 320e0be23d runtime/pprof: possibly deflake TestGoroutineLeakProfileConcurrency stress tests
+ 2026-05-18 6997bcd820 crypto/x509: add RawSignatureAlgorithm
+ 2026-05-18 e62d3e6e89 internal/buildcfg: enable JSONv2 as baseline
+ 2026-05-18 250d0eb6ee math/big: reduce x1,x2 via subtraction
+ 2026-05-17 69a99fdcbb net/netip: inline single-use Addr.string{4,6,4In6} methods
+ 2026-05-16 9df04115d6 log/slog: document context.Background use in non-Context methods
+ 2026-05-16 4e06ed21ac runtime: throw if a timespec64 can't be converted to a timespec32
+ 2026-05-16 0d54be530b cmd/compile: cleanup ARM64 shift lowering
+ 2026-05-15 9e0467b174 cmd/compile: remove flags → bool → flags roundtrips on amd64
+ 2026-05-15 c6eaf03788 database/sql: run tests with different driver variants
+ 2026-05-15 9be7615aa2 cmd/compile: represent escape analysis callees as a slice
+ 2026-05-15 f4bfb1a9c6 cmd/compile: treat singly-assigned func vars as static in escape analysis
+ 2026-05-15 1a7e601d07 net/textproto: escape arbitrary input when including them in errors
+ 2026-05-15 ab7c8279a0 cmd/compile, runtime: use fine-grained FENCE instructions on riscv64
+ 2026-05-15 212065c922 cmd/compile: shuffle bits.Sub intrinsic generation on amd64
+ 2026-05-15 8bd95ae848 src: fix spelling mistakes
+ 2026-05-14 080a6d5fa8 net/http: disable HTTP/3 tests prior to freeze
+ 2026-05-14 80123ef4bf cmd/compile: preserve pointerness during splitload
+ 2026-05-14 c203e4ecb9 image, image/gif: document DecodeConfig before Decode for untrusted input
+ 2026-05-13 7601c4bf42 net/http/internal/http2: reject STREAM_ENDED + Content-Length request
+ 2026-05-13 c22f92a751 net/http: fix hang in TestTransportClosesBodyOnError/h3
+ 2026-05-13 81f747893d net/netip: fix typo in AddrPort.AppendBinary godoc
+ 2026-05-13 168fe84e6c go/ast: fix godoc links
+ 2026-05-13 364de84f36 all: turn on cgo/external linking for linux/ppc64
+ 2026-05-13 922abf576d cmd/go: add constant for requires simplification
+ 2026-05-12 58efaf3859 cmd/asm, cmd/internal/obj: add zvbb/zvbc for riscv64
+ 2026-05-12 aa3c8ed492 net/http: move TestOmitHTTP2 to cmd/dist
+ 2026-05-12 42bdffec2d cmd/go: force external linking when CGO_LDFLAGS contains static-linking flags
+ 2026-05-12 cd913caa3f cmd/go/internal/telemetrystats: add go/platform/target/port:*-* counter
+ 2026-05-12 a5a336cda2 debug/pe, debug/macho: use saferio.ReadData for ZLIB section decompression
+ 2026-05-12 f552547748 cmd/compile: fixed error message about println says print
+ 2026-05-12 5b106947d1 cmd/compile: propagate desired registers through phi nodes
+ 2026-05-12 55089b9e27 runtime: remove specialized classes larger than 128 bytes
+ 2026-05-12 15129eb73b runtime: add microbenchmarks for sizespecializedmalloc
+ 2026-05-11 9936a78b78 runtime: consolidate tiny sizespecializedmalloc functions
+ 2026-05-11 326e7845a2 all: update to x/net@ad8140e0aa
+ 2026-05-11 358cf41413 cmd/internal/obj/arm: use single BIC for AND with negative-rotated immediate
+ 2026-05-11 11a3b27b91 crypto/hkdf: fix example to derive three different 128-bit keys
+ 2026-05-11 2568174249 runtime: fix TestUsingVDSO on linux/ppc64le
+ 2026-05-10 2403e594a5 internal/runtime/maps: rewrite MemHash{32,64} using simd/archsimd intrinsics
+ 2026-05-08 ce4fc9417c cmd/compile: use inline tree index to identify call stack
+ 2026-05-08 55ff407d4f cmd/internal/obj: print error on duplicate symbol definitions
+ 2026-05-08 e49b53439d cmd/compile/internal/obj/arm64: add RPRFM instruction for range prefetch
+ 2026-05-08 74c35fca7a cmd/compile: canonicalize x+x into x<<1 in generic.rules
+ 2026-05-08 f133609b75 runtime: eliminate false positives in ctrlGroupMatchH2 on ARM64
+ 2026-05-08 816c1a79fb go/importer: un-deprecate importer.ForCompiler
+ 2026-05-08 3f3387fab8 cmd/compile: catch missed case in binary-search-for-switch
+ 2026-05-08 c7d87cda53 runtime/cgo: add acquire/release back around malloc
+ 2026-05-08 373b3a9097 test: update newinline.go for closure name change
+ 2026-05-08 afcf04cb64 internal/goexperiment: actually delete goroutineleakprofile experiment
+ 2026-05-07 e30b75a910 cmd/cgo/internal/testsanitizers: bound ASAN C support probe
+ 2026-05-07 834214f787 runtime: fix TestUsingVDSO on Linux ARMv6 (Pi 1)
+ 2026-05-07 ea0da4047c net/http/internal/http2: close client conn on GOAWAY with no reqs in-flight
+ 2026-05-07 8042aaf03c runtime/maps: only grow small full maps when inserting new keys
+ 2026-05-07 409f784bea cmd/compile: simplify closure name
+ 2026-05-07 1456da550a crypto/tls: add QUICConfig.ClientHelloInfoConn
+ 2026-05-07 fee42ee058 src: spelling and grammar fixes
+ 2026-05-07 8908cc14cc cmd/internal: fix error message
+ 2026-05-07 f2b1b38293 cmd/compile: crash if we try to generate a truncated AMD64 const shift
+ 2026-05-07 c3bfc824a5 cmd/compile: do not misscompile x+x << 63 to x << 0 on amd64
+ 2026-05-07 784ea961a4 runtime: avoid concurrent use of synctest timer race context.
+ 2026-05-07 15b9fc2659 net/http: support non-tls.Conn TLS connections
+ 2026-05-07 887f38afa9 net/http: fix FileServer tests that are racy for HTTP/3
+ 2026-05-07 70634e7d67 net/http: adjust several tests to work for HTTP/3
+ 2026-05-07 4d7ac7ff23 all: update to x/net@689f70a42a
+ 2026-05-07 1a9af07120 cmd/go: reject sumdb response lacking module hash
+ 2026-05-07 788b1c54c1 all: avoid unsafe StringToUTF16Ptr on Windows
+ 2026-05-07 f9f6dc7c82 archive/tar: clarify that tarinsecurepath=0 does not apply to linknames
+ 2026-05-07 2747d887eb compress/flate: clarify compatibility promise
+ 2026-05-07 714a94dd31 cmd/compile/internal/noder: put type args inside parenthesis
+ 2026-05-06 16449179ec internal/goexperiment,runtime: drop goroutineleakprofile experiment
+ 2026-05-06 3b7d571c99 html/template: use zero-alloc bytes.EqualFold
+ 2026-05-06 66843181d1 cmd/go: fix potention deadlock
+ 2026-05-06 b32283b27b cmd/go: fix length is not equal cause bytes.Equal never return true
+ 2026-05-06 caeb5b7b66 cmd/api: fix false positive and false negative in isDeprecated
+ 2026-05-06 deee1b75cf cmd/api/testdata: add test case for issue 79145
+ 2026-05-06 f03f2ab67a go/types: prevent panic with multi-tag, multi-file test packages
+ 2026-05-06 f230dd8a1d mime: avoid quadratic complexity in WordDecoder.DecodeHeader
+ 2026-05-06 eb845eca72 go/types, types2: include type arguments in instantiated type cycle errors
+ 2026-05-06 3cf84263ec runtime: prune tombstones before rehash in fast32 pointer-key insert
+ 2026-05-06 edc5480072 cmd/go: correct go/vcs counter names
+ 2026-05-06 978f00ab7f crypto/x509/pkix: render string-typed attribute values as strings
+ 2026-05-06 253aa2a12a internal/buildcfg: enable goroutineleakprofile GOEXPERIMENT by default
+ 2026-05-06 0b87c1d350 regexp: reimplement API using iterators, revise doc comments
+ 2026-05-06 d5ebe8100d cmd/compile: use binsearch-not-table for simd non-constant immediates when retpoline
+ 2026-05-05 07840ceeed index/suffixarray: fix incorrect condition
+ 2026-05-05 628674a0c1 cmd/compile: schedule increments after flags
+ 2026-05-05 d81ba6c35d runtime: exclude main goroutine blocked on select{} from goroutine leak profile
+ 2026-05-05 19f8047c26 all: update to x/net@5e11a5ab89
+ 2026-05-05 0b54a75319 encoding/json/v2: support `format` tag option behind goexperiment
+ 2026-05-05 d5d2bde748 encoding/json/jsontext: document underlying data storage of Token
+ 2026-05-05 f2a43196d1 encoding/json/jsontext: use custom wrapper type for Token accessor errors
+ 2026-05-05 1bd98fab2c crypto/internal/fips140/drbg: fix Wasm stub
+ 2026-05-05 6f19c3b459 cmd/compile: add missing bound checks when handle zero-sized values
+ 2026-05-04 e929fb78e4 index/suffixarray: protect against another data corruption
+ 2026-05-04 2098279730 index/suffixarray: report error rather than panic for corrupted data
+ 2026-05-04 4e4b780652 cmd/compile/internal/noder: hoist up generic methods assertion

Change-Id: Iecbe9b5fbcd86b4094a839b03aa8f7e0c28275de
2026-05-23 09:08:07 -04:00
Alexander Musman
ae7bac0a4b [dev.simd] simd: add ARM64 NEON comparisons, Masked/IfElse, and helper intrinsics
Add hardware-backed comparison intrinsics (Equal, Greater, GreaterEqual)
with derived comparisons (Less, LessEqual, NotEqual), mask types with
Masked/IfElse methods using VBIT and VBIF, and Neg/Abs intrinsics for
all element types.  VBIF complements VBIT so that IfElse with an inverted
mask (e.g. from NotEqual) folds away the VNOT.

ARM64 NEON uses bitwise mask representation (all-0 or all-1 per lane).
Comparisons use CMEQ/CMGT/CMHI/CMGE/CMHS and FCMEQ/FCMGT/FCMGE.
Neg/Abs instrinsics are implemented with VNEG/VFNEG and VABS/VFABS.

Here is a small runnable example:
```
package main

import (
        "fmt"
        "simd/archsimd"
)

func main() {
        a := archsimd.LoadFloat32x4([]float32{1.0, -2.0, 3.0, -4.0})
        b := archsimd.LoadFloat32x4([]float32{10.0, 20.0, 30.0, 40.0})
        neg := a.Less(archsimd.Float32x4{})
        result := a.IfElse(neg, b)
        // Expected output: {0,1,0,1} {1,20,3,40} {0,-2,0,-4}
        fmt.Println(neg.String(), result.String(), a.Masked(neg).String())
}
```

Change-Id: I353c34bbcfc7bff25f0c094b3dd13d5ecfb9af53
Reviewed-on: https://go-review.googlesource.com/c/go/+/776560
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-05-21 14:43:54 -07:00
Alexander Musman
6a8b351af9 [dev.simd] simd: support ARM64 NEON narrow and long intrinsics
Support the "2" (upper-half) variants of narrowing and widening NEON
instructions purely through SSA folding rules: the lower-half base
instruction is paired with SetHi/GetHi and folded at compile time
(e.g. SHRN+SetHi → SHRN2).

Adds intrinsics for multiply-long, shift-left-long, shift-right-narrow,
and truncate (XTN, which folds from SHRN with shift=0).

Example Montgomery multiply from the 78498 issue, rewritten using intrinsics:
```
package main

import (
        "fmt"
        "simd/archsimd"
)

func main() {
        a := archsimd.LoadUint16x8Array(&[8]uint16{100, 3000, 42, 0, 1, 3328, 2048, 7})
        b := archsimd.LoadUint16x8Array(&[8]uint16{200, 500, 79, 0, 3328, 1, 128, 13})
        q := archsimd.BroadcastUint16x8(3329)
        qinv := archsimd.BroadcastUint16x8(62209)
        cLo := a.Mul(b)
        wLo := a.MulLoLong(b)
        wHi := a.GetHi().MulLoLong(b.GetHi())
        cHi := wLo.ShiftRightNarrowConst(16).SetHi(wHi.ShiftRightNarrowConst(16))
        t := cLo.Mul(qinv)
        tqLo := t.MulLoLong(q)
        tqHi := t.GetHi().MulLoLong(q.GetHi())
        cor := tqLo.ShiftRightNarrowConst(16).SetHi(tqHi.ShiftRightNarrowConst(16))
        fmt.Println(cHi.Sub(cor)) // {63272,65515,63677,0,65367,65367,4,64270}
}
```

Change-Id: Iffb0a0051aaeb5ef18c4724240485e140b052078
Reviewed-on: https://go-review.googlesource.com/c/go/+/775281
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-05-21 12:53:23 -07:00
Alexander Musman
3ea9804aef [dev.simd] simd: add ARM64 NEON SetHi/GetHi methods
SetHi is emulated as VMOV Vn.D[0], Vd.D[1] and folded as destination of
narrow instruction into its variant that writes into upper half only.

GetHi is emulated as VMOV Vn.D[1], Dd and folded as a source of
long instruction into its variant reading upper half only.

Narrow and long instructions that these methods fold with will be added in follow-up CLs.

Simple example:
```
package main

import (
        "fmt"
        "simd/archsimd"
)

func main() {
        x := archsimd.LoadUint32x4Array(&[4]uint32{1, 2, 0xFF, 0xFF})
        y := archsimd.LoadUint32x4Array(&[4]uint32{10, 20, 0, 0})
        s := x.SetHi(y)
        g := s.GetHi()
        fmt.Printf("%v.SetHi(%v) = %v\n", x, y, s) // {1,2,255,255}.SetHi({10,20,0,0}) = {1,2,10,20}
        fmt.Printf("%v.GetHi() = %v\n", s, g)      // {1,2,10,20}.GetHi() = {10,20,0,0}
}
```

Change-Id: Iaf2a6eca15c2be7800eaf72f066227666c7c0d95
Reviewed-on: https://go-review.googlesource.com/c/go/+/773721
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-05-21 12:45:10 -07:00
Alexander Musman
39b11f4b14 [dev.simd] simd: add ARM64 NEON shift intrinsics
Add element-wise vector shift operations for ARM64 NEON,
supporting all integer element widths (B/H/S/D) for both signed
and unsigned types.

This adds:
  - Shift (SSHL/USHL): per-element shift by signed amount from a second vector
  - ShiftSaturated (SQSHL/UQSHL): saturating per-element shift
  - ShiftLeftConst/ShiftRightConst (VSHL/VSSHR/VUSHR): shift by compile-time constant
  - ShiftLeftSaturatedConst (VSQSHL/VUQSHL): saturating left shift by constant
  - ShiftAllLeft/ShiftAllRight: shift all lanes by a scalar uint64

Lowering uses new case-based specialLower rules for const-shift
(immediate encoding) and ShiftAll (broadcast + VSSHL/VUSHL
with CSEL clamping for out-of-range amounts).

Test helpers are generated via tmplgen into
arm64_shift_helpers_test.go (20 type-specialized helpers for
ShiftConst, ShiftAll, and mixed-type Shift).

Example demonstrating Shift, ShiftLeftSaturatedConst, and ShiftAllRight:
```
package main

import (
        "fmt"
        "simd/archsimd"
)

func main() {
        a := archsimd.LoadInt16x8([]int16{1, -1, 200, -200, 2049, -2049, 100, -100})
        amt := archsimd.LoadInt16x8([]int16{2, -1, 3, 1, -2, 4, 256, -3})
        fmt.Printf("%s\n%s\n%s\n",
                a.Shift(amt).String(),                 // {4,-1,1600,-400,512,32752,100,-13}
                a.ShiftLeftSaturatedConst(4).String(), // {16,-16,3200,-3200,32767,-32768,1600,-1600}
                a.ShiftAllRight(2).String())           // {0,-1,50,-50,512,-513,25,-25}
}
```

Change-Id: Ife4aac499d8732f613325828c0ac16fdb7bedf0c
Reviewed-on: https://go-review.googlesource.com/c/go/+/767262
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-05-21 12:20:55 -07:00
Alexander Musman
098d688071 [dev.simd] simdgen: add argsMatchRule for broadcast-to-VMOVI folding
Implement parseArgsMatchRule to support custom argument patterns
in SSA lowering rules (currently arm64 only). Use it to fold
Broadcast1To16 of a constant into VMOVI.16B. The other arrangements
(H8,S4,D2) are currently pending assembler VMOVI support.

Change-Id: Id36d7e032a940f8261bda10281235e2b818700a3
Reviewed-on: https://go-review.googlesource.com/c/go/+/767261
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-05-21 11:54:44 -07:00
Alexander Musman
581c1b1081 [dev.simd] cmd/compile: set simdRegMask for ARM64
Set simdRegMask=fp so the register allocator sees available registers.
On ARM64 floats occupy the lower part of NEON vectors, which in turn
occupy the lower 128 bits of SVE vectors.

Change-Id: I091c59b28b0be8011ac8889c21364eac40218fed
Reviewed-on: https://go-review.googlesource.com/c/go/+/780740
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-05-20 19:41:11 -07:00
Roland Shoemaker
4136ffed69 simd/archsimd: decode non-broadcast memory operands
This allows us to properly generate the ops and merge/load rules for
various SIMD instructions that can use memory operands.

Fixes #78159

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-simd,gotip-linux-amd64_avx512-simd

Change-Id: Idec450c931c41bb903d4cc5b9b9ee8f610ee8796
Reviewed-on: https://go-review.googlesource.com/c/go/+/779521
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-05-20 08:53:04 -07:00
Jorropo
fabaedcbe8 cmd/compile: fold == != with a const and a bijective operation into the const
This extends a pattern we already match for Add* to
- Sub
- Sub (with swapped arguments)
- Xor
- Com
- Neg
- Mul

This more or less equates to constant folding and is particularly hard to
benchmark objectively for the same reasons.

It is 1 or 3 (for mul) cycles faster in a microbenchmark.

However it may require constants that are harder to materialize.

We currently do not consider these drawbacks in generic.rules.

I didn't originally thought the o.Uses == 1 was required however
certain arches like PPC64 are able to merge the CMP into the operation
in limited conditions which are broken by this CL.

Also if o.Uses == 1 we aren't removing a user, we could extand the
liveness of o's argument, without removing o increasing register pressure.

The latency gains should be invisible on branches, maybe not if used by
CondSelect or CvtBoolToUint8, but don't bother with theses unproven
dices.

Change-Id: I4fe6b5149576d2549e1157e5cc891af9edb79d55
Reviewed-on: https://go-review.googlesource.com/c/go/+/750181
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-05-19 08:24:22 -07:00
Jorropo
9e0467b174 cmd/compile: remove flags → bool → flags roundtrips on amd64
Fixes #76056
Fixes #76060

If we modify the issue's fieldReduceOnce2 function to:

  // fieldReduceOnce reduces a value a < 2q.
  func fieldReduceOnce2(a uint32) fieldElement {
    x, b := bits.Sub(uint(a), uint(q), 0)
    return fieldElement(subtle.ConstantTimeSelect(int(b), int(a), int(x)))
  }

We get the wanted assembly*:
  MOVL AX, CX
  MOVL AX, DX
  SUBQ $8380417, CX
  CMOVQCS DX, CX
  MOVQ CX, AX ; not ideal code size but handled by the register renaming unit
  RET

Changes made to fieldReduceOnce2:
- fixed a bug where a and x arguments to subtle.ConstantTimeSelect were swapped.
  we should use a when the sub underflows and x otherwise.
- use bits.Sub rather than bits.Sub32 which is intriscified.

*we use CMOVQCS + MOVQ because the CMOV randomly gets generated backward,
I believe this would be fixed if we teach regalloc to commut CMOV
(by swapping the two register args and inverting the condition).

Change-Id: I01eca545d3c5c8a1c1f5a107e0089f715359dfc6
Reviewed-on: https://go-review.googlesource.com/c/go/+/778141
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2026-05-15 16:23:27 -07:00
Jorropo
212065c922 cmd/compile: shuffle bits.Sub intrinsic generation on amd64
Assuming the CPU recognize SBB RX, RX as a dependency break,
this is a no-op however SET is much more canonical and easier
to match for.

Updates #76056

Change-Id: Icc590dbcc76a8ed2fca7b167cfb66a2d33d4d2d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/778140
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
2026-05-15 08:26:25 -07:00
Michael Matloob
9936a78b78 runtime: consolidate tiny sizespecializedmalloc functions
In the sizespecializedmalloc goexperiment, we specialized the tiny
function per tiny size, so there was a different allocation function per
size from 1-15. This created a lot of functions for a code path that was
not executed that often. From the microbenchmarks, comparing the
consolidated tiny function in this cl with the per-size functions, the
specialized functions could be up to 20% faster, but for 8 byte
allocations, which are almost certainly the most common, the per-size
function was slower.

Look at the change description of CL 766980 for the results of those
microbenchmarks. The CL also contains the code used to run the
benchmark.

Since we've noticed significant icache pressure from all the functions,
and, the tiny functions aren't used as much as the other ones, and the
benefits seem to be mixed, consolidate the 15 functions into a single
function.

This cuts the size of the mallocgc* functions by about 20%.

For #79286

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64_c2s16-perf_vs_parent-sizespecializedmalloc,gotip-linux-amd64_c3h88-perf_vs_parent-sizespecializedmalloc,gotip-linux-arm64_c4ah72-perf_vs_parent-sizespecializedmalloc,gotip-linux-arm64_c4as16-perf_vs_parent-sizespecializedmalloc,gotip-linux-arm64_c4as16-perf_vs_parent,gotip-linux-arm64_c4ah72-perf_vs_parent,gotip-linux-amd64_c3h88-perf_vs_parent,gotip-linux-amd64_c2s16-perf_vs_parent
Change-Id: I824f65727a858158c14d2edd6fea1e846a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/772540
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2026-05-11 14:06:40 -07:00
Jorropo
74c35fca7a cmd/compile: canonicalize x+x into x<<1 in generic.rules
The canonical way to multiply by 2 is x<<1, this is what other
generic rules expect.

It is slower than x+x but arches rule can turn x<<1 back into x+x,
as this avoids adding many special cases for rules optimizing shifts
to also search x+x as x<<1.

Change-Id: I249c60cd2643db2e2a3503f3934211f80fb2912a
Reviewed-on: https://go-review.googlesource.com/c/go/+/774060
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-05-08 16:01:21 -07:00
Junyang Shao
b70242b3b9 [dev.simd] simd, cmd/compile: rename loads and stores
This CL contains API renamings:

Loads:
Adjust these names to be "scalable first"
LoadT => LoadTArray
LoadTSlice => LoadT
LoadTSlicePart(s []E) T => LoadTPart(s []E) T (note: the next CL will further refine it to return the elements loaded)
LoadTMasked - Let's drop this for now. Passing an array defeats the main purpose of suppressing faults. Passing a slice would require extra work bounds-checking work. It's not clear how to translate this into Go.

Stores:
T.Store => T.StoreArray (not necessary, but gives symmetry and compile-time bounds checking)
T.StoreSlice => T.Store
T.StoreSlicePart => T.StorePart
T.StoreMasked => T.StoreArrayMasked
We may want a slice version of masked store, but we'll leave it out for now. It requires bounds checking. Mostly this will be served by StorePart.

For #78979.

Change-Id: I16dbc269b4566380c19e769892ea55d849024e53
Reviewed-on: https://go-review.googlesource.com/c/go/+/775600
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2026-05-08 09:33:15 -07:00
Cherry Mui
409f784bea cmd/compile: simplify closure name
Currently, a closure in a function is usually named after the
outer function, usually in the form of pkg.outer.funcN. When the
containing function is inlined, we attach the inlined caller's
name to the closure name, so this may become things like
callerpkg.caller.pkg.outer.funcN. With multiple levels of
inlining, this name can get pretty long and clutter.

This CL change the compiler to use the simple, pre-inlining name
for closures. That is, the closure is always named pkg.outer.funcN
where outer is the containing function in the source code. This
name is not changed during inlining. With inlining, there may be
multiple copies of the closure, all with the same name. They are
likely to be compiled identically, although technically it is
possible for the compiler to optimize them differently based on
the context. So we'll use a content hash to distinguish and
deduplicate them.

With the content-addressable symbol mechanism, the linker is
capable of handling multiple symbols with the same name, and use
the content hash to distinguish and deduplicate them. A
complication is that the compiler is not able to handle multiple
symbols with the same name when compiling a package. So we give
them temporarily unique suffixes during the compilation (based
on the inline call stack), and trim the suffix in the object file
and DWARF generation. So their linker symbols remain simple.

One caveat is nested closure (i.e. a closure within a closure).
Previously, a nested closure is named as topLevelFunc.funcN.M where
topLevelFunc.funcN is the outer closure. When the outer closure is
inlined, and the inlined caller is not a closure, it is named as
caller.topLevelFunc.funcN.funcM (note the extra "func"). This is
arguably a bug in the current code, as it decides whether to
include the "func" word based on whether the physical containing
function is a closure or not, not the source-level function. This
CL removes the "caller" part from the name, but does not address
the extra "func" word. So when the outer closure is inlined, the
inner closure will be named topLevelFunc.funcN.funcM, which
differs from the original topLevelFunc.funcN.M. This is not too
bad in that the name won't get too long, and still match the
source.

Fixes #60324.

Change-Id: Ia69c35a8f9b1a3b2c27db1a0959c1316be8b1f81
Reviewed-on: https://go-review.googlesource.com/c/go/+/770200
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Commit-Queue: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
2026-05-07 14:46:33 -07:00
Josh Bleecher Snyder
628674a0c1 cmd/compile: schedule increments after flags
ScoreInductionInc was introduced in 19f05770b0.
The goal was to keep the i++ in-place in a register.

Placing ScoreInductionInc later than ScoreFlags further improves
the generated code, mainly for code involving carry chains.

For example, the math/big.addVW_ref inner loop was:

    LEAQ    1(CX), R8
    ADDQ    DX, R9
    MOVQ    R9, (AX)(CX*8)
    SBBQ    R9, R9
    NEGQ    R9
    MOVQ    R8, CX

After this commit:

    ADDQ    DX, R9
    MOVQ    R9, (AX)(CX*8)
    SBBQ    R9, R9
    NEGQ    R9
    INCQ    CX

This is almost uniformly an improvement, across GOARCHes.
There are a few functions where this perturbs regalloc and causes
a little bit of movement, but they are rare and appear to be the
usual uninteresting regalloc change noise.

Change-Id: I883a92e4511136f478cf49471ba8b628434393dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/773660
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2026-05-05 15:30:14 -07:00
Junyang Shao
fdbd2e871a
[dev.simd] all: merge master (f6664a0) into dev.simd
Conflicts:

- src/cmd/compile/internal/ssa/_gen/AMD64ops.go
- src/cmd/compile/internal/ssa/opGen.go

+ 2026-05-04 f6664a0a60 cmd/dist: inline matchexpr into its only caller
+ 2026-05-04 e1dff0e0b9 cmd/dist: use go/build/constraint to parse build constraints
+ 2026-05-04 ba02236208 cmd/dist: add a 'gccgo && gc || gc' build constraint testcase
+ 2026-05-02 2868672202 cmd/compile: resync regmask refactoring change
+ 2026-05-02 1901161d96 cmd/compile: refactor regMask for more registers
+ 2026-05-01 8594bf4621 runtime/,internal/runtime/maps: move hashing function implemented in GOASM to maps package
+ 2026-05-01 464dc3f344 encoding/json/jsontext: report errors for numeric Token accessors
+ 2026-05-01 f1bc06b98d encoding/json: document Unmarshal behavior of JSON arrays into non-empty Go slices
+ 2026-05-01 0af7dbf1e6 sync/atomic: document why not Int16
+ 2026-05-01 891f4a8711 reflect: Value.Methods should panic at nil interface value
+ 2026-05-01 c5b875218f debug/pe: should check return of r.ReadAt at NewFile
+ 2026-05-01 3fdac6780b cmd/compile: add math.{Ceil,Trunc,Floor,RoundToEven} intrinsics on loong64
+ 2026-05-01 4e0783368b all: fix a lot of spelling mistakes
+ 2026-05-01 b1772bacc7 cmd/compile: fix type reshaping for nested instantiations
+ 2026-05-01 deaf3e6789 cmd/compile: use HasPointers in memcombine to match write barrier check
+ 2026-05-01 7bb9bc64d0 encoding/json/jsontext: expand signature of AppendFormat
+ 2026-05-01 60a809d31a encoding/json/internal/jsonwire: remove generic implementations
+ 2026-05-01 cb39d7aa5c net: clarify documentation for Dialer.FallbackDelay
+ 2026-05-01 10b5baca54 crypto/tls: skip unsupported ECH config versions
+ 2026-05-01 be9da6ce60 cmd/compile/internal/ssa: limit call stack use in known bits
+ 2026-05-01 914b632202 cmd/compile/internal/types: don't change outer formatting mode when recursing
+ 2026-05-01 c4a8f71e28 cmd/internal/obj/s390x: typo
+ 2026-05-01 bb1dde2709 cmd/compile: remove deadcode
+ 2026-05-01 d9a6e74180 reflect: fix nil array pointer caused panic
+ 2026-05-01 5756e857c8 encoding/json: fix typos in documentation
+ 2026-05-01 7912a25a4e cmd/asm, cmd/internal/obj/arm64: make addr the last op in SVE stores
+ 2026-05-01 f512621129 cmd/go: document input formats for edit -go / -toolchain
+ 2026-05-01 2aa62e3d90 cmd/go/internal/list: disallow empty string arg
+ 2026-05-01 da6a4cd70a cmd/compile: teach deadstore about moves
+ 2026-05-01 4aa6dad54e cmd/internal/obj/s390x: add VSTRL instruction
+ 2026-05-01 e7679df393 cmd/go/internal/workcmd: fix typo
+ 2026-05-01 b163a5975e crypto/tls: clean up and regenerate client recorded test handshakes
+ 2026-05-01 1023dc1af2 crypto/tls: clean up and regenerate server recorded test handshakes
+ 2026-05-01 6baecf3148 crypto/tls: make tests use SetGlobalRandom
+ 2026-05-01 fbab18c66a crypto/tls: fix error handling in recorded test connections
+ 2026-05-01 bb416f5057 runtime: disable CgoCallbackX15 test on freebsd+race
+ 2026-04-30 87fe5fafba cmd/compile: use GOEXPERIMENT from environment for generic methods
+ 2026-04-30 5732f4b76a net/http: disable some flaky HTTP/3 tests
+ 2026-04-30 23eac3d12b cmd/go/internal/vcs: stop making network connections in test
+ 2026-04-30 70e521bdff cmd/link: avoid a copy in Mach-O CodeSign
+ 2026-04-30 fdd592745d test: use goexperiment.genericmethods for tests
+ 2026-04-30 aa62c18749 internal/goexperiment: put generic methods behind GOEXPERIMENT
+ 2026-04-30 17bd5ab8c6 syscall: copy only read bytes in js/wasm
+ 2026-04-29 0e9a844b0d cmd/go: set a http user agent
+ 2026-04-29 d8cab4c45a cmd/compile, go/types: disable constant string size check
+ 2026-04-29 cbaecb2830 cmd/link: make -f flag actually ignore version mismatch
+ 2026-04-29 8191cd8868 cmd/go: loosen go work sync version requirements
+ 2026-04-29 a221442229 cmd/go: add go1.24 requirement when running go get with tools
+ 2026-04-29 60eb90e6b0 encoding/json/jsontext: add TODO about removing Internal symbol
+ 2026-04-29 f2ec1254ff html/template: fix escaping of URLs in meta content attributes
+ 2026-04-29 76c2c9b32a crypto/sha3: ensure unwrapped *sha3.Digest are usable
+ 2026-04-29 79b47a7566 crypto/mlkem: enrich the DecapsulationKey768|1024 doc comments
+ 2026-04-29 f0f2768dff crypto/fips140: add package docs
+ 2026-04-28 5bb6d165f0 os/signal: add Notify windows documentation
+ 2026-04-28 a63b23ffb2 html/template: fix escaper bypass by treating empty script type as JavaScript
+ 2026-04-28 2c59389fcc net/mail: fix quadratic consumePhrase behavior
+ 2026-04-28 a3f569adee net/http: resolve data race in TestMaxBytesHandler
+ 2026-04-28 f93915339a database/sql: prioritize closingMutex.Lock over RLock when no rlocks
+ 2026-04-28 b8e0cb88c8 cmd/compile: consolidate size limits
+ 2026-04-28 343fbe2971 cmd/compile: eliminate impossible type assertions in generic functions
+ 2026-04-28 58968c79e7 crypto/internal/rand: avoid MaybeReadByte non-determinism with SetGlobalRandom
+ 2026-04-28 da36c0eecd crypto/tls: delete orphaned test transcripts
+ 2026-04-28 3103a23124 crypto/tls: wrap ML-KEM hybrids in fips140.WithoutEnforcement
+ 2026-04-28 8b2f069b14 crypto/internal/cryptotest: add RerunWithFIPS140Enabled/Enforced
+ 2026-04-28 10434cb4f2 crypto/tls: switch FIPS 140-3 tests to new certificates
+ 2026-04-28 37b75cc637 crypto/tls: generate test certificates
+ 2026-04-28 3ac09d0ab6 cmd/go/internal/fips140: verify zip hash before unzipping
+ 2026-04-28 d876fda088 crypto/internal/fips140/mldsa: add accumulated field functions test
+ 2026-04-28 e22e20a1e5 lib/fips140: add certified pointing to v1.0.0-c2097c7c
+ 2026-04-28 d0aedae1e2 lib/fips140: update inprocess to v1.26.0
+ 2026-04-28 7dcde17e8d crypto: typo
+ 2026-04-28 65d5c5f6dd net/http: resolve data race in TestTransportReadToEndReusesConn
+ 2026-04-27 5fb2392a6f go/types, types2: add missing state for assertion
+ 2026-04-27 6795bb3317 net/http/httputil: reencode queries with many parameters in proxy
+ 2026-04-27 1f5c165a81 cmd/compile: support optimizing switch statements with fallthroughs to lookup tables
+ 2026-04-27 d79a0079f5 go/ast: incr i after i == 0 check
+ 2026-04-27 1d23a4caa1 runtime: tweak outdated comment in cgocallback for amd64
+ 2026-04-27 efa1eecc7d net/http: re-enable HTTP/3 tests
+ 2026-04-26 85f838f46c cmd/internal/obj, cmd/compile: refactor encoding arm64 RegisterArrangement
+ 2026-04-26 5e45c1df65 crypto/internal/fips140: handle static assembly symbols correctly in FIPS check
+ 2026-04-26 879b659ae0 cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64
+ 2026-04-26 ca10097f29 cmd/asm, cmd/internal/obj: add riscv64 pseudo CSR ops
+ 2026-04-24 02b3e0d4dd encoding/json/jsontext: add float32 support
+ 2026-04-24 1225feb0da sync: document guidance on Cond.Broadcast regarding holding the lock
+ 2026-04-24 52fd498a96 runtime: clear X15 before calling cgocallbackg
+ 2026-04-24 9b3f3ad17a cmd/trace: rewrite unspecified address to localhost in URL
+ 2026-04-24 9c0cb3c3a9 runtime: fix should not compare uint64 with zero
+ 2026-04-24 02d136966c cmd/trace: listen on localhost when address omitted
+ 2026-04-24 82885449f7 cmd/go: using cmdFlags instead run flags
+ 2026-04-24 543703d352 cmd/compile: set the limit of string constants to 1 GiB
+ 2026-04-24 33cf6926ec cmd/trace: fix off-by-one bug
+ 2026-04-24 9a32e8ce07 internal/fuzz: use full int64/uint64 range in mutator
+ 2026-04-24 620cefa291 go/build: check result of strings.Cut
+ 2026-04-24 3b345adf2c reflect: correct panic message
+ 2026-04-24 3c770e3233 cmd/internal/obj/loong64: fix copy-paste error
+ 2026-04-24 d8034799e0 internal/reflectlite: use reflectlite instead of reflect
+ 2026-04-24 03dc8c482f internal/runtime/maps: map pointer-key variants missing pruneTombstones
+ 2026-04-24 a91e9fa1de internal/syscall/windows: avoid uint16 overflow in NewNTUnicodeString
+ 2026-04-24 a804e04b7e runtime: remove obsolete memory profiler comment
+ 2026-04-23 d484fb9ddf cmd/internal/obj/riscv: generate inst.go with make
+ 2026-04-23 4c6ba57ea7 cmd/internal/obj/wasm: use p.To instead of p.From
+ 2026-04-23 d44e4e062b runtime/arm64: use ABIInternal convention in cgocallbackg
+ 2026-04-23 2bb808bfc2 cmd/compile: add Trunc support to known bits
+ 2026-04-23 c72ba16e07 cmd/compile: cleanup shift code in known bits
+ 2026-04-23 e133fb1569 cmd/compile: add Rsh support to known bits
+ 2026-04-23 767140eff2 cmd/compile: add RshU support to known bits
+ 2026-04-23 977041b065 cmd/compile: add Lsh support to known bits
+ 2026-04-23 9c0a8a2b46 cmd/compile: add Neq support to known bits
+ 2026-04-23 8963c303b4 cmd/compile: add Sext support to known bits
+ 2026-04-23 d75902b195 cmd/compile: add CvtBoolToUint8 support to known bits
+ 2026-04-23 1ad012aa6b cmd/compile: add Zext support to known bits
+ 2026-04-23 b9e1876c11 cmd/compile: add first boolean and Eq support to known bits
+ 2026-04-23 7a8dcab743 cmd/compile: add known bits pass
+ 2026-04-23 13cab13f78 cmd/compile: recognize OpVarDef and OpZero in cse isMemDef
+ 2026-04-23 e4e2474e12 cmd/go/internal/load: fix a data race in test.go
+ 2026-04-22 729d18bcc0 runtime/debug: mark doc-links in doc-comments
+ 2026-04-22 be45023407 log/slog: return if error happens
+ 2026-04-22 8fe0c0eb63 runtime: on arm64 use all of the hash seed on 32-bit hashes
+ 2026-04-22 82fd4c4967 crypto/tls: increase readFromUntil buffer size
+ 2026-04-22 91c0f6acd8 crypto/tls: reject 0xFFFF AEAD ID in pickECHConfig
+ 2026-04-22 62caa6db3d crypto/x509: stricter email parsing
+ 2026-04-22 122eb7d035 internal/cpu: add zbc extension detection for riscv64
+ 2026-04-22 9c688e3f4d cmd/compile: don't lift divide instructions out of loops
+ 2026-04-22 81973b4038 cmd/link: use /lib/ld64.so.1 as dynamic linker path for s390x
+ 2026-04-22 043b76a90d cmd/go/internal/doc: walk GOROOT when not in a module
+ 2026-04-21 820c83da80 cmd/compile/internal/noder: add README.md

Change-Id: I91de41f5b0fda54374914a45baaed5b0737724cd
2026-05-04 18:56:12 +00:00
David Chase
acc4bad854 [dev.simd] cmd/compile: add optimizations for all the AVX512 mask operations
these will also help as a model for other architectures.

Change-Id: If36b3e361c285a6a21db0b3c024f683db9125ac2
Reviewed-on: https://go-review.googlesource.com/c/go/+/769761
Auto-Submit: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-05-01 09:33:59 -07:00
Keith Randall
da6a4cd70a cmd/compile: teach deadstore about moves
Moves that read from read-only memory can't be reading the results
of a previous store. These are often generated by constant struct literals.
Moves whose results aren't needed because that memory is immediately
overwritten, are not needed.

Saves a few bytes of generated code (~<0.1%).

Change-Id: I8dab6d1b9c066d6b623eae8b8fe31a51dd3de006
Reviewed-on: https://go-review.googlesource.com/c/go/+/771780
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
Reviewed-by: David Chase <drchase@google.com>
2026-05-01 09:05:50 -07:00
Jake Bailey
343fbe2971 cmd/compile: eliminate impossible type assertions in generic functions
When a generic function converts a shape-typed value to an interface
and then type-asserts or type-switches on it, some cases can never
match because the asserted concrete type has a different shape than
the source. For example:

    func foo[S string | []byte](x S) {
        switch any(x).(type) {
        case string:  // possible only when S has shape string
        case []byte:  // possible only when S has shape []uint8
        }
    }

Since instantiated generic funcs work on shapes, all instantiations
contain the code for all cases even if they will never be hit.

Detect OCONVIFACE of a shape type followed by a concrete type
assertion, and compare the shapes. If they are incompatible, the
assertion can never succeed for that instantiation.

This applies to both type switch cases (which are skipped entirely)
and comma-ok type assertions (which are replaced with zero, false).

The analysis also tracks through intermediate variables using a
pre-walk pass with ReassignOracle, so patterns like

    iface := any(x)
    v, ok := iface.(string)

are handled as well.

Updates #57072

Change-Id: I837f6089b9e431f856a528463075fd10abe464dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/767640
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2026-04-28 11:19:35 -07:00
qmuntal
1f5c165a81 cmd/compile: support optimizing switch statements with fallthroughs to lookup tables
Switch cases that end in a fallthrough, and the case that follows it,
can't be optimized to a lookup table. Others should still be eligible
for optimization.

Change-Id: Iebdde2ab590f2be89ba08a2dc3326553c5a4083c
Reviewed-on: https://go-review.googlesource.com/c/go/+/764440
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-04-27 12:44:53 -07:00
Jorropo
7a8dcab743 cmd/compile: add known bits pass
This pass performs bitwise constant folding.

It's main goal is to optimize bitfields like generated by defer.

You might have 3 defers in a function and the middle one is always taken,
previously we couldn't remove the branch for it, this pass is able to do so.

This is hit 93 times uniqued by LOC when building the std.

My first thought was to implement this as parts of the limits code.
However the way limits allows to set knownBits tighter and vice-versa
means the code complexity between the two is multiplicative.
Thus I have avoided this, someone might change it in the future
but I don't have a good usecase now and this simple pass is sufficient.

I have tried multiple places for the pass,
we need it before any opt (here late opt) since we need the generic rules
to optimize any user of a constant folded value.

We also want one run of known bits after prove since prove removing some
never / always taken branches allows known bits to do a better job.

This yields real optimizations when you have a defer inside an always
taken branch.

I've thought prove might do a better job if some branches were removed by
running an early known bits first.
However after trying it, this never helped.

I am sure you can build an example where this becomes true, but at least
in the code I've looked at it didn't help.
Thus I decided against running known bits twice (before and after prove).

Fixes #78633

Change-Id: I90a46875cc11d5d26367f00ac83c29fed433cb6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/765560
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-04-23 08:17:49 -07:00
Alexander Musman
13cab13f78 cmd/compile: recognize OpVarDef and OpZero in cse isMemDef
recognize OpVarDef (width 0, always skippable) and OpZero (similar to
OpStore, it should be disjoint), so these ops do not prevent loads cse.

This change slightly improves code size:

Executable           Base .text linux_arm64     Change
----------------------------------------------------
asm                     2133284     2132900     -0.02%
cgo                     1742996     1742868     -0.01%
compile                10567620    10566852     -0.01%
cover                   1906740     1906100     -0.03%
fix                     3131284     3131012     -0.01%
link                    2667604     2667076     -0.02%
preprofile               877908      877876     -0.00%
vet                     3010372     3010084     -0.01%

Change-Id: I428f73008a817d0e302d438c020504c560ae1653
Reviewed-on: https://go-review.googlesource.com/c/go/+/769000
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
2026-04-23 08:12:54 -07:00
Timo Friedl
f42f2a3bb3 cmd/compile: add boolean absorption laws to SSA rewrite rules
The SSA generic rewrite rules implement DeMorgan's laws but are
missing the closely related boolean absorption laws:

  x & (x | y) == x
  x | (x & y) == x

These are fundamental boolean algebra identities (see
https://en.wikipedia.org/wiki/Absorption_law) that hold for all
bit patterns, all widths, signed and unsigned. Both GCC and LLVM
recognize and optimize these patterns at -O2.

Add two generic rules covering all four widths (8, 16, 32, 64).
Commutativity of AND/OR is handled automatically by the rule
engine, so all argument orderings are matched.

The rules eliminate two redundant ALU instructions per occurrence
and fire on real code (defer bit-manipulation patterns in runtime,
testing, go/parser, and third-party packages).

Fixes #78632

Change-Id: Ib59e839081302ad1635e823309d8aec768c25dcf
GitHub-Last-Rev: 23f8296ece
GitHub-Pull-Request: golang/go#78634
Reviewed-on: https://go-review.googlesource.com/c/go/+/765580
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2026-04-13 03:42:16 -07:00
qmuntal
c4cb9a90f6 cmd/internal/tesdir: fix Test/codegen/switch on loong64
loong64 uses ALSLV the compute the lookup index.

Fixes #78575

Change-Id: Ied90a4f811cc19ffec4d304333546d1fa430ccc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/764180
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2026-04-09 05:10:28 -07:00
Jorropo
455d4f41fb cmd/compile: run CondSelect into math rules on all arches
Fixes #78558

I've also added tests to make sure PPC still generate ISEL when
the constant isn't 1.
This is to make sure we aren't generating a sequence that wouldn't
work right now.

But it does not mean we couldn't try to optimize other constants
on PPC64 if a fast sequence exists; for example like arm64's
inline register shifts.

Change-Id: Ic241d593149b7a11533948f5d4c52db357cc134f
Reviewed-on: https://go-review.googlesource.com/c/go/+/763340
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
2026-04-09 04:42:54 -07:00
Melnikov Denis
996b985008 cmd/compile: improve stp merging for non-sequent cases
Original algorithm merges stores with the first
mergeable store in the chain, but it misses some
cases. Additionally, creating list of STs, which
store data to adjacent memory cells allows merging them
according to the direction of increase of their addresses.

I have already tried another algorithm in CL 698097,
but it was reverted. This algorithm works differently
and fixes bug, generated by variant from another CL.

Fixes #71987, #75365

There are the results of sweet benchmarks
                      │  base.stat  │              opt.stat              │
                       │   sec/op    │   sec/op     vs base               │
ESBuildThreeJS-4          1.088 ± 2%    1.086 ± 1%       ~ (p=1.000 n=10)
ESBuildRomeTS-4          263.0m ± 2%   260.8m ± 1%       ~ (p=0.105 n=10)
EtcdPut-4                73.08m ± 1%   73.16m ± 1%       ~ (p=0.971 n=10)
EtcdSTM-4                414.9m ± 1%   415.4m ± 1%       ~ (p=0.393 n=10)
GoBuildKubelet-4          203.3 ± 0%    203.5 ± 0%       ~ (p=0.393 n=10)
GoBuildKubeletLink-4      19.06 ± 1%    19.05 ± 0%       ~ (p=0.280 n=10)
GoBuildIstioctl-4         156.6 ± 0%    156.6 ± 0%       ~ (p=0.796 n=10)
GoBuildIstioctlLink-4     14.16 ± 1%    14.18 ± 1%       ~ (p=0.853 n=10)
GoBuildFrontend-4         56.45 ± 1%    56.57 ± 0%       ~ (p=0.579 n=10)
GoBuildFrontendLink-4     3.635 ± 1%    3.646 ± 0%       ~ (p=0.436 n=10)
GoBuildTsgo-4             103.0 ± 1%    103.4 ± 1%       ~ (p=0.529 n=10)
GoBuildTsgoLink-4         1.865 ± 1%    1.860 ± 1%       ~ (p=0.684 n=10)
GopherLuaKNucleotide-4    33.55 ± 0%    33.58 ± 0%       ~ (p=0.075 n=10)
MarkdownRenderXHTML-4    281.0m ± 0%   280.3m ± 0%  -0.23% (p=0.019 n=10)
Tile38QueryLoad-4        970.0µ ± 1%   969.3µ ± 0%       ~ (p=0.436 n=10)
geomean                   3.128         3.128       -0.01%

Change-Id: Ia548b43601b1bdb1c1723d300a4b8b907ab0c040
Reviewed-on: https://go-review.googlesource.com/c/go/+/760100
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2026-04-08 14:21:41 -07:00
qmuntal
6797caf71a cmd/compile: support all constant return types in switch lookup tables
Lookup tables for switch statements can be generalized to also support
bools, strings, floats, and complex numbers.

Change-Id: Ic3ece41fe2009050fbf08ba6f06ea8a567407974
Reviewed-on: https://go-review.googlesource.com/c/go/+/763320
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2026-04-08 03:47:21 -07:00
Jorropo
a93560b70a cmd/compile: optimize CondSelect to math on arm64 with inline register shifts
Change-Id: I27696b1a5fa0593d9f36743efa3559a36d23ec4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/760844
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2026-04-06 19:34:25 -07:00
Jorropo
666e8c59c9 cmd/compile: improve Mul to Left Shift rules
- fix a bug where it wouldn't recognize 1<<63 as a power of two
- remove the IsSigned check; there is no such thing as a signed Mul
  If the rule works for signed numbers it works for unsigned ones too.
  Even if the intermediary steps makes no sense, it ends up wrapping
  the right way around in the end.

Change-Id: I86182762aec5eff784e2d9bc49ee028825fb9ea0
Reviewed-on: https://go-review.googlesource.com/c/go/+/760843
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-04-06 19:34:21 -07:00
Jorropo
68ee544e87 cmd/compile: extend condselect into math code to handle other constants than 1
On amd64 along:
  if b { x += 1 } => x += b

We can also implement constants 2 4 and 8:
  if b { x += 2 } => x += b * 2

This compiles to a displacement LEA.

Change-Id: Ib00fcc5059acb0ebb346e056c4a656f164cc63df
Reviewed-on: https://go-review.googlesource.com/c/go/+/760841
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2026-04-06 10:04:56 -07:00
Jayanth Krishnamurthy jayanth.krishnamurthy@ibm.com
d74de3ce79 cmd/compile: improve uint8/uint16 logical immediates on PPC64
Logical ops on uint8/uint16 (AND/OR/XOR) with constants sometimes
materialized the mask via MOVD (often as a negative immediate), even
when the value fit in the UI-immediate range. This prevented the backend
from selecting andi. / ori / xori forms.

This CL makes:
UI-immediate truncation is performed only at the use-site of
logical-immediate ops, and only when the constant does not fit in the
8- or 16-bit unsigned domain (m != uint8(m) / m != uint16(m)).

This avoids negative-mask materialization and enables correct emission of
UI-form logical instructions. Arithmetic SI-immediate instructions (addi, subfic, etc.) and other
use-patterns are unchanged.

Codegen tests are added to ensure the expected andi./ori/xori
patterns appear and that MOVD is not emitted for valid 8/16-bit masks.

Change-Id: I9fcdf4498c4e984c7587814fb9019a75865c4a0d
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/704015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
2026-04-06 01:13:27 -07:00
qmuntal
2a902c8a8a cmd/compile: optimize switch statements using lookup tables
Switch statement containing integer constant cases and case bodies just
returning a constant should be optimizable to a simpler and faster table
lookup instead of a jump table.

That is, a switch like this:

    switch x {
    case 0: return 10
    case 1: return 20
    case 2: return 30
    case 3: return 40
    default: return -1
    }

Could be optimized to this:

    var table = [4]int{10, 20, 30, 40}
    if uint(x) < 4 { return table[x] }
    return -1

The resulting code is smaller and faster, especially on platforms where
jump tables are not supported.

goos: windows
goarch: arm64
pkg: cmd/compile/internal/test
                               │  .\old.txt  │              .\new.txt              │
                               │   sec/op    │   sec/op     vs base                │
SwitchLookup8Predictable-12      2.708n ± 6%   2.249n ± 5%  -16.97% (p=0.000 n=10)
SwitchLookup8Unpredictable-12    8.758n ± 7%   3.272n ± 4%  -62.65% (p=0.000 n=10)
SwitchLookup32Predictable-12     2.672n ± 5%   2.373n ± 6%  -11.21% (p=0.000 n=10)
SwitchLookup32Unpredictable-12   9.372n ± 7%   3.385n ± 6%  -63.89% (p=0.000 n=10)
geomean                          4.937n        2.772n       -43.84%

Fixes #78203

Change-Id: I74fa3d77ef618412951b2e5c3cb6ebc760ce4ff1
Reviewed-on: https://go-review.googlesource.com/c/go/+/756340
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-04-03 16:46:16 -07:00
Jorropo
0a36b58888 cmd/compile: extend all the cmov into math generic rules with their contrary
If the bool comes from a local operation this is foldable into the comparison.
  if a == b {
  } else {
    x++
  }
becomes:
  x += !(a == b)
becomes:
  x += a != b

If the bool is passed in or loaded rather than being locally computed
this adds an extra XOR ^1 to invert it.

But at worst it should make the math equal to the compute + CMP + CMOV
which is a tie on modern CPUs which can execute CMOV on all int ALUs
and a win on the cheaper or older ones which can't.

Change-Id: Idd2566c7a3826ec432ebfbba7b3898aa0db4b812
Reviewed-on: https://go-review.googlesource.com/c/go/+/760922
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-04-03 13:21:37 -07:00
Keith Randall
14a6bf0e90 test/codegen: remove unneeded commas
After CL 760780, commas aren't allowed.
But some CLs that were already in flight don't know that.

Change-Id: I31f586c87def4a9746dc2c055923fce8bad6647e
Reviewed-on: https://go-review.googlesource.com/c/go/+/761620
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2026-03-31 12:43:09 -07:00
Keith Randall
1582ad4105 test/codegen: fix some unbalanced quotes
Change-Id: I081da8c79f0264118e079af21ff58c511ae37e6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/760682
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
2026-03-31 11:01:20 -07:00
Keith Randall
d5b6d583c1 test/codegen: replace commas with spaces between regexps
Change-Id: Ia7a955833d761e08c1b8081fb29a2e6317de004c
Reviewed-on: https://go-review.googlesource.com/c/go/+/760681
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-31 11:01:16 -07:00
Keith Randall
d6492e284b test/codegen: get rid of \s
Replace \s with a space in backtick-quoted strings
Replace \\s with a space in double-quoted strings

Change-Id: I0c8b249bb12c2c8ca69e683e4bc6f27544fd6094
Reviewed-on: https://go-review.googlesource.com/c/go/+/760680
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-31 11:01:13 -07:00
Keith Randall
1673075d4b test/codegen: fix broken syntax
A bunch of tests had broken yet undetected syntax errors
in their assembly output regexps. Things like mismatched quotes,
using ^ instead of - for negation, etc.

In addition, since CL 716060 using commas as separators between
regexps doesn't work, and ends up just silently dropping every
regexp after the comma.

Fix all these things, and add a test to make sure that we're not
silently dropping regexps on the floor.

After this CL I will do some cleanup to align with CL 716060, like
replacing commas and \s with spaces (which was the point of that CL,
but wasn't consistently rewritten everywhere).

Change-Id: I54f226120a311ead0c6c62eaf5d152ceed106034
Reviewed-on: https://go-review.googlesource.com/c/go/+/760521
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Auto-Submit: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-31 11:01:09 -07:00
Jorropo
d9fbe4c90d cmd/compile: convert some condmoves in XOR
Similar to CL 685676 but for XOR.

Change-Id: Ib5ffd4c13348f176a808b3218fdbbafc2c42794f
Reviewed-on: https://go-review.googlesource.com/c/go/+/760921
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2026-03-31 10:58:48 -07:00
Jorropo
de7f006df2 cmd/compile: convert some condmoves in OR
Similar to CL 685676 but for OR.

Change-Id: I0ddfd457ed9e8888462306138a251ac48ad42084
Reviewed-on: https://go-review.googlesource.com/c/go/+/760920
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2026-03-31 10:58:45 -07:00