Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Cuong Manh Le	509ddf3868	cmd/compile: ensure bloop only kept alive addressable nodes Fixes #76636 Change-Id: I881f88dbf62a901452c1d77e6ffca651451c7790 Reviewed-on: https://go-review.googlesource.com/c/go/+/725420 Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2025-12-02 16:56:58 -08:00
Joel Sing	3f94f3d4b2	test/codegen: fix shift tests on riscv64 These were broken by CL 721206, which changes Rsh to RshU for positive inputs. Change-Id: I9e38c3c428fb8aeb70cf51e7e76f4711c864f027 Reviewed-on: https://go-review.googlesource.com/c/go/+/723340 Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-28 15:26:25 -08:00
Cuong Manh Le	2ac1f9cbc3	cmd/compile: avoid unnecessary interface conversion in bloop Fixes #76482 Change-Id: I076568d8ae92ad6c9e0a5797cfe5bbfb615f63d2 Reviewed-on: https://go-review.googlesource.com/c/go/+/725180 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>	2025-11-28 15:18:43 -08:00
thepudds	4879151d1d	cmd/compile: introduce alias analysis and automatically free non-aliased memory after growslice This CL is part of a set of CLs that attempt to reduce how much work the GC must do. See the design in https://go.dev/design/74299-runtime-freegc This CL updates the compiler to examine append calls to prove whether or not the slice is aliased. If proven unaliased, the compiler automatically inserts a call to a new runtime function introduced with this CL, runtime.growsliceNoAlias, which frees the old backing memory immediately after slice growth is complete and the old storage is logically dead. Two append benchmarks below show promising results, executing up to ~2x faster and up to factor of ~3 memory reduction with this CL. The approach works with multiple append calls for the same slice, including inside loops, and the final slice memory can be escaping, such as in a classic pattern of returning a slice from a function after the slice is built. (The final slice memory is never freed with this CL, though we have other work that tackles that.) An example target for this CL is we automatically free the intermediate memory for the appends in the loop in this function: func f1(input []int) []int { var s []int for _, x := range input { s = append(s, g(x)) // s cannot be aliased here if h(x) { s = append(s, x) // s cannot be aliased here } } return s // slice escapes at end } In this case, the compiler and the runtime collaborate so that the heap allocated backing memory for s is automatically freed after a successful grow. (For the first grow, there is nothing to free, but for the second and subsequent growths, the old heap memory is freed automatically.) The new runtime.growsliceNoAlias is primarily implemented by calling runtime.freegc, which we introduced in CL 673695. The high-level approach here is we step through the IR starting from a slice declaration and look for any operations that either alias the slice or might do so, and treat any IR construct we don't specifically handle as a potential alias (and therefore conservatively fall back to treating the slice as aliased when encountering something not understood). For loops, some additional care is required. We arrange the analysis so that an alias in the body of a loop causes all the appends in that same loop body to be marked aliased, even if the aliasing occurs after the append in the IR: func f2() { var s []int for i := range 10 { s = append(s, i) // aliased due to next line alias = s } } For nested loops, we analyse the nesting appropriately so that for example this append is still proven as non-aliased in the inner loop even though it aliased for the outer loop: func f3() { for range 10 { var s []int for i := range 10 { s = append(s, i) // append using non-aliased slice } alias = s } } A good starting point is the beginning of the test/escape_alias.go file, which starts with ~10 introductory examples with brief comments that attempt to illustrate the high-level approach. For more details, see the new .../internal/escape/alias.go file, especially the (*aliasAnalysis).analyze method. In the first benchmark, an append in a loop builds up a slice from nothing, where the slice elements are each 64 bytes. In the table below, 'count' is the number of appends. With 1 append, there is no opportunity for this CL to free memory. Once there are 2 appends, the growth from 1 element to 2 elements means the compiler-inserted growsliceNoAlias frees the 1-element array, and we see a ~33% reduction in memory use and a small reported speed improvement. As the number of appends increases for example to 5, we are at a ~20% speed improvement and ~45% memory reduction, and so on until we reach ~40% faster and ~50% less memory allocated at the end of the table. There can be variation in the reported numbers based on -randlayout, so this table is for 30 different values of -randlayout with a total n=150. (Even so, there is still some variation, so we probably should not read too much into small changes.) This is with GOAMD64=v3 on a VM that gcc reports is cascadelake. goos: linux goarch: amd64 pkg: runtime cpu: Intel(R) Xeon(R) CPU @ 2.80GHz │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ sec/op │ sec/op vs base │ Append64Bytes/count=1-4 31.09n ± 2% 31.69n ± 1% +1.95% (n=150) Append64Bytes/count=2-4 73.31n ± 1% 70.27n ± 0% -4.15% (n=150) Append64Bytes/count=3-4 142.7n ± 1% 124.6n ± 1% -12.68% (n=150) Append64Bytes/count=4-4 149.6n ± 1% 127.7n ± 0% -14.64% (n=150) Append64Bytes/count=5-4 277.1n ± 1% 213.6n ± 0% -22.90% (n=150) Append64Bytes/count=6-4 280.7n ± 1% 216.5n ± 1% -22.87% (n=150) Append64Bytes/count=10-4 544.3n ± 1% 386.6n ± 0% -28.97% (n=150) Append64Bytes/count=20-4 1058.5n ± 1% 715.6n ± 1% -32.39% (n=150) Append64Bytes/count=50-4 2.121µ ± 1% 1.404µ ± 1% -33.83% (n=150) Append64Bytes/count=100-4 4.152µ ± 1% 2.736µ ± 1% -34.11% (n=150) Append64Bytes/count=200-4 7.753µ ± 1% 4.882µ ± 1% -37.03% (n=150) Append64Bytes/count=400-4 15.163µ ± 2% 9.273µ ± 1% -38.84% (n=150) geomean 601.8n 455.0n -24.39% │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ B/op │ B/op vs base │ Append64Bytes/count=1-4 64.00 ± 0% 64.00 ± 0% ~ (n=150) Append64Bytes/count=2-4 192.0 ± 0% 128.0 ± 0% -33.33% (n=150) Append64Bytes/count=3-4 448.0 ± 0% 256.0 ± 0% -42.86% (n=150) Append64Bytes/count=4-4 448.0 ± 0% 256.0 ± 0% -42.86% (n=150) Append64Bytes/count=5-4 960.0 ± 0% 512.0 ± 0% -46.67% (n=150) Append64Bytes/count=6-4 960.0 ± 0% 512.0 ± 0% -46.67% (n=150) Append64Bytes/count=10-4 1.938Ki ± 0% 1.000Ki ± 0% -48.39% (n=150) Append64Bytes/count=20-4 3.938Ki ± 0% 2.001Ki ± 0% -49.18% (n=150) Append64Bytes/count=50-4 7.938Ki ± 0% 4.005Ki ± 0% -49.54% (n=150) Append64Bytes/count=100-4 15.938Ki ± 0% 8.021Ki ± 0% -49.67% (n=150) Append64Bytes/count=200-4 31.94Ki ± 0% 16.08Ki ± 0% -49.64% (n=150) Append64Bytes/count=400-4 63.94Ki ± 0% 32.33Ki ± 0% -49.44% (n=150) geomean 1.991Ki 1.124Ki -43.54% │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ allocs/op │ allocs/op vs base │ Append64Bytes/count=1-4 1.000 ± 0% 1.000 ± 0% ~ (n=150) Append64Bytes/count=2-4 2.000 ± 0% 1.000 ± 0% -50.00% (n=150) Append64Bytes/count=3-4 3.000 ± 0% 1.000 ± 0% -66.67% (n=150) Append64Bytes/count=4-4 3.000 ± 0% 1.000 ± 0% -66.67% (n=150) Append64Bytes/count=5-4 4.000 ± 0% 1.000 ± 0% -75.00% (n=150) Append64Bytes/count=6-4 4.000 ± 0% 1.000 ± 0% -75.00% (n=150) Append64Bytes/count=10-4 5.000 ± 0% 1.000 ± 0% -80.00% (n=150) Append64Bytes/count=20-4 6.000 ± 0% 1.000 ± 0% -83.33% (n=150) Append64Bytes/count=50-4 7.000 ± 0% 1.000 ± 0% -85.71% (n=150) Append64Bytes/count=100-4 8.000 ± 0% 1.000 ± 0% -87.50% (n=150) Append64Bytes/count=200-4 9.000 ± 0% 1.000 ± 0% -88.89% (n=150) Append64Bytes/count=400-4 10.000 ± 0% 1.000 ± 0% -90.00% (n=150) geomean 4.331 1.000 -76.91% The second benchmark is similar, but instead uses an 8-byte integer for the slice element. The first 4 appends in the loop never call into the runtime thanks to the excellent CL 664299 introduced by Keith in Go 1.25 that allows some <= 32 byte dynamically-sized slices to be on the stack, so this CL is neutral for <= 32 bytes. Once the 5th append occurs at count=5, a grow happens via the runtime and heap allocates as normal, but freegc does not yet have anything to free, so we see a small ~1.4ns penalty reported there. But once the second growth happens, the older heap memory is now automatically freed by freegc, so we start to see some benefit in memory reductions and speed improvements, starting at a tiny speed improvement (close to a wash, or maybe noise) by the second growth before count=10, and building up to ~2x faster with ~68% fewer allocated bytes reported. goos: linux goarch: amd64 pkg: runtime cpu: Intel(R) Xeon(R) CPU @ 2.80GHz │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ sec/op │ sec/op vs base │ AppendInt/count=1-4 2.978n ± 0% 2.969n ± 0% -0.30% (p=0.000 n=150) AppendInt/count=4-4 4.292n ± 3% 4.163n ± 3% ~ (p=0.528 n=150) AppendInt/count=5-4 33.50n ± 0% 34.93n ± 0% +4.25% (p=0.000 n=150) AppendInt/count=10-4 76.21n ± 1% 75.67n ± 0% -0.72% (p=0.000 n=150) AppendInt/count=20-4 150.6n ± 1% 133.0n ± 0% -11.65% (n=150) AppendInt/count=50-4 284.1n ± 1% 225.6n ± 0% -20.59% (n=150) AppendInt/count=100-4 544.2n ± 1% 392.4n ± 1% -27.89% (n=150) AppendInt/count=200-4 1051.5n ± 1% 702.3n ± 0% -33.21% (n=150) AppendInt/count=400-4 2.041µ ± 1% 1.312µ ± 1% -35.70% (n=150) AppendInt/count=1000-4 5.224µ ± 2% 2.851µ ± 1% -45.43% (n=150) AppendInt/count=2000-4 11.770µ ± 1% 6.010µ ± 1% -48.94% (n=150) AppendInt/count=3000-4 17.747µ ± 2% 8.264µ ± 1% -53.44% (n=150) geomean 331.8n 246.4n -25.72% │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ B/op │ B/op vs base │ AppendInt/count=1-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=150) AppendInt/count=4-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=150) AppendInt/count=5-4 64.00 ± 0% 64.00 ± 0% ~ (p=1.000 n=150) AppendInt/count=10-4 192.0 ± 0% 128.0 ± 0% -33.33% (n=150) AppendInt/count=20-4 448.0 ± 0% 256.0 ± 0% -42.86% (n=150) AppendInt/count=50-4 960.0 ± 0% 512.0 ± 0% -46.67% (n=150) AppendInt/count=100-4 1.938Ki ± 0% 1.000Ki ± 0% -48.39% (n=150) AppendInt/count=200-4 3.938Ki ± 0% 2.001Ki ± 0% -49.18% (n=150) AppendInt/count=400-4 7.938Ki ± 0% 4.005Ki ± 0% -49.54% (n=150) AppendInt/count=1000-4 24.56Ki ± 0% 10.05Ki ± 0% -59.07% (n=150) AppendInt/count=2000-4 58.56Ki ± 0% 20.31Ki ± 0% -65.32% (n=150) AppendInt/count=3000-4 85.19Ki ± 0% 27.30Ki ± 0% -67.95% (n=150) geomean ² -42.81% │ old-1bb1f2bf0c │ freegc-8ba7421-ps16 │ │ allocs/op │ allocs/op vs base │ AppendInt/count=1-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=150) AppendInt/count=4-4 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=150) AppendInt/count=5-4 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=150) AppendInt/count=10-4 2.000 ± 0% 1.000 ± 0% -50.00% (n=150) AppendInt/count=20-4 3.000 ± 0% 1.000 ± 0% -66.67% (n=150) AppendInt/count=50-4 4.000 ± 0% 1.000 ± 0% -75.00% (n=150) AppendInt/count=100-4 5.000 ± 0% 1.000 ± 0% -80.00% (n=150) AppendInt/count=200-4 6.000 ± 0% 1.000 ± 0% -83.33% (n=150) AppendInt/count=400-4 7.000 ± 0% 1.000 ± 0% -85.71% (n=150) AppendInt/count=1000-4 9.000 ± 0% 1.000 ± 0% -88.89% (n=150) AppendInt/count=2000-4 11.000 ± 0% 1.000 ± 0% -90.91% (n=150) AppendInt/count=3000-4 12.000 ± 0% 1.000 ± 0% -91.67% (n=150) geomean ² -72.76% ² Of course, these are just microbenchmarks, but likely indicate there are some opportunities here. The immediately following CL 712422 tackles inlining and is able to get runtime.freegc working automatically with iterators such as used by slices.Collect, which becomes able to automatically free the intermediate memory from its repeated appends (which earlier in this work required a temporary hand edit to the slices package). For now, we only use the NoAlias version for element types without pointers while waiting on additional runtime support in CL 698515. Updates #74299 Change-Id: I1b9d286aa97c170dcc2e203ec0f8ca72d84e8221 Reviewed-on: https://go-review.googlesource.com/c/go/+/710015 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>	2025-11-26 19:04:05 -08:00
Mark Freeman	3531ac23d4	go/types, types2: replace setDefType with pending type check Given a type definition of the form: type T RHS The setDefType function would set T.fromRHS as soon as we knew its top-level type. For instance, in: type S struct { ... } S.fromRHS is set to a struct type before type-checking anything inside the struct. This permit access to the (incomplete) RHS type in a cyclic type declaration. Accessing this information is fraught (as it's incomplete), but was used for reporting certain types of cycles. This CL replaces setDefType with a check that ensures no value of type T is used before its RHS is set up. This CL is strictly more complete than what setDefType achieved. For instance, it enables correct reporting for the below cycles: type A [unsafe.Sizeof(A{})]int var v any = 42 type B [v.(B)]int func f() C { return C{} } type C [unsafe.Sizeof(f())]int Fixes #76383 Fixes #76384 Change-Id: I9dfab5b708013b418fa66e43362bb4d8483fedec Reviewed-on: https://go-review.googlesource.com/c/go/+/724140 Auto-Submit: Mark Freeman <markfreeman@google.com> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-26 16:15:28 -08:00
Junyang Shao	3fd9cb1895	cmd/compile: fix bloop get name logic This CL change getNameFrom impl to pattern match addressible patterns. Change-Id: If1faa22a3a012d501e911d8468a5702b348abf16 Reviewed-on: https://go-review.googlesource.com/c/go/+/724180 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>	2025-11-26 14:46:13 -08:00
Keith Randall	3c6bf6fbf3	cmd/compile: handle loops better during stack allocation of slices Don't use the move2heap optimization if the move2heap is inside a loop deeper than the declaration of the slice. We really only want to do the move2heap operation once. Change-Id: I4a68d01609c2c9d4e0abe4580839e70059393a81 Reviewed-on: https://go-review.googlesource.com/c/go/+/722440 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-26 13:33:51 -08:00
Alexander Musman	dda7c8253d	cmd/compile,internal/bytealg: add MemEq intrinsic for runtime.memequal Introduce a new MemEq SSA operation for runtime.memequal. The operation is initially implemented for arm64. The change adds opt rules (following existing rules for call to runtime.memequal), working with MemEq, and a later op version LoweredMemEq which may be lowered differently for more constant size cases in future (for other targets as well as for arm64). The new MemEq SSA operation does not have memory result, allowing cse of loads operations around it. Code size difference (for arm64 linux): Executable Old .text New .text Change ------------------------------------------------------- asm 1970420 1969668 -0.04% cgo 1741220 1740212 -0.06% compile 8956756 8959428 +0.03% cover 1879332 1878772 -0.03% link 2574116 2572660 -0.06% preprofile 867124 866820 -0.04% vet 2890404 2888596 -0.06% Change-Id: I6ab507929b861884d17d5818cfbd152cf7879751 Reviewed-on: https://go-review.googlesource.com/c/go/+/686655 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2025-11-26 09:58:51 -08:00
David Chase	62cd044a79	cmd/compile: add cases for StringLen to prove Tricky index-offset logic had been added for slices, but not for strings. This fixes that, and also adds tests for same behavior in string/slice cases, and adds a new test for code in prove that had been added but not explicitly tested. Fixes #76270. Change-Id: Ibd92b89e944d86b7f30b4486a9008e6f1ac6af7d Reviewed-on: https://go-review.googlesource.com/c/go/+/723980 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2025-11-24 15:49:12 -08:00
Cherry Mui	220d73cc44	[dev.simd] all: merge master (`8dd5b13`) into dev.simd Merge List: + 2025-11-24 `8dd5b13abc` cmd/compile: relax stmtline_test on amd64 + 2025-11-23 `feae743bdb` cmd/compile: use 32x32->64 multiplies on loong64 + 2025-11-23 `e88be8a128` runtime: fix stale comment for mheap/malloc + 2025-11-23 `a318843a2a` cmd/internal/obj/loong64: optimize duplicate optab entries + 2025-11-23 `a18294bb6a` cmd/internal/obj/arm64, image/gif, runtime, sort: use math/bits to calculate log2 + 2025-11-23 `437323ef7b` slices: fix incorrect comment in slices.Insert function documentation + 2025-11-23 `1993dca400` doc/next: pre-announce end of support for macOS 12 in Go 1.27 + 2025-11-22 `337f7b1f5d` cmd/go: update default go directive in mod or work init + 2025-11-21 `3c26aef8fb` cmd/internal/obj/riscv: improve large branch/call/jump tests + 2025-11-21 `31aa9f800b` crypto/tls: use inner hello for earlyData when using QUIC and ECH + 2025-11-21 `d68aec8db1` runtime: replace trace seqlock with write flag + 2025-11-21 `8d9906cd34` runtime/trace: add Log benchmark + 2025-11-21 `6aeacdff38` cmd/go: support sha1 repos when git default is sha256 + 2025-11-21 `9570036ca5` crypto/sha3: make the zero value of SHAKE useable + 2025-11-21 `155efbbeeb` crypto/sha3: make the zero value of SHA3 useable + 2025-11-21 `6f16669e34` database/sql: don't ignore ColumnConverter for unknown input count + 2025-11-21 `121bc3e464` runtime/pprof: remove hard-coded sleep in CPU profile reader + 2025-11-21 `b604148c4e` runtime: fix double wakeup in CPU profile buffer + 2025-11-21 `22f24f90b5` cmd/compile: change testing.B.Loop keep alive semantic + 2025-11-21 `cfb9d2eb73` net: remove unused linknames + 2025-11-21 `65ef314f89` net/http: remove unused linknames + 2025-11-21 `0f32fbc631` net/http: populate Response.Request when using NewFileTransport + 2025-11-21 `3e0a8e7867` net/http: preserve original path encoding in redirects + 2025-11-21 `831af61120` net/http: use HTTP 307 redirects in ServeMux + 2025-11-21 `87269224cb` net/http: update Response.Request.URL after redirects on GOOS=js + 2025-11-21 `7aa9ca729f` net/http/cookiejar: treat localhost as secure origin + 2025-11-21 `f870a1d398` net/url: warn that JoinPath arguments should be escaped + 2025-11-21 `9962d95fed` crypto/internal/fips140/mldsa: unroll NTT and inverseNTT + 2025-11-21 `f821fc46c5` crypto/internal/fisp140test: update acvptool, test data + 2025-11-21 `b59efc38a0` crypto/internal/fips140/mldsa: new package + 2025-11-21 `62741480b8` runtime: remove linkname for gopanic + 2025-11-21 `7db2f0bb9a` crypto/internal/hpke: separate KEM and PublicKey/PrivateKey interfaces + 2025-11-21 `e15800c0ec` crypto/internal/hpke: add ML-KEM and hybrid KEMs, and SHAKE KDFs + 2025-11-21 `7c985a2df4` crypto/internal/hpke: modularize API and support more ciphersuites + 2025-11-21 `e7d47ac33d` cmd/compile: simplify negative on multiplication + 2025-11-21 `35d2712b32` net/http: fix typo in Transport docs + 2025-11-21 `90c970cd0f` net: remove unnecessary loop variable copies in tests + 2025-11-21 `9772d3a690` cmd/cgo: strip top-level const qualifier from argument frame struct + 2025-11-21 `1903782ade` errors: add examples for custom Is/As matching + 2025-11-21 `ec92bc6d63` cmd/compile: rewrite Rsh to RshU if arguments are proved positive + 2025-11-21 `3820f94c1d` cmd/compile: propagate unsigned relations for Rsh if arguments are positive + 2025-11-21 `d474f1fd21` cmd/compile: make dse track multiple shadowed ranges + 2025-11-21 `d0d0a72980` cmd/compile/internal/ssa: correct type of ARM64 conditional instructions + 2025-11-21 `a9704f89ea` internal/runtime/gc/scan: add AVX512 impl of filterNil. + 2025-11-21 `ccd389036a` cmd/internal/objabi: remove -V=goexperiment internal special case + 2025-11-21 `e7787b9eca` runtime: go fmt + 2025-11-21 `17b3b98796` internal/strconv: go fmt + 2025-11-21 `c851827c68` internal/trace: go fmt + 2025-11-21 `f87aaec53d` cmd/compile: fix integer overflow in prove pass + 2025-11-21 `dbd2ab9992` cmd/compile/internal: fix typos + 2025-11-21 `b9d86baae3` cmd/compile/internal/devirtualize: fix typos + 2025-11-20 `4b0e3cc1d6` cmd/link: support loading R_LARCH_PCREL20_S2 and R_LARCH_CALL36 relocs + 2025-11-20 `cdba82c7d6` cmd/internal/obj/loong64: add {,X}VSLT.{B/H/W/V}{,U} instructions support + 2025-11-20 `bd2b117c2c` crypto/tls: add QUICErrorEvent + 2025-11-20 `3ad2e113fc` net/http/httputil: wrap ReverseProxy's outbound request body so Close is a noop + 2025-11-20 `d58b733646` runtime: track goroutine location until actual STW + 2025-11-20 `1bc54868d4` cmd/vendor: update to x/tools@68724af + 2025-11-20 `8c3195973b` runtime: disable stack allocation tests on sanitizers + 2025-11-20 `ff654ea100` net/url: permit colons in the host of postgresql:// URLs + 2025-11-20 `a662badab9` encoding/json: remove linknames + 2025-11-20 `5afe237d65` mime: add missing path for mime types in godoc + 2025-11-20 `c1b7112af8` os/signal: make NotifyContext cancel the context with a cause Change-Id: Ib93ef643be610dfbdd83ff45095a7b1ca2537b8b	2025-11-24 11:03:06 -05:00
Xiaolin Zhao	feae743bdb	cmd/compile: use 32x32->64 multiplies on loong64 Gets rid of some sign extensions, like arm64. Change-Id: I9fc37e15a82718bfcf53db8cab0c4e7baaa0a747 Reviewed-on: https://go-review.googlesource.com/c/go/+/721522 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-23 23:54:44 -08:00
Junyang Shao	22f24f90b5	cmd/compile: change testing.B.Loop keep alive semantic This CL implements this initial design of testing.B.Loop's keep variable alive semantic: https://github.com/golang/go/issues/61515#issuecomment-2407963248. Fixes #73137. Change-Id: I8060470dbcb0dda0819334f3615cc391ff0f6501 Reviewed-on: https://go-review.googlesource.com/c/go/+/716660 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>	2025-11-21 12:49:20 -08:00
Meng Zhuo	e7d47ac33d	cmd/compile: simplify negative on multiplication goos: linux goarch: amd64 pkg: cmd/compile/internal/test cpu: AMD EPYC 7532 32-Core Processor │ simplify_base │ simplify_new │ │ sec/op │ sec/op vs base │ SimplifyNegMul 623.0n ± 0% 319.3n ± 1% -48.75% (p=0.000 n=10) goos: linux goarch: riscv64 pkg: cmd/compile/internal/test cpu: Spacemit(R) X60 │ simplify.base │ simplify.new │ │ sec/op │ sec/op vs base │ SimplifyNegMul 10.928µ ± 0% 6.432µ ± 0% -41.14% (p=0.000 n=10) Change-Id: I1d9393cd19a0b948a5d3a512d627cdc0cf0b38be Reviewed-on: https://go-review.googlesource.com/c/go/+/721520 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2025-11-21 12:40:29 -08:00
Jorropo	ec92bc6d63	cmd/compile: rewrite Rsh to RshU if arguments are proved positive Fixes #76332 Change-Id: I9044025d5dc599531c7f88ed2870bcf3d8b0acbd Reviewed-on: https://go-review.googlesource.com/c/go/+/721206 Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>	2025-11-21 12:37:30 -08:00
Jorropo	3820f94c1d	cmd/compile: propagate unsigned relations for Rsh if arguments are positive Updates #76332 Change-Id: Ifaa4d12897138d88d56b9d4e530c53dcee70bd58 Reviewed-on: https://go-review.googlesource.com/c/go/+/721205 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-21 12:37:27 -08:00
Jakub Ciolek	f87aaec53d	cmd/compile: fix integer overflow in prove pass The detectSliceLenRelation function incorrectly deduced lower bounds for "len(s) - i" without checking if the subtraction could overflow (e.g. when i is negative). This led to incorrect elimination of bounds checks. Fixes: #76355 Change-Id: I30ada0e5f1425929ddd8ae1b66e55096ec209b5b Reviewed-on: https://go-review.googlesource.com/c/go/+/721920 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-21 10:34:21 -08:00
Cherry Mui	e3d4645693	[dev.simd] all: merge master (`ca37d24`) into dev.simd Conflicts: - src/cmd/compile/internal/typecheck/builtin.go Merge List: + 2025-11-20 `ca37d24e0b` net/http: drop unused "broken" field from persistConn + 2025-11-20 `4b740af56a` cmd/internal/obj/x86: handle global reference in From3 in dynlink mode + 2025-11-20 `790384c6c2` spec: adjust rule for type parameter on RHS of alias declaration + 2025-11-20 `a49b0302d0` net/http: correctly close fake net.Conns + 2025-11-20 `32f5aadd2f` cmd/compile: stack allocate backing stores during append + 2025-11-20 `a18aff8057` runtime: select GC mark workers during start-the-world + 2025-11-20 `829779f4fe` runtime: split findRunnableGCWorker in two + 2025-11-20 `ab59569099` go/version: use "custom" as an example of a version suffix + 2025-11-19 `c4bb9653ba` cmd/compile: Implement LoweredZeroLoop with LSX Instruction on loong64 + 2025-11-19 `7f2ae21fb4` cmd/internal/obj/loong64: add MULW.D.W[U] instructions + 2025-11-19 `a2946f2385` crypto: add Encapsulator and Decapsulator interfaces + 2025-11-19 `6b83bd7146` crypto/ecdh: add KeyExchanger interface + 2025-11-19 `4fef9f8b55` go/types, types2: fix object path for grouped declaration statements + 2025-11-19 `33529db142` spec: escape double-ampersands + 2025-11-19 `dc42565a20` cmd/compile: fix control flow for unsigned divisions proof relations + 2025-11-19 `e64023dcbf` cmd/compile: cleanup useless if statement in prove + 2025-11-19 `2239520d1c` test: go fmt prove.go tests + 2025-11-19 `489d3dafb7` math: switch s390x math.Pow to generic implementation + 2025-11-18 `8c41a482f9` runtime: add dlog.hexdump + 2025-11-18 `e912618bd2` runtime: add hexdumper + 2025-11-18 `2cf9d4b62f` Revert "net/http: do not discard body content when closing it within request handlers" + 2025-11-18 `4d0658bb08` cmd/compile: prefer fixed registers for values + 2025-11-18 `ba634ca5c7` cmd/compile: fold boolean NOT into branches + 2025-11-18 `8806d53c10` cmd/link: align sections, not symbols after DWARF compress + 2025-11-18 `c93766007d` runtime: do not print recovered when double panic with the same value + 2025-11-18 `9859b43643` cmd/asm,cmd/compile,cmd/internal/obj/riscv: use compressed instructions on riscv64 + 2025-11-17 `b9ef0633f6` cmd/internal/sys,internal/goarch,runtime: enable the use of compressed instructions on riscv64 + 2025-11-17 `a087dea869` debug/elf: sync new loong64 relocation types up to LoongArch ELF psABI v20250521 + 2025-11-17 `e1a12c781f` cmd/compile: use 32x32->64 multiplies on arm64 + 2025-11-17 `6caab99026` runtime: relax TestMemoryLimit on darwin a bit more + 2025-11-17 `eda2e8c683` runtime: clear frame pointer at thread entry points + 2025-11-17 `6919858338` runtime: rename findrunnable references to findRunnable + 2025-11-17 `8e734ec954` go/ast: fix BasicLit.End position for raw strings containing \r + 2025-11-17 `592775ec7d` crypto/mlkem: avoid a few unnecessary inverse NTT calls + 2025-11-17 `590cf18daf` crypto/mlkem/mlkemtest: add derandomized Encapsulate768/1024 + 2025-11-17 `c12c337099` cmd/compile: teach prove about subtract idioms + 2025-11-17 `bc15963813` cmd/compile: clean up prove pass + 2025-11-17 `1297fae708` go/token: add (*File).End method + 2025-11-17 `65c09eafdf` runtime: hoist invariant code out of heapBitsSmallForAddrInline + 2025-11-17 `594129b80c` internal/runtime/maps: update doc for table.Clear + 2025-11-15 `c58d075e9a` crypto/rsa: deprecate PKCS#1 v1.5 encryption + 2025-11-14 `d55ecea9e5` runtime: usleep before stealing runnext only if not in syscall + 2025-11-14 `410ef44f00` cmd: update x/tools to 59ff18c + 2025-11-14 `50128a2154` runtime: support runtime.freegc in size-specialized mallocs for noscan objects + 2025-11-14 `c3708350a4` cmd/go: tests: rename git-min-vers->git-sha256 + 2025-11-14 `aea881230d` std: fix printf("%q", int) mistakes + 2025-11-14 `120f1874ef` runtime: add more precise test of assist credit handling for runtime.freegc + 2025-11-14 `fecfcaa4f6` runtime: add runtime.freegc to reduce GC work + 2025-11-14 `5a347b775e` runtime: set GOEXPERIMENT=runtimefreegc to disabled by default + 2025-11-14 `1a03d0db3f` runtime: skip tests for GOEXPERIMENT=arenas that do not handle clobberfree=1 + 2025-11-14 `cb0d9980f5` net/http: do not discard body content when closing it within request handlers + 2025-11-14 `03ed43988f` cmd/compile: allow multi-field structs to be stored directly in interfaces + 2025-11-14 `1bb1f2bf0c` runtime: put AddCleanup cleanup arguments in their own allocation + 2025-11-14 `9fd2e44439` runtime: add AddCleanup benchmark + 2025-11-14 `80c91eedbb` runtime: ensure weak handles end up in their own allocation + 2025-11-14 `7a8d0b5d53` runtime: add debug mode to extend _Grunning-without-P windows + 2025-11-14 `710abf74da` internal/runtime/cgobench: add Go function call benchmark for comparison + 2025-11-14 `b24aec598b` doc, cmd/internal/obj/riscv: document the riscv64 assembler + 2025-11-14 `a0e738c657` cmd/compile/internal: remove incorrect riscv64 SLTI rule + 2025-11-14 `2cdcc4150b` cmd/compile: fold negation into multiplication + 2025-11-14 `b57962b7c7` bytes: fix panic in bytes.Buffer.Peek + 2025-11-14 `0a569528ea` cmd/compile: optimize comparisons with single bit difference + 2025-11-14 `1e5e6663e9` cmd/compile: remove unnecessary casts and types from riscv64 rules + 2025-11-14 `ddd8558e61` go/types, types2: swap object.color for Checker.objPathIdx + 2025-11-14 `9daaab305c` cmd/link/internal/ld: make runtime.buildVersion with experiments valid + 2025-11-13 `d50a571ddf` test: fix tests to work with sizespecializedmalloc turned off + 2025-11-13 `704f841eab` cmd/trace: annotation proc start/stop with thread and proc always + 2025-11-13 `17a02b9106` net/http: remove unused isLitOrSingle and isNotToken + 2025-11-13 `ff61991aed` cmd/go: fix flaky TestScript/mod_get_direct + 2025-11-13 `129d0cb543` net/http/cgi: accept INCLUDED as protocol for server side includes + 2025-11-13 `77c5130100` go/types: minor simplification + 2025-11-13 `7601cd3880` go/types: generate cycles.go + 2025-11-13 `7a372affd9` go/types, types2: rename definedType to declaredType and clarify docs Change-Id: Ibaa9bdb982364892f80e511c1bb12661fcd5fb86	2025-11-20 14:40:43 -05:00
khr@golang.org	32f5aadd2f	cmd/compile: stack allocate backing stores during append We can already stack allocate the backing store during append if the resulting backing store doesn't escape. See CL 664299. This CL enables us to often stack allocate the backing store during append even if the result escapes. Typically, for code like: func f(n int) []int { var r []int for i := range n { r = append(r, i) } return r } the backing store for r escapes, but only by returning it. Could we operate with r on the stack for most of its lifeime, and only move it to the heap at the return point? The current implementation of append will need to do an allocation each time it calls growslice. This will happen on the 1st, 2nd, 4th, 8th, etc. append calls. The allocations done by all but the last growslice call will then immediately be garbage. We'd like to avoid doing some of those intermediate allocations if possible. We rewrite the above code by introducing a move2heap operation: func f(n int) []int { var r []int for i := range n { r = append(r, i) } r = move2heap(r) return r } Using the move2heap runtime function, which does: move2heap(r): If r is already backed by heap storage, return r. Otherwise, copy r to the heap and return the copy. Now we can treat the backing store of r allocated at the append site as not escaping. Previous stack allocation optimizations now apply, which can use a fixed-size stack-allocated backing store for r when appending. See the description in cmd/compile/internal/slice/slice.go for how we ensure that this optimization is safe. Change-Id: I81f36e58bade2241d07f67967d8d547fff5302b8 Reviewed-on: https://go-review.googlesource.com/c/go/+/707755 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-20 09:19:39 -08:00
Jorropo	2239520d1c	test: go fmt prove.go tests Change-Id: Ia4c2ceffcf2bfde862e9dba02a4b38245f868692 Reviewed-on: https://go-review.googlesource.com/c/go/+/721202 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Sean Liao <sean@liao.dev> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com>	2025-11-19 13:33:52 -08:00
Keith Randall	ba634ca5c7	cmd/compile: fold boolean NOT into branches Gets rid of an EOR $1 instruction. Change-Id: Ib032b0cee9ac484329c978af9b1305446f8d5dac Reviewed-on: https://go-review.googlesource.com/c/go/+/721501 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-18 09:31:58 -08:00
Keith Randall	e1a12c781f	cmd/compile: use 32x32->64 multiplies on arm64 Gets rid of some sign extensions. Change-Id: Ie67ef36b4ca1cd1a2cd9fa5d84578db553578a22 Reviewed-on: https://go-review.googlesource.com/c/go/+/721241 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-17 13:45:54 -08:00
Junyang Shao	934dbcea1a	[dev.simd] simd: update CPU feature APIs This CL also updates the internal uses of these APIs. This CL also fixed a instable output issue left by previous CLs. Change-Id: Ibc38361d35e2af0c4943a48578f3c610b74ed14d Reviewed-on: https://go-review.googlesource.com/c/go/+/720020 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-17 13:37:30 -08:00
Keith Randall	c12c337099	cmd/compile: teach prove about subtract idioms For v = x-y: if y >= 0 then v <= x if y <= x then v >= 0 (With appropriate guards against overflow/underflow.) Fixes #76304 Change-Id: I8f8f1254156c347fa97802bd057a8379676720ae Reviewed-on: https://go-review.googlesource.com/c/go/+/720740 Reviewed-by: Mark Freeman <markfreeman@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-17 09:48:53 -08:00
Keith Randall	03ed43988f	cmd/compile: allow multi-field structs to be stored directly in interfaces If the struct is a bunch of 0-sized fields and one pointer field. Merged revert-of-revert for 4 CLs. original revert 681937 695016 693415 694996 693615 695015 694195 694995 Fixes #74092 Update #74888 Update #74908 Update #74935 (updated issues are bugs in the last attempt at this) Change-Id: I32246d49b8bac3bb080972dc06ab432a5480d560 Reviewed-on: https://go-review.googlesource.com/c/go/+/714421 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>	2025-11-14 12:37:44 -08:00
Meng Zhuo	2cdcc4150b	cmd/compile: fold negation into multiplication goos: linux goarch: riscv64 pkg: cmd/compile/internal/test cpu: Spacemit(R) X60 │ /root/mul.base.log │ /root/mul.new.log │ │ sec/op │ sec/op vs base │ MulNeg 6.426µ ± 0% 4.501µ ± 0% -29.96% (p=0.000 n=10) Mul2Neg 9.000µ ± 0% 6.431µ ± 0% -28.54% (p=0.000 n=10) Mul2 1.263µ ± 0% 1.263µ ± 0% ~ (p=1.000 n=10) MulNeg2 1.577µ ± 0% 1.577µ ± 0% ~ (p=0.211 n=10) geomean 3.276µ 2.756µ -15.89% goos: linux goarch: amd64 pkg: cmd/compile/internal/test cpu: AMD EPYC 7532 32-Core Processor │ /root/base │ /root/new │ │ sec/op │ sec/op vs base │ MulNeg 691.9n ± 1% 319.4n ± 0% -53.83% (p=0.000 n=10) Mul2Neg 630.0n ± 0% 629.6n ± 0% -0.07% (p=0.000 n=10) Mul2 438.1n ± 0% 438.1n ± 0% ~ (p=0.728 n=10) MulNeg2 439.3n ± 0% 439.4n ± 0% ~ (p=0.656 n=10) geomean 538.2n 443.6n -17.58% Change-Id: Ice8e6c8d1e8e3009ba8a0b1b689205174e199019 Reviewed-on: https://go-review.googlesource.com/c/go/+/720180 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org>	2025-11-14 11:01:22 -08:00
Michael Munday	0a569528ea	cmd/compile: optimize comparisons with single bit difference Optimize comparisons with constants that only differ by 1 bit (i.e. a power of 2). For example: x == 4 \|\| x == 6 -> x\|2 == 6 x != 1 && x != 5 -> x\|4 != 5 Change-Id: Ic61719e5118446d21cf15652d9da22f7d95b2a15 Reviewed-on: https://go-review.googlesource.com/c/go/+/719420 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2025-11-14 10:59:56 -08:00
matloob@golang.org	d50a571ddf	test: fix tests to work with sizespecializedmalloc turned off Cq-Include-Trybots: luci.golang.try:gotip-linux-386-nosizespecializedmalloc,gotip-linux-amd64-nosizespecializedmalloc,gotip-linux-arm64-nosizespecializedmalloc Change-Id: I6a6a696465004b939c989afc058c4c3e1fb7134f Reviewed-on: https://go-review.googlesource.com/c/go/+/720401 Auto-Submit: Michael Matloob <matloob@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Matloob <matloob@google.com>	2025-11-13 16:57:31 -08:00
Cherry Mui	d7a0c45642	[dev.simd] all: merge master (`57362e9`) into dev.simd Conflicts: - src/cmd/compile/internal/ir/symtab.go - src/cmd/compile/internal/ssa/prove.go - src/cmd/compile/internal/ssa/rewriteAMD64.go - src/cmd/compile/internal/ssagen/intrinsics.go - src/cmd/compile/internal/typecheck/builtin.go - src/internal/buildcfg/exp.go - src/internal/strconv/ftoa.go - test/codegen/stack.go Manually resolved some conflicts: - Use internal/strconv for simd.String, remove internal/ftoa - prove.go is just copied from the one on the main branch. We have cherry-picked the changes to prove.go to main branch, so our copy is identical to an old version of the one on the main branch. There are CLs landed after our cherry-picks. Just copy it over to adopt the new code. Merge List: + 2025-11-13 `57362e9814` go/types, types2: check for direct cycles as a separate phase + 2025-11-13 `099e0027bd` cmd/go/internal/modfetch: consolidate global vars + 2025-11-13 `028375323f` cmd/go/internal/modfetch/codehost: fix flaky TestReadZip + 2025-11-13 `4ebf295b0b` runtime: prefer to restart Ps on the same M after STW + 2025-11-13 `625d8e9b9c` runtime/pprof: fix goroutine leak profile tests for noopt + 2025-11-13 `4684a26c26` spec: remove cycle restriction for type parameters + 2025-11-13 `0f9c8fb29d` cmd/asm,cmd/internal/obj/riscv: add support for riscv compressed instructions + 2025-11-13 `a15d036ce2` cmd/internal/obj/riscv: implement better bit pattern encoding + 2025-11-12 `abb241a789` cmd/internal/obj/loong64: add {,X}VS{ADD,SUB}.{B/H/W/V}{,U} instructions support + 2025-11-12 `0929d21978` cmd/go: keep objects alive while stopping cleanups + 2025-11-12 `f03d06ec1a` runtime: fix list test memory management for mayMoreStack + 2025-11-12 `48127f656b` crypto/internal/fips140/sha3: remove outdated TODO + 2025-11-12 `c3d1d42764` sync/atomic: amend comments for Value.{Swap,CompareAndSwap} + 2025-11-12 `e0807ba470` cmd/compile: don't clear ptrmask in fillptrmask + 2025-11-12 `66318d2b4b` internal/abi: correctly describe result in Name.Name doc comment + 2025-11-12 `34aef89366` cmd/compile: use FCLASSD for subnormal checks on riscv64 + 2025-11-12 `0c28789bd7` net/url: disallow raw IPv6 addresses in host + 2025-11-12 `4e761b9a18` cmd/compile: optimize liveness in stackalloc + 2025-11-12 `956909ff84` crypto/x509: move BetterTLS suite from crypto/tls + 2025-11-12 `6525f46707` cmd/link: change shdr and phdr from arrays to slices + 2025-11-12 `d3aeba1670` runtime: switch p.gcFractionalMarkTime to atomic.Int64 + 2025-11-12 `8873e8bea2` runtime,runtime/pprof: clean up goroutine leak profile writing + 2025-11-12 `b8b84b789e` cmd/go: clarify the -o testflag is only for copying the binary + 2025-11-12 `c761b26b56` mime: parse media types that contain braces + 2025-11-12 `65858a146e` os/exec: include Cmd.Start in the list of methods that run Cmd + 2025-11-11 `4bfc3a9d14` std,cmd: go fix -any std cmd + 2025-11-11 `2263d4aabd` runtime: doubly-linked sched.midle list + 2025-11-11 `046dce0e54` runtime: use new list type for spanSPMCs + 2025-11-11 `5f11275457` runtime: reusable intrusive doubly-linked list + 2025-11-11 `951cf0501b` internal/trace/testtrace: fix flag name typos + 2025-11-11 `2750f95291` cmd/go: implement accurate pseudo-versions for Mercurial + 2025-11-11 `b709a3e8b4` cmd/go/internal/vcweb: cache hg servers + 2025-11-11 `426ef30ecf` cmd/go: implement -reuse for Mercurial repos + 2025-11-10 `5241d114f5` spec: more precise prose for special case of append + 2025-11-10 `cdf64106f6` go/types, types2: first argument to append must never be be nil + 2025-11-10 `a0eb4548cf` .gitignore: ignore go test artifacts + 2025-11-10 `bf58e7845e` internal/trace: add "command" to convert text traces to raw + 2025-11-10 `052c192a4c` runtime: fix lock rank for work.spanSPMCs.lock + 2025-11-10 `bc5ffe5c79` internal/runtime/sys,math/bits: eliminate bounds checks on len8tab + 2025-11-10 `32f8d6486f` runtime: document that tracefpunwindoff applies to some profilers + 2025-11-10 `1c1c1942ba` cmd/go: remove redundant AVX regex in security flag checks + 2025-11-10 `3b3d6b9e5d` cmd/internal/obj/arm64: shorten constant integer loads + 2025-11-10 `5f4b5f1a19` runtime/msan: use different msan routine for copying + 2025-11-10 `0fe6c8e8c8` runtime: tweak wording for comment of mcache.flushGen + 2025-11-10 `95a0e5adc1` sync: don't call Done when f() panics in WaitGroup.Go + 2025-11-08 `e8ed85d6c2` cmd/go: update goSum if necessary + 2025-11-08 `b76103c08e` cmd/go: output missing GoDebug entries + 2025-11-07 `47a63a331d` cmd/go: rewrite hgrepo1 test repo to be deterministic + 2025-11-07 `7995751d3a` cmd/go: copy git reuse and support repos to hg + 2025-11-07 `66c7ca7fb3` cmd/go: improve TestScript/reuse_git + 2025-11-07 `de84ac55c6` cmd/link: clean up some comments to Go standards + 2025-11-07 `5cd1b73772` runtime: correctly print panics before fatal-ing on defer + 2025-11-07 `91ca80f970` runtime/cgo: improve error messages after pointer panic + 2025-11-07 `d36e88f21f` runtime: tweak wording for doc + 2025-11-06 `ad3ccd92e4` cmd/link: move pclntab out of relro section + 2025-11-06 `43b91e7abd` iter: fix a tiny doc comment bug + 2025-11-06 `48c7fa13c6` Revert "runtime: remove the pc field of _defer struct" + 2025-11-05 `8111104a21` cmd/internal/obj/loong64: add {,X}VSHUF.{B/H/W/V} instructions support + 2025-11-05 `2e2072561c` cmd/internal/obj/loong64: add {,X}VEXTRINS.{B,H,W,V} instruction support + 2025-11-05 `01c29d1f0b` internal/chacha8rand: replace VORV with instruction VMOVQ on loong64 + 2025-11-05 `f01a1841fd` cmd/compile: fix error message on loong64 + 2025-11-05 `8cf7a0b4c9` cmd/link: support weak binding on darwin + 2025-11-05 `2dd7e94e16` cmd/go: use go.dev instead of golang.org in flag errors + 2025-11-05 `28f1ad5782` cmd/go: fix TestScript/govcs + 2025-11-05 `daa220a1c9` cmd/go: silence TLS handshake errors during test + 2025-11-05 `3ae9e95002` cmd/go: fix TestCgoPkgConfig on darwin with pkg-config installed + 2025-11-05 `a494a26bc2` cmd/go: fix TestScript/vet_flags + 2025-11-05 `a8fb94969c` cmd/go: fix TestScript/tool_build_as_needed + 2025-11-05 `04f05219c4` cmd/cgo: skip escape checks if call site has no argument + 2025-11-04 `9f3a108ee0` os: ignore O_TRUNC errors on named pipes and terminal devices on Windows + 2025-11-04 `0e1bd8b5f1` cmd/link, runtime: don't store text start in pcHeader + 2025-11-04 `7347b54727` cmd/link: don't generate .gosymtab section + 2025-11-04 `6914dd11c0` cmd/link: add and use new SymKind SFirstUnallocated + 2025-11-04 `f5f14262d0` cmd/link: remove misleading comment + 2025-11-04 `61de3a9dae` cmd/link: remove unused SFILEPATH symbol kind + 2025-11-04 `8e2bd267b5` cmd/link: add comments for SymKind values + 2025-11-04 `16705b962e` cmd/compile: faster liveness analysis in regalloc + 2025-11-04 `a5fe6791d7` internal/syscall/windows: fix ReOpenFile sentinel error value + 2025-11-04 `a7d174ccaa` cmd/compile/internal/ssa: simplify riscv64 FCLASSD rewrite rules + 2025-11-04 `856238615d` runtime: amend doc for setPinned + 2025-11-04 `c7ccbddf22` cmd/compile/internal/ssa: more aggressive on dead auto elim + 2025-11-04 `75b2bb1d1a` cmd/cgo: drop pre-1.18 support + 2025-11-04 `dd839f1d00` internal/strconv: handle %f with fixedFtoa when possible + 2025-11-04 `6e165b4d17` cmd/compile: implement Avg64u, Hmul64, Hmul64u for wasm + 2025-11-04 `9f6590f333` encoding/pem: don't reslice in failure modes + 2025-11-03 `34fec512ce` internal/strconv: extract fixed-precision ftoa from ftoaryu.go + 2025-11-03 `162ba6cc40` internal/strconv: add tests and benchmarks for ftoaFixed + 2025-11-03 `9795c7ba22` internal/strconv: fix pow10 off-by-one in exponent result + 2025-11-03 `ad5e941a45` cmd/internal/obj/loong64: using {xv,v}slli.d to perform copying between vector registers + 2025-11-03 `dadbac0c9e` cmd/internal/obj/loong64: add VPERMI.W, XVPERMI.{W,V,Q} instruction support + 2025-11-03 `e2c6a2024c` runtime: avoid append in printint, printuint + 2025-11-03 `c93cc603cd` runtime: allow Stack to traceback goroutines in syscall _Grunning window + 2025-11-03 `b5353fd90a` runtime: don't panic in castogscanstatus + 2025-11-03 `43491f8d52` cmd/cgo: use the export'ed file/line in error messages + 2025-11-03 `aa94fdf0cc` cmd/go: link to go.dev/doc/godebug for removed GODEBUG settings + 2025-11-03 `4d2b03d2fc` crypto/tls: add BetterTLS test coverage + 2025-11-03 `0c4444e13d` cmd/internal/obj: support arm64 FMOVQ large offset encoding + 2025-11-03 `85bec791a0` cmd/go/testdata/script: loosen list_empty_importpath for freebsd + 2025-11-03 `17b57078ab` internal/runtime/cgobench: add cgo callback benchmark + 2025-11-03 `5f8fdb720c` cmd/go: move functions to methods + 2025-11-03 `0a95856b95` cmd/go: eliminate additional global variable + 2025-11-03 `f93186fb44` cmd/go/internal/telemetrystats: count cgo usage + 2025-11-03 `eaf28a27fd` runtime: update outdated comments for deferprocStack + 2025-11-03 `e12d8a90bf` all: remove extra space in the comments + 2025-11-03 `c5559344ac` internal/profile: optimize Parse allocs + 2025-11-03 `5132158ac2` bytes: add Buffer.Peek + 2025-11-03 `361d51a6b5` runtime: remove the pc field of _defer struct + 2025-11-03 `00ee1860ce` crypto/internal/constanttime: expose intrinsics to the FIPS 140-3 packages + 2025-11-02 `388c41c412` cmd/go: skip git sha256 tests if git < 2.29 + 2025-11-01 `385dc33250` runtime: prevent time.Timer.Reset(0) from deadlocking testing/synctest tests + 2025-10-31 `99b724f454` cmd/go: document purego convention + 2025-10-31 `27937289dc` runtime: avoid zeroing scavenged memory + 2025-10-30 `89dee70484` runtime: prioritize panic output over racefini + 2025-10-30 `8683bb846d` runtime: optimistically CAS atomicstatus directly in enter/exitsyscall + 2025-10-30 `5b8e850340` runtime: don't track scheduling latency for _Grunning <-> _Gsyscall + 2025-10-30 `251814e580` runtime: document tracer invariants explicitly + 2025-10-30 `7244e9221f` runtime: eliminate _Psyscall + 2025-10-30 `5ef19c0d0c` strconv: delete divmod1e9 + 2025-10-30 `d32b1f02c3` runtime: delete timediv + 2025-10-30 `cbbd385cb8` strconv: remove arch-specific decision in formatBase10 + 2025-10-30 `6aca04a73a` reflect: correct internal docs for uncommonType + 2025-10-30 `235b4e729d` cmd/compile/internal/ssa: model right shift more precisely + 2025-10-30 `d44db293f9` go/token: fix a typo in a comment + 2025-10-30 `cdc6b559ca` strconv: remove hand-written divide on 32-bit systems + 2025-10-30 `1e5bb416d8` cmd/compile: implement bits.Mul64 on 32-bit systems + 2025-10-30 `38317c44e7` crypto/internal/fips140/aes: fix CTR generator + 2025-10-29 `3be9a0e014` go/types, types: proceed with correct (invalid) type in case of a selector error + 2025-10-29 `d2c5fa0814` strconv: remove &0xFF trick in formatBase10 + 2025-10-29 `9bbda7c99d` cmd/compile: make prove understand div, mod better + 2025-10-29 `915c1839fe` test/codegen: simplify asmcheck pattern matching + 2025-10-29 `32ee3f3f73` runtime: tweak example code for gorecover + 2025-10-29 `da3fb90b23` crypto/internal/fips140/bigmod: fix extendedGCD comment + 2025-10-29 `9035f7aea5` runtime: use internal/strconv + 2025-10-29 `49c1da474d` internal/itoa, internal/runtime/strconv: delete + 2025-10-29 `b2a346bbd1` strconv: move all but Quote to internal/strconv + 2025-10-28 `041f564b3e` internal/runtime/gc/scan: avoid memory destination on VPCOMPRESSQ + 2025-10-28 `81afd3a59b` cmd/compile: extend ppc64 MADDLD to match const ADDconst & MULLDconst + 2025-10-28 `ea50d61b66` cmd/compile: name change isDirect -> isDirectAndComparable + 2025-10-28 `bd4dc413cd` cmd/compile: don't optimize away a panicing interface comparison + 2025-10-28 `30c047d0d0` cmd/compile: extend loong MOVidx rules to match ADDshiftLLV + 2025-10-28 `46e5e2b09a` runtime: define PanicBounds in funcdata.h + 2025-10-28 `3da0356685` crypto/internal/fips140test: collect 300M entropy samples for ESV + 2025-10-28 `d5953185d5` runtime: amend comments a bit + 2025-10-28 `12c8d14d94` errors: document that the target of Is must be comparable + 2025-10-28 `1f4d14e493` go/types, types2: pull up package-level object sort to a separate phase + 2025-10-28 `b8aa1ee442` go/types, types2: reduce locks held at once in resolveUnderlying + 2025-10-28 `24af441437` cmd/compile: rewrite proved multiplies by 0 or 1 into CondSelect + 2025-10-28 `2d33a456c6` cmd/compile: move branchelim supported arches to Config + 2025-10-27 `2c91c33e88` crypto/subtle,cmd/compile: add intrinsics for ConstantTimeSelect and Eq + 2025-10-27 `73d7635fae` cmd/compile: add generic rules to remove bool → int → bool roundtrips + 2025-10-27 `1662d55247` cmd/compile: do not Zext bools to 64bits in amd64 CMOV generation rules + 2025-10-27 `b8468d8c4e` cmd/compile: introduce bytesizeToConst to cleanup switches in prove + 2025-10-27 `9e25c2f6de` cmd/link: internal linking support for windows/arm64 + 2025-10-27 `ff2ebf69c4` internal/runtime/gc/scan: correct size class size check + 2025-10-27 `9a77aa4f08` cmd/compile: add position info to sccp debug messages + 2025-10-27 `77dc138030` cmd/compile: teach prove about unsigned rounding-up divide + 2025-10-27 `a0f33b2887` cmd/compile: change !l.nonzero() into l.maybezero() + 2025-10-27 `5453b788fd` cmd/compile: optimize Add64carry with unused carries into plain Add64 + 2025-10-27 `2ce5aab79e` cmd/compile: remove 68857 ModU flowLimit workaround in prove + 2025-10-27 `a50de4bda7` cmd/compile: remove 68857 min & max flowLimit workaround in prove + 2025-10-27 `53be78630a` cmd/compile: use topo-sort in prove to correctly learn facts while walking once + 2025-10-27 `dec2b4c83d` runtime: avoid bound check in freebsd binuptime + 2025-10-27 `916e682d51` cmd/internal/obj, cmd/asm: reclassify the offset of memory access operations on loong64 + 2025-10-27 `2835b994fb` cmd/go: remove global loader state variable + 2025-10-27 `139f89226f` cmd/go: use local state for telemetry + 2025-10-27 `8239156571` cmd/go: use tagged switch + 2025-10-27 `d741483a1f` cmd/go: increase stmt threshold on amd64 + 2025-10-27 `a6929cf4a7` cmd/go: removed unused code in toolchain.Exec + 2025-10-27 `180c07e2c1` go/types, types2: clarify docs for resolveUnderlying + 2025-10-27 `d8a32f3d4b` go/types, types2: wrap Named.fromRHS into Named.rhs + 2025-10-27 `b2af92270f` go/types, types2: verify stateMask transitions in debug mode + 2025-10-27 `92decdcbaa` net/url: further speed up escape and unescape + 2025-10-27 `5f4ec3541f` runtime: remove unused cgoCheckUsingType function + 2025-10-27 `189f2c08cc` time: rewrite IsZero method to use wall and ext fields + 2025-10-27 `f619b4a00d` cmd/go: reorder parameters so that context is first + 2025-10-27 `f527994c61` sync: update comments for Once.done + 2025-10-26 `5dcaf9a01b` runtime: add GOEXPERIMENT=runtimefree + 2025-10-26 `d7a52f9369` cmd/compile: use MOV(D\|F) with const for Const(64\|32)F on riscv64 + 2025-10-26 `6f04a92be3` internal/chacha8rand: provide vector implementation for riscv64 + 2025-10-26 `54e3adc533` cmd/go: use local loader state in test + 2025-10-26 `ca379b1c56` cmd/go: remove loaderstate dependency + 2025-10-26 `83a44bde64` cmd/go: remove unused loader state + 2025-10-26 `7e7cd9de68` cmd/go: remove temporary rf cleanup script + 2025-10-26 `53ad68de4b` cmd/compile: allow unaligned load/store on Wasm + 2025-10-25 `12ec09f434` cmd/go: use local state object in work.runBuild and work.runInstall + 2025-10-24 `643f80a11f` runtime: add ppc and s390 to 32 build constraints for gccgo + 2025-10-24 `0afbeb5102` runtime: add ppc and s390 to linux 32 bits syscall build constraints for gccgo + 2025-10-24 `7b506d106f` cmd/go: use local state object in `generate.runGenerate` + 2025-10-24 `26a8a21d7f` cmd/go: use local state object in `env.runEnv` + 2025-10-24 `f2dd3d7e31` cmd/go: use local state object in `vet.runVet` + 2025-10-24 `784700439a` cmd/go: use local state object in pkg `workcmd` + 2025-10-24 `69673e9be2` cmd/go: use local state object in `tool.runTool` + 2025-10-24 `2e12c5db11` cmd/go: use local state object in `test.runTest` + 2025-10-24 `fe345ff2ae` cmd/go: use local state object in `modget.runGet` + 2025-10-24 `d312e27e8b` cmd/go: use local state object in pkg `modcmd` + 2025-10-24 `ea9cf26aa1` cmd/go: use local state object in `list.runList` + 2025-10-24 `9926e1124e` cmd/go: use local state object in `bug.runBug` + 2025-10-24 `2c4fd7b2cd` cmd/go: use local state object in `run.runRun` + 2025-10-24 `ade9f33e1f` cmd/go: add loaderstate as field on `mvsReqs` + 2025-10-24 `ccf4192a31` cmd/go: make ImportMissingError work with local state + 2025-10-24 `f5403f15f0` debug/pe: check for zdebug_gdb_scripts section in testDWARF + 2025-10-24 `a26f860fa4` runtime: use 32-bit hash for maps on Wasm + 2025-10-24 `747fe2efed` encoding/json/v2: fix typo in documentation about errors.AsType + 2025-10-24 `94f47fc03f` cmd/link: remove pointless assignment in SetSymAlign + 2025-10-24 `e6cff69051` crypto/x509: move constraint checking after chain building + 2025-10-24 `f5f69a3de9` encoding/json/jsontext: avoid pinning application data in pools + 2025-10-24 `a6a59f0762` encoding/json/v2: use slices.Sort directly + 2025-10-24 `0d3dab9b1d` crypto/x509: simplify candidate chain filtering + 2025-10-24 `29046398bb` cmd/go: refactor injection of modload.LoaderState + 2025-10-24 `c18fa69e52` cmd/go: make ErrNoModRoot work with local state + 2025-10-24 `296ecc918d` cmd/go: add modload.State parameter to AllowedFunc + 2025-10-24 `c445a61e52` cmd/go: add loaderstate as field on `QueryMatchesMainModulesError` + 2025-10-24 `6ac40051d3` cmd/go: remove module loader state from ccompile + 2025-10-24 `6a5a452528` cmd/go: inject vendor dir into builder struct + 2025-10-23 `dfac972233` crypto/pbkdf2: add missing error return value in example + 2025-10-23 `47bf8f073e` unique: fix inconsistent panic prefix in canonmap cleanup path + 2025-10-23 `03bd43e8bb` go/types, types2: rename Named.resolve to unpack + 2025-10-23 `9fcdc814b2` go/types, types2: rename loaded namedState to lazyLoaded + 2025-10-23 `8401512a9b` go/types, types2: rename complete namedState to hasMethods + 2025-10-23 `cf826bfcb4` go/types, types2: set t.underlying exactly once in resolveUnderlying + 2025-10-23 `c4e910895b` net/url: speed up escape and unescape + 2025-10-23 `3f6ac3a10f` go/build: use slices.Equal + 2025-10-23 `839da71f89` encoding/pem: properly calculate end indexes + 2025-10-23 `39ed968832` cmd: update golang.org/x/arch for riscv64 disassembler + 2025-10-23 `ca448191c9` all: replace Split in loops with more efficient SplitSeq + 2025-10-23 `107fcb70de` internal/goroot: replace HasPrefix+TrimPrefix with CutPrefix + 2025-10-23 `8378276d66` strconv: optimize int-to-decimal and use consistently + 2025-10-23 `e5688d0bdd` cmd/internal/obj/riscv: simplify validation and encoding of raw instructions + 2025-10-22 `77fc27972a` doc/next: improve new(expr) release note + 2025-10-22 `d94a8c56ad` runtime: cleanup pagetrace + 2025-10-22 `02728a2846` crypto/internal/fips140test: add entropy SHA2-384 testing + 2025-10-22 `f92e01c117` runtime/cgo: fix cgoCheckArg description + 2025-10-22 `50586182ab` runtime: use backoff and ISB instruction to reduce contention in (lfstack).pop and (spanSet).pop on arm64 + 2025-10-22 `1ff59f3dd3` strconv: clean up powers-of-10 table, tests + 2025-10-22 `7c9fa4d5e9` cmd/go: check if build output should overwrite files with renames + 2025-10-22 `557b4d6e0f` comment: change slice to string in function comment/help + 2025-10-22 `d09a8c8ef4` go/types, types2: simplify locking in Named.resolveUnderlying + 2025-10-22 `5a42af7f6c` go/types, types2: in resolveUnderlying, only compute path when needed + 2025-10-22 `4bdb55b5b8` go/types, types2: rename Named.under to Named.resolveUnderlying + 2025-10-21 `29d43df8ab` go/build, cmd/go: use ast.ParseDirective for go:embed + 2025-10-21 `4e695dd634` go/ast: add ParseDirective for parsing directive comments + 2025-10-21 `06e57e60a7` go/types, types2: only report version errors if new(expr) is ok otherwise + 2025-10-21 `6c3d0d259f` path/filepath: reword documentation for Rel + 2025-10-21 `39fd61ddb0` go/types, types2: guard Named.underlying with Named.mu + 2025-10-21 `4a0115c886` runtime,syscall: implement and use syscalln on darwin + 2025-10-21 `261c561f5a` all: gofmt -w + 2025-10-21 `c9c78c06ef` strconv: embed testdata in test + 2025-10-21 `8f74f9daf4` sync: re-enable race even when panicking + 2025-10-21 `8a6c64f4fe` syscall: use rawSyscall6 to call ptrace in forkAndExecInChild + 2025-10-21 `4620db72d2` runtime: use timer_settime64 on 32-bit Linux + 2025-10-21 `b31dc77cea` os: support deleting read-only files in RemoveAll on older Windows versions + 2025-10-21 `46cc532900` cmd/compile/internal/ssa: fix typo in comment + 2025-10-21 `2163a58021` crypto/internal/fips140/entropy: increase AllocsPerRun iterations + 2025-10-21 `306eacbc11` cmd/go/testdata/script: disable list_empty_importpath test on Windows + 2025-10-21 `a5a249d6a6` all: eliminate unnecessary type conversions + 2025-10-21 `694182d77b` cmd/internal/obj/ppc64: improve large prologue generation + 2025-10-21 `b0dcb95542` cmd/compile: leave the horses alone + 2025-10-21 `9a5a1202f4` runtime: clean dead architectures from go:build constraint + 2025-10-21 `8539691d0c` crypto/internal/fips140/entropy: move to crypto/internal/entropy/v1.0.0 + 2025-10-20 `99cf4d671c` runtime: save lasx and lsx registers in loong64 async preemption + 2025-10-20 `79ae97fe9b` runtime: make procyieldAsm no longer loop infinitely if passed 0 + 2025-10-20 `f838faffe2` runtime: wrap procyield assembly and check for 0 + 2025-10-20 `ee4d2c312d` runtime/trace: dump test traces on validation failure + 2025-10-20 `7b81a1e107` net/url: reduce allocs in Encode + 2025-10-20 `e425176843` cmd/asm: fix typo in comment + 2025-10-20 `dc9a3e2a65` runtime: fix generation skew with trace reentrancy + 2025-10-20 `df33c17091` runtime: add _Gdeadextra status + 2025-10-20 `7503856d40` cmd/go: inject loaderstate into matcher function + 2025-10-20 `d57c3fd743` cmd/go: inject State parameter into `work.runInstall` + 2025-10-20 `e94a5008f6` cmd/go: inject State parameter into `work.runBuild` + 2025-10-20 `d9e6f95450` cmd/go: inject State parameter into `workcmd.runSync` + 2025-10-20 `9769a61e64` cmd/go: inject State parameter into `modget.runGet` + 2025-10-20 `f859799ccf` cmd/go: inject State parameter into `modcmd.runVerify` + 2025-10-20 `0f820aca29` cmd/go: inject State parameter into `modcmd.runVendor` + 2025-10-20 `92aa3e9e98` cmd/go: inject State parameter into `modcmd.runInit` + 2025-10-20 `e176dff41c` cmd/go: inject State parameter into `modcmd.runDownload` + 2025-10-20 `e7c66a58d5` cmd/go: inject State parameter into `toolchain.Select` + 2025-10-20 `4dc3dd9a86` cmd/go: add loaderstate to Switcher + 2025-10-20 `bcf7da1595` cmd/go: convert functions to methods + 2025-10-20 `0d3044f965` cmd/go: make Reset work with any State instance + 2025-10-20 `386d81151d` cmd/go: make setState work with any State instance + 2025-10-20 `a420aa221e` cmd/go: inject State parameter into `tool.runTool` + 2025-10-20 `441e7194a4` cmd/go: inject State parameter into `test.runTest` + 2025-10-20 `35e8309be2` cmd/go: inject State parameter into `list.runList` + 2025-10-20 `29a81624f7` cmd/go: inject state parameter into `fmtcmd.runFmt` + 2025-10-20 `f7eaea02fd` cmd/go: inject state parameter into `clean.runClean` + 2025-10-20 `58a8fdb6cf` cmd/go: inject State parameter into `bug.runBug` + 2025-10-20 `8d0bef7ffe` runtime: add linkname documentation and guidance + 2025-10-20 `3e43f48cb6` encoding/asn1: use reflect.TypeAssert to improve performance + 2025-10-20 `4ad5585c2c` runtime: fix _rt0_ppc64x_lib on aix + 2025-10-17 `a5f55a441e` cmd/fix: add modernize and inline analyzers + 2025-10-17 `80876f4b42` cmd/go/internal/vet: tweak help doc + 2025-10-17 `b5aefe07e5` all: remove unnecessary loop variable copies in tests + 2025-10-17 `5137c473b6` go/types, types2: remove references to under function in comments + 2025-10-17 `dbbb1bfc91` all: correct name for comments + 2025-10-17 `0983090171` encoding/pem: properly decode strange PEM data + 2025-10-17 `36863d6194` runtime: unify riscv64 library entry point + 2025-10-16 `0c14000f87` go/types, types2: remove under(Type) in favor of Type.Underlying() + 2025-10-16 `1099436f1b` go/types, types2: change and enforce lifecycle of Named.fromRHS and Named.underlying fields + 2025-10-16 `41f5659347` go/types, types2: remove superfluous unalias call (minor cleanup) + 2025-10-16 `e7351c03c8` runtime: use DC ZVA instead of its encoding in WORD in arm64 memclr + 2025-10-16 `6cbe0920c4` cmd: update to x/tools@7d9453cc + 2025-10-15 `45eee553e2` cmd/internal/obj: move ARM64RegisterExtension from cmd/asm/internal/arch + 2025-10-15 `27f9a6705c` runtime: increase repeat count for alloc test + 2025-10-15 `b68cebd809` net/http/httptest: record failed ResponseWriter writes + 2025-10-15 `f1fed742eb` cmd: fix three printf problems reported by newest vet + 2025-10-15 `0984dcd757` cmd/compile: fix an error in comments + 2025-10-15 `31f82877e8` go/types, types2: fix misleading internal comment + 2025-10-15 `6346349f56` cmd/compile: replace angle brackets with square + 2025-10-15 `284379cdfc` cmd/compile: remove rematerializable values from live set across calls + 2025-10-15 `519ae514ab` cmd/compile: eliminate bound check for slices of the same length + 2025-10-15 `b5a29cca48` cmd/distpack: add fix tool to inventory + 2025-10-15 `bb5eb51715` runtime/pprof: fix errors in pprof_test + 2025-10-15 `5c9a26c7f8` cmd/compile: use arm64 neon in LoweredMemmove/LoweredMemmoveLoop + 2025-10-15 `61d1ff61ad` cmd/compile: use block starting position for phi line number + 2025-10-15 `5b29875c8e` cmd/go: inject State parameter into `run.runRun` + 2025-10-15 `5113496805` runtime/pprof: skip flaky test TestProfilerStackDepth/heap for now + 2025-10-15 `36086e85f8` cmd/go: create temporary cleanup script + 2025-10-14 `7056c71d32` cmd/compile: disable use of new saturating float-to-int conversions + 2025-10-14 `6d5b13793f` Revert "cmd/compile: make 386 float-to-int conversions match amd64" + 2025-10-14 `bb2a14252b` Revert "runtime: adjust softfloat corner cases to match amd64/arm64" + 2025-10-14 `3bc9d9fa83` Revert "cmd/compile: make wasm match other platforms for FP->int32/64 conversions" + 2025-10-14 `ee5af46172` encoding/json: avoid misleading errors under goexperiment.jsonv2 + 2025-10-14 `11d3d2f77d` cmd/internal/obj/arm64: add support for PAC instructions + 2025-10-14 `4dbf1a5a4c` cmd/compile/internal/devirtualize: do not track assignments to non-PAUTO + 2025-10-14 `0ddb5ed465` cmd/compile/internal/devirtualize: use FatalfAt instead of Fatalf where possible + 2025-10-14 `0a239bcc99` Revert "net/url: disallow raw IPv6 addresses in host" + 2025-10-14 `5a9ef44bc0` cmd/compile/internal/devirtualize: fix OCONVNOP assertion + 2025-10-14 `3765758b96` go/types, types2: minor cleanup (remove TODO) + 2025-10-14 `f6b9d56aff` crypto/internal/fips140/entropy: fix benign race + 2025-10-14 `60f6d2f623` crypto/internal/fips140/entropy: support SHA-384 sizes for ACVP tests + 2025-10-13 `6fd8e88d07` encoding/json/v2: restrict presence of default options + 2025-10-13 `1abc6b0204` go/types, types2: permit type cycles through type parameter lists + 2025-10-13 `9fdd6904da` strconv: add tests that Java once mishandled + 2025-10-13 `9b8742f2e7` cmd/compile: don't depend on arch-dependent conversions in the compiler + 2025-10-13 `0e64ee1286` encoding/json/v2: report EOF for top-level values in UnmarshalDecode + 2025-10-13 `6bcd97d9f4` all: replace calls to errors.As with errors.AsType + 2025-10-11 `1cd71689f2` crypto/x509: rework fix for CVE-2025-58187 + 2025-10-11 `8aa1efa223` cmd/link: in TestFallocate, only check number of blocks on Darwin + 2025-10-10 `b497a29d25` encoding/json: fix regression in quoted numbers under goexperiment.jsonv2 + 2025-10-10 `48bb7a6114` cmd/compile: repair bisection behavior for float-to-unsigned conversion + 2025-10-10 `e8a53538b4` runtime: fail TestGoroutineLeakProfile on data race + 2025-10-10 `e3be2d1b2b` net/url: disallow raw IPv6 addresses in host + 2025-10-10 `aced4c79a2` net/http: strip request body headers on POST to GET redirects + 2025-10-10 `584a89fe74` all: omit unnecessary reassignment + 2025-10-10 `69e8279632` net/http: set cookie host to Request.Host when available + 2025-10-10 `6f4c63ba63` cmd/go: unify "go fix" and "go vet" + 2025-10-10 `955a5a0dc5` runtime: support arm64 Neon in async preemption + 2025-10-10 `5368e77429` net/http: run TestRequestWriteTransport with fake time to avoid flakes + 2025-10-09 `c53cb642de` internal/buildcfg: enable greenteagc experiment for loong64 + 2025-10-09 `954fdcc51a` cmd/compile: declare no output register for loong64 LoweredAtomic{And,Or}32 ops + 2025-10-09 `19a30ea3f2` cmd/compile: call generated size-specialized malloc functions directly + 2025-10-09 `80f3bb5516` reflect: remove timeout in TestChanOfGC + 2025-10-09 `9db7e30bb4` net/url: allow IP-literals with IPv4-mapped IPv6 addresses + 2025-10-09 `8d810286b3` cmd/compile: make wasm match other platforms for FP->int32/64 conversions + 2025-10-09 `b9f3accdcf` runtime: adjust softfloat corner cases to match amd64/arm64 + 2025-10-09 `78d75b3799` cmd/compile: make 386 float-to-int conversions match amd64 + 2025-10-09 `0e466a8d1d` cmd/compile: modify float-to-[u]int so that amd64 and arm64 match + 2025-10-08 `4837fbe414` net/http/httptest: check whether response bodies are allowed + 2025-10-08 `ee163197a8` path/filepath: return cleaned path from Rel + 2025-10-08 `de9da0de30` cmd/compile/internal/devirtualize: improve concrete type analysis + 2025-10-08 `ae094a1397` crypto/internal/fips140test: make entropy file pair names match + 2025-10-08 `941e5917c1` runtime: cleanup comments from asm_ppc64x.s improvements + 2025-10-08 `d945600d06` cmd/gofmt: change -d to exit 1 if diffs exist + 2025-10-08 `d4830c6130` cmd/internal/obj: fix Link.Diag printf errors + 2025-10-08 `e1ca1de123` net/http: format pprof.go + 2025-10-08 `e5d004c7a8` net/http: update HTTP/2 documentation to reference new config features + 2025-10-08 `97fd6bdecc` cmd/compile: fuse NaN checks with other comparisons + 2025-10-07 `78b43037dc` cmd/go: refactor usage of `workFilePath` + 2025-10-07 `bb1ca7ae81` cmd/go, testing: add TB.ArtifactDir and -artifacts flag + 2025-10-07 `1623927730` cmd/go: refactor usage of `requirements` + 2025-10-07 `a1661e776f` Revert "crypto/internal/fips140/subtle: add assembly implementation of xorBytes for mips64x" + 2025-10-07 `cb81270113` Revert "crypto/internal/fips140/subtle: add assembly implementation of xorBytes for mipsx" + 2025-10-07 `f2d0d05d28` cmd/go: refactor usage of `MainModules` + 2025-10-07 `f7a68d3804` archive/tar: set a limit on the size of GNU sparse file 1.0 regions + 2025-10-07 `463165699d` net/mail: avoid quadratic behavior in mail address parsing + 2025-10-07 `5ede095649` net/textproto: avoid quadratic complexity in Reader.ReadResponse + 2025-10-07 `5ce8cd16f3` encoding/pem: make Decode complexity linear + 2025-10-07 `f6f4e8b3ef` net/url: enforce stricter parsing of bracketed IPv6 hostnames + 2025-10-07 `7dd54e1fd7` runtime: make work.spanSPMCs.all doubly-linked + 2025-10-07 `3ee761739b` runtime: free spanQueue on P destroy + 2025-10-07 `8709a41d5e` encoding/asn1: prevent memory exhaustion when parsing using internal/saferio + 2025-10-07 `9b9d02c5a0` net/http: add httpcookiemaxnum GODEBUG option to limit number of cookies parsed + 2025-10-07 `3fc4c79fdb` crypto/x509: improve domain name verification + 2025-10-07 `6e4007e8cf` crypto/x509: mitigate DoS vector when intermediate certificate contains DSA public key + 2025-10-07 `6f7926589d` cmd/go: refactor usage of `modRoots` + 2025-10-07 `11d5484190` runtime: fix self-deadlock on sbrk platforms + 2025-10-07 `2e52060084` cmd/go: refactor usage of `RootMode` + 2025-10-07 `f86ddb54b5` cmd/go: refactor usage of `ForceUseModules` + 2025-10-07 `c938051dd0` Revert "cmd/compile: redo arm64 LR/FP save and restore" + 2025-10-07 `6469954203` runtime: assert p.destroy runs with GC not running + 2025-10-06 `4c0fd3a2b4` internal/goexperiment: remove the synctest GOEXPERIMENT + 2025-10-06 `c1e6e49d5d` fmt: reduce Errorf("x") allocations to match errors.New("x") + 2025-10-06 `7fbf54bfeb` internal/buildcfg: enable greenteagc experiment by default + 2025-10-06 `7bfeb43509` cmd/go: refactor usage of `initialized` + 2025-10-06 `1d62e92567` test/codegen: make sure assignment results are used. + 2025-10-06 `4fca79833f` runtime: delete redundant code in the page allocator + 2025-10-06 `719dfcf8a8` cmd/compile: redo arm64 LR/FP save and restore + 2025-10-06 `f3312124c2` runtime: remove batching from spanSPMC free + 2025-10-06 `24416458c2` cmd/go: export type State + 2025-10-06 `c2fb15164b` testing/synctest: remove Run + 2025-10-06 `ac2ec82172` runtime: bump thread count slack for TestReadMetricsSched + 2025-10-06 `e74b224b7c` crypto/tls: streamline BoGo testing w/ -bogo-local-dir + 2025-10-06 `3a05e7b032` spec: close tag + 2025-10-03 `2a71af11fc` net/url: improve URL docs + 2025-10-03 `ee5369b003` cmd/link: add LIBRARY statement only with -buildmode=cshared + 2025-10-03 `1bca4c1673` cmd/compile: improve slicemask removal + 2025-10-03 `38b26f29f1` cmd/compile: remove stores to unread parameters + 2025-10-03 `003b5ce1bc` cmd/compile: fix SIMD const rematerialization condition + 2025-10-03 `d91148c7a8` cmd/compile: enhance prove to infer bounds in slice len/cap calculations + 2025-10-03 `20c9377e47` cmd/compile: enhance the chunked indexing case to include reslicing + 2025-10-03 `ad3db2562e` cmd/compile: handle rematerialized op for incompatible reg constraint + 2025-10-03 `18cd4a1fc7` cmd/compile: use the right type for spill slot + 2025-10-03 `1caa95acfa` cmd/compile: enhance prove to deal with double-offset IsInBounds checks + 2025-10-03 `ec70d19023` cmd/compile: rewrite to elide Slicemask from len==c>0 slicing + 2025-10-03 `10e7968849` cmd/compile: accounts rematerialize ops's output reginfo + 2025-10-03 `ab043953cb` cmd/compile: minor tweak for race detector + 2025-10-03 `ebb72bef44` cmd/compile: don't treat devel compiler as a released compiler + 2025-10-03 `c54dc1418b` runtime: support valgrind (but not asan) in specialized malloc functions + 2025-10-03 `a7917eed70` internal/buildcfg: enable specializedmalloc experiment + 2025-10-03 `630799c6c9` crypto/tls: add flag to render HTML BoGo report Change-Id: I6bf904c523a77ee7d3dea9c8ae72292f8a5f2ba5	2025-11-13 16:54:07 -05:00
Michael Munday	34aef89366	cmd/compile: use FCLASSD for subnormal checks on riscv64 Only implemented for 64 bit floating point operations for now. goos: linux goarch: riscv64 pkg: math cpu: Spacemit(R) X60 │ sec/op │ sec/op vs base │ Acos 154.1n ± 0% 154.1n ± 0% ~ (p=0.303 n=10) Acosh 215.8n ± 6% 226.7n ± 0% ~ (p=0.439 n=10) Asin 149.2n ± 1% 149.2n ± 0% ~ (p=0.700 n=10) Asinh 262.1n ± 0% 258.5n ± 0% -1.37% (p=0.000 n=10) Atan 99.48n ± 0% 99.49n ± 0% ~ (p=0.836 n=10) Atanh 244.9n ± 0% 243.8n ± 0% -0.43% (p=0.002 n=10) Atan2 158.2n ± 1% 153.3n ± 0% -3.10% (p=0.000 n=10) Cbrt 186.8n ± 0% 181.1n ± 0% -3.03% (p=0.000 n=10) Ceil 36.71n ± 1% 36.71n ± 0% ~ (p=0.434 n=10) Copysign 6.531n ± 1% 6.526n ± 0% ~ (p=0.268 n=10) Cos 98.19n ± 0% 95.40n ± 0% -2.84% (p=0.000 n=10) Cosh 233.1n ± 0% 222.6n ± 0% -4.50% (p=0.000 n=10) Erf 122.5n ± 0% 114.2n ± 0% -6.78% (p=0.000 n=10) Erfc 126.0n ± 1% 116.6n ± 0% -7.46% (p=0.000 n=10) Erfinv 138.8n ± 0% 138.6n ± 0% ~ (p=0.082 n=10) Erfcinv 140.0n ± 0% 139.7n ± 0% ~ (p=0.359 n=10) Exp 193.3n ± 0% 184.2n ± 0% -4.68% (p=0.000 n=10) ExpGo 204.8n ± 0% 194.5n ± 0% -5.03% (p=0.000 n=10) Expm1 152.5n ± 1% 145.0n ± 0% -4.92% (p=0.000 n=10) Exp2 174.5n ± 0% 164.2n ± 0% -5.85% (p=0.000 n=10) Exp2Go 184.4n ± 1% 175.4n ± 0% -4.88% (p=0.000 n=10) Abs 4.912n ± 0% 4.914n ± 0% ~ (p=0.283 n=10) Dim 15.50n ± 1% 15.52n ± 1% ~ (p=0.331 n=10) Floor 36.89n ± 1% 36.76n ± 1% ~ (p=0.325 n=10) Max 31.05n ± 1% 31.17n ± 1% ~ (p=0.628 n=10) Min 31.01n ± 0% 31.06n ± 0% ~ (p=0.767 n=10) Mod 294.1n ± 0% 245.6n ± 0% -16.52% (p=0.000 n=10) Frexp 44.86n ± 1% 35.20n ± 0% -21.53% (p=0.000 n=10) Gamma 195.8n ± 0% 185.4n ± 1% -5.29% (p=0.000 n=10) Hypot 84.91n ± 0% 84.54n ± 1% -0.43% (p=0.006 n=10) HypotGo 96.70n ± 0% 95.42n ± 1% -1.32% (p=0.000 n=10) Ilogb 45.03n ± 0% 35.07n ± 1% -22.10% (p=0.000 n=10) J0 634.5n ± 0% 627.2n ± 0% -1.16% (p=0.000 n=10) J1 644.5n ± 0% 636.9n ± 0% -1.18% (p=0.000 n=10) Jn 1.357µ ± 0% 1.344µ ± 0% -0.92% (p=0.000 n=10) Ldexp 49.89n ± 0% 39.96n ± 0% -19.90% (p=0.000 n=10) Lgamma 186.6n ± 0% 184.3n ± 0% -1.21% (p=0.000 n=10) Log 150.4n ± 0% 141.1n ± 0% -6.15% (p=0.000 n=10) Logb 46.70n ± 0% 35.89n ± 0% -23.15% (p=0.000 n=10) Log1p 164.1n ± 0% 163.9n ± 0% ~ (p=0.122 n=10) Log10 153.1n ± 0% 143.5n ± 0% -6.24% (p=0.000 n=10) Log2 58.83n ± 0% 49.75n ± 0% -15.43% (p=0.000 n=10) Modf 40.82n ± 1% 40.78n ± 0% ~ (p=0.239 n=10) Nextafter32 49.15n ± 0% 48.93n ± 0% -0.44% (p=0.011 n=10) Nextafter64 43.33n ± 0% 43.23n ± 0% ~ (p=0.228 n=10) PowInt 269.4n ± 0% 243.8n ± 0% -9.49% (p=0.000 n=10) PowFrac 618.0n ± 0% 571.7n ± 0% -7.48% (p=0.000 n=10) Pow10Pos 13.09n ± 0% 13.05n ± 0% -0.31% (p=0.003 n=10) Pow10Neg 30.99n ± 1% 30.99n ± 0% ~ (p=0.173 n=10) Round 23.73n ± 0% 23.65n ± 0% -0.36% (p=0.011 n=10) RoundToEven 27.87n ± 0% 27.73n ± 0% -0.48% (p=0.003 n=10) Remainder 282.1n ± 0% 249.6n ± 0% -11.52% (p=0.000 n=10) Signbit 11.46n ± 0% 11.42n ± 0% -0.39% (p=0.003 n=10) Sin 115.2n ± 0% 113.2n ± 0% -1.74% (p=0.000 n=10) Sincos 140.6n ± 0% 138.6n ± 0% -1.39% (p=0.000 n=10) Sinh 252.0n ± 0% 241.4n ± 0% -4.21% (p=0.000 n=10) SqrtIndirect 4.909n ± 0% 4.893n ± 0% -0.34% (p=0.021 n=10) SqrtLatency 19.57n ± 1% 19.57n ± 0% ~ (p=0.087 n=10) SqrtIndirectLatency 19.64n ± 0% 19.57n ± 0% -0.36% (p=0.025 n=10) SqrtGoLatency 198.1n ± 0% 197.4n ± 0% -0.35% (p=0.014 n=10) SqrtPrime 5.733µ ± 0% 5.725µ ± 0% ~ (p=0.116 n=10) Tan 149.1n ± 0% 146.8n ± 0% -1.54% (p=0.000 n=10) Tanh 248.2n ± 1% 238.1n ± 0% -4.05% (p=0.000 n=10) Trunc 36.86n ± 0% 36.70n ± 0% -0.43% (p=0.029 n=10) Y0 638.2n ± 0% 633.6n ± 0% -0.71% (p=0.000 n=10) Y1 641.8n ± 0% 636.1n ± 0% -0.87% (p=0.000 n=10) Yn 1.358µ ± 0% 1.345µ ± 0% -0.92% (p=0.000 n=10) Float64bits 5.721n ± 0% 5.709n ± 0% -0.22% (p=0.044 n=10) Float64frombits 4.905n ± 0% 4.893n ± 0% ~ (p=0.266 n=10) Float32bits 12.27n ± 0% 12.23n ± 0% ~ (p=0.122 n=10) Float32frombits 4.909n ± 0% 4.893n ± 0% -0.32% (p=0.024 n=10) FMA 6.556n ± 0% 6.526n ± 0% ~ (p=0.283 n=10) geomean 86.82n 83.75n -3.54% Change-Id: I522297a79646d76543d516accce291f5a3cea337 Reviewed-on: https://go-review.googlesource.com/c/go/+/717560 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>	2025-11-12 10:03:41 -08:00
Junyang Shao	86b4fe31d9	[dev.simd] cmd/compile: add masked merging ops and optimizations This CL generates optimizations for masked variant of AVX512 instructions for patterns: x.Op(y).Merge(z, mask) => OpMasked(z, x, y mask), where OpMasked is resultInArg0. Change-Id: Ife7ccc9ddbf76ae921a085bd6a42b965da9bc179 Reviewed-on: https://go-review.googlesource.com/c/go/+/718160 Reviewed-by: David Chase <drchase@google.com> TryBot-Bypass: Junyang Shao <shaojunyang@google.com>	2025-11-11 13:34:39 -08:00
Junyang Shao	771a1dc216	[dev.simd] cmd/compile: add peepholes for all masked ops and bug fixes For 512-bits they are unchanged. This CL adds the optimization rules for 128/256-bits under feature check. This CL also fixed a bug for masked load variant of instructions and make them zeroing by default as well. Change-Id: I6fe395541c0cd509984a81841420e71c3af732f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/717822 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-11-10 09:53:24 -08:00
Youlin Feng	c7ccbddf22	cmd/compile/internal/ssa: more aggressive on dead auto elim Propagate "unread" across OpMoves. If the addr of this auto is only used by an OpMove as its source arg, and the OpMove's target arg is the addr of another auto. If the 2nd auto can be eliminated, this one can also be eliminated. This CL eliminates unnecessary memory copies and makes the frame smaller in the following code snippet: func contains(m map[string][16]int, k string) bool { _, ok := m[k] return ok } These are the benchmark results followed by the benchmark code: goos: linux goarch: amd64 cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Map1Access2Ok-8 9.582n ± 2% 9.226n ± 0% -3.72% (p=0.000 n=20) Map2Access2Ok-8 13.79n ± 1% 10.24n ± 1% -25.77% (p=0.000 n=20) Map3Access2Ok-8 68.68n ± 1% 12.65n ± 1% -81.58% (p=0.000 n=20) package main_test import "testing" var ( m1 = map[int]int{} m2 = map[int][16]int{} m3 = map[int][256]int{} ) func init() { for i := range 1000 { m1[i] = i m2[i] = [16]int{15:i} m3[i] = [256]int{255:i} } } func BenchmarkMap1Access2Ok(b testing.B) { for i := range b.N { _, ok := m1[i%1000] if !ok { b.Errorf("%d not found", i) } } } func BenchmarkMap2Access2Ok(b testing.B) { for i := range b.N { _, ok := m2[i%1000] if !ok { b.Errorf("%d not found", i) } } } func BenchmarkMap3Access2Ok(b *testing.B) { for i := range b.N { _, ok := m3[i%1000] if !ok { b.Errorf("%d not found", i) } } } Fixes #75398 Change-Id: If75e9caaa50d460efc31a94565b9ba28c8158771 Reviewed-on: https://go-review.googlesource.com/c/go/+/702875 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2025-11-04 12:46:15 -08:00
Russ Cox	6e165b4d17	cmd/compile: implement Avg64u, Hmul64, Hmul64u for wasm This lets us remove useAvg and useHmul from the division rules. The compiler is simpler and the generated code is faster. goos: wasip1 goarch: wasm pkg: internal/strconv │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ AppendFloat/Decimal 192.8n ± 1% 194.6n ± 0% +0.91% (p=0.000 n=10) AppendFloat/Float 328.6n ± 0% 279.6n ± 0% -14.93% (p=0.000 n=10) AppendFloat/Exp 335.6n ± 1% 289.2n ± 1% -13.80% (p=0.000 n=10) AppendFloat/NegExp 336.0n ± 0% 289.1n ± 1% -13.97% (p=0.000 n=10) AppendFloat/LongExp 332.4n ± 0% 285.2n ± 1% -14.20% (p=0.000 n=10) AppendFloat/Big 348.2n ± 0% 300.1n ± 0% -13.83% (p=0.000 n=10) AppendFloat/BinaryExp 137.4n ± 0% 138.2n ± 0% +0.55% (p=0.001 n=10) AppendFloat/32Integer 193.3n ± 1% 196.5n ± 0% +1.66% (p=0.000 n=10) AppendFloat/32ExactFraction 283.3n ± 0% 268.9n ± 1% -5.08% (p=0.000 n=10) AppendFloat/32Point 279.9n ± 0% 266.5n ± 0% -4.80% (p=0.000 n=10) AppendFloat/32Exp 300.1n ± 0% 288.3n ± 1% -3.90% (p=0.000 n=10) AppendFloat/32NegExp 288.2n ± 1% 277.9n ± 1% -3.59% (p=0.000 n=10) AppendFloat/32Shortest 261.7n ± 0% 250.2n ± 0% -4.39% (p=0.000 n=10) AppendFloat/32Fixed8Hard 173.3n ± 1% 158.9n ± 1% -8.31% (p=0.000 n=10) AppendFloat/32Fixed9Hard 180.0n ± 0% 167.9n ± 2% -6.70% (p=0.000 n=10) AppendFloat/64Fixed1 167.1n ± 0% 149.6n ± 1% -10.50% (p=0.000 n=10) AppendFloat/64Fixed2 162.4n ± 1% 146.5n ± 0% -9.73% (p=0.000 n=10) AppendFloat/64Fixed2.5 165.5n ± 0% 149.4n ± 1% -9.70% (p=0.000 n=10) AppendFloat/64Fixed3 166.4n ± 1% 150.2n ± 0% -9.74% (p=0.000 n=10) AppendFloat/64Fixed4 163.7n ± 0% 149.6n ± 1% -8.62% (p=0.000 n=10) AppendFloat/64Fixed5Hard 182.8n ± 1% 167.1n ± 1% -8.61% (p=0.000 n=10) AppendFloat/64Fixed12 222.2n ± 0% 208.8n ± 0% -6.05% (p=0.000 n=10) AppendFloat/64Fixed16 197.6n ± 1% 181.7n ± 0% -8.02% (p=0.000 n=10) AppendFloat/64Fixed12Hard 194.5n ± 0% 181.0n ± 0% -6.99% (p=0.000 n=10) AppendFloat/64Fixed17Hard 205.1n ± 1% 191.9n ± 0% -6.44% (p=0.000 n=10) AppendFloat/64Fixed18Hard 6.269µ ± 0% 6.643µ ± 0% +5.97% (p=0.000 n=10) AppendFloat/64FixedF1 211.7n ± 1% 197.0n ± 0% -6.95% (p=0.000 n=10) AppendFloat/64FixedF2 189.4n ± 0% 174.2n ± 0% -8.08% (p=0.000 n=10) AppendFloat/64FixedF3 169.0n ± 0% 154.9n ± 0% -8.32% (p=0.000 n=10) AppendFloat/Slowpath64 321.2n ± 0% 274.2n ± 1% -14.63% (p=0.000 n=10) AppendFloat/SlowpathDenormal64 307.4n ± 1% 261.2n ± 0% -15.03% (p=0.000 n=10) AppendInt 3.367µ ± 1% 3.376µ ± 0% ~ (p=0.517 n=10) AppendUint 675.5n ± 0% 676.9n ± 0% ~ (p=0.196 n=10) AppendIntSmall 28.13n ± 1% 28.17n ± 0% +0.14% (p=0.015 n=10) AppendUintVarlen/digits=1 20.70n ± 0% 20.51n ± 1% -0.89% (p=0.018 n=10) AppendUintVarlen/digits=2 20.43n ± 0% 20.27n ± 0% -0.81% (p=0.001 n=10) AppendUintVarlen/digits=3 38.48n ± 0% 37.93n ± 0% -1.43% (p=0.000 n=10) AppendUintVarlen/digits=4 41.10n ± 0% 38.78n ± 1% -5.62% (p=0.000 n=10) AppendUintVarlen/digits=5 42.25n ± 1% 42.11n ± 0% -0.32% (p=0.041 n=10) AppendUintVarlen/digits=6 45.40n ± 1% 43.14n ± 0% -4.98% (p=0.000 n=10) AppendUintVarlen/digits=7 46.81n ± 1% 46.03n ± 0% -1.66% (p=0.000 n=10) AppendUintVarlen/digits=8 48.88n ± 1% 46.59n ± 1% -4.68% (p=0.000 n=10) AppendUintVarlen/digits=9 49.94n ± 2% 49.41n ± 1% -1.06% (p=0.000 n=10) AppendUintVarlen/digits=10 57.28n ± 1% 56.92n ± 1% -0.62% (p=0.045 n=10) AppendUintVarlen/digits=11 60.09n ± 1% 58.11n ± 2% -3.30% (p=0.000 n=10) AppendUintVarlen/digits=12 62.22n ± 0% 61.85n ± 0% -0.59% (p=0.000 n=10) AppendUintVarlen/digits=13 64.94n ± 0% 62.92n ± 0% -3.10% (p=0.000 n=10) AppendUintVarlen/digits=14 65.42n ± 1% 65.19n ± 1% -0.34% (p=0.005 n=10) AppendUintVarlen/digits=15 68.17n ± 0% 66.13n ± 0% -2.99% (p=0.000 n=10) AppendUintVarlen/digits=16 70.21n ± 1% 70.09n ± 1% ~ (p=0.517 n=10) AppendUintVarlen/digits=17 72.93n ± 0% 70.49n ± 0% -3.34% (p=0.000 n=10) AppendUintVarlen/digits=18 73.01n ± 0% 72.75n ± 0% -0.35% (p=0.000 n=10) AppendUintVarlen/digits=19 79.27n ± 1% 79.49n ± 1% ~ (p=0.671 n=10) AppendUintVarlen/digits=20 82.18n ± 0% 80.43n ± 1% -2.14% (p=0.000 n=10) geomean 143.4n 136.0n -5.20% Change-Id: I8245814a0259ad13cf9225f57db8e9fe3d2e4267 Reviewed-on: https://go-review.googlesource.com/c/go/+/717407 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>	2025-11-04 11:38:18 -08:00
Russ Cox	235b4e729d	cmd/compile/internal/ssa: model right shift more precisely Prove currently checks for 0 sign bit extraction (x>>63) at the end of the pass, but it is more general and more useful (and not really more work) to model right shift during value range tracking. This handles sign bit extraction (both 0 and -1) but also makes the value ranges available for proving bounds checks. 'go build -a -gcflags=-d=ssa/prove/debug=1 std' finds 105 new things to prove. https://gist.github.com/rsc/8ac41176e53ed9c2f1a664fc668e8336 For example, the compiler now recognizes that this code in strconv does not need to check the second shift for being ≥ 64. msb := xHi >> 63 retMantissa := xHi >> (msb + 38) nor does this code in regexp: return b < utf8.RuneSelf && specialBytes[b%16]&(1<<(b/16)) != 0 This code in math no longer has a bounds check on the first index: if 0 <= n && n <= 308 { return pow10postab32[uint(n)/32] * pow10tab[uint(n)%32] } The diff shows one "lost" proof in ycbcr.go but it's not really lost: the expression was folded to a constant instead, and that only shows up with debug=2. A diff of that output is at https://gist.github.com/rsc/9139ed46c6019ae007f5a1ba4bb3250f Change-Id: I84087311e0a303f00e2820d957a6f8b29ee22519 Reviewed-on: https://go-review.googlesource.com/c/go/+/716140 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: David Chase <drchase@google.com>	2025-10-30 09:17:59 -07:00
Russ Cox	1e5bb416d8	cmd/compile: implement bits.Mul64 on 32-bit systems This CL implements Mul64uhilo, Hmul64, Hmul64u, and Avg64u on 32-bit systems, with the effect that constant division of both int64s and uint64s can now be emitted directly in all cases, and also that bits.Mul64 can be intrinsified on 32-bit systems. Previously, constant division of uint64s by values 0 ≤ c ≤ 0xFFFF were implemented as uint32 divisions by c and some fixup. After expanding those smaller constant divisions, the code for i/999 required: (386) 7 mul, 10 add, 2 sub, 3 rotate, 3 shift (104 bytes) (arm) 7 mul, 9 add, 3 sub, 2 shift (104 bytes) (mips) 7 mul, 10 add, 5 sub, 6 shift, 3 sgtu (176 bytes) For that much code, we might as well use a full 64x64->128 multiply that can be used for all divisors, not just small ones. Having done that, the same i/999 now generates: (386) 4 mul, 9 add, 2 sub, 2 or, 6 shift (112 bytes) (arm) 4 mul, 8 add, 2 sub, 2 or, 3 shift (92 bytes) (mips) 4 mul, 11 add, 3 sub, 6 shift, 8 sgtu, 4 or (196 bytes) The size increase on 386 is due to a few extra register spills. The size increase on mips is due to add-with-carry being hard. The new approach is more general, letting us delete the old special case and guarantee that all int64 and uint64 divisions by constants are generated directly on 32-bit systems. This especially speeds up code making heavy use of bits.Mul64 with a constant argument, which happens in strconv and various crypto packages. A few examples are benchmarked below. pkg: cmd/compile/internal/test benchmark \ host local linux-amd64 s7 linux-386 s7:GOARCH=386 vs base vs base vs base vs base vs base DivconstI64 ~ ~ ~ -49.66% -21.02% ModconstI64 ~ ~ ~ -13.45% +14.52% DivisiblePow2constI64 ~ ~ ~ +0.97% -1.32% DivisibleconstI64 ~ ~ ~ -20.01% -48.28% DivisibleWDivconstI64 ~ ~ -1.76% -38.59% -42.74% DivconstU64/3 ~ ~ ~ -13.82% -4.09% DivconstU64/5 ~ ~ ~ -14.10% -3.54% DivconstU64/37 -2.07% -4.45% ~ -19.60% -9.55% DivconstU64/1234567 ~ ~ ~ -61.55% -56.93% ModconstU64 ~ ~ ~ -6.25% ~ DivisibleconstU64 ~ ~ ~ -2.78% -7.82% DivisibleWDivconstU64 ~ ~ ~ +4.23% +2.56% pkg: math/bits benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base Add ~ ~ ~ ~ Add32 +1.59% ~ ~ ~ Add64 ~ ~ ~ ~ Add64multiple ~ ~ ~ ~ Sub ~ ~ ~ ~ Sub32 ~ ~ ~ ~ Sub64 ~ ~ -9.20% ~ Sub64multiple ~ ~ ~ ~ Mul ~ ~ ~ ~ Mul32 ~ ~ ~ ~ Mul64 ~ ~ -41.58% -53.21% Div ~ ~ ~ ~ Div32 ~ ~ ~ ~ Div64 ~ ~ ~ ~ pkg: strconv benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base ParseInt/Pos/7bit ~ ~ -11.08% -6.75% ParseInt/Pos/26bit ~ ~ -13.65% -11.02% ParseInt/Pos/31bit ~ ~ -14.65% -9.71% ParseInt/Pos/56bit -1.80% ~ -17.97% -10.78% ParseInt/Pos/63bit ~ ~ -13.85% -9.63% ParseInt/Neg/7bit ~ ~ -12.14% -7.26% ParseInt/Neg/26bit ~ ~ -14.18% -9.81% ParseInt/Neg/31bit ~ ~ -14.51% -9.02% ParseInt/Neg/56bit ~ ~ -15.79% -9.79% ParseInt/Neg/63bit ~ ~ -15.68% -11.07% AppendFloat/Decimal ~ ~ -7.25% -12.26% AppendFloat/Float ~ ~ -15.96% -19.45% AppendFloat/Exp ~ ~ -13.96% -17.76% AppendFloat/NegExp ~ ~ -14.89% -20.27% AppendFloat/LongExp ~ ~ -12.68% -17.97% AppendFloat/Big ~ ~ -11.10% -16.64% AppendFloat/BinaryExp ~ ~ ~ ~ AppendFloat/32Integer ~ ~ -10.05% -10.91% AppendFloat/32ExactFraction ~ ~ -8.93% -13.00% AppendFloat/32Point ~ ~ -10.36% -14.89% AppendFloat/32Exp ~ ~ -9.88% -13.54% AppendFloat/32NegExp ~ ~ -10.16% -14.26% AppendFloat/32Shortest ~ ~ -11.39% -14.96% AppendFloat/32Fixed8Hard ~ ~ ~ -2.31% AppendFloat/32Fixed9Hard ~ ~ ~ -7.01% AppendFloat/64Fixed1 ~ ~ -2.83% -8.23% AppendFloat/64Fixed2 ~ ~ ~ -7.94% AppendFloat/64Fixed3 ~ ~ -4.07% -7.22% AppendFloat/64Fixed4 ~ ~ -7.24% -7.62% AppendFloat/64Fixed12 ~ ~ -6.57% -4.82% AppendFloat/64Fixed16 ~ ~ -4.00% -5.81% AppendFloat/64Fixed12Hard -2.22% ~ -4.07% -6.35% AppendFloat/64Fixed17Hard -2.12% ~ ~ -3.79% AppendFloat/64Fixed18Hard -1.89% ~ +2.48% ~ AppendFloat/Slowpath64 -1.85% ~ -14.49% -18.21% AppendFloat/SlowpathDenormal64 ~ ~ -13.08% -19.41% pkg: crypto/internal/fips140/nistec/fiat benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base Mul/P224 ~ ~ -29.95% -39.60% Mul/P384 ~ ~ -37.11% -63.33% Mul/P521 ~ ~ -26.62% -12.42% Square/P224 +1.46% ~ -40.62% -49.18% Square/P384 ~ ~ -45.51% -69.68% Square/P521 +90.37% ~ -25.26% -11.23% (The +90% is a separate problem and not real; that much variation can be seen on that system by running the same binary from two different files.) pkg: crypto/internal/fips140/edwards25519 benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base EncodingDecoding ~ ~ -34.67% -35.75% ScalarBaseMult ~ ~ -31.25% -30.29% ScalarMult ~ ~ -33.45% -32.54% VarTimeDoubleScalarBaseMult ~ ~ -33.78% -33.68% Change-Id: Id3c91d42cd01def6731b755e99f8f40c6ad1bb65 Reviewed-on: https://go-review.googlesource.com/c/go/+/716061 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2025-10-30 08:04:20 -07:00
Russ Cox	9bbda7c99d	cmd/compile: make prove understand div, mod better This CL introduces new divisible and divmod passes that rewrite divisibility checks and div, mod, and mul. These happen after prove, so that prove can make better sense of the code for deriving bounds, and they must run before decompose, so that 64-bit ops can be lowered to 32-bit ops on 32-bit systems. And then they need another generic pass as well, to optimize the generated code before decomposing. The three opt passes are "opt", "middle opt", and "late opt". (Perhaps instead they should be "generic", "opt", and "late opt"?) The "late opt" pass repeats the "middle opt" work on any new code that has been generated in the interim. There will not be new divs or mods, but there may be new muls. The x%c==0 rewrite rules are much simpler now, since they can match before divs have been rewritten. This has the effect of applying them more consistently and making the rewrite rules independent of the exact div rewrites. Prove is also now charged with marking signed div/mod as unsigned when the arguments call for it, allowing simpler code to be emitted in various cases. For example, t.Seconds()/2 and len(x)/2 are now recognized as unsigned, meaning they compile to a simple shift (unsigned division), avoiding the more complex fixup we need for signed values. https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32 shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std' output before and after. "Proved Rsh64x64 shifts to zero" is replaced by the higher-level "Proved Div64 is unsigned" (the shift was in the signed expansion of div by constant), but otherwise prove is only finding more things to prove. One short example, in code that does x[i%len(x)]: < runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero --- > runtime/mfinal.go:131:34: Proved Div64 is unsigned > runtime/mfinal.go:131:38: Proved IsInBounds A longer example: < crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero --- > crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds These diffs are due to the smaller opt being better and taking work away from prove: < image/jpeg/dct.go:307:5: Proved IsInBounds < image/jpeg/dct.go:308:5: Proved IsInBounds ... < image/jpeg/dct.go:442:5: Proved IsInBounds In the old opt, Mul by 8 was rewritten to Lsh by 3 early. This CL delays that rule to help prove recognize mods, but it also helps opt constant-fold the slice x[8i:8i+8:8*i+8]. Specifically, computing the length, opt can now do: (Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) -> (Add 8 (Sub (Mul 8 i) (Mul 8 i))) -> (Add 8 (Mul 8 (Sub i i))) -> (Add 8 (Mul 8 0)) -> (Add 8 0) -> 8 The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)), Leaving the multiply as Mul enables using that step; the old rewrite to Lsh blocked it, leaving prove to figure out the length and then remove the bounds checks. But now opt can evaluate the length down to a constant 8 and then constant-fold away the bounds checks 0 < 8, 1 < 8, and so on. After that, the compiler has nothing left to prove. Benchmarks are noisy in general; I checked the assembly for the many large increases below, and the vast majority are unchanged and presumably hitting the caches differently in some way. The divisibility optimizations were not reliably triggering before. This leads to a very large improvement in some cases, like DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems and DivisbleconstU64 on 32-bit systems. Another way the divisibility optimizations were unreliable before was incorrectly triggering for x/3, x%3 even though they are written not to do that. There is a real but small slowdown in the DivisibleWDivconst benchmarks on Mac because in the cases used in the benchmark, it is still faster (on Mac) to do the divisibility check than to remultiply. This may be worth further study. Perhaps when there is no rotate (meaning the divisor is odd), the divisibility optimization should be enabled always. In any event, this CL makes it possible to study that. benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm vs base vs base vs base vs base vs base vs base vs base vs base LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~ ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50% Modify ~ ~ ~ ~ ~ ~ ~ -1.53% MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05% ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~ BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06% BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06% BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~ BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49% BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40% BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77% MulNeg ~ ~ ~ ~ ~ ~ ~ ~ Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~ DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50% ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32% DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~ DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63% DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60% DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~ DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~ DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~ DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12% ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56% DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72% DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17% DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67% ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30% DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22% DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03% DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01% DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~ ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~ DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~ DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07% DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47% ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06% DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17% DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64% DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06% DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~ ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23% DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06% DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06% DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~ ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~ DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~ DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~ DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~ DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88% ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~ DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~ DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01% Mul2 ~ ~ ~ ~ ~ ~ ~ ~ MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02% EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53% TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04% Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~ Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39% Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35% Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57% Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55% Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11% Div64 ~ ~ ~ ~ ~ ~ ~ +0.11% Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~ Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~ Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28% Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93% Mod64Small ~ ~ ~ ~ ~ ~ ~ ~ Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44% Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37% Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22% Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11% Mod64 ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02% Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02% MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37% MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21% MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02% MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03% MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03% MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04% MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13% MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~ MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30% MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~ MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~ MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~ MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~ MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31% MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47% MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03% MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03% MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03% MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22% MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64% MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14% MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04% MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~ MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~ ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31% Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~ Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79% Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~ Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44% SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31% SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~ SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~ SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62% SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75% SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08% Fixes #63110. Fixes #75954. Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0 Reviewed-on: https://go-review.googlesource.com/c/go/+/714160 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-10-29 18:49:40 -07:00
Russ Cox	915c1839fe	test/codegen: simplify asmcheck pattern matching Separate patterns in asmcheck by spaces instead of commas. Many patterns end in comma (like "MOV [$]123,") so separating patterns by comma is not great; they're already quoted, so spaces are fine. Also replace all tabs in the assembly lines with spaces before matching. Finally, replace \$ or \\$ with [$] as the matching idiom. The effect of all these is to make the patterns look like: // amd64:"BSFQ" "ORQ [$]256" instead of the old: // amd64:"BSFQ","ORQ\t\\$256" Update all tests as well. Change-Id: Ia39febe5d7f67ba115846422789e11b185d5c807 Reviewed-on: https://go-review.googlesource.com/c/go/+/716060 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alan Donovan <adonovan@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com>	2025-10-29 13:55:00 -07:00
David Chase	e452f4ac7d	[dev.simd] cmd/compile: enhance inlining for closure-of-SIMD We noticed some hand-translated code that used nested functions as the translation of asm macros, and they were too big to inline, and the resulting performance was underwhelming. Any such closures really need to be inlined. Because Gerrit removed votes from a previous patch set, and because in offline discussion we realized that this was actually a hard-to-abuse inlining hack, I decided to turn it up some more, and also add a "this one goes to 11" joke. The number is utterly unprincipled, only "simd is supposed to go fast, and this is a natural use of closures, and we don't want there to be issues where it doesn't go fast." The test verifies that the inlining occurs for a function that exceeds the current inlining threshold. Inspection of the generated code shows that it has the desired effect. Change-Id: I7a8b57c07d6482e6d98cedaf9622c960f956834d Reviewed-on: https://go-review.googlesource.com/c/go/+/715740 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>	2025-10-29 13:46:34 -07:00
Russ Cox	9035f7aea5	runtime: use internal/strconv Runtime doing its own number formatting dates back to when runtime was the bottom-most Go package. Those days are long gone. Use internal/strconv to avoid duplicating code and also to get better floating-point formatting: % go1.24.6 run x.go +1.234568e+004 % go run x.go 12345.678 % With accurate floating point it becomes necessary to introduce separate printers for float32 vs float64 and for complex64 vs complex128. Otherwise float32(93.7) prints as 93.69999694824219. Change-Id: I25ae3f09519342dc3d1dcabf4711651423e00128 Reviewed-on: https://go-review.googlesource.com/c/go/+/716002 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-10-29 11:00:23 -07:00
Keith Randall	bd4dc413cd	cmd/compile: don't optimize away a panicing interface comparison We can't do direct pointer comparisons if the type is not a comparable type. Fixes #76008 Change-Id: I1687acff21832d2c2e8f3b875e7b5ec125702ef3 Reviewed-on: https://go-review.googlesource.com/c/go/+/713840 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com>	2025-10-28 10:24:14 -07:00
Jorropo	24af441437	cmd/compile: rewrite proved multiplies by 0 or 1 into CondSelect Updates #76056 Change-Id: I64fe631ab381c74f902f877392530d7cc91860ab Reviewed-on: https://go-review.googlesource.com/c/go/+/715044 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-10-28 07:51:50 -07:00
Jorropo	73d7635fae	cmd/compile: add generic rules to remove bool → int → bool roundtrips Change-Id: I8b0a3b64c89fe167d304f901a5d38470f35400ab Reviewed-on: https://go-review.googlesource.com/c/go/+/715200 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2025-10-27 23:24:54 -07:00
Jorropo	77dc138030	cmd/compile: teach prove about unsigned rounding-up divide Change-Id: Ia7b5242c723f83ba85d12e4ca64a19fbbd126016 Reviewed-on: https://go-review.googlesource.com/c/go/+/714622 Auto-Submit: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2025-10-27 19:08:00 -07:00
David Chase	ca1264ac50	[dev.simd] test: add some trickier cases to ternary-boolean simd test These new tests check a hypothesis about interactions between CPU features and common subexpressions. Happily, they hypothesis was not (yet) correct and the test did not fail. These are probably good to have in the corpus in case we decide to tinker with the rewrite in the future, or if someone wants to write a fuzzer and needs a little inspiration. Change-Id: I8ea6e1655a293c22e39bf53e4d2c5afd3dcb2510 Reviewed-on: https://go-review.googlesource.com/c/go/+/714803 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>	2025-10-27 12:57:33 -07:00
Meng Zhuo	d7a52f9369	cmd/compile: use MOV(D\|F) with const for Const(64\|32)F on riscv64 The original Const64F using: AUIPC + LD + FMVDX to load float64 const, we can use AUIPC + FLD instead, same as Const32F. Change-Id: I8ca0a0e90d820a26e69b74cd25df3cc662132bf7 Reviewed-on: https://go-review.googlesource.com/c/go/+/703215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>	2025-10-26 18:35:09 -07:00
David Chase	f6b4711095	[dev.simd] cmd/compile, simd: add rewrite to convert logical expression trees into TERNLOG instructions includes tests of both rewrite application and rewrite correctness Change-Id: I7983ccf87a8408af95bb6c447cb22f01beda9f61 Reviewed-on: https://go-review.googlesource.com/c/go/+/710697 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>	2025-10-24 11:05:14 -07:00
Youlin Feng	519ae514ab	cmd/compile: eliminate bound check for slices of the same length If two slices start out with the same length and decrease in length by the same amount on each round of the loop (or in the if block), then we think their length are always equal. For example: if len(a) != len(b) { return } for len(a) >= 4 { a = a[4:] b = b[4:] // proved here, omit boundary check } if len(a) == len(b) { // proved here //... } Or, change 'for' to 'if': if len(a) != len(b) { return } if len(a) >= 4 { a = a[4:] b = b[4:] } if len(a) == len(b) { // proved here //... } Fixes #75144 Change-Id: I4e5902a02b5cf8fdc122715a7dbd2fb5e9a8f5dc Reviewed-on: https://go-review.googlesource.com/c/go/+/699155 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org>	2025-10-15 12:34:48 -07:00
David Chase	7056c71d32	cmd/compile: disable use of new saturating float-to-int conversions The new conversions can be activated (or bisected) with -gcflags=all=-d=converthash=PATTERN where PATTERN is either a hash string or n, qn, y, qy for no, quietly no, yes, quietly yes. This CL makes the default pattern be "qn" instead of the default-default which is an efficient encoding of "qy". Updates #75834 Change-Id: I88a9fd7880bc999132420c8d0a22a8fdc1e95a2a Reviewed-on: https://go-review.googlesource.com/c/go/+/711845 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: David Chase <drchase@google.com>	2025-10-14 15:09:35 -07:00
David Chase	6d5b13793f	Revert "cmd/compile: make 386 float-to-int conversions match amd64" This reverts commit `78d75b3799`. Reason for revert: we need to do this more carefully, at minimum gated by a module version (This should follow the softfloat FP conversion revert) Change-Id: I736bec6cd860285dcc3b11fac85b377a149435c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/711842 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2025-10-14 12:46:22 -07:00
David Chase	bb2a14252b	Revert "runtime: adjust softfloat corner cases to match amd64/arm64" This reverts commit `b9f3accdcf`. Reason for revert: we need to do this more carefully, at minimum gated by a module version (This should follow the WASM FP conversion revert) Change-Id: Ib98ce7d243348f69c9944db8537397b225c2cc33 Reviewed-on: https://go-review.googlesource.com/c/go/+/711841 Reviewed-by: Keith Randall <khr@golang.org> TryBot-Bypass: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com>	2025-10-14 12:45:58 -07:00

1 2 3 4 5 ...

5447 commits