Commit graph

63575 commits

Author SHA1 Message Date
Cherry Mui
f53dcb6280 cmd/internal/testdir: unify link command
There are three places where we manually construct a "go tool link"
command. Unify them.

Test binaries don't need the symbol table or debug info, so pass
-s -w always.

Change-Id: I40143894172877738e250f291d7e7ef8dce62488
Reviewed-on: https://go-review.googlesource.com/c/go/+/692875
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06 12:27:04 -07:00
Damien Neil
a3895fe9f1 database/sql: avoid closing Rows while scan is in progress
A database/sql/driver.Rows can return database-owned data
from Rows.Next. The driver.Rows documentation doesn't explicitly
document the lifetime guarantees for this data, but a reasonable
expectation is that the caller of Next should only access it
until the next call to Rows.Close or Rows.Next.

Avoid violating that constraint when a query is cancelled while
a call to database/sql.Rows.Scan (note the difference between
the two different Rows types!) is in progress. We previously
took care to avoid closing a driver.Rows while the user has
access to driver-owned memory via a RawData, but we could still
close a driver.Rows while a Scan call was in the process of
reading previously-returned driver-owned data.

Update the fake DB used in database/sql tests to invalidate
returned data to help catch other places we might be
incorrectly retaining it.

Fixes #74831.

Change-Id: Ice45b5fad51b679c38e3e1d21ef39156b56d6037
Reviewed-on: https://go-internal-review.googlesource.com/c/go/+/2540
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Neal Patel <nealpatel@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/693735
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-06 11:36:35 -07:00
Mark Freeman
608e9fac90 go/types, types2: flip on position tracing
Running compilebench with flags off / on, we get the below:

                         │   old.txt    │              new.txt               │
                         │    sec/op    │   sec/op     vs base               │
Template                    149.2m ± 6%   155.5m ± 5%       ~ (p=0.280 n=10)
Unicode                     110.1m ± 3%   105.8m ± 7%       ~ (p=0.280 n=10)
GoTypes                     774.0m ± 6%   757.7m ± 2%       ~ (p=0.247 n=10)
Compiler                    109.6m ± 6%   109.8m ± 6%       ~ (p=0.579 n=10)
SSA                          4.562 ± 2%    4.550 ± 2%       ~ (p=0.436 n=10)
Flate                      101.65m ± 9%   96.32m ± 7%  -5.24% (p=0.043 n=10)
GoParser                    168.7m ± 6%   173.7m ± 6%       ~ (p=0.436 n=10)
Reflect                     390.2m ± 5%   387.8m ± 6%       ~ (p=0.684 n=10)
Tar                         185.9m ± 3%   182.2m ± 4%       ~ (p=0.529 n=10)
XML                         212.7m ± 4%   211.4m ± 4%       ~ (p=0.971 n=10)
LinkCompiler                490.9m ± 4%   480.4m ± 4%       ~ (p=0.353 n=10)
ExternalLinkCompiler         1.501 ± 1%    1.501 ± 1%       ~ (p=0.853 n=10)
LinkWithoutDebugCompiler    311.8m ± 4%   308.6m ± 4%       ~ (p=0.579 n=10)
StdCmd                       17.60 ± 1%    17.62 ± 1%       ~ (p=0.912 n=10)
geomean                     427.5m        424.2m       -0.77%

Overall, we do not see a statistically significant perforance impact. Flate
actually reports a speedup, but with a p-value of 0.043, it's quite close
to the significance threshold (which is fairly lenient). In my opinion,
this is likely due to chance.

Fixes #51603

Change-Id: I7f439730be45e02c7f799df768590ef78e321952
Reviewed-on: https://go-review.googlesource.com/c/go/+/676816
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06 11:05:55 -07:00
Keith Randall
72e8237cc1 cmd/compile: allow more args in StructMake folding rule
imakeOfStructMake does the right thing, but we never call it
when the StructMake has more than one argument.

Fixes #74908

Change-Id: Ib4b1a025bfb1fa69a325207e47b74bd6217092bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/693615
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06 11:04:07 -07:00
Joel Sing
3406a617d9 internal/bytealg: vector implementation of indexbyte for riscv64
Provide a vector implementation of indexbyte for riscv64, which is used
when compiled with the rva23u64 profile, or when vector is detected
to be available. Inputs that are smaller than 24 bytes will continue
to use the non-vector path.

On a Banana Pi F3, with GORISCV64=rva23u64:

                │  indexbyte.1  │             indexbyte.2              │
                │    sec/op     │    sec/op     vs base                │
IndexByte/10-8     52.68n ±  0%   47.26n ±  0%  -10.30% (p=0.000 n=10)
IndexByte/32-8     68.62n ±  0%   47.02n ±  0%  -31.49% (p=0.000 n=10)
IndexByte/4K-8    2217.0n ±  0%   420.4n ±  0%  -81.04% (p=0.000 n=10)
IndexByte/4M-8    2624.4µ ±  0%   767.5µ ±  0%  -70.75% (p=0.000 n=10)
IndexByte/64M-8    68.08m ± 10%   47.84m ± 45%  -29.73% (p=0.004 n=10)
geomean            17.03µ         8.073µ        -52.59%

                │ indexbyte.1  │               indexbyte.2               │
                │     B/s      │      B/s        vs base                 │
IndexByte/10-8    181.0Mi ± 0%    201.8Mi ±  0%   +11.48% (p=0.000 n=10)
IndexByte/32-8    444.7Mi ± 0%    649.1Mi ±  0%   +45.97% (p=0.000 n=10)
IndexByte/4K-8    1.721Gi ± 0%    9.076Gi ±  0%  +427.51% (p=0.000 n=10)
IndexByte/4M-8    1.488Gi ± 0%    5.089Gi ±  0%  +241.93% (p=0.000 n=10)
IndexByte/64M-8   940.3Mi ± 9%   1337.8Mi ± 31%   +42.27% (p=0.004 n=10)
geomean           727.1Mi         1.498Gi        +110.94%

Change-Id: If7b0dbef38d76fa7a2021e4ecaed668a1d4b9783
Reviewed-on: https://go-review.googlesource.com/c/go/+/648856
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-06 06:23:02 -07:00
Joel Sing
75ea2d05c0 internal/bytealg: vector implementation of equal for riscv64
Provide a vector implementation of equal for riscv64, which is used
when compiled with the rva23u64 profile, or when vector is detected
to be available. Inputs that are 8 byte aligned will still be handled
via a the non-vector code if the length is less than or equal to 64
bytes.

On a Banana Pi F3, with GORISCV64=rva23u64:

                                │   equal.1    │               equal.2                │
                                │    sec/op    │    sec/op     vs base                │
Equal/0-8                         1.254n ±  0%   1.254n ±  0%        ~ (p=1.000 n=10)
Equal/same/1-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.466 n=10)
Equal/same/6-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.689 n=10)
Equal/same/9-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.861 n=10)
Equal/same/15-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.657 n=10)
Equal/same/16-8                   21.32n ±  0%   21.33n ±  0%        ~ (p=0.075 n=10)
Equal/same/20-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.249 n=10)
Equal/same/32-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.303 n=10)
Equal/same/4K-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=1.000 n=10)
Equal/same/4M-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.582 n=10)
Equal/same/64M-8                  21.32n ±  0%   21.32n ±  0%        ~ (p=0.930 n=10)
Equal/1-8                         39.16n ±  1%   38.71n ±  0%   -1.15% (p=0.000 n=10)
Equal/6-8                         51.49n ±  1%   50.40n ±  1%   -2.12% (p=0.000 n=10)
Equal/9-8                         54.46n ±  1%   53.89n ±  0%   -1.04% (p=0.000 n=10)
Equal/15-8                        71.81n ±  1%   70.59n ±  0%   -1.71% (p=0.000 n=10)
Equal/16-8                        69.14n ±  0%   68.21n ±  0%   -1.34% (p=0.000 n=10)
Equal/20-8                        78.59n ±  0%   77.59n ±  0%   -1.26% (p=0.000 n=10)
Equal/32-8                        41.55n ±  0%   41.16n ±  0%   -0.96% (p=0.000 n=10)
Equal/4K-8                        925.5n ±  0%   561.4n ±  1%  -39.34% (p=0.000 n=10)
Equal/4M-8                        3.110m ± 32%   2.463m ± 16%  -20.80% (p=0.000 n=10)
Equal/64M-8                       47.34m ± 30%   39.89m ± 16%  -15.75% (p=0.004 n=10)
EqualBothUnaligned/64_0-8         32.17n ±  1%   32.11n ±  1%        ~ (p=0.184 n=10)
EqualBothUnaligned/64_1-8         79.48n ±  0%   48.24n ±  1%  -39.31% (p=0.000 n=10)
EqualBothUnaligned/64_4-8         72.71n ±  0%   48.37n ±  1%  -33.48% (p=0.000 n=10)
EqualBothUnaligned/64_7-8         77.12n ±  0%   48.16n ±  1%  -37.56% (p=0.000 n=10)
EqualBothUnaligned/4096_0-8       908.4n ±  0%   562.4n ±  2%  -38.09% (p=0.000 n=10)
EqualBothUnaligned/4096_1-8       956.6n ±  0%   571.4n ±  3%  -40.26% (p=0.000 n=10)
EqualBothUnaligned/4096_4-8       949.6n ±  0%   571.6n ±  3%  -39.81% (p=0.000 n=10)
EqualBothUnaligned/4096_7-8       954.2n ±  0%   571.7n ±  3%  -40.09% (p=0.000 n=10)
EqualBothUnaligned/4194304_0-8    2.935m ± 29%   2.664m ± 19%        ~ (p=0.089 n=10)
EqualBothUnaligned/4194304_1-8    3.341m ± 13%   2.896m ± 34%        ~ (p=0.075 n=10)
EqualBothUnaligned/4194304_4-8    3.204m ± 39%   3.352m ± 33%        ~ (p=0.796 n=10)
EqualBothUnaligned/4194304_7-8    3.226m ± 30%   2.737m ± 34%  -15.16% (p=0.043 n=10)
EqualBothUnaligned/67108864_0-8   49.04m ± 17%   39.94m ± 12%  -18.57% (p=0.005 n=10)
EqualBothUnaligned/67108864_1-8   51.96m ± 15%   42.48m ± 15%  -18.23% (p=0.015 n=10)
EqualBothUnaligned/67108864_4-8   47.67m ± 17%   37.85m ± 41%  -20.61% (p=0.035 n=10)
EqualBothUnaligned/67108864_7-8   53.00m ± 22%   38.76m ± 21%  -26.87% (p=0.000 n=10)
CompareBytesEqual-8               51.71n ±  1%   52.00n ±  0%   +0.57% (p=0.002 n=10)
geomean                           1.469µ         1.265µ        -13.93%

                                │    equal.1     │                equal.2                 │
                                │      B/s       │      B/s        vs base                │
Equal/same/1-8                     44.73Mi ±  0%    44.72Mi ±  0%        ~ (p=0.426 n=10)
Equal/same/6-8                     268.3Mi ±  0%    268.4Mi ±  0%        ~ (p=0.753 n=10)
Equal/same/9-8                     402.6Mi ±  0%    402.5Mi ±  0%        ~ (p=0.209 n=10)
Equal/same/15-8                    670.9Mi ±  0%    670.9Mi ±  0%        ~ (p=0.724 n=10)
Equal/same/16-8                    715.6Mi ±  0%    715.4Mi ±  0%   -0.04% (p=0.022 n=10)
Equal/same/20-8                    894.6Mi ±  0%    894.5Mi ±  0%        ~ (p=0.060 n=10)
Equal/same/32-8                    1.398Gi ±  0%    1.398Gi ±  0%        ~ (p=0.986 n=10)
Equal/same/4K-8                    178.9Gi ±  0%    178.9Gi ±  0%        ~ (p=0.853 n=10)
Equal/same/4M-8                    178.9Ti ±  0%    178.9Ti ±  0%        ~ (p=0.971 n=10)
Equal/same/64M-8                  2862.8Ti ±  0%   2862.6Ti ±  0%        ~ (p=0.971 n=10)
Equal/1-8                          24.35Mi ±  1%    24.63Mi ±  0%   +1.16% (p=0.000 n=10)
Equal/6-8                          111.1Mi ±  1%    113.5Mi ±  1%   +2.17% (p=0.000 n=10)
Equal/9-8                          157.6Mi ±  1%    159.3Mi ±  0%   +1.05% (p=0.000 n=10)
Equal/15-8                         199.2Mi ±  1%    202.7Mi ±  0%   +1.74% (p=0.000 n=10)
Equal/16-8                         220.7Mi ±  0%    223.7Mi ±  0%   +1.36% (p=0.000 n=10)
Equal/20-8                         242.7Mi ±  0%    245.8Mi ±  0%   +1.27% (p=0.000 n=10)
Equal/32-8                         734.3Mi ±  0%    741.6Mi ±  0%   +0.98% (p=0.000 n=10)
Equal/4K-8                         4.122Gi ±  0%    6.795Gi ±  1%  +64.84% (p=0.000 n=10)
Equal/4M-8                         1.258Gi ± 24%    1.586Gi ± 14%  +26.12% (p=0.000 n=10)
Equal/64M-8                        1.320Gi ± 23%    1.567Gi ± 14%  +18.69% (p=0.004 n=10)
EqualBothUnaligned/64_0-8          1.853Gi ±  1%    1.856Gi ±  1%        ~ (p=0.190 n=10)
EqualBothUnaligned/64_1-8          767.9Mi ±  0%   1265.2Mi ±  1%  +64.76% (p=0.000 n=10)
EqualBothUnaligned/64_4-8          839.4Mi ±  0%   1261.9Mi ±  1%  +50.33% (p=0.000 n=10)
EqualBothUnaligned/64_7-8          791.4Mi ±  0%   1267.5Mi ±  1%  +60.16% (p=0.000 n=10)
EqualBothUnaligned/4096_0-8        4.199Gi ±  0%    6.784Gi ±  2%  +61.54% (p=0.000 n=10)
EqualBothUnaligned/4096_1-8        3.988Gi ±  0%    6.676Gi ±  3%  +67.40% (p=0.000 n=10)
EqualBothUnaligned/4096_4-8        4.017Gi ±  0%    6.674Gi ±  3%  +66.14% (p=0.000 n=10)
EqualBothUnaligned/4096_7-8        3.998Gi ±  0%    6.673Gi ±  3%  +66.92% (p=0.000 n=10)
EqualBothUnaligned/4194304_0-8     1.332Gi ± 22%    1.468Gi ± 16%        ~ (p=0.089 n=10)
EqualBothUnaligned/4194304_1-8     1.169Gi ± 12%    1.350Gi ± 25%        ~ (p=0.075 n=10)
EqualBothUnaligned/4194304_4-8     1.222Gi ± 28%    1.165Gi ± 48%        ~ (p=0.796 n=10)
EqualBothUnaligned/4194304_7-8     1.211Gi ± 23%    1.427Gi ± 26%  +17.88% (p=0.043 n=10)
EqualBothUnaligned/67108864_0-8    1.274Gi ± 14%    1.567Gi ± 14%  +22.97% (p=0.005 n=10)
EqualBothUnaligned/67108864_1-8    1.204Gi ± 14%    1.471Gi ± 13%  +22.18% (p=0.015 n=10)
EqualBothUnaligned/67108864_4-8    1.311Gi ± 14%    1.651Gi ± 29%  +25.92% (p=0.035 n=10)
EqualBothUnaligned/67108864_7-8    1.179Gi ± 18%    1.612Gi ± 17%  +36.73% (p=0.000 n=10)
geomean                            1.870Gi          2.190Gi        +17.16%

Change-Id: I9c5270bcc6997d020a96d1e97c7e7cfc7ca7fd34
Reviewed-on: https://go-review.googlesource.com/c/go/+/646736
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-08-06 06:18:46 -07:00
Julian Zhu
17a8be7117 crypto/sha512: use const table for key loading on loong64
Load constant keys from a static memory table rather than loading immediates into registers on loong64.

Benchmark for Loongson-3A5000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A5000-HV @ 2500.00MHz
                    │   sha512o   │              sha512n            │
                    │   sec/op    │   sec/op     vs base            │
Hash8Bytes/New-4      489.1n ± 0%   464.7n ± 0%  -5.00% (p=0.000 n=8)
Hash8Bytes/Sum384-4   499.1n ± 0%   474.6n ± 0%  -4.92% (p=0.000 n=8)
Hash8Bytes/Sum512-4   506.6n ± 0%   481.9n ± 0%  -4.86% (p=0.000 n=8)
Hash1K/New-4          3.371µ ± 0%   3.152µ ± 0%  -6.51% (p=0.000 n=8)
Hash1K/Sum384-4       3.385µ ± 0%   3.164µ ± 0%  -6.53% (p=0.000 n=8)
Hash1K/Sum512-4       3.392µ ± 0%   3.170µ ± 0%  -6.54% (p=0.000 n=8)
Hash8K/New-4          23.62µ ± 0%   22.01µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum384-4       23.63µ ± 0%   22.02µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum512-4       23.64µ ± 0%   22.02µ ± 0%  -6.86% (p=0.000 n=8)
geomean               3.415µ        3.207µ       -6.10%

                    │   sha512o    │              sha512n            │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-4     15.60Mi ± 0%   16.42Mi ± 0%  +5.29% (p=0.000 n=8)
Hash8Bytes/Sum384-4  15.29Mi ± 0%   16.08Mi ± 0%  +5.18% (p=0.000 n=8)
Hash8Bytes/Sum512-4  15.06Mi ± 0%   15.83Mi ± 0%  +5.13% (p=0.000 n=8)
Hash1K/New-4         289.7Mi ± 0%   309.9Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum384-4      288.5Mi ± 0%   308.6Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum512-4      287.9Mi ± 0%   308.0Mi ± 0%  +7.00% (p=0.000 n=8)
Hash8K/New-4         330.8Mi ± 0%   355.0Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum384-4      330.6Mi ± 0%   354.9Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum512-4      330.5Mi ± 0%   354.8Mi ± 0%  +7.36% (p=0.000 n=8)
geomean              113.5Mi        120.9Mi       +6.50%

Benchmark for Loongson-3A6000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A6000 @ 2500.00MHz
                    │ sha512.old  │             sha512.new           │
                    │   sec/op    │   sec/op     vs base             │
Hash8Bytes/New-8      397.2n ± 0%   380.6n ± 0%  -4.17% (p=0.000 n=10)
Hash8Bytes/Sum384-8   406.1n ± 0%   397.9n ± 0%  -2.02% (p=0.000 n=10)
Hash8Bytes/Sum512-8   410.1n ± 0%   395.8n ± 1%  -3.50% (p=0.000 n=10)
Hash1K/New-8          2.932µ ± 0%   2.800µ ± 0%  -4.50% (p=0.000 n=10)
Hash1K/Sum384-8       2.941µ ± 0%   2.812µ ± 0%  -4.39% (p=0.000 n=10)
Hash1K/Sum512-8       2.947µ ± 0%   2.814µ ± 0%  -4.50% (p=0.000 n=10)
Hash8K/New-8          20.68µ ± 0%   19.73µ ± 1%  -4.58% (p=0.000 n=10)
Hash8K/Sum384-8       20.69µ ± 0%   19.73µ ± 0%  -4.62% (p=0.000 n=10)
Hash8K/Sum512-8       20.70µ ± 0%   19.75µ ± 0%  -4.60% (p=0.000 n=10)
geomean               2.908µ        2.789µ       -4.10%

                    │  sha512.old  │             sha512.new          │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-8    19.21Mi ± 0%   20.05Mi ± 0%  +4.37% (p=0.000 n=10)
Hash8Bytes/Sum384-8 18.79Mi ± 0%   19.18Mi ± 0%  +2.08% (p=0.000 n=10)
Hash8Bytes/Sum512-8 18.60Mi ± 0%   19.28Mi ± 1%  +3.64% (p=0.000 n=10)
Hash1K/New-8        333.1Mi ± 0%   348.8Mi ± 0%  +4.71% (p=0.000 n=10)
Hash1K/Sum384-8     332.0Mi ± 0%   347.3Mi ± 0%  +4.60% (p=0.000 n=10)
Hash1K/Sum512-8     331.5Mi ± 0%   347.0Mi ± 0%  +4.69% (p=0.000 n=10)
Hash8K/New-8        377.8Mi ± 0%   396.0Mi ± 1%  +4.80% (p=0.000 n=10)
Hash8K/Sum384-8     377.7Mi ± 0%   396.0Mi ± 0%  +4.85% (p=0.000 n=10)
Hash8K/Sum512-8     377.5Mi ± 0%   395.7Mi ± 0%  +4.82% (p=0.000 n=10)
geomean             133.3Mi        139.0Mi       +4.28%

Change-Id: I55ae4a8e4b0c51a98583f654158235fe738cf348
Reviewed-on: https://go-review.googlesource.com/c/go/+/678436
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-05 18:02:52 -07:00
Julian Zhu
dda9d780e2 crypto/sha256: use const table for key loading on loong64
Load constant keys from a static memory table rather than loading immediates into registers on loong64.

Benchmark for Loongson-3A5000:
goos: linux
goarch: loong64
pkg: crypto/sha256
cpu: Loongson-3A5000-HV @ 2500.00MHz
                    │   sha256o   │              sha256n              │
                    │   sec/op    │   sec/op     vs base              │
Hash8Bytes/New-4      356.1n ± 0%   347.0n ± 0%  -2.54% (p=0.000 n=8)
Hash8Bytes/Sum224-4   368.7n ± 0%   359.5n ± 0%  -2.50% (p=0.000 n=8)
Hash8Bytes/Sum256-4   367.7n ± 0%   358.9n ± 0%  -2.41% (p=0.000 n=8)
Hash1K/New-4          4.741µ ± 0%   4.578µ ± 0%  -3.44% (p=0.000 n=8)
Hash1K/Sum224-4       4.755µ ± 0%   4.591µ ± 0%  -3.44% (p=0.000 n=8)
Hash1K/Sum256-4       4.753µ ± 0%   4.589µ ± 0%  -3.46% (p=0.000 n=8)
Hash8K/New-4          35.42µ ± 0%   34.19µ ± 0%  -3.45% (p=0.000 n=8)
Hash8K/Sum224-4       35.43µ ± 0%   34.21µ ± 0%  -3.44% (p=0.000 n=8)
Hash8K/Sum256-4       35.46µ ± 0%   34.22µ ± 0%  -3.48% (p=0.000 n=8)
Hash256K/New-4        1.138m ± 0%   1.098m ± 0%  -3.54% (p=0.000 n=8)
Hash256K/Sum224-4     1.138m ± 0%   1.098m ± 0%  -3.53% (p=0.000 n=8)
Hash256K/Sum256-4     1.139m ± 0%   1.099m ± 0%  -3.48% (p=0.000 n=8)
Hash1M/New-4          4.488m ± 0%   4.388m ± 0%  -2.22% (p=0.000 n=8)
Hash1M/Sum224-4       4.488m ± 0%   4.387m ± 0%  -2.24% (p=0.000 n=8)
Hash1M/Sum256-4       4.489m ± 0%   4.388m ± 0%  -2.25% (p=0.000 n=8)
geomean               50.02µ        48.50µ       -3.03%

                    │   sha256o    │              sha256n               │
                    │     B/s      │     B/s       vs base              │
Hash8Bytes/New-4      21.42Mi ± 0%   21.99Mi ± 0%  +2.63% (p=0.000 n=8)
Hash8Bytes/Sum224-4   20.69Mi ± 0%   21.22Mi ± 0%  +2.56% (p=0.000 n=8)
Hash8Bytes/Sum256-4   20.74Mi ± 0%   21.26Mi ± 0%  +2.48% (p=0.000 n=8)
Hash1K/New-4          206.0Mi ± 0%   213.3Mi ± 0%  +3.57% (p=0.000 n=8)
Hash1K/Sum224-4       205.4Mi ± 0%   212.7Mi ± 0%  +3.57% (p=0.000 n=8)
Hash1K/Sum256-4       205.5Mi ± 0%   212.8Mi ± 0%  +3.58% (p=0.000 n=8)
Hash8K/New-4          220.6Mi ± 0%   228.5Mi ± 0%  +3.58% (p=0.000 n=8)
Hash8K/Sum224-4       220.5Mi ± 0%   228.4Mi ± 0%  +3.56% (p=0.000 n=8)
Hash8K/Sum256-4       220.3Mi ± 0%   228.3Mi ± 0%  +3.61% (p=0.000 n=8)
Hash256K/New-4        219.7Mi ± 0%   227.7Mi ± 0%  +3.67% (p=0.000 n=8)
Hash256K/Sum224-4     219.6Mi ± 0%   227.6Mi ± 0%  +3.66% (p=0.000 n=8)
Hash256K/Sum256-4     219.6Mi ± 0%   227.5Mi ± 0%  +3.60% (p=0.000 n=8)
Hash1M/New-4          222.8Mi ± 0%   227.9Mi ± 0%  +2.27% (p=0.000 n=8)
Hash1M/Sum224-4       222.8Mi ± 0%   227.9Mi ± 0%  +2.29% (p=0.000 n=8)
Hash1M/Sum256-4       222.8Mi ± 0%   227.9Mi ± 0%  +2.30% (p=0.000 n=8)
geomean               136.0Mi        140.2Mi       +3.13%

Benchmark for Loongson-3A6000:
goos: linux
goarch: loong64
pkg: crypto/sha256
cpu: Loongson-3A6000 @ 2500.00MHz
                    │ sha256.old  │             sha256.new             │
                    │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-8      294.5n ± 0%   288.6n ± 0%  -2.00% (p=0.000 n=10)
Hash8Bytes/Sum224-8   305.0n ± 0%   299.7n ± 0%  -1.74% (p=0.000 n=10)
Hash8Bytes/Sum256-8   302.0n ± 0%   296.8n ± 0%  -1.74% (p=0.000 n=10)
Hash1K/New-8          4.186µ ± 0%   4.096µ ± 0%  -2.15% (p=0.000 n=10)
Hash1K/Sum224-8       4.193µ ± 0%   4.104µ ± 0%  -2.12% (p=0.000 n=10)
Hash1K/Sum256-8       4.194µ ± 0%   4.108µ ± 0%  -2.04% (p=0.000 n=10)
Hash8K/New-8          31.44µ ± 0%   30.76µ ± 0%  -2.17% (p=0.000 n=10)
Hash8K/Sum224-8       31.45µ ± 0%   30.79µ ± 0%  -2.10% (p=0.000 n=10)
Hash8K/Sum256-8       31.45µ ± 0%   30.78µ ± 0%  -2.12% (p=0.000 n=10)
Hash256K/New-8        996.7µ ± 0%   975.6µ ± 0%  -2.12% (p=0.000 n=10)
Hash256K/Sum224-8     996.8µ ± 0%   975.8µ ± 0%  -2.11% (p=0.000 n=10)
Hash256K/Sum256-8     996.8µ ± 0%   975.6µ ± 0%  -2.12% (p=0.000 n=10)
Hash1M/New-8          3.987m ± 0%   3.904m ± 0%  -2.08% (p=0.000 n=10)
Hash1M/Sum224-8       3.990m ± 0%   3.902m ± 0%  -2.20% (p=0.000 n=10)
Hash1M/Sum256-8       3.987m ± 0%   3.903m ± 0%  -2.10% (p=0.000 n=10)
geomean               43.59µ        42.69µ       -2.06%

                    │  sha256.old  │             sha256.new              │
                    │     B/s      │     B/s       vs base               │
Hash8Bytes/New-8      25.90Mi ± 0%   26.44Mi ± 0%  +2.06% (p=0.000 n=10)
Hash8Bytes/Sum224-8   25.01Mi ± 0%   25.46Mi ± 0%  +1.77% (p=0.000 n=10)
Hash8Bytes/Sum256-8   25.26Mi ± 0%   25.72Mi ± 0%  +1.79% (p=0.000 n=10)
Hash1K/New-8          233.3Mi ± 0%   238.5Mi ± 0%  +2.19% (p=0.000 n=10)
Hash1K/Sum224-8       232.9Mi ± 0%   238.0Mi ± 0%  +2.17% (p=0.000 n=10)
Hash1K/Sum256-8       232.9Mi ± 0%   237.7Mi ± 0%  +2.07% (p=0.000 n=10)
Hash8K/New-8          248.5Mi ± 0%   254.0Mi ± 0%  +2.22% (p=0.000 n=10)
Hash8K/Sum224-8       248.4Mi ± 0%   253.7Mi ± 0%  +2.14% (p=0.000 n=10)
Hash8K/Sum256-8       248.4Mi ± 0%   253.8Mi ± 0%  +2.17% (p=0.000 n=10)
Hash256K/New-8        250.8Mi ± 0%   256.3Mi ± 0%  +2.17% (p=0.000 n=10)
Hash256K/Sum224-8     250.8Mi ± 0%   256.2Mi ± 0%  +2.16% (p=0.000 n=10)
Hash256K/Sum256-8     250.8Mi ± 0%   256.2Mi ± 0%  +2.17% (p=0.000 n=10)
Hash1M/New-8          250.8Mi ± 0%   256.2Mi ± 0%  +2.12% (p=0.000 n=10)
Hash1M/Sum224-8       250.6Mi ± 0%   256.3Mi ± 0%  +2.25% (p=0.000 n=10)
Hash1M/Sum256-8       250.8Mi ± 0%   256.2Mi ± 0%  +2.14% (p=0.000 n=10)
geomean               156.0Mi        159.3Mi       +2.11%

Change-Id: Ib72cf3c746d4ad73e52e5d31f6b4a834fd36d934
Reviewed-on: https://go-review.googlesource.com/c/go/+/678435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-05 18:02:46 -07:00
Xiaolin Zhao
5defe8ebb3 internal/chacha8rand: replace WORD with instruction VMOVQ
Change-Id: I5d05af4d071b4b0ee60fafbd2a39494128bdf3f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/682896
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 18:02:27 -07:00
limeidan
4c7362e41c cmd/internal/obj/loong64: add new instructions ALSL{W/WU/V} for loong64
Go asm syntax:
	ALSL{W/WU/V}	$3, R4, R5, R6

Equivalent platform assembler syntax:
	alsl.{w/wu/d}	$r6, $r4, $r5, 3

Change-Id: Ic8364dfe2753bcea7de6cffe656ca0dde6875766
Reviewed-on: https://go-review.googlesource.com/c/go/+/692136
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2025-08-05 18:02:17 -07:00
Xiaolin Zhao
a552737418 cmd/compile: fold negation into multiplication on loong64
This change also add corresponding benchmark tests and codegen tests.
The performance improvement on CPU Loongson-3A6000-HV is as follows:

goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
        |  bench.old   |              bench.new              |
        |    sec/op    |   sec/op     vs base                |
MulNeg     828.4n ± 0%   655.9n ± 0%  -20.82% (p=0.000 n=10)
Mul2Neg   1062.0n ± 0%   826.8n ± 0%  -22.15% (p=0.000 n=10)
geomean    938.0n        736.4n       -21.49%

Change-Id: Ia999732880ec65be0c66cddc757a4868847e5b15
Reviewed-on: https://go-review.googlesource.com/c/go/+/682535
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-08-05 18:02:06 -07:00
Tobias Klauser
e1fd4faf91 runtime: fix godoc comment for inVDSOPage
Change-Id: I7dcab0c915a748e52c5c689c1cb774f486d2b9e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/693195
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 14:31:35 -07:00
Keith Randall
bcd25c79aa cmd/compile: allow StructSelect [x] of interface data fields for x>0
As of CL 681937 we can now have structs which are pointer shaped, but
their pointer field is not the first field, like struct{ struct{}; *int }.

Fixes #74888

Change-Id: Idc80f6b1abde3ae01437e2a9cadb5aa23d04b806
Reviewed-on: https://go-review.googlesource.com/c/go/+/693415
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 14:30:13 -07:00
Mark Freeman
b0945a54b5 cmd/dist, internal/platform: mark freebsd/riscv64 broken
It seems we have a builder, but it is not running correctly. Until
then, we should mark this port broken.

For #74734
For #74735

Change-Id: I536d037a43499cbd033fb6ebdf004a3df76332ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/691835
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 14:10:52 -07:00
Austin Clements
55d961b202 runtime: save AVX2 and AVX-512 state on asynchronous preemption
Based on CL 669415 by shaojunyang@google.com.

This is a cherry-pick of CL 680900 from the dev.simd branch.

Change-Id: I574f15c3b18a7179a1573aaf567caf18d8602ef1
Reviewed-on: https://go-review.googlesource.com/c/go/+/693397
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 14:00:15 -07:00
Austin Clements
af0c4fe2ca runtime: save scalar registers off stack in amd64 async preemption
Asynchronous preemption must save all registers that could be in use
by Go code. Currently, it saves all of these to the goroutine stack.
As a result, the stack frame requirements of asynchronous preemption
can be rather high. On amd64, this requires 368 bytes of stack space,
most of which is the XMM registers. Several RISC architectures are
around 0.5 KiB.

As we add support for SIMD instructions, this is going to become a
problem. The AVX-512 register state is 2.5 KiB. This well exceeds the
nosplit limit, and even if it didn't, could constrain when we can
asynchronously preempt goroutines on small stacks.

This CL fixes this by moving pure scalar state stored in non-GP
registers off the stack and into an allocated "extended register
state" object. To reduce space overhead, we only allocate these
objects as needed. While in the theoretical limit, every G could need
this register state, in practice very few do at a time.

However, we can't allocate when we're in the middle of saving the
register state during an asynchronous preemption, so we reserve
scratch space on every P to temporarily store the register state,
which can then be copied out to an allocated state object later by Go
code.

This commit only implements this for amd64, since that's where we're
about to add much more vector state, but it lays the groundwork for
doing this on any architecture that could benefit.

This is a cherry-pick of CL 680898 plus bug fix CL 684836 from the
dev.simd branch.

Change-Id: I123a95e21c11d5c10942d70e27f84d2d99bbf735
Reviewed-on: https://go-review.googlesource.com/c/go/+/669195
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 13:58:52 -07:00
Austin Clements
e73afaae69 internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512"
This adds detection for the CD and DQ sub-features of x86 AVX-512.

Building on these, we also add a "derived" AVX-512 feature that
bundles together the basic usable subset of subfeatures. Despite the F
in AVX-512-F standing for "foundation", AVX-512-F+BW+DQ+VL together
really form the basic usable subset of AVX-512 functionality. These
have also all been supported together by almost every CPU, and are
guaranteed by GOAMD64=v4, so there's little point in separating them
out.

This is a cherry-pick of CL 680899 from the dev.simd branch.

Change-Id: I34356502bd1853ba2372e48db0b10d55cffe07a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/693396
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 13:57:45 -07:00
Austin Clements
cef381ba60 runtime: eliminate global state in mkpreempt.go
We're going to start writing two files, so having a single global file
we're writing will be a problem.

This has no effect on the generated code.

This is a cherry-pick of CL 680897 from the dev.simd branch.

Change-Id: I49897ea0c6500a29eac89b597d75c0eb3e9b6706
Reviewed-on: https://go-review.googlesource.com/c/go/+/693395
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 13:55:01 -07:00
Mateusz Poliwczak
c0025d5e0b go/parser: correct comment in expectedErrors
If `here` were already the start of the comment, then
the `pos = here` assignment would be redundant. Since pos
is already the start of the comment.

Change-Id: I793334988951ae5441327cb62d7524b423155b74
Reviewed-on: https://go-review.googlesource.com/c/go/+/693295
Reviewed-by: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
2025-08-05 10:41:10 -07:00
qiulaidongfeng
4ee0df8c46 cmd: remove dead code
Fixes #74076

Change-Id: Icc67b3d4e342f329584433bd1250c56ae8f5a73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/690635
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 10:31:25 -07:00
Michael Pratt
a2c45f0eb1 runtime: test VDSO symbol hash values
In addition to verifying existing values, this makes it easier to add a
new one by adding an invalid entry and running the test.

Change-Id: I6a6a636c9c413add29884e4f6759196f4db34de7
Reviewed-on: https://go-review.googlesource.com/c/go/+/693276
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 09:52:44 -07:00
Keith Randall
cd55f86b8d cmd/compile: allow multi-field structs to be stored directly in interfaces
If the struct is a bunch of 0-sized fields and one pointer field.

Fixes #74092

Change-Id: I87c5d162c8c9fdba812420d7f9d21de97295b62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/681937
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 09:18:31 -07:00
Keith Randall
21ab0128b6 cmd/compile: remove support for old-style bounds check calls
This CL rips out the support for old-style assembly stubs.

We need to keep the Go stubs for wasm support.

Change-Id: I23d6d9f2f06be1ded8d22b3e0ef04ff6e252a587
Reviewed-on: https://go-review.googlesource.com/c/go/+/682402
Reviewed-by: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-05 08:59:28 -07:00
Keith Randall
802d056c78 cmd/compile: move ppc64 over to new bounds check strategy
Change-Id: I25a9bbc247b2490e7e37ed843386f53a71822146
Reviewed-on: https://go-review.googlesource.com/c/go/+/682498
Reviewed-by: Paul Murphy <paumurph@redhat.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 08:59:16 -07:00
Daniel Morsing
a3295df873 cmd/compile/internal/ssa: Use transitive properties for len/cap
Remove the special casing for len/cap and rely on the posets.

After removing the special logic, I ran `go build -gcflags='-d
ssa/prove/debug=2' all` to verify my results. During this, I found 2
common cases where the old implicit unsigned->signed domain conversion
made proving a branch possible that shouldn't be strictly possible and
added these.

The 2 cases are shifting a non-negative signed integer and unsigned
comparisons that happen with arguments that fits entirely inside the
unsigned argument

Change-Id: Ic88049ff69efc5602fc15f5dad02028e704f5483
Reviewed-on: https://go-review.googlesource.com/c/go/+/679155
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-08-05 08:58:11 -07:00
Derek Parker
bd082857a5 doc: fix typo in go memory model doc
Fixes a typo where originally "may by" was written where the intent was
"may be".

Change-Id: Ia5ba51a966506395c41b17ca28d59f63bd487f3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/693075
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 08:41:37 -07:00
Cuong Manh Le
2b622b05a9 cmd/compile: remove isUintXPowerOfTwo functions
And use the generic version instead.

While at it, also correct the corresponding rules to use logXu variants
instead of logXu, following discussion in CL 689815.

Change-Id: Iba85d14ff0e26d45a126764e7bd5702586358d23
Reviewed-on: https://go-review.googlesource.com/c/go/+/692917
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 08:37:45 -07:00
Cuong Manh Le
72147ffa75 cmd/compile: simplify isUintXPowerOfTwo implementation
By calling isUnsignedPowerOfTwo instead of duplicating the same ones.

Change-Id: I1e29d3b7eda1bc8773fcd25728d8f508ae633ac9
Reviewed-on: https://go-review.googlesource.com/c/go/+/692916
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 08:34:05 -07:00
Cuong Manh Le
26da1199eb cmd/compile: make isUint{32,64}PowerOfTwo implementations clearer
Since these functions cast the input to uint64, so the result always
non-negative. The condition should be changed to comparing with zero,
thus maaking it clearer to reader, and open room for simplifying in the
future by using the generic isUnsignedPowerOfTwo function.

Separated this change, so it's easier to do bisecting if there's any
problems happened.

Change-Id: Ibec28c2590f4c52caa36384b710d526459725e49
Reviewed-on: https://go-review.googlesource.com/c/go/+/692915
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 08:32:51 -07:00
Cuong Manh Le
5ab9f23977 cmd/compile, runtime: add checkptr instrumentation for unsafe.Add
Fixes #74431

Change-Id: Id651ea0b82599ccaff8816af0a56ddbb149b6f89
Reviewed-on: https://go-review.googlesource.com/c/go/+/692015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 08:28:50 -07:00
Michael Munday
fcc036f03b cmd/compile: optimise float <-> int register moves on riscv64
Use the FMV* instructions to move values between the floating point and
integer register files.

Note: I'm unsure why there is a slowdown in the Float32bits benchmark,
I've checked and an FMVXS instruction is being used as expected. There
are multiple loads and other instructions in the main loop.

goos: linux
goarch: riscv64
pkg: math
cpu: Spacemit(R) X60
                    │ fmv-before.txt │            fmv-after.txt            │
                    │     sec/op     │   sec/op     vs base                │
Acos                     122.7n ± 0%   122.7n ± 0%        ~ (p=1.000 n=10)
Acosh                    197.2n ± 0%   191.5n ± 0%   -2.89% (p=0.000 n=10)
Asin                     122.7n ± 0%   122.7n ± 0%        ~ (p=0.474 n=10)
Asinh                    231.0n ± 0%   224.1n ± 0%   -2.99% (p=0.000 n=10)
Atan                     91.39n ± 0%   91.41n ± 0%        ~ (p=0.465 n=10)
Atanh                    210.3n ± 0%   203.4n ± 0%   -3.26% (p=0.000 n=10)
Atan2                    149.6n ± 0%   149.6n ± 0%        ~ (p=0.721 n=10)
Cbrt                     176.5n ± 0%   165.9n ± 0%   -6.01% (p=0.000 n=10)
Ceil                     25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Copysign                 3.756n ± 0%   3.756n ± 0%        ~ (p=0.149 n=10)
Cos                      95.15n ± 0%   95.15n ± 0%        ~ (p=0.374 n=10)
Cosh                     228.6n ± 0%   224.7n ± 0%   -1.71% (p=0.000 n=10)
Erf                      115.2n ± 0%   115.2n ± 0%        ~ (p=0.474 n=10)
Erfc                     116.4n ± 0%   116.4n ± 0%        ~ (p=0.628 n=10)
Erfinv                   133.3n ± 0%   133.3n ± 0%        ~ (p=1.000 n=10)
Erfcinv                  133.3n ± 0%   133.3n ± 0%        ~ (p=1.000 n=10)
Exp                      194.1n ± 0%   190.3n ± 0%   -1.93% (p=0.000 n=10)
ExpGo                    204.7n ± 0%   200.3n ± 0%   -2.15% (p=0.000 n=10)
Expm1                    137.7n ± 0%   135.2n ± 0%   -1.82% (p=0.000 n=10)
Exp2                     173.4n ± 0%   169.0n ± 0%   -2.54% (p=0.000 n=10)
Exp2Go                   182.8n ± 0%   178.4n ± 0%   -2.41% (p=0.000 n=10)
Abs                      3.756n ± 0%   3.756n ± 0%        ~ (p=0.157 n=10)
Dim                      12.52n ± 0%   12.52n ± 0%        ~ (p=0.737 n=10)
Floor                    25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Max                      21.29n ± 0%   20.03n ± 0%   -5.92% (p=0.000 n=10)
Min                      21.28n ± 0%   20.04n ± 0%   -5.85% (p=0.000 n=10)
Mod                      344.9n ± 0%   319.2n ± 0%   -7.45% (p=0.000 n=10)
Frexp                    55.71n ± 0%   48.85n ± 0%  -12.30% (p=0.000 n=10)
Gamma                    165.9n ± 0%   167.8n ± 0%   +1.15% (p=0.000 n=10)
Hypot                    73.24n ± 0%   70.74n ± 0%   -3.41% (p=0.000 n=10)
HypotGo                  84.50n ± 0%   82.63n ± 0%   -2.21% (p=0.000 n=10)
Ilogb                    49.45n ± 0%   45.70n ± 0%   -7.59% (p=0.000 n=10)
J0                       556.5n ± 0%   544.0n ± 0%   -2.25% (p=0.000 n=10)
J1                       555.3n ± 0%   542.8n ± 0%   -2.24% (p=0.000 n=10)
Jn                       1.181µ ± 0%   1.156µ ± 0%   -2.12% (p=0.000 n=10)
Ldexp                    59.47n ± 0%   53.84n ± 0%   -9.47% (p=0.000 n=10)
Lgamma                   167.2n ± 0%   154.6n ± 0%   -7.51% (p=0.000 n=10)
Log                      160.9n ± 0%   154.6n ± 0%   -3.92% (p=0.000 n=10)
Logb                     49.45n ± 0%   45.70n ± 0%   -7.58% (p=0.000 n=10)
Log1p                    147.1n ± 0%   137.1n ± 0%   -6.80% (p=0.000 n=10)
Log10                    162.1n ± 1%   154.6n ± 0%   -4.63% (p=0.000 n=10)
Log2                     66.99n ± 0%   60.72n ± 0%   -9.36% (p=0.000 n=10)
Modf                     29.42n ± 0%   26.29n ± 0%  -10.64% (p=0.000 n=10)
Nextafter32              41.95n ± 0%   37.88n ± 0%   -9.70% (p=0.000 n=10)
Nextafter64              38.82n ± 0%   33.49n ± 0%  -13.73% (p=0.000 n=10)
PowInt                   252.3n ± 0%   237.3n ± 0%   -5.95% (p=0.000 n=10)
PowFrac                  615.5n ± 0%   589.7n ± 0%   -4.19% (p=0.000 n=10)
Pow10Pos                 10.64n ± 0%   10.64n ± 0%        ~ (p=1.000 n=10)
Pow10Neg                 24.42n ± 0%   15.02n ± 0%  -38.49% (p=0.000 n=10)
Round                    21.91n ± 0%   18.16n ± 0%  -17.12% (p=0.000 n=10)
RoundToEven              24.42n ± 0%   21.29n ± 0%  -12.84% (p=0.000 n=10)
Remainder                308.0n ± 0%   291.2n ± 0%   -5.44% (p=0.000 n=10)
Signbit                  10.02n ± 0%   10.02n ± 0%        ~ (p=1.000 n=10)
Sin                      102.7n ± 0%   102.7n ± 0%        ~ (p=0.211 n=10)
Sincos                   124.0n ± 1%   123.3n ± 0%   -0.56% (p=0.002 n=10)
Sinh                     239.1n ± 0%   234.7n ± 0%   -1.84% (p=0.000 n=10)
SqrtIndirect             2.504n ± 0%   2.504n ± 0%        ~ (p=0.303 n=10)
SqrtLatency              15.03n ± 0%   15.02n ± 0%        ~ (p=0.598 n=10)
SqrtIndirectLatency      15.02n ± 0%   15.02n ± 0%        ~ (p=0.907 n=10)
SqrtGoLatency            165.3n ± 0%   157.2n ± 0%   -4.90% (p=0.000 n=10)
SqrtPrime                3.801µ ± 0%   3.802µ ± 0%        ~ (p=1.000 n=10)
Tan                      125.2n ± 0%   125.2n ± 0%        ~ (p=0.458 n=10)
Tanh                     244.2n ± 0%   239.9n ± 0%   -1.76% (p=0.000 n=10)
Trunc                    25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Y0                       550.2n ± 0%   538.1n ± 0%   -2.21% (p=0.000 n=10)
Y1                       552.8n ± 0%   540.6n ± 0%   -2.21% (p=0.000 n=10)
Yn                       1.168µ ± 0%   1.143µ ± 0%   -2.14% (p=0.000 n=10)
Float64bits              8.139n ± 0%   4.385n ± 0%  -46.13% (p=0.000 n=10)
Float64frombits          7.512n ± 0%   3.759n ± 0%  -49.96% (p=0.000 n=10)
Float32bits              8.138n ± 0%   9.393n ± 0%  +15.42% (p=0.000 n=10)
Float32frombits          7.513n ± 0%   3.757n ± 0%  -49.98% (p=0.000 n=10)
FMA                      3.756n ± 0%   3.756n ± 0%        ~ (p=0.246 n=10)
geomean                  77.43n        72.42n        -6.47%

Change-Id: I8dac69b1d17cb3d2af78d1c844d2b5d80000d667
Reviewed-on: https://go-review.googlesource.com/c/go/+/599235
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Munday <mikemndy@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-08-05 08:27:15 -07:00
Keith Randall
7a1679d7ae cmd/compile: move s390x over to new bounds check strategy
Change-Id: I86ed1a60165b729bb88a8a418da0ea1b59b3dc10
Reviewed-on: https://go-review.googlesource.com/c/go/+/682499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Munday <mikemndy@gmail.com>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-04 10:08:22 -07:00
Keith Randall
95693816a5 cmd/compile: move riscv64 over to new bounds check strategy
Change-Id: Idd9eaf051aa57f7fef7049c12085926030c35d70
Reviewed-on: https://go-review.googlesource.com/c/go/+/682401
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04 10:08:13 -07:00
Mateusz Poliwczak
d7bd7773eb go/parser: remove safePos
The logic in safePos is wrong, since (*token.File).Offset does not panic,
so this function was basically a noop (since CL 559436).

To work properly it would have to be:

return p.file.Pos(p.file.Offset(pos))

Since it effectively acts as a no-op and hasn't been noticed since,
let's go ahead and remove it.

Change-Id: I00a1bcc5af6a996c63de3f1175c15062e85cf89b
Reviewed-on: https://go-review.googlesource.com/c/go/+/692955
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2025-08-04 09:33:59 -07:00
Cherry Mui
4b6cbc377f cmd/cgo/internal/test: use (syntactic) constant for C array bound
A test in C has an array bound defined as a "const int", which is
technically a variable. The new version of C compiler in Xcode 26
beta emits a warning "variable length array folded to constant
array as an extension" for this (as an error since we build the
test with -Werror). Work around this by using an enum, which is
syntactically a constant.

Change-Id: Icfa943f293f6eac8f41d0615da40c126330d7d11
Reviewed-on: https://go-review.googlesource.com/c/go/+/692877
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-08-04 09:11:25 -07:00
Xiaolin Zhao
b2960e3580 cmd/internal/obj/loong64: add {V,XV}{BITCLR/BITSET/BITREV}[I].{B/H/W/D} instructions support
Go asm syntax:
	 V{BITCLR/BITSET/BITREV}{B/H/W/V}	$1, V2, V3
	XV{BITCLR/BITSET/BITREV}{B/H/W/V}	$1, X2, X3
	 V{BITCLR/BITSET/BITREV}{B/H/W/V}	VK, VJ, VD
	XV{BITCLR/BITSET/BITREV}{B/H/W/V}	XK, XJ, XD

Equivalent platform assembler syntax:
	 v{bitclr/bitset/bitrev}i.{b/h/w/d}	v3, v2, $1
	xv{bitclr/bitset/bitrev}i.{b/h/w/d}	x3, x2, $1
	 v{bitclr/bitset/bitrev}.{b/h/w/d}	vd, vj, vk
	xv{bitclr/bitset/bitrev}.{b/h/w/d}	xd, xj, xk

Change-Id: I244f8ae316f72cc7ea01ca0139ac78c5616a3c5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/677435
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-03 18:26:56 -07:00
Xiaolin Zhao
abeeef1c08 cmd/compile/internal/test: fix typo in comments
Change-Id: Iba6bb7f8252120f56d7e6ae49c9edc9382e8c7e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/679855
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-03 18:25:35 -07:00
Xiaolin Zhao
d44749b65b cmd/internal/obj/loong64: add [X]VLDREPL.{B/H/W/D} instructions support
Go asm syntax:
	 VMOVQ	offset(Rj), Vd.<T>
	XVMOVQ	offset(Rj), Xd.<T>

<T> can have the following values:
B16, H8, W4, V2, B32, H16, W8, V4

Change-Id: I44af51d58bb62649d3fe360b3abb771565e78a8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/682895
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-03 18:25:27 -07:00
limeidan
d6beda863e runtime: add reference to debugPinnerV1
This is intended to be used by debuggers, to keep heap memory reachable
even if it isn't referenced from anywhere else.

Change-Id: I1e900e02b4fe3a188f8173cec70f8de32122489b
Reviewed-on: https://go-review.googlesource.com/c/go/+/682875
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-03 18:25:12 -07:00
Roy Reznik
4ab1aec007 cmd/go: modload should use a read-write lock to improve concurrency
This PR will be imported into Gerrit with the title and first
comment (this text) used to generate the subject and body of
the Gerrit change.

Change-Id: I3f9bc8a2459059a924a04fa02794e258957819b5
GitHub-Last-Rev: 6ad6f6a70e
GitHub-Pull-Request: golang/go#74311
Reviewed-on: https://go-review.googlesource.com/c/go/+/683215
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Ian Alexander <jitsu@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-01 11:56:27 -07:00
qmuntal
e666972a67 runtime: deduplicate Windows stdcall
There is no need to have a dedicated stdcall variant for each
number of arguments. Instead, we can use a variadic function
that accepts any number of arguments and handles them uniformly.

While here, improve documentation of syscall_syscalln to make it clear
that it should not be used within the runtime package.

Change-Id: I022afc7f28d969fd7307bb2b1f4594246ac38d18
Reviewed-on: https://go-review.googlesource.com/c/go/+/691215
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-01 09:05:13 -07:00
qmuntal
ef40549786 runtime,syscall: move loadlibrary and getprocaddress to syscall
There is no need for loadlibrary, loadsystemlibrary and getprocaddress
to be implemented in the runtime and linknamed from syscall.

Change-Id: Icefd53a8e8f7012ed0c94c356be4179d5e45a01b
Reviewed-on: https://go-review.googlesource.com/c/go/+/690516
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-01 09:05:06 -07:00
qmuntal
336931a4ca cmd/go: use os.Rename to move files on Windows
The Go toolchain has been copying files instead of moving them on
Windows since CL 72910. It did this because os.Rename doesn't respect
the destination directory ACL permissions, and we want to honor them.

The drawback is that copying files is slower than moving them.

This CL reintroduces the use of os.Rename, but it also sets the
destination file's ACL to inherit the permissions from the
destination directory.

On my computer this change speeds up a simple "go build" (with warm
cache) by 1 second, going from 3 seconds to 2 seconds.

Updates #22343

Change-Id: I65b2b6f301e9e18bf2588959764eb06b9a01dfa2
Reviewed-on: https://go-review.googlesource.com/c/go/+/691255
Reviewed-by: Mark Freeman <mark@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
2025-08-01 09:04:50 -07:00
Keith Randall
eef5f8d930 cmd/compile: enforce that locals are always accessed with SP base register
After CL 678937, we could have a situation where the value of the
stack pointer is in both SP and another register. We need to make sure
that regalloc picks SP when issuing a reference to local variables;
the assembler expects that.

Fixes #74836

Change-Id: I2ac73ece6eb44b4a78c1369f8a69e51ab9748754
Reviewed-on: https://go-review.googlesource.com/c/go/+/692395
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mark Freeman <mark@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-01 08:57:29 -07:00
Xiaolin Zhao
e071617222 cmd/compile: optimize multiplication rules on loong64
Improve multiplication strength reduction, refer to CL 626998,
add additional 3 linear combination instructions for loong64.

goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
                  |  bench.old   |              bench.new               |
                  |    sec/op    |    sec/op     vs base                |
MulconstI32/3       1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstI32/5       1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstI32/12       1.601n ± 0%    1.201n ± 0%  -24.98% (p=0.000 n=10)
MulconstI32/120     1.6010n ± 0%   0.8130n ± 0%  -49.22% (p=0.000 n=10)
MulconstI32/-120    1.6010n ± 0%   0.8109n ± 0%  -49.35% (p=0.000 n=10)
MulconstI32/65537   1.6275n ± 0%   0.8005n ± 0%  -50.81% (p=0.000 n=10)
MulconstI32/65538   1.6290n ± 0%   0.8004n ± 0%  -50.87% (p=0.000 n=10)
MulconstI64/3       1.6010n ± 0%   0.8004n ± 0%  -50.01% (p=0.000 n=10)
MulconstI64/5       1.6010n ± 0%   0.8004n ± 0%  -50.01% (p=0.000 n=10)
MulconstI64/12       1.601n ± 0%    1.201n ± 0%  -24.98% (p=0.000 n=10)
MulconstI64/120     1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstI64/-120    1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstI64/65537   1.6270n ± 0%   0.8005n ± 0%  -50.80% (p=0.000 n=10)
MulconstI64/65538   1.6290n ± 0%   0.8071n ± 1%  -50.45% (p=0.000 n=10)
MulconstU32/3       1.6010n ± 0%   0.8004n ± 0%  -50.01% (p=0.000 n=10)
MulconstU32/5       1.6010n ± 0%   0.8004n ± 0%  -50.01% (p=0.000 n=10)
MulconstU32/12       1.601n ± 0%    1.201n ± 0%  -24.98% (p=0.000 n=10)
MulconstU32/120     1.6010n ± 0%   0.8066n ± 0%  -49.62% (p=0.000 n=10)
MulconstU32/65537   1.6290n ± 0%   0.8005n ± 0%  -50.86% (p=0.000 n=10)
MulconstU32/65538   1.6280n ± 0%   0.8005n ± 0%  -50.83% (p=0.000 n=10)
MulconstU64/3       1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstU64/5       1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstU64/12       1.601n ± 0%    1.201n ± 0%  -24.98% (p=0.000 n=10)
MulconstU64/120     1.6010n ± 0%   0.8005n ± 0%  -50.00% (p=0.000 n=10)
MulconstU64/65537   1.6290n ± 0%   0.8005n ± 0%  -50.86% (p=0.000 n=10)
MulconstU64/65538   1.6300n ± 0%   0.8067n ± 0%  -50.51% (p=0.000 n=10)
geomean              1.609n        0.8537n       -46.95%

goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A5000 @ 2500.00MHz
                  |  bench.old   |              bench.new               |
                  |    sec/op    |    sec/op     vs base                |
MulconstI32/3       1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstI32/5       1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstI32/12       1.601n ± 0%    1.202n ± 0%  -24.92% (p=0.000 n=10)
MulconstI32/120     1.6020n ± 0%   0.8012n ± 0%  -49.99% (p=0.000 n=10)
MulconstI32/-120    1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstI32/65537   1.6020n ± 0%   0.8007n ± 0%  -50.02% (p=0.000 n=10)
MulconstI32/65538   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstI64/3       1.6015n ± 0%   0.8007n ± 0%  -50.00% (p=0.000 n=10)
MulconstI64/5       1.6020n ± 0%   0.8007n ± 0%  -50.02% (p=0.000 n=10)
MulconstI64/12       1.602n ± 0%    1.202n ± 0%  -25.00% (p=0.000 n=10)
MulconstI64/120     1.6030n ± 0%   0.8011n ± 0%  -50.02% (p=0.000 n=10)
MulconstI64/-120    1.6020n ± 0%   0.8007n ± 0%  -50.02% (p=0.000 n=10)
MulconstI64/65537   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstI64/65538   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstU32/3       1.6010n ± 0%   0.8006n ± 0%  -49.99% (p=0.000 n=10)
MulconstU32/5       1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstU32/12       1.601n ± 0%    1.202n ± 0%  -24.92% (p=0.000 n=10)
MulconstU32/120     1.6010n ± 0%   0.8006n ± 0%  -49.99% (p=0.000 n=10)
MulconstU32/65537   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstU32/65538   1.6020n ± 0%   0.8009n ± 0%  -50.01% (p=0.000 n=10)
MulconstU64/3       1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstU64/5       1.6010n ± 0%   0.8007n ± 0%  -49.98% (p=0.000 n=10)
MulconstU64/12       1.601n ± 0%    1.201n ± 0%  -24.98% (p=0.000 n=10)
MulconstU64/120     1.6020n ± 0%   0.8007n ± 0%  -50.02% (p=0.000 n=10)
MulconstU64/65537   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
MulconstU64/65538   1.6010n ± 0%   0.8007n ± 0%  -49.99% (p=0.000 n=10)
geomean              1.601n        0.8523n       -46.77%

Change-Id: I9fb0e47ca57875da171a347bf4828adfab41b875
Reviewed-on: https://go-review.googlesource.com/c/go/+/675455
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
2025-08-01 08:42:40 -07:00
Keith Randall
eb7f515c4d cmd/compile: use generated loops instead of DUFFZERO on amd64
goarch: amd64
cpu: 12th Gen Intel(R) Core(TM) i7-12700
                        │     base      │                 exp                 │
                        │    sec/op     │   sec/op     vs base                │
MemclrKnownSize112-20      1.270n ± 14%   1.006n ± 0%  -20.72% (p=0.000 n=10)
MemclrKnownSize128-20      1.266n ±  0%   1.005n ± 0%  -20.58% (p=0.000 n=10)
MemclrKnownSize192-20      1.771n ±  0%   1.579n ± 1%  -10.84% (p=0.000 n=10)
MemclrKnownSize248-20      4.034n ±  0%   3.520n ± 0%  -12.75% (p=0.000 n=10)
MemclrKnownSize256-20      2.269n ±  0%   2.014n ± 0%  -11.26% (p=0.000 n=10)
MemclrKnownSize512-20      4.280n ±  0%   4.030n ± 0%   -5.84% (p=0.000 n=10)
MemclrKnownSize1024-20     8.309n ±  1%   8.057n ± 0%   -3.03% (p=0.000 n=10)

Change-Id: I8f1627e2a1e981ff351dc7178932b32a2627f765
Reviewed-on: https://go-review.googlesource.com/c/go/+/678937
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-31 17:12:39 -07:00
Michael Matloob
c0ee2fd4e3 cmd/go: explicitly reject module paths "go" and "toolchain"
The module paths "go" and "toolchain" are reserved for the dependency on
the go and toolchain versions. Check for those paths in go mod init to
create modules, go mod edit, and in the module loader and return an
error when attempting to use those paths for a work module. Trying to
init or load a work module with a go.mod that specifies the module path
"go" panics since Go 1.21 (when the toolchain switching logic and the
implicit dependencies on the "go" module was introduced), and this
change returns a proper error instead of panicking.

Fixes #74784

Change-Id: I10e712f8fddbea63edaeb37e14c6d783722e623f
Reviewed-on: https://go-review.googlesource.com/c/go/+/691515
Reviewed-by: Ian Alexander <jitsu@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@google.com>
2025-07-31 11:00:08 -07:00
Michael Anthony Knyszek
a4d99770c0 runtime/metrics: add cleanup and finalizer queue metrics
These metrics are useful for identifying finalizer and cleanup problems,
namely slow finalizers and/or cleanups holding up the queue, which can
lead to a memory leak.

Fixes #72948.

Change-Id: I1bb64a9ca751fcb462c96d986d0346e0c2894c95
Reviewed-on: https://go-review.googlesource.com/c/go/+/690396
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-07-30 14:00:16 -07:00
Michael Anthony Knyszek
70a2ff7648 runtime: add cgo call benchmark
Change-Id: I12d2ae7dd6a33ecb7110b7d090871e7143fd609f
Reviewed-on: https://go-review.googlesource.com/c/go/+/646196
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-30 13:37:28 -07:00
Michael Matloob
69338a335a cmd/go/internal/gover: fix ModIsPrerelease for toolchain versions
We forgot to call the IsPrerelease function on FromToolchain(vers)
rather than on vers itself. IsPrerelase expects a version without the
"go" prefix. See the corresponding code in ModIsValid and ModIsPrefix
that call FromToolchain before passing the versions to IsValid and
IsLang respectively.

Fixes #74786

Change-Id: I3cf055e1348e6a9dc0334e414f06fe85eaf78024
Reviewed-on: https://go-review.googlesource.com/c/go/+/691655
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Matloob <matloob@golang.org>
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-30 13:18:24 -07:00