Commit graph

63793 commits

Author SHA1 Message Date
Damien Neil
a3895fe9f1 database/sql: avoid closing Rows while scan is in progress
A database/sql/driver.Rows can return database-owned data
from Rows.Next. The driver.Rows documentation doesn't explicitly
document the lifetime guarantees for this data, but a reasonable
expectation is that the caller of Next should only access it
until the next call to Rows.Close or Rows.Next.

Avoid violating that constraint when a query is cancelled while
a call to database/sql.Rows.Scan (note the difference between
the two different Rows types!) is in progress. We previously
took care to avoid closing a driver.Rows while the user has
access to driver-owned memory via a RawData, but we could still
close a driver.Rows while a Scan call was in the process of
reading previously-returned driver-owned data.

Update the fake DB used in database/sql tests to invalidate
returned data to help catch other places we might be
incorrectly retaining it.

Fixes #74831.

Change-Id: Ice45b5fad51b679c38e3e1d21ef39156b56d6037
Reviewed-on: https://go-internal-review.googlesource.com/c/go/+/2540
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Neal Patel <nealpatel@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/693735
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-06 11:36:35 -07:00
Mark Freeman
608e9fac90 go/types, types2: flip on position tracing
Running compilebench with flags off / on, we get the below:

                         │   old.txt    │              new.txt               │
                         │    sec/op    │   sec/op     vs base               │
Template                    149.2m ± 6%   155.5m ± 5%       ~ (p=0.280 n=10)
Unicode                     110.1m ± 3%   105.8m ± 7%       ~ (p=0.280 n=10)
GoTypes                     774.0m ± 6%   757.7m ± 2%       ~ (p=0.247 n=10)
Compiler                    109.6m ± 6%   109.8m ± 6%       ~ (p=0.579 n=10)
SSA                          4.562 ± 2%    4.550 ± 2%       ~ (p=0.436 n=10)
Flate                      101.65m ± 9%   96.32m ± 7%  -5.24% (p=0.043 n=10)
GoParser                    168.7m ± 6%   173.7m ± 6%       ~ (p=0.436 n=10)
Reflect                     390.2m ± 5%   387.8m ± 6%       ~ (p=0.684 n=10)
Tar                         185.9m ± 3%   182.2m ± 4%       ~ (p=0.529 n=10)
XML                         212.7m ± 4%   211.4m ± 4%       ~ (p=0.971 n=10)
LinkCompiler                490.9m ± 4%   480.4m ± 4%       ~ (p=0.353 n=10)
ExternalLinkCompiler         1.501 ± 1%    1.501 ± 1%       ~ (p=0.853 n=10)
LinkWithoutDebugCompiler    311.8m ± 4%   308.6m ± 4%       ~ (p=0.579 n=10)
StdCmd                       17.60 ± 1%    17.62 ± 1%       ~ (p=0.912 n=10)
geomean                     427.5m        424.2m       -0.77%

Overall, we do not see a statistically significant perforance impact. Flate
actually reports a speedup, but with a p-value of 0.043, it's quite close
to the significance threshold (which is fairly lenient). In my opinion,
this is likely due to chance.

Fixes #51603

Change-Id: I7f439730be45e02c7f799df768590ef78e321952
Reviewed-on: https://go-review.googlesource.com/c/go/+/676816
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06 11:05:55 -07:00
Keith Randall
72e8237cc1 cmd/compile: allow more args in StructMake folding rule
imakeOfStructMake does the right thing, but we never call it
when the StructMake has more than one argument.

Fixes #74908

Change-Id: Ib4b1a025bfb1fa69a325207e47b74bd6217092bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/693615
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06 11:04:07 -07:00
Joel Sing
3406a617d9 internal/bytealg: vector implementation of indexbyte for riscv64
Provide a vector implementation of indexbyte for riscv64, which is used
when compiled with the rva23u64 profile, or when vector is detected
to be available. Inputs that are smaller than 24 bytes will continue
to use the non-vector path.

On a Banana Pi F3, with GORISCV64=rva23u64:

                │  indexbyte.1  │             indexbyte.2              │
                │    sec/op     │    sec/op     vs base                │
IndexByte/10-8     52.68n ±  0%   47.26n ±  0%  -10.30% (p=0.000 n=10)
IndexByte/32-8     68.62n ±  0%   47.02n ±  0%  -31.49% (p=0.000 n=10)
IndexByte/4K-8    2217.0n ±  0%   420.4n ±  0%  -81.04% (p=0.000 n=10)
IndexByte/4M-8    2624.4µ ±  0%   767.5µ ±  0%  -70.75% (p=0.000 n=10)
IndexByte/64M-8    68.08m ± 10%   47.84m ± 45%  -29.73% (p=0.004 n=10)
geomean            17.03µ         8.073µ        -52.59%

                │ indexbyte.1  │               indexbyte.2               │
                │     B/s      │      B/s        vs base                 │
IndexByte/10-8    181.0Mi ± 0%    201.8Mi ±  0%   +11.48% (p=0.000 n=10)
IndexByte/32-8    444.7Mi ± 0%    649.1Mi ±  0%   +45.97% (p=0.000 n=10)
IndexByte/4K-8    1.721Gi ± 0%    9.076Gi ±  0%  +427.51% (p=0.000 n=10)
IndexByte/4M-8    1.488Gi ± 0%    5.089Gi ±  0%  +241.93% (p=0.000 n=10)
IndexByte/64M-8   940.3Mi ± 9%   1337.8Mi ± 31%   +42.27% (p=0.004 n=10)
geomean           727.1Mi         1.498Gi        +110.94%

Change-Id: If7b0dbef38d76fa7a2021e4ecaed668a1d4b9783
Reviewed-on: https://go-review.googlesource.com/c/go/+/648856
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-06 06:23:02 -07:00
Joel Sing
75ea2d05c0 internal/bytealg: vector implementation of equal for riscv64
Provide a vector implementation of equal for riscv64, which is used
when compiled with the rva23u64 profile, or when vector is detected
to be available. Inputs that are 8 byte aligned will still be handled
via a the non-vector code if the length is less than or equal to 64
bytes.

On a Banana Pi F3, with GORISCV64=rva23u64:

                                │   equal.1    │               equal.2                │
                                │    sec/op    │    sec/op     vs base                │
Equal/0-8                         1.254n ±  0%   1.254n ±  0%        ~ (p=1.000 n=10)
Equal/same/1-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.466 n=10)
Equal/same/6-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.689 n=10)
Equal/same/9-8                    21.32n ±  0%   21.32n ±  0%        ~ (p=0.861 n=10)
Equal/same/15-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.657 n=10)
Equal/same/16-8                   21.32n ±  0%   21.33n ±  0%        ~ (p=0.075 n=10)
Equal/same/20-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.249 n=10)
Equal/same/32-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.303 n=10)
Equal/same/4K-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=1.000 n=10)
Equal/same/4M-8                   21.32n ±  0%   21.32n ±  0%        ~ (p=0.582 n=10)
Equal/same/64M-8                  21.32n ±  0%   21.32n ±  0%        ~ (p=0.930 n=10)
Equal/1-8                         39.16n ±  1%   38.71n ±  0%   -1.15% (p=0.000 n=10)
Equal/6-8                         51.49n ±  1%   50.40n ±  1%   -2.12% (p=0.000 n=10)
Equal/9-8                         54.46n ±  1%   53.89n ±  0%   -1.04% (p=0.000 n=10)
Equal/15-8                        71.81n ±  1%   70.59n ±  0%   -1.71% (p=0.000 n=10)
Equal/16-8                        69.14n ±  0%   68.21n ±  0%   -1.34% (p=0.000 n=10)
Equal/20-8                        78.59n ±  0%   77.59n ±  0%   -1.26% (p=0.000 n=10)
Equal/32-8                        41.55n ±  0%   41.16n ±  0%   -0.96% (p=0.000 n=10)
Equal/4K-8                        925.5n ±  0%   561.4n ±  1%  -39.34% (p=0.000 n=10)
Equal/4M-8                        3.110m ± 32%   2.463m ± 16%  -20.80% (p=0.000 n=10)
Equal/64M-8                       47.34m ± 30%   39.89m ± 16%  -15.75% (p=0.004 n=10)
EqualBothUnaligned/64_0-8         32.17n ±  1%   32.11n ±  1%        ~ (p=0.184 n=10)
EqualBothUnaligned/64_1-8         79.48n ±  0%   48.24n ±  1%  -39.31% (p=0.000 n=10)
EqualBothUnaligned/64_4-8         72.71n ±  0%   48.37n ±  1%  -33.48% (p=0.000 n=10)
EqualBothUnaligned/64_7-8         77.12n ±  0%   48.16n ±  1%  -37.56% (p=0.000 n=10)
EqualBothUnaligned/4096_0-8       908.4n ±  0%   562.4n ±  2%  -38.09% (p=0.000 n=10)
EqualBothUnaligned/4096_1-8       956.6n ±  0%   571.4n ±  3%  -40.26% (p=0.000 n=10)
EqualBothUnaligned/4096_4-8       949.6n ±  0%   571.6n ±  3%  -39.81% (p=0.000 n=10)
EqualBothUnaligned/4096_7-8       954.2n ±  0%   571.7n ±  3%  -40.09% (p=0.000 n=10)
EqualBothUnaligned/4194304_0-8    2.935m ± 29%   2.664m ± 19%        ~ (p=0.089 n=10)
EqualBothUnaligned/4194304_1-8    3.341m ± 13%   2.896m ± 34%        ~ (p=0.075 n=10)
EqualBothUnaligned/4194304_4-8    3.204m ± 39%   3.352m ± 33%        ~ (p=0.796 n=10)
EqualBothUnaligned/4194304_7-8    3.226m ± 30%   2.737m ± 34%  -15.16% (p=0.043 n=10)
EqualBothUnaligned/67108864_0-8   49.04m ± 17%   39.94m ± 12%  -18.57% (p=0.005 n=10)
EqualBothUnaligned/67108864_1-8   51.96m ± 15%   42.48m ± 15%  -18.23% (p=0.015 n=10)
EqualBothUnaligned/67108864_4-8   47.67m ± 17%   37.85m ± 41%  -20.61% (p=0.035 n=10)
EqualBothUnaligned/67108864_7-8   53.00m ± 22%   38.76m ± 21%  -26.87% (p=0.000 n=10)
CompareBytesEqual-8               51.71n ±  1%   52.00n ±  0%   +0.57% (p=0.002 n=10)
geomean                           1.469µ         1.265µ        -13.93%

                                │    equal.1     │                equal.2                 │
                                │      B/s       │      B/s        vs base                │
Equal/same/1-8                     44.73Mi ±  0%    44.72Mi ±  0%        ~ (p=0.426 n=10)
Equal/same/6-8                     268.3Mi ±  0%    268.4Mi ±  0%        ~ (p=0.753 n=10)
Equal/same/9-8                     402.6Mi ±  0%    402.5Mi ±  0%        ~ (p=0.209 n=10)
Equal/same/15-8                    670.9Mi ±  0%    670.9Mi ±  0%        ~ (p=0.724 n=10)
Equal/same/16-8                    715.6Mi ±  0%    715.4Mi ±  0%   -0.04% (p=0.022 n=10)
Equal/same/20-8                    894.6Mi ±  0%    894.5Mi ±  0%        ~ (p=0.060 n=10)
Equal/same/32-8                    1.398Gi ±  0%    1.398Gi ±  0%        ~ (p=0.986 n=10)
Equal/same/4K-8                    178.9Gi ±  0%    178.9Gi ±  0%        ~ (p=0.853 n=10)
Equal/same/4M-8                    178.9Ti ±  0%    178.9Ti ±  0%        ~ (p=0.971 n=10)
Equal/same/64M-8                  2862.8Ti ±  0%   2862.6Ti ±  0%        ~ (p=0.971 n=10)
Equal/1-8                          24.35Mi ±  1%    24.63Mi ±  0%   +1.16% (p=0.000 n=10)
Equal/6-8                          111.1Mi ±  1%    113.5Mi ±  1%   +2.17% (p=0.000 n=10)
Equal/9-8                          157.6Mi ±  1%    159.3Mi ±  0%   +1.05% (p=0.000 n=10)
Equal/15-8                         199.2Mi ±  1%    202.7Mi ±  0%   +1.74% (p=0.000 n=10)
Equal/16-8                         220.7Mi ±  0%    223.7Mi ±  0%   +1.36% (p=0.000 n=10)
Equal/20-8                         242.7Mi ±  0%    245.8Mi ±  0%   +1.27% (p=0.000 n=10)
Equal/32-8                         734.3Mi ±  0%    741.6Mi ±  0%   +0.98% (p=0.000 n=10)
Equal/4K-8                         4.122Gi ±  0%    6.795Gi ±  1%  +64.84% (p=0.000 n=10)
Equal/4M-8                         1.258Gi ± 24%    1.586Gi ± 14%  +26.12% (p=0.000 n=10)
Equal/64M-8                        1.320Gi ± 23%    1.567Gi ± 14%  +18.69% (p=0.004 n=10)
EqualBothUnaligned/64_0-8          1.853Gi ±  1%    1.856Gi ±  1%        ~ (p=0.190 n=10)
EqualBothUnaligned/64_1-8          767.9Mi ±  0%   1265.2Mi ±  1%  +64.76% (p=0.000 n=10)
EqualBothUnaligned/64_4-8          839.4Mi ±  0%   1261.9Mi ±  1%  +50.33% (p=0.000 n=10)
EqualBothUnaligned/64_7-8          791.4Mi ±  0%   1267.5Mi ±  1%  +60.16% (p=0.000 n=10)
EqualBothUnaligned/4096_0-8        4.199Gi ±  0%    6.784Gi ±  2%  +61.54% (p=0.000 n=10)
EqualBothUnaligned/4096_1-8        3.988Gi ±  0%    6.676Gi ±  3%  +67.40% (p=0.000 n=10)
EqualBothUnaligned/4096_4-8        4.017Gi ±  0%    6.674Gi ±  3%  +66.14% (p=0.000 n=10)
EqualBothUnaligned/4096_7-8        3.998Gi ±  0%    6.673Gi ±  3%  +66.92% (p=0.000 n=10)
EqualBothUnaligned/4194304_0-8     1.332Gi ± 22%    1.468Gi ± 16%        ~ (p=0.089 n=10)
EqualBothUnaligned/4194304_1-8     1.169Gi ± 12%    1.350Gi ± 25%        ~ (p=0.075 n=10)
EqualBothUnaligned/4194304_4-8     1.222Gi ± 28%    1.165Gi ± 48%        ~ (p=0.796 n=10)
EqualBothUnaligned/4194304_7-8     1.211Gi ± 23%    1.427Gi ± 26%  +17.88% (p=0.043 n=10)
EqualBothUnaligned/67108864_0-8    1.274Gi ± 14%    1.567Gi ± 14%  +22.97% (p=0.005 n=10)
EqualBothUnaligned/67108864_1-8    1.204Gi ± 14%    1.471Gi ± 13%  +22.18% (p=0.015 n=10)
EqualBothUnaligned/67108864_4-8    1.311Gi ± 14%    1.651Gi ± 29%  +25.92% (p=0.035 n=10)
EqualBothUnaligned/67108864_7-8    1.179Gi ± 18%    1.612Gi ± 17%  +36.73% (p=0.000 n=10)
geomean                            1.870Gi          2.190Gi        +17.16%

Change-Id: I9c5270bcc6997d020a96d1e97c7e7cfc7ca7fd34
Reviewed-on: https://go-review.googlesource.com/c/go/+/646736
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-08-06 06:18:46 -07:00
Julian Zhu
17a8be7117 crypto/sha512: use const table for key loading on loong64
Load constant keys from a static memory table rather than loading immediates into registers on loong64.

Benchmark for Loongson-3A5000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A5000-HV @ 2500.00MHz
                    │   sha512o   │              sha512n            │
                    │   sec/op    │   sec/op     vs base            │
Hash8Bytes/New-4      489.1n ± 0%   464.7n ± 0%  -5.00% (p=0.000 n=8)
Hash8Bytes/Sum384-4   499.1n ± 0%   474.6n ± 0%  -4.92% (p=0.000 n=8)
Hash8Bytes/Sum512-4   506.6n ± 0%   481.9n ± 0%  -4.86% (p=0.000 n=8)
Hash1K/New-4          3.371µ ± 0%   3.152µ ± 0%  -6.51% (p=0.000 n=8)
Hash1K/Sum384-4       3.385µ ± 0%   3.164µ ± 0%  -6.53% (p=0.000 n=8)
Hash1K/Sum512-4       3.392µ ± 0%   3.170µ ± 0%  -6.54% (p=0.000 n=8)
Hash8K/New-4          23.62µ ± 0%   22.01µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum384-4       23.63µ ± 0%   22.02µ ± 0%  -6.82% (p=0.000 n=8)
Hash8K/Sum512-4       23.64µ ± 0%   22.02µ ± 0%  -6.86% (p=0.000 n=8)
geomean               3.415µ        3.207µ       -6.10%

                    │   sha512o    │              sha512n            │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-4     15.60Mi ± 0%   16.42Mi ± 0%  +5.29% (p=0.000 n=8)
Hash8Bytes/Sum384-4  15.29Mi ± 0%   16.08Mi ± 0%  +5.18% (p=0.000 n=8)
Hash8Bytes/Sum512-4  15.06Mi ± 0%   15.83Mi ± 0%  +5.13% (p=0.000 n=8)
Hash1K/New-4         289.7Mi ± 0%   309.9Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum384-4      288.5Mi ± 0%   308.6Mi ± 0%  +6.97% (p=0.000 n=8)
Hash1K/Sum512-4      287.9Mi ± 0%   308.0Mi ± 0%  +7.00% (p=0.000 n=8)
Hash8K/New-4         330.8Mi ± 0%   355.0Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum384-4      330.6Mi ± 0%   354.9Mi ± 0%  +7.32% (p=0.000 n=8)
Hash8K/Sum512-4      330.5Mi ± 0%   354.8Mi ± 0%  +7.36% (p=0.000 n=8)
geomean              113.5Mi        120.9Mi       +6.50%

Benchmark for Loongson-3A6000:
goos: linux
goarch: loong64
pkg: crypto/sha512
cpu: Loongson-3A6000 @ 2500.00MHz
                    │ sha512.old  │             sha512.new           │
                    │   sec/op    │   sec/op     vs base             │
Hash8Bytes/New-8      397.2n ± 0%   380.6n ± 0%  -4.17% (p=0.000 n=10)
Hash8Bytes/Sum384-8   406.1n ± 0%   397.9n ± 0%  -2.02% (p=0.000 n=10)
Hash8Bytes/Sum512-8   410.1n ± 0%   395.8n ± 1%  -3.50% (p=0.000 n=10)
Hash1K/New-8          2.932µ ± 0%   2.800µ ± 0%  -4.50% (p=0.000 n=10)
Hash1K/Sum384-8       2.941µ ± 0%   2.812µ ± 0%  -4.39% (p=0.000 n=10)
Hash1K/Sum512-8       2.947µ ± 0%   2.814µ ± 0%  -4.50% (p=0.000 n=10)
Hash8K/New-8          20.68µ ± 0%   19.73µ ± 1%  -4.58% (p=0.000 n=10)
Hash8K/Sum384-8       20.69µ ± 0%   19.73µ ± 0%  -4.62% (p=0.000 n=10)
Hash8K/Sum512-8       20.70µ ± 0%   19.75µ ± 0%  -4.60% (p=0.000 n=10)
geomean               2.908µ        2.789µ       -4.10%

                    │  sha512.old  │             sha512.new          │
                    │     B/s      │     B/s       vs base           │
Hash8Bytes/New-8    19.21Mi ± 0%   20.05Mi ± 0%  +4.37% (p=0.000 n=10)
Hash8Bytes/Sum384-8 18.79Mi ± 0%   19.18Mi ± 0%  +2.08% (p=0.000 n=10)
Hash8Bytes/Sum512-8 18.60Mi ± 0%   19.28Mi ± 1%  +3.64% (p=0.000 n=10)
Hash1K/New-8        333.1Mi ± 0%   348.8Mi ± 0%  +4.71% (p=0.000 n=10)
Hash1K/Sum384-8     332.0Mi ± 0%   347.3Mi ± 0%  +4.60% (p=0.000 n=10)
Hash1K/Sum512-8     331.5Mi ± 0%   347.0Mi ± 0%  +4.69% (p=0.000 n=10)
Hash8K/New-8        377.8Mi ± 0%   396.0Mi ± 1%  +4.80% (p=0.000 n=10)
Hash8K/Sum384-8     377.7Mi ± 0%   396.0Mi ± 0%  +4.85% (p=0.000 n=10)
Hash8K/Sum512-8     377.5Mi ± 0%   395.7Mi ± 0%  +4.82% (p=0.000 n=10)
geomean             133.3Mi        139.0Mi       +4.28%

Change-Id: I55ae4a8e4b0c51a98583f654158235fe738cf348
Reviewed-on: https://go-review.googlesource.com/c/go/+/678436
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-05 18:02:52 -07:00
Julian Zhu
dda9d780e2 crypto/sha256: use const table for key loading on loong64
Load constant keys from a static memory table rather than loading immediates into registers on loong64.

Benchmark for Loongson-3A5000:
goos: linux
goarch: loong64
pkg: crypto/sha256
cpu: Loongson-3A5000-HV @ 2500.00MHz
                    │   sha256o   │              sha256n              │
                    │   sec/op    │   sec/op     vs base              │
Hash8Bytes/New-4      356.1n ± 0%   347.0n ± 0%  -2.54% (p=0.000 n=8)
Hash8Bytes/Sum224-4   368.7n ± 0%   359.5n ± 0%  -2.50% (p=0.000 n=8)
Hash8Bytes/Sum256-4   367.7n ± 0%   358.9n ± 0%  -2.41% (p=0.000 n=8)
Hash1K/New-4          4.741µ ± 0%   4.578µ ± 0%  -3.44% (p=0.000 n=8)
Hash1K/Sum224-4       4.755µ ± 0%   4.591µ ± 0%  -3.44% (p=0.000 n=8)
Hash1K/Sum256-4       4.753µ ± 0%   4.589µ ± 0%  -3.46% (p=0.000 n=8)
Hash8K/New-4          35.42µ ± 0%   34.19µ ± 0%  -3.45% (p=0.000 n=8)
Hash8K/Sum224-4       35.43µ ± 0%   34.21µ ± 0%  -3.44% (p=0.000 n=8)
Hash8K/Sum256-4       35.46µ ± 0%   34.22µ ± 0%  -3.48% (p=0.000 n=8)
Hash256K/New-4        1.138m ± 0%   1.098m ± 0%  -3.54% (p=0.000 n=8)
Hash256K/Sum224-4     1.138m ± 0%   1.098m ± 0%  -3.53% (p=0.000 n=8)
Hash256K/Sum256-4     1.139m ± 0%   1.099m ± 0%  -3.48% (p=0.000 n=8)
Hash1M/New-4          4.488m ± 0%   4.388m ± 0%  -2.22% (p=0.000 n=8)
Hash1M/Sum224-4       4.488m ± 0%   4.387m ± 0%  -2.24% (p=0.000 n=8)
Hash1M/Sum256-4       4.489m ± 0%   4.388m ± 0%  -2.25% (p=0.000 n=8)
geomean               50.02µ        48.50µ       -3.03%

                    │   sha256o    │              sha256n               │
                    │     B/s      │     B/s       vs base              │
Hash8Bytes/New-4      21.42Mi ± 0%   21.99Mi ± 0%  +2.63% (p=0.000 n=8)
Hash8Bytes/Sum224-4   20.69Mi ± 0%   21.22Mi ± 0%  +2.56% (p=0.000 n=8)
Hash8Bytes/Sum256-4   20.74Mi ± 0%   21.26Mi ± 0%  +2.48% (p=0.000 n=8)
Hash1K/New-4          206.0Mi ± 0%   213.3Mi ± 0%  +3.57% (p=0.000 n=8)
Hash1K/Sum224-4       205.4Mi ± 0%   212.7Mi ± 0%  +3.57% (p=0.000 n=8)
Hash1K/Sum256-4       205.5Mi ± 0%   212.8Mi ± 0%  +3.58% (p=0.000 n=8)
Hash8K/New-4          220.6Mi ± 0%   228.5Mi ± 0%  +3.58% (p=0.000 n=8)
Hash8K/Sum224-4       220.5Mi ± 0%   228.4Mi ± 0%  +3.56% (p=0.000 n=8)
Hash8K/Sum256-4       220.3Mi ± 0%   228.3Mi ± 0%  +3.61% (p=0.000 n=8)
Hash256K/New-4        219.7Mi ± 0%   227.7Mi ± 0%  +3.67% (p=0.000 n=8)
Hash256K/Sum224-4     219.6Mi ± 0%   227.6Mi ± 0%  +3.66% (p=0.000 n=8)
Hash256K/Sum256-4     219.6Mi ± 0%   227.5Mi ± 0%  +3.60% (p=0.000 n=8)
Hash1M/New-4          222.8Mi ± 0%   227.9Mi ± 0%  +2.27% (p=0.000 n=8)
Hash1M/Sum224-4       222.8Mi ± 0%   227.9Mi ± 0%  +2.29% (p=0.000 n=8)
Hash1M/Sum256-4       222.8Mi ± 0%   227.9Mi ± 0%  +2.30% (p=0.000 n=8)
geomean               136.0Mi        140.2Mi       +3.13%

Benchmark for Loongson-3A6000:
goos: linux
goarch: loong64
pkg: crypto/sha256
cpu: Loongson-3A6000 @ 2500.00MHz
                    │ sha256.old  │             sha256.new             │
                    │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-8      294.5n ± 0%   288.6n ± 0%  -2.00% (p=0.000 n=10)
Hash8Bytes/Sum224-8   305.0n ± 0%   299.7n ± 0%  -1.74% (p=0.000 n=10)
Hash8Bytes/Sum256-8   302.0n ± 0%   296.8n ± 0%  -1.74% (p=0.000 n=10)
Hash1K/New-8          4.186µ ± 0%   4.096µ ± 0%  -2.15% (p=0.000 n=10)
Hash1K/Sum224-8       4.193µ ± 0%   4.104µ ± 0%  -2.12% (p=0.000 n=10)
Hash1K/Sum256-8       4.194µ ± 0%   4.108µ ± 0%  -2.04% (p=0.000 n=10)
Hash8K/New-8          31.44µ ± 0%   30.76µ ± 0%  -2.17% (p=0.000 n=10)
Hash8K/Sum224-8       31.45µ ± 0%   30.79µ ± 0%  -2.10% (p=0.000 n=10)
Hash8K/Sum256-8       31.45µ ± 0%   30.78µ ± 0%  -2.12% (p=0.000 n=10)
Hash256K/New-8        996.7µ ± 0%   975.6µ ± 0%  -2.12% (p=0.000 n=10)
Hash256K/Sum224-8     996.8µ ± 0%   975.8µ ± 0%  -2.11% (p=0.000 n=10)
Hash256K/Sum256-8     996.8µ ± 0%   975.6µ ± 0%  -2.12% (p=0.000 n=10)
Hash1M/New-8          3.987m ± 0%   3.904m ± 0%  -2.08% (p=0.000 n=10)
Hash1M/Sum224-8       3.990m ± 0%   3.902m ± 0%  -2.20% (p=0.000 n=10)
Hash1M/Sum256-8       3.987m ± 0%   3.903m ± 0%  -2.10% (p=0.000 n=10)
geomean               43.59µ        42.69µ       -2.06%

                    │  sha256.old  │             sha256.new              │
                    │     B/s      │     B/s       vs base               │
Hash8Bytes/New-8      25.90Mi ± 0%   26.44Mi ± 0%  +2.06% (p=0.000 n=10)
Hash8Bytes/Sum224-8   25.01Mi ± 0%   25.46Mi ± 0%  +1.77% (p=0.000 n=10)
Hash8Bytes/Sum256-8   25.26Mi ± 0%   25.72Mi ± 0%  +1.79% (p=0.000 n=10)
Hash1K/New-8          233.3Mi ± 0%   238.5Mi ± 0%  +2.19% (p=0.000 n=10)
Hash1K/Sum224-8       232.9Mi ± 0%   238.0Mi ± 0%  +2.17% (p=0.000 n=10)
Hash1K/Sum256-8       232.9Mi ± 0%   237.7Mi ± 0%  +2.07% (p=0.000 n=10)
Hash8K/New-8          248.5Mi ± 0%   254.0Mi ± 0%  +2.22% (p=0.000 n=10)
Hash8K/Sum224-8       248.4Mi ± 0%   253.7Mi ± 0%  +2.14% (p=0.000 n=10)
Hash8K/Sum256-8       248.4Mi ± 0%   253.8Mi ± 0%  +2.17% (p=0.000 n=10)
Hash256K/New-8        250.8Mi ± 0%   256.3Mi ± 0%  +2.17% (p=0.000 n=10)
Hash256K/Sum224-8     250.8Mi ± 0%   256.2Mi ± 0%  +2.16% (p=0.000 n=10)
Hash256K/Sum256-8     250.8Mi ± 0%   256.2Mi ± 0%  +2.17% (p=0.000 n=10)
Hash1M/New-8          250.8Mi ± 0%   256.2Mi ± 0%  +2.12% (p=0.000 n=10)
Hash1M/Sum224-8       250.6Mi ± 0%   256.3Mi ± 0%  +2.25% (p=0.000 n=10)
Hash1M/Sum256-8       250.8Mi ± 0%   256.2Mi ± 0%  +2.14% (p=0.000 n=10)
geomean               156.0Mi        159.3Mi       +2.11%

Change-Id: Ib72cf3c746d4ad73e52e5d31f6b4a834fd36d934
Reviewed-on: https://go-review.googlesource.com/c/go/+/678435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-05 18:02:46 -07:00
Xiaolin Zhao
5defe8ebb3 internal/chacha8rand: replace WORD with instruction VMOVQ
Change-Id: I5d05af4d071b4b0ee60fafbd2a39494128bdf3f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/682896
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 18:02:27 -07:00
limeidan
4c7362e41c cmd/internal/obj/loong64: add new instructions ALSL{W/WU/V} for loong64
Go asm syntax:
	ALSL{W/WU/V}	$3, R4, R5, R6

Equivalent platform assembler syntax:
	alsl.{w/wu/d}	$r6, $r4, $r5, 3

Change-Id: Ic8364dfe2753bcea7de6cffe656ca0dde6875766
Reviewed-on: https://go-review.googlesource.com/c/go/+/692136
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2025-08-05 18:02:17 -07:00
Xiaolin Zhao
a552737418 cmd/compile: fold negation into multiplication on loong64
This change also add corresponding benchmark tests and codegen tests.
The performance improvement on CPU Loongson-3A6000-HV is as follows:

goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
        |  bench.old   |              bench.new              |
        |    sec/op    |   sec/op     vs base                |
MulNeg     828.4n ± 0%   655.9n ± 0%  -20.82% (p=0.000 n=10)
Mul2Neg   1062.0n ± 0%   826.8n ± 0%  -22.15% (p=0.000 n=10)
geomean    938.0n        736.4n       -21.49%

Change-Id: Ia999732880ec65be0c66cddc757a4868847e5b15
Reviewed-on: https://go-review.googlesource.com/c/go/+/682535
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-08-05 18:02:06 -07:00
David Chase
7ca34599ec [dev.simd] simd, cmd/compile: generated files to add 'blend' and 'blendMasked'
Generated by arch/internal/simdgen CL 693175

These methods are not public because of simdgen-induced name/signature
issues, and because their addition was motivated by the need for
emulation tools.

The specific name signature problems are:

1) one set of instructions has the "Masked" suffix (because of how
that is incorporated into names) and the other set does not (though I
suppose the operation could be renamed).

2) because the AVX2 instruction is bytes-only, to get the signature
right, requires "OverwriteBase" but OverwriteBase also requires
OverwriteClass and "simdgen does not support [OverwriteClass] in
inputs".

3) the default operation order is false, true, but we want this in a
"x.Merged(y, mask)" that pairs with "x.Masked(mask)" where the true
 case is x and the false case is y/zero, but the default ordering for
 VPBLENDVB and VPBLENDMB is false->x and true->y.

4) VPBLENDVB only comes in byte width, which causes problems
for floats.

All this may get fixed in the future, for now it is just an
implementation detail.

Change-Id: I61b655c7011e2c33f8644f704f886133c89d2f15
Reviewed-on: https://go-review.googlesource.com/c/go/+/693155
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-05 17:43:49 -07:00
Tobias Klauser
e1fd4faf91 runtime: fix godoc comment for inVDSOPage
Change-Id: I7dcab0c915a748e52c5c689c1cb774f486d2b9e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/693195
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 14:31:35 -07:00
Keith Randall
bcd25c79aa cmd/compile: allow StructSelect [x] of interface data fields for x>0
As of CL 681937 we can now have structs which are pointer shaped, but
their pointer field is not the first field, like struct{ struct{}; *int }.

Fixes #74888

Change-Id: Idc80f6b1abde3ae01437e2a9cadb5aa23d04b806
Reviewed-on: https://go-review.googlesource.com/c/go/+/693415
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 14:30:13 -07:00
Mark Freeman
b0945a54b5 cmd/dist, internal/platform: mark freebsd/riscv64 broken
It seems we have a builder, but it is not running correctly. Until
then, we should mark this port broken.

For #74734
For #74735

Change-Id: I536d037a43499cbd033fb6ebdf004a3df76332ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/691835
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 14:10:52 -07:00
Austin Clements
55d961b202 runtime: save AVX2 and AVX-512 state on asynchronous preemption
Based on CL 669415 by shaojunyang@google.com.

This is a cherry-pick of CL 680900 from the dev.simd branch.

Change-Id: I574f15c3b18a7179a1573aaf567caf18d8602ef1
Reviewed-on: https://go-review.googlesource.com/c/go/+/693397
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 14:00:15 -07:00
Austin Clements
af0c4fe2ca runtime: save scalar registers off stack in amd64 async preemption
Asynchronous preemption must save all registers that could be in use
by Go code. Currently, it saves all of these to the goroutine stack.
As a result, the stack frame requirements of asynchronous preemption
can be rather high. On amd64, this requires 368 bytes of stack space,
most of which is the XMM registers. Several RISC architectures are
around 0.5 KiB.

As we add support for SIMD instructions, this is going to become a
problem. The AVX-512 register state is 2.5 KiB. This well exceeds the
nosplit limit, and even if it didn't, could constrain when we can
asynchronously preempt goroutines on small stacks.

This CL fixes this by moving pure scalar state stored in non-GP
registers off the stack and into an allocated "extended register
state" object. To reduce space overhead, we only allocate these
objects as needed. While in the theoretical limit, every G could need
this register state, in practice very few do at a time.

However, we can't allocate when we're in the middle of saving the
register state during an asynchronous preemption, so we reserve
scratch space on every P to temporarily store the register state,
which can then be copied out to an allocated state object later by Go
code.

This commit only implements this for amd64, since that's where we're
about to add much more vector state, but it lays the groundwork for
doing this on any architecture that could benefit.

This is a cherry-pick of CL 680898 plus bug fix CL 684836 from the
dev.simd branch.

Change-Id: I123a95e21c11d5c10942d70e27f84d2d99bbf735
Reviewed-on: https://go-review.googlesource.com/c/go/+/669195
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 13:58:52 -07:00
Austin Clements
e73afaae69 internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512"
This adds detection for the CD and DQ sub-features of x86 AVX-512.

Building on these, we also add a "derived" AVX-512 feature that
bundles together the basic usable subset of subfeatures. Despite the F
in AVX-512-F standing for "foundation", AVX-512-F+BW+DQ+VL together
really form the basic usable subset of AVX-512 functionality. These
have also all been supported together by almost every CPU, and are
guaranteed by GOAMD64=v4, so there's little point in separating them
out.

This is a cherry-pick of CL 680899 from the dev.simd branch.

Change-Id: I34356502bd1853ba2372e48db0b10d55cffe07a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/693396
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 13:57:45 -07:00
Austin Clements
cef381ba60 runtime: eliminate global state in mkpreempt.go
We're going to start writing two files, so having a single global file
we're writing will be a problem.

This has no effect on the generated code.

This is a cherry-pick of CL 680897 from the dev.simd branch.

Change-Id: I49897ea0c6500a29eac89b597d75c0eb3e9b6706
Reviewed-on: https://go-review.googlesource.com/c/go/+/693395
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 13:55:01 -07:00
Mateusz Poliwczak
c0025d5e0b go/parser: correct comment in expectedErrors
If `here` were already the start of the comment, then
the `pos = here` assignment would be redundant. Since pos
is already the start of the comment.

Change-Id: I793334988951ae5441327cb62d7524b423155b74
Reviewed-on: https://go-review.googlesource.com/c/go/+/693295
Reviewed-by: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
2025-08-05 10:41:10 -07:00
qiulaidongfeng
4ee0df8c46 cmd: remove dead code
Fixes #74076

Change-Id: Icc67b3d4e342f329584433bd1250c56ae8f5a73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/690635
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 10:31:25 -07:00
Michael Pratt
a2c45f0eb1 runtime: test VDSO symbol hash values
In addition to verifying existing values, this makes it easier to add a
new one by adding an invalid entry and running the test.

Change-Id: I6a6a636c9c413add29884e4f6759196f4db34de7
Reviewed-on: https://go-review.googlesource.com/c/go/+/693276
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-05 09:52:44 -07:00
Keith Randall
cd55f86b8d cmd/compile: allow multi-field structs to be stored directly in interfaces
If the struct is a bunch of 0-sized fields and one pointer field.

Fixes #74092

Change-Id: I87c5d162c8c9fdba812420d7f9d21de97295b62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/681937
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 09:18:31 -07:00
Keith Randall
21ab0128b6 cmd/compile: remove support for old-style bounds check calls
This CL rips out the support for old-style assembly stubs.

We need to keep the Go stubs for wasm support.

Change-Id: I23d6d9f2f06be1ded8d22b3e0ef04ff6e252a587
Reviewed-on: https://go-review.googlesource.com/c/go/+/682402
Reviewed-by: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-05 08:59:28 -07:00
Keith Randall
802d056c78 cmd/compile: move ppc64 over to new bounds check strategy
Change-Id: I25a9bbc247b2490e7e37ed843386f53a71822146
Reviewed-on: https://go-review.googlesource.com/c/go/+/682498
Reviewed-by: Paul Murphy <paumurph@redhat.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 08:59:16 -07:00
Daniel Morsing
a3295df873 cmd/compile/internal/ssa: Use transitive properties for len/cap
Remove the special casing for len/cap and rely on the posets.

After removing the special logic, I ran `go build -gcflags='-d
ssa/prove/debug=2' all` to verify my results. During this, I found 2
common cases where the old implicit unsigned->signed domain conversion
made proving a branch possible that shouldn't be strictly possible and
added these.

The 2 cases are shifting a non-negative signed integer and unsigned
comparisons that happen with arguments that fits entirely inside the
unsigned argument

Change-Id: Ic88049ff69efc5602fc15f5dad02028e704f5483
Reviewed-on: https://go-review.googlesource.com/c/go/+/679155
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-08-05 08:58:11 -07:00
Derek Parker
bd082857a5 doc: fix typo in go memory model doc
Fixes a typo where originally "may by" was written where the intent was
"may be".

Change-Id: Ia5ba51a966506395c41b17ca28d59f63bd487f3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/693075
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 08:41:37 -07:00
Cuong Manh Le
2b622b05a9 cmd/compile: remove isUintXPowerOfTwo functions
And use the generic version instead.

While at it, also correct the corresponding rules to use logXu variants
instead of logXu, following discussion in CL 689815.

Change-Id: Iba85d14ff0e26d45a126764e7bd5702586358d23
Reviewed-on: https://go-review.googlesource.com/c/go/+/692917
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 08:37:45 -07:00
Junyang Shao
82d056ddd7 [dev.simd] cmd/compile: add ShiftAll immediate variant
This CL is generated by CL 693136.

Change-Id: Ifd2278d3f927efa008a14cc5e592e7c14b7120ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/693157
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-08-05 08:37:44 -07:00
Cuong Manh Le
72147ffa75 cmd/compile: simplify isUintXPowerOfTwo implementation
By calling isUnsignedPowerOfTwo instead of duplicating the same ones.

Change-Id: I1e29d3b7eda1bc8773fcd25728d8f508ae633ac9
Reviewed-on: https://go-review.googlesource.com/c/go/+/692916
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05 08:34:05 -07:00
Cuong Manh Le
26da1199eb cmd/compile: make isUint{32,64}PowerOfTwo implementations clearer
Since these functions cast the input to uint64, so the result always
non-negative. The condition should be changed to comparing with zero,
thus maaking it clearer to reader, and open room for simplifying in the
future by using the generic isUnsignedPowerOfTwo function.

Separated this change, so it's easier to do bisecting if there's any
problems happened.

Change-Id: Ibec28c2590f4c52caa36384b710d526459725e49
Reviewed-on: https://go-review.googlesource.com/c/go/+/692915
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2025-08-05 08:32:51 -07:00
Cuong Manh Le
5ab9f23977 cmd/compile, runtime: add checkptr instrumentation for unsafe.Add
Fixes #74431

Change-Id: Id651ea0b82599ccaff8816af0a56ddbb149b6f89
Reviewed-on: https://go-review.googlesource.com/c/go/+/692015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: t hepudds <thepudds1460@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-05 08:28:50 -07:00
Michael Munday
fcc036f03b cmd/compile: optimise float <-> int register moves on riscv64
Use the FMV* instructions to move values between the floating point and
integer register files.

Note: I'm unsure why there is a slowdown in the Float32bits benchmark,
I've checked and an FMVXS instruction is being used as expected. There
are multiple loads and other instructions in the main loop.

goos: linux
goarch: riscv64
pkg: math
cpu: Spacemit(R) X60
                    │ fmv-before.txt │            fmv-after.txt            │
                    │     sec/op     │   sec/op     vs base                │
Acos                     122.7n ± 0%   122.7n ± 0%        ~ (p=1.000 n=10)
Acosh                    197.2n ± 0%   191.5n ± 0%   -2.89% (p=0.000 n=10)
Asin                     122.7n ± 0%   122.7n ± 0%        ~ (p=0.474 n=10)
Asinh                    231.0n ± 0%   224.1n ± 0%   -2.99% (p=0.000 n=10)
Atan                     91.39n ± 0%   91.41n ± 0%        ~ (p=0.465 n=10)
Atanh                    210.3n ± 0%   203.4n ± 0%   -3.26% (p=0.000 n=10)
Atan2                    149.6n ± 0%   149.6n ± 0%        ~ (p=0.721 n=10)
Cbrt                     176.5n ± 0%   165.9n ± 0%   -6.01% (p=0.000 n=10)
Ceil                     25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Copysign                 3.756n ± 0%   3.756n ± 0%        ~ (p=0.149 n=10)
Cos                      95.15n ± 0%   95.15n ± 0%        ~ (p=0.374 n=10)
Cosh                     228.6n ± 0%   224.7n ± 0%   -1.71% (p=0.000 n=10)
Erf                      115.2n ± 0%   115.2n ± 0%        ~ (p=0.474 n=10)
Erfc                     116.4n ± 0%   116.4n ± 0%        ~ (p=0.628 n=10)
Erfinv                   133.3n ± 0%   133.3n ± 0%        ~ (p=1.000 n=10)
Erfcinv                  133.3n ± 0%   133.3n ± 0%        ~ (p=1.000 n=10)
Exp                      194.1n ± 0%   190.3n ± 0%   -1.93% (p=0.000 n=10)
ExpGo                    204.7n ± 0%   200.3n ± 0%   -2.15% (p=0.000 n=10)
Expm1                    137.7n ± 0%   135.2n ± 0%   -1.82% (p=0.000 n=10)
Exp2                     173.4n ± 0%   169.0n ± 0%   -2.54% (p=0.000 n=10)
Exp2Go                   182.8n ± 0%   178.4n ± 0%   -2.41% (p=0.000 n=10)
Abs                      3.756n ± 0%   3.756n ± 0%        ~ (p=0.157 n=10)
Dim                      12.52n ± 0%   12.52n ± 0%        ~ (p=0.737 n=10)
Floor                    25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Max                      21.29n ± 0%   20.03n ± 0%   -5.92% (p=0.000 n=10)
Min                      21.28n ± 0%   20.04n ± 0%   -5.85% (p=0.000 n=10)
Mod                      344.9n ± 0%   319.2n ± 0%   -7.45% (p=0.000 n=10)
Frexp                    55.71n ± 0%   48.85n ± 0%  -12.30% (p=0.000 n=10)
Gamma                    165.9n ± 0%   167.8n ± 0%   +1.15% (p=0.000 n=10)
Hypot                    73.24n ± 0%   70.74n ± 0%   -3.41% (p=0.000 n=10)
HypotGo                  84.50n ± 0%   82.63n ± 0%   -2.21% (p=0.000 n=10)
Ilogb                    49.45n ± 0%   45.70n ± 0%   -7.59% (p=0.000 n=10)
J0                       556.5n ± 0%   544.0n ± 0%   -2.25% (p=0.000 n=10)
J1                       555.3n ± 0%   542.8n ± 0%   -2.24% (p=0.000 n=10)
Jn                       1.181µ ± 0%   1.156µ ± 0%   -2.12% (p=0.000 n=10)
Ldexp                    59.47n ± 0%   53.84n ± 0%   -9.47% (p=0.000 n=10)
Lgamma                   167.2n ± 0%   154.6n ± 0%   -7.51% (p=0.000 n=10)
Log                      160.9n ± 0%   154.6n ± 0%   -3.92% (p=0.000 n=10)
Logb                     49.45n ± 0%   45.70n ± 0%   -7.58% (p=0.000 n=10)
Log1p                    147.1n ± 0%   137.1n ± 0%   -6.80% (p=0.000 n=10)
Log10                    162.1n ± 1%   154.6n ± 0%   -4.63% (p=0.000 n=10)
Log2                     66.99n ± 0%   60.72n ± 0%   -9.36% (p=0.000 n=10)
Modf                     29.42n ± 0%   26.29n ± 0%  -10.64% (p=0.000 n=10)
Nextafter32              41.95n ± 0%   37.88n ± 0%   -9.70% (p=0.000 n=10)
Nextafter64              38.82n ± 0%   33.49n ± 0%  -13.73% (p=0.000 n=10)
PowInt                   252.3n ± 0%   237.3n ± 0%   -5.95% (p=0.000 n=10)
PowFrac                  615.5n ± 0%   589.7n ± 0%   -4.19% (p=0.000 n=10)
Pow10Pos                 10.64n ± 0%   10.64n ± 0%        ~ (p=1.000 n=10)
Pow10Neg                 24.42n ± 0%   15.02n ± 0%  -38.49% (p=0.000 n=10)
Round                    21.91n ± 0%   18.16n ± 0%  -17.12% (p=0.000 n=10)
RoundToEven              24.42n ± 0%   21.29n ± 0%  -12.84% (p=0.000 n=10)
Remainder                308.0n ± 0%   291.2n ± 0%   -5.44% (p=0.000 n=10)
Signbit                  10.02n ± 0%   10.02n ± 0%        ~ (p=1.000 n=10)
Sin                      102.7n ± 0%   102.7n ± 0%        ~ (p=0.211 n=10)
Sincos                   124.0n ± 1%   123.3n ± 0%   -0.56% (p=0.002 n=10)
Sinh                     239.1n ± 0%   234.7n ± 0%   -1.84% (p=0.000 n=10)
SqrtIndirect             2.504n ± 0%   2.504n ± 0%        ~ (p=0.303 n=10)
SqrtLatency              15.03n ± 0%   15.02n ± 0%        ~ (p=0.598 n=10)
SqrtIndirectLatency      15.02n ± 0%   15.02n ± 0%        ~ (p=0.907 n=10)
SqrtGoLatency            165.3n ± 0%   157.2n ± 0%   -4.90% (p=0.000 n=10)
SqrtPrime                3.801µ ± 0%   3.802µ ± 0%        ~ (p=1.000 n=10)
Tan                      125.2n ± 0%   125.2n ± 0%        ~ (p=0.458 n=10)
Tanh                     244.2n ± 0%   239.9n ± 0%   -1.76% (p=0.000 n=10)
Trunc                    25.67n ± 0%   24.42n ± 0%   -4.87% (p=0.000 n=10)
Y0                       550.2n ± 0%   538.1n ± 0%   -2.21% (p=0.000 n=10)
Y1                       552.8n ± 0%   540.6n ± 0%   -2.21% (p=0.000 n=10)
Yn                       1.168µ ± 0%   1.143µ ± 0%   -2.14% (p=0.000 n=10)
Float64bits              8.139n ± 0%   4.385n ± 0%  -46.13% (p=0.000 n=10)
Float64frombits          7.512n ± 0%   3.759n ± 0%  -49.96% (p=0.000 n=10)
Float32bits              8.138n ± 0%   9.393n ± 0%  +15.42% (p=0.000 n=10)
Float32frombits          7.513n ± 0%   3.757n ± 0%  -49.98% (p=0.000 n=10)
FMA                      3.756n ± 0%   3.756n ± 0%        ~ (p=0.246 n=10)
geomean                  77.43n        72.42n        -6.47%

Change-Id: I8dac69b1d17cb3d2af78d1c844d2b5d80000d667
Reviewed-on: https://go-review.googlesource.com/c/go/+/599235
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Munday <mikemndy@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-08-05 08:27:15 -07:00
Cherry Mui
775fb52745 [dev.simd] all: merge master (7a1679d) into dev.simd
Conflicts:

- src/cmd/compile/internal/amd64/ssa.go
- src/cmd/compile/internal/ssa/rewriteAMD64.go
- src/internal/buildcfg/exp.go
- src/internal/cpu/cpu.go
- src/internal/cpu/cpu_x86.go
- src/internal/goexperiment/flags.go

Merge List:

+ 2025-08-04 7a1679d7ae cmd/compile: move s390x over to new bounds check strategy
+ 2025-08-04 95693816a5 cmd/compile: move riscv64 over to new bounds check strategy
+ 2025-08-04 d7bd7773eb go/parser: remove safePos
+ 2025-08-04 4b6cbc377f cmd/cgo/internal/test: use (syntactic) constant for C array bound
+ 2025-08-03 b2960e3580 cmd/internal/obj/loong64: add {V,XV}{BITCLR/BITSET/BITREV}[I].{B/H/W/D} instructions support
+ 2025-08-03 abeeef1c08 cmd/compile/internal/test: fix typo in comments
+ 2025-08-03 d44749b65b cmd/internal/obj/loong64: add [X]VLDREPL.{B/H/W/D} instructions support
+ 2025-08-03 d6beda863e runtime: add reference to debugPinnerV1
+ 2025-08-01 4ab1aec007 cmd/go: modload should use a read-write lock to improve concurrency
+ 2025-08-01 e666972a67 runtime: deduplicate Windows stdcall
+ 2025-08-01 ef40549786 runtime,syscall: move loadlibrary and getprocaddress to syscall
+ 2025-08-01 336931a4ca cmd/go: use os.Rename to move files on Windows
+ 2025-08-01 eef5f8d930 cmd/compile: enforce that locals are always accessed with SP base register
+ 2025-08-01 e071617222 cmd/compile: optimize multiplication rules on loong64
+ 2025-07-31 eb7f515c4d cmd/compile: use generated loops instead of DUFFZERO on amd64
+ 2025-07-31 c0ee2fd4e3 cmd/go: explicitly reject module paths "go" and "toolchain"
+ 2025-07-30 a4d99770c0 runtime/metrics: add cleanup and finalizer queue metrics
+ 2025-07-30 70a2ff7648 runtime: add cgo call benchmark
+ 2025-07-30 69338a335a cmd/go/internal/gover: fix ModIsPrerelease for toolchain versions
+ 2025-07-30 cedf63616a cmd/compile: add floating point min/max intrinsics on s390x
+ 2025-07-30 82a1921c3b all: remove redundant Swiss prefixes
+ 2025-07-30 2ae059ccaf all: remove GOEXPERIMENT=swissmap
+ 2025-07-30 cc571dab91 cmd/compile: deduplicate instructions when rewrite func results
+ 2025-07-30 2174a7936c crypto/tls: use standard chacha20-poly1305 cipher suite names
+ 2025-07-30 8330fb48a6 cmd/compile: move mips32 over to new bounds check strategy
+ 2025-07-30 9f9d7b50e8 cmd/compile: move mips64 over to new bounds check strategy
+ 2025-07-30 5216fd570e cmd/compile: move loong64 over to new bounds check strategy
+ 2025-07-30 89a0af86b8 cmd/compile: allow ops to specify clobbering input registers
+ 2025-07-30 5e94d72158 cmd/compile: simplify zerorange on arm64
+ 2025-07-30 8cd85e602a cmd/compile: check domination of loop return in both controls
+ 2025-07-30 cefaed0de0 reflect: fix noswiss builder
+ 2025-07-30 3aa1b00081 regexp: fix compiling alternate patterns of different fold case literals
+ 2025-07-30 b1e933d955 cmd/compile: avoid extending when already sufficiently masked on loong64
+ 2025-07-29 880ca333d7 cmd/compile: removing log2uint32 function
+ 2025-07-29 1513661dc3 cmd/compile: simplify logX implementations
+ 2025-07-29 bd94ae8903 cmd/compile: use unsigned power-of-two detector for unsigned mod
+ 2025-07-29 f3582fc80e cmd/compile: add unsigned power-of-two detector
+ 2025-07-29 f7d167fe71 internal/abi: move direct/indirect flag from Kind to TFlag
+ 2025-07-29 e0b07dc22e os/exec: fix incorrect expansion of "", "." and ".." in LookPath
+ 2025-07-29 25816d401c internal/goexperiment: delete RangeFunc goexperiment
+ 2025-07-29 7961bf71f8 internal/goexperiment: delete CacheProg goexperiment
+ 2025-07-29 e15a14c4dd sync: remove synchashtriemap GOEXPERIMENT
+ 2025-07-29 7dccd6395c cmd/compile: move arm32 over to new bounds check strategy
+ 2025-07-29 d79405a344 runtime: only deduct assist credit for arenas during GC
+ 2025-07-29 19a086f716 cmd/go/internal/telemetrystats: count goexperiments
+ 2025-07-29 aa95ab8215 image: fix formatting of godoc link
+ 2025-07-29 4c854b7a3e crypto/elliptic: change a variable name that have the same name as keywords
+ 2025-07-28 b10eb1d042 cmd/compile: simplify zerorange on amd64
+ 2025-07-28 f8eae7a3c3 os/user: fix tests to pass on non-english Windows
+ 2025-07-28 0984264471 internal/poll: remove msg field from Windows' poll.operation
+ 2025-07-28 d7b4114346 internal/poll: remove rsan field from Windows' poll.operation
+ 2025-07-28 361b1ab41f internal/poll: remove sa field from Windows' poll.operation
+ 2025-07-28 9b6bd64e46 internal/poll: remove qty and flags fields from Windows' poll.operation
+ 2025-07-28 cd3655a824 internal/runtime/maps: fix spelling errors in comments
+ 2025-07-28 d5dc36af45 runtime: remove openbsd/mips64 related code
+ 2025-07-28 64ba72474d errors: omit redundant nil check in type assertion for Join
+ 2025-07-28 e151db3e06 all: omit unnecessary type conversions
+ 2025-07-28 4569255f8c cmd/compile: cleanup SelectN rules by indexing into args
+ 2025-07-28 94645d2413 cmd/compile: rewrite cmov(x, x, cond) into x
+ 2025-07-28 10c5cf68d4 net/http: add proper panic message
+ 2025-07-28 46b5839231 test/codegen: fix failing condmove wasm tests
+ 2025-07-28 98f301cf68 runtime,syscall: move SyscallX implementations from runtime to syscall
+ 2025-07-28 c7ed3a1c5a internal/runtime/syscall/windows: factor out code from runtime
+ 2025-07-28 e81eac19d3 hash/crc32: fix incorrect checksums with avx512+race
+ 2025-07-25 6fbad4be75 cmd/compile: remove no-longer-necessary call to calculateDepths
+ 2025-07-25 5045fdd8ff cmd/compile: fix containsUnavoidableCall computation
+ 2025-07-25 d28b27cd8e go/types, types2: use nil to represent incomplete explicit aliases
+ 2025-07-25 7b53d8d06e cmd/compile/internal/types2: add loaded state between loader calls and constraint expansion
+ 2025-07-25 374e3be2eb os/user: user random name for the test user account
+ 2025-07-25 1aa154621d runtime: rename scanobject to scanObject
+ 2025-07-25 41b429881a runtime: duplicate scanobject in greentea and non-greentea files
+ 2025-07-25 aeb256e98a cmd/compile: remove unused arg from gorecover
+ 2025-07-25 08376e1a9c runtime: iterate through inlinings when processing recover()
+ 2025-07-25 c76c3abc54 encoding/json: fix truncated Token error regression in goexperiment.jsonv2
+ 2025-07-25 ebdbfccd98 encoding/json/jsontext: preserve buffer capacity in Encoder.Reset
+ 2025-07-25 91c4f0ccd5 reflect: avoid a bounds check in stack-constrained code
+ 2025-07-24 3636ced112 encoding/json: fix extra data regression under goexperiment.jsonv2
+ 2025-07-24 a6eec8bdc7 encoding/json: reduce error text regressions under goexperiment.jsonv2
+ 2025-07-24 0fa88dec1e time: remove redundant uint32 conversion in split
+ 2025-07-24 ada30b8248 internal/buildcfg: add ability to get GORISCV64 variable in GOGOARCH
+ 2025-07-24 6f6c6c5782 cmd/internal/obj: rip out argp adjustment for wrapper frames
+ 2025-07-24 7b50024330 runtime: detect successful recovers differently
+ 2025-07-24 7b9de668bd unicode/utf8: skip ahead during ascii runs in Valid/ValidString
+ 2025-07-24 076eae436e cmd/compile: move amd64 and 386 over to new bounds check strategy
+ 2025-07-24 f703dc5bef cmd/compile: add missing StringLen rule in prove
+ 2025-07-24 394d0bee8d cmd/compile: move arm64 over to new bounds check strategy
+ 2025-07-24 3024785b92 cmd/compile,runtime: remember idx+len for bounds check failure with less code
+ 2025-07-24 741a19ab41 runtime: move bounds check constants to internal/abi
+ 2025-07-24 ce05ad448f cmd/compile: rewrite condselects into doublings and halvings
+ 2025-07-24 fcd28070fe cmd/compile: add opt branchelim to rewrite some CondSelect into math
+ 2025-07-24 f32cf8e4b0 cmd/compile: learn transitive proofs for safe unsigned subs
+ 2025-07-24 d574856482 cmd/compile: learn transitive proofs for safe negative signed adds
+ 2025-07-24 1a72920f09 cmd/compile: learn transitive proofs for safe positive signed adds
+ 2025-07-24 e5f202bb60 cmd/compile: learn transitive proofs for safe unsigned adds
+ 2025-07-24 bd80f74bc1 cmd/compile: fold shift through AND for slice operations
+ 2025-07-24 5c45fe1385 internal/runtime/syscall: rename to internal/runtime/syscall/linux
+ 2025-07-24 592c2db868 cmd/compile: improve loopRotate to handle nested loops
+ 2025-07-24 dcb479c2f9 cmd/compile: optimize slice bounds checking with SUB/SUBconst comparisons
+ 2025-07-24 f11599b0b9 internal/poll: remove handle field from Windows' poll.operation
+ 2025-07-24 f7432e0230 internal/poll: remove fd field from Windows' poll.operation
+ 2025-07-24 e84ed38641 runtime: add benchmark for small-size memmory operation
+ 2025-07-24 18dbe5b941 hash/crc32: add AVX512 IEEE CRC32 calculation
+ 2025-07-24 c641900f72 cmd/compile: prefer base.Fatalf to panic in dwarfgen
+ 2025-07-24 d71d8aeafd cmd/internal/obj/s390x: add MVCLE instruction
+ 2025-07-24 b6cf1d94dc runtime: optimize memclr on mips64x
+ 2025-07-24 a8edd99479 runtime: improvement in memclr for s390x
+ 2025-07-24 bd04f65511 internal/runtime/exithook: fix a typo
+ 2025-07-24 5c8624a396 cmd/internal/goobj: make error output clear
+ 2025-07-24 44d73dfb4e cmd/go/internal/doc: clean up after merge with cmd/internal/doc
+ 2025-07-24 bd446662dd cmd/internal/doc: merge with cmd/go/internal/doc
+ 2025-07-24 da8b50c830 cmd/doc: delete
+ 2025-07-24 6669aa3b14 runtime: randomize heap base address
+ 2025-07-24 26338a7f69 cmd/compile: use better fatal message for staticValue1
+ 2025-07-24 8587ba272e cmd/cgo: compare malloc return value to NULL instead of literal 0
+ 2025-07-24 cae45167b7 go/types, types2: better error messages for certain type mismatches
+ 2025-07-24 2ddf542e4c cmd/compile: use ,ok return idiom for sparsemap.get
+ 2025-07-24 6505fcbd0a cmd/compile: use generics for sparse map
+ 2025-07-24 14f5eb7812 cmd/api: rerun updategolden
+ 2025-07-24 52b6d7f67a runtime: drop NetBSD kernel bug sysmon workaround fixed in NetBSD 9.2
+ 2025-07-24 1ebebf1cc1 cmd/go: clean should respect workspaces
+ 2025-07-24 6536a93547 encoding/json/jsontext: preserve buffer capacity in Decoder.Reset
+ 2025-07-24 efc37e97c0 cmd/go: always return the cached path from go tool -n
+ 2025-07-23 98a031193b runtime: check TestUsingVDSO ExitError type assertion
+ 2025-07-23 6bb42997c8 doc/next: initialize
+ 2025-07-23 2696a11a97 internal/goversion: update Version to 1.26
+ 2025-07-23 489868f776 cmd/link: scope test to linux & net.sendFile
+ 2025-07-22 71c2bf5513 cmd/compile: fix loclist for heap return vars without optimizations
+ 2025-07-22 c74399e7f5 net: correct comment for ListenConfig.ListenPacket
+ 2025-07-22 4ed9943b26 all: go fmt
+ 2025-07-22 1aaf7422f1 cmd/internal/objabi: remove redundant word in comment
+ 2025-07-21 d5ec0815e6 runtime: relax TestMemoryLimitNoGCPercent a bit
+ 2025-07-21 f7cc61e7d7 cmd/compile: for arm64 epilog, do SP increment with a single instruction
+ 2025-07-21 5dac42363b runtime: fix asan wrapper for riscv64
+ 2025-07-21 e5502e0959 cmd/go: check subcommand properties
+ 2025-07-19 2363897932 cmd/internal/obj: enable got pcrel itype in fips140 for riscv64
+ 2025-07-19 e32255fcc0 cmd/compile/internal/ssa: restrict architectures for TestDebugLines_74576
+ 2025-07-18 0451816430 os: revert the use of AddCleanup to close files and roots
+ 2025-07-18 34b70684ba go/types: infer correct type for y in append(bytes, y...)
+ 2025-07-17 66536242fc cmd/compile/internal/escape: improve DWARF .debug_line numbering for literal rewriting optimizations
+ 2025-07-16 385000b004 runtime: fix idle time double-counting bug
+ 2025-07-16 f506ad2644 cmd/compile/internal/escape: speed up analyzing some functions with many closures
+ 2025-07-16 9c507e7942 cmd/link, runtime: on Wasm, put only function index in method table and func table
+ 2025-07-16 9782dcfd16 runtime: use 32-bit function index on Wasm
+ 2025-07-16 c876bf9346 cmd/internal/obj/wasm: use 64-bit instructions for indirect calls
+ 2025-07-15 b4309ece66 cmd/internal/doc: upgrade godoc pkgsite to 01b046e
+ 2025-07-15 75a19dbcd7 runtime: use memclrNoHeapPointers to clear inline mark bits
+ 2025-07-15 6d4a91c7a5 runtime: only clear inline mark bits on span alloc if necessary
+ 2025-07-15 0c6296ab12 runtime: have mergeInlineMarkBits also clear the inline mark bits
+ 2025-07-15 397d2117ec runtime: merge inline mark bits with gcmarkBits 8 bytes at a time
+ 2025-07-15 7dceabd3be runtime/maps: fix typo in group.go comment (instrinsified -> intrinsified)
+ 2025-07-15 d826bf4d74 os: remove useless error check
+ 2025-07-14 bb07e55aff runtime: expand GOMAXPROCS documentation
+ 2025-07-14 9159cd4ec6 encoding/json: decompose legacy options
+ 2025-07-14 c6556b8eb3 encoding/json/v2: add security section to doc
+ 2025-07-11 6ebb5f56d9 runtime: gofmt after CL 643897 and CL 662455
+ 2025-07-11 1e48ca7020 encoding/json: remove legacy option to EscapeInvalidUTF8
+ 2025-07-11 a0a99cb22b encoding/json/v2: report wrapped io.ErrUnexpectedEOF
+ 2025-07-11 9d04122d24 crypto/rsa: drop contradictory promise to keep PublicKey modulus secret
+ 2025-07-11 1ca23682dd crypto/rsa: fix documentation formatting
+ 2025-07-11 4bc3373c8e runtime: turn off large memmove tests under asan/msan

Change-Id: I1e32d964eba770b85421efb86b305a2242f24466
2025-08-04 15:07:05 -04:00
David Chase
6b9b59e144 [dev.simd] simd, cmd/compile: rename some methods
generated by simdgen CL 692556

these are the "easy" ones
SaturatedOp -> OpSaturated
PairwiseOp -> OpPairs
OpWithPrecision -> OpScaled
DiffWithOpWithPrecision -> OpScaledResidue

Change-Id: I036bf89c0690bcf9922c376d62cef48392942af3
Reviewed-on: https://go-review.googlesource.com/c/go/+/692357
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04 11:53:11 -07:00
David Chase
d375b95357 [dev.simd] simd: move lots of slice functions and methods to generated code
Lots of handwritten/stenciled code is now untouched by human hands

For certain combinations of operation-arity and type, there
is an option to use a flaky version of a test helper, that only
requires "close enough".  For example:

testFloat32x4TernaryFlaky(t, simd.Float32x4.FusedMultiplyAdd, fmaSlice[float32], 0.001)

Some of the quirkier operations have their behavior captured
in their test-simulation, for example, ceilResidue regards
infinities as integers (therefore their residue is zero).

Change-Id: I8242914e5ab399edbe226da8586988441cffa83f
Reviewed-on: https://go-review.googlesource.com/c/go/+/690575
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-04 11:52:48 -07:00
Junyang Shao
3f92aa1eca [dev.simd] cmd/compile, simd: make bitwise logic ops available to all u?int vectors
This CL is generated by CL 692555.

Change-Id: I24e6de83e0408576f385a1c8e861b08c583f9098
Reviewed-on: https://go-review.googlesource.com/c/go/+/692356
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04 11:23:40 -07:00
Keith Randall
7a1679d7ae cmd/compile: move s390x over to new bounds check strategy
Change-Id: I86ed1a60165b729bb88a8a418da0ea1b59b3dc10
Reviewed-on: https://go-review.googlesource.com/c/go/+/682499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Munday <mikemndy@gmail.com>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-04 10:08:22 -07:00
Keith Randall
95693816a5 cmd/compile: move riscv64 over to new bounds check strategy
Change-Id: Idd9eaf051aa57f7fef7049c12085926030c35d70
Reviewed-on: https://go-review.googlesource.com/c/go/+/682401
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04 10:08:13 -07:00
Junyang Shao
c2d775d401 [dev.simd] cmd/compile, simd: change PairDotProdAccumulate to AddDotProd
This CL is generated by CL 692219.

Change-Id: I50fa919f1edc5c6505bc6d3238f65b37fc7628b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/692156
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2025-08-04 09:52:09 -07:00
Junyang Shao
2c25f3e846 [dev.simd] cmd/compile, simd: change Shift*AndFillUpperFrom to Shift*Concat
This CL is generated by CL 692216.

Change-Id: Ib7530142bcce2a23f90d48866271994c57561955
Reviewed-on: https://go-review.googlesource.com/c/go/+/692215
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04 09:52:05 -07:00
Mateusz Poliwczak
d7bd7773eb go/parser: remove safePos
The logic in safePos is wrong, since (*token.File).Offset does not panic,
so this function was basically a noop (since CL 559436).

To work properly it would have to be:

return p.file.Pos(p.file.Offset(pos))

Since it effectively acts as a no-op and hasn't been noticed since,
let's go ahead and remove it.

Change-Id: I00a1bcc5af6a996c63de3f1175c15062e85cf89b
Reviewed-on: https://go-review.googlesource.com/c/go/+/692955
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2025-08-04 09:33:59 -07:00
Cherry Mui
4b6cbc377f cmd/cgo/internal/test: use (syntactic) constant for C array bound
A test in C has an array bound defined as a "const int", which is
technically a variable. The new version of C compiler in Xcode 26
beta emits a warning "variable length array folded to constant
array as an extension" for this (as an error since we build the
test with -Werror). Work around this by using an enum, which is
syntactically a constant.

Change-Id: Icfa943f293f6eac8f41d0615da40c126330d7d11
Reviewed-on: https://go-review.googlesource.com/c/go/+/692877
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-08-04 09:11:25 -07:00
Xiaolin Zhao
b2960e3580 cmd/internal/obj/loong64: add {V,XV}{BITCLR/BITSET/BITREV}[I].{B/H/W/D} instructions support
Go asm syntax:
	 V{BITCLR/BITSET/BITREV}{B/H/W/V}	$1, V2, V3
	XV{BITCLR/BITSET/BITREV}{B/H/W/V}	$1, X2, X3
	 V{BITCLR/BITSET/BITREV}{B/H/W/V}	VK, VJ, VD
	XV{BITCLR/BITSET/BITREV}{B/H/W/V}	XK, XJ, XD

Equivalent platform assembler syntax:
	 v{bitclr/bitset/bitrev}i.{b/h/w/d}	v3, v2, $1
	xv{bitclr/bitset/bitrev}i.{b/h/w/d}	x3, x2, $1
	 v{bitclr/bitset/bitrev}.{b/h/w/d}	vd, vj, vk
	xv{bitclr/bitset/bitrev}.{b/h/w/d}	xd, xj, xk

Change-Id: I244f8ae316f72cc7ea01ca0139ac78c5616a3c5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/677435
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-03 18:26:56 -07:00
Xiaolin Zhao
abeeef1c08 cmd/compile/internal/test: fix typo in comments
Change-Id: Iba6bb7f8252120f56d7e6ae49c9edc9382e8c7e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/679855
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-03 18:25:35 -07:00
Xiaolin Zhao
d44749b65b cmd/internal/obj/loong64: add [X]VLDREPL.{B/H/W/D} instructions support
Go asm syntax:
	 VMOVQ	offset(Rj), Vd.<T>
	XVMOVQ	offset(Rj), Xd.<T>

<T> can have the following values:
B16, H8, W4, V2, B32, H16, W8, V4

Change-Id: I44af51d58bb62649d3fe360b3abb771565e78a8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/682895
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <mark@golang.org>
2025-08-03 18:25:27 -07:00
limeidan
d6beda863e runtime: add reference to debugPinnerV1
This is intended to be used by debuggers, to keep heap memory reachable
even if it isn't referenced from anywhere else.

Change-Id: I1e900e02b4fe3a188f8173cec70f8de32122489b
Reviewed-on: https://go-review.googlesource.com/c/go/+/682875
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-03 18:25:12 -07:00
David Chase
c25e5c86b2 [dev.simd] cmd/compile: generated code for K-mask-register slice load/stores
plus slice-part load, store and test for a single type.

Generated by arch/internal/simdgen CL 690315

Change-Id: I58052728b544c4a772a2870ac68f3c832813e1ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/690336
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-01 14:27:47 -07:00
David Chase
1ac5f3533f [dev.simd] cmd/compile: opcodes and rules and code generation to enable AVX512 masked loads/stores
Change-Id: I9e05fc5031420f60a2e6bac7b9f86365f0f4c0f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/690335
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-01 14:26:37 -07:00
David Chase
f39711a03d [dev.simd] cmd/compile: test for int-to-mask conversion
Change-Id: If341cb2c25dc535cdebe6f539db3cab8917d5afe
Reviewed-on: https://go-review.googlesource.com/c/go/+/689937
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-01 14:26:25 -07:00
David Chase
08bec02907 [dev.simd] cmd/compile: add register-to-mask moves, other simd glue
This includes code generated by simdgen CL 689955,
here because of git-facilitated pilot error
(the generated file should have been in the next CL
but that is related to this one, so, oh well).

Change-Id: Ibfea3f1cd93ca9cd12970edf15a013471677a6ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/689936
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-01 14:26:11 -07:00