Commit graph

238 commits

Author SHA1 Message Date
Jakub Ciolek
8354f6b5bb cmd/compile: use a boolean as a avoid clobbering flags mov marker
The Value type implements Aux interface because it is being used as a
"avoid clobbering flags" marker by amd64, x86 and s390x SSA parts.

Create a boolean that implements the Aux interface. Use it as the marker
instead. We no longer need Value to implement Aux.

Resolves a TODO.

See CL 275756 for more info.

Change-Id: I8a1eddf7e738b8aa31e82f3c4c590bafd2cdc56b
Reviewed-on: https://go-review.googlesource.com/c/go/+/461156
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Jakub Ciolek <jakub@ciolek.dev>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-01-20 05:01:15 +00:00
Keith Randall
f959fb3872 cmd/compile: add anchored version of SP
The SPanchored opcode is identical to SP, except that it takes a memory
argument so that it (and more importantly, anything that uses it)
must be scheduled at or after that memory argument.

This opcode ensures that a LEAQ of a variable gets scheduled after the
corresponding VARDEF for that variable.

This may lead to less CSE of LEAQ operations. The effect is very small.
The go binary is only 80 bytes bigger after this CL. Usually LEAQs get
folded into load/store operations, so the effect is only for pointerful
types, large enough to need a duffzero, and have their address passed
somewhere. Even then, usually the CSEd LEAQs will be un-CSEd because
the two uses are on different sides of a function call and the LEAQ
ends up being rematerialized at the second use anyway.

Change-Id: Ib893562cd05369b91dd563b48fb83f5250950293
Reviewed-on: https://go-review.googlesource.com/c/go/+/452916
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Martin Möhrmann <martin@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-01-19 22:43:12 +00:00
cui fliter
3a7a528c2d all: fix some comments for method
Change-Id: I4cff6b2a1fed6acdf754539c3c53a61eaa3b3f84
Reviewed-on: https://go-review.googlesource.com/c/go/+/450176
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-12-03 17:08:51 +00:00
cui fliter
b2faff18ce all: add missing periods in comments
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18 17:59:44 +00:00
Johan Brandhorst-Satzkorn
85196fc982 cmd/internal/ssa: correct references to _gen folder
The gen folder was renamed to _gen in CL 435472, but references in code
and docs were not updated. This updates the references.

Change-Id: Ibadc0cdcb5bed145c3257b58465a8df370487ae5
Reviewed-on: https://go-review.googlesource.com/c/go/+/444355
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-23 17:42:11 +00:00
eric fang
ddc7d2a80c cmd/compile: add late lower pass for last rules to run
Usually optimization rules have corresponding priorities, some need to
be run first, some run next, and some run last, which produces the best
code. But currently our optimization rules have no priority, this CL
adds a late lower pass that runs those rules that need to be run at last,
such as split unreasonable constant folding. This pass can be seen as
the second round of the lower pass.

For example:
func foo(a, b uint64) uint64 {
        d := a+0x1234568
        d1 := b+0x1234568
        return d&d1
}
The code generated by the master branch:
	0x0004 00004        ADD     $19088744, R0, R2 // movz+movk+add
	0x0010 00016        ADD     $19088744, R1, R1 // movz+movk+add
	0x001c 00028        AND     R1, R2, R0

This is because the current constant folding optimization rules do not
take into account the range of constants, causing the constant to be
loaded repeatedly. This CL splits these unreasonable constants folding
in the late lower pass. With this CL the generated code:
	0x0004 00004        MOVD    $19088744, R2 // movz+movk
	0x000c 00012        ADD     R0, R2, R3
	0x0010 00016        ADD     R1, R2, R1
	0x0014 00020        AND     R1, R3, R0

This CL also adds constant folding optimization for ADDS instruction.

In addition, in order not to introduce the codegen regression, an
optimization rule is added to change the addition of a negative number
into a subtraction of a positive number.

go1 benchmarks:
name                     old time/op    new time/op    delta
BinaryTree17-8              1.22s ± 1%     1.24s ± 0%  +1.56%  (p=0.008 n=5+5)
Fannkuch11-8                1.54s ± 0%     1.53s ± 0%  -0.69%  (p=0.016 n=4+5)
FmtFprintfEmpty-8          14.1ns ± 0%    14.1ns ± 0%    ~     (p=0.079 n=4+5)
FmtFprintfString-8         26.0ns ± 0%    26.1ns ± 0%  +0.23%  (p=0.008 n=5+5)
FmtFprintfInt-8            32.3ns ± 0%    32.9ns ± 1%  +1.72%  (p=0.008 n=5+5)
FmtFprintfIntInt-8         54.5ns ± 0%    55.5ns ± 0%  +1.83%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8    61.5ns ± 0%    62.0ns ± 0%  +0.93%  (p=0.008 n=5+5)
FmtFprintfFloat-8          72.0ns ± 0%    73.6ns ± 0%  +2.24%  (p=0.008 n=5+5)
FmtManyArgs-8               221ns ± 0%     224ns ± 0%  +1.22%  (p=0.008 n=5+5)
GobDecode-8                1.91ms ± 0%    1.93ms ± 0%  +0.98%  (p=0.008 n=5+5)
GobEncode-8                1.40ms ± 1%    1.39ms ± 0%  -0.79%  (p=0.032 n=5+5)
Gzip-8                      115ms ± 0%     117ms ± 1%  +1.17%  (p=0.008 n=5+5)
Gunzip-8                   19.4ms ± 1%    19.3ms ± 0%  -0.71%  (p=0.016 n=5+4)
HTTPClientServer-8         27.0µs ± 0%    27.3µs ± 0%  +0.80%  (p=0.008 n=5+5)
JSONEncode-8               3.36ms ± 1%    3.33ms ± 0%    ~     (p=0.056 n=5+5)
JSONDecode-8               17.5ms ± 2%    17.8ms ± 0%  +1.71%  (p=0.016 n=5+4)
Mandelbrot200-8            2.29ms ± 0%    2.29ms ± 0%    ~     (p=0.151 n=5+5)
GoParse-8                  1.35ms ± 1%    1.36ms ± 1%    ~     (p=0.056 n=5+5)
RegexpMatchEasy0_32-8      24.5ns ± 0%    24.5ns ± 0%    ~     (p=0.444 n=4+5)
RegexpMatchEasy0_1K-8       131ns ±11%     118ns ± 6%    ~     (p=0.056 n=5+5)
RegexpMatchEasy1_32-8      22.9ns ± 0%    22.9ns ± 0%    ~     (p=0.905 n=4+5)
RegexpMatchEasy1_1K-8       126ns ± 0%     127ns ± 0%    ~     (p=0.063 n=4+5)
RegexpMatchMedium_32-8      486ns ± 5%     483ns ± 0%    ~     (p=0.381 n=5+4)
RegexpMatchMedium_1K-8     15.4µs ± 1%    15.5µs ± 0%    ~     (p=0.151 n=5+5)
RegexpMatchHard_32-8        687ns ± 0%     686ns ± 0%    ~     (p=0.103 n=5+5)
RegexpMatchHard_1K-8       20.7µs ± 0%    20.7µs ± 1%    ~     (p=0.151 n=5+5)
Revcomp-8                   175ms ± 2%     176ms ± 3%    ~     (p=1.000 n=5+5)
Template-8                 20.4ms ± 6%    20.1ms ± 2%    ~     (p=0.151 n=5+5)
TimeParse-8                 112ns ± 0%     113ns ± 0%  +0.97%  (p=0.016 n=5+4)
TimeFormat-8                156ns ± 0%     145ns ± 0%  -7.14%  (p=0.029 n=4+4)

Change-Id: I3ced26e89041f873ac989586514ccc5ee09f13da
Reviewed-on: https://go-review.googlesource.com/c/go/+/425134
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-10-05 02:40:56 +00:00
ruinan
121344ac33 cmd/compile: optimize RotateLeft8/16 on arm64
This CL optimizes RotateLeft8/16 on arm64.

For 16 bits, we form a 32 bits register by duplicating two 16 bits
registers, then use RORW instruction to do the rotate shift.

For 8 bits, we just use LSR and LSL instead of RORW because the code is
simpler.

Benchmark          Old          ThisCL       delta
RotateLeft8-46     2.16 ns/op   1.73 ns/op   -19.70%
RotateLeft16-46    2.16 ns/op   1.54 ns/op   -28.53%

Change-Id: I09cde4383d12e31876a57f8cdfd3bb4f324fadb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/420976
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-09-02 17:46:31 +00:00
Wayne Zuo
e8f0340fa4 cmd/compile: intrinsify RotateLeft{32,64} on loong64
Benchmark on crypto/sha256 (provided by Xiaodong Liu):
name               old time/op    new time/op    delta
Hash8Bytes/New       1.19µs ± 0%    0.97µs ± 0%  -18.75%  (p=0.000 n=9+9)
Hash8Bytes/Sum224    1.21µs ± 0%    0.97µs ± 0%  -20.04%  (p=0.000 n=9+10)
Hash8Bytes/Sum256    1.21µs ± 0%    0.98µs ± 0%  -19.16%  (p=0.000 n=10+7)
Hash1K/New           15.9µs ± 0%    12.4µs ± 0%  -22.10%  (p=0.000 n=10+10)
Hash1K/Sum224        15.9µs ± 0%    12.4µs ± 0%  -22.18%  (p=0.000 n=8+10)
Hash1K/Sum256        15.9µs ± 0%    12.4µs ± 0%  -22.15%  (p=0.000 n=10+9)
Hash8K/New            119µs ± 0%      92µs ± 0%  -22.40%  (p=0.000 n=10+9)
Hash8K/Sum224         119µs ± 0%      92µs ± 0%  -22.41%  (p=0.000 n=9+10)
Hash8K/Sum256         119µs ± 0%      92µs ± 0%  -22.40%  (p=0.000 n=9+9)

name               old speed      new speed      delta
Hash8Bytes/New     6.70MB/s ± 0%  8.25MB/s ± 0%  +23.13%  (p=0.000 n=10+10)
Hash8Bytes/Sum224  6.60MB/s ± 0%  8.26MB/s ± 0%  +25.06%  (p=0.000 n=10+10)
Hash8Bytes/Sum256  6.59MB/s ± 0%  8.15MB/s ± 0%  +23.67%  (p=0.000 n=10+7)
Hash1K/New         64.3MB/s ± 0%  82.5MB/s ± 0%  +28.36%  (p=0.000 n=10+10)
Hash1K/Sum224      64.3MB/s ± 0%  82.6MB/s ± 0%  +28.51%  (p=0.000 n=10+10)
Hash1K/Sum256      64.3MB/s ± 0%  82.6MB/s ± 0%  +28.46%  (p=0.000 n=9+9)
Hash8K/New         69.0MB/s ± 0%  89.0MB/s ± 0%  +28.87%  (p=0.000 n=10+8)
Hash8K/Sum224      69.0MB/s ± 0%  89.0MB/s ± 0%  +28.88%  (p=0.000 n=9+10)
Hash8K/Sum256      69.0MB/s ± 0%  88.9MB/s ± 0%  +28.87%  (p=0.000 n=8+9)

Benchmark on crypto/sha512 (provided by Xiaodong Liu):
name               old time/op    new time/op     delta
Hash8Bytes/New       1.55µs ± 0%     1.31µs ± 0%  -15.67%  (p=0.000 n=10+10)
Hash8Bytes/Sum384    1.59µs ± 0%     1.35µs ± 0%  -14.97%  (p=0.000 n=10+10)
Hash8Bytes/Sum512    1.62µs ± 0%     1.39µs ± 0%  -14.02%  (p=0.000 n=10+10)
Hash1K/New           10.7µs ± 0%      8.6µs ± 0%  -19.60%  (p=0.000 n=8+8)
Hash1K/Sum384        10.8µs ± 0%      8.7µs ± 0%  -19.40%  (p=0.000 n=9+9)
Hash1K/Sum512        10.8µs ± 0%      8.7µs ± 0%  -19.35%  (p=0.000 n=9+10)
Hash8K/New           74.6µs ± 0%     59.6µs ± 0%  -20.08%  (p=0.000 n=10+9)
Hash8K/Sum384        74.7µs ± 0%     59.7µs ± 0%  -20.04%  (p=0.000 n=9+8)
Hash8K/Sum512        74.7µs ± 0%     59.7µs ± 0%  -20.01%  (p=0.000 n=10+10)

name               old speed      new speed       delta
Hash8Bytes/New     5.16MB/s ± 0%   6.12MB/s ± 0%  +18.60%  (p=0.000 n=10+8)
Hash8Bytes/Sum384  5.02MB/s ± 0%   5.90MB/s ± 0%  +17.56%  (p=0.000 n=10+10)
Hash8Bytes/Sum512  4.94MB/s ± 0%   5.74MB/s ± 0%  +16.29%  (p=0.000 n=10+9)
Hash1K/New         95.4MB/s ± 0%  118.6MB/s ± 0%  +24.38%  (p=0.000 n=10+10)
Hash1K/Sum384      95.0MB/s ± 0%  117.9MB/s ± 0%  +24.06%  (p=0.000 n=8+9)
Hash1K/Sum512      94.8MB/s ± 0%  117.5MB/s ± 0%  +23.99%  (p=0.000 n=8+9)
Hash8K/New          110MB/s ± 0%    137MB/s ± 0%  +25.11%  (p=0.000 n=9+6)
Hash8K/Sum384       110MB/s ± 0%    137MB/s ± 0%  +25.07%  (p=0.000 n=9+8)
Hash8K/Sum512       110MB/s ± 0%    137MB/s ± 0%  +25.01%  (p=0.000 n=10+10)

Change-Id: I28ccfce634659305a336c8e0a3f8589f7361d661
Reviewed-on: https://go-review.googlesource.com/c/go/+/422317
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-08-30 03:21:44 +00:00
Keith Randall
60ad3c48f5 cmd/compile: move SSA rotate instruction detection to arch-independent rules
Detect rotate instructions while still in architecture-independent form.
It's easier to do here, and we don't need to repeat it in each
architecture file.

Change-Id: I9396954b3f3b3bfb96c160d064a02002309935bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/421195
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Ruinan Sun <Ruinan.Sun@arm.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-23 21:24:14 +00:00
Keith Randall
332a5981d0 cmd/compile: handle partially overlapping assignments
Normally, when moving Go values of type T from one location to another,
we don't need to worry about partial overlaps. The two Ts must either be
in disjoint (nonoverlapping) memory or in exactly the same location.
There are 2 cases where this isn't true:
 1) Using unsafe you can arrange partial overlaps.
 2) Since Go 1.17, you can use a cast from a slice to a ptr-to-array.
    https://go.dev/ref/spec#Conversions_from_slice_to_array_pointer
    This feature can be used to construct partial overlaps of array types.
      var a [3]int
      p := (*[2]int)(a[:])
      q := (*[2]int)(a[1:])
      *p = *q
We don't care about solving 1. Or at least, we haven't historically
and no one has complained.
For 2, we need to ensure that if there might be partial overlap,
then we can't use OpMove; we must use memmove instead.
(memmove handles partial overlap by copying in the correct
direction. OpMove does not.)

Note that we have to be careful here not to introduce a call when
we're marshaling arguments to a call or unmarshaling results from a call.

Fixes #54467

Change-Id: I1ca6aba8041576849c1d85f1fa33ae61b80a373d
Reviewed-on: https://go-review.googlesource.com/c/go/+/425076
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-23 19:56:32 +00:00
Jakub Ciolek
de561dc766 cmd/compile: mark booleans as registerizable
Boolean values fit in registers, mark them accordingly. Improves codegen a bit.

compilecmp for darwin/amd64:

compress/gzip
compress/gzip.(*Reader).Reset 1017 -> 985  (-3.15%)

net
net.newRequest 1002 -> 970  (-3.19%)

crypto/tls
crypto/tls.(*sessionState).unmarshal 1054 -> 968  (-8.16%)

cmd/compile/internal/syntax
cmd/compile/internal/syntax.Fprint 518 -> 453  (-12.55%)

cmd/vendor/github.com/ianlancetaylor/demangle
cmd/vendor/github.com/ianlancetaylor/demangle.ASTToString 389 -> 325  (-16.45%)

cmd/go/internal/load
cmd/go/internal/load.PackagesAndErrors 3453 -> 3381  (-2.09%)

cmd/compile/internal/ssa
cmd/compile/internal/ssa.registerizable 249 -> 255  (+2.41%)

cmd/compile/internal/ssagen
cmd/compile/internal/ssagen.buildssa 9388 -> 9356  (-0.34%)

file                                            before   after    Δ       %
compress/gzip.s                                 8247     8215     -32     -0.388%
net.s                                           266667   266635   -32     -0.012%
crypto/tls.s                                    290324   290238   -86     -0.030%
cmd/compile/internal/syntax.s                   156422   156357   -65     -0.042%
cmd/vendor/github.com/ianlancetaylor/demangle.s 268313   268249   -64     -0.024%
cmd/go/internal/load.s                          122946   122874   -72     -0.059%
cmd/compile/internal/ssa.s                      3551201  3551207  +6      +0.000%
cmd/compile/internal/ssagen.s                   362299   362267   -32     -0.009%
total                                           19725872 19725495 -377    -0.002%

Change-Id: I4cd40b54d8b2da6d1f946e51f16689315a369dca
Reviewed-on: https://go-review.googlesource.com/c/go/+/408474
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-23 18:16:49 +00:00
cuiweixie
e7307034cc cmd/compile: store combine on amd64
Fixes #54120

Change-Id: I6915b6e8d459d9becfdef4fdcba95ee4dea6af05
GitHub-Last-Rev: 03f19942c7
GitHub-Pull-Request: golang/go#54126
Reviewed-on: https://go-review.googlesource.com/c/go/+/420115
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
2022-08-08 16:40:58 +00:00
Cherry Mui
c5da4fb7ac cmd/compile: make jump table symbol local
When using plugins, if the plugin and the main executable both
have the same function, and if it uses jump table, currently the
jump table symbol have the same name so it will be deduplicated by
the dynamic linker. This causes a function in the plugin may (in
the middle of the function) jump to the function with the same name
in the main executable (or vice versa). But the function may be
compiled slightly differently, because the plugin needs to be PIC.
Jumping from the middle of one function to the other will not work.
Avoid this problem by marking the jump table symbol local to a DSO.

Fixes #53989.

Change-Id: I2b573b9dfc22401c8a09ffe9b9ea8bb83d3700ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/418960
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-07-22 20:12:19 +00:00
John Bampton
b2116f748a all: fix spelling
Change-Id: Iee18987c495d1d4bde9da888d454eea8079d3ebc
GitHub-Last-Rev: ff5e01599d
GitHub-Pull-Request: golang/go#52949
Reviewed-on: https://go-review.googlesource.com/c/go/+/406915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-05-17 21:46:33 +00:00
Xiaodong Liu
a6c95e75d9 cmd/compile/internal/ssa: inline memmove with known size
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I1534b66b527efaf2bbaa8e6e6ac0618aac0b5930
Reviewed-on: https://go-review.googlesource.com/c/go/+/367040
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-11 23:58:56 +00:00
Keith Randall
1ba96d8c09 cmd/compile: implement jump tables
Performance is kind of hard to exactly quantify.

One big difference between jump tables and the old binary search
scheme is that there's only 1 branch statement instead of O(n) of
them. That can be both a blessing and a curse, and can make evaluating
jump tables very hard to do.

The single branch can become a choke point for the hardware branch
predictor. A branch table jump must fit all of its state in a single
branch predictor entry (technically, a branch target predictor entry).
With binary search that predictor state can be spread among lots of
entries. In cases where the case selection is repetitive and thus
predictable, binary search can perform better.

The big win for a jump table is that it doesn't consume so much of the
branch predictor's resources. But that benefit is essentially never
observed in microbenchmarks, because the branch predictor can easily
keep state for all the binary search branches in a microbenchmark. So
that benefit is really hard to measure.

So predictable switch microbenchmarks are ~useless - they will almost
always favor the binary search scheme. Fully unpredictable switch
microbenchmarks are better, as they aren't lying to us quite so
much. In a perfectly unpredictable situation, a jump table will expect
to incur 1-1/N branch mispredicts, where a binary search would incur
lg(N)/2 of them. That makes the crossover point at about N=4. But of
course switches in real programs are seldom fully unpredictable, so
we'll use a higher crossover point.

Beyond the branch predictor, jump tables tend to execute more
instructions per switch but have no additional instructions per case,
which also argues for a larger crossover.

As far as code size goes, with this CL cmd/go has a slightly smaller
code segment and a slightly larger overall size (from the jump tables
themselves which live in the data segment).

This is a case where some FDO (feedback-directed optimization) would
be really nice to have. #28262

Some large-program benchmarks might help make the case for this
CL. Especially if we can turn on branch mispredict counters so we can
see how much using jump tables can free up branch prediction resources
that can be gainfully used elsewhere in the program.

name                         old time/op  new time/op  delta
Switch8Predictable         1.89ns ± 2%  1.27ns ± 3%  -32.58%  (p=0.000 n=9+10)
Switch8Unpredictable       9.33ns ± 1%  7.50ns ± 1%  -19.60%  (p=0.000 n=10+9)
Switch32Predictable        2.20ns ± 2%  1.64ns ± 1%  -25.39%  (p=0.000 n=10+9)
Switch32Unpredictable      10.0ns ± 2%   7.6ns ± 2%  -24.04%  (p=0.000 n=10+10)

Fixes #5496
Update #34381

Change-Id: I3ff56011d02be53f605ca5fd3fb96b905517c34f
Reviewed-on: https://go-review.googlesource.com/c/go/+/357330
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-04-14 19:30:00 +00:00
Russ Cox
19309779ac all: gofmt main repo
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]

Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.

For #51082.

Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-11 16:34:30 +00:00
Keith Randall
15728ce950 cmd/compile: disable rewrite loop detector for deadcode-only changes
We're guaranteed we won't infinite loop on deadcode-only changes,
because each change converts valid -> invalid, and there are only a
finite number of valid values.

The loops this test is looking for are those generated by rule
applications, so it isn't useful to check for loops when rules aren't
involved.

Fixes #51639

Change-Id: Idf1abeab9d47baafddc3a1197d5064faaf07ef78
Reviewed-on: https://go-review.googlesource.com/c/go/+/392760
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
2022-03-15 00:05:18 +00:00
Cuong Manh Le
f686f6a963 cmd/compile: remove Value.RemoveArg
It's only used in two places:

 - The one in regalloc.go can be replaced with v.resetArgs()
 - The one in rewrite.go can be open coded

and can cause wrong usage like the bug that CL 358117 fixed.

Change-Id: I125baf237db159d056fe4b1c73072331eea4d06a
Reviewed-on: https://go-review.googlesource.com/c/go/+/357965
Trust: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-10-25 02:59:35 +00:00
Ruslan Andreev
810b08b8ec cmd/compile: inline memequal(x, const, sz) for small sizes
This CL adds late expanded memequal(x, const, sz) inlining for 2, 4, 8
bytes size. This PoC is using the same method as CL 248404.
This optimization fires about 100 times in Go compiler (1675 occurrences
reduced to 1574, so -6%).
Also, added unit-tests to codegen/comparisions.go file.

Updates #37275

Change-Id: Ia52808d573cb706d1da8166c5746ede26f46c5da
Reviewed-on: https://go-review.googlesource.com/c/go/+/328291
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Trust: David Chase <drchase@google.com>
2021-10-06 13:47:50 +00:00
Keith Randall
6f35430faa cmd/compile: allow rotates to be merged with logical ops on arm64
Fixes #48002

Change-Id: Ie3a157d55b291f5ac2ef4845e6ce4fefd84fc642
Reviewed-on: https://go-review.googlesource.com/c/go/+/350912
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-20 16:25:36 +00:00
fanzha02
2091bd3f26 cmd/compile: simiplify arm64 bitfield optimizations
In some rewrite rules for arm64 bitfield optimizations, the
bitfield lsb value and the bitfield width value are related
to datasize, some of them use datasize directly to check the
bitfield lsb value is valid, to get the bitfiled width value,
but some of them call isARM64BFMask() and arm64BFWidth()
functions. In order to be consistent, this patch changes them
all to use datasize.

Besides, this patch sorts the codegen test cases.

Run the "toolstash-check -all" command and find one inconsistent code
is as the following.

new:    src/math/fma.go:104      BEQ	247
master: src/math/fma.go:104      BEQ	248

The above inconsistence is due to this patch changing the range of the
field lsb value in "UBFIZ" optimization rules from "lc+(32|16|8)<64" to
"lc<64", so that the following code is generated as "UBFIZ". The logical
of changed code is still correct.

The code of src/math/fma.go:160:
  const uvinf = 0x7FF0000000000000
  func FMA(a, b uint32) float64 {
  	ps := a+b
  	return Float64frombits(uint64(ps)<<63 | uvinf)
  }

The new assembly code:
  TEXT    "".FMA(SB), LEAF|NOFRAME|ABIInternal, $0-16
  MOVWU   "".a(FP), R0
  MOVWU   "".b+4(FP), R1
  ADD     R1, R0, R0
  UBFIZ   $63, R0, $1, R0
  ORR     $9218868437227405312, R0, R0
  MOVD    R0, "".~r2+8(FP)
  RET     (R30)

The master assembly code:
  TEXT    "".FMA(SB), LEAF|NOFRAME|ABIInternal, $0-16
  MOVWU   "".a(FP), R0
  MOVWU   "".b+4(FP), R1
  ADD     R1, R0, R0
  MOVWU   R0, R0
  LSL     $63, R0, R0
  ORR     $9218868437227405312, R0, R0
  MOVD    R0, "".~r2+8(FP)
  RET     (R30)

Change-Id: I9061104adfdfd3384d0525327ae1e5c8b0df5c35
Reviewed-on: https://go-review.googlesource.com/c/go/+/265038
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-10 02:44:36 +00:00
Josh Bleecher Snyder
bff39cf6cb cmd/compile: add automated rewrite cycle detection
A common bug during development is to introduce rewrite rule cycles.
This is annoying because it takes a while to notice that
make.bash is a bit too slow this time, and to remember why.
And then you have to manually arrange to debug.

Make this all easier by automating it.
Detect cycles, and when we detect one, print the sequence
of rewrite rules that occur within a single cycle before crashing.

Change-Id: I8dadda13990ab925a81940d4833c9e5243368435
Reviewed-on: https://go-review.googlesource.com/c/go/+/347829
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-09-08 22:47:57 +00:00
Matthew Dempsky
72c003ef82 cmd/compile: unexport Type.Width and Type.Align [generated]
[git-generate]
cd src/cmd/compile/internal

: Workaround rf issue with types2 tests.
rm types2/*_test.go

: Rewrite uses. First a type-safe rewrite,
: then a second pass to fix unnecessary conversions.
rf '
ex ./abi ./escape ./gc ./liveness ./noder ./reflectdata ./ssa ./ssagen ./staticinit ./typebits ./typecheck ./walk {
  import "cmd/compile/internal/types"
  var t *types.Type
  t.Width -> t.Size()
  t.Align -> uint8(t.Alignment())
}

ex ./abi ./escape ./gc ./liveness ./noder ./reflectdata ./ssa ./ssagen ./staticinit ./typebits ./typecheck ./walk {
  import "cmd/compile/internal/types"
  var t *types.Type
  int64(uint8(t.Alignment())) -> t.Alignment()
}
'

: Rename fields to lower case.
(
cd types
rf '
mv Type.Width Type.width
mv Type.Align Type.align
'
)

: Revert types2 changes.
git checkout HEAD^ types2

Change-Id: I42091faece104c4ef619d9d4d50514fd48c8f029
Reviewed-on: https://go-review.googlesource.com/c/go/+/345480
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2021-08-27 20:43:31 +00:00
Matthew Dempsky
dd95a4e3db [dev.typeparams] cmd/compile: simplify SSA devirtualization
This CL implements a few improvements to SSA devirtualization to make
it simpler and more general:

1. Change reflectdata.ITabAddr to now immediately generate the wrapper
functions and write out the itab symbol data. Previously, these were
each handled by separate phases later on.

2. Removes the hack in typecheck where we marked itabs that we
expected to need later. Instead, the calls to ITabAddr in walk now
handle generating the wrappers.

3. Changes the SSA interface call devirtualization algorithm to just
use the itab symbol data (namely, its relocations) to figure out what
pointer is available in memory at the given offset. This decouples it
somewhat from reflectdata.

Change-Id: I8fe06922af8f8a1e7c93f5aff2b60ff59b8e7114
Reviewed-on: https://go-review.googlesource.com/c/go/+/327871
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-06-16 20:57:38 +00:00
Than McIntosh
00d42ffc89 cmd/compile: spos handling fixes to improve prolog debuggability
With the new register ABI, the compiler sometimes introduces spills of
argument registers in function prologs; depending on the positions
assigned to these spills and whether they have the IsStmt flag set,
this can degrade the debugging experience. For example, in this
function from one of the Delve regression tests:

L13:  func foo((eface interface{}) {
L14:	if eface != nil {
L15:		n++
L16:	}
L17   }

we wind up with a prolog containing two spill instructions, the first
with line 14, the second with line 13.  The end result for the user
is that if you set a breakpoint in foo and run to it, then do "step",
execution will initially stop at L14, then jump "backwards" to L13.

The root of the problem in this case is that an ArgIntReg pseudo-op is
introduced during expand calls, then promoted (due to lowering) to a
first-class statement (IsStmt flag set), which in turn causes
downstream handling to propagate its position to the first of the register
spills in the prolog.

To help improve things, this patch changes the rewriter to avoid
moving an "IsStmt" flag from a deleted/replaced instruction to an
Arg{Int,Float}Reg value, and adds Arg{Int,Float}Reg to the list of
opcodes not suitable for selection as statement boundaries, and
suppresses generation of additional register spills in defframe() when
optimization is disabled (since in that case things will get spilled
in any case).

This is not a comprehensive/complete fix; there are still cases where
we get less-than-ideal source position markers (ex: issue 45680).

Updates #40724.

Change-Id: Ica8bba4940b2291bef6b5d95ff0cfd84412a2d40
Reviewed-on: https://go-review.googlesource.com/c/go/+/312989
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-04-26 14:55:26 +00:00
Paul E. Murphy
c8fb0ec5a0 cmd/compile: fix ANDI/SRWI merge on ppc64
The shift amount should be masked to avoid rotation values
beyond the numer of bits. In this case, if the shift amount
is 0, it should rotate 0, not 32.

Fixes #45589

Change-Id: I1e764497a39d0ec128e29af42352b70c70b2ecc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/310569
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Trust: Carlos Eduardo Seo <carlos.seo@linaro.org>
2021-04-15 21:57:02 +00:00
Cherry Zhang
70ed28e5f7 cmd/compile: support memmove inlining with register args
The rule that inlines memmove expects SSA ops that calls memmove
with arguments in memory. This CL adds a version that matches
it with arguments in registers, so the optimization works for
both situations.

Change-Id: Ideb64f65b7521481ab2ca7c9975a6cf7b70d5966
Reviewed-on: https://go-review.googlesource.com/c/go/+/309332
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-04-12 16:44:49 +00:00
David Chase
9b78c68a15 cmd/compile: remove AuxCall.results, cleanup ssagen/ssa.go
More cleanup to remove unnecessary parts of AuxCall.
Passed testing on arm64 (a link-register architecture)
in addition to amd64 so very likely okay.

(Gratuitously updated commit message to see if it will
correctly this time.)

Updates #40724

Change-Id: Iaece952ceb5066149a5d32aaa14b36755f26bb8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/303433
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-23 16:47:43 +00:00
David Chase
a93849b9e2 cmd/compile: remove now-redundant AuxCall.args
Cleanup, ABI information subsumes this.

Updates #40724

Change-Id: I6c69da44380f7b0d159b22acacbd68dc000e4725
Reviewed-on: https://go-review.googlesource.com/c/go/+/303432
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-22 19:09:53 +00:00
John Bampton
289d34a465 all: remove duplicate words
Change-Id: Ib0469232a2b69a869e58d5d24990ad74ac96ea56
GitHub-Last-Rev: eb38e049ee
GitHub-Pull-Request: golang/go#44805
Reviewed-on: https://go-review.googlesource.com/c/go/+/299109
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-03-13 11:56:59 +00:00
Josh Bleecher Snyder
b4787201c9 cmd/compile: minor doc improvements
These are left over from comments I failed to leave on CL 249463;
apparently I never hit "Reply".

Change-Id: Ia3f8a900703c347f8f98581ec1ac172c0f72cd9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/299589
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-08 18:56:25 +00:00
Cherry Zhang
a22bd3dc73 cmd/compile: use getcallerpc for racefuncentry
Currently, when instrumenting for the race detector, the compiler
inserts racefuncentry/racefuncentryfp at the entry of instrumented
functions. racefuncentry takes the caller's PC. On AMD64, we synthesize
a node which points to -8(FP) which is where the return address is
stored. Later this node turns to a special Arg in SSA that is not
really an argument. This causes problems in the new ABI work so that
special node has to be special-cased.

This CL changes the special node to a call to getcallerpc, which lowers
to an intrinsic in SSA. This also unifies AMD64 code path and LR machine
code path, as getcallerpc works on all platforms.

Change-Id: I1377e140b91e0473cfcadfda221f26870c1b124d
Reviewed-on: https://go-review.googlesource.com/c/go/+/297929
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-03-05 22:42:10 +00:00
David Chase
f2df1e3c34 cmd/compile: retrieve Args from registers
in progress; doesn't fully work until they are also passed on
register on the caller side.

For #40724.

Change-Id: I29a6680e60bdbe9d132782530214f2a2b51fb8f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/293394
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-03 03:02:39 +00:00
David Chase
e25040d162 cmd/compile: change StaticCall to return a "Results"
StaticLECall (multiple value in +mem, multiple value result +mem) ->
StaticCall (multiple ergister value in +mem,
   multiple register-sized-value result +mem) ->
ARCH CallStatic (multiple ergister value in +mem,
   multiple register-sized-value result +mem)

But the architecture-dependent stuff is indifferent to whether
it is mem->mem or (mem)->(mem) until Prog generation.

Deal with OpSelectN -> Prog in ssagen/ssa.go, others, as they
appear.

For #40724.

Change-Id: I1d0436f6371054f1881862641d8e7e418e4a6a16
Reviewed-on: https://go-review.googlesource.com/c/go/+/293391
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-26 02:52:33 +00:00
David Chase
7a2f3273c5 cmd/compile: plumb abi info into ssagen/ssa
Plumb abi information into ssa/ssagen for plain calls
and plain functions (not methods).  Does not extend all the
way through the compiler (yet).

One test disabled because it extends far enough to break the test.

Normalized all the compiler's register args TODOs to
// TODO(register args) ...

For #40724.

Change-Id: I0173a4579f032ac3c9db3aef1749d40da5ea01ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/293389
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-24 17:54:29 +00:00
Matthew Dempsky
7e0a81d280 [dev.regabi] all: merge master (dab3e5a) into dev.regabi
This merge had two conflicts to resolve:

1. The embed code on master had somewhat substantially diverged, so
this CL tediously backported the changes to dev.regabi. In particular,
I went through all of the embed changes to gc/{embed,noder,syntax}.go
and made sure the analogous code on dev.regabi in noder/noder.go and
staticdata/embed.go mirrors it.

2. The init-cycle reporting code on master was extended slightly to
track already visited declarations to avoid exponential behavior. The
same fix is applied on dev.regabi, just using ir.NameSet instead of
map[ir.Node]bool.

Conflicts:

- src/cmd/compile/internal/gc/embed.go
- src/cmd/compile/internal/gc/noder.go
- src/cmd/compile/internal/gc/syntax.go
- src/cmd/compile/internal/pkginit/initorder.go
- src/embed/internal/embedtest/embed_test.go
- src/go/types/stdlib_test.go

Merge List:

+ 2021-01-22 dab3e5affe runtime: switch runtime to libc for openbsd/amd64
+ 2021-01-22 a1b53d85da cmd/go: add documentation for test and xtest fields output by go list
+ 2021-01-22 b268b60774 runtime: remove pthread_kill/pthread_self for openbsd
+ 2021-01-22 ec4051763d runtime: fix typo in mgcscavenge.go
+ 2021-01-22 7ece3a7b17 net/http: fix flaky TestDisableKeepAliveUpgrade
+ 2021-01-22 50cba0506f time: clarify Timer.Reset behavior on AfterFunc Timers
+ 2021-01-22 cf10e69f17 doc/go1.16: mention net/http.Transport.GetProxyConnectHeader
+ 2021-01-22 ec1b945265 doc/go1.16: mention path/filepath.WalkDir
+ 2021-01-22 11def3d40b doc/go1.16: mention syscall.AllThreadsSyscall
+ 2021-01-21 07b0235609 doc/go1.16: add notes about package-specific fs.FS changes
+ 2021-01-21 e2b4f1fea5 doc/go1.16: minor formatting fix
+ 2021-01-21 9f43a9e07b doc/go1.16: mention new debug/elf constants
+ 2021-01-21 3c2f11ba5b cmd/go: overwrite program name with full path
+ 2021-01-21 953d1feca9 all: introduce and use internal/execabs
+ 2021-01-21 b186e4d70d cmd/go: add test case for cgo CC setting
+ 2021-01-21 5a8a2265fb cmd/cgo: report exec errors a bit more clearly
+ 2021-01-21 46e2e2e9d9 cmd/go: pass resolved CC, GCCGO to cgo
+ 2021-01-21 3d40895e36 runtime: switch openbsd/arm64 to pthreads
+ 2021-01-21 d95ca91380 crypto/elliptic: fix P-224 field reduction
+ 2021-01-20 ecf4ebf100 cmd/internal/moddeps: check content of all modules in GOROOT
+ 2021-01-20 d2d155d1ae runtime: don't adjust timer pp field in timerWaiting status
+ 2021-01-20 803d18fc6c cmd/go: set Incomplete field on go list output if no files match embed
+ 2021-01-20 6e243ce71d cmd/go: have go mod vendor copy embedded files in subdirs
+ 2021-01-20 be28e5abc5 cmd/go: fix mod_get_fallback test
+ 2021-01-20 928bda4f4a runtime: convert openbsd/amd64 locking to libc
+ 2021-01-19 824f2d635c cmd/go: allow go fmt to complete when embedded file is missing
+ 2021-01-19 0575e35e50 cmd/compile: require 'go 1.16' go.mod line for //go:embed
+ 2021-01-19 ccb2e90688 cmd/link: exit before Asmb2 if error
+ 2021-01-19 ca5774a5a5 embed: treat uninitialized FS as empty
+ 2021-01-19 d047c91a6c cmd/link,runtime: switch openbsd/amd64 to pthreads
+ 2021-01-19 61debffd97 runtime: factor out usesLibcall
+ 2021-01-19 9fed39d281 runtime: factor out mStackIsSystemAllocated
+ 2021-01-18 dbab079835 runtime: free Windows event handles after last lock is dropped
+ 2021-01-18 5a8fbb0d2d os: do not close syscall.Stdin in TestReadStdin
+ 2021-01-15 682a1d2176 runtime: detect errors in DuplicateHandle
+ 2021-01-15 9f83418b83 cmd/link: remove GOROOT write in TestBuildForTvOS
+ 2021-01-15 ec9470162f cmd/compile: allow embed into any string or byte slice type
+ 2021-01-15 54198b04db cmd/compile: disallow embed of var inside func
+ 2021-01-15 b386c735e7 cmd/go: fix go generate docs
+ 2021-01-15 bb5075a525 syscall: remove RtlGenRandom and move it into internal/syscall
+ 2021-01-15 1deae0b597 os: invoke processKiller synchronously in testKillProcess
+ 2021-01-15 ff196c3e84 crypto/x509: update iOS bundled roots to version 55188.40.9
+ 2021-01-14 e125ccd10e cmd/go: in 'go mod edit', validate versions given to -retract and -exclude
+ 2021-01-14 eb330020dc cmd/dist, cmd/go: pass -arch for C compilation on Darwin
+ 2021-01-14 84e8a06f62 cmd/cgo: remove unnecessary space in cgo export header
+ 2021-01-14 0c86b999c3 cmd/test2json: document passing -test.paniconexit0
+ 2021-01-14 9135795891 cmd/go/internal/load: report positions for embed errors
+ 2021-01-14 d9b79e53bb cmd/compile: fix wrong complement for arm64 floating-point comparisons
+ 2021-01-14 c73232d08f cmd/go/internal/load: refactor setErrorPos to PackageError.setPos
+ 2021-01-14 6aa28d3e06 go/build: report positions for go:embed directives
+ 2021-01-13 7eb31d999c cmd/go: add hints to more missing sum error messages
+ 2021-01-12 ba76567bc2 cmd/go/internal/modload: delete unused *mvsReqs.next method
+ 2021-01-12 665def2c11 encoding/asn1: document unmarshaling behavior for IMPLICIT string fields
+ 2021-01-11 81ea89adf3 cmd/go: fix non-script staleness checks interacting badly with GOFLAGS
+ 2021-01-11 759309029f doc: update editors.html for Go 1.16
+ 2021-01-11 c3b4c7093a cmd/internal/objfile: don't require runtime.symtab symbol for XCOFF
+ 2021-01-08 59bfc18e34 cmd/go: add hint to read 'go help vcs' to GOVCS errors
+ 2021-01-08 cd6f3a54e4 cmd/go: revise 'go help' documentation for modules
+ 2021-01-08 6192b98751 cmd/go: make hints in error messages more consistent
+ 2021-01-08 25886cf4bd cmd/go: preserve sums for indirect deps fetched by 'go mod download'
+ 2021-01-08 6250833911 runtime/metrics: mark histogram metrics as cumulative
+ 2021-01-08 8f6a9acbb3 runtime/metrics: remove unused StopTheWorld Description field
+ 2021-01-08 6598c65646 cmd/compile: fix exponential-time init-cycle reporting
+ 2021-01-08 fefad1dc85 test: fix timeout code for invoking compiler
+ 2021-01-08 6728118e0a cmd/go: pass signals forward during "go tool"
+ 2021-01-08 e65c543f3c go/build/constraint: add parser for build tag constraint expressions
+ 2021-01-08 0c5afc4fb7 testing/fstest,os: clarify racy behavior of TestFS
+ 2021-01-08 32afcc9436 runtime/metrics: change unit on *-by-size metrics to match bucket unit
+ 2021-01-08 c6513bca5a io/fs: minor corrections to Glob doc
+ 2021-01-08 304f769ffc cmd/compile: don't short-circuit copies whose source is volatile
+ 2021-01-08 ae97717133 runtime,runtime/metrics: use explicit histogram boundaries
+ 2021-01-08 a9ccd2d795 go/build: skip string literal while findEmbed
+ 2021-01-08 d92f8add32 archive/tar: fix typo in comment
+ 2021-01-08 cab1202183 cmd/link: accept extra blocks in TestFallocate
+ 2021-01-08 ee4d32249b io/fs: minor corrections to Glob release date
+ 2021-01-08 54bd1ccce2 cmd: update to latest golang.org/x/tools
+ 2021-01-07 9ec21a8f34 Revert "reflect: support multiple keys in struct tags"
+ 2021-01-07 091414b5b7 io/fs: correct WalkDirFunc documentation
+ 2021-01-07 9b55088d6b doc/go1.16: add release note for disallowing non-ASCII import paths
+ 2021-01-07 fa90aaca7d cmd/compile: fix late expand_calls leaf type for OpStructSelect/OpArraySelect
+ 2021-01-07 7cee66d4cb cmd/go: add documentation for Embed fields in go list output
+ 2021-01-07 e60cffa4ca html/template: attach functions to namespace
+ 2021-01-07 6da2d3b7d7 cmd/link: fix typo in asm.go
+ 2021-01-07 df81a15819 runtime: check mips64 VDSO clock_gettime return code
+ 2021-01-06 4787e906cf crypto/x509: rollback new CertificateRequest fields
+ 2021-01-06 c9658bee93 cmd/go: make module suggestion more friendly
+ 2021-01-06 4c668b25c6 runtime/metrics: fix panic message for Float64Histogram
+ 2021-01-06 d2131704a6 net/http/httputil: fix deadlock in DumpRequestOut
+ 2021-01-05 3e1e13ce6d cmd/go: set cfg.BuildMod to "readonly" by default with no module root
+ 2021-01-05 0b0d004983 cmd/go: pass embedcfg to gccgo if supported
+ 2021-01-05 1b85e7c057 cmd/go: don't scan gccgo standard library packages for imports
+ 2021-01-05 6b37b15d95 runtime: don't take allglock in tracebackothers
+ 2021-01-04 9eef49cfa6 math/rand: fix typo in comment
+ 2021-01-04 b01fb2af9e testing/fstest: fix typo in error message
+ 2021-01-01 3dd5867605 doc: 2021 is the Year of the Gopher
+ 2020-12-31 95ce805d14 io/fs: remove darwin/arm64 special condition
+ 2020-12-30 20d0991b86 lib/time, time/tzdata: update tzdata to 2020f
+ 2020-12-30 ed301733bb misc/cgo/testcarchive: remove special flags for Darwin/ARM
+ 2020-12-30 0ae2e032f2 misc/cgo/test: enable TestCrossPackageTests on darwin/arm64
+ 2020-12-29 780b4de16b misc/ios: fix wording for command line instructions
+ 2020-12-29 b4a71c95d2 doc/go1.16: reference misc/ios/README for how to build iOS programs
+ 2020-12-29 f83e0f6616 misc/ios: add to README how to build ios executables
+ 2020-12-28 4fd9455882 io/fs: fix typo in comment

Change-Id: I2f257bbc5fbb05f15c2d959f8cfe0ce13b083538
2021-01-22 12:01:13 -08:00
Junchen Li
d9b79e53bb cmd/compile: fix wrong complement for arm64 floating-point comparisons
Consider the following example,

  func test(a, b float64, x uint64) uint64 {
    if a < b {
      x = 0
    }
    return x
  }

  func main() {
    fmt.Println(test(1, math.NaN(), 123))
  }

The output is 0, but the expectation is 123.

This is because the rewrite rule

  (CSEL [cc] (MOVDconst [0]) y flag) => (CSEL0 [arm64Negate(cc)] y flag)

converts

  FCMP NaN, 1
  CSEL MI, 0, 123, R0 // if 1 < NaN then R0 = 0 else R0 = 123

to

  FCMP NaN, 1
  CSEL GE, 123, 0, R0 // if 1 >= NaN then R0 = 123 else R0 = 0

But both 1 < NaN and 1 >= NaN are false. So the output is 0, not 123.

The root cause is arm64Negate not handle negation of floating comparison
correctly. According to the ARM manual, the meaning of MI, GE, and PL
are

  MI: Less than
  GE: Greater than or equal to
  PL: Greater than, equal to, or unordered

Because NaN cannot be compared with other numbers, the result of such
comparison is unordered. So when NaN is involved, unlike integer, the
result of !(a < b) is not a >= b, it is a >= b || a is NaN || b is NaN.
This is exactly what PL means. We add NotLessThanF to represent PL. Then
the negation of LessThanF is NotLessThanF rather than GreaterEqualF. The
same reason for the other floating comparison operations.

Fixes #43619

Change-Id: Ia511b0027ad067436bace9fbfd261dbeaae01bcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/283572
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Keith Randall <khr@golang.org>
2021-01-14 17:23:11 +00:00
David Chase
9a19481acb [dev.regabi] cmd/compile: make ordering for InvertFlags more stable
Current many architectures use a rule along the lines of

// Canonicalize the order of arguments to comparisons - helps with CSE.
((CMP|CMPW) x y) && x.ID > y.ID => (InvertFlags ((CMP|CMPW) y x))

to normalize comparisons as much as possible for CSE.  Replace the
ID comparison with something less variable across compiler changes.
This helps avoid spurious failures in some of the codegen-comparison
tests (though the current choice of comparison is sensitive to Op
ordering).

Two tests changed to accommodate modified instruction choice.

Change-Id: Ib35f450bd2bae9d4f9f7838ceaf7ec682bcf1e1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/280155
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-01-13 02:40:43 +00:00
Matthew Dempsky
1a98ab0e2d [dev.regabi] cmd/compile: add ssa.Aux tag interface for Value.Aux
It's currently hard to automate refactorings around the Value.Aux
field, because we don't have any static typing information for it.
Adding a tag interface will make subsequent CLs easier and safer.

Passes buildall w/ toolstash -cmp.

Updates #42982.

Change-Id: I41ae8e411a66bda3195a0957b60c2fe8a8002893
Reviewed-on: https://go-review.googlesource.com/c/go/+/275756
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
2020-12-08 01:46:31 +00:00
Lynn Boger
0ae3b7cb74 cmd/compile: fix rules regression with shifts on PPC64
Some rules for PPC64 were checking for a case
where a shift followed by an 'and' of a mask could
be lowered, depending on the format of the mask. The
function to verify if the mask was valid for this purpose
was not checking if the mask was 0 which we don't want to
allow. This case can happen if previous optimizations
resulted in that mask value.

This fixes isPPC64ValidShiftMask to check for a mask of 0 and return
false.

This also adds a codegen testcase to verify it doesn't try to
match the rules in the future.

Fixes #42610

Change-Id: I565d94e88495f51321ab365d6388c01e791b4dbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/270358
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-11-17 13:20:20 +00:00
Cherry Zhang
0387bedadf cmd/compile: remove racefuncenterfp when it is not needed
We already remove racefuncenter and racefuncexit if they are not
needed (i.e. the function doesn't have any other race  calls).
racefuncenterfp is like racefuncenter but used on LR machines.
Remove unnecessary racefuncenterfp as well.

Change-Id: I65edb00e19c6d9ab55a204cbbb93e9fb710559f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/267099
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-11-02 03:03:16 +00:00
Alberto Donizetti
1090f0986d cmd/compile: rename mergeSymTyped to mergeSym
Also make canMergeSym take Syms instead of interface{}

Change-Id: I4926a1fc586aa90e198249d67e5b520404b40869
Reviewed-on: https://go-review.googlesource.com/c/go/+/265817
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-10-28 19:19:18 +00:00
Alberto Donizetti
bc0d7fd9b7 cmd/compile: delete log2, switch to log64
rewrite.go has two identical functions log2 and log64; the former has
been there for a while, while the latter was added together with
log{8,16,32} for use in typed rules.

This change deletes log2 and switches to using log64 everywhere.

Change-Id: I759b878814e4c115a5fa470274f22477738d69ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/265457
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-10-28 19:19:04 +00:00
Alberto Donizetti
5c1122b528 cmd/compile: delete isPowerOfTwo, switch to isPowerOfTwo64
rewrite.go has two identical functions isPowerOfTwo and
isPowerOfTwo64; the former has been there for a while, while the
latter was added together with isPowerOfTwo{8,16,32} for use in typed
rules.

This change deletes isPowerOfTwo and switch to using isPowerOfTwo64
everywhere.

Change-Id: If26c94565d2393fac6f0ba117ee7ee2fc915f7cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/265417
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-10-27 20:03:41 +00:00
Paul E. Murphy
c3c6fbf314 cmd/compile: combine more 32 bit shift and mask operations on ppc64
Combine (AND m (SRWconst x)) or (SRWconst (AND m x)) when mask m is
and the shift value produce constant which can be encoded into an
RLWINM instruction.

Combine (CLRLSLDI (SRWconst x)) if the combining of the underling rotate
masks produces a constant which can be encoded into RLWINM.

Likewise for (SLDconst (SRWconst x)) and (CLRLSDI (RLWINM x)).

Combine rotate word + and operations which can be encoded as a single
RLWINM/RLWNM instruction.

The most notable performance improvements arise from the crypto
benchmarks below (GOARCH=power8 on a ppc64le/linux):

pkg:golang.org/x/crypto/blowfish goos:linux goarch:ppc64le
ExpandKeyWithSalt                               52.2µs ± 0%    47.5µs ± 0%  -8.88%
ExpandKey                                       44.4µs ± 0%    40.3µs ± 0%  -9.15%

pkg:golang.org/x/crypto/ssh/internal/bcrypt_pbkdf goos:linux goarch:ppc64le
Key                                             57.6ms ± 0%    52.3ms ± 0%  -9.13%

pkg:golang.org/x/crypto/bcrypt goos:linux goarch:ppc64le
Equal                                           90.9ms ± 0%    82.6ms ± 0%  -9.13%
DefaultCost                                     91.0ms ± 0%    82.7ms ± 0%  -9.12%

Change-Id: I59a0ca29face38f4ab46e37124c32906f216c4ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/260798
Run-TryBot: Carlos Eduardo Seo <carlos.seo@linaro.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-27 18:33:20 +00:00
David Chase
7bda6154ca cmd/compile: add generic optimization patterns for late-expanded calls.
Repeats existing patterns for old calls, so that these will apply
during the optimization phases that precede call expansion.

Change-Id: I1ca0a78c159aa1a51004db217edde4ecc772b646
Reviewed-on: https://go-review.googlesource.com/c/go/+/248190
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-12 22:39:53 +00:00
Lynn Boger
cc2a5cf4b8 cmd/compile,cmd/internal/obj/ppc64: fix some shift rules due to a regression
A recent change to improve shifts was generating some
invalid cases when the rule was based on an AND. The
extended mnemonics CLRLSLDI and CLRLSLWI only allow
certain values for the operands and in the mask case
those values were not being checked properly. This
adds a check to those rules to verify that the
'b' and 'n' values used when an AND was part of the rule
have correct values.

There was a bug in some diag messages in asm9. The
message expected 3 values but only provided 2. Those are
corrected here also.

The test/codegen/shift.go was updated to add a few more
cases to check for the case mentioned here.

Some of the comments that mention the order of operands
in these extended mnemonics were wrong and those have been
corrected.

Fixes #41683.

Change-Id: If5bb860acaa5051b9e0cd80784b2868b85898c31
Reviewed-on: https://go-review.googlesource.com/c/go/+/258138
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-01 18:51:18 +00:00
David Chase
adef4deeb8 cmd/compile: enable late expansion for interface calls
Includes a few tweaks to Value.copyOf(a) (make it a no-op for
a self-copy) and new pattern hack "___" (3 underscores) is
like ellipsis, except the replacement doesn't need to have
matching ellipsis/underscores.

Moved the arg-length check in generated pattern-matching code
BEFORE the args are probed, because not all instances of
variable length OpFoo will have all the args mentioned in
some rule for OpFoo, and when that happens, the compiler
panics without the early check.

Change-Id: I66de40672b3794a6427890ff96c805a488d783f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/247537
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-01 17:47:21 +00:00
Lynn Boger
967465da29 cmd/compile: use combined shifts to improve array addressing on ppc64x
This change adds rules to find pairs of instructions that can
be combined into a single shifts. These instruction sequences
are common in array addressing within loops. Improvements can
be seen in many crypto packages and the hash packages.

These are based on the extended mnemonics found in the ISA
sections C.8.1 and C.8.2.

Some rules in PPC64.rules were moved because the ordering prevented
some matching.

The following results were generated on power9.

hash/crc32:
    CRC32/poly=Koopman/size=40/align=0          195ns ± 0%     163ns ± 0%  -16.41%
    CRC32/poly=Koopman/size=40/align=1          200ns ± 0%     163ns ± 0%  -18.50%
    CRC32/poly=Koopman/size=512/align=0        1.98µs ± 0%    1.67µs ± 0%  -15.46%
    CRC32/poly=Koopman/size=512/align=1        1.98µs ± 0%    1.69µs ± 0%  -14.80%
    CRC32/poly=Koopman/size=1kB/align=0        3.90µs ± 0%    3.31µs ± 0%  -15.27%
    CRC32/poly=Koopman/size=1kB/align=1        3.85µs ± 0%    3.31µs ± 0%  -14.15%
    CRC32/poly=Koopman/size=4kB/align=0        15.3µs ± 0%    13.1µs ± 0%  -14.22%
    CRC32/poly=Koopman/size=4kB/align=1        15.4µs ± 0%    13.1µs ± 0%  -14.79%
    CRC32/poly=Koopman/size=32kB/align=0        137µs ± 0%     105µs ± 0%  -23.56%
    CRC32/poly=Koopman/size=32kB/align=1        137µs ± 0%     105µs ± 0%  -23.53%

crypto/rc4:
    RC4_128    733ns ± 0%    650ns ± 0%  -11.32%  (p=1.000 n=1+1)
    RC4_1K    5.80µs ± 0%   5.17µs ± 0%  -10.89%  (p=1.000 n=1+1)
    RC4_8K    45.7µs ± 0%   40.8µs ± 0%  -10.73%  (p=1.000 n=1+1)

crypto/sha1:
    Hash8Bytes       635ns ± 0%     613ns ± 0%   -3.46%  (p=1.000 n=1+1)
    Hash320Bytes    2.30µs ± 0%    2.18µs ± 0%   -5.38%  (p=1.000 n=1+1)
    Hash1K          5.88µs ± 0%    5.38µs ± 0%   -8.62%  (p=1.000 n=1+1)
    Hash8K          42.0µs ± 0%    37.9µs ± 0%   -9.75%  (p=1.000 n=1+1)

There are other improvements found in golang.org/x/crypto which are all in the
range of 5-15%.

Change-Id: I193471fbcf674151ffe2edab212799d9b08dfb8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/252097
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
2020-09-17 12:37:40 +00:00