Commit graph

155 commits

Author SHA1 Message Date
Cherry Mui
c93d5d1a52 [dev.typeparams] all: always enable regabig on AMD64
Always enable regabig on AMD64, which enables the G register and
the X15 zero register. Remove the fallback path.

Also remove the regabig GOEXPERIMENT. On AMD64 it is always
enabled (this CL). Other architectures already have a G register,
except for 386, where there are too few registers and it is
unlikely that we will reserve one. (If we really do, we can just
add a new experiment).

Change-Id: I229cac0060f48fe58c9fdaabd38d6fa16b8a0855
Reviewed-on: https://go-review.googlesource.com/c/go/+/327272
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-06-11 20:52:41 +00:00
Cherry Mui
06df0ee7fa [dev.typeparams] cmd/compile: add arg/result register load/spill code on ARM64
Add code that loads results into registers (used in defer return
code path) and spills argument registers (used for partially lived
in-register args).

Move some code from the amd64 package to a common place.

Change-Id: I8d59b68693048fdba86e10769c4ac58de5fcfb64
Reviewed-on: https://go-review.googlesource.com/c/go/+/322851
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-05-27 16:37:50 +00:00
Cherry Mui
02117775d1 [dev.typeparams] cmd/compile, runtime: do not zero X15 on Plan 9
On Plan 9, we cannot use SSE registers in note handlers, so we
don't use X15 for zeroing (MOVOstorezero and DUFFZERO). Do not
zero X15 on Plan 9.

Change-Id: I2b083b01b27965611cb83d19afd66b383dc77846
Reviewed-on: https://go-review.googlesource.com/c/go/+/321329
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-05-20 19:03:48 +00:00
Keith Randall
b211fe0058 cmd/compile: remove bit operations that modify memory directly
These operations (BT{S,R,C}{Q,L}modify) are quite a bit slower than
other ways of doing the same thing.

Without the BTxmodify operations, there are two fallback ways the compiler
performs these operations: AND/OR/XOR operations directly on memory, or
load-BTx-write sequences. The compiler kinda chooses one arbitrarily
depending on rewrite rule application order. Currently, it uses
load-BTx-write for the Const benchmarks and AND/OR/XOR directly to memory
for the non-Const benchmarks. TBD, someone might investigate which of
the two fallback strategies is really better. For now, they are both
better than BTx ops.

name              old time/op  new time/op  delta
BitSet-8          1.09µs ± 2%  0.64µs ± 5%  -41.60%  (p=0.000 n=9+10)
BitClear-8        1.15µs ± 3%  0.68µs ± 6%  -41.00%  (p=0.000 n=10+10)
BitToggle-8       1.18µs ± 4%  0.73µs ± 2%  -38.36%  (p=0.000 n=10+8)
BitSetConst-8     37.0ns ± 7%  25.8ns ± 2%  -30.24%  (p=0.000 n=10+10)
BitClearConst-8   30.7ns ± 2%  25.0ns ±12%  -18.46%  (p=0.000 n=10+10)
BitToggleConst-8  36.9ns ± 1%  23.8ns ± 3%  -35.46%  (p=0.000 n=9+10)

Fixes #45790
Update #45242

Change-Id: Ie33a72dc139f261af82db15d446cd0855afb4e59
Reviewed-on: https://go-review.googlesource.com/c/go/+/318149
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ben Shi <powerman1st@163.com>
2021-05-08 03:27:59 +00:00
Than McIntosh
00d42ffc89 cmd/compile: spos handling fixes to improve prolog debuggability
With the new register ABI, the compiler sometimes introduces spills of
argument registers in function prologs; depending on the positions
assigned to these spills and whether they have the IsStmt flag set,
this can degrade the debugging experience. For example, in this
function from one of the Delve regression tests:

L13:  func foo((eface interface{}) {
L14:	if eface != nil {
L15:		n++
L16:	}
L17   }

we wind up with a prolog containing two spill instructions, the first
with line 14, the second with line 13.  The end result for the user
is that if you set a breakpoint in foo and run to it, then do "step",
execution will initially stop at L14, then jump "backwards" to L13.

The root of the problem in this case is that an ArgIntReg pseudo-op is
introduced during expand calls, then promoted (due to lowering) to a
first-class statement (IsStmt flag set), which in turn causes
downstream handling to propagate its position to the first of the register
spills in the prolog.

To help improve things, this patch changes the rewriter to avoid
moving an "IsStmt" flag from a deleted/replaced instruction to an
Arg{Int,Float}Reg value, and adds Arg{Int,Float}Reg to the list of
opcodes not suitable for selection as statement boundaries, and
suppresses generation of additional register spills in defframe() when
optimization is disabled (since in that case things will get spilled
in any case).

This is not a comprehensive/complete fix; there are still cases where
we get less-than-ideal source position markers (ex: issue 45680).

Updates #40724.

Change-Id: Ica8bba4940b2291bef6b5d95ff0cfd84412a2d40
Reviewed-on: https://go-review.googlesource.com/c/go/+/312989
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-04-26 14:55:26 +00:00
David Chase
b6e1c33603 cmd/compile: spill all the parameters around morestack
former code only spilled those parameters mentioned in code
AT THE REGISTER LEVEL, this caused problems with liveness
sometimes (which worked on whole variables including
aggregates).

Updates #40724.

Change-Id: Ib9fdc50d95d1d2b1f1e405dd370540e88582ac71
Reviewed-on: https://go-review.googlesource.com/c/go/+/310690
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-04-16 20:12:20 +00:00
Russ Cox
95ed5c3800 internal/buildcfg: move build configuration out of cmd/internal/objabi
The go/build package needs access to this configuration,
so move it into a new package available to the standard library.

Change-Id: I868a94148b52350c76116451f4ad9191246adcff
Reviewed-on: https://go-review.googlesource.com/c/go/+/310731
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-16 19:20:53 +00:00
Cherry Zhang
23f8c203f0 cmd/compile: rework/reduce partially lived argument spilling
In CL 307909 we generate code that spills pointer-typed argument
registers if it is part of an SSA-able aggregate. The current
code spill the register unconditionally. Sometimes it is
unnecessary, because it is already spilled, or it is never live.

This CL reworks the spill generation. We move it to the end of
compilation, after liveness analysis, so we have information about
if a spill is necessary, and only generate spills for the
necessary ones.

Change-Id: I8d60be9b2c47651aeda14f5e2d1bbd207c134b26
Reviewed-on: https://go-review.googlesource.com/c/go/+/309331
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-04-14 21:00:20 +00:00
Cherry Zhang
f5efa5a313 cmd/compile: load results into registers on open defer return path
When a function panics then recovers, it needs to return to the
caller with named results having the correct values. For
in-register results, we need to load them into registers at the
defer return path.

For non-open-coded defers, we already generate correct code, as
the defer return path is part of the SSA CFG and contains the
instructions that are the same as an ordinary return statement,
including putting the results to the right places.

For open-coded defers, we have a special code generation that
emits a disconnected block that currently contains only the
deferreturn call and a RET instruction. It leaves the result
registers unset. This CL adds instructions that load the result
registers on that path.

Updates #40724.

Change-Id: I1f60514da644fd5fb4b4871a1153c62f42927282
Reviewed-on: https://go-review.googlesource.com/c/go/+/307231
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-04-06 20:22:15 +00:00
Cherry Zhang
6996bae5d1 cmd/compile: use ABI0 for cgo_unsafe_args functions
cgo_unsafe_args paragma indicates that the function (or its
callee) uses address and offsets to dispatch arguments, which
currently using ABI0 frame layout. Pin them to ABI0.

With this, "GOEXPERIMENT=regabi,regabiargs go run hello.go" works
on Darwin/AMD64.

Change-Id: I3eadd5a3646a9de8fa681fa0a7f46e7cdc217d24
Reviewed-on: https://go-review.googlesource.com/c/go/+/306609
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-04-02 16:31:47 +00:00
Cherry Zhang
6ae3b70ef2 cmd/compile: add clobberdeadreg mode
When -clobberdeadreg flag is set, the compiler inserts code that
clobbers integer registers at call sites. This may be helpful for
debugging register ABI.

Only implemented on AMD64 for now.

Change-Id: Ia203d3f891c30fd95d0103489056fe01d63a2899
Reviewed-on: https://go-review.googlesource.com/c/go/+/302809
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2021-03-19 23:21:21 +00:00
Austin Clements
eaa1ddee84 all: explode GOEXPERIMENT=regabi into 5 sub-experiments
This separates GOEXPERIMENT=regabi into five sub-experiments:
regabiwrappers, regabig, regabireflect, regabidefer, and regabiargs.
Setting GOEXPERIMENT=regabi now implies the working subset of these
(currently, regabiwrappers, regabig, and regabireflect).

This simplifies testing, helps derisk the register ABI project,
and will also help with performance comparisons.

This replaces the -abiwrap flag to the compiler and linker with
the regabiwrappers experiment.

As part of this, regabiargs now enables registers for all calls
in the compiler. Previously, this was statically disabled in
regabiEnabledForAllCompilation, but now that we can control it
independently, this isn't necessary.

For #40724.

Change-Id: I5171e60cda6789031f2ef034cc2e7c5d62459122
Reviewed-on: https://go-review.googlesource.com/c/go/+/302070
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-03-18 16:51:27 +00:00
Josh Bleecher Snyder
43d5f213e2 cmd/compile: optimize multi-register shifts on amd64
amd64 can shift in bits from another register instead of filling with 0/1.
This pattern is helpful when implementing 128 bit shifts or arbitrary length shifts.
In the standard library, it shows up in pure Go math/big.

Benchmarks results on amd64 with -tags=math_big_pure_go.

name                          old time/op  new time/op  delta
NonZeroShifts/1/shrVU-8       4.45ns ± 3%  4.39ns ± 1%   -1.28%  (p=0.000 n=30+27)
NonZeroShifts/1/shlVU-8       4.13ns ± 4%  4.10ns ± 2%     ~     (p=0.254 n=29+28)
NonZeroShifts/2/shrVU-8       5.55ns ± 1%  5.63ns ± 2%   +1.42%  (p=0.000 n=28+29)
NonZeroShifts/2/shlVU-8       5.70ns ± 2%  5.14ns ± 1%   -9.82%  (p=0.000 n=29+28)
NonZeroShifts/3/shrVU-8       6.79ns ± 2%  6.35ns ± 2%   -6.46%  (p=0.000 n=28+29)
NonZeroShifts/3/shlVU-8       6.69ns ± 1%  6.25ns ± 1%   -6.60%  (p=0.000 n=28+27)
NonZeroShifts/4/shrVU-8       7.79ns ± 2%  7.06ns ± 2%   -9.48%  (p=0.000 n=30+30)
NonZeroShifts/4/shlVU-8       7.82ns ± 1%  7.24ns ± 1%   -7.37%  (p=0.000 n=28+29)
NonZeroShifts/5/shrVU-8       8.90ns ± 3%  7.93ns ± 1%  -10.84%  (p=0.000 n=29+26)
NonZeroShifts/5/shlVU-8       8.68ns ± 1%  7.92ns ± 1%   -8.76%  (p=0.000 n=29+29)
NonZeroShifts/10/shrVU-8      14.4ns ± 1%  12.3ns ± 2%  -14.79%  (p=0.000 n=28+29)
NonZeroShifts/10/shlVU-8      14.1ns ± 1%  11.9ns ± 2%  -15.55%  (p=0.000 n=28+27)
NonZeroShifts/100/shrVU-8      118ns ± 1%    96ns ± 3%  -18.82%  (p=0.000 n=30+29)
NonZeroShifts/100/shlVU-8      120ns ± 2%    98ns ± 2%  -18.46%  (p=0.000 n=29+28)
NonZeroShifts/1000/shrVU-8    1.10µs ± 1%  0.88µs ± 2%  -19.63%  (p=0.000 n=29+30)
NonZeroShifts/1000/shlVU-8    1.10µs ± 2%  0.88µs ± 2%  -20.28%  (p=0.000 n=29+28)
NonZeroShifts/10000/shrVU-8   10.9µs ± 1%   8.7µs ± 1%  -19.78%  (p=0.000 n=28+27)
NonZeroShifts/10000/shlVU-8   10.9µs ± 2%   8.7µs ± 1%  -19.64%  (p=0.000 n=29+27)
NonZeroShifts/100000/shrVU-8   111µs ± 2%    90µs ± 2%  -19.39%  (p=0.000 n=28+29)
NonZeroShifts/100000/shlVU-8   113µs ± 2%    90µs ± 2%  -20.43%  (p=0.000 n=30+27)

The assembly version is still faster, unfortunately, but the gap is narrowing.
Speedup from pure Go to assembly:

name                          old time/op  new time/op  delta
NonZeroShifts/1/shrVU-8       4.39ns ± 1%  3.45ns ± 2%  -21.36%  (p=0.000 n=27+29)
NonZeroShifts/1/shlVU-8       4.10ns ± 2%  3.47ns ± 3%  -15.42%  (p=0.000 n=28+30)
NonZeroShifts/2/shrVU-8       5.63ns ± 2%  3.97ns ± 0%  -29.40%  (p=0.000 n=29+25)
NonZeroShifts/2/shlVU-8       5.14ns ± 1%  3.77ns ± 2%  -26.65%  (p=0.000 n=28+26)
NonZeroShifts/3/shrVU-8       6.35ns ± 2%  4.79ns ± 2%  -24.52%  (p=0.000 n=29+29)
NonZeroShifts/3/shlVU-8       6.25ns ± 1%  4.42ns ± 1%  -29.29%  (p=0.000 n=27+26)
NonZeroShifts/4/shrVU-8       7.06ns ± 2%  5.64ns ± 1%  -20.05%  (p=0.000 n=30+29)
NonZeroShifts/4/shlVU-8       7.24ns ± 1%  5.34ns ± 2%  -26.23%  (p=0.000 n=29+29)
NonZeroShifts/5/shrVU-8       7.93ns ± 1%  6.56ns ± 2%  -17.26%  (p=0.000 n=26+30)
NonZeroShifts/5/shlVU-8       7.92ns ± 1%  6.27ns ± 1%  -20.79%  (p=0.000 n=29+25)
NonZeroShifts/10/shrVU-8      12.3ns ± 2%  10.2ns ± 2%  -17.21%  (p=0.000 n=29+29)
NonZeroShifts/10/shlVU-8      11.9ns ± 2%  10.5ns ± 2%  -12.45%  (p=0.000 n=27+29)
NonZeroShifts/100/shrVU-8     95.9ns ± 3%  77.7ns ± 1%  -19.00%  (p=0.000 n=29+30)
NonZeroShifts/100/shlVU-8     97.5ns ± 2%  66.8ns ± 2%  -31.47%  (p=0.000 n=28+30)
NonZeroShifts/1000/shrVU-8     884ns ± 2%   705ns ± 1%  -20.17%  (p=0.000 n=30+28)
NonZeroShifts/1000/shlVU-8     880ns ± 2%   590ns ± 1%  -32.96%  (p=0.000 n=28+25)
NonZeroShifts/10000/shrVU-8   8.74µs ± 1%  7.34µs ± 3%  -15.94%  (p=0.000 n=27+30)
NonZeroShifts/10000/shlVU-8   8.73µs ± 1%  6.00µs ± 1%  -31.25%  (p=0.000 n=27+28)
NonZeroShifts/100000/shrVU-8  89.6µs ± 2%  75.5µs ± 2%  -15.80%  (p=0.000 n=29+29)
NonZeroShifts/100000/shlVU-8  89.6µs ± 2%  68.0µs ± 3%  -24.09%  (p=0.000 n=27+30)

Change-Id: I18f58d8f5513d737d9cdf09b8f9d14011ffe3958
Reviewed-on: https://go-review.googlesource.com/c/go/+/297050
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-11 19:11:46 +00:00
Alberto Donizetti
b70a2bc9c6 cmd/compile: make ValAndOff.{Val,Off} return an int32
The ValAndOff type is a 64bit integer holding a 32bit value and a
32bit offset in each half, but for historical reasons its Val and Off
methods returned an int64. This was convenient when AuxInt was always
an int64, but now that AuxInts are typed we can return int32 from Val
and Off and get rid of a several casts and now unnecessary range
checks.

This change:

- changes the Val and Off methods to return an int32 (from int64);
- adds Val64 and Off64 methods for convenience in the few remaining
  places (in the ssa.go files) where Val and Off are stored in int64
  fields;
- deletes makeValAndOff64, renames makeValAndOff32 to makeValAndOff
- deletes a few ValAndOff methods that are now unused;
- removes several validOff/validValAndOff check that will always
  return true.

Passes:

  GOARCH=amd64 gotip build -toolexec 'toolstash -cmp' -a std
  GOARCH=386 gotip build -toolexec 'toolstash -cmp' -a std
  GOARCH=s390x gotip build -toolexec 'toolstash -cmp' -a std

(the three GOARCHs with SSA rules files impacted by the change).

Change-Id: I2abbbf42188c798631b94d3a55ca44256f140be7
Reviewed-on: https://go-review.googlesource.com/c/go/+/299149
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-09 08:19:14 +00:00
David Chase
a2d92b5143 cmd/compile: register abi, morestack work and mole whacking
Morestack works for non-pointer register parameters

Within a function body, pointer-typed parameters are correctly
tracked.

Results still not hooked up.

For #40724.

Change-Id: Icaee0b51d0da54af983662d945d939b756088746
Reviewed-on: https://go-review.googlesource.com/c/go/+/294410
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04 16:19:12 +00:00
David Chase
9f33dc3ca1 cmd/compile: handle aggregate OpArg in registers
Also handles case where OpArg does not escape but has its address
taken.

May have exposed a lurking bug in 1.16 expandCalls,
if e.g., loading len(someArrayOfstructThing[0].secondStringField)
from a local.  Maybe.

For #40724.

Change-Id: I0298c4ad5d652b5e3d7ed6a62095d59e2d8819c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/293396
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-03 15:00:48 +00:00
David Chase
f2df1e3c34 cmd/compile: retrieve Args from registers
in progress; doesn't fully work until they are also passed on
register on the caller side.

For #40724.

Change-Id: I29a6680e60bdbe9d132782530214f2a2b51fb8f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/293394
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-03 03:02:39 +00:00
fanzha02
2b50ab2aee cmd/compile: optimize single-precision floating point square root
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.

On arm64 flatform.

previous:
  FCVTSD F0, F0
  FSQRTD F0, F0
  FCVTDS F0, F0

optimized:
  FSQRTS S0, S0

And this patch adds the test case to check the correctness.

This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)

Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-02 06:38:07 +00:00
Josh Bleecher Snyder
a61524d103 cmd/internal/obj: add Prog.SetFrom3{Reg,Const}
These are the the most common uses, and they reduce line noise.

I don't love adding new deprecated APIs,
but since they're trivial wrappers,
it'll be very easy to update them along with the rest.
No functional changes; passes toolstash-check.

Change-Id: I691a8175cfef9081180e463c63f326376af3f3a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/296009
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-02-25 23:19:16 +00:00
Josh Bleecher Snyder
4ebb6f5110 cmd/compile: automate resultInArg0 register checks
No functional changes; passes toolstash-check.
No measureable performance changes.

Change-Id: I2629f73d4a3cc56d80f512f33cf57cf41d8f15d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/296010
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-02-25 18:57:20 +00:00
Cherry Zhang
bf5fa2d198 cmd/compile: guard special register usage with GOEXPERIMENT=regabi
Previously, some special register uses are only guarded with ABI
wrapper generation (-abiwrap). This CL makes it also guarded with
the GOEXPERIMENT. This way we can enable only the wrapper
generation without fully the new ABI, for benchmarking purposes.

Change-Id: I90fc34afa1dc17c9c73e7b06e940e79e4c4bf7f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/295289
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-02-23 17:53:19 +00:00
Cherry Zhang
5d7dc53888 [dev.regabi] cmd/compile, runtime: reserve R14 as g registers on AMD64
This is a proof-of-concept change for using the g register on
AMD64. getg is now lowered to R14 in the new ABI. The g register
is not yet used in all places where it can be used (e.g. stack
bounds check, runtime assembly code).

Change-Id: I10123ddf38e31782cf58bafcdff170aee0ff0d1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/289196
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-02-08 16:30:07 +00:00
Cherry Zhang
401d7e5a24 [dev.regabi] cmd/compile: reserve X15 as zero register on AMD64
In ABIInternal, reserve X15 as constant zero, and use it to zero
memory. (Maybe there can be more use of it?)

The register is zeroed when transition to ABIInternal from ABI0.

Caveat: using X15 generates longer instructions than using X0.
Maybe we want to use X0?

Change-Id: I12d5ee92a01fc0b59dad4e5ab023ac71bc2a8b7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/288093
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2021-02-03 22:44:53 +00:00
Russ Cox
6c34d2f420 [dev.regabi] cmd/compile: split out package ssagen [generated]
[git-generate]

cd src/cmd/compile/internal/gc
rf '
	# maxOpenDefers is declared in ssa.go but used only by walk.
	mv maxOpenDefers walk.go

	# gc.Arch -> ssagen.Arch
	# It is not as nice but will do for now.
	mv Arch ArchInfo
	mv thearch Arch
	mv Arch ArchInfo arch.go

	# Pull dwarf out of pgen.go.
	mv debuginfo declPos createDwarfVars preInliningDcls \
		createSimpleVars createSimpleVar \
		createComplexVars createComplexVar \
		dwarf.go

	# Pull high-level compilation out of pgen.go,
	# leaving only the SSA code.
	mv compilequeue funccompile compile compilenow \
		compileFunctions isInlinableButNotInlined \
		initLSym \
		compile.go

	mv BoundsCheckFunc GCWriteBarrierReg ssa.go
	mv largeStack largeStackFrames CheckLargeStacks pgen.go

	# All that is left in dcl.go is the nowritebarrierrecCheck
	mv dcl.go nowb.go

	# Export API and unexport non-API.
	mv initssaconfig InitConfig
	mv isIntrinsicCall IsIntrinsicCall
	mv ssaDumpInline DumpInline
	mv initSSATables InitTables
	mv initSSAEnv InitEnv
	mv compileSSA Compile
	mv stackOffset StackOffset
	mv canSSAType TypeOK
	mv SSAGenState State
	mv FwdRefAux fwdRefAux

	mv cgoSymABIs CgoSymABIs
	mv readSymABIs ReadSymABIs
	mv initLSym InitLSym
	mv useABIWrapGen symabiDefs CgoSymABIs ReadSymABIs InitLSym selectLSym makeABIWrapper setupTextLSym abi.go

	mv arch.go abi.go nowb.go phi.go pgen.go pgen_test.go ssa.go cmd/compile/internal/ssagen
'
rm go.go gsubr.go

Change-Id: I47fad6cbf1d1e583fd9139003a08401d7cd048a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/279476
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-23 06:39:29 +00:00
Russ Cox
dac0de3748 [dev.regabi] cmd/compile: move type size calculations into package types [generated]
To break up package gc, we need to put these calculations somewhere
lower in the import graph, either an existing or new package. Package types
already needs this code and is using hacks to get it without an import cycle.
We can remove the hacks and set up for the new package gc by moving the
code into package types itself.

[git-generate]
cd src/cmd/compile/internal/gc
rf '
	# Remove old import cycle hacks in gc.
	rm TypecheckInit:/types.Widthptr =/-0,/types.Dowidth =/+0 \
		../ssa/export_test.go:/types.Dowidth =/-+
	ex {
		import "cmd/compile/internal/types"
		types.Widthptr -> Widthptr
		types.Dowidth -> dowidth
	}

	# Disable CalcSize in tests instead of base.Fatalf
	sub dowidth:/base.Fatalf\("dowidth without betypeinit"\)/ \
		// Assume this is a test. \
		return

	# Move size calculation into cmd/compile/internal/types
	mv Widthptr PtrSize
	mv Widthreg RegSize
	mv slicePtrOffset SlicePtrOffset
	mv sliceLenOffset SliceLenOffset
	mv sliceCapOffset SliceCapOffset
	mv sizeofSlice SliceSize
	mv sizeofString StringSize
	mv skipDowidthForTracing SkipSizeForTracing
	mv dowidth CalcSize
	mv checkwidth CheckSize
	mv widstruct calcStructOffset
	mv sizeCalculationDisabled CalcSizeDisabled
	mv defercheckwidth DeferCheckSize
	mv resumecheckwidth ResumeCheckSize
	mv typeptrdata PtrDataSize
	mv \
		PtrSize RegSize SlicePtrOffset SkipSizeForTracing typePos align.go PtrDataSize \
		size.go
	mv size.go cmd/compile/internal/types
'

: # Remove old import cycle hacks in types.
cd ../types
rf '
	ex {
		Widthptr -> PtrSize
		Dowidth -> CalcSize
	}
	rm Widthptr Dowidth
'

Change-Id: Ib96cdc6bda2617235480c29392ea5cfb20f60cd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/279234
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-23 06:38:20 +00:00
Russ Cox
65c4c6dfb2 [dev.regabi] cmd/compile: group known symbols, packages, names [generated]
There are a handful of pre-computed magic symbols known by
package gc, and we need a place to store them.

If we keep them together, the need for type *ir.Name means that
package ir is the lowest package in the import hierarchy that they
can go in. And package ir needs gopkg for methodSymSuffix
(in a later CL), so they can't go any higher either, at least not all together.
So package ir it is.

Rather than dump them all into the top-level package ir
namespace, however, we introduce global structs, Syms, Pkgs, and Names,
and make the known symbols, packages, and names fields of those.

[git-generate]
cd src/cmd/compile/internal/gc

rf '
	add go.go:$ \
		// Names holds known names. \
		var Names struct{} \
		\
		// Syms holds known symbols. \
		var Syms struct {} \
		\
		// Pkgs holds known packages. \
		var Pkgs struct {} \

	mv staticuint64s Names.Staticuint64s
	mv zerobase Names.Zerobase

	mv assertE2I Syms.AssertE2I
	mv assertE2I2 Syms.AssertE2I2
	mv assertI2I Syms.AssertI2I
	mv assertI2I2 Syms.AssertI2I2
	mv deferproc Syms.Deferproc
	mv deferprocStack Syms.DeferprocStack
	mv Deferreturn Syms.Deferreturn
	mv Duffcopy Syms.Duffcopy
	mv Duffzero Syms.Duffzero
	mv gcWriteBarrier Syms.GCWriteBarrier
	mv goschedguarded Syms.Goschedguarded
	mv growslice Syms.Growslice
	mv msanread Syms.Msanread
	mv msanwrite Syms.Msanwrite
	mv msanmove Syms.Msanmove
	mv newobject Syms.Newobject
	mv newproc Syms.Newproc
	mv panicdivide Syms.Panicdivide
	mv panicshift Syms.Panicshift
	mv panicdottypeE Syms.PanicdottypeE
	mv panicdottypeI Syms.PanicdottypeI
	mv panicnildottype Syms.Panicnildottype
	mv panicoverflow Syms.Panicoverflow
	mv raceread Syms.Raceread
	mv racereadrange Syms.Racereadrange
	mv racewrite Syms.Racewrite
	mv racewriterange Syms.Racewriterange
	mv SigPanic Syms.SigPanic
	mv typedmemclr Syms.Typedmemclr
	mv typedmemmove Syms.Typedmemmove
	mv Udiv Syms.Udiv
	mv writeBarrier Syms.WriteBarrier
	mv zerobaseSym Syms.Zerobase
	mv arm64HasATOMICS Syms.ARM64HasATOMICS
	mv armHasVFPv4 Syms.ARMHasVFPv4
	mv x86HasFMA Syms.X86HasFMA
	mv x86HasPOPCNT Syms.X86HasPOPCNT
	mv x86HasSSE41 Syms.X86HasSSE41
	mv WasmDiv Syms.WasmDiv
	mv WasmMove Syms.WasmMove
	mv WasmZero Syms.WasmZero
	mv WasmTruncS Syms.WasmTruncS
	mv WasmTruncU Syms.WasmTruncU

	mv gopkg Pkgs.Go
	mv itabpkg Pkgs.Itab
	mv itablinkpkg Pkgs.Itablink
	mv mappkg Pkgs.Map
	mv msanpkg Pkgs.Msan
	mv racepkg Pkgs.Race
	mv Runtimepkg Pkgs.Runtime
	mv trackpkg Pkgs.Track
	mv unsafepkg Pkgs.Unsafe

	mv Names Syms Pkgs symtab.go
	mv symtab.go cmd/compile/internal/ir
'

Change-Id: Ic143862148569a3bcde8e70b26d75421aa2d00f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/279235
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-23 06:38:07 +00:00
Russ Cox
26b66fd60b [dev.regabi] cmd/compile: introduce cmd/compile/internal/base [generated]
Move Flag, Debug, Ctxt, Exit, and error messages to
new package cmd/compile/internal/base.

These are the core functionality that everything in gc uses
and which otherwise prevent splitting any other code
out of gc into different packages.

A minor milestone: the compiler source code
no longer contains the string "yy".

[git-generate]
cd src/cmd/compile/internal/gc
rf '
        mv atExit AtExit
        mv Ctxt atExitFuncs AtExit Exit base.go

        mv lineno Pos
        mv linestr FmtPos
        mv flusherrors FlushErrors
        mv yyerror Errorf
        mv yyerrorl ErrorfAt
        mv yyerrorv ErrorfVers
        mv noder.yyerrorpos noder.errorAt
        mv Warnl WarnfAt
        mv errorexit ErrorExit

        mv base.go debug.go flag.go print.go cmd/compile/internal/base
'

: # update comments
sed -i '' 's/yyerrorl/ErrorfAt/g; s/yyerror/Errorf/g' *.go

: # bootstrap.go is not built by default so invisible to rf
sed -i '' 's/Fatalf/base.Fatalf/' bootstrap.go
goimports -w bootstrap.go

: # update cmd/dist to add internal/base
cd ../../../dist
sed -i '' '/internal.amd64/a\
	"cmd/compile/internal/base",
' buildtool.go
gofmt -w buildtool.go

Change-Id: I59903c7084222d6eaee38823fd222159ba24a31a
Reviewed-on: https://go-review.googlesource.com/c/go/+/272250
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-11-25 16:39:54 +00:00
Russ Cox
3c240f5d17 [dev.regabi] cmd/compile: clean up debug flag (-d) handling [generated]
The debug table is not as haphazard as flags, but there are still
a few mismatches between command-line names and variable names.
This CL moves them all into a consistent home (var Debug, like var Flag).

Code updated automatically using the rf command below.
A followup CL will make a few manual cleanups, leaving this CL
completely automated and easier to regenerate during merge
conflicts.

[git-generate]
cd src/cmd/compile/internal/gc
rf '
	add main.go var Debug struct{}
	mv Debug_append Debug.Append
	mv Debug_checkptr Debug.Checkptr
	mv Debug_closure Debug.Closure
	mv Debug_compilelater Debug.CompileLater
	mv disable_checknil Debug.DisableNil
	mv debug_dclstack Debug.DclStack
	mv Debug_gcprog Debug.GCProg
	mv Debug_libfuzzer Debug.Libfuzzer
	mv Debug_checknil Debug.Nil
	mv Debug_panic Debug.Panic
	mv Debug_slice Debug.Slice
	mv Debug_typeassert Debug.TypeAssert
	mv Debug_wb Debug.WB
	mv Debug_export Debug.Export
	mv Debug_pctab Debug.PCTab
	mv Debug_locationlist Debug.LocationLists
	mv Debug_typecheckinl Debug.TypecheckInl
	mv Debug_gendwarfinl Debug.DwarfInl
	mv Debug_softfloat Debug.SoftFloat
	mv Debug_defer Debug.Defer
	mv Debug_dumpptrs Debug.DumpPtrs

	mv flag.go:/parse.-d/-1,/unknown.debug/+2 parseDebug

	mv debugtab Debug parseDebug \
		debugHelpHeader debugHelpFooter \
		debug.go

	# Remove //go:generate line copied from main.go
	rm debug.go:/go:generate/-+
'

Change-Id: I625761ca5659be4052f7161a83baa00df75cca91
Reviewed-on: https://go-review.googlesource.com/c/go/+/272246
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-11-25 16:39:42 +00:00
David Chase
92c732e901 cmd/compile: fix load of interface{}-typed OpIData in expand_calls
In certain cases, the declkared type of an OpIData is interface{}.
This was not expected (since interface{} is a pair, right?) and
thus caused a crash.  What is intended is that these be treated as
a byteptr, so do that instead (this is what happens in 1.15).

Fixes #42568.

Change-Id: Id7c9e5dc2cbb5d7c71c6748832491ea62b0b339f
Reviewed-on: https://go-review.googlesource.com/c/go/+/270057
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2020-11-14 17:24:37 +00:00
Michele Di Pede
94b3fd06cb cmd/compile: code cleanup
Change-Id: Ibf68e663f29a5cb3b64a7d923c005c16da647769
Reviewed-on: https://go-review.googlesource.com/c/go/+/266537
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-30 19:48:57 +00:00
Michael Pratt
44dbeaf356 cmd/compile: intrinsify runtime/internal/atomic.{And,Or} on AMD64
These are identical to And8 and Or8, just using ANDL/ORL instead of
ANDB/ORB.

Change-Id: I99cf90a8b0b5f211fb23325dddd55821875f0c8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/263140
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-23 14:18:13 +00:00
Keith Randall
9e70564f63 cmd/compile,cmd/asm: simplify recording of branch targets, take 2
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.

This is a cleanup CL in preparation for some stack frame CLs.

Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405
Reviewed-on: https://go-review.googlesource.com/c/go/+/251437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-31 17:36:08 +00:00
Keith Randall
26ad27bb02 Revert "cmd/compile,cmd/asm: simplify recording of branch targets"
This reverts CL 243318.

Reason for revert: Seems to be crashing some builders.

Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756
Reviewed-on: https://go-review.googlesource.com/c/go/+/251169
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-08-28 02:10:13 +00:00
Keith Randall
8247da3662 cmd/compile,cmd/asm: simplify recording of branch targets
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.

This is a cleanup CL in preparation for some stack frame CLs.

Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/243318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-27 22:35:45 +00:00
Keith Randall
c4fed25553 cmd/compile: add floating point load+op operations to addressing modes pass
They were missed as part of the refactoring to use a separate
addressing modes pass.

Fixes #40426

Change-Id: Ie0418b2fac4ba1ffe720644ac918f6d728d5e420
Reviewed-on: https://go-review.googlesource.com/c/go/+/244859
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-07-27 18:24:32 +00:00
Xiangdong Ji
e8f5a33191 cmd/compile: fix incorrect rewriting to if condition
Some ARM64 rewriting rules convert 'comparing to zero' conditions of if
statements to a simplified version utilizing CMN and CMP instructions to
branch over condition flags, in order to save one Add or Sub caculation.

Such optimizations lead to wrong branching in case an overflow/underflow
occurs when executing CMN or CMP.

Fix the issue by introducing new block opcodes that don't honor the
overflow/underflow flag, in the following categories:

  Block-Op        Meaning                   ARM condition codes
  1. LTnoov        less than                 MI
  2. GEnoov        greater than or equal     PL
  3. LEnoov        less than or equal        MI || EQ
  4. GTnoov        greater than              NEQ & PL

The backend generates two consecutive branch instructions for 'LEnoov'
and 'GTnoov' to model their expected behavior. A slight change to 'gc'
and amd64/386 backends is made to unify the code generation.

Add a test 'TestCondRewrite' as justification, it covers 32 incorrect rules
identified on arm64, more might be needed on other arches, like 32-bit arm.

Add two benchmarks profiling the aforementioned category 1&2 and category
3&4 separetely, we expect the first two categories will show performance
improvement and the second will not result in visible regression compared with
the non-optimized version.

This change also updates TestFormats to support using %#x.

Examples exhibiting where does the issue come from:
  1: 'if x + 3 < 0' might be converted to:
  before:
    CMN $3, R0
    BGE <else branch> // wrong branch is taken if 'x+3' overflows
  after:
    CMN $3, R0
    BPL <else branch>

  2: 'if y - 3 > 0' might be converted to:
  before:
    CMP $3, R0
    BLE <else branch> // wrong branch is taken if 'y-3' underflows
  after:
    CMP $3, R0
    BMI <else branch>
    BEQ <else branch>

Benchmark data from different kinds of arm64 servers, 'old' is the non-optimized
version (not the parent commit), generally the optimization version outperforms.

S1:
name                    old time/op  new time/op  delta
CondRewrite/SoloJump  13.6ns ± 0%  12.9ns ± 0%  -5.15%  (p=0.000 n=10+10)
CondRewrite/CombJump  13.8ns ± 1%  12.9ns ± 0%  -6.32%  (p=0.000 n=10+10)

S2:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  11.6ns ± 0%  10.9ns ± 0%  -6.03%  (p=0.000 n=10+10)
CondRewrite/CombJump  11.4ns ± 0%  10.8ns ± 1%  -5.53%  (p=0.000 n=10+10)

S3:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  7.36ns ± 0%  7.50ns ± 0%  +1.79%  (p=0.000 n=9+10)
CondRewrite/CombJump  7.35ns ± 0%  7.75ns ± 0%  +5.51%  (p=0.000 n=8+9)

S4:
name                      old time/op  new time/op  delta
CondRewrite/SoloJump-224  11.5ns ± 1%  10.9ns ± 0%  -4.97%  (p=0.000 n=10+10)
CondRewrite/CombJump-224  11.9ns ± 0%  11.5ns ± 0%  -2.95%  (p=0.000 n=10+10)

S5:
name                     old time/op  new time/op  delta
CondRewrite/SoloJump  10.0ns ± 0%  10.0ns ± 0%  -0.45%  (p=0.000 n=9+10)
CondRewrite/CombJump  9.93ns ± 0%  9.77ns ± 0%  -1.53%  (p=0.000 n=10+9)

Go1 perf. data:

name                     old time/op    new time/op    delta
BinaryTree17              6.29s ± 1%     6.30s ± 1%    ~     (p=1.000 n=5+5)
Fannkuch11                5.40s ± 0%     5.40s ± 0%    ~     (p=0.841 n=5+5)
FmtFprintfEmpty          97.9ns ± 0%    98.9ns ± 3%    ~     (p=0.937 n=4+5)
FmtFprintfString          171ns ± 3%     171ns ± 2%    ~     (p=0.754 n=5+5)
FmtFprintfInt             212ns ± 0%     217ns ± 6%  +2.55%  (p=0.008 n=5+5)
FmtFprintfIntInt          296ns ± 1%     297ns ± 2%    ~     (p=0.516 n=5+5)
FmtFprintfPrefixedInt     371ns ± 2%     374ns ± 7%    ~     (p=1.000 n=5+5)
FmtFprintfFloat           435ns ± 1%     439ns ± 2%    ~     (p=0.056 n=5+5)
FmtManyArgs              1.37µs ± 1%    1.36µs ± 1%    ~     (p=0.730 n=5+5)
GobDecode                14.6ms ± 4%    14.4ms ± 4%    ~     (p=0.690 n=5+5)
GobEncode                11.8ms ±20%    11.6ms ±15%    ~     (p=1.000 n=5+5)
Gzip                      507ms ± 0%     491ms ± 0%  -3.22%  (p=0.008 n=5+5)
Gunzip                   73.8ms ± 0%    73.9ms ± 0%    ~     (p=0.690 n=5+5)
HTTPClientServer          116µs ± 0%     116µs ± 0%    ~     (p=0.686 n=4+4)
JSONEncode               21.8ms ± 1%    21.6ms ± 2%    ~     (p=0.151 n=5+5)
JSONDecode                104ms ± 1%     103ms ± 1%  -1.08%  (p=0.016 n=5+5)
Mandelbrot200            9.53ms ± 0%    9.53ms ± 0%    ~     (p=0.421 n=5+5)
GoParse                  7.55ms ± 1%    7.51ms ± 1%    ~     (p=0.151 n=5+5)
RegexpMatchEasy0_32       158ns ± 0%     158ns ± 0%    ~     (all equal)
RegexpMatchEasy0_1K       606ns ± 1%     608ns ± 3%    ~     (p=0.937 n=5+5)
RegexpMatchEasy1_32       143ns ± 0%     144ns ± 1%    ~     (p=0.095 n=5+4)
RegexpMatchEasy1_1K       927ns ± 2%     944ns ± 2%    ~     (p=0.056 n=5+5)
RegexpMatchMedium_32     16.0ns ± 0%    16.0ns ± 0%    ~     (all equal)
RegexpMatchMedium_1K     69.3µs ± 2%    69.7µs ± 0%    ~     (p=0.690 n=5+5)
RegexpMatchHard_32       3.73µs ± 0%    3.73µs ± 1%    ~     (p=0.984 n=5+5)
RegexpMatchHard_1K        111µs ± 1%     110µs ± 0%    ~     (p=0.151 n=5+5)
Revcomp                   1.91s ±47%     1.77s ±68%    ~     (p=1.000 n=5+5)
Template                  138ms ± 1%     138ms ± 1%    ~     (p=1.000 n=5+5)
TimeParse                 787ns ± 2%     785ns ± 1%    ~     (p=0.540 n=5+5)
TimeFormat                729ns ± 1%     726ns ± 1%    ~     (p=0.151 n=5+5)

Updates #38740
Change-Id: I06c604874acdc1e63e66452dadee5df053045222
Reviewed-on: https://go-review.googlesource.com/c/go/+/233097
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2020-05-29 15:39:54 +00:00
Keith Randall
9ed0fb42e3 cmd/compile: add indexed memory modification ops to amd64
name            old time/op  new time/op  delta
Modify-16        404ns ± 1%   365ns ± 1%  -9.73%  (p=0.000 n=10+10)
ConstModify-16   407ns ± 0%   385ns ± 2%  -5.56%  (p=0.000 n=9+10)

Seems to generally help generated code.

Binary size change is in the noise.

Change-Id: I57891bfaf0f7dfc5d143bb9f7ebafc7079d2614f
Reviewed-on: https://go-review.googlesource.com/c/go/+/228098
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-30 17:21:31 +00:00
Keith Randall
882ec701d2 cmd/compile: add indexed load+op operations to amd64
name        old time/op  new time/op  delta
LoadAdd-16   545ns ± 0%   456ns ± 0%  -16.31%  (p=0.000 n=10+10)

Update #36468

Change-Id: I84f390d55490648fa1f58cdbc24fd74c4f1bc8c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/227960
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-30 17:19:57 +00:00
Josh Bleecher Snyder
b3e8a00060 cmd/compile: move duffcopy auxint calculation out of rewrite rules
Package amd64 is a more natural home for it.
It also makes it easier to see how many bytes
are being copied in ssa.html.

Passes toolstash-check.

Change-Id: I5ecf0f0f18e8db2faa2caf7a05028c310952bd94
Reviewed-on: https://go-review.googlesource.com/c/go/+/229703
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:56:54 +00:00
Keith Randall
ea7126fe14 cmd/compile: use a Sym type instead of interface{} for symbolic offsets
Will help with strongly typed rewrite rules.

Change-Id: Ifbf316a49f4081322b3b8f13bc962713437d9aba
Reviewed-on: https://go-review.googlesource.com/c/go/+/227785
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-04-10 16:24:46 +00:00
Josh Bleecher Snyder
7ee8467b27 cmd/compile: use MOVBQZX for OpAMD64LoweredHasCPUFeature
In the commit message of CL 212360, I wrote:

> This new intrinsic ... generates MOVB+TESTB+NE.
> (It is possible that MOVBQZX+TESTQ+NE would be better.)

I should have tested. MOVBQZX+TESTQ+NE does in fact appear to be better.

For the benchmark in #36196, on my machine:

name      old time/op  new time/op  delta
FMA-8     0.86ns ± 6%  0.70ns ± 5%  -18.79%  (p=0.000 n=98+97)
NonFMA-8  0.61ns ± 5%  0.60ns ± 4%   -0.74%  (p=0.001 n=100+97)

Interestingly, these are both considerably faster than
the measurements I took a couple of months ago (1.4ns/2ns).
It appears that CL 219131 (clearing VZEROUPPER in asyncPreempt) helped a lot.
And FMA is now once again slower than NonFMA, although this change
helps it regain some ground.

Updates #15808
Updates #36351
Updates #36196

Change-Id: I8a326289a963b1939aaa7eaa2fab2ec536467c7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/227238
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-07 18:19:55 +00:00
Josh Bleecher Snyder
fff7509d47 cmd/compile: add intrinsic HasCPUFeature for checking cpu features
Before using some CPU instructions, we must check for their presence.
We use global variables in the runtime package to record features.

Prior to this CL, we issued a regular memory load for these features.
The downside to this is that, because it is a regular memory load,
it cannot be hoisted out of loops or otherwise reordered with other loads.

This CL introduces a new intrinsic just for checking cpu features.
It still ends up resulting in a memory load, but that memory load can
now be floated to the entry block and rematerialized as needed.

One downside is that the regular load could be combined with the comparison
into a CMPBconstload+NE. This new intrinsic cannot; it generates MOVB+TESTB+NE.
(It is possible that MOVBQZX+TESTQ+NE would be better.)

This CL does only amd64. It is easy to extend to other architectures.

For the benchmark in #36196, on my machine, this offers a mild speedup.

name      old time/op  new time/op  delta
FMA-8     1.39ns ± 6%  1.29ns ± 9%  -7.19%  (p=0.000 n=97+96)
NonFMA-8  2.03ns ±11%  2.04ns ±12%    ~     (p=0.618 n=99+98)

Updates #15808
Updates #36196

Change-Id: I75e2fcfcf5a6df1bdb80657a7143bed69fca6deb
Reviewed-on: https://go-review.googlesource.com/c/go/+/212360
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2020-04-04 01:01:04 +00:00
Keith Randall
bba88467f8 cmd/compile: add indexed-load CMP instructions
Things like CMPQ 4(AX)(BX*8), CX

Fixes #37955

Change-Id: Icbed430f65c91a0e3f38a633d8321d79433ad8b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/224219
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-04-01 17:03:26 +00:00
Josh Bleecher Snyder
8114242359 cmd/compile, runtime: use more registers for amd64 write barrier calls
The compiler-inserted write barrier calls use a special ABI
for speed and to minimize the binary size impact.

runtime.gcWriteBarrier takes its args in DI and AX.
This change adds gcWriteBarrier wrapper functions,
varying only in the register used for the second argument.
(Allowing variation in the first argument doesn't offer improvements,
which is convenient, as it avoids quadratic API growth.)
This reduces the number of register copies.

The goals are reduced binary size via reduced register pressure/copies.

One downside to this change is that when the write barrier is on,
we may bounce through several different write barrier wrappers,
which is bad for the instruction cache.

Package runtime write barrier benchmarks for this change:

name                old time/op  new time/op  delta
WriteBarrier-8      16.6ns ± 6%  15.6ns ± 6%  -5.73%  (p=0.000 n=97+99)
BulkWriteBarrier-8  4.37ns ± 7%  4.22ns ± 8%  -3.45%  (p=0.000 n=96+99)

However, I don't particularly trust these numbers.
I ran runtime.BenchmarkWriteBarrier multiple times as I rebased
this change, and noticed that the results have high variance
depending on the parent change, perhaps due to aligment.

This change was stress tested with GOGC=1 GODEBUG=gccheckmark=1 go test std.

This change reduces binary sizes:

file      before    after     Δ       %
addr2line 4308720   4296688   -12032  -0.279%
api       5965592   5945368   -20224  -0.339%
asm       5148088   5025464   -122624 -2.382%
buildid   2848760   2844904   -3856   -0.135%
cgo       4828968   4812840   -16128  -0.334%
compile   19754720  19529744  -224976 -1.139%
cover     5256840   5236600   -20240  -0.385%
dist      3670312   3658264   -12048  -0.328%
doc       4669608   4657576   -12032  -0.258%
fix       3377976   3365944   -12032  -0.356%
link      6614888   6586472   -28416  -0.430%
nm        4258368   4254528   -3840   -0.090%
objdump   4656336   4644304   -12032  -0.258%
pack      2295176   2295432   +256    +0.011%
pprof     14762356  14709364  -52992  -0.359%
test2json 2824456   2820600   -3856   -0.137%
trace     11684404  11643700  -40704  -0.348%
vet       8284760   8252248   -32512  -0.392%
total     115210328 114580040 -630288 -0.547%

This change improves compiler performance:

name        old time/op       new time/op       delta
Template          208ms ± 3%        207ms ± 3%  -0.40%  (p=0.030 n=43+44)
Unicode          80.2ms ± 3%       81.3ms ± 3%  +1.25%  (p=0.000 n=41+44)
GoTypes           699ms ± 3%        694ms ± 2%  -0.71%  (p=0.016 n=42+37)
Compiler          3.26s ± 2%        3.23s ± 2%  -0.86%  (p=0.000 n=43+45)
SSA               6.97s ± 1%        6.93s ± 1%  -0.63%  (p=0.000 n=43+45)
Flate             134ms ± 3%        133ms ± 2%    ~     (p=0.139 n=45+42)
GoParser          165ms ± 2%        164ms ± 1%  -0.79%  (p=0.000 n=45+40)
Reflect           434ms ± 4%        435ms ± 4%    ~     (p=0.937 n=44+44)
Tar               181ms ± 2%        181ms ± 2%    ~     (p=0.702 n=43+45)
XML               244ms ± 2%        244ms ± 2%    ~     (p=0.237 n=45+44)
[Geo mean]        403ms             402ms       -0.29%

name        old user-time/op  new user-time/op  delta
Template          271ms ± 2%        268ms ± 1%  -1.40%  (p=0.000 n=42+42)
Unicode           117ms ± 3%        116ms ± 5%    ~     (p=0.066 n=45+45)
GoTypes           948ms ± 2%        936ms ± 2%  -1.30%  (p=0.000 n=41+40)
Compiler          4.26s ± 1%        4.21s ± 2%  -1.25%  (p=0.000 n=37+45)
SSA               9.52s ± 2%        9.41s ± 1%  -1.18%  (p=0.000 n=44+45)
Flate             167ms ± 2%        165ms ± 2%  -1.15%  (p=0.000 n=44+41)
GoParser          201ms ± 2%        198ms ± 1%  -1.40%  (p=0.000 n=43+43)
Reflect           563ms ± 8%        560ms ± 7%    ~     (p=0.206 n=45+44)
Tar               224ms ± 2%        222ms ± 2%  -0.81%  (p=0.000 n=45+45)
XML               308ms ± 2%        304ms ± 1%  -1.17%  (p=0.000 n=42+43)
[Geo mean]        525ms             519ms       -1.08%

name        old alloc/op      new alloc/op      delta
Template         36.3MB ± 0%       36.3MB ± 0%    ~     (p=0.421 n=5+5)
Unicode          28.4MB ± 0%       28.3MB ± 0%    ~     (p=0.056 n=5+5)
GoTypes           121MB ± 0%        121MB ± 0%  -0.14%  (p=0.008 n=5+5)
Compiler          567MB ± 0%        567MB ± 0%  -0.06%  (p=0.016 n=4+5)
SSA              1.26GB ± 0%       1.26GB ± 0%  -0.07%  (p=0.008 n=5+5)
Flate            22.9MB ± 0%       22.8MB ± 0%    ~     (p=0.310 n=5+5)
GoParser         28.0MB ± 0%       27.9MB ± 0%  -0.09%  (p=0.008 n=5+5)
Reflect          78.4MB ± 0%       78.4MB ± 0%  -0.03%  (p=0.008 n=5+5)
Tar              34.2MB ± 0%       34.2MB ± 0%  -0.05%  (p=0.008 n=5+5)
XML              44.4MB ± 0%       44.4MB ± 0%  -0.04%  (p=0.016 n=5+5)
[Geo mean]       76.4MB            76.3MB       -0.05%

name        old allocs/op     new allocs/op     delta
Template           356k ± 0%         356k ± 0%  -0.13%  (p=0.008 n=5+5)
Unicode            326k ± 0%         326k ± 0%  -0.07%  (p=0.008 n=5+5)
GoTypes           1.24M ± 0%        1.24M ± 0%  -0.24%  (p=0.008 n=5+5)
Compiler          5.30M ± 0%        5.28M ± 0%  -0.34%  (p=0.008 n=5+5)
SSA               11.9M ± 0%        11.9M ± 0%  -0.16%  (p=0.008 n=5+5)
Flate              226k ± 0%         225k ± 0%  -0.12%  (p=0.008 n=5+5)
GoParser           287k ± 0%         286k ± 0%  -0.29%  (p=0.008 n=5+5)
Reflect            930k ± 0%         929k ± 0%  -0.05%  (p=0.008 n=5+5)
Tar                332k ± 0%         331k ± 0%  -0.12%  (p=0.008 n=5+5)
XML                411k ± 0%         411k ± 0%  -0.12%  (p=0.008 n=5+5)
[Geo mean]         771k              770k       -0.16%

For some packages, this change significantly reduces the size of executable text.
Examples:

file                                   before   after    Δ       %
cmd/internal/obj/arm.s                 68658    66855    -1803   -2.626%
cmd/internal/obj/mips.s                57486    56272    -1214   -2.112%
cmd/internal/obj/arm64.s               152107   147163   -4944   -3.250%
cmd/internal/obj/ppc64.s               125544   120456   -5088   -4.053%
cmd/vendor/golang.org/x/tools/go/cfg.s 31699    30742    -957    -3.019%

Full listing:

file                                                                     before   after    Δ       %
container/ring.s                                                         1890     1870     -20     -1.058%
container/list.s                                                         5366     5390     +24     +0.447%
internal/cpu.s                                                           3298     3295     -3      -0.091%
internal/testlog.s                                                       1507     1501     -6      -0.398%
image/color.s                                                            8281     8248     -33     -0.399%
runtime.s                                                                480970   480075   -895    -0.186%
sync.s                                                                   16497    16408    -89     -0.539%
internal/singleflight.s                                                  2591     2577     -14     -0.540%
math/rand.s                                                              10456    10438    -18     -0.172%
cmd/go/internal/par.s                                                    2801     2790     -11     -0.393%
internal/reflectlite.s                                                   28477    28417    -60     -0.211%
errors.s                                                                 2750     2736     -14     -0.509%
internal/oserror.s                                                       446      434      -12     -2.691%
sort.s                                                                   17061    17046    -15     -0.088%
io.s                                                                     17063    16999    -64     -0.375%
vendor/golang.org/x/crypto/hkdf.s                                        1962     1936     -26     -1.325%
text/tabwriter.s                                                         9617     9574     -43     -0.447%
hash/crc64.s                                                             3414     3408     -6      -0.176%
hash/crc32.s                                                             6657     6651     -6      -0.090%
bytes.s                                                                  31932    31863    -69     -0.216%
strconv.s                                                                53158    52799    -359    -0.675%
strings.s                                                                42829    42665    -164    -0.383%
encoding/ascii85.s                                                       4833     4791     -42     -0.869%
vendor/golang.org/x/text/transform.s                                     16810    16724    -86     -0.512%
path.s                                                                   6848     6845     -3      -0.044%
encoding/base32.s                                                        9658     9592     -66     -0.683%
bufio.s                                                                  23051    22908    -143    -0.620%
compress/bzip2.s                                                         11773    11764    -9      -0.076%
image.s                                                                  37565    37502    -63     -0.168%
syscall.s                                                                82359    82279    -80     -0.097%
regexp/syntax.s                                                          83573    82930    -643    -0.769%
image/jpeg.s                                                             36535    36490    -45     -0.123%
regexp.s                                                                 64396    64214    -182    -0.283%
time.s                                                                   82724    82622    -102    -0.123%
plugin.s                                                                 6539     6536     -3      -0.046%
context.s                                                                10959    10865    -94     -0.858%
internal/poll.s                                                          24286    24270    -16     -0.066%
reflect.s                                                                168304   167927   -377    -0.224%
internal/fmtsort.s                                                       7416     7376     -40     -0.539%
os.s                                                                     52465    51787    -678    -1.292%
cmd/go/internal/lockedfile/internal/filelock.s                           2326     2317     -9      -0.387%
os/signal.s                                                              4657     4648     -9      -0.193%
runtime/debug.s                                                          6040     5998     -42     -0.695%
encoding/binary.s                                                        30838    30801    -37     -0.120%
vendor/golang.org/x/net/route.s                                          23694    23491    -203    -0.857%
path/filepath.s                                                          17895    17889    -6      -0.034%
cmd/vendor/golang.org/x/sys/unix.s                                       78125    78109    -16     -0.020%
io/ioutil.s                                                              6999     6996     -3      -0.043%
encoding/base64.s                                                        12094    12007    -87     -0.719%
crypto/cipher.s                                                          20466    20372    -94     -0.459%
cmd/go/internal/robustio.s                                               2672     2669     -3      -0.112%
encoding/pem.s                                                           9302     9286     -16     -0.172%
internal/obscuretestdata.s                                               1719     1695     -24     -1.396%
crypto/aes.s                                                             11014    11002    -12     -0.109%
os/exec.s                                                                29388    29231    -157    -0.534%
cmd/internal/browser.s                                                   2266     2260     -6      -0.265%
internal/goroot.s                                                        4601     4592     -9      -0.196%
vendor/golang.org/x/crypto/chacha20poly1305.s                            8945     8942     -3      -0.034%
cmd/vendor/golang.org/x/crypto/ssh/terminal.s                            27226    27195    -31     -0.114%
index/suffixarray.s                                                      36431    36411    -20     -0.055%
fmt.s                                                                    77017    76709    -308    -0.400%
encoding/hex.s                                                           6241     6154     -87     -1.394%
compress/lzw.s                                                           7133     7069     -64     -0.897%
database/sql/driver.s                                                    18888    18877    -11     -0.058%
net/url.s                                                                29838    29739    -99     -0.332%
debug/plan9obj.s                                                         8329     8279     -50     -0.600%
encoding/csv.s                                                           12986    12902    -84     -0.647%
debug/gosym.s                                                            25403    25330    -73     -0.287%
compress/flate.s                                                         51192    50970    -222    -0.434%
vendor/golang.org/x/net/dns/dnsmessage.s                                 86769    86208    -561    -0.647%
compress/gzip.s                                                          9791     9758     -33     -0.337%
compress/zlib.s                                                          7310     7277     -33     -0.451%
archive/zip.s                                                            42356    42166    -190    -0.449%
debug/dwarf.s                                                            108259   107730   -529    -0.489%
encoding/json.s                                                          106378   105910   -468    -0.440%
os/user.s                                                                14751    14724    -27     -0.183%
database/sql.s                                                           99011    98404    -607    -0.613%
log.s                                                                    9466     9423     -43     -0.454%
debug/pe.s                                                               31272    31182    -90     -0.288%
debug/macho.s                                                            32764    32608    -156    -0.476%
encoding/gob.s                                                           136976   136517   -459    -0.335%
vendor/golang.org/x/text/unicode/bidi.s                                  27318    27276    -42     -0.154%
archive/tar.s                                                            71416    70975    -441    -0.618%
vendor/golang.org/x/net/http2/hpack.s                                    23892    23848    -44     -0.184%
vendor/golang.org/x/text/secure/bidirule.s                               3354     3351     -3      -0.089%
mime/quotedprintable.s                                                   5960     5925     -35     -0.587%
net/http/internal.s                                                      5874     5853     -21     -0.358%
math/big.s                                                               184147   183692   -455    -0.247%
debug/elf.s                                                              63775    63567    -208    -0.326%
mime.s                                                                   39802    39709    -93     -0.234%
encoding/xml.s                                                           111038   110713   -325    -0.293%
crypto/dsa.s                                                             6044     6029     -15     -0.248%
go/token.s                                                               12139    12077    -62     -0.511%
crypto/rand.s                                                            6889     6866     -23     -0.334%
go/scanner.s                                                             19030    19008    -22     -0.116%
flag.s                                                                   22320    22236    -84     -0.376%
vendor/golang.org/x/text/unicode/norm.s                                  66652    66391    -261    -0.392%
crypto/rsa.s                                                             31671    31650    -21     -0.066%
crypto/elliptic.s                                                        51553    51403    -150    -0.291%
internal/xcoff.s                                                         22950    22822    -128    -0.558%
go/constant.s                                                            43750    43689    -61     -0.139%
encoding/asn1.s                                                          57086    57035    -51     -0.089%
runtime/trace.s                                                          2609     2603     -6      -0.230%
crypto/x509/pkix.s                                                       10458    10471    +13     +0.124%
image/gif.s                                                              27544    27385    -159    -0.577%
vendor/golang.org/x/net/idna.s                                           24558    24502    -56     -0.228%
image/png.s                                                              42775    42685    -90     -0.210%
vendor/golang.org/x/crypto/cryptobyte.s                                  33616    33493    -123    -0.366%
go/ast.s                                                                 80684    80449    -235    -0.291%
net/internal/socktest.s                                                  16571    16535    -36     -0.217%
crypto/ecdsa.s                                                           11948    11936    -12     -0.100%
text/template/parse.s                                                    95138    94002    -1136   -1.194%
runtime/pprof.s                                                          59702    59639    -63     -0.106%
testing.s                                                                68427    68088    -339    -0.495%
internal/testenv.s                                                       5620     5596     -24     -0.427%
testing/internal/testdeps.s                                              3312     3294     -18     -0.543%
internal/trace.s                                                         78473    78239    -234    -0.298%
testing/iotest.s                                                         4968     4908     -60     -1.208%
os/signal/internal/pty.s                                                 3011     2990     -21     -0.697%
testing/quick.s                                                          12179    12125    -54     -0.443%
cmd/internal/bio.s                                                       9286     9274     -12     -0.129%
cmd/internal/src.s                                                       17684    17663    -21     -0.119%
cmd/internal/goobj2.s                                                    12588    12558    -30     -0.238%
cmd/internal/objabi.s                                                    16408    16390    -18     -0.110%
go/printer.s                                                             77417    77308    -109    -0.141%
go/parser.s                                                              80045    79113    -932    -1.164%
go/format.s                                                              5434     5419     -15     -0.276%
cmd/internal/goobj.s                                                     26146    25954    -192    -0.734%
runtime/pprof/internal/profile.s                                         102518   102178   -340    -0.332%
text/template.s                                                          95343    94935    -408    -0.428%
cmd/internal/dwarf.s                                                     31718    31572    -146    -0.460%
cmd/vendor/golang.org/x/arch/arm/armasm.s                                45240    45151    -89     -0.197%
internal/lazytemplate.s                                                  1470     1457     -13     -0.884%
cmd/vendor/golang.org/x/arch/ppc64/ppc64asm.s                            37253    37220    -33     -0.089%
cmd/asm/internal/flags.s                                                 2593     2590     -3      -0.116%
cmd/asm/internal/lex.s                                                   25068    24921    -147    -0.586%
cmd/internal/buildid.s                                                   18536    18263    -273    -1.473%
cmd/vendor/golang.org/x/arch/x86/x86asm.s                                80209    80105    -104    -0.130%
go/doc.s                                                                 75140    74585    -555    -0.739%
cmd/internal/edit.s                                                      3893     3899     +6      +0.154%
html/template.s                                                          89377    88809    -568    -0.636%
cmd/vendor/golang.org/x/arch/arm64/arm64asm.s                            117998   117824   -174    -0.147%
cmd/internal/obj.s                                                       115015   114290   -725    -0.630%
go/build.s                                                               69379    68862    -517    -0.745%
cmd/internal/objfile.s                                                   48106    47982    -124    -0.258%
cmd/cover.s                                                              46239    46113    -126    -0.272%
cmd/addr2line.s                                                          2845     2833     -12     -0.422%
cmd/internal/obj/arm.s                                                   68658    66855    -1803   -2.626%
cmd/internal/obj/mips.s                                                  57486    56272    -1214   -2.112%
cmd/internal/obj/riscv.s                                                 63834    63006    -828    -1.297%
cmd/compile/internal/syntax.s                                            146582   145456   -1126   -0.768%
cmd/internal/obj/wasm.s                                                  44117    44066    -51     -0.116%
cmd/cgo.s                                                                242645   241653   -992    -0.409%
cmd/internal/obj/arm64.s                                                 152107   147163   -4944   -3.250%
net.s                                                                    295972   292010   -3962   -1.339%
go/types.s                                                               321371   319432   -1939   -0.603%
vendor/golang.org/x/net/http/httpproxy.s                                 9450     9423     -27     -0.286%
net/textproto.s                                                          19455    19406    -49     -0.252%
cmd/internal/obj/ppc64.s                                                 125544   120456   -5088   -4.053%
go/internal/srcimporter.s                                                6475     6409     -66     -1.019%
log/syslog.s                                                             8017     7929     -88     -1.098%
cmd/compile/internal/logopt.s                                            10183    10162    -21     -0.206%
net/mail.s                                                               24085    23948    -137    -0.569%
mime/multipart.s                                                         21527    21420    -107    -0.497%
cmd/internal/obj/s390x.s                                                 127610   127757   +147    +0.115%
go/internal/gcimporter.s                                                 34913    34548    -365    -1.045%
vendor/golang.org/x/net/nettest.s                                        28103    28016    -87     -0.310%
cmd/go/internal/cfg.s                                                    9967     9916     -51     -0.512%
cmd/api.s                                                                39703    39603    -100    -0.252%
go/internal/gccgoimporter.s                                              56470    56120    -350    -0.620%
go/importer.s                                                            2077     2056     -21     -1.011%
cmd/compile/internal/types.s                                             48202    47282    -920    -1.909%
cmd/go/internal/str.s                                                    4341     4320     -21     -0.484%
cmd/internal/obj/x86.s                                                   89440    88625    -815    -0.911%
cmd/go/internal/base.s                                                   12667    12580    -87     -0.687%
cmd/go/internal/cache.s                                                  30754    30571    -183    -0.595%
cmd/doc.s                                                                62976    62755    -221    -0.351%
cmd/go/internal/search.s                                                 20114    19993    -121    -0.602%
cmd/vendor/golang.org/x/xerrors.s                                        17923    17855    -68     -0.379%
cmd/go/internal/lockedfile.s                                             16451    16415    -36     -0.219%
cmd/vendor/golang.org/x/mod/sumdb/note.s                                 18200    18150    -50     -0.275%
cmd/vendor/golang.org/x/mod/module.s                                     17869    17851    -18     -0.101%
cmd/asm/internal/arch.s                                                  37533    37482    -51     -0.136%
cmd/fix.s                                                                87728    87492    -236    -0.269%
cmd/vendor/golang.org/x/mod/sumdb/tlog.s                                 36394    36367    -27     -0.074%
cmd/vendor/golang.org/x/mod/sumdb/dirhash.s                              4990     4963     -27     -0.541%
cmd/go/internal/imports.s                                                16499    16469    -30     -0.182%
cmd/vendor/golang.org/x/mod/zip.s                                        18816    18745    -71     -0.377%
cmd/go/internal/cmdflag.s                                                5126     5123     -3      -0.059%
cmd/internal/test2json.s                                                 9540     9452     -88     -0.922%
cmd/go/internal/tool.s                                                   3629     3623     -6      -0.165%
cmd/go/internal/version.s                                                11232    11220    -12     -0.107%
cmd/go/internal/mvs.s                                                    25383    25179    -204    -0.804%
cmd/nm.s                                                                 5815     5803     -12     -0.206%
cmd/dist.s                                                               210146   209140   -1006   -0.479%
cmd/asm/internal/asm.s                                                   68655    68549    -106    -0.154%
cmd/vendor/golang.org/x/mod/modfile.s                                    72974    72510    -464    -0.636%
cmd/go/internal/load.s                                                   107548   106861   -687    -0.639%
cmd/link/internal/sym.s                                                  18708    18581    -127    -0.679%
cmd/asm.s                                                                3367     3343     -24     -0.713%
cmd/gofmt.s                                                              30795    30698    -97     -0.315%
cmd/link/internal/objfile.s                                              21828    21630    -198    -0.907%
cmd/pack.s                                                               14878    14869    -9      -0.060%
cmd/vendor/github.com/google/pprof/internal/elfexec.s                    6788     6782     -6      -0.088%
cmd/test2json.s                                                          1647     1641     -6      -0.364%
cmd/link/internal/loader.s                                               48677    48483    -194    -0.399%
cmd/vendor/golang.org/x/tools/go/analysis/internal/analysisflags.s       16783    16773    -10     -0.060%
cmd/link/internal/loadelf.s                                              35464    35126    -338    -0.953%
cmd/link/internal/loadmacho.s                                            29438    29180    -258    -0.876%
cmd/link/internal/loadpe.s                                               16440    16371    -69     -0.420%
cmd/vendor/golang.org/x/tools/go/analysis/passes/internal/analysisutil.s 2106     2100     -6      -0.285%
cmd/link/internal/loadxcoff.s                                            11711    11615    -96     -0.820%
cmd/vendor/golang.org/x/tools/go/analysis/internal/facts.s               14954    14883    -71     -0.475%
cmd/vendor/golang.org/x/tools/go/ast/inspector.s                         5394     5374     -20     -0.371%
cmd/vendor/golang.org/x/tools/go/analysis/passes/asmdecl.s               37029    36822    -207    -0.559%
cmd/vendor/golang.org/x/tools/go/analysis/passes/inspect.s               340      337      -3      -0.882%
cmd/vendor/golang.org/x/tools/go/analysis/passes/cgocall.s               9919     9858     -61     -0.615%
cmd/vendor/golang.org/x/tools/go/analysis/passes/bools.s                 6705     6690     -15     -0.224%
cmd/vendor/golang.org/x/tools/go/analysis/passes/copylock.s              9783     9741     -42     -0.429%
cmd/vendor/golang.org/x/tools/go/cfg.s                                   31699    30742    -957    -3.019%
cmd/vendor/golang.org/x/tools/go/analysis/passes/ifaceassert.s           2768     2762     -6      -0.217%
cmd/vendor/golang.org/x/tools/go/analysis/passes/loopclosure.s           3031     2998     -33     -1.089%
cmd/vendor/golang.org/x/tools/go/analysis/passes/shift.s                 4382     4376     -6      -0.137%
cmd/vendor/golang.org/x/tools/go/analysis/passes/stdmethods.s            8654     8642     -12     -0.139%
cmd/vendor/golang.org/x/tools/go/analysis/passes/stringintconv.s         3458     3446     -12     -0.347%
cmd/vendor/golang.org/x/tools/go/analysis/passes/structtag.s             8011     7995     -16     -0.200%
cmd/vendor/golang.org/x/tools/go/analysis/passes/tests.s                 6205     6193     -12     -0.193%
cmd/vendor/golang.org/x/tools/go/ast/astutil.s                           66183    65861    -322    -0.487%
cmd/vendor/github.com/google/pprof/profile.s                             150844   150261   -583    -0.386%
cmd/vendor/golang.org/x/tools/go/analysis/passes/unreachable.s           8057     8054     -3      -0.037%
cmd/vendor/golang.org/x/tools/go/analysis/passes/unusedresult.s          3670     3667     -3      -0.082%
cmd/vendor/github.com/google/pprof/internal/measurement.s                10464    10440    -24     -0.229%
cmd/vendor/golang.org/x/tools/go/types/typeutil.s                        12319    12274    -45     -0.365%
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.s                  13503    13342    -161    -1.192%
cmd/vendor/golang.org/x/tools/go/analysis/passes/ctrlflow.s              5261     5218     -43     -0.817%
cmd/vendor/golang.org/x/tools/go/analysis/passes/errorsas.s              1462     1459     -3      -0.205%
cmd/vendor/golang.org/x/tools/go/analysis/passes/lostcancel.s            9594     9582     -12     -0.125%
cmd/vendor/golang.org/x/tools/go/analysis/passes/printf.s                34397    34338    -59     -0.172%
cmd/vendor/github.com/google/pprof/internal/graph.s                      53225    52936    -289    -0.543%
cmd/vendor/github.com/ianlancetaylor/demangle.s                          177450   175329   -2121   -1.195%
crypto/x509.s                                                            147892   147388   -504    -0.341%
cmd/go/internal/work.s                                                   306465   304950   -1515   -0.494%
cmd/go/internal/run.s                                                    4664     4657     -7      -0.150%
crypto/tls.s                                                             313130   311833   -1297   -0.414%
net/http/httptrace.s                                                     3979     3905     -74     -1.860%
net/smtp.s                                                               14413    14344    -69     -0.479%
cmd/link/internal/ld.s                                                   545343   542279   -3064   -0.562%
cmd/link/internal/mips.s                                                 6218     6215     -3      -0.048%
cmd/link/internal/mips64.s                                               6108     6103     -5      -0.082%
cmd/link/internal/amd64.s                                                18154    18112    -42     -0.231%
cmd/link/internal/arm64.s                                                22527    22494    -33     -0.146%
cmd/link/internal/arm.s                                                  22574    22494    -80     -0.354%
cmd/link/internal/s390x.s                                                20779    20746    -33     -0.159%
cmd/link/internal/wasm.s                                                 16531    16493    -38     -0.230%
cmd/link/internal/x86.s                                                  18906    18849    -57     -0.301%
cmd/link/internal/ppc64.s                                                26856    26778    -78     -0.290%
net/http.s                                                               559101   556513   -2588   -0.463%
net/http/cookiejar.s                                                     15912    15885    -27     -0.170%
expvar.s                                                                 9531     9525     -6      -0.063%
net/http/httptest.s                                                      16616    16475    -141    -0.849%
net/http/cgi.s                                                           23624    23458    -166    -0.703%
cmd/go/internal/web.s                                                    16546    16489    -57     -0.344%
cmd/vendor/golang.org/x/mod/sumdb.s                                      33197    33117    -80     -0.241%
net/http/fcgi.s                                                          19266    19169    -97     -0.503%
net/http/httputil.s                                                      39875    39728    -147    -0.369%
cmd/vendor/github.com/google/pprof/internal/symbolz.s                    5888     5867     -21     -0.357%
net/rpc.s                                                                34154    34003    -151    -0.442%
cmd/vendor/github.com/google/pprof/internal/transport.s                  2746     2716     -30     -1.092%
cmd/vendor/github.com/google/pprof/internal/binutils.s                   35999    35875    -124    -0.344%
net/rpc/jsonrpc.s                                                        6637     6598     -39     -0.588%
cmd/vendor/github.com/google/pprof/internal/symbolizer.s                 11533    11458    -75     -0.650%
cmd/go/internal/get.s                                                    62921    62803    -118    -0.188%
cmd/vendor/github.com/google/pprof/internal/report.s                     80364    80058    -306    -0.381%
cmd/go/internal/modfetch/codehost.s                                      89680    89066    -614    -0.685%
cmd/trace.s                                                              117171   116701   -470    -0.401%
cmd/vendor/github.com/google/pprof/internal/driver.s                     144268   143297   -971    -0.673%
cmd/go/internal/modfetch.s                                               126299   125860   -439    -0.348%
cmd/vendor/github.com/google/pprof/driver.s                              9042     9000     -42     -0.464%
cmd/go/internal/modconv.s                                                17947    17889    -58     -0.323%
cmd/pprof.s                                                              12399    12326    -73     -0.589%
cmd/go/internal/modload.s                                                151182   150389   -793    -0.525%
cmd/go/internal/generate.s                                               11738    11636    -102    -0.869%
cmd/go/internal/help.s                                                   6571     6531     -40     -0.609%
cmd/go/internal/clean.s                                                  11174    11142    -32     -0.286%
cmd/go/internal/vet.s                                                    7897     7867     -30     -0.380%
cmd/go/internal/envcmd.s                                                 22176    22095    -81     -0.365%
cmd/go/internal/list.s                                                   15216    15067    -149    -0.979%
cmd/go/internal/modget.s                                                 38698    38519    -179    -0.463%
cmd/go/internal/modcmd.s                                                 46674    46441    -233    -0.499%
cmd/go/internal/test.s                                                   64664    64456    -208    -0.322%
cmd/go.s                                                                 6730     6703     -27     -0.401%
cmd/compile/internal/ssa.s                                               3592565  3582500  -10065  -0.280%
cmd/compile/internal/gc.s                                                1549123  1537123  -12000  -0.775%
cmd/compile/internal/riscv64.s                                           14579    14483    -96     -0.658%
cmd/compile/internal/mips.s                                              20578    20419    -159    -0.773%
cmd/compile/internal/ppc64.s                                             25524    25359    -165    -0.646%
cmd/compile/internal/mips64.s                                            19795    19636    -159    -0.803%
cmd/compile/internal/wasm.s                                              13329    13290    -39     -0.293%
cmd/compile/internal/s390x.s                                             28097    27892    -205    -0.730%
cmd/compile/internal/arm.s                                               31489    31321    -168    -0.534%
cmd/compile/internal/arm64.s                                             29803    29590    -213    -0.715%
cmd/compile/internal/amd64.s                                             32961    33221    +260    +0.789%
cmd/compile/internal/x86.s                                               31029    30878    -151    -0.487%
total                                                                    18534966 18440341 -94625  -0.511%

Change-Id: I830d37364f14f0297800adc42c99f60a74c51aca
Reviewed-on: https://go-review.googlesource.com/c/go/+/226367
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-31 21:26:33 +00:00
Josh Bleecher Snyder
1b47fde55c cmd/compile: clarify division bounds check optimization
The name of the function should mention division.
Eliminate double negatives from the comment describing it.

Change-Id: Icef1a5139b3a91b86acb930af97938f5160f7342
Reviewed-on: https://go-review.googlesource.com/c/go/+/217001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2020-02-21 17:28:10 +00:00
David Chase
40ebcfaa17 cmd/compile: enable nil check logging for other architectures.
Change-Id: If82ebd9cd6470863eb5de9e031e7905a66218857
Reviewed-on: https://go-review.googlesource.com/c/go/+/204159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-10 17:12:15 +00:00
David Chase
cd53fddabb cmd/compile: add framework for logging optimizer (non)actions to LSP
This is intended to allow IDEs to note where the optimizer
was not able to improve users' code.  There may be other
applications for this, for example in studying effectiveness
of optimizer changes more quickly than running benchmarks,
or in verifying that code changes did not accidentally disable
optimizations in performance-critical code.

Logging of nilcheck (bad) for amd64 is implemented as
proof-of-concept.  In general, the intent is that optimizations
that didn't happen are what will be logged, because that is
believed to be what IDE users want.

Added flag -json=version,dest

Check that version=0.  (Future compilers will support a
few recent versions, I hope that version is always <=3.)

Dest is expected to be one of:

/path (or \path in Windows)
  will create directory /path and fill it w/ json files
file://path
  will create directory path, intended either for
     I:\dont\know\enough\about\windows\paths
     trustme_I_know_what_I_am_doing_probably_testing

Not passing an absolute path name usually leads to
json splattered all over source directories,
or failure when those directories are not writeable.
If you want a foot-gun, you have to ask for it.

The JSON output is directed to subdirectories of dest,
where each subdirectory is net/url.PathEscape of the
package name, and each for each foo.go in the package,
net/url.PathEscape(foo).json is created.  The first line
of foo.json contains version and context information,
and subsequent lines contains LSP-conforming JSON
describing the missing optimizations.

Change-Id: Ib83176a53a8c177ee9081aefc5ae05604ccad8a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/204338
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-10 17:11:34 +00:00
Austin Clements
97592b3c14 cmd/compile: intrinsics for runtime/internal/atomic.Store8
For #10958, #24543, but makes sense on its own.

Change-Id: I2a87dab66b82a1863e4b6512b1f8def51463ce2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/203284
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-10-29 03:18:55 +00:00
smasher164
7a6da218b1 cmd/compile: add fma intrinsic for amd64
To permit ssa-level optimization, this change introduces an amd64 intrinsic
that generates the VFMADD231SD instruction for the fused-multiply-add
operation on systems that support it. System support is detected via
cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic
("Fma") to VFMADD231SD.

The benchmark compares the software implementation (old) with the intrinsic
(new).

name   old time/op  new time/op  delta
Fma-4  27.2ns ± 1%   1.0ns ± 9%  -96.48%  (p=0.008 n=5+5)

Updates #25819.

Change-Id: I966655e5f96817a5d06dff5942418a3915b09584
Reviewed-on: https://go-review.googlesource.com/c/go/+/137156
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-21 16:42:10 +00:00
Ben Shi
11d7775c9f cmd/compile: remove some nacl SSA rules
Updates golang/go#30439

Change-Id: I7ef5301fbd650d26a37a1241ddf7ca1ccd58b89d
Reviewed-on: https://go-review.googlesource.com/c/go/+/200941
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-10-15 16:45:31 +00:00