Commit graph

102 commits

Author SHA1 Message Date
David Chase
92cb157cf3 [dev.regabi] cmd/compile: late expansion of return values
By-hand rebase of earlier CL, because that was easier than
letting git try to figure things out.

This will naively insert self-moves; in the case that these
involve memory, the expander detects these and removes them
and their vardefs.

Change-Id: Icf72575eb7ae4a186b0de462bc8cf0bedc84d3e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/279519
Trust: David Chase <drchase@google.com>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
2021-01-20 19:57:48 +00:00
David Chase
c1370e918f [dev.regabi] cmd/compile: add code to support register ABI spills around morestack calls
This is a selected copy from the register ABI experiment CL, focused
on the files and data structures that handle spilling around morestack.
Unnecessary code from the experiment was removed, other code was adapted.

Would it make sense to leave comments in the experiment as pieces are
brought over?

Experiment CL (for comparison purposes)
https://go-review.googlesource.com/c/go/+/28832

Change-Id: I92136f070351d4fcca1407b52ecf9b80898fed95
Reviewed-on: https://go-review.googlesource.com/c/go/+/279520
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
2021-01-13 15:47:13 +00:00
Matthew Dempsky
1a98ab0e2d [dev.regabi] cmd/compile: add ssa.Aux tag interface for Value.Aux
It's currently hard to automate refactorings around the Value.Aux
field, because we don't have any static typing information for it.
Adding a tag interface will make subsequent CLs easier and safer.

Passes buildall w/ toolstash -cmp.

Updates #42982.

Change-Id: I41ae8e411a66bda3195a0957b60c2fe8a8002893
Reviewed-on: https://go-review.googlesource.com/c/go/+/275756
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
2020-12-08 01:46:31 +00:00
Cuong Manh Le
0ecf769633 cmd/compile: do not mark OpSP, OpSB pos for debugging
Fixes #42801

Change-Id: I2080ecacc109479f5820035401ce2b26d72e2ef2
Reviewed-on: https://go-review.googlesource.com/c/go/+/273506
Trust: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2020-12-01 01:35:19 +00:00
soolaugust
2e7706d1fa ssa: comment Sdom() with the form "Sdom..."
Change-Id: I7ddb3d178e5437a7c3d8e94a089ac7a476a7dc85
GitHub-Last-Rev: bc27289128
GitHub-Pull-Request: golang/go#41925
Reviewed-on: https://go-review.googlesource.com/c/go/+/261437
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-10-12 15:38:05 +00:00
David Chase
44c0586931 cmd/compile: add code to expand calls just before late opt
Still needs to generate the calls that will need lowering.

Change-Id: Ifd4e510193441a5e27c462c1f1d704f07bf6dec3
Reviewed-on: https://go-review.googlesource.com/c/go/+/242359
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-18 16:14:40 +00:00
David Chase
bf833ead62 cmd/compile: ensure that ssa.Func constant cache is consistent
It was not necessarily consistent before, we were just lucky.

Change-Id: I3a92dc724e0af7b4d810a6a0b7b1d58844eb8f87
Reviewed-on: https://go-review.googlesource.com/c/go/+/251440
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-05 14:11:21 +00:00
David Chase
4220d67084 cmd/compile: make GOSSAHASH package-sensitive, also append to log files
Turns out if your failure is in a function with a name like "Reset()"
there will be a lot of hits on the same hashcode.  Adding package sensitivity
solves this problem.

In additionm, it turned out that in the case that a logfile was specified
for the GOSSAHASH logging, that it was opened in create mode, which meant
that multiple compiler invocations would reset the file to zero length.
Opening in append mode works better; the automated harness
(github.com/dr2chase/gossahash) takes care of truncating the file before use.

Change-Id: I5601bc280faa94cbd507d302448831849db6c842
Reviewed-on: https://go-review.googlesource.com/c/go/+/246937
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-08-24 02:38:18 +00:00
surechen
17553c6e71 cmd/compile: move dumpFileSeq
I noticed that there is a Todo comment here. This variable is only used for filename when dump a function's ssa passes result in details. It is no problem to print a function alone, but may be edited by not only one goroutine if dump multiple functions at the same time. Although it looks only dump one function's ssa passes now. As far as I am concerned this variable can be a member variable of the struct Func. I'm not sure if this change is necessary. Looking forward to your advices, thank you very much.

Change-Id: I35dd7247889e0cc7f19c0b400b597206592dee75
Reviewed-on: https://go-review.googlesource.com/c/go/+/244918
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2020-08-17 21:04:19 +00:00
Keith Randall
32a84c99e1 cmd/compile: fix live variable computation for deferreturn
Taking the live variable set from the last return point is problematic.
See #40629 for details, but there may not be a return point, or it may
be before the final defer.

Additionally, keeping track of the last call as a *Value doesn't quite
work. If it is dead-code eliminated, the storage for the Value is reused
for some other random instruction. Its live variable information,
if it is available at all, is wrong.

Instead, just mark all the open-defer argument slots as live
throughout the function. (They are already zero-initialized.)

Fixes #40629

Change-Id: Ie456c7db3082d0de57eaa5234a0f32525a1cce13
Reviewed-on: https://go-review.googlesource.com/c/go/+/247522
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
2020-08-14 21:49:36 +00:00
Dan Scales
1b3a1db19f cmd/compile: fix liveness for open-coded defer args for infinite loops
Once defined, a stack slot holding an open-coded defer arg should always be marked
live, since it may be used at any time if there is a panic. These stack slots are
typically kept live naturally by the open-defer code inlined at each return/exit point.
However, we need to do extra work to make sure that they are kept live if a
function has an infinite loop or a panic exit.

For this fix, only in the case of a function that is using open-coded defers, we
compute the set of blocks (most often empty) that cannot reach a return or a
BlockExit (panic) because of an infinite loop. Then, for each block b which
cannot reach a return or BlockExit or is a BlockExit block, we mark each defer arg
slot as live, as long as the definition of the defer arg slot dominates block b.

For this change, had to export (*Func).sdom (-> Sdom) and SparseTree.isAncestorEq
(-> IsAncestorEq)

Updates #35277

Change-Id: I7b53c9bd38ba384a3794386dd0eb94e4cbde4eb1
Reviewed-on: https://go-review.googlesource.com/c/go/+/204802
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-11-05 17:19:16 +00:00
Dan Scales
be64a19d99 cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).

When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.

In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).

I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().

The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).

Cost of defer statement  [ go test -run NONE -bench BenchmarkDefer$ runtime ]
  With normal (stack-allocated) defers only:         35.4  ns/op
  With open-coded defers:                             5.6  ns/op
  Cost of function call alone (remove defer keyword): 4.4  ns/op

Text size increase (including funcdata) for go binary without/with open-coded defers:  0.09%

The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.

The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:

Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
  Without open-coded defers:        62.0 ns/op
  With open-coded defers:           255  ns/op

A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:

CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
  Without open-coded defers:        443 ns/op
  With open-coded defers:           347 ns/op

Updates #14939 (defer performance)
Updates #34481 (design doc)

Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/202340
Reviewed-by: Austin Clements <austin@google.com>
2019-10-24 13:54:11 +00:00
Bryan C. Mills
b76e6f8825 Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata"
This reverts CL 190098.

Reason for revert: broke several builders.

Change-Id: I69161352f9ded02537d8815f259c4d391edd9220
Reviewed-on: https://go-review.googlesource.com/c/go/+/201519
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Dan Scales <danscales@google.com>
2019-10-16 20:59:53 +00:00
Dan Scales
dad616375f cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).

When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.

In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).

I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().

The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).

Cost of defer statement  [ go test -run NONE -bench BenchmarkDefer$ runtime ]
  With normal (stack-allocated) defers only:         35.4  ns/op
  With open-coded defers:                             5.6  ns/op
  Cost of function call alone (remove defer keyword): 4.4  ns/op

Text size increase (including funcdata) for go cmd without/with open-coded defers:  0.09%

The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.

The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:

Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
  Without open-coded defers:        62.0 ns/op
  With open-coded defers:           255  ns/op

A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:

CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
  Without open-coded defers:        443 ns/op
  With open-coded defers:           347 ns/op

Updates #14939 (defer performance)
Updates #34481 (design doc)

Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28
Reviewed-on: https://go-review.googlesource.com/c/go/+/190098
Reviewed-by: Austin Clements <austin@google.com>
2019-10-16 18:27:16 +00:00
David Chase
6081a9f7e6 cmd/compile: index line number tables by source file to improve sparsity
This reduces allocations and also resolves some
lurking inliner/inlinee line-number match problems.
However, it does add about 1.5% to compile time.

This fixes compiler OOMs seen compiling some large protobuf-
derived inputs.  For compiling the compiler itself,

compilebench -pkg cmd/compile/internal/ssa -memprofile withcl.prof

the numberlines-related memory consumption is reduced from 129MB
to 29MB (about a 5% overall reduction in allocation).

Additionally modified after going over changes with Austin
to remove unused code (nobody called size()) and correct
the cache-clearing code.

I've attempted to speed this up by not using maps, and have
not succeeded.  I'd rather get correct code in now, speed it
up later if I can.

Updates #27739.
Fixes #29279.

Change-Id: I098005de4e45196a5f5b10c0886a49f88e9f8fd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/154617
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-05-14 18:48:16 +00:00
Josh Bleecher Snyder
0047353c53 cmd/compile: add countRule rewrite rule helper
noteRule is useful when you're trying to debug
a particular rule, or get a general sense for
how often a rule fires overall.

It is less useful if you're trying to figure
out which functions might be useful to benchmark
to ascertain the impact of a newly added rule.

Enter countRule. You use it like noteRule,
except that you get per-function summaries.

Sample output:

 # runtime
(*mspan).sweep: idx1=1
evacuate_faststr: idx1=1
evacuate_fast32: idx1=1
evacuate: idx1=2
evacuate_fast64: idx1=1
sweepone: idx1=1
purgecachedstats: idx1=1
mProf_Free: idx1=1

This suggests that the map benchmarks
might be good to run for this added rule.

Change-Id: Id471c3231f1736165f2020f6979ff01c29677808
Reviewed-on: https://go-review.googlesource.com/c/go/+/167088
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-05-08 22:58:10 +00:00
Keith Randall
2c423f063b cmd/compile,runtime: provide index information on bounds check failure
A few examples (for accessing a slice of length 3):

   s[-1]    runtime error: index out of range [-1]
   s[3]     runtime error: index out of range [3] with length 3
   s[-1:0]  runtime error: slice bounds out of range [-1:]
   s[3:0]   runtime error: slice bounds out of range [3:0]
   s[3:-1]  runtime error: slice bounds out of range [:-1]
   s[3:4]   runtime error: slice bounds out of range [:4] with capacity 3
   s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3

Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.

An exhaustive set of examples is in issue30116[u].out in the CL.

The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.

Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.

Fixes #30116

Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-18 17:33:38 +00:00
Josh Bleecher Snyder
c9ccdf1f8c cmd/compile: make deadcode pass cheaper
The deadcode pass runs a lot.
I'd like it to run even more.

This change adds dedicated storage for deadcode to ssa.Cache.
In addition to being a nice win now, it makes
deadcode easier to add other places in the future.

name        old time/op       new time/op       delta
Template          210ms ± 3%        209ms ± 2%    ~     (p=0.951 n=93+95)
Unicode          92.2ms ± 3%       93.0ms ± 3%  +0.87%  (p=0.000 n=94+94)
GoTypes           739ms ± 2%        733ms ± 2%  -0.84%  (p=0.000 n=92+94)
Compiler          3.51s ± 2%        3.49s ± 2%  -0.57%  (p=0.000 n=94+91)
SSA               9.80s ± 2%        9.75s ± 2%  -0.57%  (p=0.000 n=95+92)
Flate             132ms ± 2%        132ms ± 3%    ~     (p=0.165 n=94+98)
GoParser          160ms ± 3%        159ms ± 3%  -0.42%  (p=0.005 n=96+94)
Reflect           446ms ± 4%        442ms ± 4%  -0.91%  (p=0.000 n=95+98)
Tar               186ms ± 3%        186ms ± 2%    ~     (p=0.221 n=94+97)
XML               252ms ± 2%        250ms ± 2%  -0.55%  (p=0.000 n=95+94)
[Geo mean]        430ms             429ms       -0.34%

name        old user-time/op  new user-time/op  delta
Template          256ms ± 3%        257ms ± 3%    ~     (p=0.521 n=94+98)
Unicode           120ms ± 9%        121ms ± 9%    ~     (p=0.074 n=99+100)
GoTypes           935ms ± 3%        935ms ± 2%    ~     (p=0.574 n=82+96)
Compiler          4.56s ± 1%        4.55s ± 2%    ~     (p=0.247 n=88+90)
SSA               13.6s ± 2%        13.6s ± 1%    ~     (p=0.277 n=94+95)
Flate             155ms ± 3%        156ms ± 3%    ~     (p=0.181 n=95+100)
GoParser          193ms ± 8%        184ms ± 6%  -4.39%  (p=0.000 n=100+89)
Reflect           549ms ± 3%        552ms ± 3%  +0.45%  (p=0.036 n=94+96)
Tar               230ms ± 4%        230ms ± 4%    ~     (p=0.670 n=97+99)
XML               315ms ± 5%        309ms ±12%  -2.05%  (p=0.000 n=99+99)
[Geo mean]        540ms             538ms       -0.47%

name        old alloc/op      new alloc/op      delta
Template         40.3MB ± 0%       38.9MB ± 0%  -3.36%  (p=0.008 n=5+5)
Unicode          28.6MB ± 0%       28.4MB ± 0%  -0.90%  (p=0.008 n=5+5)
GoTypes           137MB ± 0%        132MB ± 0%  -3.65%  (p=0.008 n=5+5)
Compiler          637MB ± 0%        609MB ± 0%  -4.40%  (p=0.008 n=5+5)
SSA              2.19GB ± 0%       2.07GB ± 0%  -5.63%  (p=0.008 n=5+5)
Flate            25.0MB ± 0%       24.1MB ± 0%  -3.80%  (p=0.008 n=5+5)
GoParser         30.0MB ± 0%       29.1MB ± 0%  -3.17%  (p=0.008 n=5+5)
Reflect          87.1MB ± 0%       84.4MB ± 0%  -3.05%  (p=0.008 n=5+5)
Tar              37.3MB ± 0%       36.0MB ± 0%  -3.31%  (p=0.008 n=5+5)
XML              49.8MB ± 0%       48.0MB ± 0%  -3.69%  (p=0.008 n=5+5)
[Geo mean]       87.6MB            84.6MB       -3.50%

name        old allocs/op     new allocs/op     delta
Template           387k ± 0%         380k ± 0%  -1.76%  (p=0.008 n=5+5)
Unicode            342k ± 0%         341k ± 0%  -0.31%  (p=0.008 n=5+5)
GoTypes           1.39M ± 0%        1.37M ± 0%  -1.64%  (p=0.008 n=5+5)
Compiler          5.68M ± 0%        5.60M ± 0%  -1.41%  (p=0.008 n=5+5)
SSA               17.1M ± 0%        16.8M ± 0%  -1.49%  (p=0.008 n=5+5)
Flate              240k ± 0%         236k ± 0%  -1.99%  (p=0.008 n=5+5)
GoParser           309k ± 0%         304k ± 0%  -1.57%  (p=0.008 n=5+5)
Reflect           1.01M ± 0%        0.99M ± 0%  -2.69%  (p=0.008 n=5+5)
Tar                360k ± 0%         353k ± 0%  -1.91%  (p=0.008 n=5+5)
XML                447k ± 0%         441k ± 0%  -1.26%  (p=0.008 n=5+5)
[Geo mean]         858k              844k       -1.60%

Fixes #15306

Change-Id: I9f558adb911efddead3865542fe2ca71f66fe1da
Reviewed-on: https://go-review.googlesource.com/c/go/+/166718
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-11 21:20:01 +00:00
Yury Smolsky
3068fcfa0d cmd/compile: add control flow graphs to ssa.html
This CL adds CFGs to ssa.html.
It execs dot to generate SVG,
which then gets inlined into the html.
Some standard naming and javascript hacks
enable integration with the rest of ssa.html.
Clicking on blocks highlights the relevant
part of the CFG, and vice versa.

Sample output and screenshots can be seen in #28177.

CFGs can be turned on with the suffix mask:
:*            - dump CFG for every phase
:lower        - just the lower phase
:lower-layout - lower through layout
:w,x-y        - phases w and x through y

Calling dot after every pass is noticeably slow,
instead use the range of phases.

Dead blocks are not displayed on CFG.

User can zoom and pan individual CFG
when the automatic adjustment has failed.

Dot-related errors are reported
without bringing down the process.

Fixes #28177

Change-Id: Id52c42d86c4559ca737288aa10561b67a119c63d
Reviewed-on: https://go-review.googlesource.com/c/142517
Run-TryBot: Yury Smolsky <yury@smolsky.by>
Reviewed-by: David Chase <drchase@google.com>
2018-11-21 10:22:43 +00:00
Brad Fitzpatrick
3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
David Chase
69c5830c2b cmd/compile: repair display of values & blocks in prog column
This restores the printing of vXX and bYY in the left-hand
edge of the last column of ssa.html, where the generated
progs appear.

Change-Id: I81ab9b2fa5ae28e6e5de1b77665cfbed8d14e000
Reviewed-on: https://go-review.googlesource.com/c/141277
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Yury Smolsky <yury@smolsky.by>
2018-10-11 15:29:00 +00:00
David Chase
0029cd479e cmd/compile: add LocalAddr that takes SP,mem operands
Lack of a well-defined order between VarDef and related
address operations sometimes causes problems with store order
and write barrier transformations; glitches in the order are
made irreparable (by later optimizations) if the two parts of
the glitch straddle a split in the original block caused by
insertion of a write barrier diamond.

Fix this by creating a LocalAddr for addresses of locals
(what VarDef matters for) that takes a memory input to
help make the order explicit.  Addr is modified to only
be legal for SB operand, so there is no overlap between
Addr and LocalAddr uses (there may be some downstream
cleanup from this).

Changes to generic.rules and rewrite.go ensure that codegen
tests continue to pass; CSE of LocalAddr is impaired, not
quite sure of the cost.

Fixes #26105.

Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515
Reviewed-on: https://go-review.googlesource.com/122483
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-07-12 18:45:31 +00:00
Austin Clements
a367f44c18 cmd/compile: enable stack maps everywhere except unsafe points
This modifies issafepoint in liveness analysis to report almost every
operation as a safe point. There are four things we don't mark as
safe-points:

1. Runtime code (other than at calls).

2. go:nosplit functions (other than at calls).

3. Instructions between the load of the write barrier-enabled flag and
   the write.

4. Instructions leading up to a uintptr -> unsafe.Pointer conversion.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          185ms ± 2%        190ms ± 2%   +2.95%  (p=0.000 n=10+10)
Unicode          96.3ms ± 3%       96.4ms ± 1%     ~     (p=0.905 n=10+9)
GoTypes           658ms ± 0%        669ms ± 1%   +1.72%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.18s ± 1%   +1.56%  (p=0.000 n=9+10)
SSA               7.41s ± 2%        7.59s ± 1%   +2.48%  (p=0.000 n=9+10)
Flate             126ms ± 1%        128ms ± 1%   +2.08%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        157ms ± 2%   +2.38%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        442ms ± 1%   +0.98%  (p=0.001 n=10+10)
Tar               178ms ± 1%        179ms ± 1%   +0.67%  (p=0.035 n=10+9)
XML               223ms ± 1%        229ms ± 1%   +2.58%  (p=0.000 n=10+10)
[Geo mean]        394ms             401ms        +1.75%

No effect on binary size because we're not yet emitting these extra
safe points.

For #24543.

Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859
Reviewed-on: https://go-review.googlesource.com/109348
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 14:43:37 +00:00
Giovanni Bajo
3c8545c5f6 cmd/compile: reduce allocations in prove by reusing posets
In prove, reuse posets between different functions by storing them
in the per-worker cache.

Allocation count regression caused by prove improvements is down
from 5% to 3% after this CL.

Updates #25179

Change-Id: I6d14003109833d9b3ef5165fdea00aa9c9e952e8
Reviewed-on: https://go-review.googlesource.com/110455
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-14 14:44:55 +00:00
David Chase
c2c1822b12 cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).

Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).

Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.

Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".

The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices.  This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.

Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go.  There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.

Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)

This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.

The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.

This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.

Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-14 14:09:49 +00:00
Daniel Martí
14393c5cd4 cmd: remove a few more unused parameters
ssa's pos parameter on the Const* funcs is unused, so remove it.

ld's alloc parameter on elfnote is always true, so remove the arguments
and simplify the code.

Finally, arm's addpltreloc never has its return parameter used, so
remove it.

Change-Id: I63387ecf6ab7b5f7c20df36be823322bb98427b8
Reviewed-on: https://go-review.googlesource.com/104456
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-09 17:10:25 +00:00
Daniel Martí
cd2cb6e3f5 cmd/compile: cache sparse maps across ssa passes
This is done for sparse sets already, but it was missing for sparse
maps. Only affects deadstore and regalloc, as they're the only ones that
use sparse maps.

name                 old time/op    new time/op    delta
DSEPass-4               247µs ± 0%     216µs ± 0%  -12.75%  (p=0.008 n=5+5)
DSEPassBlock-4         3.05ms ± 1%    2.87ms ± 1%   -6.02%  (p=0.002 n=6+6)
CSEPass-4              2.30ms ± 0%    2.32ms ± 0%   +0.53%  (p=0.026 n=6+6)
CSEPassBlock-4         23.8ms ± 0%    23.8ms ± 0%     ~     (p=0.931 n=6+5)
DeadcodePass-4         51.7µs ± 1%    51.5µs ± 2%     ~     (p=0.429 n=5+6)
DeadcodePassBlock-4     734µs ± 1%     742µs ± 3%     ~     (p=0.394 n=6+6)
MultiPass-4             152µs ± 0%     149µs ± 2%     ~     (p=0.082 n=5+6)
MultiPassBlock-4       2.67ms ± 1%    2.41ms ± 2%   -9.77%  (p=0.008 n=5+5)

name                 old alloc/op   new alloc/op   delta
DSEPass-4              41.2kB ± 0%     0.1kB ± 0%  -99.68%  (p=0.002 n=6+6)
DSEPassBlock-4          560kB ± 0%       4kB ± 0%  -99.34%  (p=0.026 n=5+6)
CSEPass-4               189kB ± 0%     189kB ± 0%     ~     (all equal)
CSEPassBlock-4         3.10MB ± 0%    3.10MB ± 0%     ~     (p=0.444 n=5+5)
DeadcodePass-4         10.5kB ± 0%    10.5kB ± 0%     ~     (all equal)
DeadcodePassBlock-4     164kB ± 0%     164kB ± 0%     ~     (all equal)
MultiPass-4             240kB ± 0%     199kB ± 0%  -17.06%  (p=0.002 n=6+6)
MultiPassBlock-4       3.60MB ± 0%    2.99MB ± 0%  -17.06%  (p=0.002 n=6+6)

name                 old allocs/op  new allocs/op  delta
DSEPass-4                8.00 ± 0%      4.00 ± 0%  -50.00%  (p=0.002 n=6+6)
DSEPassBlock-4            240 ± 0%       120 ± 0%  -50.00%  (p=0.002 n=6+6)
CSEPass-4                9.00 ± 0%      9.00 ± 0%     ~     (all equal)
CSEPassBlock-4          1.35k ± 0%     1.35k ± 0%     ~     (all equal)
DeadcodePass-4           3.00 ± 0%      3.00 ± 0%     ~     (all equal)
DeadcodePassBlock-4      9.00 ± 0%      9.00 ± 0%     ~     (all equal)
MultiPass-4              11.0 ± 0%      10.0 ± 0%   -9.09%  (p=0.002 n=6+6)
MultiPassBlock-4          165 ± 0%       150 ± 0%   -9.09%  (p=0.002 n=6+6)

Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a
Reviewed-on: https://go-review.googlesource.com/98449
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-03-15 17:24:39 +00:00
Austin Clements
491f409a32 cmd/compile: minor comment improvements/corrections
Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3
Reviewed-on: https://go-review.googlesource.com/87475
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2018-03-08 22:25:21 +00:00
Keith Randall
4b00d3f4a2 cmd/compile: implement comparisons directly with memory
Allow the compiler to generate code like CMPQ 16(AX), $7

It's tricky because it's difficult to spill such a comparison during
flagalloc, because the same memory state might not be available at
the restore locations.

Solve this problem by decomposing the compare+load back into its parts
if it needs to be spilled.

The big win is that the write barrier test goes from:

MOVL	runtime.writeBarrier(SB), CX
TESTL	CX, CX
JNE	60

to

CMPL	runtime.writeBarrier(SB), $0
JNE	59

It's one instruction and one byte smaller.

Fixes #19485
Fixes #15245
Update #22460

Binaries are about 0.15% smaller.

Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836
Reviewed-on: https://go-review.googlesource.com/86035
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-02-26 23:49:44 +00:00
David Chase
f22cf7131a cmd/compile: use src.NoXPos for entry-block constants
The ssa backend is aggressive about placing constants and
certain other values in the Entry block.  It's implausible
that the original line numbers for these constants makes
any sort of sense when it appears to a user stepping in a
debugger, and they're also not that useful in dumps since
entry-block instructions tend to be constants (i.e.,
unlikely to be the cause of a crash).

Therefore, use src.NoXPos for any values that are explicitly
inserted into a function's entry block.

Passes all tests, including ssa/debug_test.go with both
gdb and a fairly recent dlv.  Hand-verified that it solves
the reported problem; constructed a test that reproduced
a problem, and fixed it.

Modified test harness to allow injection of slightly more
interesting inputs.

Fixes #22558.

Change-Id: I4476927067846bc4366da7793d2375c111694c55
Reviewed-on: https://go-review.googlesource.com/81215
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-12-01 07:09:54 +00:00
Austin Clements
afbe646ab4 cmd/compile: report typedslicecopy write barriers
Most write barrier calls are inserted by SSA, but copy and append are
lowered to runtime.typedslicecopy during walk. Fix these to set
Func.WBPos and emit the "write barrier" warning, as done for the write
barriers inserted by SSA. As part of this, we refactor setting WBPos
and emitting this warning into the frontend so it can be shared by
both walk and SSA.

Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b
Reviewed-on: https://go-review.googlesource.com/73411
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-29 20:21:43 +00:00
Keith Randall
770d8d8207 cmd/compile: free value earlier in nilcheck
When we remove a nil check, add it back to the free Value pool immediately.

Fixes #18732

Change-Id: I8d644faabbfb52157d3f2d071150ff0342ac28dc
Reviewed-on: https://go-review.googlesource.com/58810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-08-25 06:01:26 +00:00
Josh Bleecher Snyder
46b88c9fbc cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.

In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.

Though this is a big CL, most of the changes are
mechanical and uninteresting.

Interesting bits:

* Add new singleton globals to package types for the special
  SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
  and TTUPLE, for SSA tuple types.
  ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
  to package types.
* We had picked the name "types" in our rules for the handy
  list of types provided by ssa.Config. That conflicted with
  the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
  types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
  and probably also some other duplicated Type methods
  designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
  and they were not particularly careful about types in general.
  Of necessity, this CL switches them to use *types.Type;
  it does not make them more type-accurate.
  Unfortunately, using types.Type means initializing a bit
  of the types universe.
  This is prime for refactoring and improvement.

This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.

name        old alloc/op      new alloc/op      delta
Template         37.9MB ± 0%       37.7MB ± 0%  -0.57%  (p=0.000 n=10+8)
Unicode          28.9MB ± 0%       28.7MB ± 0%  -0.52%  (p=0.000 n=10+10)
GoTypes           110MB ± 0%        109MB ± 0%  -0.88%  (p=0.000 n=10+10)
Flate            24.7MB ± 0%       24.6MB ± 0%  -0.66%  (p=0.000 n=10+10)
GoParser         31.1MB ± 0%       30.9MB ± 0%  -0.61%  (p=0.000 n=10+9)
Reflect          73.9MB ± 0%       73.4MB ± 0%  -0.62%  (p=0.000 n=10+8)
Tar              25.8MB ± 0%       25.6MB ± 0%  -0.77%  (p=0.000 n=9+10)
XML              41.2MB ± 0%       40.9MB ± 0%  -0.80%  (p=0.000 n=10+10)
[Geo mean]       40.5MB            40.3MB       -0.68%

name        old allocs/op     new allocs/op     delta
Template           385k ± 0%         386k ± 0%    ~     (p=0.356 n=10+9)
Unicode            343k ± 1%         344k ± 0%    ~     (p=0.481 n=10+10)
GoTypes           1.16M ± 0%        1.16M ± 0%  -0.16%  (p=0.004 n=10+10)
Flate              238k ± 1%         238k ± 1%    ~     (p=0.853 n=10+10)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.720 n=10+9)
Reflect            957k ± 0%         957k ± 0%    ~     (p=0.460 n=10+8)
Tar                252k ± 0%         252k ± 0%    ~     (p=0.133 n=9+10)
XML                400k ± 0%         400k ± 0%    ~     (p=0.796 n=10+10)
[Geo mean]         428k              428k       -0.01%


Removing all the interface calls helps non-trivially with CPU, though.

name        old time/op       new time/op       delta
Template          178ms ± 4%        173ms ± 3%  -2.90%  (p=0.000 n=94+96)
Unicode          85.0ms ± 4%       83.9ms ± 4%  -1.23%  (p=0.000 n=96+96)
GoTypes           543ms ± 3%        528ms ± 3%  -2.73%  (p=0.000 n=98+96)
Flate             116ms ± 3%        113ms ± 4%  -2.34%  (p=0.000 n=96+99)
GoParser          144ms ± 3%        140ms ± 4%  -2.80%  (p=0.000 n=99+97)
Reflect           344ms ± 3%        334ms ± 4%  -3.02%  (p=0.000 n=100+99)
Tar               106ms ± 5%        103ms ± 4%  -3.30%  (p=0.000 n=98+94)
XML               198ms ± 5%        192ms ± 4%  -2.88%  (p=0.000 n=92+95)
[Geo mean]        178ms             173ms       -2.65%

name        old user-time/op  new user-time/op  delta
Template          229ms ± 5%        224ms ± 5%  -2.36%  (p=0.000 n=95+99)
Unicode           107ms ± 6%        106ms ± 5%  -1.13%  (p=0.001 n=93+95)
GoTypes           696ms ± 4%        679ms ± 4%  -2.45%  (p=0.000 n=97+99)
Flate             137ms ± 4%        134ms ± 5%  -2.66%  (p=0.000 n=99+96)
GoParser          176ms ± 5%        172ms ± 8%  -2.27%  (p=0.000 n=98+100)
Reflect           430ms ± 6%        411ms ± 5%  -4.46%  (p=0.000 n=100+92)
Tar               128ms ±13%        123ms ±13%  -4.21%  (p=0.000 n=100+100)
XML               239ms ± 6%        233ms ± 6%  -2.50%  (p=0.000 n=95+97)
[Geo mean]        220ms             213ms       -2.76%


Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder
4ee934ad27 cmd/compile: remove references to *os.File from ssa package
This reduces the size of the ssa export data
by 10%, from 76154 to 67886.

It doesn't appear that #20084, which would do this automatically,
is going to be fixed soon. Do it manually for now.

This speeds up compiling cmd/compile/internal/amd64
and presumably its comrades as well:

name          old time/op       new time/op       delta
CompileAMD64       89.6ms ± 6%       86.7ms ± 5%  -3.29%  (p=0.000 n=49+47)

name          old user-time/op  new user-time/op  delta
CompileAMD64        116ms ± 5%        112ms ± 5%  -3.51%  (p=0.000 n=45+42)

name          old alloc/op      new alloc/op      delta
CompileAMD64       26.7MB ± 0%       25.8MB ± 0%  -3.26%  (p=0.008 n=5+5)

name          old allocs/op     new allocs/op     delta
CompileAMD64         223k ± 0%         213k ± 0%  -4.46%  (p=0.008 n=5+5)

Updates #20084

Change-Id: I49e8951c5bfce63ad2b7f4fc3bfa0868c53114f9
Reviewed-on: https://go-review.googlesource.com/41493
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-24 23:58:14 +00:00
Josh Bleecher Snyder
0323895cc0 cmd/compile: catch and report nowritebarrier violations later
Prior to this CL, the SSA backend reported violations
of the //go:nowritebarrier annotation immediately.
This necessitated emitting errors during SSA compilation,
which is not compatible with a concurrent backend.

Instead, check for such violations later.
We already save the data required to do a late check
for violations of the //go:nowritebarrierrec annotation.
Use the same data, and check //go:nowritebarrier at the same time.

One downside to doing this is that now only a single
violation will be reported per function.
Given that this is for the runtime only,
and violations are rare, this seems an acceptable cost.

While we are here, remove several 'nerrors != 0' checks
that are rendered pointless.

Updates #15756
Fixes #19250 (as much as it ever can be)

Change-Id: Ia44c4ad5b6fd6f804d9f88d9571cec8d23665cb3
Reviewed-on: https://go-review.googlesource.com/38973
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-31 16:31:20 +00:00
Josh Bleecher Snyder
b3a8beb9d1 cmd/compile: minor cleanup in debug code
Change-Id: I9885606801b9c8fcb62c16d0856025c4e83e658b
Reviewed-on: https://go-review.googlesource.com/38650
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-24 22:21:55 +00:00
Matthew Dempsky
325904fe6a cmd/compile: port liveness analysis to SSA
Passes toolstash-check -all.

Change-Id: I92c3c25d6c053f971f346f4fa3bbc76419b58183
Reviewed-on: https://go-review.googlesource.com/38087
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-20 22:58:50 +00:00
Josh Bleecher Snyder
2cdb7f118a cmd/compile: move Frontend field from ssa.Config to ssa.Func
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.

This is a giant CL, but it is almost entirely routine refactoring.

The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.

Passes toolstash -cmp.

Updates #15756

Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder
88e47187c1 cmd/compile: relocate code from config.go to func.go
This is a follow-up to CL 38167.
Pure code movement.

Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57
Reviewed-on: https://go-review.googlesource.com/38322
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-17 05:21:53 +00:00
Josh Bleecher Snyder
a5e3cac895 cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config
This makes ssa.Func, ssa.Cache, and ssa.Config fulfill
the roles laid out for them in CL 38160.

The only non-trivial change in this CL is how cached
values and blocks get IDs. Prior to this CL, their IDs were
assigned as part of resetting the cache, and only modified
IDs were reset. This required knowing how many values and
blocks were modified, which required a tight coupling between
ssa.Func and ssa.Config. To eliminate that coupling,
we now zero values and blocks during reset,
and assign their IDs when they are used.
Since unused values and blocks have ID == 0,
we can efficiently find the last used value/block,
to avoid zeroing everything.
Bulk zeroing is efficient, but not efficient enough
to obviate the need to avoid zeroing everything every time.
As a happy side-effect, ssa.Func.Free is no longer necessary.

DebugHashMatch and friends now belong in func.go.
They have been left in place for clarity and review.
I will move them in a subsequent CL.

Passes toolstash -cmp. No compiler performance impact.
No change in 'go test cmd/compile/internal/ssa' execution time.

Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098
Reviewed-on: https://go-review.googlesource.com/38167
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-17 05:21:42 +00:00
Cherry Zhang
c8f38b3398 cmd/compile: use type information in Aux for Store size
Remove size AuxInt in Store, and alignment in Move/Zero. We still
pass size AuxInt to Move/Zero, as it is used for partial Move/Zero
lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288).
SizeAndAlign is gone.

Passes "toolstash -cmp" on std.

Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d
Reviewed-on: https://go-review.googlesource.com/38150
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:25:04 +00:00
Cherry Zhang
1b85300602 cmd/compile: clean up SSA-building code
Now that the write barrier insertion is moved to SSA, the SSA
building code can be simplified.

Updates #17583.

Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25
Reviewed-on: https://go-review.googlesource.com/36840
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:24:40 +00:00
Josh Bleecher Snyder
43afcb5c96 cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache
The line between ssa.Func and ssa.Config has blurred.
Concurrent compilation in the backend will require more precision.
This CL lays out an (aspirational) organization.
The implementation will come in follow-up CLs,
once the organization is settled.

ssa.Config holds basic compiler configuration,
mostly arch-specific information.
It is configured once, early on, and is readonly,
so it is safe for concurrent use.

ssa.Func is a single-shot object used for
compiling a single Func. It is not concurrency-safe
and not re-usable.

ssa.Cache is a multi-use object used to avoid
expensive allocations during compilation.
Each ssa.Func is given an ssa.Cache to use.
ssa.Cache is not concurrency-safe.

Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc
Reviewed-on: https://go-review.googlesource.com/38160
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-15 04:27:49 +00:00
David Chase
886e9e6065 cmd/compile: put spills in better places
Previously we always issued a spill right after the op
that was being spilled.  This CL pushes spills father away
from the generator, hopefully pushing them into unlikely branches.
For example:

  x = ...
  if unlikely {
    call ...
  }
  ... use x ...

Used to compile to

  x = ...
  spill x
  if unlikely {
    call ...
    restore x
  }

It now compiles to

  x = ...
  if unlikely {
    spill x
    call ...
    restore x
  }

This is particularly useful for code which appends, as the only
call is an unlikely call to growslice.  It also helps for the
spills needed around write barrier calls.

The basic algorithm is walk down the dominator tree following a
path where the block still dominates all of the restores.  We're
looking for a block that:
 1) dominates all restores
 2) has the value being spilled in a register
 3) has a loop depth no deeper than the value being spilled

The walking-down code is iterative.  I was forced to limit it to
searching 100 blocks so it doesn't become O(n^2).  Maybe one day
we'll find a better way.

I had to delete most of David's code which pushed spills out of loops.
I suspect this CL subsumes most of the cases that his code handled.

Generally positive performance improvements, but hard to tell for sure
with all the noise.  (compilebench times are unchanged.)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.91s ±15%     2.80s ±12%    ~     (p=0.063 n=10+10)
Fannkuch11-12                3.47s ± 0%     3.30s ± 4%  -4.91%   (p=0.000 n=9+10)
FmtFprintfEmpty-12          48.0ns ± 1%    47.4ns ± 1%  -1.32%    (p=0.002 n=9+9)
FmtFprintfString-12         85.6ns ±11%    79.4ns ± 3%  -7.27%  (p=0.005 n=10+10)
FmtFprintfInt-12            91.8ns ±10%    85.9ns ± 4%    ~      (p=0.203 n=10+9)
FmtFprintfIntInt-12          135ns ±13%     127ns ± 1%  -5.72%   (p=0.025 n=10+9)
FmtFprintfPrefixedInt-12     167ns ± 1%     168ns ± 2%    ~      (p=0.580 n=9+10)
FmtFprintfFloat-12           249ns ±11%     230ns ± 1%  -7.32%  (p=0.000 n=10+10)
FmtManyArgs-12               504ns ± 7%     506ns ± 1%    ~       (p=0.198 n=9+9)
GobDecode-12                6.95ms ± 1%    7.04ms ± 1%  +1.37%  (p=0.001 n=10+10)
GobEncode-12                6.32ms ±13%    6.04ms ± 1%    ~     (p=0.063 n=10+10)
Gzip-12                      233ms ± 1%     235ms ± 0%  +1.01%   (p=0.000 n=10+9)
Gunzip-12                   40.1ms ± 1%    39.6ms ± 0%  -1.12%   (p=0.000 n=10+8)
HTTPClientServer-12          227µs ± 9%     221µs ± 5%    ~       (p=0.114 n=9+8)
JSONEncode-12               16.1ms ± 2%    15.8ms ± 1%  -2.09%    (p=0.002 n=9+8)
JSONDecode-12               61.8ms ±11%    57.9ms ± 1%  -6.30%   (p=0.000 n=10+9)
Mandelbrot200-12            4.30ms ± 3%    4.28ms ± 1%    ~      (p=0.203 n=10+8)
GoParse-12                  3.18ms ± 2%    3.18ms ± 2%    ~     (p=0.579 n=10+10)
RegexpMatchEasy0_32-12      76.7ns ± 1%    77.5ns ± 1%  +0.92%    (p=0.002 n=9+8)
RegexpMatchEasy0_1K-12       239ns ± 3%     239ns ± 1%    ~     (p=0.204 n=10+10)
RegexpMatchEasy1_32-12      71.4ns ± 1%    70.6ns ± 0%  -1.15%   (p=0.000 n=10+9)
RegexpMatchEasy1_1K-12       383ns ± 2%     390ns ±10%    ~       (p=0.181 n=8+9)
RegexpMatchMedium_32-12      114ns ± 0%     113ns ± 1%  -0.88%    (p=0.000 n=9+8)
RegexpMatchMedium_1K-12     36.3µs ± 1%    36.8µs ± 1%  +1.59%   (p=0.000 n=10+8)
RegexpMatchHard_32-12       1.90µs ± 1%    1.90µs ± 1%    ~     (p=0.341 n=10+10)
RegexpMatchHard_1K-12       59.4µs ±11%    57.8µs ± 1%    ~      (p=0.968 n=10+9)
Revcomp-12                   461ms ± 1%     462ms ± 1%    ~       (p=1.000 n=9+9)
Template-12                 67.5ms ± 1%    66.3ms ± 1%  -1.77%   (p=0.000 n=10+8)
TimeParse-12                 314ns ± 3%     309ns ± 0%  -1.56%    (p=0.000 n=9+8)
TimeFormat-12                340ns ± 2%     331ns ± 1%  -2.79%  (p=0.000 n=10+10)

The go binary is 0.2% larger.  Not really sure why the size
would change.

Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c
Reviewed-on: https://go-review.googlesource.com/34822
Reviewed-by: David Chase <drchase@google.com>
2017-03-15 02:09:25 +00:00
Josh Bleecher Snyder
3dcfce8d19 cmd/compile: add OpOffPtr [c] SP to constant cache
They accounted for almost 30% of all CSE'd values.

By never creating the duplicates in the first place,
we reduce the high water mark of Value IDs,
which in turn makes all SSA phases cheaper,
particularly regalloc.

name       old time/op     new time/op     delta
Template       200ms ± 3%      198ms ± 4%  -0.87%  (p=0.016 n=50+49)
Unicode       86.9ms ± 2%     85.5ms ± 3%  -1.56%  (p=0.000 n=49+50)
GoTypes        553ms ± 4%      551ms ± 4%    ~     (p=0.183 n=50+49)
SSA            3.97s ± 3%      3.93s ± 2%  -1.06%  (p=0.000 n=48+48)
Flate          124ms ± 4%      124ms ± 3%    ~     (p=0.545 n=48+50)
GoParser       146ms ± 4%      146ms ± 4%    ~     (p=0.810 n=49+49)
Reflect        357ms ± 3%      355ms ± 3%  -0.59%  (p=0.049 n=50+48)
Tar            106ms ± 4%      107ms ± 5%    ~     (p=0.454 n=49+50)
XML            203ms ± 4%      203ms ± 4%    ~     (p=0.726 n=48+50)

name       old user-ns/op  new user-ns/op  delta
Template        237M ± 3%       235M ± 4%    ~     (p=0.208 n=47+48)
Unicode         111M ± 4%       108M ± 9%  -2.50%  (p=0.000 n=47+50)
GoTypes         736M ± 5%       729M ± 4%  -0.95%  (p=0.017 n=50+46)
SSA            5.73G ± 4%      5.74G ± 4%    ~     (p=0.765 n=50+50)
Flate           150M ± 5%       148M ± 6%  -0.89%  (p=0.045 n=48+47)
GoParser        180M ± 5%       178M ± 7%  -1.34%  (p=0.012 n=50+50)
Reflect         450M ± 4%       444M ± 4%  -1.40%  (p=0.000 n=50+49)
Tar             124M ± 7%       123M ± 7%    ~     (p=0.092 n=50+50)
XML             248M ± 6%       245M ± 5%    ~     (p=0.057 n=50+50)

name       old alloc/op    new alloc/op    delta
Template      39.4MB ± 0%     39.3MB ± 0%  -0.37%  (p=0.000 n=50+50)
Unicode       30.9MB ± 0%     30.9MB ± 0%  -0.27%  (p=0.000 n=48+50)
GoTypes        114MB ± 0%      113MB ± 0%  -1.03%  (p=0.000 n=50+49)
SSA            882MB ± 0%      865MB ± 0%  -1.95%  (p=0.000 n=49+49)
Flate         25.8MB ± 0%     25.7MB ± 0%  -0.21%  (p=0.000 n=50+50)
GoParser      31.7MB ± 0%     31.6MB ± 0%  -0.33%  (p=0.000 n=50+50)
Reflect       79.7MB ± 0%     79.3MB ± 0%  -0.49%  (p=0.000 n=44+49)
Tar           27.2MB ± 0%     27.1MB ± 0%  -0.31%  (p=0.000 n=50+50)
XML           42.7MB ± 0%     42.3MB ± 0%  -1.05%  (p=0.000 n=48+49)

name       old allocs/op   new allocs/op   delta
Template        379k ± 1%       380k ± 1%  +0.26%  (p=0.000 n=50+50)
Unicode         324k ± 1%       324k ± 1%    ~     (p=0.964 n=49+50)
GoTypes        1.14M ± 0%      1.15M ± 0%  +0.14%  (p=0.000 n=50+49)
SSA            7.89M ± 0%      7.89M ± 0%  -0.05%  (p=0.000 n=49+49)
Flate           240k ± 1%       241k ± 1%  +0.27%  (p=0.001 n=50+50)
GoParser        310k ± 1%       311k ± 1%  +0.48%  (p=0.000 n=50+49)
Reflect        1.00M ± 0%      1.00M ± 0%  +0.17%  (p=0.000 n=48+50)
Tar             254k ± 1%       255k ± 1%  +0.23%  (p=0.005 n=50+50)
XML             395k ± 1%       395k ± 1%  +0.19%  (p=0.002 n=49+47)

Change-Id: Iaa8f5f37e23bd81983409f7359f9dcd4dfe2961f
Reviewed-on: https://go-review.googlesource.com/38003
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-10 16:50:58 +00:00
Josh Bleecher Snyder
d11a2184fb cmd/compile: allow earlier GC of freed constant value
Minor fix, because it's the right thing to do.
No significant impact.

Change-Id: I2138285d397494daa9a88c414149c2a7860edd7e
Reviewed-on: https://go-review.googlesource.com/38001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-10 01:39:09 +00:00
Josh Bleecher Snyder
c63ad970f6 cmd/compile: rename Func.constVal arg for clarity
Values have an Aux and an AuxInt.
We're setting AuxInt, not Aux.
Say so.

Change-Id: I41aa783273bb7e1ba47c941aa4233f818e37dadd
Reviewed-on: https://go-review.googlesource.com/37997
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-09 23:39:01 +00:00
Josh Bleecher Snyder
f791b288d1 cmd/compile: remove some allocs from CSE
Pick up a few pennies:

* CSE gets run twice for each function,
but the set of Aux values doesn't change.
Avoid populating it twice.

* Don't bother populating auxmap for values
that can't be CSE'd anyway.

name       old alloc/op     new alloc/op     delta
Template       41.0MB ± 0%      40.7MB ± 0%  -0.61%  (p=0.008 n=5+5)
Unicode        32.3MB ± 0%      32.3MB ± 0%  -0.22%  (p=0.008 n=5+5)
GoTypes         122MB ± 0%       121MB ± 0%  -0.55%  (p=0.008 n=5+5)
Compiler        482MB ± 0%       479MB ± 0%  -0.58%  (p=0.008 n=5+5)
SSA             865MB ± 0%       862MB ± 0%  -0.35%  (p=0.008 n=5+5)
Flate          26.5MB ± 0%      26.5MB ± 0%    ~     (p=0.056 n=5+5)
GoParser       32.6MB ± 0%      32.4MB ± 0%  -0.58%  (p=0.008 n=5+5)
Reflect        84.2MB ± 0%      83.8MB ± 0%  -0.57%  (p=0.008 n=5+5)
Tar            27.7MB ± 0%      27.6MB ± 0%  -0.37%  (p=0.008 n=5+5)
XML            44.7MB ± 0%      44.5MB ± 0%  -0.53%  (p=0.008 n=5+5)

name       old allocs/op    new allocs/op    delta
Template         373k ± 0%        373k ± 1%    ~     (p=1.000 n=5+5)
Unicode          326k ± 0%        325k ± 0%    ~     (p=0.548 n=5+5)
GoTypes         1.16M ± 0%       1.16M ± 0%    ~     (p=0.841 n=5+5)
Compiler        4.16M ± 0%       4.15M ± 0%    ~     (p=0.222 n=5+5)
SSA             7.57M ± 0%       7.56M ± 0%  -0.22%  (p=0.008 n=5+5)
Flate            238k ± 1%        239k ± 1%    ~     (p=0.690 n=5+5)
GoParser         304k ± 0%        304k ± 0%    ~     (p=1.000 n=5+5)
Reflect         1.01M ± 0%       1.00M ± 0%  -0.31%  (p=0.016 n=4+5)
Tar              245k ± 0%        245k ± 1%    ~     (p=0.548 n=5+5)
XML              393k ± 0%        391k ± 1%    ~     (p=0.095 n=5+5)

Change-Id: I78f1ffe129bd8fd590b7511717dd2bf9f5ecbd6d
Reviewed-on: https://go-review.googlesource.com/36690
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-02-09 20:42:46 +00:00
Matthew Dempsky
5c90e1cf8a cmd/compile/internal/ssa: remove Func.StaticData field
Rather than collecting static data nodes to be written out later, just
write them out immediately.

Change-Id: I51708b690e94bc3e288b4d6ba3307bf738a80f64
Reviewed-on: https://go-review.googlesource.com/36352
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-02-04 01:09:26 +00:00
Josh Bleecher Snyder
57546d67ec cmd/compile: add reusable []Location to ssa.Config
name       old time/op      new time/op      delta
Template        218ms ± 3%       214ms ± 3%  -1.70%  (p=0.000 n=30+30)
Unicode         100ms ± 3%       100ms ± 4%    ~     (p=0.614 n=29+30)
GoTypes         657ms ± 1%       660ms ± 3%  +0.46%  (p=0.046 n=29+30)
Compiler        2.80s ± 2%       2.80s ± 1%    ~     (p=0.451 n=28+29)
Flate           131ms ± 2%       132ms ± 4%    ~     (p=1.000 n=29+29)
GoParser        159ms ± 3%       160ms ± 5%    ~     (p=0.341 n=28+30)
Reflect         406ms ± 3%       408ms ± 4%    ~     (p=0.511 n=28+30)
Tar             118ms ± 4%       118ms ± 4%    ~     (p=0.827 n=29+30)
XML             222ms ± 6%       222ms ± 3%    ~     (p=0.532 n=30+30)

name       old user-ns/op   new user-ns/op   delta
Template   274user-ms ± 3%  272user-ms ± 3%  -0.87%  (p=0.015 n=29+30)
Unicode    140user-ms ± 4%  140user-ms ± 3%    ~     (p=0.735 n=29+30)
GoTypes    890user-ms ± 1%  897user-ms ± 2%  +0.88%  (p=0.002 n=29+30)
Compiler   3.88user-s ± 2%  3.89user-s ± 1%    ~     (p=0.132 n=30+29)
Flate      168user-ms ± 2%  157user-ms ± 4%  -6.21%  (p=0.000 n=25+28)
GoParser   211user-ms ± 2%  213user-ms ± 5%    ~     (p=0.086 n=28+30)
Reflect    539user-ms ± 2%  541user-ms ± 3%    ~     (p=0.267 n=27+29)
Tar        156user-ms ± 7%  155user-ms ± 5%    ~     (p=0.708 n=30+30)
XML        291user-ms ± 5%  294user-ms ± 3%  +0.83%  (p=0.029 n=29+30)

name       old alloc/op     new alloc/op     delta
Template       40.7MB ± 0%      39.4MB ± 0%  -3.26%  (p=0.000 n=29+26)
Unicode        30.8MB ± 0%      30.7MB ± 0%  -0.40%  (p=0.000 n=28+30)
GoTypes         123MB ± 0%       119MB ± 0%  -3.47%  (p=0.000 n=30+29)
Compiler        472MB ± 0%       455MB ± 0%  -3.60%  (p=0.000 n=30+30)
Flate          26.5MB ± 0%      25.6MB ± 0%  -3.21%  (p=0.000 n=28+30)
GoParser       32.3MB ± 0%      31.4MB ± 0%  -2.98%  (p=0.000 n=29+30)
Reflect        84.4MB ± 0%      82.1MB ± 0%  -2.83%  (p=0.000 n=30+30)
Tar            27.3MB ± 0%      26.5MB ± 0%  -2.70%  (p=0.000 n=29+29)
XML            44.6MB ± 0%      43.1MB ± 0%  -3.49%  (p=0.000 n=30+30)

name       old allocs/op    new allocs/op    delta
Template         401k ± 1%        399k ± 0%  -0.35%  (p=0.000 n=30+28)
Unicode          331k ± 0%        331k ± 1%    ~     (p=0.907 n=28+30)
GoTypes         1.24M ± 0%       1.23M ± 0%  -0.43%  (p=0.000 n=30+30)
Compiler        4.26M ± 0%       4.25M ± 0%  -0.34%  (p=0.000 n=29+30)
Flate            252k ± 1%        251k ± 1%  -0.41%  (p=0.000 n=30+30)
GoParser         325k ± 1%        324k ± 1%  -0.31%  (p=0.000 n=27+30)
Reflect         1.06M ± 0%       1.05M ± 0%  -0.69%  (p=0.000 n=30+30)
Tar              266k ± 1%        265k ± 1%  -0.51%  (p=0.000 n=29+30)
XML              416k ± 1%        415k ± 1%  -0.36%  (p=0.002 n=30+30)

Change-Id: I8f784001324df83b2764c44f0e83a540e5beab34
Reviewed-on: https://go-review.googlesource.com/36212
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-02 22:39:32 +00:00