Commit graph

68 commits

Author SHA1 Message Date
Josh Bleecher Snyder
0323895cc0 cmd/compile: catch and report nowritebarrier violations later
Prior to this CL, the SSA backend reported violations
of the //go:nowritebarrier annotation immediately.
This necessitated emitting errors during SSA compilation,
which is not compatible with a concurrent backend.

Instead, check for such violations later.
We already save the data required to do a late check
for violations of the //go:nowritebarrierrec annotation.
Use the same data, and check //go:nowritebarrier at the same time.

One downside to doing this is that now only a single
violation will be reported per function.
Given that this is for the runtime only,
and violations are rare, this seems an acceptable cost.

While we are here, remove several 'nerrors != 0' checks
that are rendered pointless.

Updates #15756
Fixes #19250 (as much as it ever can be)

Change-Id: Ia44c4ad5b6fd6f804d9f88d9571cec8d23665cb3
Reviewed-on: https://go-review.googlesource.com/38973
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-31 16:31:20 +00:00
Josh Bleecher Snyder
b3a8beb9d1 cmd/compile: minor cleanup in debug code
Change-Id: I9885606801b9c8fcb62c16d0856025c4e83e658b
Reviewed-on: https://go-review.googlesource.com/38650
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-24 22:21:55 +00:00
Matthew Dempsky
325904fe6a cmd/compile: port liveness analysis to SSA
Passes toolstash-check -all.

Change-Id: I92c3c25d6c053f971f346f4fa3bbc76419b58183
Reviewed-on: https://go-review.googlesource.com/38087
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-20 22:58:50 +00:00
Josh Bleecher Snyder
2cdb7f118a cmd/compile: move Frontend field from ssa.Config to ssa.Func
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.

This is a giant CL, but it is almost entirely routine refactoring.

The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.

Passes toolstash -cmp.

Updates #15756

Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder
88e47187c1 cmd/compile: relocate code from config.go to func.go
This is a follow-up to CL 38167.
Pure code movement.

Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57
Reviewed-on: https://go-review.googlesource.com/38322
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-17 05:21:53 +00:00
Josh Bleecher Snyder
a5e3cac895 cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config
This makes ssa.Func, ssa.Cache, and ssa.Config fulfill
the roles laid out for them in CL 38160.

The only non-trivial change in this CL is how cached
values and blocks get IDs. Prior to this CL, their IDs were
assigned as part of resetting the cache, and only modified
IDs were reset. This required knowing how many values and
blocks were modified, which required a tight coupling between
ssa.Func and ssa.Config. To eliminate that coupling,
we now zero values and blocks during reset,
and assign their IDs when they are used.
Since unused values and blocks have ID == 0,
we can efficiently find the last used value/block,
to avoid zeroing everything.
Bulk zeroing is efficient, but not efficient enough
to obviate the need to avoid zeroing everything every time.
As a happy side-effect, ssa.Func.Free is no longer necessary.

DebugHashMatch and friends now belong in func.go.
They have been left in place for clarity and review.
I will move them in a subsequent CL.

Passes toolstash -cmp. No compiler performance impact.
No change in 'go test cmd/compile/internal/ssa' execution time.

Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098
Reviewed-on: https://go-review.googlesource.com/38167
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-17 05:21:42 +00:00
Cherry Zhang
c8f38b3398 cmd/compile: use type information in Aux for Store size
Remove size AuxInt in Store, and alignment in Move/Zero. We still
pass size AuxInt to Move/Zero, as it is used for partial Move/Zero
lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288).
SizeAndAlign is gone.

Passes "toolstash -cmp" on std.

Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d
Reviewed-on: https://go-review.googlesource.com/38150
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:25:04 +00:00
Cherry Zhang
1b85300602 cmd/compile: clean up SSA-building code
Now that the write barrier insertion is moved to SSA, the SSA
building code can be simplified.

Updates #17583.

Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25
Reviewed-on: https://go-review.googlesource.com/36840
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:24:40 +00:00
Josh Bleecher Snyder
43afcb5c96 cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache
The line between ssa.Func and ssa.Config has blurred.
Concurrent compilation in the backend will require more precision.
This CL lays out an (aspirational) organization.
The implementation will come in follow-up CLs,
once the organization is settled.

ssa.Config holds basic compiler configuration,
mostly arch-specific information.
It is configured once, early on, and is readonly,
so it is safe for concurrent use.

ssa.Func is a single-shot object used for
compiling a single Func. It is not concurrency-safe
and not re-usable.

ssa.Cache is a multi-use object used to avoid
expensive allocations during compilation.
Each ssa.Func is given an ssa.Cache to use.
ssa.Cache is not concurrency-safe.

Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc
Reviewed-on: https://go-review.googlesource.com/38160
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-15 04:27:49 +00:00
David Chase
886e9e6065 cmd/compile: put spills in better places
Previously we always issued a spill right after the op
that was being spilled.  This CL pushes spills father away
from the generator, hopefully pushing them into unlikely branches.
For example:

  x = ...
  if unlikely {
    call ...
  }
  ... use x ...

Used to compile to

  x = ...
  spill x
  if unlikely {
    call ...
    restore x
  }

It now compiles to

  x = ...
  if unlikely {
    spill x
    call ...
    restore x
  }

This is particularly useful for code which appends, as the only
call is an unlikely call to growslice.  It also helps for the
spills needed around write barrier calls.

The basic algorithm is walk down the dominator tree following a
path where the block still dominates all of the restores.  We're
looking for a block that:
 1) dominates all restores
 2) has the value being spilled in a register
 3) has a loop depth no deeper than the value being spilled

The walking-down code is iterative.  I was forced to limit it to
searching 100 blocks so it doesn't become O(n^2).  Maybe one day
we'll find a better way.

I had to delete most of David's code which pushed spills out of loops.
I suspect this CL subsumes most of the cases that his code handled.

Generally positive performance improvements, but hard to tell for sure
with all the noise.  (compilebench times are unchanged.)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.91s ±15%     2.80s ±12%    ~     (p=0.063 n=10+10)
Fannkuch11-12                3.47s ± 0%     3.30s ± 4%  -4.91%   (p=0.000 n=9+10)
FmtFprintfEmpty-12          48.0ns ± 1%    47.4ns ± 1%  -1.32%    (p=0.002 n=9+9)
FmtFprintfString-12         85.6ns ±11%    79.4ns ± 3%  -7.27%  (p=0.005 n=10+10)
FmtFprintfInt-12            91.8ns ±10%    85.9ns ± 4%    ~      (p=0.203 n=10+9)
FmtFprintfIntInt-12          135ns ±13%     127ns ± 1%  -5.72%   (p=0.025 n=10+9)
FmtFprintfPrefixedInt-12     167ns ± 1%     168ns ± 2%    ~      (p=0.580 n=9+10)
FmtFprintfFloat-12           249ns ±11%     230ns ± 1%  -7.32%  (p=0.000 n=10+10)
FmtManyArgs-12               504ns ± 7%     506ns ± 1%    ~       (p=0.198 n=9+9)
GobDecode-12                6.95ms ± 1%    7.04ms ± 1%  +1.37%  (p=0.001 n=10+10)
GobEncode-12                6.32ms ±13%    6.04ms ± 1%    ~     (p=0.063 n=10+10)
Gzip-12                      233ms ± 1%     235ms ± 0%  +1.01%   (p=0.000 n=10+9)
Gunzip-12                   40.1ms ± 1%    39.6ms ± 0%  -1.12%   (p=0.000 n=10+8)
HTTPClientServer-12          227µs ± 9%     221µs ± 5%    ~       (p=0.114 n=9+8)
JSONEncode-12               16.1ms ± 2%    15.8ms ± 1%  -2.09%    (p=0.002 n=9+8)
JSONDecode-12               61.8ms ±11%    57.9ms ± 1%  -6.30%   (p=0.000 n=10+9)
Mandelbrot200-12            4.30ms ± 3%    4.28ms ± 1%    ~      (p=0.203 n=10+8)
GoParse-12                  3.18ms ± 2%    3.18ms ± 2%    ~     (p=0.579 n=10+10)
RegexpMatchEasy0_32-12      76.7ns ± 1%    77.5ns ± 1%  +0.92%    (p=0.002 n=9+8)
RegexpMatchEasy0_1K-12       239ns ± 3%     239ns ± 1%    ~     (p=0.204 n=10+10)
RegexpMatchEasy1_32-12      71.4ns ± 1%    70.6ns ± 0%  -1.15%   (p=0.000 n=10+9)
RegexpMatchEasy1_1K-12       383ns ± 2%     390ns ±10%    ~       (p=0.181 n=8+9)
RegexpMatchMedium_32-12      114ns ± 0%     113ns ± 1%  -0.88%    (p=0.000 n=9+8)
RegexpMatchMedium_1K-12     36.3µs ± 1%    36.8µs ± 1%  +1.59%   (p=0.000 n=10+8)
RegexpMatchHard_32-12       1.90µs ± 1%    1.90µs ± 1%    ~     (p=0.341 n=10+10)
RegexpMatchHard_1K-12       59.4µs ±11%    57.8µs ± 1%    ~      (p=0.968 n=10+9)
Revcomp-12                   461ms ± 1%     462ms ± 1%    ~       (p=1.000 n=9+9)
Template-12                 67.5ms ± 1%    66.3ms ± 1%  -1.77%   (p=0.000 n=10+8)
TimeParse-12                 314ns ± 3%     309ns ± 0%  -1.56%    (p=0.000 n=9+8)
TimeFormat-12                340ns ± 2%     331ns ± 1%  -2.79%  (p=0.000 n=10+10)

The go binary is 0.2% larger.  Not really sure why the size
would change.

Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c
Reviewed-on: https://go-review.googlesource.com/34822
Reviewed-by: David Chase <drchase@google.com>
2017-03-15 02:09:25 +00:00
Josh Bleecher Snyder
3dcfce8d19 cmd/compile: add OpOffPtr [c] SP to constant cache
They accounted for almost 30% of all CSE'd values.

By never creating the duplicates in the first place,
we reduce the high water mark of Value IDs,
which in turn makes all SSA phases cheaper,
particularly regalloc.

name       old time/op     new time/op     delta
Template       200ms ± 3%      198ms ± 4%  -0.87%  (p=0.016 n=50+49)
Unicode       86.9ms ± 2%     85.5ms ± 3%  -1.56%  (p=0.000 n=49+50)
GoTypes        553ms ± 4%      551ms ± 4%    ~     (p=0.183 n=50+49)
SSA            3.97s ± 3%      3.93s ± 2%  -1.06%  (p=0.000 n=48+48)
Flate          124ms ± 4%      124ms ± 3%    ~     (p=0.545 n=48+50)
GoParser       146ms ± 4%      146ms ± 4%    ~     (p=0.810 n=49+49)
Reflect        357ms ± 3%      355ms ± 3%  -0.59%  (p=0.049 n=50+48)
Tar            106ms ± 4%      107ms ± 5%    ~     (p=0.454 n=49+50)
XML            203ms ± 4%      203ms ± 4%    ~     (p=0.726 n=48+50)

name       old user-ns/op  new user-ns/op  delta
Template        237M ± 3%       235M ± 4%    ~     (p=0.208 n=47+48)
Unicode         111M ± 4%       108M ± 9%  -2.50%  (p=0.000 n=47+50)
GoTypes         736M ± 5%       729M ± 4%  -0.95%  (p=0.017 n=50+46)
SSA            5.73G ± 4%      5.74G ± 4%    ~     (p=0.765 n=50+50)
Flate           150M ± 5%       148M ± 6%  -0.89%  (p=0.045 n=48+47)
GoParser        180M ± 5%       178M ± 7%  -1.34%  (p=0.012 n=50+50)
Reflect         450M ± 4%       444M ± 4%  -1.40%  (p=0.000 n=50+49)
Tar             124M ± 7%       123M ± 7%    ~     (p=0.092 n=50+50)
XML             248M ± 6%       245M ± 5%    ~     (p=0.057 n=50+50)

name       old alloc/op    new alloc/op    delta
Template      39.4MB ± 0%     39.3MB ± 0%  -0.37%  (p=0.000 n=50+50)
Unicode       30.9MB ± 0%     30.9MB ± 0%  -0.27%  (p=0.000 n=48+50)
GoTypes        114MB ± 0%      113MB ± 0%  -1.03%  (p=0.000 n=50+49)
SSA            882MB ± 0%      865MB ± 0%  -1.95%  (p=0.000 n=49+49)
Flate         25.8MB ± 0%     25.7MB ± 0%  -0.21%  (p=0.000 n=50+50)
GoParser      31.7MB ± 0%     31.6MB ± 0%  -0.33%  (p=0.000 n=50+50)
Reflect       79.7MB ± 0%     79.3MB ± 0%  -0.49%  (p=0.000 n=44+49)
Tar           27.2MB ± 0%     27.1MB ± 0%  -0.31%  (p=0.000 n=50+50)
XML           42.7MB ± 0%     42.3MB ± 0%  -1.05%  (p=0.000 n=48+49)

name       old allocs/op   new allocs/op   delta
Template        379k ± 1%       380k ± 1%  +0.26%  (p=0.000 n=50+50)
Unicode         324k ± 1%       324k ± 1%    ~     (p=0.964 n=49+50)
GoTypes        1.14M ± 0%      1.15M ± 0%  +0.14%  (p=0.000 n=50+49)
SSA            7.89M ± 0%      7.89M ± 0%  -0.05%  (p=0.000 n=49+49)
Flate           240k ± 1%       241k ± 1%  +0.27%  (p=0.001 n=50+50)
GoParser        310k ± 1%       311k ± 1%  +0.48%  (p=0.000 n=50+49)
Reflect        1.00M ± 0%      1.00M ± 0%  +0.17%  (p=0.000 n=48+50)
Tar             254k ± 1%       255k ± 1%  +0.23%  (p=0.005 n=50+50)
XML             395k ± 1%       395k ± 1%  +0.19%  (p=0.002 n=49+47)

Change-Id: Iaa8f5f37e23bd81983409f7359f9dcd4dfe2961f
Reviewed-on: https://go-review.googlesource.com/38003
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-10 16:50:58 +00:00
Josh Bleecher Snyder
d11a2184fb cmd/compile: allow earlier GC of freed constant value
Minor fix, because it's the right thing to do.
No significant impact.

Change-Id: I2138285d397494daa9a88c414149c2a7860edd7e
Reviewed-on: https://go-review.googlesource.com/38001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-10 01:39:09 +00:00
Josh Bleecher Snyder
c63ad970f6 cmd/compile: rename Func.constVal arg for clarity
Values have an Aux and an AuxInt.
We're setting AuxInt, not Aux.
Say so.

Change-Id: I41aa783273bb7e1ba47c941aa4233f818e37dadd
Reviewed-on: https://go-review.googlesource.com/37997
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-09 23:39:01 +00:00
Josh Bleecher Snyder
f791b288d1 cmd/compile: remove some allocs from CSE
Pick up a few pennies:

* CSE gets run twice for each function,
but the set of Aux values doesn't change.
Avoid populating it twice.

* Don't bother populating auxmap for values
that can't be CSE'd anyway.

name       old alloc/op     new alloc/op     delta
Template       41.0MB ± 0%      40.7MB ± 0%  -0.61%  (p=0.008 n=5+5)
Unicode        32.3MB ± 0%      32.3MB ± 0%  -0.22%  (p=0.008 n=5+5)
GoTypes         122MB ± 0%       121MB ± 0%  -0.55%  (p=0.008 n=5+5)
Compiler        482MB ± 0%       479MB ± 0%  -0.58%  (p=0.008 n=5+5)
SSA             865MB ± 0%       862MB ± 0%  -0.35%  (p=0.008 n=5+5)
Flate          26.5MB ± 0%      26.5MB ± 0%    ~     (p=0.056 n=5+5)
GoParser       32.6MB ± 0%      32.4MB ± 0%  -0.58%  (p=0.008 n=5+5)
Reflect        84.2MB ± 0%      83.8MB ± 0%  -0.57%  (p=0.008 n=5+5)
Tar            27.7MB ± 0%      27.6MB ± 0%  -0.37%  (p=0.008 n=5+5)
XML            44.7MB ± 0%      44.5MB ± 0%  -0.53%  (p=0.008 n=5+5)

name       old allocs/op    new allocs/op    delta
Template         373k ± 0%        373k ± 1%    ~     (p=1.000 n=5+5)
Unicode          326k ± 0%        325k ± 0%    ~     (p=0.548 n=5+5)
GoTypes         1.16M ± 0%       1.16M ± 0%    ~     (p=0.841 n=5+5)
Compiler        4.16M ± 0%       4.15M ± 0%    ~     (p=0.222 n=5+5)
SSA             7.57M ± 0%       7.56M ± 0%  -0.22%  (p=0.008 n=5+5)
Flate            238k ± 1%        239k ± 1%    ~     (p=0.690 n=5+5)
GoParser         304k ± 0%        304k ± 0%    ~     (p=1.000 n=5+5)
Reflect         1.01M ± 0%       1.00M ± 0%  -0.31%  (p=0.016 n=4+5)
Tar              245k ± 0%        245k ± 1%    ~     (p=0.548 n=5+5)
XML              393k ± 0%        391k ± 1%    ~     (p=0.095 n=5+5)

Change-Id: I78f1ffe129bd8fd590b7511717dd2bf9f5ecbd6d
Reviewed-on: https://go-review.googlesource.com/36690
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-02-09 20:42:46 +00:00
Matthew Dempsky
5c90e1cf8a cmd/compile/internal/ssa: remove Func.StaticData field
Rather than collecting static data nodes to be written out later, just
write them out immediately.

Change-Id: I51708b690e94bc3e288b4d6ba3307bf738a80f64
Reviewed-on: https://go-review.googlesource.com/36352
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-02-04 01:09:26 +00:00
Josh Bleecher Snyder
57546d67ec cmd/compile: add reusable []Location to ssa.Config
name       old time/op      new time/op      delta
Template        218ms ± 3%       214ms ± 3%  -1.70%  (p=0.000 n=30+30)
Unicode         100ms ± 3%       100ms ± 4%    ~     (p=0.614 n=29+30)
GoTypes         657ms ± 1%       660ms ± 3%  +0.46%  (p=0.046 n=29+30)
Compiler        2.80s ± 2%       2.80s ± 1%    ~     (p=0.451 n=28+29)
Flate           131ms ± 2%       132ms ± 4%    ~     (p=1.000 n=29+29)
GoParser        159ms ± 3%       160ms ± 5%    ~     (p=0.341 n=28+30)
Reflect         406ms ± 3%       408ms ± 4%    ~     (p=0.511 n=28+30)
Tar             118ms ± 4%       118ms ± 4%    ~     (p=0.827 n=29+30)
XML             222ms ± 6%       222ms ± 3%    ~     (p=0.532 n=30+30)

name       old user-ns/op   new user-ns/op   delta
Template   274user-ms ± 3%  272user-ms ± 3%  -0.87%  (p=0.015 n=29+30)
Unicode    140user-ms ± 4%  140user-ms ± 3%    ~     (p=0.735 n=29+30)
GoTypes    890user-ms ± 1%  897user-ms ± 2%  +0.88%  (p=0.002 n=29+30)
Compiler   3.88user-s ± 2%  3.89user-s ± 1%    ~     (p=0.132 n=30+29)
Flate      168user-ms ± 2%  157user-ms ± 4%  -6.21%  (p=0.000 n=25+28)
GoParser   211user-ms ± 2%  213user-ms ± 5%    ~     (p=0.086 n=28+30)
Reflect    539user-ms ± 2%  541user-ms ± 3%    ~     (p=0.267 n=27+29)
Tar        156user-ms ± 7%  155user-ms ± 5%    ~     (p=0.708 n=30+30)
XML        291user-ms ± 5%  294user-ms ± 3%  +0.83%  (p=0.029 n=29+30)

name       old alloc/op     new alloc/op     delta
Template       40.7MB ± 0%      39.4MB ± 0%  -3.26%  (p=0.000 n=29+26)
Unicode        30.8MB ± 0%      30.7MB ± 0%  -0.40%  (p=0.000 n=28+30)
GoTypes         123MB ± 0%       119MB ± 0%  -3.47%  (p=0.000 n=30+29)
Compiler        472MB ± 0%       455MB ± 0%  -3.60%  (p=0.000 n=30+30)
Flate          26.5MB ± 0%      25.6MB ± 0%  -3.21%  (p=0.000 n=28+30)
GoParser       32.3MB ± 0%      31.4MB ± 0%  -2.98%  (p=0.000 n=29+30)
Reflect        84.4MB ± 0%      82.1MB ± 0%  -2.83%  (p=0.000 n=30+30)
Tar            27.3MB ± 0%      26.5MB ± 0%  -2.70%  (p=0.000 n=29+29)
XML            44.6MB ± 0%      43.1MB ± 0%  -3.49%  (p=0.000 n=30+30)

name       old allocs/op    new allocs/op    delta
Template         401k ± 1%        399k ± 0%  -0.35%  (p=0.000 n=30+28)
Unicode          331k ± 0%        331k ± 1%    ~     (p=0.907 n=28+30)
GoTypes         1.24M ± 0%       1.23M ± 0%  -0.43%  (p=0.000 n=30+30)
Compiler        4.26M ± 0%       4.25M ± 0%  -0.34%  (p=0.000 n=29+30)
Flate            252k ± 1%        251k ± 1%  -0.41%  (p=0.000 n=30+30)
GoParser         325k ± 1%        324k ± 1%  -0.31%  (p=0.000 n=27+30)
Reflect         1.06M ± 0%       1.05M ± 0%  -0.69%  (p=0.000 n=30+30)
Tar              266k ± 1%        265k ± 1%  -0.51%  (p=0.000 n=29+30)
XML              416k ± 1%        415k ± 1%  -0.36%  (p=0.002 n=30+30)

Change-Id: I8f784001324df83b2764c44f0e83a540e5beab34
Reviewed-on: https://go-review.googlesource.com/36212
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-02 22:39:32 +00:00
Russ Cox
47ce87877b all: merge dev.inline into master
Change-Id: I7715581a04e513dcda9918e853fa6b1ddc703770
2017-02-01 09:47:23 -05:00
Robert Griesemer
472c792e0a [dev.inline] cmd/internal/src: introduce compact source position representation
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.

In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.

The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):

name       old time/op     new time/op     delta
Template       256ms ± 1%      262ms ± 2%    ~             (p=0.063 n=5+4)
Unicode        132ms ± 1%      135ms ± 2%    ~             (p=0.063 n=5+4)
GoTypes        891ms ± 1%      871ms ± 1%  -2.28%          (p=0.016 n=5+4)
Compiler       3.84s ± 2%      3.89s ± 2%    ~             (p=0.413 n=5+4)
MakeBash       47.1s ± 1%      46.2s ± 2%    ~             (p=0.095 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        309M ± 1%       314M ± 2%    ~             (p=0.111 n=5+4)
Unicode         165M ± 1%       172M ± 9%    ~             (p=0.151 n=5+5)
GoTypes        1.14G ± 2%      1.12G ± 1%    ~             (p=0.063 n=5+4)
Compiler       5.00G ± 1%      4.96G ± 1%    ~             (p=0.286 n=5+4)

Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-09 22:43:22 +00:00
David Chase
7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
Robert Griesemer
c10499b539 [dev.inline] cmd/compile/internal/ssa: another round of renames from line -> pos (cleanup)
Mostly mechanical renames. Make variable names consistent with use.

Change-Id: Iaa89d31deab11eca6e784595b58e779ad525c8a3
Reviewed-on: https://go-review.googlesource.com/34146
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-12-08 23:10:30 +00:00
Robert Griesemer
cfd17f51c8 [dev.inline] cmd/compile/internal/ssa: rename various fields from Line to Pos
This is a mostly mechanical rename followed by manual fixes where necessary.

Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08 21:36:52 +00:00
Robert Griesemer
24597c080b [dev.inline] cmd/compile: introduce cmd/internal/src.Pos type for line numbers
This is a step toward chosing a different position representation.
By introducing an explicit type, it will be easier to make the
transition step-wise while ensuring everything keeps running.

This has been reviewed via https://go-review.googlesource.com/#/c/34025/.

Change-Id: Ibceddcd62d8f346321ac3250e3940e9c436ed684
Reviewed-on: https://go-review.googlesource.com/34132
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08 21:26:25 +00:00
David Chase
a190f3c8a3 cmd/compile: enable flag-specified dump of specific phase+function
For very large input files, use of GOSSAFUNC to obtain a dump
after compilation steps can lead to both unwieldy large output
files and unwieldy larger processes (because the output is
buffered in a string).  This flag

  -d=ssa/<phase>/dump:<function name>

provides finer control of what is dumped, into a smaller
file, and with less memory overhead in the running compiler.
The special phase name "build" is added to allow printing
of the just-built ssa before any transformations are applied.

This was helpful in making sense of the gogo/protobuf
problems.

The output format was tweaked to remove gratuitous spaces,
and a crude -d=ssa/help help text was added.

Change-Id: If7516e22203420eb6ed3614f7cee44cb9260f43e
Reviewed-on: https://go-review.googlesource.com/23044
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-20 22:23:56 +00:00
Hajime Hoshi
c5368123fe cmd/compile: remove redundant function idom
Change-Id: Ib14b5421bb5e407bbd4d3cbfc68c92d3dd257cb1
Reviewed-on: https://go-review.googlesource.com/30732
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-11 16:43:12 +00:00
Keith Randall
5a6e511c61 cmd/compile: Use Sreedhar+Gao phi building algorithm
Should be more asymptotically happy.

We process each variable in turn to find all the
locations where it needs a phi (the dominance frontier
of all of its definitions).  Then we add all those phis.
This takes O(n * #variables), although hopefully much less.

Then we do a single tree walk to match all the
FwdRefs with the nearest definition or phi.
This takes O(n) time.

The one remaining inefficiency is that we might end up
introducing a bunch of dead phis in the first step.
A TODO is to introduce phis only where they might be
used by a read.

The old algorithm is still faster on small functions,
so there's a cutover size (currently 500 blocks).

This algorithm supercedes the David's sparse phi
placement algorithm for large functions.

Lowers compile time of example from #14934 from
~10 sec to ~4 sec.
Lowers compile time of example from #16361 from
~4.5 sec to ~3 sec.
Lowers #16407 from ~20 min to ~30 sec.

Update #14934
Update #16361
Fixes #16407

Change-Id: I1cff6364e1623c143190b6a924d7599e309db58f
Reviewed-on: https://go-review.googlesource.com/30163
Reviewed-by: David Chase <drchase@google.com>
2016-10-03 20:30:08 +00:00
Keith Randall
75ce89c20d cmd/compile: cache CFG-dependent computations
We compute a lot of stuff based off the CFG: postorder traversal,
dominators, dominator tree, loop nest.  Multiple phases use this
information and we end up recomputing some of it.  Add a cache
for this information so if the CFG hasn't changed, we can reuse
the previous computation.

Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2
Reviewed-on: https://go-review.googlesource.com/29356
Reviewed-by: David Chase <drchase@google.com>
2016-09-19 16:00:13 +00:00
Keith Randall
167e381f40 cmd/compile: make ssa compilation unconditional
Rip out the code that allows SSA to be used conditionally.

No longer exists:
 ssa=0 flag
 GOSSAHASH
 GOSSAPKG
 SSATEST

GOSSAFUNC now only controls the printing of the IR/html.

Still need to rip out all of the old backend.  It should no longer be
callable after this CL.

Update #16357

Change-Id: Ib30cc18fba6ca52232c41689ba610b0a94aa74f5
Reviewed-on: https://go-review.googlesource.com/29155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-09-14 17:38:04 +00:00
Keith Randall
84aac622a4 cmd/compile: intrinsify the rest of runtime/internal/atomic for amd64
Atomic swap, add/and/or, compare and swap.

Also works on amd64p32.

Change-Id: Idf2d8f3e1255f71deba759e6e75e293afe4ab2ba
Reviewed-on: https://go-review.googlesource.com/27813
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-28 16:31:08 +00:00
David Chase
6b99fb5bea cmd/compile: use sparse algorithm for phis in large program
This adds a sparse method for locating nearest ancestors
in a dominator tree, and checks blocks with more than one
predecessor for differences and inserts phi functions where
there are.

Uses reversed post order to cut number of passes, running
it from first def to last use ("last use" for paramout and
mem is end-of-program; last use for a phi input from a
backedge is the source of the back edge)

Includes a cutover from old algorithm to new to avoid paying
large constant factor for small programs.  This keeps normal
builds running at about the same time, while not running
over-long on large machine-generated inputs.

Add "phase" flags for ssa/build -- ssa/build/stats prints
number of blocks, values (before and after linking references
and inserting phis, so expansion can be measured), and their
product; the product governs the cutover, where a good value
seems to be somewhere between 1 and 5 million.

Among the files compiled by make.bash, this is the shape of
the tail of the distribution for #blocks, #vars, and their
product:

	 #blocks	#vars	    product
 max	6171	28180	173,898,780
99.9%	1641	 6548	 10,401,878
  99%	 463	 1909	    873,721
  95%	 152	  639	     95,235
  90%	  84	  359	     30,021

The old algorithm is indeed usually fastest, for 99%ile
values of usually.

The fix to LookupVarOutgoing
( https://go-review.googlesource.com/#/c/22790/ )
deals with some of the same problems addressed by this CL,
but on at least one bug ( #15537 ) this change is still
a significant help.

With this CL:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	4m35.200s
user	13m16.644s
sys	0m36.712s
and pprof reports 3.4GB allocated in one of the larger profiles

With tip:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	10m36.569s
user	25m52.286s
sys	4m3.696s
and pprof reports 8.3GB allocated in the same larger profile

With this CL, most of the compilation time on the benchmarked
input is spent in register/stack allocation (cumulative 53%)
and in the sparse lookup algorithm itself (cumulative 20%).

Fixes #15537.

Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00
Reviewed-on: https://go-review.googlesource.com/22342
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-16 21:08:05 +00:00
Keith Randall
4fa050024f cmd/compile: enable constant-time CFG editing
Provide indexes along with block pointers for Preds
and Succs arrays.  This allows us to splice edges in
and out of those arrays in constant time.

Fixes worst-case O(n^2) behavior in deadcode and fuse.

benchmark                     old ns/op      new ns/op     delta
BenchmarkFuse1-8              2065           2057          -0.39%
BenchmarkFuse10-8             9408           9073          -3.56%
BenchmarkFuse100-8            105238         76277         -27.52%
BenchmarkFuse1000-8           3982562        1026750       -74.22%
BenchmarkFuse10000-8          301220329      12824005      -95.74%
BenchmarkDeadCode1-8          1588           1566          -1.39%
BenchmarkDeadCode10-8         4333           4250          -1.92%
BenchmarkDeadCode100-8        32031          32574         +1.70%
BenchmarkDeadCode1000-8       590407         468275        -20.69%
BenchmarkDeadCode10000-8      17822890       5000818       -71.94%
BenchmarkDeadCode100000-8     1388706640     78021127      -94.38%
BenchmarkDeadCode200000-8     5372518479     168598762     -96.86%

Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17
Reviewed-on: https://go-review.googlesource.com/22589
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 15:58:59 +00:00
Josh Bleecher Snyder
2563b6f9fe cmd/compile/internal/ssa: use Compare instead of Equal
They have different semantics.

Equal is stricter and is designed for the front-end.
Compare is looser and cheaper and is designed for the back-end.
To avoid possible regression, remove Equal from ssa.Type.

Updates #15043

Change-Id: Ie23ce75ff6b4d01b7982e0a89e6f81b5d099d8d6
Reviewed-on: https://go-review.googlesource.com/21483
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-17 04:50:45 +00:00
Alexandru Moșoi
9743e4b031 cmd/compile: share dominator tree among many passes
These passes do not modify the dominator tree too much.

% benchstat old.txt new.txt
name       old time/op     new time/op     delta
Template       335ms ± 3%      325ms ± 8%    ~             (p=0.074 n=8+9)
GoTypes        1.05s ± 1%      1.05s ± 3%    ~            (p=0.095 n=9+10)
Compiler       5.37s ± 4%      5.29s ± 1%  -1.42%         (p=0.022 n=9+10)
MakeBash       34.9s ± 3%      34.4s ± 2%    ~            (p=0.095 n=9+10)

name       old alloc/op    new alloc/op    delta
Template      55.4MB ± 0%     54.9MB ± 0%  -0.81%        (p=0.000 n=10+10)
GoTypes        179MB ± 0%      178MB ± 0%  -0.89%        (p=0.000 n=10+10)
Compiler       807MB ± 0%      798MB ± 0%  -1.10%        (p=0.000 n=10+10)

name       old allocs/op   new allocs/op   delta
Template        498k ± 0%       496k ± 0%  -0.29%          (p=0.000 n=9+9)
GoTypes        1.42M ± 0%      1.41M ± 0%  -0.24%        (p=0.000 n=10+10)
Compiler       5.61M ± 0%      5.60M ± 0%  -0.12%        (p=0.000 n=10+10)

Change-Id: I4cd20cfba3f132ebf371e16046ab14d7e42799ec
Reviewed-on: https://go-review.googlesource.com/21806
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-04-12 14:44:26 +00:00
Keith Randall
7e40627a0e cmd/compile: zero all three argstorage slots
These changes were missed when going from 2 to 3 argstorage slots.
https://go-review.googlesource.com/20296/

Change-Id: I930a307bb0b695bf1ae088030c9bbb6d14ca31d2
Reviewed-on: https://go-review.googlesource.com/21841
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-11 20:49:22 +00:00
Keith Randall
56e0ecc5ea cmd/compile: keep value use counts in SSA
Keep track of how many uses each Value has.  Each appearance in
Value.Args and in Block.Control counts once.

The number of uses of a value is generically useful to
constrain rewrite rules.  For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.

But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use.  Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races.  In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice.  (I have a separate
CL which triggers the mutex failure.  This CL has a test which
demonstrates a similar failure.)

Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-03-17 04:20:02 +00:00
Todd Neal
98b88de56f cmd/compile: change the type of ssa Warnl line number
Line numbers are always int32, so the Warnl function should take the
line number as an int32 as well.  This matches gc.Warnl and removes
a cast every place it's used.

Change-Id: I5d6201e640d52ec390eb7174f8fd8c438d4efe58
Reviewed-on: https://go-review.googlesource.com/20662
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-14 11:04:40 +00:00
Todd Neal
f6ceed2cab cmd/compile: const folding for float32/64
Split the auxFloat type into 32/64 bit versions and perform checking for
exactly representable float32 values.  Perform const folding on
float32/64.  Comment out some const negation rules that the frontend
already performs.

Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2
Reviewed-on: https://go-review.googlesource.com/20568
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-13 13:32:41 +00:00
Todd Neal
6cb2e1d015 cmd/compile: remove values from const cache upon free
When calling freeValue for possible const values, remove them from the
cache as well.

Change-Id: I087ed592243e33c58e5db41700ab266fc70196d9
Reviewed-on: https://go-review.googlesource.com/20481
Run-TryBot: Todd Neal <tolchz@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-10 10:28:05 +00:00
Josh Bleecher Snyder
39214275d6 cmd/compile: cache const nil, iface, slice, and ""
name      old time/op    new time/op    delta
Template     441ms ± 4%     446ms ± 4%  +1.23%  (p=0.048 n=22+25)
GoTypes      1.51s ± 2%     1.51s ± 2%    ~     (p=0.224 n=25+25)
Compiler     5.59s ± 1%     5.57s ± 2%  -0.38%  (p=0.019 n=24+24)

name      old alloc/op   new alloc/op   delta
Template    85.6MB ± 0%    85.6MB ± 0%  -0.11%  (p=0.000 n=25+24)
GoTypes      307MB ± 0%     305MB ± 0%  -0.45%  (p=0.000 n=25+25)
Compiler    1.06GB ± 0%    1.06GB ± 0%  -0.34%  (p=0.000 n=25+25)

name      old allocs/op  new allocs/op  delta
Template     1.10M ± 0%     1.10M ± 0%  -0.03%  (p=0.001 n=25+24)
GoTypes      3.36M ± 0%     3.35M ± 0%  -0.13%  (p=0.000 n=25+25)
Compiler     13.0M ± 0%     13.0M ± 0%  -0.12%  (p=0.000 n=25+24)

Change-Id: I7fc18acbc3b1588aececef9692e24a0bd3dba974
Reviewed-on: https://go-review.googlesource.com/20295
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-08 16:21:51 +00:00
Brad Fitzpatrick
5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Josh Bleecher Snyder
194c79c163 [dev.ssa] cmd/compile: add constant cache
The cache gets a 62% hit rate while compiling
the standard library.


name      old time/op    new time/op    delta
Template     449ms ± 2%     443ms ± 4%  -1.40%  (p=0.006 n=23+25)
GoTypes      1.54s ± 1%     1.50s ± 2%  -2.53%  (p=0.000 n=22+22)
Compiler     5.51s ± 1%     5.39s ± 1%  -2.29%  (p=0.000 n=23+25)

name      old alloc/op   new alloc/op   delta
Template    90.4MB ± 0%    90.0MB ± 0%  -0.45%  (p=0.000 n=25+25)
GoTypes      334MB ± 0%     331MB ± 0%  -1.05%  (p=0.000 n=25+25)
Compiler    1.12GB ± 0%    1.10GB ± 0%  -1.57%  (p=0.000 n=25+24)

name      old allocs/op  new allocs/op  delta
Template      681k ± 0%      682k ± 0%  +0.26%  (p=0.000 n=25+25)
GoTypes      2.23M ± 0%     2.23M ± 0%  +0.05%  (p=0.000 n=23+24)
Compiler     6.46M ± 0%     6.46M ± 0%  +0.02%  (p=0.000 n=24+25)


Change-Id: I2629c291892827493d7b55ec4d83f6973a2ab133
Reviewed-on: https://go-review.googlesource.com/20026
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-29 22:40:43 +00:00
David Chase
378a863682 [dev.ssa] cmd/compile: enhance command line option processing for SSA
The -d compiler flag can also specify ssa phase and flag,
for example -d=ssa/generic_cse/time,ssa/generic_cse/stats

Spaces in the phase names can be specified with an
underscore.  Flags currently parsed (not necessarily
recognized by the phases yet) are:

   on, off, mem, time, debug, stats, and test

On, off and time are handled in the harness,
debug, stats, and test are interpreted by the phase itself.

The pass is now attached to the Func being compiled, and a
new method logStats(key, ...value) on *Func to encourage a
semi-standardized format for that output.  Output fields
are separated by tabs to ease digestion by awk and
spreadsheets.  For example,
	if f.pass.stats > 0 {
		f.logStat("CSE REWRITES", rewrites)
	}

Change-Id: I16db2b5af64c50ca9a47efeb51d961147a903abc
Reviewed-on: https://go-review.googlesource.com/19885
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-02-25 20:32:15 +00:00
Todd Neal
3297a4f5f3 [dev.ssa] cmd/compile: cache sparse sets in Config
Move the cached sparse sets to the Config.  I tested make.bash with
pre-allocating sets of size 150 and not caching very small sets, but the
difference between this implementation (no min size, no preallocation)
and a min size with preallocation was fairly negligible:

Number of sparse sets allocated:
Cached in Config w/none preallocated no min size    3684 *this CL*
Cached in Config w/three preallocated no min size   3370
Cached in Config w/three preallocated min size=150  3370
Cached in Config w/none preallocated min size=150  15947
Cached in Func,  w/no min                          96996 *previous code*

Change-Id: I7f9de8a7cae192648a7413bfb18a6690fad34375
Reviewed-on: https://go-review.googlesource.com/19152
Reviewed-by: Keith Randall <khr@golang.org>
2016-02-02 18:08:46 +00:00
Todd Neal
f962f33035 [dev.ssa] cmd/compile: reuse sparse sets across compiler passes
Cache sparse sets in the function so they can be reused by subsequent
compiler passes.

benchmark                        old ns/op     new ns/op     delta
BenchmarkDSEPass-8               206945        180022        -13.01%
BenchmarkDSEPassBlock-8          5286103       2614054       -50.55%
BenchmarkCSEPass-8               1790277       1790655       +0.02%
BenchmarkCSEPassBlock-8          18083588      18112771      +0.16%
BenchmarkDeadcodePass-8          59837         41375         -30.85%
BenchmarkDeadcodePassBlock-8     1651575       511169        -69.05%
BenchmarkMultiPass-8             531529        427506        -19.57%
BenchmarkMultiPassBlock-8        7033496       4487814       -36.19%

benchmark                        old allocs     new allocs     delta
BenchmarkDSEPass-8               11             4              -63.64%
BenchmarkDSEPassBlock-8          599            120            -79.97%
BenchmarkCSEPass-8               18             18             +0.00%
BenchmarkCSEPassBlock-8          2700           2700           +0.00%
BenchmarkDeadcodePass-8          4              3              -25.00%
BenchmarkDeadcodePassBlock-8     30             9              -70.00%
BenchmarkMultiPass-8             24             20             -16.67%
BenchmarkMultiPassBlock-8        1800           1000           -44.44%

benchmark                        old bytes     new bytes     delta
BenchmarkDSEPass-8               221367        142           -99.94%
BenchmarkDSEPassBlock-8          3695207       3846          -99.90%
BenchmarkCSEPass-8               303328        303328        +0.00%
BenchmarkCSEPassBlock-8          5006400       5006400       +0.00%
BenchmarkDeadcodePass-8          84232         10506         -87.53%
BenchmarkDeadcodePassBlock-8     1274940       163680        -87.16%
BenchmarkMultiPass-8             608674        313834        -48.44%
BenchmarkMultiPassBlock-8        9906001       5003450       -49.49%

Change-Id: Ib1fa58c7f494b374d1a4bb9cffbc2c48377b59d3
Reviewed-on: https://go-review.googlesource.com/19100
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-01-30 13:57:39 +00:00
David Chase
88b230eaa6 [dev.ssa] cmd/compile: exposed do-log boolean to reduce allocations
From memory profiling, about 3% reduction in allocation count.

Change-Id: I4b662d55b8a94fe724759a2b22f05a08d0bf40f8
Reviewed-on: https://go-review.googlesource.com/19103
Reviewed-by: Keith Randall <khr@golang.org>
2016-01-29 21:30:29 +00:00
Keith Randall
056c09bb88 [dev.ssa] cmd/compile: add backing store buffers for block.{Preds,Succs,Values}
Speeds up compilation by 6%.

Change-Id: Ibaad95710323ddbe13c1b0351843fe43a48d776e
Reviewed-on: https://go-review.googlesource.com/19080
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-01-29 01:01:39 +00:00
Keith Randall
2f57d0fe02 [dev.ssa] cmd/compile: preallocate small-numbered values and blocks
Speeds up the compiler ~5%.

Change-Id: Ia5cf0bcd58701fd14018ec77d01f03d5c7d6385b
Reviewed-on: https://go-review.googlesource.com/19060
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-01-28 22:52:42 +00:00
Keith Randall
da8af47710 [dev.ssa] cmd/compile: report better line numbers for Unimplemented/Fatal
If a failure occurs in SSA processing, we always report the
last line of the function we're compiling.  Modify the callbacks
from SSA to the GC compiler so we can pass a line number back
and use it in Fatalf.

Change-Id: Ifbfad50d5e167e997e0a96f0775bcc369f5c397e
Reviewed-on: https://go-review.googlesource.com/18599
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-01-19 15:43:32 +00:00
Keith Randall
02f4d0a130 [dev.ssa] cmd/compile: start arguments as spilled
Declare a function's arguments as having already been
spilled so their use just requires a restore.

Allow spill locations to be portions of larger objects the stack.
Required to load portions of compound input arguments.

Rename the memory input to InputMem.  Use Arg for the
pre-spilled argument values.

Change-Id: I8fe2a03ffbba1022d98bfae2052b376b96d32dda
Reviewed-on: https://go-review.googlesource.com/16536
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2015-11-03 17:29:40 +00:00
Keith Randall
582baae22a [dev.ssa] cmd/compile: Do pointer arithmetic with int, not uintptr
Be more consistent about this.  There's no reason to do the
pointer arithmetic on a different type, as sizeof(int) >=
sizeof(ptr) on all of our platforms.  It simplifies our
rewrite rules also, except for a few that need duplication.

Add some more constant folding to get constant indexing and
slicing to fold down to nothing.

Change-Id: I3e56cdb14b3dc1a6a0514f0333e883f92c19e3c7
Reviewed-on: https://go-review.googlesource.com/16586
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2015-11-03 17:28:12 +00:00
Keith Randall
c24681ae2e [dev.ssa] cmd/compile: remember names of values
For debugging, spill values to named variables instead of autotmp_
variables if possible.  We do this by keeping a name -> value map
for each function, keep it up-to-date during deadcode elim, and use
it to override spill decisions in stackalloc.

It might even make stack frames a bit smaller, as it makes it easy
to identify a set of spills which are likely not to interfere.

This just works for one-word variables for now.  Strings/slices
will be a separate CL.

Change-Id: Ie89eba8cab16bcd41b311c479ec46dd7e64cdb67
Reviewed-on: https://go-review.googlesource.com/16336
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2015-10-28 17:00:31 +00:00