Commit graph

10 commits

Author SHA1 Message Date
Josh Bleecher Snyder
46b88c9fbc cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.

In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.

Though this is a big CL, most of the changes are
mechanical and uninteresting.

Interesting bits:

* Add new singleton globals to package types for the special
  SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
  and TTUPLE, for SSA tuple types.
  ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
  to package types.
* We had picked the name "types" in our rules for the handy
  list of types provided by ssa.Config. That conflicted with
  the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
  types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
  and probably also some other duplicated Type methods
  designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
  and they were not particularly careful about types in general.
  Of necessity, this CL switches them to use *types.Type;
  it does not make them more type-accurate.
  Unfortunately, using types.Type means initializing a bit
  of the types universe.
  This is prime for refactoring and improvement.

This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.

name        old alloc/op      new alloc/op      delta
Template         37.9MB ± 0%       37.7MB ± 0%  -0.57%  (p=0.000 n=10+8)
Unicode          28.9MB ± 0%       28.7MB ± 0%  -0.52%  (p=0.000 n=10+10)
GoTypes           110MB ± 0%        109MB ± 0%  -0.88%  (p=0.000 n=10+10)
Flate            24.7MB ± 0%       24.6MB ± 0%  -0.66%  (p=0.000 n=10+10)
GoParser         31.1MB ± 0%       30.9MB ± 0%  -0.61%  (p=0.000 n=10+9)
Reflect          73.9MB ± 0%       73.4MB ± 0%  -0.62%  (p=0.000 n=10+8)
Tar              25.8MB ± 0%       25.6MB ± 0%  -0.77%  (p=0.000 n=9+10)
XML              41.2MB ± 0%       40.9MB ± 0%  -0.80%  (p=0.000 n=10+10)
[Geo mean]       40.5MB            40.3MB       -0.68%

name        old allocs/op     new allocs/op     delta
Template           385k ± 0%         386k ± 0%    ~     (p=0.356 n=10+9)
Unicode            343k ± 1%         344k ± 0%    ~     (p=0.481 n=10+10)
GoTypes           1.16M ± 0%        1.16M ± 0%  -0.16%  (p=0.004 n=10+10)
Flate              238k ± 1%         238k ± 1%    ~     (p=0.853 n=10+10)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.720 n=10+9)
Reflect            957k ± 0%         957k ± 0%    ~     (p=0.460 n=10+8)
Tar                252k ± 0%         252k ± 0%    ~     (p=0.133 n=9+10)
XML                400k ± 0%         400k ± 0%    ~     (p=0.796 n=10+10)
[Geo mean]         428k              428k       -0.01%


Removing all the interface calls helps non-trivially with CPU, though.

name        old time/op       new time/op       delta
Template          178ms ± 4%        173ms ± 3%  -2.90%  (p=0.000 n=94+96)
Unicode          85.0ms ± 4%       83.9ms ± 4%  -1.23%  (p=0.000 n=96+96)
GoTypes           543ms ± 3%        528ms ± 3%  -2.73%  (p=0.000 n=98+96)
Flate             116ms ± 3%        113ms ± 4%  -2.34%  (p=0.000 n=96+99)
GoParser          144ms ± 3%        140ms ± 4%  -2.80%  (p=0.000 n=99+97)
Reflect           344ms ± 3%        334ms ± 4%  -3.02%  (p=0.000 n=100+99)
Tar               106ms ± 5%        103ms ± 4%  -3.30%  (p=0.000 n=98+94)
XML               198ms ± 5%        192ms ± 4%  -2.88%  (p=0.000 n=92+95)
[Geo mean]        178ms             173ms       -2.65%

name        old user-time/op  new user-time/op  delta
Template          229ms ± 5%        224ms ± 5%  -2.36%  (p=0.000 n=95+99)
Unicode           107ms ± 6%        106ms ± 5%  -1.13%  (p=0.001 n=93+95)
GoTypes           696ms ± 4%        679ms ± 4%  -2.45%  (p=0.000 n=97+99)
Flate             137ms ± 4%        134ms ± 5%  -2.66%  (p=0.000 n=99+96)
GoParser          176ms ± 5%        172ms ± 8%  -2.27%  (p=0.000 n=98+100)
Reflect           430ms ± 6%        411ms ± 5%  -4.46%  (p=0.000 n=100+92)
Tar               128ms ±13%        123ms ±13%  -4.21%  (p=0.000 n=100+100)
XML               239ms ± 6%        233ms ± 6%  -2.50%  (p=0.000 n=95+97)
[Geo mean]        220ms             213ms       -2.76%


Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder
5272a2cdc5 cmd/compile: avoid infinite loops in dead blocks during phi insertion
Now that we no longer generate dead code,
it is possible to follow block predecessors
into infinite loops with no variable definitions,
causing an infinite loop during phi insertion.

To fix that, check explicitly whether the predecessor
is dead in lookupVarOutgoing, and if so, bail.

The loop in lookupVarOutgoing is very hot code,
so I am wary of adding anything to it.
However, a long, CPU-only benchmarking run shows no
performance impact at all.

Fixes #19783

Change-Id: I8ef8d267e0b20a29b5cb0fecd7084f76c6f98e47
Reviewed-on: https://go-review.googlesource.com/38913
Reviewed-by: David Chase <drchase@google.com>
2017-03-30 17:06:08 +00:00
Josh Bleecher Snyder
e00e57d67c cmd/compile: ignore all unreachable values during simple phi insertion
Simple phi insertion already had a heuristic to check
for dead blocks, namely having no predecessors.
When we stopped generating code for dead blocks,
we eliminated some values contained in more subtle
dead blocks, which confused phi insertion.
Compensate by beefing up the reachability check.

Fixes #19678

Change-Id: I0081e4a46f7ce2f69b131a34a0553874a0cb373e
Reviewed-on: https://go-review.googlesource.com/38602
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-24 18:00:15 +00:00
Josh Bleecher Snyder
61fb2f6d63 cmd/compile: speed up hot phi insertion code
This speeds up compilation of the code in #8225 by 25%-30%.
The complexity of the algorithm is unchanged,
but this shrinks the constant factor so much that it doesn't matter,
even the size of the giant type switch gets scaled up dramatically.

name       old time/op      new time/op      delta
Template        218ms ± 5%       217ms ±10%    ~           (p=0.163 n=27+30)
Unicode        98.2ms ± 6%      97.7ms ±10%    ~           (p=0.150 n=27+29)
GoTypes         654ms ± 5%       650ms ± 5%    ~           (p=0.350 n=30+30)
Compiler        2.70s ± 4%       2.68s ± 3%    ~           (p=0.128 n=30+29)

name       old user-ns/op   new user-ns/op   delta
Template   276user-ms ± 6%  271user-ms ± 7%  -1.83%        (p=0.003 n=29+28)
Unicode    138user-ms ± 5%  137user-ms ± 4%    ~           (p=0.071 n=27+27)
GoTypes    881user-ms ± 4%  877user-ms ± 4%    ~           (p=0.423 n=30+30)
Compiler   3.76user-s ± 4%  3.72user-s ± 2%  -0.84%        (p=0.028 n=30+29)

name       old alloc/op     new alloc/op     delta
Template       40.7MB ± 0%      40.7MB ± 0%    ~           (p=0.936 n=30+30)
Unicode        30.8MB ± 0%      30.8MB ± 0%    ~           (p=0.859 n=28+30)
GoTypes         123MB ± 0%       123MB ± 0%    ~           (p=0.273 n=30+30)
Compiler        472MB ± 0%       472MB ± 0%    ~           (p=0.432 n=30+30)

name       old allocs/op    new allocs/op    delta
Template         401k ± 1%        401k ± 1%    ~           (p=0.859 n=30+30)
Unicode          331k ± 0%        331k ± 1%    ~           (p=0.823 n=28+30)
GoTypes         1.24M ± 0%       1.24M ± 0%    ~           (p=0.286 n=30+30)
Compiler        4.26M ± 0%       4.26M ± 0%    ~           (p=0.359 n=30+30)

Change-Id: Ia850065a9a84c07a5b0b4e23c1758b5679498da7
Reviewed-on: https://go-review.googlesource.com/36112
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-03 05:36:22 +00:00
Josh Bleecher Snyder
1cbc5aa529 cmd/compile: insertVarPhis micro-optimization
Algorithmic improvements here are hard.
Lifting a lookup out of the loop helps a little, though.

To compile the code in #17926:

name  old s/op   new s/op   delta
Real   146 ± 3%   140 ± 4%  -3.87%  (p=0.002 n=10+10)
User   143 ± 3%   139 ± 4%  -3.08%  (p=0.005 n=10+10)
Sys   8.28 ±35%  8.08 ±28%    ~     (p=0.684 n=10+10)

Updates #17926.

Change-Id: Ic255ac8b7b409c1a53791058818b7e2cf574abe3
Reviewed-on: https://go-review.googlesource.com/33305
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-01 20:23:36 +00:00
Robert Griesemer
472c792e0a [dev.inline] cmd/internal/src: introduce compact source position representation
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.

In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.

The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):

name       old time/op     new time/op     delta
Template       256ms ± 1%      262ms ± 2%    ~             (p=0.063 n=5+4)
Unicode        132ms ± 1%      135ms ± 2%    ~             (p=0.063 n=5+4)
GoTypes        891ms ± 1%      871ms ± 1%  -2.28%          (p=0.016 n=5+4)
Compiler       3.84s ± 2%      3.89s ± 2%    ~             (p=0.413 n=5+4)
MakeBash       47.1s ± 1%      46.2s ± 2%    ~             (p=0.095 n=5+5)

name       old user-ns/op  new user-ns/op  delta
Template        309M ± 1%       314M ± 2%    ~             (p=0.111 n=5+4)
Unicode         165M ± 1%       172M ± 9%    ~             (p=0.151 n=5+5)
GoTypes        1.14G ± 2%      1.12G ± 1%    ~             (p=0.063 n=5+4)
Compiler       5.00G ± 1%      4.96G ± 1%    ~             (p=0.286 n=5+4)

Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-09 22:43:22 +00:00
Robert Griesemer
cfd17f51c8 [dev.inline] cmd/compile/internal/ssa: rename various fields from Line to Pos
This is a mostly mechanical rename followed by manual fixes where necessary.

Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08 21:36:52 +00:00
Robert Griesemer
24597c080b [dev.inline] cmd/compile: introduce cmd/internal/src.Pos type for line numbers
This is a step toward chosing a different position representation.
By introducing an explicit type, it will be easier to make the
transition step-wise while ensuring everything keeps running.

This has been reviewed via https://go-review.googlesource.com/#/c/34025/.

Change-Id: Ibceddcd62d8f346321ac3250e3940e9c436ed684
Reviewed-on: https://go-review.googlesource.com/34132
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08 21:26:25 +00:00
Andrew Pogrebnoy
433be563b6 cmd/compile: fix choice of phi building algorithm
The algorithm for placing a phi nodes in small functions now
unreachable. This patch fix that.

Change-Id: I253d745b414fa12ee0719459c28e78a69c6861ae
Reviewed-on: https://go-review.googlesource.com/30106
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-07 19:44:07 +00:00
Keith Randall
5a6e511c61 cmd/compile: Use Sreedhar+Gao phi building algorithm
Should be more asymptotically happy.

We process each variable in turn to find all the
locations where it needs a phi (the dominance frontier
of all of its definitions).  Then we add all those phis.
This takes O(n * #variables), although hopefully much less.

Then we do a single tree walk to match all the
FwdRefs with the nearest definition or phi.
This takes O(n) time.

The one remaining inefficiency is that we might end up
introducing a bunch of dead phis in the first step.
A TODO is to introduce phis only where they might be
used by a read.

The old algorithm is still faster on small functions,
so there's a cutover size (currently 500 blocks).

This algorithm supercedes the David's sparse phi
placement algorithm for large functions.

Lowers compile time of example from #14934 from
~10 sec to ~4 sec.
Lowers compile time of example from #16361 from
~4.5 sec to ~3 sec.
Lowers #16407 from ~20 min to ~30 sec.

Update #14934
Update #16361
Fixes #16407

Change-Id: I1cff6364e1623c143190b6a924d7599e309db58f
Reviewed-on: https://go-review.googlesource.com/30163
Reviewed-by: David Chase <drchase@google.com>
2016-10-03 20:30:08 +00:00