Commit graph

136 commits

Author SHA1 Message Date
Giovanni Bajo
7ec25d0acf cmd/compile: implement loop BCE in prove
Reuse findIndVar to discover induction variables, and then
register the facts we know about them into the facts table
when entering the loop block.

Moreover, handle "x+delta > w" while updating the facts table,
to be able to prove accesses to slices with constant offsets
such as slice[i-10].

Change-Id: I2a63d050ed58258136d54712ac7015b25c893d71
Reviewed-on: https://go-review.googlesource.com/104038
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:37:35 +00:00
Giovanni Bajo
29162ec9a7 cmd/compile: in prove, infer unsigned relations while branching
When a branch is followed, we apply the relation as described
in the domain relation table. In case the relation is in the
positive domain, we can also infer an unsigned relation if,
by that point, we know that both operands are non-negative.

Fixes #20393

Change-Id: Ieaf0c81558b36d96616abae3eb834c788dd278d5
Reviewed-on: https://go-review.googlesource.com/100278
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:37:15 +00:00
Giovanni Bajo
5c40210987 cmd/compile: in prove, add transitive closure of relations
Implement it through a partial order datastructure, which
keeps the relations between SSA values in a forest of DAGs
and is able to discover contradictions.

In make.bash, this patch is able to prove hundreds of conditions
which were not proved before.

Compilebench:

name       old time/op       new time/op       delta
Template         371ms ± 2%        368ms ± 1%    ~     (p=0.222 n=5+5)
Unicode          203ms ± 6%        199ms ± 3%    ~     (p=0.421 n=5+5)
GoTypes          1.17s ± 4%        1.18s ± 1%    ~     (p=0.151 n=5+5)
Compiler         5.54s ± 2%        5.59s ± 1%    ~     (p=0.548 n=5+5)
SSA              12.9s ± 2%        13.2s ± 1%  +2.96%  (p=0.032 n=5+5)
Flate            245ms ± 2%        247ms ± 3%    ~     (p=0.690 n=5+5)
GoParser         302ms ± 6%        302ms ± 4%    ~     (p=0.548 n=5+5)
Reflect          764ms ± 4%        773ms ± 3%    ~     (p=0.095 n=5+5)
Tar              354ms ± 6%        361ms ± 3%    ~     (p=0.222 n=5+5)
XML              434ms ± 3%        429ms ± 1%    ~     (p=0.421 n=5+5)
StdCmd           22.6s ± 1%        22.9s ± 1%  +1.40%  (p=0.032 n=5+5)

name       old user-time/op  new user-time/op  delta
Template         436ms ± 8%        426ms ± 5%    ~     (p=0.579 n=5+5)
Unicode          219ms ±15%        219ms ±12%    ~     (p=1.000 n=5+5)
GoTypes          1.47s ± 6%        1.53s ± 6%    ~     (p=0.222 n=5+5)
Compiler         7.26s ± 4%        7.40s ± 2%    ~     (p=0.389 n=5+5)
SSA              17.7s ± 4%        18.5s ± 4%  +4.13%  (p=0.032 n=5+5)
Flate            257ms ± 5%        268ms ± 9%    ~     (p=0.333 n=5+5)
GoParser         354ms ± 6%        348ms ± 6%    ~     (p=0.913 n=5+5)
Reflect          904ms ± 2%        944ms ± 4%    ~     (p=0.056 n=5+5)
Tar              398ms ±11%        430ms ± 7%    ~     (p=0.079 n=5+5)
XML              501ms ± 7%        489ms ± 5%    ~     (p=0.444 n=5+5)

name       old text-bytes    new text-bytes    delta
HelloSize        670kB ± 0%        670kB ± 0%  +0.00%  (p=0.008 n=5+5)
CmdGoSize       7.22MB ± 0%       7.21MB ± 0%  -0.07%  (p=0.008 n=5+5)

name       old data-bytes    new data-bytes    delta
HelloSize       9.88kB ± 0%       9.88kB ± 0%    ~     (all equal)
CmdGoSize        248kB ± 0%        248kB ± 0%  -0.06%  (p=0.008 n=5+5)

name       old bss-bytes     new bss-bytes     delta
HelloSize        125kB ± 0%        125kB ± 0%    ~     (all equal)
CmdGoSize        145kB ± 0%        144kB ± 0%  -0.20%  (p=0.008 n=5+5)

name       old exe-bytes     new exe-bytes     delta
HelloSize       1.43MB ± 0%       1.43MB ± 0%    ~     (all equal)
CmdGoSize       14.5MB ± 0%       14.5MB ± 0%  -0.06%  (p=0.008 n=5+5)

Fixes #19714
Updates #20393

Change-Id: Ia090f5b5dc1bcd274ba8a39b233c1e1ace1b330e
Reviewed-on: https://go-review.googlesource.com/100277
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: David Chase <drchase@google.com>
2018-04-29 09:35:39 +00:00
Josh Bleecher Snyder
b9785fc844 cmd/compile: log Ctz non-zero proofs
I forgot this in CL 109358.

Change-Id: Ia5e8bd9cf43393f098b101a0d6a0c526e3e4f101
Reviewed-on: https://go-review.googlesource.com/109775
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-27 04:49:09 +00:00
Josh Bleecher Snyder
d9a50a6531 cmd/compile: use prove pass to detect Ctz of non-zero values
On amd64, Ctz must include special handling of zeros.
But the prove pass has enough information to detect whether the input
is non-zero, allowing a more efficient lowering.

Introduce new CtzNonZero ops to capture and use this information.

Benchmark code:

func BenchmarkVisitBits(b *testing.B) {
	b.Run("8", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			x := uint8(0xff)
			for x != 0 {
				sink = bits.TrailingZeros8(x)
				x &= x - 1
			}
		}
	})

    // and similarly so for 16, 32, 64
}

name            old time/op  new time/op  delta
VisitBits/8-8   7.27ns ± 4%  5.58ns ± 4%  -23.35%  (p=0.000 n=28+26)
VisitBits/16-8  14.7ns ± 7%  10.5ns ± 4%  -28.43%  (p=0.000 n=30+28)
VisitBits/32-8  27.6ns ± 8%  19.3ns ± 3%  -30.14%  (p=0.000 n=30+26)
VisitBits/64-8  44.0ns ±11%  38.0ns ± 5%  -13.48%  (p=0.000 n=30+30)

Fixes #25077

Change-Id: Ie6e5bd86baf39ee8a4ca7cadcf56d934e047f957
Reviewed-on: https://go-review.googlesource.com/109358
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-26 18:22:28 +00:00
Giovanni Bajo
ac43de3ae5 cmd/compile: in prove, complete support for OpIsInBounds/OpIsSliceInBounds
The logic in addBranchRestrictions didn't allow to correctly
model OpIs(Slice)Bound for signed domain, and it was also partly
implemented within addRestrictions.

Thanks to the previous changes, it is now possible to handle
the negative conditions correctly, so that we can learn
both signed/LT + unsigned/LT on the positive side, and
signed/GE + unsigned/GE on the negative side (but only if
the index can be proved to be non-negative).

This is able to prove ~50 more slice accesses in std+cmd.

Change-Id: I9858080dc03b16f85993a55983dbc4b00f8491b0
Reviewed-on: https://go-review.googlesource.com/104037
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 20:25:34 +00:00
Giovanni Bajo
b846edfd59 cmd/compile: in prove, make addRestrictions more generic
addRestrictions was taking a branch parameter, binding its logic
to that of addBranchRestrictions. Since we will need to use it
for updating the facts table for induction variables, refactor it
to remove the branch parameter.

Passes toolstash -cmp.

Change-Id: Iaaec350a8becd1919d03d8574ffd1bbbd906d068
Reviewed-on: https://go-review.googlesource.com/104036
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 20:25:17 +00:00
Giovanni Bajo
321bd8c93b cmd/compile: in prove, simplify logic of branch pushing
prove used a complex logic when trying to prove branch conditions:
tryPushBranch() was sometimes leaving a checkpoint on the factsTable,
sometimes not, and the caller was supposed to check the return value
to know what to do.

Since we're going to make the prove descend logic a little bit more
complex by adding also induction variables, simplify the tryPushBranch
logic, by removing any factsTable checkpoint handling from it.

Passes toolstash -cmp.

Change-Id: Idfb1703df8a455f612f93158328b36c461560781
Reviewed-on: https://go-review.googlesource.com/104035
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-04-03 11:14:26 +00:00
Austin Clements
2d8181e7b5 cmd/compile: clarify unsigned interpretation of AuxInt
The way Value.AuxInt represents unsigned numbers is currently
documented in genericOps.go, which is not the most obvious place for
it. Move that documentation to Value.AuxInt. Furthermore, to make it
harder to use incorrectly, introduce a Value.AuxUnsigned accessor that
returns the zero-extended value of Value.AuxInt.

Change-Id: I85030c3c68761404058a430e0b1c7464591b2f42
Reviewed-on: https://go-review.googlesource.com/102597
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-26 19:08:24 +00:00
Giovanni Bajo
d54902ece9 cmd/compile: in prove, shortcircuit self-facts
Sometimes, we can end up calling update with a self-relation
about a variable (x REL x). In this case, there is no need
to record anything: the relation is unsatisfiable if and only
if it doesn't contain eq.

This also helps avoiding infinite loop in next CL that will
introduce transitive closure of relations.

Passes toolstash -cmp.

Change-Id: Ic408452ec1c13653f22ada35466ec98bc14aaa8e
Reviewed-on: https://go-review.googlesource.com/100276
Reviewed-by: Austin Clements <austin@google.com>
2018-03-24 03:06:21 +00:00
Giovanni Bajo
385d936fb2 cmd/compile: in prove, fail fast when unsat is found
When an unsatisfiable relation is recorded in the facts table,
there is no need to compute further relations or updates
additional data structures.

Since we're about to transitively propagate relations, make
sure to fail as fast as possible to avoid doing useless work
in dead branches.

Passes toolstash -cmp.

Change-Id: I23eed376d62776824c33088163c7ac9620abce85
Reviewed-on: https://go-review.googlesource.com/100275
Reviewed-by: Austin Clements <austin@google.com>
2018-03-24 03:06:01 +00:00
Austin Clements
da022da900 cmd/compile: simplify OpSlicemask optimization
The previous CL introduced isConstDelta. Use it to simplify the
OpSlicemask optimization in the prove pass. This passes toolstash
-cmp.

Change-Id: If2aa762db4cdc0cd1c581a536340530a9831081b
Reviewed-on: https://go-review.googlesource.com/87481
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:25:29 +00:00
Austin Clements
6436270dad cmd/compile: add fence-post implications to prove
This adds four new deductions to the prove pass, all related to adding
or subtracting one from a value. This is the first hint of actual
arithmetic relations in the prove pass.

The most effective of these is

   x-1 >= w && x > min  ⇒  x > w

This helps eliminate bounds checks in code like

  if x > 0 {
    // do something with s[x-1]
  }

Altogether, these deductions prove an additional 260 branches in std
and cmd. Furthermore, they will let us eliminate some tricky
compiler-inserted panics in the runtime that are interfering with
static analysis.

Fixes #23354.

Change-Id: I7088223e0e0cd6ff062a75c127eb4bb60e6dce02
Reviewed-on: https://go-review.googlesource.com/87480
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2018-03-08 22:25:28 +00:00
Austin Clements
941fc129e2 cmd/compile: derive unsigned limits from signed limits in prove
This adds a few simple deductions to the prove pass' fact table to
derive unsigned concrete limits from signed concrete limits where
possible.

This tweak lets the pass prove 70 additional branch conditions in std
and cmd.

This is based on a comment from the recently-deleted factsTable.get:
"// TODO: also use signed data if lim.min >= 0".

Change-Id: Ib4340249e7733070f004a0aa31254adf5df8a392
Reviewed-on: https://go-review.googlesource.com/87479
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:25:27 +00:00
Austin Clements
669db2cef5 cmd/compile: make prove pass use unsatisfiability
Currently the prove pass uses implication queries. For each block, it
collects the set of branch conditions leading to that block, and
queries this fact table for whether any of these facts imply the
block's own branch condition (or its inverse). This works remarkably
well considering it doesn't do any deduction on these facts, but it
has various downsides:

1. It requires an implementation both of adding facts to the table and
   determining implications. These are very nearly duals of each
   other, but require separate implementations. Likewise, the process
   of asserting facts of dominating branch conditions is very nearly
   the dual of the process of querying implied branch conditions.

2. It leads to less effective use of derived facts. For example, the
   prove pass currently derives facts about the relations between len
   and cap, but can't make use of these unless a branch condition is
   in the exact form of a derived fact. If one of these derived facts
   contradicts another fact, it won't notice or make use of this.

This CL changes the approach of the prove pass to instead use
*contradiction* instead of implication. Rather than ever querying a
branch condition, it simply adds branch conditions to the fact table.
If this leads to a contradiction (specifically, it makes the fact set
unsatisfiable), that branch is impossible and can be cut. As a result,

1. We can eliminate the code for determining implications
   (factsTable.get disappears entirely). Also, there is now a single
   implementation of visiting and asserting branch conditions, since
   we don't have to flip them around to treat them as facts in one
   place and queries in another.

2. Derived facts can be used effectively. It doesn't matter *why* the
   fact table is unsatisfiable; a contradiction in any of the facts is
   enough.

3. As an added benefit, it's now quite easy to avoid traversing beyond
   provably-unreachable blocks. In contrast, the current
   implementation always visits all blocks.

The prove pass already has nearly all of the mechanism necessary to
compute unsatisfiability, which means this both simplifies the code
and makes it more powerful.

The only complication is that the current implication procedure has a
hack for dealing with the 0 <= Args[0] condition of OpIsInBounds and
OpIsSliceInBounds. We replace this with asserting the appropriate fact
when we process one of these conditions. This seems much cleaner
anyway, and works because we can now take advantage of derived facts.

This has no measurable effect on compiler performance.

Effectiveness:

There is exactly one condition in all of std and cmd that this fails
to prove that the old implementation could: (int64(^uint(0)>>1) < x)
in encoding/gob. This can never be true because x is an int, and it's
basically coincidence that the old code gets this. (For example, it
fails to prove the similar (x < ^int64(^uint(0)>>1)) condition that
immediately precedes it, and even though the conditions are logically
unrelated, it wouldn't get the second one if it hadn't first processed
the first!)

It does, however, prove a few dozen additional branches. These come
from facts that are added to the fact table about the relations
between len and cap. These were almost never queried directly before,
but could lead to contradictions, which the unsat-based approach is
able to use.

There are exactly two branches in std and cmd that this implementation
proves in the *other* direction. This sounds scary, but is okay
because both occur in already-unreachable blocks, so it doesn't matter
what we chose. Because the fact table logic is sound but incomplete,
it fails to prove that the block isn't reachable, even though it is
able to prove that both outgoing branches are impossible. We could
turn these blocks into BlockExit blocks, but it doesn't seem worth the
trouble of the extra proof effort for something that happens twice in
all of std and cmd.

Tests:

This CL updates test/prove.go to change the expected messages because
it can no longer give a "reason" why it proved or disproved a
condition. It also adds a new test of a branch it couldn't prove
before.

It mostly guts test/sliceopt.go, removing everything related to slice
bounds optimizations and moving a few relevant tests to test/prove.go.
Much of this test is actually unreachable. The new prove pass figures
this out and doesn't try to prove anything about the unreachable
parts. The output on the unreachable parts is already suspect because
anything can be proved at that point, so it's really just a regression
test for an algorithm the compiler no longer uses.

This is a step toward fixing #23354. That issue is quite easy to fix
once we can use derived facts effectively.

Change-Id: Ia48a1b9ee081310579fe474e4a61857424ff8ce8
Reviewed-on: https://go-review.googlesource.com/87478
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:25:25 +00:00
Austin Clements
2e9cf5f66e cmd/compile: simplify limit logic in prove
This replaces the open-coded intersection of limits in the prove pass
with a general limit intersection operation. This should get identical
results except in one case where it's more precise: when handling an
equality relation, if the value is *outside* the existing range, this
will reduce the range to empty rather than resetting it. This will be
important to a follow-up CL where we can take advantage of empty
ranges.

For #23354.

Change-Id: I3d3d75924f61b1da1cb604b3a9d189b26fb3a14e
Reviewed-on: https://go-review.googlesource.com/87477
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2018-03-08 22:25:24 +00:00
Austin Clements
44e20b64ef cmd/compile: more String methods for prove types
These aid in debugging.

Change-Id: Ieb38c996765f780f6103f8c3292639d408c25123
Reviewed-on: https://go-review.googlesource.com/87476
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-08 22:25:23 +00:00
Austin Clements
491f409a32 cmd/compile: minor comment improvements/corrections
Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3
Reviewed-on: https://go-review.googlesource.com/87475
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2018-03-08 22:25:21 +00:00
Josh Bleecher Snyder
2cdb7f118a cmd/compile: move Frontend field from ssa.Config to ssa.Func
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.

This is a giant CL, but it is almost entirely routine refactoring.

The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.

Passes toolstash -cmp.

Updates #15756

Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17 23:18:57 +00:00
Keith Randall
73f92f9b04 cmd/compile: use len(s)<=cap(s) to remove more bounds checks
When we discover a relation x <= len(s), also discover the relation
x <= cap(s).  That way, in situations like:

a := s[x:]  // tests 0 <= x <= len(s)
b := s[:x]  // tests 0 <= x <= cap(s)

the second check can be eliminated.

Fixes #16813

Change-Id: Ifc037920b6955e43bac1a1eaf6bac63a89cfbd44
Reviewed-on: https://go-review.googlesource.com/33633
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: David Chase <drchase@google.com>
2017-02-02 17:45:58 +00:00
Robert Griesemer
cfd17f51c8 [dev.inline] cmd/compile/internal/ssa: rename various fields from Line to Pos
This is a mostly mechanical rename followed by manual fixes where necessary.

Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08 21:36:52 +00:00
Keith Randall
deb4177cf0 cmd/compile: use masks instead of branches for slicing
When we do

  var x []byte = ...
  y := x[i:]

We can't just use y.ptr = x.ptr + i, as the new pointer may point to the
next object in memory after the backing array.
We used to fix this by doing:

  y.cap = x.cap - i
  delta := i
  if y.cap == 0 {
    delta = 0
  }
  y.ptr = x.ptr + delta

That generates a branch in what is otherwise straight-line code.

Better to do:

  y.cap = x.cap - i
  mask := (y.cap - 1) >> 63 // -1 if y.cap==0, 0 otherwise
  y.ptr = x.ptr + i &^ mask

It's about the same number of instructions (~4, depending on what
parts are constant, and the target architecture), but it is all
inline. It plays nicely with CSE, and the mask can be computed
in parallel with the index (in cases where a multiply is required).

It is a minor win in both speed and space.

Change-Id: Ied60465a0b8abb683c02208402e5bb7ac0e8370f
Reviewed-on: https://go-review.googlesource.com/32022
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-27 20:22:49 +00:00
David Chase
0f29942489 cmd/compile: Repurpose old sliceopt.go for prove phase.
Adapt old test for prove's bounds check elimination.
Added missing rule to generic rules that lead to differences
between 32 and 64 bit platforms on sliceopt test.
Added debugging to prove.go that was helpful-to-necessary to
discover that missing rule.
Lowered debugging level on prove.go from 3 to 1; no idea
why it was previously 3.

Change-Id: I09de206aeb2fced9f2796efe2bfd4a59927eda0c
Reviewed-on: https://go-review.googlesource.com/23290
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-10-20 23:50:19 +00:00
Hajime Hoshi
c5368123fe cmd/compile: remove redundant function idom
Change-Id: Ib14b5421bb5e407bbd4d3cbfc68c92d3dd257cb1
Reviewed-on: https://go-review.googlesource.com/30732
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-11 16:43:12 +00:00
Keith Randall
75ce89c20d cmd/compile: cache CFG-dependent computations
We compute a lot of stuff based off the CFG: postorder traversal,
dominators, dominator tree, loop nest.  Multiple phases use this
information and we end up recomputing some of it.  Add a cache
for this information so if the CFG hasn't changed, we can reuse
the previous computation.

Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2
Reviewed-on: https://go-review.googlesource.com/29356
Reviewed-by: David Chase <drchase@google.com>
2016-09-19 16:00:13 +00:00
David Chase
6b99fb5bea cmd/compile: use sparse algorithm for phis in large program
This adds a sparse method for locating nearest ancestors
in a dominator tree, and checks blocks with more than one
predecessor for differences and inserts phi functions where
there are.

Uses reversed post order to cut number of passes, running
it from first def to last use ("last use" for paramout and
mem is end-of-program; last use for a phi input from a
backedge is the source of the back edge)

Includes a cutover from old algorithm to new to avoid paying
large constant factor for small programs.  This keeps normal
builds running at about the same time, while not running
over-long on large machine-generated inputs.

Add "phase" flags for ssa/build -- ssa/build/stats prints
number of blocks, values (before and after linking references
and inserting phis, so expansion can be measured), and their
product; the product governs the cutover, where a good value
seems to be somewhere between 1 and 5 million.

Among the files compiled by make.bash, this is the shape of
the tail of the distribution for #blocks, #vars, and their
product:

	 #blocks	#vars	    product
 max	6171	28180	173,898,780
99.9%	1641	 6548	 10,401,878
  99%	 463	 1909	    873,721
  95%	 152	  639	     95,235
  90%	  84	  359	     30,021

The old algorithm is indeed usually fastest, for 99%ile
values of usually.

The fix to LookupVarOutgoing
( https://go-review.googlesource.com/#/c/22790/ )
deals with some of the same problems addressed by this CL,
but on at least one bug ( #15537 ) this change is still
a significant help.

With this CL:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	4m35.200s
user	13m16.644s
sys	0m36.712s
and pprof reports 3.4GB allocated in one of the larger profiles

With tip:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
   github.com/gogo/protobuf/test/theproto3/combos/...
...
real	10m36.569s
user	25m52.286s
sys	4m3.696s
and pprof reports 8.3GB allocated in the same larger profile

With this CL, most of the compilation time on the benchmarked
input is spent in register/stack allocation (cumulative 53%)
and in the sparse lookup algorithm itself (cumulative 20%).

Fixes #15537.

Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00
Reviewed-on: https://go-review.googlesource.com/22342
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-16 21:08:05 +00:00
Keith Randall
4fa050024f cmd/compile: enable constant-time CFG editing
Provide indexes along with block pointers for Preds
and Succs arrays.  This allows us to splice edges in
and out of those arrays in constant time.

Fixes worst-case O(n^2) behavior in deadcode and fuse.

benchmark                     old ns/op      new ns/op     delta
BenchmarkFuse1-8              2065           2057          -0.39%
BenchmarkFuse10-8             9408           9073          -3.56%
BenchmarkFuse100-8            105238         76277         -27.52%
BenchmarkFuse1000-8           3982562        1026750       -74.22%
BenchmarkFuse10000-8          301220329      12824005      -95.74%
BenchmarkDeadCode1-8          1588           1566          -1.39%
BenchmarkDeadCode10-8         4333           4250          -1.92%
BenchmarkDeadCode100-8        32031          32574         +1.70%
BenchmarkDeadCode1000-8       590407         468275        -20.69%
BenchmarkDeadCode10000-8      17822890       5000818       -71.94%
BenchmarkDeadCode100000-8     1388706640     78021127      -94.38%
BenchmarkDeadCode200000-8     5372518479     168598762     -96.86%

Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17
Reviewed-on: https://go-review.googlesource.com/22589
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 15:58:59 +00:00
Alexandru Moșoi
9743e4b031 cmd/compile: share dominator tree among many passes
These passes do not modify the dominator tree too much.

% benchstat old.txt new.txt
name       old time/op     new time/op     delta
Template       335ms ± 3%      325ms ± 8%    ~             (p=0.074 n=8+9)
GoTypes        1.05s ± 1%      1.05s ± 3%    ~            (p=0.095 n=9+10)
Compiler       5.37s ± 4%      5.29s ± 1%  -1.42%         (p=0.022 n=9+10)
MakeBash       34.9s ± 3%      34.4s ± 2%    ~            (p=0.095 n=9+10)

name       old alloc/op    new alloc/op    delta
Template      55.4MB ± 0%     54.9MB ± 0%  -0.81%        (p=0.000 n=10+10)
GoTypes        179MB ± 0%      178MB ± 0%  -0.89%        (p=0.000 n=10+10)
Compiler       807MB ± 0%      798MB ± 0%  -1.10%        (p=0.000 n=10+10)

name       old allocs/op   new allocs/op   delta
Template        498k ± 0%       496k ± 0%  -0.29%          (p=0.000 n=9+9)
GoTypes        1.42M ± 0%      1.41M ± 0%  -0.24%        (p=0.000 n=10+10)
Compiler       5.61M ± 0%      5.60M ± 0%  -0.12%        (p=0.000 n=10+10)

Change-Id: I4cd20cfba3f132ebf371e16046ab14d7e42799ec
Reviewed-on: https://go-review.googlesource.com/21806
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-04-12 14:44:26 +00:00
Eric Engestrom
7a8caf7d43 all: fix spelling mistakes
Signed-off-by: Eric Engestrom <eric@engestrom.ch>

Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7
Reviewed-on: https://go-review.googlesource.com/21433
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-03 17:03:15 +00:00
Alexandru Moșoi
27ebc84716 cmd/compile: handle non-negatives in prove
Handle this case:
if 0 <= i && i < len(a) {
        use a[i]
}

Shaves about 5k from pkg/tools/linux_amd64/*.

Change-Id: I6675ff49aa306b0d241b074c5738e448204cd981
Reviewed-on: https://go-review.googlesource.com/21431
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-02 20:34:38 +00:00
Keith Randall
47c9e139ae cmd/compile: extend prove pass to handle constant comparisons
Find comparisons to constants and propagate that information
down the dominator tree.  Use it to resolve other constant
comparisons on the same variable.

So if we know x >= 7, then a x > 4 condition must return true.

This change allows us to use "_ = b[7]" hints to eliminate bounds checks.

Fixes #14900

Change-Id: Idbf230bd5b7da43de3ecb48706e21cf01bf812f7
Reviewed-on: https://go-review.googlesource.com/21008
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2016-03-31 21:16:23 +00:00
Keith Randall
56e0ecc5ea cmd/compile: keep value use counts in SSA
Keep track of how many uses each Value has.  Each appearance in
Value.Args and in Block.Control counts once.

The number of uses of a value is generically useful to
constrain rewrite rules.  For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.

But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use.  Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races.  In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice.  (I have a separate
CL which triggers the mutex failure.  This CL has a test which
demonstrates a similar failure.)

Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-03-17 04:20:02 +00:00
Todd Neal
98b88de56f cmd/compile: change the type of ssa Warnl line number
Line numbers are always int32, so the Warnl function should take the
line number as an int32 as well.  This matches gc.Warnl and removes
a cast every place it's used.

Change-Id: I5d6201e640d52ec390eb7174f8fd8c438d4efe58
Reviewed-on: https://go-review.googlesource.com/20662
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-14 11:04:40 +00:00
Alexandru Moșoi
cd798dcb88 cmd/compile/internal/ssa: generalize prove to all booleans
* Refacts a bit saving and restoring parents restrictions
* Shaves ~100k from pkg/tools/linux_amd64,
but most of the savings come from the rewrite rules.
* Improves on the following artificial test case:
func f1(a4 bool, a6 bool) bool {
  return a6 || (a6 || (a6 || a4)) || (a6 || (a4 || a6 || (false || a6)))
}

Change-Id: I714000f75a37a3a6617c6e6834c75bd23674215f
Reviewed-on: https://go-review.googlesource.com/20306
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-13 12:05:41 +00:00
David Chase
34f048c9d9 [dev.ssa] cmd/compile: small optimization to prove using sdom tweak
Exposed data already in sdom to avoid recreating it in prove.

Change-Id: I834c9c03ed8faeaee013e5a1b3f955908f0e0915
Reviewed-on: https://go-review.googlesource.com/19999
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2016-02-28 22:28:24 +00:00
Alexandru Moșoi
bdea1d58cf [dev.ssa] cmd/compile/internal/ssa: remove proven redundant controls.
* It does very simple bounds checking elimination. E.g.
removes the second check in for i := range a { a[i]++; a[i++]; }
* Improves on the following redundant expression:
return a6 || (a6 || (a6 || a4)) || (a6 || (a4 || a6 || (false || a6)))
* Linear in the number of block edges.

I patched in CL 12960 that does bounds, nil and constant propagation
to make sure this CL is not just redundant. Size of pkg/tool/linux_amd64/*
(excluding compile which is affected by this change):

With IsInBounds and IsSliceInBounds
-this -12960 92285080
+this -12960 91947416
-this +12960 91978976
+this +12960 91923088

Gain is ~110% of 12960.

Without IsInBounds and IsSliceInBounds (older run)
-this -12960 95515512
+this -12960 95492536
-this +12960 95216920
+this +12960 95204440

Shaves 22k on its own.

* Can we handle IsInBounds better with this? In
for i := range a { a[i]++; } the bounds checking at a[i]
is not eliminated.

Change-Id: I98957427399145fb33693173fd4d5a8d71c7cc20
Reviewed-on: https://go-review.googlesource.com/19710
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-28 19:48:20 +00:00