2015-03-23 17:02:11 -07:00
|
|
|
// Copyright 2015 The Go Authors. All rights reserved.
|
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
|
|
|
|
package ssa
|
|
|
|
|
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
import (
|
|
|
|
|
"cmd/internal/src"
|
|
|
|
|
)
|
|
|
|
|
|
2018-05-27 09:03:45 -07:00
|
|
|
// fusePlain runs fuse(f, fuseTypePlain).
|
|
|
|
|
func fusePlain(f *Func) { fuse(f, fuseTypePlain) }
|
|
|
|
|
|
|
|
|
|
// fuseAll runs fuse(f, fuseTypeAll).
|
|
|
|
|
func fuseAll(f *Func) { fuse(f, fuseTypeAll) }
|
|
|
|
|
|
|
|
|
|
type fuseType uint8
|
|
|
|
|
|
|
|
|
|
const (
|
|
|
|
|
fuseTypePlain fuseType = 1 << iota
|
|
|
|
|
fuseTypeIf
|
|
|
|
|
fuseTypeAll = fuseTypePlain | fuseTypeIf
|
|
|
|
|
)
|
|
|
|
|
|
2015-03-23 17:02:11 -07:00
|
|
|
// fuse simplifies control flow by joining basic blocks.
|
2018-05-27 09:03:45 -07:00
|
|
|
func fuse(f *Func, typ fuseType) {
|
2016-02-10 00:27:33 +01:00
|
|
|
for changed := true; changed; {
|
|
|
|
|
changed = false
|
cmd/compile: fuse from end to beginning
fuseBlockPlain was accidentally quadratic.
If you had plain blocks b1 -> b2 -> b3 -> b4,
each containing single values v1, v2, v3, and v4 respectively,
fuseBlockPlain would move v1 from b1 to b2 to b3 to b4,
then v2 from b2 to b3 to b4, etc.
There are two obvious fixes.
* Look for runs of blocks in fuseBlockPlain
and handle them in a single go.
* Fuse from end to beginning; any given value in a run
of blocks to fuse then moves only once.
The latter is much simpler, so that's what this CL does.
Somewhat surprisingly, this change does not pass toolstash-check.
The resulting set of blocks is the same,
and the values in them are the same,
but the order of values in them differ,
and that order of values (while arbitrary)
is enough to change the compiler's output.
This may be due to #20178; deadstore is the next pass after fuse.
Adding basic sorting to the beginning of deadstore
is enough to make this CL pass toolstash-check:
for _, b := range f.Blocks {
obj.SortSlice(b.Values, func(i, j int) bool { return b.Values[i].ID < b.Values[j].ID })
}
Happily, this CL appears to result in better code on average,
if only by accident. It cuts 4k off of cmd/go; go1 benchmarks
are noisy as always but don't regress (numbers below).
No impact on the standard compilebench benchmarks.
For the code in #13554, this speeds up compilation dramatically:
name old time/op new time/op delta
Pkg 53.1s ± 2% 12.8s ± 3% -75.92% (p=0.008 n=5+5)
name old user-time/op new user-time/op delta
Pkg 55.0s ± 2% 14.9s ± 3% -73.00% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Pkg 2.04GB ± 0% 2.04GB ± 0% +0.18% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Pkg 6.21M ± 0% 6.21M ± 0% ~ (p=0.222 n=5+5)
name old object-bytes new object-bytes delta
Pkg 28.4M ± 0% 28.4M ± 0% +0.00% (p=0.008 n=5+5)
name old export-bytes new export-bytes delta
Pkg 208 ± 0% 208 ± 0% ~ (all equal)
Updates #13554
go1 benchmarks:
name old time/op new time/op delta
BinaryTree17-8 2.29s ± 2% 2.26s ± 2% -1.43% (p=0.000 n=48+50)
Fannkuch11-8 2.74s ± 2% 2.79s ± 2% +1.63% (p=0.000 n=50+49)
FmtFprintfEmpty-8 36.6ns ± 3% 34.6ns ± 4% -5.29% (p=0.000 n=49+50)
FmtFprintfString-8 58.3ns ± 3% 59.1ns ± 3% +1.35% (p=0.000 n=50+49)
FmtFprintfInt-8 62.4ns ± 2% 63.2ns ± 3% +1.19% (p=0.000 n=49+49)
FmtFprintfIntInt-8 95.1ns ± 2% 96.7ns ± 3% +1.61% (p=0.000 n=49+50)
FmtFprintfPrefixedInt-8 118ns ± 3% 113ns ± 2% -4.00% (p=0.000 n=50+49)
FmtFprintfFloat-8 191ns ± 2% 192ns ± 2% +0.40% (p=0.034 n=50+50)
FmtManyArgs-8 419ns ± 2% 420ns ± 2% ~ (p=0.228 n=49+49)
GobDecode-8 5.26ms ± 3% 5.19ms ± 2% -1.33% (p=0.000 n=50+49)
GobEncode-8 4.12ms ± 2% 4.15ms ± 3% +0.68% (p=0.007 n=49+50)
Gzip-8 198ms ± 2% 197ms ± 2% -0.50% (p=0.018 n=48+48)
Gunzip-8 31.9ms ± 3% 31.8ms ± 3% -0.47% (p=0.024 n=50+50)
HTTPClientServer-8 64.4µs ± 0% 64.0µs ± 0% -0.55% (p=0.000 n=43+46)
JSONEncode-8 10.6ms ± 2% 10.6ms ± 3% ~ (p=0.543 n=49+49)
JSONDecode-8 43.3ms ± 3% 43.1ms ± 2% ~ (p=0.079 n=50+50)
Mandelbrot200-8 3.70ms ± 2% 3.70ms ± 2% ~ (p=0.553 n=47+50)
GoParse-8 2.70ms ± 2% 2.71ms ± 3% ~ (p=0.843 n=49+50)
RegexpMatchEasy0_32-8 70.5ns ± 4% 70.4ns ± 4% ~ (p=0.867 n=48+50)
RegexpMatchEasy0_1K-8 162ns ± 3% 162ns ± 2% ~ (p=0.739 n=48+48)
RegexpMatchEasy1_32-8 66.1ns ± 5% 66.2ns ± 4% ~ (p=0.970 n=50+50)
RegexpMatchEasy1_1K-8 297ns ± 7% 296ns ± 7% ~ (p=0.406 n=50+50)
RegexpMatchMedium_32-8 105ns ± 5% 105ns ± 5% ~ (p=0.702 n=50+50)
RegexpMatchMedium_1K-8 32.3µs ± 4% 32.2µs ± 3% ~ (p=0.614 n=49+49)
RegexpMatchHard_32-8 1.75µs ±18% 1.74µs ±12% ~ (p=0.738 n=50+48)
RegexpMatchHard_1K-8 52.2µs ±14% 51.3µs ±13% ~ (p=0.230 n=50+50)
Revcomp-8 366ms ± 3% 367ms ± 3% ~ (p=0.745 n=49+49)
Template-8 48.5ms ± 4% 48.5ms ± 4% ~ (p=0.824 n=50+48)
TimeParse-8 263ns ± 2% 256ns ± 2% -2.98% (p=0.000 n=48+49)
TimeFormat-8 265ns ± 3% 262ns ± 3% -1.35% (p=0.000 n=48+49)
[Geo mean] 41.1µs 40.9µs -0.48%
Change-Id: Ib35fa15b54282abb39c077d150beee27f610891a
Reviewed-on: https://go-review.googlesource.com/43570
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-16 21:57:18 -07:00
|
|
|
// Fuse from end to beginning, to avoid quadratic behavior in fuseBlockPlain. See issue 13554.
|
|
|
|
|
for i := len(f.Blocks) - 1; i >= 0; i-- {
|
|
|
|
|
b := f.Blocks[i]
|
2018-05-27 09:03:45 -07:00
|
|
|
if typ&fuseTypeIf != 0 {
|
|
|
|
|
changed = fuseBlockIf(b) || changed
|
|
|
|
|
}
|
|
|
|
|
if typ&fuseTypePlain != 0 {
|
|
|
|
|
changed = fuseBlockPlain(b) || changed
|
|
|
|
|
}
|
2015-03-23 17:02:11 -07:00
|
|
|
}
|
2019-05-14 14:46:15 -07:00
|
|
|
if changed {
|
|
|
|
|
f.invalidateCFG()
|
|
|
|
|
}
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
|
|
|
|
}
|
2015-03-23 17:02:11 -07:00
|
|
|
|
2016-02-10 00:27:33 +01:00
|
|
|
// fuseBlockIf handles the following cases where s0 and s1 are empty blocks.
|
|
|
|
|
//
|
2016-02-17 17:21:53 +01:00
|
|
|
// b b b b
|
|
|
|
|
// / \ | \ / | | |
|
|
|
|
|
// s0 s1 | s1 s0 | | |
|
|
|
|
|
// \ / | / \ | | |
|
|
|
|
|
// ss ss ss ss
|
2016-02-10 00:27:33 +01:00
|
|
|
//
|
2016-02-17 17:21:53 +01:00
|
|
|
// If all Phi ops in ss have identical variables for slots corresponding to
|
|
|
|
|
// s0, s1 and b then the branch can be dropped.
|
2016-04-28 16:52:47 -07:00
|
|
|
// This optimization often comes up in switch statements with multiple
|
|
|
|
|
// expressions in a case clause:
|
|
|
|
|
// switch n {
|
|
|
|
|
// case 1,2,3: return 4
|
|
|
|
|
// }
|
2016-02-17 17:21:53 +01:00
|
|
|
// TODO: If ss doesn't contain any OpPhis, are s0 and s1 dead code anyway.
|
2016-02-10 00:27:33 +01:00
|
|
|
func fuseBlockIf(b *Block) bool {
|
|
|
|
|
if b.Kind != BlockIf {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
var ss0, ss1 *Block
|
2016-04-28 16:52:47 -07:00
|
|
|
s0 := b.Succs[0].b
|
|
|
|
|
i0 := b.Succs[0].i
|
2019-05-14 10:11:23 -07:00
|
|
|
if s0.Kind != BlockPlain || len(s0.Preds) != 1 || !isEmpty(s0) {
|
2016-02-17 17:21:53 +01:00
|
|
|
s0, ss0 = b, s0
|
2016-02-10 00:27:33 +01:00
|
|
|
} else {
|
2016-04-28 16:52:47 -07:00
|
|
|
ss0 = s0.Succs[0].b
|
|
|
|
|
i0 = s0.Succs[0].i
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
2016-04-28 16:52:47 -07:00
|
|
|
s1 := b.Succs[1].b
|
|
|
|
|
i1 := b.Succs[1].i
|
2019-05-14 10:11:23 -07:00
|
|
|
if s1.Kind != BlockPlain || len(s1.Preds) != 1 || !isEmpty(s1) {
|
2016-02-17 17:21:53 +01:00
|
|
|
s1, ss1 = b, s1
|
2016-02-10 00:27:33 +01:00
|
|
|
} else {
|
2016-04-28 16:52:47 -07:00
|
|
|
ss1 = s1.Succs[0].b
|
|
|
|
|
i1 = s1.Succs[0].i
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
2015-03-23 17:02:11 -07:00
|
|
|
|
2016-02-10 00:27:33 +01:00
|
|
|
if ss0 != ss1 {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
ss := ss0
|
|
|
|
|
|
2016-02-17 17:21:53 +01:00
|
|
|
// s0 and s1 are equal with b if the corresponding block is missing
|
|
|
|
|
// (2nd, 3rd and 4th case in the figure).
|
2016-04-28 16:52:47 -07:00
|
|
|
|
2016-02-10 00:27:33 +01:00
|
|
|
for _, v := range ss.Values {
|
2016-03-30 16:19:10 +02:00
|
|
|
if v.Op == OpPhi && v.Uses > 0 && v.Args[i0] != v.Args[i1] {
|
2016-02-10 00:27:33 +01:00
|
|
|
return false
|
2016-01-28 15:54:45 -08:00
|
|
|
}
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
|
|
|
|
|
2016-02-17 17:21:53 +01:00
|
|
|
// Now we have two of following b->ss, b->s0->ss and b->s1->ss,
|
2016-02-10 00:27:33 +01:00
|
|
|
// with s0 and s1 empty if exist.
|
2016-02-17 17:21:53 +01:00
|
|
|
// We can replace it with b->ss without if all OpPhis in ss
|
|
|
|
|
// have identical predecessors (verified above).
|
2016-02-10 00:27:33 +01:00
|
|
|
// No critical edge is introduced because b will have one successor.
|
2016-02-17 17:21:53 +01:00
|
|
|
if s0 != b && s1 != b {
|
2016-04-28 16:52:47 -07:00
|
|
|
// Replace edge b->s0->ss with b->ss.
|
2016-02-17 17:21:53 +01:00
|
|
|
// We need to keep a slot for Phis corresponding to b.
|
2016-04-28 16:52:47 -07:00
|
|
|
b.Succs[0] = Edge{ss, i0}
|
|
|
|
|
ss.Preds[i0] = Edge{b, 0}
|
|
|
|
|
b.removeEdge(1)
|
|
|
|
|
s1.removeEdge(0)
|
2016-02-17 17:21:53 +01:00
|
|
|
} else if s0 != b {
|
2016-04-28 16:52:47 -07:00
|
|
|
b.removeEdge(0)
|
|
|
|
|
s0.removeEdge(0)
|
2016-02-17 17:21:53 +01:00
|
|
|
} else if s1 != b {
|
2016-04-28 16:52:47 -07:00
|
|
|
b.removeEdge(1)
|
|
|
|
|
s1.removeEdge(0)
|
|
|
|
|
} else {
|
|
|
|
|
b.removeEdge(1)
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
|
|
|
|
b.Kind = BlockPlain
|
2018-01-22 09:43:27 -08:00
|
|
|
b.Likely = BranchUnknown
|
2019-08-12 20:19:58 +01:00
|
|
|
b.ResetControls()
|
2016-02-10 00:27:33 +01:00
|
|
|
|
2019-05-14 10:11:23 -07:00
|
|
|
// Trash the empty blocks s0 and s1.
|
|
|
|
|
blocks := [...]*Block{s0, s1}
|
|
|
|
|
for _, s := range &blocks {
|
|
|
|
|
if s == b {
|
|
|
|
|
continue
|
|
|
|
|
}
|
|
|
|
|
// Move any (dead) values in s0 or s1 to b,
|
|
|
|
|
// where they will be eliminated by the next deadcode pass.
|
|
|
|
|
for _, v := range s.Values {
|
|
|
|
|
v.Block = b
|
|
|
|
|
}
|
|
|
|
|
b.Values = append(b.Values, s.Values...)
|
|
|
|
|
// Clear s.
|
|
|
|
|
s.Kind = BlockInvalid
|
|
|
|
|
s.Values = nil
|
|
|
|
|
s.Succs = nil
|
|
|
|
|
s.Preds = nil
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
2019-05-14 10:11:23 -07:00
|
|
|
return true
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// isEmpty reports whether b contains any live values.
|
|
|
|
|
// There may be false positives.
|
|
|
|
|
func isEmpty(b *Block) bool {
|
|
|
|
|
for _, v := range b.Values {
|
2019-12-05 18:56:54 -05:00
|
|
|
if v.Uses > 0 || v.Op.IsCall() || v.Op.HasSideEffects() || v.Type.IsVoid() {
|
2019-05-14 10:11:23 -07:00
|
|
|
return false
|
|
|
|
|
}
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
|
|
|
|
return true
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func fuseBlockPlain(b *Block) bool {
|
|
|
|
|
if b.Kind != BlockPlain {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
|
2016-04-28 16:52:47 -07:00
|
|
|
c := b.Succs[0].b
|
2016-02-10 00:27:33 +01:00
|
|
|
if len(c.Preds) != 1 {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
// If a block happened to end in a statement marker,
|
|
|
|
|
// try to preserve it.
|
|
|
|
|
if b.Pos.IsStmt() == src.PosIsStmt {
|
|
|
|
|
l := b.Pos.Line()
|
|
|
|
|
for _, v := range c.Values {
|
|
|
|
|
if v.Pos.IsStmt() == src.PosNotStmt {
|
|
|
|
|
continue
|
|
|
|
|
}
|
|
|
|
|
if l == v.Pos.Line() {
|
|
|
|
|
v.Pos = v.Pos.WithIsStmt()
|
|
|
|
|
l = 0
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if l != 0 && c.Pos.Line() == l {
|
|
|
|
|
c.Pos = c.Pos.WithIsStmt()
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2016-04-28 16:52:47 -07:00
|
|
|
// move all of b's values to c.
|
2016-02-10 00:27:33 +01:00
|
|
|
for _, v := range b.Values {
|
|
|
|
|
v.Block = c
|
2017-01-11 13:58:20 -08:00
|
|
|
}
|
|
|
|
|
// Use whichever value slice is larger, in the hopes of avoiding growth.
|
|
|
|
|
// However, take care to avoid c.Values pointing to b.valstorage.
|
|
|
|
|
// See golang.org/issue/18602.
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
// It's important to keep the elements in the same order; maintenance of
|
|
|
|
|
// debugging information depends on the order of *Values in Blocks.
|
|
|
|
|
// This can also cause changes in the order (which may affect other
|
|
|
|
|
// optimizations and possibly compiler output) for 32-vs-64 bit compilation
|
2018-05-18 17:43:11 -04:00
|
|
|
// platforms (word size affects allocation bucket size affects slice capacity).
|
2017-01-11 13:58:20 -08:00
|
|
|
if cap(c.Values) >= cap(b.Values) || len(b.Values) <= len(b.valstorage) {
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
bl := len(b.Values)
|
|
|
|
|
cl := len(c.Values)
|
2018-05-18 17:43:11 -04:00
|
|
|
var t []*Value // construct t = b.Values followed-by c.Values, but with attention to allocation.
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
if cap(c.Values) < bl+cl {
|
|
|
|
|
// reallocate
|
2018-05-18 17:43:11 -04:00
|
|
|
t = make([]*Value, bl+cl)
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
} else {
|
|
|
|
|
// in place.
|
2018-05-18 17:43:11 -04:00
|
|
|
t = c.Values[0 : bl+cl]
|
cmd/compile: assign and preserve statement boundaries.
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-03-23 22:46:06 -04:00
|
|
|
}
|
2018-05-18 17:43:11 -04:00
|
|
|
copy(t[bl:], c.Values) // possibly in-place
|
|
|
|
|
c.Values = t
|
|
|
|
|
copy(c.Values, b.Values)
|
2017-01-11 13:58:20 -08:00
|
|
|
} else {
|
|
|
|
|
c.Values = append(b.Values, c.Values...)
|
2016-02-10 00:27:33 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// replace b->c edge with preds(b) -> c
|
2016-04-28 16:52:47 -07:00
|
|
|
c.predstorage[0] = Edge{}
|
2016-02-10 00:27:33 +01:00
|
|
|
if len(b.Preds) > len(b.predstorage) {
|
|
|
|
|
c.Preds = b.Preds
|
|
|
|
|
} else {
|
|
|
|
|
c.Preds = append(c.predstorage[:0], b.Preds...)
|
|
|
|
|
}
|
2016-04-28 16:52:47 -07:00
|
|
|
for i, e := range c.Preds {
|
|
|
|
|
p := e.b
|
|
|
|
|
p.Succs[e.i] = Edge{c, i}
|
2015-03-23 17:02:11 -07:00
|
|
|
}
|
2016-09-07 14:04:31 -07:00
|
|
|
f := b.Func
|
|
|
|
|
if f.Entry == b {
|
2016-02-10 00:27:33 +01:00
|
|
|
f.Entry = c
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// trash b, just in case
|
|
|
|
|
b.Kind = BlockInvalid
|
|
|
|
|
b.Values = nil
|
|
|
|
|
b.Preds = nil
|
|
|
|
|
b.Succs = nil
|
|
|
|
|
return true
|
2015-03-23 17:02:11 -07:00
|
|
|
}
|