cmd/internal/gc: improve flow of input params to output params

This includes the following information in the per-function summary:

outK = paramJ   encoded in outK bits for paramJ
outK = *paramJ  encoded in outK bits for paramJ
heap = paramJ   EscHeap
heap = *paramJ  EscContentEscapes

Note that (currently) if the address of a parameter is taken and
returned, necessarily a heap allocation occurred to contain that
reference, and the heap can never refer to stack, therefore the
parameter and everything downstream from it escapes to the heap.

The per-function summary information now has a tuneable number of bits
(2 is probably noticeably better than 1, 3 is likely overkill, but it
is now easy to check and the -m debugging output includes information
that allows you to figure out if more would be better.)

A new test was  added to check pointer flow through struct-typed and
*struct-typed parameters and returns; some of these are sensitive to
the number of summary bits, and ought to yield better results with a
more competent escape analysis algorithm.  Another new test checks
(some) correctness with array parameters, results, and operations.

The old analysis inferred a piece of plan9 runtime was non-escaping by
counteracting overconservative analysis with buggy analysis; with the
bug fixed, the result was too conservative (and it's not easy to fix
in this framework) so the source code was tweaked to get the desired
result.  A test was added against the discovered bug.

The escape analysis was further improved splitting the "level" into
3 parts, one tracking the conventional "level" and the other two
computing the highest-level-suffix-from-copy, which is used to
generally model the cancelling effect of indirection applied to
address-of.

With the improved escape analysis enabled, it was necessary to
modify one of the runtime tests because it now attempts to allocate
too much on the (small, fixed-size) G0 (system) stack and this
failed the test.

Compiling src/std after touching src/runtime/*.go with -m logging
turned on shows 420 fewer heap allocation sites (10538 vs 10968).

Profiling allocations in src/html/template with
for i in {1..5} ;
  do go tool 6g -memprofile=mastx.${i}.prof  -memprofilerate=1 *.go;
  go tool pprof -alloc_objects -text  mastx.${i}.prof ;
done

showed a 15% reduction in allocations performed by the compiler.

Update #3753
Update #4720
Fixes #10466

Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432
Reviewed-on: https://go-review.googlesource.com/8202
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This commit is contained in:
David Chase 2015-03-26 16:36:15 -04:00
parent 4044adedf7
commit 7fbb1b36c3
20 changed files with 1444 additions and 298 deletions

View file

@ -44,15 +44,15 @@ type Node struct {
Isddd bool // is the argument variadic
Readonly bool
Implicit bool
Addrtaken bool // address taken, even if not moved to heap
Assigned bool // is the variable ever assigned to
Captured bool // is the variable captured by a closure
Byval bool // is the variable captured by value or by reference
Reslice bool // this is a reslice x = x[0:y] or x = append(x, ...)
Likely int8 // likeliness of if statement
Hasbreak bool // has break statement
Needzero bool // if it contains pointers, needs to be zeroed on function entry
Esc uint8 // EscXXX
Addrtaken bool // address taken, even if not moved to heap
Assigned bool // is the variable ever assigned to
Captured bool // is the variable captured by a closure
Byval bool // is the variable captured by value or by reference
Reslice bool // this is a reslice x = x[0:y] or x = append(x, ...)
Likely int8 // likeliness of if statement
Hasbreak bool // has break statement
Needzero bool // if it contains pointers, needs to be zeroed on function entry
Esc uint16 // EscXXX
Funcdepth int32
// most nodes
@ -103,14 +103,14 @@ type Node struct {
Escloopdepth int // -1: global, 0: return variables, 1:function top level, increased inside function for every loop or label to mark scopes
Sym *Sym // various
Vargen int32 // unique name for OTYPE/ONAME
Vargen int32 // unique name for OTYPE/ONAME within a function. Function outputs are numbered starting at one.
Lineno int32
Xoffset int64
Stkdelta int64 // offset added by stack frame compaction phase.
Ostk int32 // 6g only
Iota int32
Walkgen uint32
Esclevel int32
Esclevel Level
Opt interface{} // for optimization passes
}