We compute a lot of stuff based off the CFG: postorder traversal,
dominators, dominator tree, loop nest. Multiple phases use this
information and we end up recomputing some of it. Add a cache
for this information so if the CFG hasn't changed, we can reuse
the previous computation.
Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2
Reviewed-on: https://go-review.googlesource.com/29356
Reviewed-by: David Chase <drchase@google.com>
Rip out the code that allows SSA to be used conditionally.
No longer exists:
ssa=0 flag
GOSSAHASH
GOSSAPKG
SSATEST
GOSSAFUNC now only controls the printing of the IR/html.
Still need to rip out all of the old backend. It should no longer be
callable after this CL.
Update #16357
Change-Id: Ib30cc18fba6ca52232c41689ba610b0a94aa74f5
Reviewed-on: https://go-review.googlesource.com/29155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Atomic swap, add/and/or, compare and swap.
Also works on amd64p32.
Change-Id: Idf2d8f3e1255f71deba759e6e75e293afe4ab2ba
Reviewed-on: https://go-review.googlesource.com/27813
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This adds a sparse method for locating nearest ancestors
in a dominator tree, and checks blocks with more than one
predecessor for differences and inserts phi functions where
there are.
Uses reversed post order to cut number of passes, running
it from first def to last use ("last use" for paramout and
mem is end-of-program; last use for a phi input from a
backedge is the source of the back edge)
Includes a cutover from old algorithm to new to avoid paying
large constant factor for small programs. This keeps normal
builds running at about the same time, while not running
over-long on large machine-generated inputs.
Add "phase" flags for ssa/build -- ssa/build/stats prints
number of blocks, values (before and after linking references
and inserting phis, so expansion can be measured), and their
product; the product governs the cutover, where a good value
seems to be somewhere between 1 and 5 million.
Among the files compiled by make.bash, this is the shape of
the tail of the distribution for #blocks, #vars, and their
product:
#blocks #vars product
max 6171 28180 173,898,780
99.9% 1641 6548 10,401,878
99% 463 1909 873,721
95% 152 639 95,235
90% 84 359 30,021
The old algorithm is indeed usually fastest, for 99%ile
values of usually.
The fix to LookupVarOutgoing
( https://go-review.googlesource.com/#/c/22790/ )
deals with some of the same problems addressed by this CL,
but on at least one bug ( #15537 ) this change is still
a significant help.
With this CL:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
github.com/gogo/protobuf/test/theproto3/combos/...
...
real 4m35.200s
user 13m16.644s
sys 0m36.712s
and pprof reports 3.4GB allocated in one of the larger profiles
With tip:
/tmp/gopath$ rm -rf pkg bin
/tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \
github.com/gogo/protobuf/test/theproto3/combos/...
...
real 10m36.569s
user 25m52.286s
sys 4m3.696s
and pprof reports 8.3GB allocated in the same larger profile
With this CL, most of the compilation time on the benchmarked
input is spent in register/stack allocation (cumulative 53%)
and in the sparse lookup algorithm itself (cumulative 20%).
Fixes#15537.
Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00
Reviewed-on: https://go-review.googlesource.com/22342
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
They have different semantics.
Equal is stricter and is designed for the front-end.
Compare is looser and cheaper and is designed for the back-end.
To avoid possible regression, remove Equal from ssa.Type.
Updates #15043
Change-Id: Ie23ce75ff6b4d01b7982e0a89e6f81b5d099d8d6
Reviewed-on: https://go-review.googlesource.com/21483
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Keep track of how many uses each Value has. Each appearance in
Value.Args and in Block.Control counts once.
The number of uses of a value is generically useful to
constrain rewrite rules. For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.
But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use. Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races. In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice. (I have a separate
CL which triggers the mutex failure. This CL has a test which
demonstrates a similar failure.)
Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Line numbers are always int32, so the Warnl function should take the
line number as an int32 as well. This matches gc.Warnl and removes
a cast every place it's used.
Change-Id: I5d6201e640d52ec390eb7174f8fd8c438d4efe58
Reviewed-on: https://go-review.googlesource.com/20662
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Split the auxFloat type into 32/64 bit versions and perform checking for
exactly representable float32 values. Perform const folding on
float32/64. Comment out some const negation rules that the frontend
already performs.
Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2
Reviewed-on: https://go-review.googlesource.com/20568
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
When calling freeValue for possible const values, remove them from the
cache as well.
Change-Id: I087ed592243e33c58e5db41700ab266fc70196d9
Reviewed-on: https://go-review.googlesource.com/20481
Run-TryBot: Todd Neal <tolchz@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The -d compiler flag can also specify ssa phase and flag,
for example -d=ssa/generic_cse/time,ssa/generic_cse/stats
Spaces in the phase names can be specified with an
underscore. Flags currently parsed (not necessarily
recognized by the phases yet) are:
on, off, mem, time, debug, stats, and test
On, off and time are handled in the harness,
debug, stats, and test are interpreted by the phase itself.
The pass is now attached to the Func being compiled, and a
new method logStats(key, ...value) on *Func to encourage a
semi-standardized format for that output. Output fields
are separated by tabs to ease digestion by awk and
spreadsheets. For example,
if f.pass.stats > 0 {
f.logStat("CSE REWRITES", rewrites)
}
Change-Id: I16db2b5af64c50ca9a47efeb51d961147a903abc
Reviewed-on: https://go-review.googlesource.com/19885
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Move the cached sparse sets to the Config. I tested make.bash with
pre-allocating sets of size 150 and not caching very small sets, but the
difference between this implementation (no min size, no preallocation)
and a min size with preallocation was fairly negligible:
Number of sparse sets allocated:
Cached in Config w/none preallocated no min size 3684 *this CL*
Cached in Config w/three preallocated no min size 3370
Cached in Config w/three preallocated min size=150 3370
Cached in Config w/none preallocated min size=150 15947
Cached in Func, w/no min 96996 *previous code*
Change-Id: I7f9de8a7cae192648a7413bfb18a6690fad34375
Reviewed-on: https://go-review.googlesource.com/19152
Reviewed-by: Keith Randall <khr@golang.org>
From memory profiling, about 3% reduction in allocation count.
Change-Id: I4b662d55b8a94fe724759a2b22f05a08d0bf40f8
Reviewed-on: https://go-review.googlesource.com/19103
Reviewed-by: Keith Randall <khr@golang.org>
If a failure occurs in SSA processing, we always report the
last line of the function we're compiling. Modify the callbacks
from SSA to the GC compiler so we can pass a line number back
and use it in Fatalf.
Change-Id: Ifbfad50d5e167e997e0a96f0775bcc369f5c397e
Reviewed-on: https://go-review.googlesource.com/18599
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Declare a function's arguments as having already been
spilled so their use just requires a restore.
Allow spill locations to be portions of larger objects the stack.
Required to load portions of compound input arguments.
Rename the memory input to InputMem. Use Arg for the
pre-spilled argument values.
Change-Id: I8fe2a03ffbba1022d98bfae2052b376b96d32dda
Reviewed-on: https://go-review.googlesource.com/16536
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Be more consistent about this. There's no reason to do the
pointer arithmetic on a different type, as sizeof(int) >=
sizeof(ptr) on all of our platforms. It simplifies our
rewrite rules also, except for a few that need duplication.
Add some more constant folding to get constant indexing and
slicing to fold down to nothing.
Change-Id: I3e56cdb14b3dc1a6a0514f0333e883f92c19e3c7
Reviewed-on: https://go-review.googlesource.com/16586
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
For debugging, spill values to named variables instead of autotmp_
variables if possible. We do this by keeping a name -> value map
for each function, keep it up-to-date during deadcode elim, and use
it to override spill decisions in stackalloc.
It might even make stack frames a bit smaller, as it makes it easy
to identify a set of spills which are likely not to interfere.
This just works for one-word variables for now. Strings/slices
will be a separate CL.
Change-Id: Ie89eba8cab16bcd41b311c479ec46dd7e64cdb67
Reviewed-on: https://go-review.googlesource.com/16336
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This change is all about leveraging the gc bitmap generation
that is already done by the current compiler. We rearrange how
stack allocation is done so that we generate a variable declaration
for each spill. We also reorganize how args/locals are recorded
during SSA. Then we can use the existing allocauto/defframe to
allocate the stack frame and liveness to make the gc bitmaps.
With this change, stack copying works correctly and we no longer
need hacks in runtime/stack*.go to make tests work. GC is close
to working, it just needs write barriers.
Change-Id: I990fb4e3fbe98850c6be35c3185a1c85d9e1a6ba
Reviewed-on: https://go-review.googlesource.com/13894
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Store floats in AuxInt to reduce allocations.
Change-Id: I101e6322530b4a0b2ea3591593ad022c992e8df8
Reviewed-on: https://go-review.googlesource.com/14320
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Added F32 and F64 load, store, and addition.
Added F32 and F64 multiply.
Added F32 and F64 subtraction and division.
Added X15 to "clobber" for FP sub/div
Added FP constants
Added separate FP test in gc/testdata
Change-Id: Ifa60dbad948a40011b478d9605862c4b0cc9134c
Reviewed-on: https://go-review.googlesource.com/13612
Reviewed-by: Keith Randall <khr@golang.org>
Using the type of the store argument is not safe, it may change
during rewriting, giving us the wrong store width.
(Store ptr (Trunc32to16 val) mem)
This should be a 2-byte store. But we have the rule:
(Trunc32to16 x) -> x
So if the Trunc rewrite happens before the Store -> MOVW rewrite,
then the Store thinks that the value it is storing is 4 bytes
in size and uses a MOVL. Bad things ensue.
Fix this by encoding the store width explicitly in the auxint field.
In general, we can't rely on the type of arguments, as they may
change during rewrites. The type of the op itself (as used by
the Load rules) is still ok to use.
Change-Id: I9e2359e4f657bb0ea0e40038969628bf0f84e584
Reviewed-on: https://go-review.googlesource.com/13636
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The DFS scheduler doesn't do the right thing. If a Value x is used by
more than one other Value, then x is put into the DFS queue when
its first user (call it y) is visited. It is not removed and reinserted
when the second user of x (call it z) is visited, so the dependency
between x and z is not respected. There is no easy way to fix this with
the DFS queue because we'd have to rip values out of the middle of the
DFS queue.
The new scheduler works from the end of the block backwards, scheduling
instructions which have had all of their uses already scheduled.
A simple priority scheme breaks ties between multiple instructions that
are ready to schedule simultaneously.
Keep track of whether we've scheduled or not, and make print() use
the scheduled order if we have.
Fix some shift tests that this change tickles. Add unsigned right shift tests.
Change-Id: I44164c10bb92ae8ab8f76d7a5180cbafab826ea1
Reviewed-on: https://go-review.googlesource.com/13069
Reviewed-by: Todd Neal <todd@tneal.org>
The existing backend recognizes special
assignment statements as being implementable
with static data rather than code.
Unfortunately, it assumes that it is in the middle
of codegen; it emits data and modifies the AST.
This does not play well with SSA's two-phase
bootstrapping approach, in which we attempt to
compile code but fall back to the existing backend
if something goes wrong.
To work around this:
* Add the ability to inquire about static data
without side-effects.
* Save the static data required for a function.
* Emit that static data during SSA codegen.
Change-Id: I2e8a506c866ea3e27dffb597095833c87f62d87e
Reviewed-on: https://go-review.googlesource.com/12790
Reviewed-by: Keith Randall <khr@golang.org>
This change has some tests verifying functionality and an assortment of
benchmarks of various block lists. It modifies NewBlock to allocate in
contiguous blocks improving the performance of intersect() for extremely
large graphs by 30-40%.
benchmark old ns/op new ns/op delta
BenchmarkDominatorsLinear-8 1185619 901154 -23.99%
BenchmarkDominatorsFwdBack-8 1302138 863537 -33.68%
BenchmarkDominatorsManyPred-8 404670521 247450911 -38.85%
BenchmarkDominatorsMaxPred-8 455809002 471675119 +3.48%
BenchmarkDominatorsMaxPredVal-8 819315864 468257300 -42.85%
BenchmarkNilCheckDeep1-8 766 706 -7.83%
BenchmarkNilCheckDeep10-8 2553 2209 -13.47%
BenchmarkNilCheckDeep100-8 58606 57545 -1.81%
BenchmarkNilCheckDeep1000-8 7753012 8025750 +3.52%
BenchmarkNilCheckDeep10000-8 1224165946 789995184 -35.47%
Change-Id: Id3d6bc9cb1138e8177934441073ac7873ddf7ade
Reviewed-on: https://go-review.googlesource.com/11716
Reviewed-by: Keith Randall <khr@golang.org>
This will make it possible for us to start implementing interfaces
and other stack allocated types which are more than one machine word.
Change-Id: I52b187a791cf1919cb70ed6dabdc9f57b317ea83
Reviewed-on: https://go-review.googlesource.com/11631
Reviewed-by: Keith Randall <khr@golang.org>
The SSA implementation logs for three purposes:
* debug logging
* fatal errors
* unimplemented features
Separating these three uses lets us attempt an SSA
implementation for all functions, not just
_ssa functions. This turns the entire standard
library into a compilation test, and makes it
easy to figure out things like
"how much coverage does SSA have now" and
"what should we do next to get more coverage?".
Functions called _ssa are still special.
They log profusely by default and
the output of the SSA implementation
is used. For all other functions,
logging is off, and the implementation
is built and discarded, due to lack of
support for the runtime.
While we're here, fix a few minor bugs and
add some extra Unimplementeds to allow
all.bash to pass.
As of now, SSA handles 20.79% of the functions
in the standard library (689 of 3314).
The top missing features are:
10.03% 2597 SSA unimplemented: zero for type error not implemented
7.79% 2016 SSA unimplemented: addr: bad op DOTPTR
7.33% 1898 SSA unimplemented: unhandled expr EQ
6.10% 1579 SSA unimplemented: unhandled expr OROR
4.91% 1271 SSA unimplemented: unhandled expr NE
4.49% 1163 SSA unimplemented: unhandled expr LROT
4.00% 1036 SSA unimplemented: unhandled expr LEN
3.56% 923 SSA unimplemented: unhandled stmt CALLFUNC
2.37% 615 SSA unimplemented: zero for type []byte not implemented
1.90% 492 SSA unimplemented: unhandled stmt CALLMETH
1.74% 450 SSA unimplemented: unhandled expr CALLINTER
1.74% 450 SSA unimplemented: unhandled expr DOT
1.71% 444 SSA unimplemented: unhandled expr ANDAND
1.65% 426 SSA unimplemented: unhandled expr CLOSUREVAR
1.54% 400 SSA unimplemented: unhandled expr CALLMETH
1.51% 390 SSA unimplemented: unhandled stmt SWITCH
1.47% 380 SSA unimplemented: unhandled expr CONV
1.33% 345 SSA unimplemented: addr: bad op *
1.30% 336 SSA unimplemented: unhandled OLITERAL 6
Change-Id: I4ca07951e276714dc13c31de28640aead17a1be7
Reviewed-on: https://go-review.googlesource.com/11160
Reviewed-by: Keith Randall <khr@golang.org>
This CL sets line numbers on Values in the newValue variants
introduced in cl/10929.
Change-Id: Ibd15bc90631a1e948177878ea4191d995e8bb19b
Reviewed-on: https://go-review.googlesource.com/11090
Reviewed-by: Keith Randall <khr@golang.org>
In the previous line number CL the NewValue\d? functions took
a line number argument but neglected to set the Line field on
the value struct. Fix that.
Change-Id: I53c79ff93703f66f5f0266178c94803719ae2074
Reviewed-on: https://go-review.googlesource.com/11054
Reviewed-by: Keith Randall <khr@golang.org>
Add an additional int64 auxiliary field to Value.
There are two main reasons for doing this:
1) Ints in interfaces require allocation, and we store ints in Aux a lot.
2) I'd like to have both *gc.Sym and int offsets included in lots
of operations (e.g. MOVQloadidx8). It will be more efficient to
store them as separate fields instead of a pointer to a sym/int pair.
It also simplifies a bunch of code.
This is just the refactoring. I'll start using this some more in a
subsequent changelist.
Change-Id: I1ca797ff572553986cf90cab3ac0a0c1d01ad241
Reviewed-on: https://go-review.googlesource.com/10929
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Semi-regular merge of tip to dev.ssa.
Complicated a bit by the move of cmd/internal/* to cmd/compile/internal/*.
Change-Id: I1c66d3c29bb95cce4a53c5a3476373aa5245303d