After benchmarking with a compiler modified to have better
spill location, it became clear that this method of checking
was actually faster on (at least) two different architectures
(ppc64 and amd64) and it also provides more timely interruption
of loops.
This change adds a modified FOR loop node "FORUNTIL" that
checks after executing the loop body instead of before (i.e.,
always at least once). This ensures that a pointer past the
end of a slice or array is not made visible to the garbage
collector.
Without the rescheduling checks inserted, the restructured
loop from this change apparently provides a 1% geomean
improvement on PPC64 running the go1 benchmarks; the
improvement on AMD64 is only 0.12%.
Inserting the rescheduling check exposed some peculiar bug
with the ssa test code for s390x; this was updated based on
initial code actually generated for GOARCH=s390x to use
appropriate OpArg, OpAddr, and OpVarDef.
NaCl is disabled in testing.
Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50
Reviewed-on: https://go-review.googlesource.com/36206
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Previously the compiler rewrote constant division into OHMUL
operations, but that rewriting was moved to SSA in CL 37015. Now OHMUL
is unused, so we can get rid of it.
Change-Id: Ib6fc7c2b6435510bafb5735b3b4f42cfd8ed8cdb
Reviewed-on: https://go-review.googlesource.com/37750
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Instead we can just call needwritebarrier when constructing the SSA
representation.
Change-Id: I6fefaad49daada9cdb3050f112889e49dca0047b
Reviewed-on: https://go-review.googlesource.com/34566
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Remove rotate generation from walk. Remove OLROT and ssa.Lrot* opcodes.
Generate rotates during SSA lowering for architectures that have them.
This CL will allow rotates to be generated in more situations,
like when the shift values are determined to be constant
only after some analysis.
Fixes#18254
Change-Id: I8d6d684ff5ce2511aceaddfda98b908007851079
Reviewed-on: https://go-review.googlesource.com/34232
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Previously, we used OKEY nodes to represent keyed struct literal
elements. The field names were represented by an ONAME node, but this
is clumsy because it's the only remaining case where ONAME was used to
represent a bare identifier and not a variable.
This CL introduces a new OSTRUCTKEY node op for use in struct
literals. These ops instead store the field name in the node's own Sym
field. This is similar in spirit to golang.org/cl/20890.
Significant reduction in allocations for struct literal heavy code
like package unicode:
name old time/op new time/op delta
Template 345ms ± 6% 341ms ± 6% ~ (p=0.141 n=29+28)
Unicode 200ms ± 9% 184ms ± 7% -7.77% (p=0.000 n=29+30)
GoTypes 1.04s ± 3% 1.05s ± 3% ~ (p=0.096 n=30+30)
Compiler 4.47s ± 9% 4.49s ± 6% ~ (p=0.890 n=29+29)
name old user-ns/op new user-ns/op delta
Template 523M ±13% 516M ±17% ~ (p=0.400 n=29+30)
Unicode 334M ±27% 314M ±30% ~ (p=0.093 n=30+30)
GoTypes 1.53G ±10% 1.52G ±10% ~ (p=0.572 n=30+30)
Compiler 6.28G ± 7% 6.34G ±11% ~ (p=0.300 n=30+30)
name old alloc/op new alloc/op delta
Template 44.5MB ± 0% 44.4MB ± 0% -0.35% (p=0.000 n=27+30)
Unicode 39.2MB ± 0% 34.5MB ± 0% -11.79% (p=0.000 n=26+30)
GoTypes 125MB ± 0% 125MB ± 0% -0.12% (p=0.000 n=29+30)
Compiler 515MB ± 0% 515MB ± 0% -0.10% (p=0.000 n=29+30)
name old allocs/op new allocs/op delta
Template 426k ± 0% 424k ± 0% -0.39% (p=0.000 n=29+30)
Unicode 374k ± 0% 323k ± 0% -13.67% (p=0.000 n=29+30)
GoTypes 1.21M ± 0% 1.21M ± 0% -0.14% (p=0.000 n=29+29)
Compiler 4.40M ± 0% 4.39M ± 0% -0.13% (p=0.000 n=29+30)
Passes toolstash/buildall.
Change-Id: Iba4ee765dd1748f67e52fcade1cd75c9f6e13fa9
Reviewed-on: https://go-review.googlesource.com/30974
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Does not pass toolstash -cmp due to changed export data,
but the cmd/go binary (which doesn't contain export data)
is bit-for-bit identical.
Change-Id: I6b12f9de18cf7da528e9207dccbf8f08c969f142
Reviewed-on: https://go-review.googlesource.com/26753
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
When T is a scalar, there are no runtime calls
required, which makes this a clear win.
encoding/binary:
WriteInts-8 958ns ± 3% 864ns ± 2% -9.80% (p=0.000 n=15+15)
This also considerably shrinks a core fmt
routine:
Before: "".(*pp).printArg t=1 size=3952 args=0x20 locals=0xf0
After: "".(*pp).printArg t=1 size=2624 args=0x20 locals=0x98
Unfortunately, I find it very hard to get stable
numbers out of the fmt benchmarks due to thermal scaling.
Change-Id: I1278006b030253bf8e48dc7631d18985cdaa143d
Reviewed-on: https://go-review.googlesource.com/26659
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
The liveness computation of parameters generally was never
correct, but forcing all parameters to be live throughout the
function covered up that problem. The new SSA back end is
too clever: even though it currently keeps the parameter values live
throughout the function, it may find optimizations that mean
the current values are not written back to the original parameter
stack slots immediately or ever (for example if a parameter is set
to nil, SSA constant propagation may replace all later uses of the
parameter with a constant nil, eliminating the need to write the nil
value back to the stack slot), so the liveness code must now
track the actual operations on the stack slots, exposing these
problems.
One small problem in the handling of arguments is that nodarg
can return ONAME PPARAM nodes with adjusted offsets, so that
there are actually multiple *Node pointers for the same parameter
in the instruction stream. This might be possible to correct, but
not in this CL. For now, we fix this by using n.Orig instead of n
when considering PPARAM and PPARAMOUT nodes.
The major problem in the handling of arguments is general
confusion in the liveness code about the meaning of PPARAM|PHEAP
and PPARAMOUT|PHEAP nodes, especially as contrasted with PAUTO|PHEAP.
The difference between these two is that when a local variable "moves"
to the heap, it's really just allocated there to start with; in contrast,
when an argument moves to the heap, the actual data has to be copied
there from the stack at the beginning of the function, and when a
result "moves" to the heap the value in the heap has to be copied
back to the stack when the function returns
This general confusion is also present in the SSA back end.
The PHEAP bit worked decently when I first introduced it 7 years ago (!)
in 391425ae. The back end did nothing sophisticated, and in particular
there was no analysis at all: no escape analysis, no liveness analysis,
and certainly no SSA back end. But the complications caused in the
various downstream consumers suggest that this should be a detail
kept mainly in the front end.
This CL therefore eliminates both the PHEAP bit and even the idea of
"heap variables" from the back ends.
First, it replaces the PPARAM|PHEAP, PPARAMOUT|PHEAP, and PAUTO|PHEAP
variable classes with the single PAUTOHEAP, a pseudo-class indicating
a variable maintained on the heap and available by indirecting a
local variable kept on the stack (a plain PAUTO).
Second, walkexpr replaces all references to PAUTOHEAP variables
with indirections of the corresponding PAUTO variable.
The back ends and the liveness code now just see plain indirected
variables. This may actually produce better code, but the real goal
here is to eliminate these little-used and somewhat suspect code
paths in the back end analyses.
The OPARAM node type goes away too.
A followup CL will do the same to PPARAMREF. I'm not sure that
the back ends (SSA in particular) are handling those right either,
and with the framework established in this CL that change is trivial
and the result clearly more correct.
Fixes#15747.
Change-Id: I2770b1ce3cbc93981bfc7166be66a9da12013d74
Reviewed-on: https://go-review.googlesource.com/23393
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Semi-regular merge from tip to dev.ssa.
Two fixes:
1) Mark selectgo as not returning. This caused problems
because there are no VARKILL ops on the selectgo path,
causing things to be marked live that shouldn't be.
2) Tell the amd64 assembler that addressing modes like
name(SP)(AX*4) are ok.
Change-Id: I9ca81c76391b1a65cc47edc8610c70ff1a621913
This claims to be autogenerated from go tool dist,
but I don't see where.
In any case, the update is trivial.
Change-Id: I58daaba755f3d34a0396005046b89411a02ada7e
Reviewed-on: https://go-review.googlesource.com/13584
Reviewed-by: Keith Randall <khr@golang.org>
Trivial merging of 5g, 6g, ... into go tool compile,
and similarlly 5l, 6l, ... into go tool link.
The files compile/main.go and link/main.go are new.
Everything else in those directories is a move followed by
change of imports and package name.
This CL breaks the build. Manual fixups are in the next CL.
See golang-dev thread titled "go tool compile, etc" for background.
Change-Id: Id35ff5a5859ad9037c61275d637b1bd51df6828b
Reviewed-on: https://go-review.googlesource.com/10287
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Rob Pike <r@golang.org>
2015-05-21 17:31:51 +00:00
Renamed from src/cmd/internal/gc/opnames.go (Browse further)