- implement *, /, %, shifts, Zero, Move.
- fix mistakes in comparison.
- fix floating point rounding.
- handle RetJmp in assembler (which was not handled, as a consequence
Duff's device was disabled in the old backend.)
all.bash now passes with SSA on.
Updates #16359.
Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f
Reviewed-on: https://go-review.googlesource.com/27592
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Now that we have ops that can return 2 results, have BSF return a result
and flags. We can then get rid of the redundant comparison and use CMOV
instead of CMOVconst ops.
Get rid of a bunch of the ops we don't use. Ctz{8,16}, plus all the Clzs,
and CMOVNEs. I don't think we'll ever use them, and they would be easy
to add back if needed.
Change-Id: I8858a1d017903474ea7e4002fc76a6a86e7bd487
Reviewed-on: https://go-review.googlesource.com/27630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This adds some additional rules to improve loads and
stores for ppc64x.
Change-Id: I96b99c3a0019e6ac5393910c086f58330a04fc5a
Reviewed-on: https://go-review.googlesource.com/27354
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Fib with all int and float types run correctly.
*, /, shifts, Zero, Move not implemented yet. No optimization yet.
Updates #16359.
Change-Id: I4b0412954d5fd4c13a5fcddd8689ed8ac701d345
Reviewed-on: https://go-review.googlesource.com/27404
Reviewed-by: David Chase <drchase@google.com>
This time with the cherry-pick from the proper patch of
the old CL.
Stack size increased.
Corrected NaN-comparison glitches.
Marked g register as clobbered by calls.
Fixed shared libraries.
live_ssa.go still disabled because of differences.
Presumably turning on more optimization will fix
both the stack size and the live_ssa.go glitches.
Enhanced debugging output for shared libs test.
Rebased onto master.
Updates #16010.
Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4
Reviewed-on: https://go-review.googlesource.com/27159
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
- Use machine instructions for uint64<->float conversions
- Do not enforce alignment on Zero/Move
ARM64 supports unaligned load/stores, but only aligned offset
or small offset can be encoded into instructions.
- Do combined loads
Change-Id: Iffca7dd0f13070b17b784861ce5a30af584680eb
Reviewed-on: https://go-review.googlesource.com/27086
Reviewed-by: David Chase <drchase@google.com>
Only need to zero-extend to 32 bits and we get the top
32 bits zeroed for free.
Only the WQ change actually generates different code.
The assembler did this optimization for us in the other two cases.
But we might as well do it during SSA so -S output more closely
matches the actual generated instructions.
Change-Id: I3e4ac50dc4da124014d4e31c86e9fc539d94f7fd
Reviewed-on: https://go-review.googlesource.com/23711
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Passes ssa_test.
Requires a few new instructions and some scratchpad
memory to move data between G and F registers.
Also fixed comparisons to be correct in case of NaN.
Added missing instructions for run.bash.
Removed some FP registers that are apparently "reserved"
(but that are also apparently also unused except for a
gratuitous multiplication by two when y = x+x would work
just as well).
Currently failing stack splits.
Updates #16010.
Change-Id: I73b161bfff54445d72bd7b813b1479f89fc72602
Reviewed-on: https://go-review.googlesource.com/26813
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Add more ARM64 optimizations:
- use hardware zero register when it is possible.
- use shifted ops.
The assembler supports shifted ops but not documented, nor knows
how to print it. This CL adds them.
- enable fast division.
This was disabled because it makes the old backend generate slower
code. But with SSA it generates faster code.
Turn on SSA by default, also adjust tests.
Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6
Reviewed-on: https://go-review.googlesource.com/26950
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
In position-independent 386 code, loading floating-point constants from
the constant pool requires two steps: materializing the address of
the constant pool entry (requires calling a thunk) and then loading
from that address.
Before this CL, the materializing happened implicitly in CX, which
clobbered that register.
Change-Id: Id094e0fb2d3be211089f299e8f7c89c315de0a87
Reviewed-on: https://go-review.googlesource.com/26811
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Use the destination register for materializing the pc
for GOT references also. See https://go-review.googlesource.com/c/25442/
The SSA backend assumes CX does not get clobbered for these instructions.
Mark duffzero as clobbering CX. The linker needs to clobber CX
to materialize the address to call. (This affects the non-shared-library
duffzero also, but hopefully forbidding one register across duffzero
won't be a big deal.)
Hopefully this is all the cases where the linker is clobbering CX
under the hood and SSA assumes it isn't.
Change-Id: I080c938170193df57cd5ce1f2a956b68a34cc886
Reviewed-on: https://go-review.googlesource.com/26611
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
It's not a new backend, just a PtrSize==4 modification
of the existing AMD64 backend.
Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda
Reviewed-on: https://go-review.googlesource.com/25586
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reg allocator skips flag-typed values. Flag allocator uses the type
and whether the op has "clobberFlags" set.
Tested on AMD64, ARM, ARM64, 386. Passed 'toolstash -cmp' on AMD64.
PPC64 is coded blindly.
Change-Id: Ib1cc27efecef6a1bb27f7d7ed035a582660d244f
Reviewed-on: https://go-review.googlesource.com/25480
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Includes hmul (all widths)
compare for boolean result and simplifications
shift operations plus changes/additions for implementation
(ORN, ADDME, ADDC)
Also fixed a backwards-operand CMP.
Change-Id: Id723c4e25125c38e0d9ab9ec9448176b75f4cdb4
Reviewed-on: https://go-review.googlesource.com/25410
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Instead of comparing the address of the end of the memory to zero/copy,
comparing the address of the last element, which is a valid pointer.
Also unify large and unaligned Zero/Move, by passing alignment as AuxInt.
Fixes#16515 for ARM.
Change-Id: I19a62b31c5acf5c55c16a89bea1039c926dc91e5
Reviewed-on: https://go-review.googlesource.com/25300
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Support the following:
- Shifts. ARM64 machine instructions only use lowest 6 bits of the
shift (i.e. mod 64). Use conditional selection instruction to
ensure Go semantics.
- Zero/Move. Alignment is ensured.
- Hmul, Avg64u, Sqrt.
- reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer
in ARM64 ABI).
Everything compiles, all.bash passed (with non-SSA test disabled).
Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1
Reviewed-on: https://go-review.googlesource.com/25290
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Added support for ClosureCall, DeferCall, InterCall
(GoCall not yet tested).
Added support for GetClosurePtr, IsNonNil, IsInBounds, IsSliceInBounds, NilCheck
(Convert and GetG not yet tested)
Still need to implement NilCheck optimizations.
Fixed move boolean constant, order of operands to subtract.
Updates #16010.
Change-Id: Ibe0f6a6e688df4396cd77de0e9095997e4ca8ed2
Reviewed-on: https://go-review.googlesource.com/25241
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Because PPC lacks store-immediate, remove the instruction
that implies that it exists. Replace it with storezero for
the special case of storing zero, because R0 is reserved zero
for Go (though the assembler knows this, do it in SSA).
Also added address folding for storezero.
(Now corrected to use right-sized stores in bulk-zero code.)
Hello.go now compiles to
genssa main
00000 (...hello.go:7) TEXT "".main(SB), $0
00001 (...hello.go:7) FUNCDATA $0, "".gcargs·0(SB)
00002 (...hello.go:7) FUNCDATA $1, "".gclocals·1(SB)
v23 00003 (...hello.go:8) MOVD $go.string."Hello, World!\n"(SB), R3
v11 00004 (...hello.go:8) MOVD R3, 32(R1)
v22 00005 (...hello.go:8) MOVD $14, R3
v6 00006 (...hello.go:8) MOVD R3, 40(R1)
v20 00007 (...hello.go:8) MOVD R0, 48(R1)
v18 00008 (...hello.go:8) MOVD R0, 56(R1)
v9 00009 (...hello.go:8) MOVD R0, 64(R1)
v10 00010 (...hello.go:8) CALL fmt.Printf(SB)
b2 00011 (...hello.go:9) RET
00012 (<unknown line number>) END
Updates #16010
Change-Id: I33cfd98c21a1617502260ac753fa8cad68c8d85a
Reviewed-on: https://go-review.googlesource.com/25151
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Mostly copied from ARM port, with instruction names and Prog fields
adjusted, and 64-bit int ops added. Not complete.
Fib compiles and runs correctly.
Change-Id: Id3ecb0d4b571200a035344b3e8e4408769f76221
Reviewed-on: https://go-review.googlesource.com/25130
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For register-register move, if there is only one use, allocate it in
the same register so we don't need to emit an instruction.
Updates #15365.
Change-Id: Iad41843854a506c521d577ad93fcbe73e8de8065
Reviewed-on: https://go-review.googlesource.com/25059
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Fix up zero/move code, including duff calls and rep movs.
Handle the new ops generated by dec64.rules.
Fix constant shifts.
Change-Id: I7d89194b29b04311bfafa0fd93b9f5644af04df9
Reviewed-on: https://go-review.googlesource.com/25033
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
We now allow Values to have 2 outputs. Use that ability for amd64.
This allows x,y := a/b,a%b to use just a single divide instruction.
Update #6815
Change-Id: Id70bcd20188a2dd8445e631a11d11f60991921e4
Reviewed-on: https://go-review.googlesource.com/25004
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Make tuple types and their SelectX ops fully generic.
These ops no longer need to be lowered.
Regalloc understands them and their tuple-generating arguments.
We can now have opcodes returning arbitrary pairs of results.
(And it would be easy to move to >2 results if needed.)
Update arm implementation to the new standard.
Implement just enough in 386 port to do 64-bit add.
Change-Id: I370ed5aacce219c82e1954c61d1f63af76c16f79
Reviewed-on: https://go-review.googlesource.com/24976
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Add some simplification rules for floating point ops.
cmd/internal/obj/arm supports instructions that compare FP register
to 0, but runtime softfloat simulator does not. This CL adds these
instructions to softfloat simulator as well.
Updates #15365.
Change-Id: I29405b2bfcb4c8cf106cb7a1a811409fec91b170
Reviewed-on: https://go-review.googlesource.com/24790
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This CL implements the following optimizations for ARM:
- use shifted ops (e.g. ADD R1<<2, R2) and indexed load/stores
- break up shift ops. Shifts used to be one SSA op that generates
multiple instructions. We break them up to multiple ops, which
allows constant folding and CSE for comparisons. Conditional moves
are introduced for this.
- simplify zero/sign-extension ops.
Updates #15365.
Change-Id: I55e262a776a7ef2a1505d75e04d1208913c35d39
Reviewed-on: https://go-review.googlesource.com/24512
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Basically just copied all the amd64 files, removed all the *Q ops,
and rebuilt.
Compiles fib successfully.
Still need to do:
- all the 64->32 bit op translations.
- audit for instructions that aren't available on 386.
- GO386=387?
Update #16358
Change-Id: Ib8c684586416a554a527a5eefa0cff71424e36f5
Reviewed-on: https://go-review.googlesource.com/24912
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Mostly constant folding rules, analogous to AMD64 ones. Along with
some simplifications.
Updates #15365.
Change-Id: If83bc1188bb05acb982ef3a1c21704c187e3eb24
Reviewed-on: https://go-review.googlesource.com/24210
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Encode the size and the alignment into AuxInt of Zero and Move ops.
On AMD64, we simply don't look at the alignment. On ARM and PPC64, we
only generate aligned stores.
Updates #15365.
Change-Id: Ifdcc205c364f67c4516b9adebfe7d50d223b6863
Reviewed-on: https://go-review.googlesource.com/24511
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the files in cmd/compile/internal/ssa/gen
are passed to go run in a different order,
e.g. due to shell differences or manual entry,
then the order of constants in opGen churns.
Sort archs by name to enforce stability.
The movement of the PPC constants is a one time cost.
Change-Id: Iebcfdb9e612d7dd8cde575f920f1292891f2f24a
Reviewed-on: https://go-review.googlesource.com/24680
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This adds the initial SSA implementation for PPC64.
Builds golang and all.bash runs correctly. Simple hello.go
builds but does not run.
Change-Id: I7cec211b934cd7a2dd75a6cdfaf9f71867063466
Reviewed-on: https://go-review.googlesource.com/24453
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Use hardware g register (R10) for GetG, allow g to appear at LHS of
some ops.
Progress on SSA backend for ARM. Now everything compiles and runs.
Updates #15365.
Change-Id: Icdf93585579faa86cc29b1e17ab7c90f0119fc4e
Reviewed-on: https://go-review.googlesource.com/23952
Reviewed-by: David Chase <drchase@google.com>
Introduce an op MOVWaddr for addresses on ARM, instead of overuse
ADDconst.
Mark MOVWaddr as rematerializable. This fixes a liveness problem: if
it were not rematerializable, the address of a variable may be spilled
and later use of the address may just load the spilled value without
mentioning the variable, and the liveness code may think it is dead
prematurely.
Update #15365.
Change-Id: Ib0b0fa826bdb75c9e6bb362b95c6cf132cc6b1c0
Reviewed-on: https://go-review.googlesource.com/23942
Reviewed-by: David Chase <drchase@google.com>
Machine supports (or the runtime simulates in soft float mode)
(u)int32<->float conversions. The frontend rewrites int64<->float
conversions to call to runtime function.
For int64->float32 conversion, the frontend generates
. . AS u(100) l(10) tc(1)
. . . NAME-main.~r1 u(1) a(true) g(1) l(9) x(8+0) class(PPARAMOUT) f(1) float32
. . . CALLFUNC u(100) l(10) tc(1) float32
. . . . NAME-runtime.int64tofloat64 u(1) a(true) x(0+0) class(PFUNC) tc(1) used(true) FUNC-func(int64) float64
The CALLFUNC node has type float32, whereas runtime.int64tofloat64
returns float64. The legacy backend implicitly makes a float64->float32
conversion. The SSA backend does not do implicit conversion, so we
insert an explicit CONV here.
All cmd/compile/internal/gc/testdata/*_ssa.go tests passed.
Progress on SSA for ARM. Still not complete.
Update #15365.
Change-Id: I30937c8ff977271246b068f48224693776804339
Reviewed-on: https://go-review.googlesource.com/23652
Reviewed-by: Keith Randall <khr@golang.org>
This CL adds support of Div, Mod, Convert, GetClosurePtr and 64-bit indexing
support to SSA backend for ARM.
Add tests for 64-bit indexing to cmd/compile/internal/gc/testdata/string_ssa.go.
Tests cmd/compile/internal/gc/testdata/*_ssa.go passed, except compound_ssa.go
and fp_ssa.go.
Progress on SSA for ARM. Still not complete. Essentially the only unsupported
part is floating point.
Updates #15365.
Change-Id: I269e88b67f641c25e7a813d910c96d356d236bff
Reviewed-on: https://go-review.googlesource.com/23542
Reviewed-by: David Chase <drchase@google.com>
Also fix a mistake in previous CL about x8 and x16 shifts:
the shift needs ZeroExt.
Progress on SSA for ARM. Still not complete.
Updates #15365.
Change-Id: Ibc352760023d38bc6b9c5251e929fe26e016637a
Reviewed-on: https://go-review.googlesource.com/23486
Reviewed-by: David Chase <drchase@google.com>
Auto-generate register masks and load them through Config.
Passed toolstash -cmp on AMD64.
Tests phi_ssa.go and regalloc_ssa.go in cmd/compile/internal/gc/testdata
passed on ARM.
Updates #15365.
Change-Id: I393924d68067f2dbb13dab82e569fb452c986593
Reviewed-on: https://go-review.googlesource.com/23292
Reviewed-by: David Chase <drchase@google.com>
Introduce dec64 rules to (generically) decompose 64-bit integer on
32-bit architectures. 64-bit integer is composed/decomposed with
Int64Make/Hi/Lo ops, as for complex types.
The idea of dealing with Add64 is the following:
(Add64 (Int64Make xh xl) (Int64Make yh yl))
->
(Int64Make
(Add32withcarry xh yh (Select0 (Add32carry xl yl)))
(Select1 (Add32carry xl yl)))
where Add32carry returns a tuple (flags,uint32). Select0 and Select1
read the first and the second component of the tuple, respectively.
The two Add32carry will be CSE'd.
Similarly for multiplication, Mul32uhilo returns a tuple (hi, lo).
Also add support of KeepAlive, to fix build after merge.
Tests addressed_ssa.go, array_ssa.go, break_ssa.go, chan_ssa.go,
cmp_ssa.go, ctl_ssa.go, map_ssa.go, and string_ssa.go in
cmd/compile/internal/gc/testdata passed.
Progress on SSA for ARM. Still not complete.
Updates #15365.
Change-Id: I7867c76785a456312de5d8398a6b3f7ca5a4f7ec
Reviewed-on: https://go-review.googlesource.com/23213
Reviewed-by: Keith Randall <khr@golang.org>
Generate load/stores for small zeroing/move, DUFFZERO/DUFFCOPY for
medium zeroing/move, and loops for large zeroing/move.
cmd/compile/internal/gc/testdata/{copy_ssa.go,zero_ssa.go} tests
passed.
Progress on SSA backend for ARM. Still not complete. A few packages
in the standard library compile and tests passed, including
container/list, hash/crc32, unicode/utf8, etc.
Updates #15365.
Change-Id: Ieb4b68b44ee7de66bf7b68f5f33a605349fcc6fa
Reviewed-on: https://go-review.googlesource.com/23097
Reviewed-by: Keith Randall <khr@golang.org>
Implement shifts and multiplications for up to 32-bit values.
Also handle Exit block.
Progress on SSA backend for ARM. Still not complete.
container/heap, crypto/subtle, hash/adler32 packages compile and
tests passed.
Updates #15365.
Change-Id: I6bee4d5b0051e51d5de97e8a1938c4b87a36cbf8
Reviewed-on: https://go-review.googlesource.com/23096
Reviewed-by: Keith Randall <khr@golang.org>
Fix hardcoded flag register mask in ssa/flagalloc.go by auto-generating
the mask.
Also fix a mistake (in previous CL) about conditional branches.
Progress on SSA backend for ARM. Still not complete. Now "container/ring"
package compiles and tests passed.
Updates #15365.
Change-Id: Id7c8805c30dbb8107baedb485ed0f71f59ed6ea8
Reviewed-on: https://go-review.googlesource.com/23093
Reviewed-by: Keith Randall <khr@golang.org>
Introduce a KeepAlive op which makes sure that its argument is kept
live until the KeepAlive. Use KeepAlive to mark pointer input
arguments as live after each function call and at each return.
We do this change only for pointer arguments. Those are the
critical ones to handle because they might have finalizers.
Doing compound arguments (slices, structs, ...) is more complicated
because we would need to track field liveness individually (we do
that for auto variables now, but inputs requires extra trickery).
Turn off the automatic marking of args as live. That way, when args
are explicitly nulled, plive will know that the original argument is
dead.
The KeepAlive op will be the eventual implementation of
runtime.KeepAlive.
Fixes#15277
Change-Id: I5f223e65d99c9f8342c03fbb1512c4d363e903e5
Reviewed-on: https://go-review.googlesource.com/22365
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>