Implement Slicemask the same way every other architecture does - negate
then arithmetic right shift. This sets or clears the sign bit, before
extending it to the entire register.
Removes around 2,500 instructions from the Go binary on linux/riscv64.
Change-Id: I4d675b826e7eb23fe2b1e6e46b95dcd49ab49733
Reviewed-on: https://go-review.googlesource.com/c/go/+/426354
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Convert subtraction from const to a negated ADDI with negative const
value, where possible. At worst this avoids a register load and uses
the same number of instructions. At best, this allows for further
optimisation to occur, particularly where equality is involved.
For example, this sequence:
li t0,-1
sub t1,t0,a0
snez t1,t1
Becomes:
addi t0,a0,1
snez t0,t0
Removes more than 2000 instructions from the Go binary on linux/riscv64.
Change-Id: I68f3be897bc645d4a8fa3ab3cef165a00a74df19
Reviewed-on: https://go-review.googlesource.com/c/go/+/426263
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
The FNES and FNED instructions are pseudo-instructions, which the
assembler expands to FEQS/NEG or FEQD/NEG - if we're comparing the
result via a branch instruction, we can avoid an instruction by
negating both the branch comparision and the floating point
comparision.
This only removes a handful of instructions from the Go binary,
however, it will provide benefit to floating point intensive code.
Change-Id: I4e3124440b7659acc4d9bc9948b755a4900a422f
Reviewed-on: https://go-review.googlesource.com/c/go/+/426261
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
The prove pass will mark some shifts bounded, and then we can use that
information to generate better code on riscv64.
Change-Id: Ia22f43d0598453c9417adac7017db28d7240948b
Reviewed-on: https://go-review.googlesource.com/c/go/+/422616
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
The negation does not change the comparison to zero.
Also remove unnecessary x.Uses == 1 condition from equivalent BEQZ/BNEZ rules.
Change-Id: I62dd8e383e42bfe5c46d11bbf78d8e5ff862a1d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/426262
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
The result of these operations are already extended.
Change-Id: Ifc8ba362dda7035d8fd0d40046a96f61d3082877
Reviewed-on: https://go-review.googlesource.com/c/go/+/426260
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Removes more than 2000 instructions from the Go binary on linux/risv64.
Change-Id: I6db3e3b1c93f29f00869adcba7c6192bfb90b25c
Reviewed-on: https://go-review.googlesource.com/c/go/+/426259
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This is a follow up of CL 425101 on RISCV64.
According to RISCV Volume 1, Unprivileged Spec v. 20191213 Chapter 7.1:
If both the high and low bits of the same product are required, then the
recommended code sequence is: MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2
(source register specifiers must be in same order and rdh cannot be the
same as rs1 or rs2). Microarchitectures can then fuse these into a single
multiply operation instead of performing two separate multiplies.
So we should not split Muluhilo to separate instructions.
Updates #54607
Change-Id: If47461f3aaaf00e27cd583a9990e144fb8bcdb17
Reviewed-on: https://go-review.googlesource.com/c/go/+/425203
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
This appeases Go 1.4, making it possible to bootstrap GOARCH=riscv64 with
a Go 1.4 compiler.
Fixes#52583
Change-Id: Ib13c2afeb095b2bb1464dcd7f1502574209bc7ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/409974
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Following CL 405114, for RISCV64.
May fix RISCV64 builds.
Updates #52788.
Change-Id: Ifc34658703d1e8b97665e7b862060152e3005d71
Reviewed-on: https://go-review.googlesource.com/c/go/+/405553
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Pointer comparison is lowered to the following on RISCV64
(EqPtr x y) => (SEQZ (SUB <x.Type> x y))
The difference of two pointers (the SUB) should not be pointer
type. Otherwise it can cause the GC to find a bad pointer.
Should fix#51101.
Change-Id: I7e73c2155c36ff403c032981a9aa9cccbfdf0f64
Reviewed-on: https://go-review.googlesource.com/c/go/+/385655
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Instructions with immediates can be precomputed when operating on a
constant - do so for SLTI/SLTIU, SLLI/SRLI/SRAI, NEG/NEGW, ANDI, ORI
and ADDI. Additionally, optimise ANDI and ORI when the immediate is
all ones or all zeroes.
In particular, the RISCV64 logical left and right shift rules
(Lsh*x*/Rsh*Ux*) produce sequences that check if the shift amount
exceeds 64 and if so returns zero. When the shift amount is a
constant we can precompute and eliminate the filter entirely.
Likewise the arithmetic right shift rules produce sequences that
check if the shift amount exceeds 64 and if so, ensures that the
lower six bits of the shift are all ones. When the shift amount
is a constant we can precompute the shift value.
Arithmetic right shift sequences like:
117fc: 00100513 li a0,1
11800: 04053593 sltiu a1,a0,64
11804: fff58593 addi a1,a1,-1
11808: 0015e593 ori a1,a1,1
1180c: 40b45433 sra s0,s0,a1
Are now a single srai instruction:
117fc: 40145413 srai s0,s0,0x1
Likewise for logical left shift (and logical right shift):
1d560: 01100413 li s0,17
1d564: 04043413 sltiu s0,s0,64
1d568: 40800433 neg s0,s0
1d56c: 01131493 slli s1,t1,0x11
1d570: 0084f433 and s0,s1,s0
Which are now a single slli (or srli) instruction:
1d120: 01131413 slli s0,t1,0x11
This removes more than 30,000 instructions from the Go binary and
should improve performance in a variety of areas - of note
runtime.makemap_small drops from 48 to 36 instructions. Similar
gains exist in at least other parts of runtime and math/bits.
Change-Id: I33f6f3d1fd36d9ff1bda706997162bfe4bb859b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/350689
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Michael Munday <mike.munday@lowrisc.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
For certain type of method wrappers we used to generate a tail
call. That was disabled in CL 307234 when register ABI is used,
because with the current IR it was difficult to generate a tail
call with the arguments in the right places. The problem was that
the IR does not contain a CALL-like node with arguments; instead,
it contains an OAS node that adjusts the receiver, than an
OTAILCALL node that just contains the target, but no argument
(with the assumption that the OAS node will put the adjusted
receiver in the right place). With register ABI, putting
arguments in registers are done in SSA. The assignment (OAS)
doesn't put the receiver in register.
This CL changes the IR of a tail call to take an actual OCALL
node. Specifically, a tail call is represented as
OTAILCALL (OCALL target args...)
This way, the call target and args are connected through the OCALL
node. So the call can be analyzed in SSA and the args can be passed
in the right places.
(Alternatively, we could have OTAILCALL node directly take the
target and the args, without the OCALL node. Using an OCALL node is
convenient as there are existing code that processes OCALL nodes
which do not need to be changed. Also, a tail call is similar to
ORETURN (OCALL target args...), except it doesn't preserve the
frame. I did the former but I'm open to change.)
The SSA representation is similar. Previously, the IR lowers to
a Store the receiver then a BlockRetJmp which jumps to the target
(without putting the arg in register). Now we use a TailCall op,
which takes the target and the args. The call expansion pass and
the register allocator handles TailCall pretty much like a
StaticCall, and it will do the right ABI analysis and put the args
in the right places. (Args other than the receiver are already in
the right places. For register args it generates no code for them.
For stack args currently it generates a self copy. I'll work on
optimize that out.) BlockRetJmp is still used, signaling it is a
tail call. The actual call is made in the TailCall op so
BlockRetJmp generates no code (we could use BlockExit if we like).
This slightly reduces binary size:
old new
cmd/go 14003088 13953936
cmd/link 6275552 6271456
Change-Id: I2d16d8d419fe1f17554916d317427383e17e27f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/350145
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: David Chase <drchase@google.com>
Also, add the FABSS and FABSD pseudo instructions to the assembler.
The compiler could use FSGNJX[SD] directly but there doesn't seem
to be much advantage to doing so and the pseudo instructions are
easier to understand.
Change-Id: Ie8825b8aa8773c69cc4f07a32ef04abf4061d80d
Reviewed-on: https://go-review.googlesource.com/c/go/+/348989
Trust: Michael Munday <mike.munday@lowrisc.org>
Run-TryBot: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Add support to the assembler for F[N]M{ADD,SUB}[SD] instructions.
Argument order is:
OP RS1, RS2, RS3, RD
Also, add support for the FMA intrinsic to the compiler. Automatic
FMA matching is left to a future CL.
Change-Id: I47166c7393b2ab6bfc2e42aa8c1a8997c3a071b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/293030
Trust: Michael Munday <mike.munday@lowrisc.org>
Run-TryBot: Michael Munday <mike.munday@lowrisc.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
We can end up with this situation due to our equality tests being based on
'SEQZ (SUB x y)' - if x is a zero valued constant, 'SUB x y' can be converted
to 'NEG x'. When used with a branch the SEQZ can be absorbed, leading to
'BNEZ (NEG x)' where the NEG is redundant.
Removes around 1700 instructions from the go binary on riscv64.
Change-Id: I947a080d8bf7d2d6378ab114172e2342ce2c51db
Reviewed-on: https://go-review.googlesource.com/c/go/+/342850
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Convert BLT and BGE with a zero valued constant to BGTZ/BLTZ/BLEZ/BGEZ as
appropriate.
Removes over 4,500 instructions from the go binary on riscv64.
Change-Id: Icc266e968b126ba04863ec88529630a9dd44498b
Reviewed-on: https://go-review.googlesource.com/c/go/+/342849
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
According to RISCV instruction set manual v2.2 Sec 6.1
MULHU followed by MUL will be fused into one multiply by microarchitecture
name old time/op new time/op delta
MulUintptr/small 11.2ns ±24% 9.2ns ± 0% -17.54% (p=0.000 n=10+9)
MulUintptr/large 15.9ns ± 0% 10.9ns ± 0% -31.55% (p=0.000 n=8+8)
Change-Id: I3d152218f83948cbc5c576bda29dc86e9b4206ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/338753
Trust: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
In external linking mode the external linker may insert
trampolines, which use R12 as a scratch register. So a call could
potentially clobber R12 if the target is laid out too far. Mark
R12 clobbered.
Also, we will use R12 for trampolines in the Go linker as well.
CL 310731 updated the generated rewrite files so imports are
grouped, but the generator was not updated to do so. Grouped
imports are nice. But as those are generated files, for
simplicity and my laziness, just regenerate with the current
generator (which makes imports not grouped).
Change-Id: Iddb741ff7314a291ade5fbffc7d315f555808409
Reviewed-on: https://go-review.googlesource.com/c/go/+/314453
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
The go/build package needs access to this configuration,
so move it into a new package available to the standard library.
Change-Id: I868a94148b52350c76116451f4ad9191246adcff
Reviewed-on: https://go-review.googlesource.com/c/go/+/310731
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
Correct sign extension handling for consts on riscv64. This fixes a bug
in part exposed by CL 302609 - previously 64 bit consts were rewritten into
multiple 32 bit consts and the expansion would result in sign/zero extension
not being eliminated. With this change a MOVDconst with a 64 bit value can be
followed by a MOV{B,H,W}reg, which will be eliminated without actually
truncating to a smaller value.
Change-Id: I8d9cd380217466997b341e008a1f139bc11a0d51
Reviewed-on: https://go-review.googlesource.com/c/go/+/303350
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Follow what MIPS does and load >32-bit constants from memory using two instructions,
rather than generating a four to six instruction sequence. This removes more than 2,500
instructions from the Go binary. This also makes it possible to load >32-bit constants
via a single assembly instruction, if required.
Change-Id: Ie679a0754071e6d8c52fe0d027f00eb241b3a758
Reviewed-on: https://go-review.googlesource.com/c/go/+/302609
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Most platforms only use a single MOV const operand - remove the MOV{B,H,W}const
operands from riscv64 and consistently use MOVDconst instead. The implementation
of all four is the same and there is no benefit gained from having multiple const
operands (in fact it requires a lot more rewrite rules).
Change-Id: I0ba7d7554e371a1de762ef5f3745e9c0c30d41ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/302610
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Michael Munday <mike.munday@lowrisc.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
IsNonNil is readily implemented using SNEZ on riscv64, removing over 8,000
instructions from the go binary. Other rules will improve on this sequence,
however in this case it makes sense to use a direct simplification.
Change-Id: Ib4068599532398afcd05f51d160673ef5fb5e5a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/299230
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Michael Munday <mike.munday@lowrisc.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
The 32 bit versions are easily implement with a single instruction, while the
8 bit versions require a bit more effort but use the same atomic instructions
via rewrite rules.
Change-Id: I42e8d457b239c8f75e39a8e282fc88c1bb292a99
Reviewed-on: https://go-review.googlesource.com/c/go/+/268098
Trust: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.
On arm64 flatform.
previous:
FCVTSD F0, F0
FSQRTD F0, F0
FCVTDS F0, F0
optimized:
FSQRTS S0, S0
And this patch adds the test case to check the correctness.
This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)
Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
MOV*nop and MOV*reg seem superfluous. They are there to keep type
information around that would otherwise get thrown away. Not sure
what we need it for. I think our compiler needs a normalization of
how types are represented in SSA, especially after lowering.
MOV*nop gets in the way of some optimization rules firing, like for
load combining.
For now, just fold MOV*nop and MOV*const. It's certainly safe to
do that, as the type info on the MOV*const isn't ever useful.
R=go1.17
Change-Id: I3630a80afc2455a8e9cd9fde10c7abe05ddc3767
Reviewed-on: https://go-review.googlesource.com/c/go/+/276792
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
As done with other equality tests, zero extend before subtraction rather than
after (or in this case, at the same time). While at face value this appears to
require more instructions, in reality it allows for most sign extensions to
be completely eliminated due to correctly typed loads. Existing optimisations
(such as subtraction of zero) then become more effective.
This removes more than 10,000 instructions from the Go binary and in particular,
a writeBarrier check only requires three instructions (AUIPC, LWU, BNEZ) instead
of the current four (AUIPC, LWU, NEGW, BNEZ).
Change-Id: I7afdc1921c4916ddbd414c3b3f5c2089107ec016
Reviewed-on: https://go-review.googlesource.com/c/go/+/274066
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Optimize small (s <= 32) zeroing/moving operations on riscv64.
Avoid generating unaligned memory accesses.
The code is almost one to one translation of the corresponding
mips64 rules with additional rule for s=32.
Change-Id: I753b0b8e53cb9efcf43c8080cab90f3d03539fb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/266217
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Sign extension for consts is unnecessary and zero extension for consts can be avoided
via casts. This removes over 16,000 instructions from the Go binary, in part because it
allows for better zero const absorbtion in blocks - for example,
`(BEQ (MOVBU (MOVBconst [0])) cond yes no)` now becomes `(BEQZ cond yes no)` when
this change is combined with existing rules.
Change-Id: I27e791bfa84869639db653af6119f6e10369ba3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/265041
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Also make canMergeSym take Syms instead of interface{}
Change-Id: I4926a1fc586aa90e198249d67e5b520404b40869
Reviewed-on: https://go-review.googlesource.com/c/go/+/265817
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Implement runtime.duffzero and runtime.duffcopy for riscv64.
Use obj.ADUFFZERO/obj.ADUFFCOPY for medium size, word aligned
zeroing/moving.
Change-Id: I42ec622055630c94cb77e286d8d33dbe7c9f846c
Reviewed-on: https://go-review.googlesource.com/c/go/+/237797
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Add additional rules to eliminate unnecessary sign/zero extension for riscv64.
Also where possible, replace an extension following a load with a different typed
load. This removes almost another 8,000 instructions from the go binary.
Of particular note, change Eq16/Eq8/Neq16/Neq8 to zero extend each value before
subtraction, rather than zero extending after subtraction. While this appears to
double the number of zero extensions, it often lets us completely eliminate them
as the load can already be performed in a properly typed manner.
As an example, prior to this change runtime.memequal16 was:
0000000000013028 <runtime.memequal16>:
13028: 00813183 ld gp,8(sp)
1302c: 00019183 lh gp,0(gp)
13030: 01013283 ld t0,16(sp)
13034: 00029283 lh t0,0(t0)
13038: 405181b3 sub gp,gp,t0
1303c: 03019193 slli gp,gp,0x30
13040: 0301d193 srli gp,gp,0x30
13044: 0011b193 seqz gp,gp
13048: 00310c23 sb gp,24(sp)
1304c: 00008067 ret
Whereas it now becomes:
0000000000012fa8 <runtime.memequal16>:
12fa8: 00813183 ld gp,8(sp)
12fac: 0001d183 lhu gp,0(gp)
12fb0: 01013283 ld t0,16(sp)
12fb4: 0002d283 lhu t0,0(t0)
12fb8: 405181b3 sub gp,gp,t0
12fbc: 0011b193 seqz gp,gp
12fc0: 00310c23 sb gp,24(sp)
12fc4: 00008067 ret
Change-Id: I16321feb18381241cab121c0097a126104c56c2c
Reviewed-on: https://go-review.googlesource.com/c/go/+/264659
Trust: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Rather than handling sign and zero extension via rules, defer to the assembler
and use MOV pseudo-instructions. The instruction can also be omitted where the
type and size is already correct. This change results in more than 6,000
instructions being removed from the go binary (in part due to omitted
instructions, in part due to MOVBU having a more efficient implementation in
the assembler than what is used in the current ZeroExt8to{16,32,64} rules).
This will also allow for further rewriting to remove redundant sign/zero
extension.
Change-Id: I05e42fd9f09f40a69948be7de772cce8946c8744
Reviewed-on: https://go-review.googlesource.com/c/go/+/264658
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Make use of multi-control values and branch pseudo-instructions to optimise
compiler generated branches.
Change-Id: I7a8bf754db3c2082a390bf6a662ccf18cbcbee39
Reviewed-on: https://go-review.googlesource.com/c/go/+/226400
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This makes the intent clearer, allows for another ellipsis and will aid
in future rewriting. While here, document boolean loads to explain register
contents.
Change-Id: I933db2813826d88819366191fbbea8fcee5e4dda
Reviewed-on: https://go-review.googlesource.com/c/go/+/230120
Reviewed-by: Keith Randall <khr@golang.org>
Implement multi-control branches for riscv64, switching to using the BNEZ
pseudo-instruction when rewriting conditionals. This will allow for further
branch optimisations to later be performed via rewrites.
Change-Id: I7f2c69f3c77494b403f26058c6bc8432d8070ad0
Reviewed-on: https://go-review.googlesource.com/c/go/+/226399
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
This reverts commit 98c32670fd454939794504225dca1d4ec55045d5.
Rolling-forward with trivial format-string fix
cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields
Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.
Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.
Passes toolstash-check -all.
Original review: https://go-review.googlesource.com/c/go/+/228882
Change-Id: Ib63db4ee1687446f0f3d9f11575a40dd85cbce55
Reviewed-on: https://go-review.googlesource.com/c/go/+/229126
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.
Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.
Passes toolstash-check -all.
Change-Id: I72ddd2b0d9001faa87ad0ab54f500057164661b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/228882
Reviewed-by: Keith Randall <khr@golang.org>
Otherwise, just copying the aux and auxint fields doesn't make much sense.
(Although there's no bug - it just means it isn't typechecked correctly.)
Change-Id: I4e21ac67f0c7bfd04ed5af1713cd24bca08af092
Reviewed-on: https://go-review.googlesource.com/c/go/+/227962
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Log large copies in the riscv64 compiler.
This was missed in 47ade08141, resulting in
the new test added to cmd/compile/internal/logopt failing on riscv64.
Change-Id: I6f763e86f42834148e911d16928f9fbabcfa4290
Reviewed-on: https://go-review.googlesource.com/c/go/+/227804
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Extend CL 220417 (which removed the integer Greater and Geq ops) to
floating point comparisons. Greater and Geq can always be
implemented using Less and Leq.
Fixes#37316.
Change-Id: Ieaddb4877dd0ff9037a1dd11d0a9a9e45ced71e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/222397
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Where possible, fold constants into versions of instructions that take
an immediate. This avoids the need to allocate a register and load the
immediate into it.
Change-Id: If911ca41235e218490679aed2ce5f48bf807a2b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/222639
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>