Commit graph

100 commits

Author SHA1 Message Date
Guoqi Chen
751a817ccc cmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX instructions support
This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new
sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions.

On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate
values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using
register offset.

Go asm syntax:
        VMOVQ           n(RJ), RV      (128bit vector load)
        XVMOVQ          n(RJ), RX      (256bit vector load)
        VMOVQ           RV, n(RJ)      (128bit vector store)
        XVMOVQ          RX, n(RJ)      (256bit vector store)

        VMOVQ           (RJ)(RK), RV   (128bit vector load)
        XVMOVQ          (RJ)(RK), RX   (256bit vector load)
        VMOVQ           RV, (RJ)(RK)   (128bit vector store)
        XVMOVQ          RX, (RJ)(RK)   (256bit vector store)

Equivalent platform assembler syntax:
         vld            vd, rj, si12
        xvld            xd, rj, si12
         vst            vd, rj, si12
        xvst            xd, rj, si12
         vldx           vd, rj, rk
        xvldx           xd, rj, rk
         vstx           vd, rj, rk
        xvstx           xd, rj, rk

[1]: LSX: Loongson SIMD Extension, 128bit
[2]: LASX: Loongson Advanced SIMD Extension, 256bit

Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003
Reviewed-on: https://go-review.googlesource.com/c/go/+/616075
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07 02:20:14 +00:00
Xiaolin Zhao
3ae5ff2a27 cmd/asm: add support for loong64 FMA instructions
Add support for assembling the FMA instructions present in the LoongArch
base ISA v1.00. This requires adding a new instruction format and making
use of a third source operand, which is put in RestArgs[0].

The single-precision instructions have the `.s` prefix in their official
mnemonics, and similar Go asm instructions all have `S` prefix for the
other architectures having FMA support, but in this change they instead
have `F` prefix in Go asm because loong64 currently follows the mips
backends in the naming convention. This could be changed later because
FMA is fully expressible in pure Go, making it unlikely to have to hand-
write such assembly in the wild.

Example mapping between actual encoding and Go asm syntax:

fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd
(prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd)

fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd
(prog.From = fa, prog.Reg = fk and prog.To = fd)

This patch is a copy of CL 477716.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090
Reviewed-on: https://go-review.googlesource.com/c/go/+/623877
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-02 01:36:19 +00:00
Xiaolin Zhao
7240c6cb97 cmd/asm: add support for loong64 CRC32 instructions
This patch is a copy of CL 478595.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623875
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-01 01:55:14 +00:00
Xiaolin Zhao
3f694f73d0 cmd/asm: add support for the rest of loong64 unary bitops
All remaining unary bitop instructions in the LoongArch v1.00 base ISA
are added with this change.

While at it, add the missing W suffix to the current CLO/CLZ names. They
are not used anywhere as far as we know, so no breakage is expected.
Also, stop reusing SLL's instruction format for simplicity, in favor of
a new but trivial instruction format case.

This patch is a copy of CL 477717.
Co-authored-by: WANG Xuerui <git@xen0n.name>

Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623876
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-01 01:54:58 +00:00
Xiaolin Zhao
b895dd5630 cmd/internal/obj/loong64: add support for instructions FSCALEB{F/D} and FLOGB{F/D}
Go asm syntax:
	FSCALEB{F/D}	FK, FJ, FD
	FLOGB{F/D}	FJ, FD

Equivalent platform assembler syntax:
	fscaleb.{s/d}	fd, fj, fk
	flogb.{s/d}	fd, fj

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I6cd75c7605adbb572dae86d6470ec7cf20ce0f6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/612975
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Tim King <taking@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-13 17:05:38 +00:00
Xiaolin Zhao
db07c8607a cmd/internal/obj/loong64: add support for instructions ANDN and ORN
Go asm syntax:
	ANDN/ORN	RK, RJ, RD
    or  ANDN/ORN	RK, RD

Equivalent platform assembler syntax:
	andn/orn	rd, rj, rk
    or  andn/orn	rd, rd, rk

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I6d240ecae8f9443811ca450aed3574f13f0f4a81
Reviewed-on: https://go-review.googlesource.com/c/go/+/610475
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Commit-Queue: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
2024-09-05 00:48:33 +00:00
Zxilly
d91a2e5b11 cmd: replace many sort.Interface with slices.Sort and SortFunc
with slices there's no need to implement sort.Interface

Change-Id: I59167e78881cb1df89a71e33d738d6aeca7adb71
GitHub-Last-Rev: 507ba84453
GitHub-Pull-Request: golang/go#68724
Reviewed-on: https://go-review.googlesource.com/c/go/+/602895
Reviewed-by: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2024-09-03 20:55:18 +00:00
Xiaolin Zhao
ea08952aa2 cmd/internal/obj/loong64: add support for instructions BSTRPICK.{W/D} and BSTRINS.{W/D}
Go asm syntax:
	BSTRPICK{W/V}	$msb, RJ, $lsb, RD
	BSTRINS{W/V}	$msb, RJ, $lsb, RD

Equivalent platform assembler syntax:
	bstrpick.{w/d}	rd, rj, $msb, $lsb
	bstrins.{w/d}	rd, rj, $msb, $lsb

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I8b89b766ed22a96da7d8d5b2b2873382a49208de
Reviewed-on: https://go-review.googlesource.com/c/go/+/604735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-08-23 00:53:08 +00:00
Guoqi Chen
f428c7b729 cmd/internal/obj/loong64: add FLDX,FSTX,LDX.STX instructions support
The LDX.{B,BU,H,HU,W,WU,D},STX.{B,H,W,D}, FLDX.{S,D}, FSTX.{S,D} instruction
on Loong64 implements memory access operations using register offset

Go asm syntax:
	MOV{B,BU,H,HU,W,WU,V}	(RJ)(RK), RD
	MOV{B,H,W,V}		RD, (RJ)(RK)
	MOV{F,D}		(RJ)(RK), FD
	MOV{F,D}		FD, (RJ)(RK)

Equivalent platform assembler syntax:
        ldx.{b,bu,h,hu,w,wu,d}	rd, rj, rk
        stx.{b,h,w,d}		rd, rj, rk
        fldx.{s,d}		fd, rj, rk
        fstx.{s,d}		fd, rj, rk

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ic7d13bf45dab8342f034b6469465e6337a087144
Reviewed-on: https://go-review.googlesource.com/c/go/+/588215
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
2024-08-03 05:06:40 +00:00
Xiaolin Zhao
3ae819ad1c cmd/internal/obj/loong64: add support for instructions FTINT{RM/RP/RZ/RNE}.{W/L}.{S/D}
These instructions convert floating-point numbers to fixed-point numbers
with the specified rounding pattern.

Go asm syntax:
            FTINT{RM/RP/RZ/RNE}{W/V}{F/D}	FJ, FD

Equivalent platform assembler syntax:
            ftint{rm/rp/rz/rne}.{w/l}.{s/d}	fd, fj

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I6d650d1b48b10296d01a98fadf9d806206f9b96e
Reviewed-on: https://go-review.googlesource.com/c/go/+/590995
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
2024-08-03 03:27:43 +00:00
Xiaolin Zhao
4087624473 cmd/internal/obj/loong64: add support for instructions FFINT.{S/D}.{W/L} and FTINT.{W/L}.{S/D}
Go asm syntax:
	FFINT{F/D}{W/V}		FJ, FD
	FTINT{W/V}{F/D}		FJ, FD

Equivalent platform assembler syntax:
	ffint.{s/d}.{w/l}	fd, fj
	ftint.{w/l}.{s/d}	fd, fj

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ie7646c5d49645c63b274b34b66539f10370f4930
Reviewed-on: https://go-review.googlesource.com/c/go/+/590996
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-08-03 02:48:45 +00:00
Xiaolin Zhao
b874005a84 cmd/internal/obj/loong64: add support for instructions FCOPYSIGN.{S/D} and FCLASS.{S/D}
Go asm syntax:
	FCOPYSG{F/D}	FK, FJ, FD
	FCLASSF{F/D}	FJ, FD

Equivalent platform assembler syntax:
	fcopysign.{s/d}	fd, fj, fk
	fclass.{s/d}	fd, fj

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ied34b71c9d0b34456ac5782a59d29d2d0229e326
Reviewed-on: https://go-review.googlesource.com/c/go/+/590675
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-08-03 01:57:52 +00:00
Xiaolin Zhao
e761921688 cmd/internal/obj/loong64: add support for instructions F{MAX/NIN}.{S/D}
Go asm syntax:
	F{MAX/MIN}{F/D}		FK, FJ, FD

Equivalent platform assembler syntax:
	f{max/min}.{s/d}	fd, fj, fk

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ib11fed1fe3700be5ebba33b5818661c4071b7b7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/590676
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
2024-08-02 14:33:57 +00:00
Xiaolin Zhao
11dbbaffe1 cmd/internal/obj/loong64: add support for MOV{GR2FCSR/FCSR2GR/FR2CF/CF2FR} instructions
Go asm syntax example:
	MOVV	R4, FCSR0
	MOVV	FCSR1, R5
	MOVV	F4, FCC0
	MOVV	FCC1, F5

Equivalent platform assembler syntax:
	movgr2fcsr	fcsr0, r4
	movfcsr2gr	r5, fcsr1
	movfr2cf	fcc0, f4
	movcf2fr	f5, fcc1

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

This change also merges the case of floating point move instructions
and add checks for the range of special registers.

Change-Id: Ib08fbce83e7a31dc0ab4857bf9ba959855241d1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/580279
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-08-02 00:29:24 +00:00
limeidan
bd85a3b153 cmd/internal/obj/loong64: remove Class C_LEXT and C_SEXT
There is no need to check whether the symbol is empty, since we have already
checked it before. In addition, it is enough to use C_ADDR to represent memory
access, C_LEXT and C_SEXT are not needed.

Change-Id: I7158d6b549482b35cd9ac5fba781648fb3f21922
Reviewed-on: https://go-review.googlesource.com/c/go/+/565615
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
2024-08-01 02:53:30 +00:00
limeidan
01ab9a016a cmd/internal/obj/loong64: optimize the code logic of jump instructions
If p.To.Sym is nil, that means we can get the target offset from
p.To.Target().pc - c.pc,only when p.To.Sym is not nil, we need relocation
to get the true address of target symbol.

Change-Id: Ied52f675c6aa6e8fb8d972b7699f5cadd1ecb268
Reviewed-on: https://go-review.googlesource.com/c/go/+/565627
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
2024-08-01 00:51:41 +00:00
limeidan
864513dda9 cmd/internal/obj/loong64: merge two branch classes into one
When the kind of the operand is TYPE_BRANCH, we cannot determine
whether it is a long branch or a short branch, so we merge these
two classes into one.

Change-Id: I7d7fa8f62ff02791ec3de4e3e3f7610bc9cb1743
Reviewed-on: https://go-review.googlesource.com/c/go/+/565626
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-08-01 00:51:17 +00:00
limeidan
ee3da50617 cmd/internal/obj/loong64: reclassify three-register operation instructions and two-register operation instructions
The instructions belonging to case 32 have the same structure as the
instructions in case 2.

The instructions in case 33 are actually two-register operation
instructions. We move their definitions from function oprrr to oprr and
merge their implementation into case 9.

Change-Id: Id04aaa497e78d8198a58f8d406876d16b3f393a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/565616
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-08-01 00:51:04 +00:00
limeidan
b53809d75d cmd/internal/obj/loong64: optimize instruction implementation
The plan9 instructions ASLLV and -ASLLV are translated into the same assembly
instructions, so -ASLLV can be removed and replaced with ASLLV in the
corresponding position.

ASRLV and -ASRLV have the same reason as the above two instructions.

Change-Id: I4bd79ca7bb070f7a924a0205ef2f19cf2b9ae2c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/565623
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-07-31 15:39:35 +00:00
limeidan
5ca7d4645f cmd/internal/obj/loong64: remove case 17 in func asmout
There is no relative optab item case 17, remove it.

Change-Id: I3ceaa3283c3641afafd46362737ff847a1d80665
Reviewed-on: https://go-review.googlesource.com/c/go/+/565617
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-07-31 15:38:21 +00:00
limeidan
0214749fef cmd/internal/obj/loong64: rename Class to represent the external symbol address
There is no need to define another C_SECON Class to express short
external symbol address, because the external symbol address is unknown
in assembler, relocate it in linker.

Change-Id: Id9fbd848c43ca63a21f2b6640e947140c26eeaf7
Reviewed-on: https://go-review.googlesource.com/c/go/+/565624
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
2024-07-31 15:36:31 +00:00
limeidan
8b51146c69 cmd/internal/obj/loong64, cmd/asm: remove useless instructions
Change-Id: I180c40898672a757d72cd0ef38e6e8cc20dc4c3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/565618
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-07-30 00:34:48 +00:00
limeidan
5881c41e7f cmd/internal/obj/loong64: remove unuseless functions
Change-Id: Ieee97a9477090d4273e54a6667b0a051bb0c1e9d
Reviewed-on: https://go-review.googlesource.com/c/go/+/565619
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2024-07-30 00:34:30 +00:00
limeidan
ff0c2d9634 cmd/internal/obj/loong64: fixed operand assignment error for BFPT/BFPF instructions
The BFPT correspond to BCNEZ instruction of LoongArch64 which structure
is:
	| op-p1 | offs[15:0] | op-p2 | cj | offs[20:16] |
The register REG_FCC0 should be assigned to the source operand cj which named rj here.

Change-Id: I696d0a46028924da1cd7e240fbb40a1913f1a757
Reviewed-on: https://go-review.googlesource.com/c/go/+/565620
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-07-30 00:34:05 +00:00
limeidan
44663f333b cmd/internal/obj/loong64: return an error when getting address of tls variable
The tls variable is thread local variable, an operation to get its address
is not supported, so we should return an error here.

Change-Id: Ia6a637f549cb886fdb643bdc04eeb269849d1096
Reviewed-on: https://go-review.googlesource.com/c/go/+/565621
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-07-30 00:33:36 +00:00
limeidan
33b247437f cmd/internal/obj/loong64, cmd/asm: remove invalid optab items
Cases 27 and 28 are used to handle floating point operations, MOVW is usually
used for integer processing, and, in two cases there is code like this:
	a :=AMOVF
	if p.As == AMOVD {
	        a=AMOVD
	}
This means that MOVW was eventually replaced by MOVF, so removed MOVW from cases 27 and 28.

Change-Id: Ib438febab88058e98b569e0dfe70b8610668ee31
Reviewed-on: https://go-review.googlesource.com/c/go/+/565622
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-07-29 14:29:30 +00:00
Xiaolin Zhao
f95ae3d689 cmd/asm: change register type for loong64 floating-point
On Loong64, the two input operands and one output operand of the ADDF
instruction are both floating-point registers; and the floating-point
comparison instruction CMPEQ{F,D}, CMPGE{F,D}, CMPGT{F,D} both input
operands are floating-point registers, and the output operation is a
floating-point condition register, currently, only FCC0 is used as the
floating-point condition register.

Example:
	ADDF	F0, F1, F0
	CMPEQF	F0, F1, FCC0

Change-Id: I4c1c453e522d43f294a8dcab7b6b5247f41c9c68
Reviewed-on: https://go-review.googlesource.com/c/go/+/580281
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-07-29 02:47:00 +00:00
Guoqi Chen
0a9321ad7f cmd/internal/obj/loong64: add CPUCFG instructions support
The CPUCFG instruction is used to dynamically obtain the features
supported by the current CPU during the running of the program.

Go asm syntax:
	CPUCFG RJ, RD

Equivalent platform assembler syntax:
	cpucfg rd, rj

Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I545110ff837ae9c5ccd7c448a1daf2d1277f9aa1
Reviewed-on: https://go-review.googlesource.com/c/go/+/493436
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-07-27 08:44:18 +00:00
Guoqi Chen
504212bbd7 cmd/internal/obj/loong64: add atomic memory access instructions support
The AM* atomic access instruction performs a sequence of “read-modify-write”
operations on a memory cell atomically. Specifically, it retrieves the old
value at the specified address in memory and writes it to the general register
rd, performs some simple operations on the old value in memory and the value
in the general register rk, and then write the result of the operation back
to the memory address pointed to by general register rj.

Go asm syntax:
	AM{SWAP/ADD/AND/OR/XOR/MAX/MIN}[DB]{W/V} RK, (RJ), RD
	AM{MAX/MIN}[DB]{WU/VU} RK, (RJ), RD

Equivalent platform assembler syntax:
	am{swap/add/and/or/xor/max/min}[_db].{w/d} rd, rk, rj
	am{max/min}[_db].{wu/du} rd, rk, rj

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I99ea4553ae731675180d63691c19ef334e7e7817
Reviewed-on: https://go-review.googlesource.com/c/go/+/481577
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-22 02:04:54 +00:00
Guoqi Chen
9ea4770e14 cmd/internal/obj/loong64: improve the definition of plan9 assembly format in optab
In the three formats corresponding to case 7 of the function asmout, BREAK actually
corresponds to the cacop instruction of Loong64, refer to the loong64 instruction
manual volume 1 [1], the cacop instruction is a privileged instruction used to
maintain the cache, and the user mode does not have permission to execute.

Referring to the loong64 instruction manual volume 1 [1], the SYSCALL, BREAK and DBAR
instructions have similar formats and can be grouped into one category, the RDTIMED,
RDTIMELW and RDTIMEHW instructions can be grouped into one category, and the NOOP and
UNDEF instructions can be grouped into one category.

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I0b8998270102d1557fc2b2410cf8c0b078bd0c2e
Reviewed-on: https://go-review.googlesource.com/c/go/+/493435
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Cherry Mui <cherryyz@google.com>
2024-05-13 15:52:19 +00:00
Guoqi Chen
f7f56ded01 cmd/internal/obj/loong64: recheck jump offset boundary after auto-aligning loop heads
After the alignment of the loop header is performed, the offset of the checked
conditional branch instruction may overflow, so it needs to be checked again.

When checking whether the offset of the branch jump instruction overflows, it
can be classified and processed according to the range of the immediate field
of the specific instruction, which can reduce the introduction of unnecessary
jump instructions.

Fixes #61819

Change-Id: I772a5b5b8b8de21c78d7566be30be8ff65fdbce8
Reviewed-on: https://go-review.googlesource.com/c/go/+/519915
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: qiu laidongfeng2 <2645477756@qq.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
2024-04-15 17:39:37 +00:00
limeidan
0584574797 cmd/internal, cmd/link: unify the relocation naming style of loong64
Change-Id: I2990701e71a63af7bdd6851b6008dc63cb1c1a83
Reviewed-on: https://go-review.googlesource.com/c/go/+/535616
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2024-02-27 17:26:07 +00:00
Guoqi Chen
346e06c46d cmd/internal/obj,cmd/link: access global data via GOT in -dynlink mode on loong64
Updates #58784

Change-Id: Ic98d10a512fea0c3ca321ab52693d9f6775126a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/480875
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: WANG Xuerui <git@xen0n.name>
Reviewed-by: WANG Xuerui <git@xen0n.name>
2023-11-21 17:49:21 +00:00
cui fliter
36b14a78b5 cmd: fix mismatched symbols
Change-Id: I6365cdf22ad5e669908519d0ee8b78d76ae8f1b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/532075
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-10-03 12:57:25 +00:00
limeidan
1f908bd060 cmd/internal/obj/loong64, cmd/internal/objabi, cmd/link: add support for --buildmode=c-shared on loong64
Updates #53301
Updates #58784

Change-Id: Ifcb40871f609531dfd8b568db9ac14da9b451742
Reviewed-on: https://go-review.googlesource.com/c/go/+/425476
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: WANG Xuerui <git@xen0n.name>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
2023-04-11 16:54:53 +00:00
WANG Xuerui
f736a9ad01 cmd/internal/obj/loong64: auto-align loop heads to 16-byte boundaries
CL 479816 took care of loops in hand-written assembly, but did not
account for those written in Go, that may become performance-sensitive
as well.

In this patch, all loop heads are automatically identified and aligned
to 16-byte boundaries, by inserting a synthetic `PCALIGN $16` before
them. "Loop heads" are defined as targets of backward branches.

While at it, tweak some of the local comments so the flow is hopefully
clearer.

Because LoongArch instructions are all 32 bits long, at most 3 NOOPs
can be inserted for each target Prog. This may sound excessive, but
benchmark results indicate the current approach is overall profitable
anyway.

Benchmark results on Loongson 3A5000 (LA464):

goos: linux
goarch: loong64
pkg: test/bench/go1
                      │  CL 479816  │              this CL               │
                      │   sec/op    │   sec/op     vs base               │
BinaryTree17             14.10 ± 1%    14.06 ± 1%       ~ (p=0.280 n=10)
Fannkuch11               3.579 ± 0%    3.419 ± 0%  -4.45% (p=0.000 n=10)
FmtFprintfEmpty         94.73n ± 0%   94.44n ± 0%  -0.31% (p=0.000 n=10)
FmtFprintfString        151.9n ± 0%   149.1n ± 0%  -1.84% (p=0.000 n=10)
FmtFprintfInt           158.3n ± 0%   155.2n ± 0%  -1.96% (p=0.000 n=10)
FmtFprintfIntInt        241.4n ± 0%   235.4n ± 0%  -2.49% (p=0.000 n=10)
FmtFprintfPrefixedInt   320.2n ± 0%   314.7n ± 0%  -1.73% (p=0.000 n=10)
FmtFprintfFloat         414.3n ± 0%   398.7n ± 0%  -3.77% (p=0.000 n=10)
FmtManyArgs             949.9n ± 0%   929.8n ± 0%  -2.12% (p=0.000 n=10)
GobDecode               15.24m ± 0%   15.30m ± 0%  +0.38% (p=0.035 n=10)
GobEncode               18.10m ± 2%   17.59m ± 1%  -2.81% (p=0.002 n=10)
Gzip                    429.9m ± 0%   421.5m ± 0%  -1.97% (p=0.000 n=10)
Gunzip                  88.31m ± 0%   87.39m ± 0%  -1.04% (p=0.000 n=10)
HTTPClientServer        85.71µ ± 0%   87.24µ ± 0%  +1.79% (p=0.000 n=10)
JSONEncode              19.74m ± 0%   18.55m ± 0%  -6.00% (p=0.000 n=10)
JSONDecode              78.60m ± 1%   77.93m ± 0%  -0.84% (p=0.000 n=10)
Mandelbrot200           7.208m ± 0%   7.217m ± 0%       ~ (p=0.481 n=10)
GoParse                 7.616m ± 1%   7.630m ± 2%       ~ (p=0.796 n=10)
RegexpMatchEasy0_32     133.0n ± 0%   134.1n ± 0%  +0.83% (p=0.000 n=10)
RegexpMatchEasy0_1K     1.362µ ± 0%   1.364µ ± 0%  +0.15% (p=0.000 n=10)
RegexpMatchEasy1_32     161.8n ± 0%   163.7n ± 0%  +1.17% (p=0.000 n=10)
RegexpMatchEasy1_1K     1.497µ ± 0%   1.497µ ± 0%       ~ (p=1.000 n=10)
RegexpMatchMedium_32    1.420µ ± 0%   1.446µ ± 0%  +1.83% (p=0.000 n=10)
RegexpMatchMedium_1K    42.25µ ± 0%   42.53µ ± 0%  +0.65% (p=0.000 n=10)
RegexpMatchHard_32      2.108µ ± 0%   2.116µ ± 0%  +0.38% (p=0.000 n=10)
RegexpMatchHard_1K      62.65µ ± 0%   63.23µ ± 0%  +0.93% (p=0.000 n=10)
Revcomp                  1.192 ± 0%    1.198 ± 0%  +0.55% (p=0.000 n=10)
Template                115.6m ± 2%   116.9m ± 1%       ~ (p=0.075 n=10)
TimeParse               418.1n ± 1%   414.7n ± 0%  -0.81% (p=0.000 n=10)
TimeFormat              517.9n ± 0%   513.7n ± 0%  -0.81% (p=0.000 n=10)
geomean                 103.5µ        102.6µ       -0.79%

                     │  CL 479816   │               this CL               │
                     │     B/s      │     B/s       vs base               │
GobDecode              48.04Mi ± 0%   47.86Mi ± 0%  -0.38% (p=0.035 n=10)
GobEncode              40.44Mi ± 2%   41.61Mi ± 1%  +2.89% (p=0.001 n=10)
Gzip                   43.04Mi ± 0%   43.91Mi ± 0%  +2.02% (p=0.000 n=10)
Gunzip                 209.6Mi ± 0%   211.8Mi ± 0%  +1.05% (p=0.000 n=10)
JSONEncode             93.76Mi ± 0%   99.75Mi ± 0%  +6.39% (p=0.000 n=10)
JSONDecode             23.55Mi ± 1%   23.75Mi ± 0%  +0.85% (p=0.000 n=10)
GoParse                7.253Mi ± 1%   7.238Mi ± 2%       ~ (p=0.698 n=10)
RegexpMatchEasy0_32    229.4Mi ± 0%   227.6Mi ± 0%  -0.82% (p=0.000 n=10)
RegexpMatchEasy0_1K    717.3Mi ± 0%   716.2Mi ± 0%  -0.15% (p=0.000 n=10)
RegexpMatchEasy1_32    188.6Mi ± 0%   186.4Mi ± 0%  -1.13% (p=0.000 n=10)
RegexpMatchEasy1_1K    652.2Mi ± 0%   652.3Mi ± 0%  +0.01% (p=0.005 n=10)
RegexpMatchMedium_32   21.49Mi ± 0%   21.11Mi ± 0%  -1.73% (p=0.000 n=10)
RegexpMatchMedium_1K   23.11Mi ± 0%   22.96Mi ± 0%  -0.62% (p=0.000 n=10)
RegexpMatchHard_32     14.48Mi ± 0%   14.42Mi ± 0%  -0.40% (p=0.000 n=10)
RegexpMatchHard_1K     15.59Mi ± 0%   15.44Mi ± 0%  -0.98% (p=0.000 n=10)
Revcomp                203.4Mi ± 0%   202.3Mi ± 0%  -0.55% (p=0.000 n=10)
Template               16.00Mi ± 2%   15.83Mi ± 1%       ~ (p=0.078 n=10)
geomean                60.72Mi        60.89Mi       +0.29%

The slight regression on the Regexp cases is likely because the previous
numbers are just coincidental: indeed, large regressions or improvements
(of roughly ±10%) happen with definitely irrelevant changes during
development. This CL should (hopefully) bring such random performance
fluctuations down a bit.

Change-Id: I8bdda6e65336da00d4ad79650937b3eeb9db0e7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/479817
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: WANG Xuerui <git@xen0n.name>
2023-04-11 08:19:45 +00:00
WANG Xuerui
a3dd959229 cmd/internal/obj/loong64, cmd/link/internal: switch to LoongArch ELF psABI v2 relocs
The LoongArch ELF psABI v2 [1] relocs are vastly simplified from the v1
which involved a stack machine for computing the reloc values, but the
details of PC-relative addressing are changed as well. Specifically, the
`pcaddu12i` instruction is substituted with the `pcalau12i`, which is
like arm64's `adrp` -- meaning the lower bits of a symbol's address now
have to be absolute and not PC-relative.

However, apart from the little bit of added complexity, the obvious
advantage is that only 1 reloc needs to be emitted for every kind of
external reloc we care about. This can mean substantial space savings
(each RELA reloc occupies 24 bytes), and no open-coded stack ops has to
remain any more.

While at it, update the preset value for the output ELF's flags to
indicate the psABI update.

Fixes #58784

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

Change-Id: I5c13bc710eaf58293a32e930dd33feff2ef14c28
Reviewed-on: https://go-review.googlesource.com/c/go/+/455017
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-04-10 15:50:11 +00:00
WANG Xuerui
47b22b6548 cmd/link, cmd/internal/obj/loong64: support the PCALIGN directive
Allow writing `PCALIGN $imm` where imm is a power-of-2 between 8 and
2048 (inclusive), for ensuring that the following instruction is
placed at an imm-byte boundary relative to the beginning of the
function. If the PC is not sufficiently aligned, NOOPs will be
inserted to make it so, otherwise the directive will do nothing.

This could be useful for both asm performance hand-tuning, and future
scenarios where a certain bigger alignment might be required.

Change-Id: Iad6244669a3d5adea88eceb0dc7be1af4f0d4fc9
Reviewed-on: https://go-review.googlesource.com/c/go/+/479815
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: WANG Xuerui <git@xen0n.name>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-04-07 20:20:25 +00:00
WANG Xuerui
c054f223e7 cmd/internal/obj/loong64: remove Optab.family and reorganize operand class fields
There is currently no support for GOARCH=loong32, so the Optab.family
field is unused so far. Remove it to simplify the optab; the loong
assembler backend would likely already be overhauled into a sufficiently
different shape by the time we start to care for loong32, that the data
we have today would be useless anyway.

While at it, add a operand class slot for the 3rd source operand
(support for which will arrive in later commits), and rename the other
operand class fields to be self-documenting. The changes are being
merged into this patch for sake of reducing code churn.

Change-Id: Icf0988e34ff1c0f762c8e0708cfcef2e7954760c
Reviewed-on: https://go-review.googlesource.com/c/go/+/477715
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Wayne Zuo <wdvxdr@golangcn.org>
2023-03-31 02:57:51 +00:00
WANG Xuerui
22f9317f20 cmd/internal/obj/loong64: assemble BEQ/BNEs comparing with 0 as beqz/bnez
LoongArch (except for the extremely reduced LA32 Primary subset) has
dedicated beqz/bnez instructions as alternative encodings for beq/bne
with one of the source registers being R0, that allow the offset field
to occupy 5 more bits, giving 21 bits in total (equal to the FP
branches). Make use of them instead of beq/bne if one source operand is
omitted in asm, or if one of the registers being compared is R0.

Multiple go1 benchmark runs indicate the change is not perf-sensitive.

Change-Id: If6267623c82092e81d75578091fb4e013658b9f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/478377
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Wayne Zuo <wdvxdr@golangcn.org>
2023-03-31 02:56:19 +00:00
WANG Xuerui
1ae306a5be cmd/internal/obj/loong64: clean up code for short conditional branches
Untangle the logic so the preparation of operands and actual assembling
(branch range checking included) are properly separated, making future
changes easier to review and maintain. No functional change intended.

Change-Id: I1f73282f9d92ff23d84846453d3597ba66d207d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/478376
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-27 21:45:39 +00:00
WANG Xuerui
09f1ddb158 cmd/internal/obj/loong64: realize all unconditional jumps with B/BL
The current practice of using the "PC-relative" `BEQ ZERO, ZERO` for
short jumps is inherited from the MIPS port, where the pre-R6 long
jumps are PC-regional instead of PC-relative. This quirk is not
present in LoongArch from the very beginning so there is no reason to
keep the behavior any more.

While at it, simplify the code to not place anything in the jump offset
field if a relocation is to take place. (It may be relic of a previous
REL-era treatment where the addend is to be stored in the instruction
word, but again, loong64 is exclusively RELA from day 1 so no point in
doing so either.)

Benchmark shows very slight improvement on a 3A5000 box, indicating the
LA464 micro-architecture presumably *not* seeing the always-true BEQs as
equivalent to B:

goos: linux
goarch: loong64
pkg: test/bench/go1
                      │  2ef70d9d0f  │                this CL                │
                      │    sec/op    │    sec/op     vs base                 │
BinaryTree17             14.57 ±  4%    14.54 ±  1%       ~ (p=0.353 n=10)
Fannkuch11               3.570 ±  0%    3.570 ±  0%       ~ (p=0.529 n=10)
FmtFprintfEmpty         92.84n ±  0%   92.84n ±  0%       ~ (p=0.970 n=10)
FmtFprintfString        150.0n ±  0%   149.9n ±  0%       ~ (p=0.350 n=10)
FmtFprintfInt           153.3n ±  0%   153.3n ±  0%       ~ (p=1.000 n=10) ¹
FmtFprintfIntInt        235.8n ±  0%   235.8n ±  0%       ~ (p=0.963 n=10)
FmtFprintfPrefixedInt   318.5n ±  0%   318.5n ±  0%       ~ (p=0.474 n=10)
FmtFprintfFloat         410.4n ±  0%   410.4n ±  0%       ~ (p=0.628 n=10)
FmtManyArgs             944.9n ±  0%   945.0n ±  0%       ~ (p=0.240 n=10)
GobDecode               13.97m ± 12%   12.83m ± 21%       ~ (p=0.165 n=10)
GobEncode               17.84m ±  5%   18.60m ±  4%       ~ (p=0.123 n=10)
Gzip                    421.0m ±  0%   421.0m ±  0%       ~ (p=0.579 n=10)
Gunzip                  89.80m ±  0%   89.77m ±  0%       ~ (p=0.529 n=10)
HTTPClientServer        86.54µ ±  1%   86.25µ ±  0%  -0.33% (p=0.003 n=10)
JSONEncode              18.57m ±  0%   18.57m ±  0%       ~ (p=0.353 n=10)
JSONDecode              77.48m ±  0%   77.30m ±  0%  -0.23% (p=0.035 n=10)
Mandelbrot200           7.217m ±  0%   7.217m ±  0%       ~ (p=0.436 n=10)
GoParse                 7.599m ±  2%   7.632m ±  1%       ~ (p=0.353 n=10)
RegexpMatchEasy0_32     140.1n ±  0%   140.1n ±  0%       ~ (p=0.582 n=10)
RegexpMatchEasy0_1K     1.538µ ±  0%   1.538µ ±  0%       ~ (p=1.000 n=10) ¹
RegexpMatchEasy1_32     161.7n ±  0%   161.7n ±  0%       ~ (p=1.000 n=10) ¹
RegexpMatchEasy1_1K     1.632µ ±  0%   1.632µ ±  0%       ~ (p=1.000 n=10) ¹
RegexpMatchMedium_32    1.369µ ±  0%   1.369µ ±  0%       ~ (p=1.000 n=10)
RegexpMatchMedium_1K    39.96µ ±  0%   39.96µ ±  0%  +0.01% (p=0.010 n=10)
RegexpMatchHard_32      2.099µ ±  0%   2.099µ ±  0%       ~ (p=1.000 n=10) ¹
RegexpMatchHard_1K      62.50µ ±  0%   62.50µ ±  0%       ~ (p=0.099 n=10)
Revcomp                  1.349 ±  0%    1.347 ±  0%  -0.14% (p=0.001 n=10)
Template                118.4m ±  0%   118.0m ±  0%  -0.36% (p=0.023 n=10)
TimeParse               407.8n ±  0%   407.9n ±  0%  +0.02% (p=0.000 n=10)
TimeFormat              508.0n ±  0%   507.9n ±  0%       ~ (p=0.421 n=10)
geomean                 103.5µ         103.3µ        -0.17%
¹ all samples are equal

                     │  2ef70d9d0f   │                this CL                 │
                     │      B/s      │      B/s       vs base                 │
GobDecode              52.67Mi ± 11%   57.04Mi ± 17%       ~ (p=0.149 n=10)
GobEncode              41.03Mi ±  4%   39.35Mi ±  4%       ~ (p=0.118 n=10)
Gzip                   43.95Mi ±  0%   43.95Mi ±  0%       ~ (p=0.428 n=10)
Gunzip                 206.1Mi ±  0%   206.1Mi ±  0%       ~ (p=0.399 n=10)
JSONEncode             99.64Mi ±  0%   99.66Mi ±  0%       ~ (p=0.304 n=10)
JSONDecode             23.88Mi ±  0%   23.94Mi ±  0%  +0.22% (p=0.030 n=10)
GoParse                7.267Mi ±  2%   7.238Mi ±  1%       ~ (p=0.360 n=10)
RegexpMatchEasy0_32    217.8Mi ±  0%   217.8Mi ±  0%  -0.00% (p=0.006 n=10)
RegexpMatchEasy0_1K    635.0Mi ±  0%   635.0Mi ±  0%       ~ (p=0.194 n=10)
RegexpMatchEasy1_32    188.7Mi ±  0%   188.7Mi ±  0%       ~ (p=0.338 n=10)
RegexpMatchEasy1_1K    598.5Mi ±  0%   598.5Mi ±  0%  -0.00% (p=0.000 n=10)
RegexpMatchMedium_32   22.30Mi ±  0%   22.30Mi ±  0%       ~ (p=0.211 n=10)
RegexpMatchMedium_1K   24.43Mi ±  0%   24.43Mi ±  0%       ~ (p=1.000 n=10)
RegexpMatchHard_32     14.54Mi ±  0%   14.54Mi ±  0%       ~ (p=0.474 n=10)
RegexpMatchHard_1K     15.62Mi ±  0%   15.62Mi ±  0%       ~ (p=1.000 n=10) ¹
Revcomp                179.7Mi ±  0%   180.0Mi ±  0%  +0.14% (p=0.001 n=10)
Template               15.63Mi ±  0%   15.68Mi ±  0%  +0.34% (p=0.022 n=10)
geomean                60.29Mi         60.44Mi        +0.24%
¹ all samples are equal

Change-Id: I112dd663c49567386ea75dd4966a9f8127ffb90e
Reviewed-on: https://go-review.googlesource.com/c/go/+/478075
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-22 18:50:59 +00:00
Huang Qiqi
2ef70d9d0f cmd/internal/obj/loong64: add support for movgr2cf and movcf2gr instructions
Change-Id: I7ff3c8df24ed7990fe104bc2530354c0bd5fe018
Reviewed-on: https://go-review.googlesource.com/c/go/+/475576
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
2023-03-21 06:53:28 +00:00
WANG Xuerui
b4ac4b4b42 cmd/internal/obj/loong64: add the PCALAU12I instruction for reloc use
The LoongArch ELF psABI v2.00 revamped the relocation design, largely
moving to using the `pcalau12i + addi/ld/st` pair for PC-relative
addressing within +/- 32 bits. The "pcala" in `pcalau12i` stands for
"PC-aligned add"; the instruction's semantics happen to coincide with
arm64's `adrp`.

Add support for emitting this instruction as part of the relevant
addressing ops, for use with new reloc types later.

Updates #58784

Change-Id: Ic1747cd9745aad0d1abb9bd78400cd5ff5978bc8
Reviewed-on: https://go-review.googlesource.com/c/go/+/455016
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Auto-Submit: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-16 17:24:42 +00:00
Guoqi Chen
b561ebab46 cmd/internal/obj/loong64: remove invalid branch delay slots
Change-Id: I222717771019f7aefa547971b2d94ef4677a42c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/420979
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Run-TryBot: hopehook <hopehook@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
2023-03-13 14:16:39 +00:00
Guoqi Chen
bf8d142b4e cmd/asm: add RDTIME{L,H}.W, RDTIME.D support for loong64
Instruction formats: rdtime rd, rj

The RDTIME family of instructions are used to read constant frequency timer
information, the stable counter value is written into the general register
rd, and the counter id information is written into the general register rj.
(Note: both of its register operands are outputs).

Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: Ida5bbb28316ef70b5f616dac3e6fa6f2e77875b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/421655
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
2023-02-06 13:49:53 +00:00
cui fliter
b2faff18ce all: add missing periods in comments
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18 17:59:44 +00:00
Wayne Zuo
7d574466a9 cmd/internal/obj/loong64: add ROTR, ROTRV instructions support
Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html

Change-Id: I29adb84eb70bffd963c79ed6957a5197896fb2bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/422316
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-08-30 03:21:06 +00:00
Wayne Zuo
1dcef7b3bd cmd/internal/obj/loong64: add MASKEQZ and MASKNEZ instructions support
Change-Id: Ied16c3be47c863a94d46bd568191057ded4b7d0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/416734
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
2022-08-23 23:17:55 +00:00
Xiaodong Liu
c1105cfd43 cmd/internal/obj{,/loong64}: instructions and registers for loong64
Implemented an assembler for LoongArch64(loong64 is short name) -
this provides register definitions and instruction encoding as
defined in the LoongArch Instruction Set Manual.

LoongArch Instruction Set Manual:
  https://github.com/loongson/LoongArch-Documentation/releases

Contributors to the linux/loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I930d2a19246496e3ca36d55539183c0f9f650ad9
Reviewed-on: https://go-review.googlesource.com/c/go/+/342309
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2022-05-11 20:11:34 +00:00