mirror of
https://github.com/golang/go.git
synced 2025-12-08 06:10:04 +00:00
cmd/compile: add fma intrinsic for amd64
To permit ssa-level optimization, this change introduces an amd64 intrinsic
that generates the VFMADD231SD instruction for the fused-multiply-add
operation on systems that support it. System support is detected via
cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic
("Fma") to VFMADD231SD.
The benchmark compares the software implementation (old) with the intrinsic
(new).
name old time/op new time/op delta
Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5)
Updates #25819.
Change-Id: I966655e5f96817a5d06dff5942418a3915b09584
Reviewed-on: https://go-review.googlesource.com/c/go/+/137156
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This commit is contained in:
parent
50f4896b72
commit
7a6da218b1
12 changed files with 85 additions and 0 deletions
|
|
@ -743,6 +743,7 @@ const (
|
|||
OpAMD64POPCNTL
|
||||
OpAMD64SQRTSD
|
||||
OpAMD64ROUNDSD
|
||||
OpAMD64VFMADD231SD
|
||||
OpAMD64SBBQcarrymask
|
||||
OpAMD64SBBLcarrymask
|
||||
OpAMD64SETEQ
|
||||
|
|
@ -9625,6 +9626,22 @@ var opcodeTable = [...]opInfo{
|
|||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "VFMADD231SD",
|
||||
argLen: 3,
|
||||
resultInArg0: true,
|
||||
asm: x86.AVFMADD231SD,
|
||||
reg: regInfo{
|
||||
inputs: []inputInfo{
|
||||
{0, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
|
||||
{1, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
|
||||
{2, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
|
||||
},
|
||||
outputs: []outputInfo{
|
||||
{0, 4294901760}, // X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "SBBQcarrymask",
|
||||
argLen: 1,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue