2015-04-15 15:51:25 -07:00
|
|
|
// Copyright 2015 The Go Authors. All rights reserved.
|
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
|
|
|
|
package ssa
|
|
|
|
|
|
[dev.ssa] cmd/compile: enhance command line option processing for SSA
The -d compiler flag can also specify ssa phase and flag,
for example -d=ssa/generic_cse/time,ssa/generic_cse/stats
Spaces in the phase names can be specified with an
underscore. Flags currently parsed (not necessarily
recognized by the phases yet) are:
on, off, mem, time, debug, stats, and test
On, off and time are handled in the harness,
debug, stats, and test are interpreted by the phase itself.
The pass is now attached to the Func being compiled, and a
new method logStats(key, ...value) on *Func to encourage a
semi-standardized format for that output. Output fields
are separated by tabs to ease digestion by awk and
spreadsheets. For example,
if f.pass.stats > 0 {
f.logStat("CSE REWRITES", rewrites)
}
Change-Id: I16db2b5af64c50ca9a47efeb51d961147a903abc
Reviewed-on: https://go-review.googlesource.com/19885
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-02-25 13:10:51 -05:00
|
|
|
import (
|
2020-10-07 09:44:16 -04:00
|
|
|
"cmd/compile/internal/abi"
|
2023-03-21 09:25:43 -07:00
|
|
|
"cmd/compile/internal/base"
|
2020-11-17 21:47:56 -05:00
|
|
|
"cmd/compile/internal/ir"
|
cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.
In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.
Though this is a big CL, most of the changes are
mechanical and uninteresting.
Interesting bits:
* Add new singleton globals to package types for the special
SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
and TTUPLE, for SSA tuple types.
ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
to package types.
* We had picked the name "types" in our rules for the handy
list of types provided by ssa.Config. That conflicted with
the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
and probably also some other duplicated Type methods
designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
and they were not particularly careful about types in general.
Of necessity, this CL switches them to use *types.Type;
it does not make them more type-accurate.
Unfortunately, using types.Type means initializing a bit
of the types universe.
This is prime for refactoring and improvement.
This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.
name old alloc/op new alloc/op delta
Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8)
Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10)
GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10)
Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10)
GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9)
Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8)
Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10)
XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10)
[Geo mean] 40.5MB 40.3MB -0.68%
name old allocs/op new allocs/op delta
Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9)
Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10)
GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10)
Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10)
GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9)
Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8)
Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10)
XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10)
[Geo mean] 428k 428k -0.01%
Removing all the interface calls helps non-trivially with CPU, though.
name old time/op new time/op delta
Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96)
Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96)
GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96)
Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99)
GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97)
Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99)
Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94)
XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95)
[Geo mean] 178ms 173ms -2.65%
name old user-time/op new user-time/op delta
Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99)
Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95)
GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99)
Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96)
GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100)
Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92)
Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100)
XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97)
[Geo mean] 220ms 213ms -2.76%
Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-28 14:12:28 -07:00
|
|
|
"cmd/compile/internal/types"
|
[dev.ssa] cmd/compile: enhance command line option processing for SSA
The -d compiler flag can also specify ssa phase and flag,
for example -d=ssa/generic_cse/time,ssa/generic_cse/stats
Spaces in the phase names can be specified with an
underscore. Flags currently parsed (not necessarily
recognized by the phases yet) are:
on, off, mem, time, debug, stats, and test
On, off and time are handled in the harness,
debug, stats, and test are interpreted by the phase itself.
The pass is now attached to the Func being compiled, and a
new method logStats(key, ...value) on *Func to encourage a
semi-standardized format for that output. Output fields
are separated by tabs to ease digestion by awk and
spreadsheets. For example,
if f.pass.stats > 0 {
f.logStat("CSE REWRITES", rewrites)
}
Change-Id: I16db2b5af64c50ca9a47efeb51d961147a903abc
Reviewed-on: https://go-review.googlesource.com/19885
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-02-25 13:10:51 -05:00
|
|
|
"cmd/internal/obj"
|
2016-12-06 17:08:06 -08:00
|
|
|
"cmd/internal/src"
|
[dev.ssa] cmd/compile: enhance command line option processing for SSA
The -d compiler flag can also specify ssa phase and flag,
for example -d=ssa/generic_cse/time,ssa/generic_cse/stats
Spaces in the phase names can be specified with an
underscore. Flags currently parsed (not necessarily
recognized by the phases yet) are:
on, off, mem, time, debug, stats, and test
On, off and time are handled in the harness,
debug, stats, and test are interpreted by the phase itself.
The pass is now attached to the Func being compiled, and a
new method logStats(key, ...value) on *Func to encourage a
semi-standardized format for that output. Output fields
are separated by tabs to ease digestion by awk and
spreadsheets. For example,
if f.pass.stats > 0 {
f.logStat("CSE REWRITES", rewrites)
}
Change-Id: I16db2b5af64c50ca9a47efeb51d961147a903abc
Reviewed-on: https://go-review.googlesource.com/19885
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
2016-02-25 13:10:51 -05:00
|
|
|
)
|
2015-08-24 02:16:19 -07:00
|
|
|
|
2017-03-14 16:44:48 -07:00
|
|
|
// A Config holds readonly compilation information.
|
|
|
|
|
// It is created once, early during compilation,
|
|
|
|
|
// and shared across all compilations.
|
2015-04-15 15:51:25 -07:00
|
|
|
type Config struct {
|
2018-07-19 15:04:14 -04:00
|
|
|
arch string // "amd64", etc.
|
|
|
|
|
PtrSize int64 // 4 or 8; copy of cmd/internal/sys.Arch.PtrSize
|
|
|
|
|
RegSize int64 // 4 or 8; copy of cmd/internal/sys.Arch.RegSize
|
|
|
|
|
Types Types
|
2022-10-19 15:53:43 +11:00
|
|
|
lowerBlock blockRewriter // block lowering function, first round
|
|
|
|
|
lowerValue valueRewriter // value lowering function, first round
|
|
|
|
|
lateLowerBlock blockRewriter // block lowering function that needs to be run after the first round; only used on some architectures
|
|
|
|
|
lateLowerValue valueRewriter // value lowering function that needs to be run after the first round; only used on some architectures
|
2020-10-07 09:44:16 -04:00
|
|
|
splitLoad valueRewriter // function for splitting merged load ops; only used on some architectures
|
|
|
|
|
registers []Register // machine registers
|
|
|
|
|
gpRegMask regMask // general purpose integer register mask
|
|
|
|
|
fpRegMask regMask // floating point register mask
|
|
|
|
|
fp32RegMask regMask // floating point register mask
|
|
|
|
|
fp64RegMask regMask // floating point register mask
|
|
|
|
|
specialRegMask regMask // special register mask
|
|
|
|
|
intParamRegs []int8 // register numbers of integer param (in/out) registers
|
|
|
|
|
floatParamRegs []int8 // register numbers of floating param (in/out) registers
|
|
|
|
|
ABI1 *abi.ABIConfig // "ABIInternal" under development // TODO change comment when this becomes current
|
|
|
|
|
ABI0 *abi.ABIConfig
|
2025-02-16 16:31:22 -05:00
|
|
|
FPReg int8 // register number of frame pointer, -1 if not used
|
|
|
|
|
LinkReg int8 // register number of link register if it is a general purpose register, -1 if not used
|
|
|
|
|
hasGReg bool // has hardware g register
|
|
|
|
|
ctxt *obj.Link // Generic arch information
|
|
|
|
|
optimize bool // Do optimization
|
|
|
|
|
useAvg bool // Use optimizations that need Avg* operations
|
|
|
|
|
useHmul bool // Use optimizations that need Hmul* operations
|
|
|
|
|
SoftFloat bool //
|
|
|
|
|
Race bool // race detector enabled
|
|
|
|
|
BigEndian bool //
|
|
|
|
|
unalignedOK bool // Unaligned loads/stores are ok
|
|
|
|
|
haveBswap64 bool // architecture implements Bswap64
|
|
|
|
|
haveBswap32 bool // architecture implements Bswap32
|
|
|
|
|
haveBswap16 bool // architecture implements Bswap16
|
2024-11-11 12:21:14 -08:00
|
|
|
|
|
|
|
|
// mulRecipes[x] = function to build v * x from v.
|
|
|
|
|
mulRecipes map[int64]mulRecipe
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
type mulRecipe struct {
|
|
|
|
|
cost int
|
|
|
|
|
build func(*Value, *Value) *Value // build(m, v) returns v * x built at m.
|
2015-04-15 15:51:25 -07:00
|
|
|
}
|
|
|
|
|
|
2017-03-17 10:50:20 -07:00
|
|
|
type (
|
|
|
|
|
blockRewriter func(*Block) bool
|
|
|
|
|
valueRewriter func(*Value) bool
|
|
|
|
|
)
|
|
|
|
|
|
2017-03-17 16:04:46 -07:00
|
|
|
type Types struct {
|
cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.
In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.
Though this is a big CL, most of the changes are
mechanical and uninteresting.
Interesting bits:
* Add new singleton globals to package types for the special
SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
and TTUPLE, for SSA tuple types.
ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
to package types.
* We had picked the name "types" in our rules for the handy
list of types provided by ssa.Config. That conflicted with
the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
and probably also some other duplicated Type methods
designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
and they were not particularly careful about types in general.
Of necessity, this CL switches them to use *types.Type;
it does not make them more type-accurate.
Unfortunately, using types.Type means initializing a bit
of the types universe.
This is prime for refactoring and improvement.
This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.
name old alloc/op new alloc/op delta
Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8)
Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10)
GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10)
Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10)
GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9)
Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8)
Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10)
XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10)
[Geo mean] 40.5MB 40.3MB -0.68%
name old allocs/op new allocs/op delta
Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9)
Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10)
GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10)
Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10)
GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9)
Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8)
Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10)
XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10)
[Geo mean] 428k 428k -0.01%
Removing all the interface calls helps non-trivially with CPU, though.
name old time/op new time/op delta
Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96)
Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96)
GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96)
Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99)
GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97)
Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99)
Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94)
XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95)
[Geo mean] 178ms 173ms -2.65%
name old user-time/op new user-time/op delta
Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99)
Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95)
GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99)
Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96)
GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100)
Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92)
Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100)
XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97)
[Geo mean] 220ms 213ms -2.76%
Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-28 14:12:28 -07:00
|
|
|
Bool *types.Type
|
|
|
|
|
Int8 *types.Type
|
|
|
|
|
Int16 *types.Type
|
|
|
|
|
Int32 *types.Type
|
|
|
|
|
Int64 *types.Type
|
|
|
|
|
UInt8 *types.Type
|
|
|
|
|
UInt16 *types.Type
|
|
|
|
|
UInt32 *types.Type
|
|
|
|
|
UInt64 *types.Type
|
|
|
|
|
Int *types.Type
|
|
|
|
|
Float32 *types.Type
|
|
|
|
|
Float64 *types.Type
|
2017-08-29 11:49:08 -04:00
|
|
|
UInt *types.Type
|
cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.
In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.
Though this is a big CL, most of the changes are
mechanical and uninteresting.
Interesting bits:
* Add new singleton globals to package types for the special
SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
and TTUPLE, for SSA tuple types.
ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
to package types.
* We had picked the name "types" in our rules for the handy
list of types provided by ssa.Config. That conflicted with
the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
and probably also some other duplicated Type methods
designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
and they were not particularly careful about types in general.
Of necessity, this CL switches them to use *types.Type;
it does not make them more type-accurate.
Unfortunately, using types.Type means initializing a bit
of the types universe.
This is prime for refactoring and improvement.
This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.
name old alloc/op new alloc/op delta
Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8)
Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10)
GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10)
Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10)
GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9)
Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8)
Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10)
XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10)
[Geo mean] 40.5MB 40.3MB -0.68%
name old allocs/op new allocs/op delta
Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9)
Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10)
GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10)
Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10)
GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9)
Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8)
Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10)
XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10)
[Geo mean] 428k 428k -0.01%
Removing all the interface calls helps non-trivially with CPU, though.
name old time/op new time/op delta
Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96)
Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96)
GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96)
Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99)
GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97)
Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99)
Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94)
XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95)
[Geo mean] 178ms 173ms -2.65%
name old user-time/op new user-time/op delta
Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99)
Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95)
GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99)
Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96)
GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100)
Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92)
Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100)
XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97)
[Geo mean] 220ms 213ms -2.76%
Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-04-28 14:12:28 -07:00
|
|
|
Uintptr *types.Type
|
|
|
|
|
String *types.Type
|
|
|
|
|
BytePtr *types.Type // TODO: use unsafe.Pointer instead?
|
|
|
|
|
Int32Ptr *types.Type
|
|
|
|
|
UInt32Ptr *types.Type
|
|
|
|
|
IntPtr *types.Type
|
|
|
|
|
UintptrPtr *types.Type
|
|
|
|
|
Float32Ptr *types.Type
|
|
|
|
|
Float64Ptr *types.Type
|
|
|
|
|
BytePtrPtr *types.Type
|
2015-07-30 11:03:05 -07:00
|
|
|
}
|
|
|
|
|
|
2018-02-27 11:53:35 -08:00
|
|
|
// NewTypes creates and populates a Types.
|
2018-02-14 14:54:59 +11:00
|
|
|
func NewTypes() *Types {
|
|
|
|
|
t := new(Types)
|
|
|
|
|
t.SetTypPtrs()
|
|
|
|
|
return t
|
|
|
|
|
}
|
|
|
|
|
|
2018-02-27 11:53:35 -08:00
|
|
|
// SetTypPtrs populates t.
|
2018-02-14 14:54:59 +11:00
|
|
|
func (t *Types) SetTypPtrs() {
|
|
|
|
|
t.Bool = types.Types[types.TBOOL]
|
|
|
|
|
t.Int8 = types.Types[types.TINT8]
|
|
|
|
|
t.Int16 = types.Types[types.TINT16]
|
|
|
|
|
t.Int32 = types.Types[types.TINT32]
|
|
|
|
|
t.Int64 = types.Types[types.TINT64]
|
|
|
|
|
t.UInt8 = types.Types[types.TUINT8]
|
|
|
|
|
t.UInt16 = types.Types[types.TUINT16]
|
|
|
|
|
t.UInt32 = types.Types[types.TUINT32]
|
|
|
|
|
t.UInt64 = types.Types[types.TUINT64]
|
2018-05-09 17:35:10 -04:00
|
|
|
t.Int = types.Types[types.TINT]
|
2018-02-14 14:54:59 +11:00
|
|
|
t.Float32 = types.Types[types.TFLOAT32]
|
|
|
|
|
t.Float64 = types.Types[types.TFLOAT64]
|
2018-05-09 17:35:10 -04:00
|
|
|
t.UInt = types.Types[types.TUINT]
|
2018-02-14 14:54:59 +11:00
|
|
|
t.Uintptr = types.Types[types.TUINTPTR]
|
|
|
|
|
t.String = types.Types[types.TSTRING]
|
|
|
|
|
t.BytePtr = types.NewPtr(types.Types[types.TUINT8])
|
|
|
|
|
t.Int32Ptr = types.NewPtr(types.Types[types.TINT32])
|
|
|
|
|
t.UInt32Ptr = types.NewPtr(types.Types[types.TUINT32])
|
|
|
|
|
t.IntPtr = types.NewPtr(types.Types[types.TINT])
|
|
|
|
|
t.UintptrPtr = types.NewPtr(types.Types[types.TUINTPTR])
|
|
|
|
|
t.Float32Ptr = types.NewPtr(types.Types[types.TFLOAT32])
|
|
|
|
|
t.Float64Ptr = types.NewPtr(types.Types[types.TFLOAT64])
|
|
|
|
|
t.BytePtrPtr = types.NewPtr(types.NewPtr(types.Types[types.TUINT8]))
|
|
|
|
|
}
|
|
|
|
|
|
2015-08-10 12:15:52 -07:00
|
|
|
type Logger interface {
|
2016-01-29 14:44:15 -05:00
|
|
|
// Logf logs a message from the compiler.
|
2015-06-24 14:03:39 -07:00
|
|
|
Logf(string, ...interface{})
|
2015-06-12 11:01:13 -07:00
|
|
|
|
2018-11-02 15:18:43 +00:00
|
|
|
// Log reports whether logging is not a no-op
|
2016-01-29 14:44:15 -05:00
|
|
|
// some logging calls account for more than a few heap allocations.
|
|
|
|
|
Log() bool
|
|
|
|
|
|
2024-03-30 04:14:55 +00:00
|
|
|
// Fatalf reports a compiler error and exits.
|
2016-12-15 17:17:01 -08:00
|
|
|
Fatalf(pos src.XPos, msg string, args ...interface{})
|
2015-06-12 11:01:13 -07:00
|
|
|
|
2015-10-26 17:34:06 -04:00
|
|
|
// Warnl writes compiler messages in the form expected by "errorcheck" tests
|
2016-12-15 17:17:01 -08:00
|
|
|
Warnl(pos src.XPos, fmt_ string, args ...interface{})
|
2015-10-26 17:34:06 -04:00
|
|
|
|
2017-01-07 08:23:11 -08:00
|
|
|
// Forwards the Debug flags from gc
|
2015-10-26 17:34:06 -04:00
|
|
|
Debug_checknil() bool
|
2015-05-27 14:52:22 -07:00
|
|
|
}
|
|
|
|
|
|
2015-08-10 12:15:52 -07:00
|
|
|
type Frontend interface {
|
|
|
|
|
Logger
|
|
|
|
|
|
|
|
|
|
// StringData returns a symbol pointing to the given string's contents.
|
cmd/compile: start implementing strongly typed aux and auxint fields
Right now the Aux and AuxInt fields of ssa.Values are typed as
interface{} and int64, respectively. Each rule that uses these values
must cast them to the type they actually are (*obj.LSym, or int32, or
ValAndOff, etc.), use them, and then cast them back to interface{} or
int64.
We know for each opcode what the types of the Aux and AuxInt fields
should be. So let's modify the rule generator to declare the types to
be what we know they should be, autoconverting to and from the generic
types for us. That way we can make the rules more type safe.
It's difficult to make a single CL for this, so I've coopted the "=>"
token to indicate a rule that is strongly typed. "->" rules are
processed as before. That will let us migrate a few rules at a time in
separate CLs. Hopefully we can reach a state where all rules are
strongly typed and we can drop the distinction.
This CL changes just a few rules to get a feel for what this
transition would look like.
I've decided not to put explicit types in the rules. I think it
makes the rules somewhat clearer, but definitely more verbose.
In particular, the passthrough rules that don't modify the fields
in question are verbose for no real reason.
Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/190197
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-03-19 16:25:08 -07:00
|
|
|
StringData(string) *obj.LSym
|
2015-08-24 02:16:19 -07:00
|
|
|
|
cmd/compile: better job of naming compound types
Compound AUTO types weren't named previously. That was because live
variable analysis (plive.go) doesn't handle spilling to compound types.
It can't handle them because there is no valid place to put VARDEFs when
regalloc is spilling compound types.
compound types = multiword builtin types: complex, string, slice, and
interface.
Instead, we split named AUTOs into individual one-word variables. For
example, a string s gets split into a byte ptr s.ptr and an integer
s.len. Those two variables can be spilled to / restored from
independently. As a result, live variable analysis can handle them
because they are one-word objects.
This CL will change how AUTOs are described in DWARF information.
Consider the code:
func f(s string, i int) int {
x := s[i:i+5]
g()
return lookup(x)
}
The old compiler would spill x to two consecutive slots on the stack,
both named x (at offsets 0 and 8). The new compiler spills the pointer
of x to a slot named x.ptr. It doesn't spill x.len at all, as it is a
constant (5) and can be rematerialized for the call to lookup.
So compound objects may not be spilled in their entirety, and even if
they are they won't necessarily be contiguous. Such is the price of
optimization.
Re-enable live variable analysis tests. One test remains disabled, it
fails because of #14904.
Change-Id: I8ef2b5ab91e43a0d2136bfc231c05d100ec0b801
Reviewed-on: https://go-review.googlesource.com/21233
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-03-28 11:25:17 -07:00
|
|
|
// Given the name for a compound type, returns the name we should use
|
|
|
|
|
// for the parts of that compound type.
|
2020-08-17 16:57:22 -04:00
|
|
|
SplitSlot(parent *LocalSlot, suffix string, offset int64, t *types.Type) LocalSlot
|
cmd/compile: better job of naming compound types
Compound AUTO types weren't named previously. That was because live
variable analysis (plive.go) doesn't handle spilling to compound types.
It can't handle them because there is no valid place to put VARDEFs when
regalloc is spilling compound types.
compound types = multiword builtin types: complex, string, slice, and
interface.
Instead, we split named AUTOs into individual one-word variables. For
example, a string s gets split into a byte ptr s.ptr and an integer
s.len. Those two variables can be spilled to / restored from
independently. As a result, live variable analysis can handle them
because they are one-word objects.
This CL will change how AUTOs are described in DWARF information.
Consider the code:
func f(s string, i int) int {
x := s[i:i+5]
g()
return lookup(x)
}
The old compiler would spill x to two consecutive slots on the stack,
both named x (at offsets 0 and 8). The new compiler spills the pointer
of x to a slot named x.ptr. It doesn't spill x.len at all, as it is a
constant (5) and can be rematerialized for the call to lookup.
So compound objects may not be spilled in their entirety, and even if
they are they won't necessarily be contiguous. Such is the price of
optimization.
Re-enable live variable analysis tests. One test remains disabled, it
fails because of #14904.
Change-Id: I8ef2b5ab91e43a0d2136bfc231c05d100ec0b801
Reviewed-on: https://go-review.googlesource.com/21233
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-03-28 11:25:17 -07:00
|
|
|
|
2016-10-13 06:57:00 -04:00
|
|
|
// Syslook returns a symbol of the runtime function/variable with the
|
|
|
|
|
// given name.
|
2017-02-06 13:30:40 -08:00
|
|
|
Syslook(string) *obj.LSym
|
2017-02-05 23:43:31 -05:00
|
|
|
|
2018-11-22 11:46:44 +01:00
|
|
|
// UseWriteBarrier reports whether write barrier is enabled
|
2017-02-05 23:43:31 -05:00
|
|
|
UseWriteBarrier() bool
|
2017-10-24 17:10:02 -04:00
|
|
|
|
2023-04-11 16:40:12 -04:00
|
|
|
// Func returns the ir.Func of the function being compiled.
|
|
|
|
|
Func() *ir.Func
|
2015-10-22 14:22:38 -07:00
|
|
|
}
|
|
|
|
|
|
2015-04-15 15:51:25 -07:00
|
|
|
// NewConfig returns a new configuration object for the given architecture.
|
2021-06-09 20:14:15 -04:00
|
|
|
func NewConfig(arch string, types Types, ctxt *obj.Link, optimize, softfloat bool) *Config {
|
2017-03-17 16:04:46 -07:00
|
|
|
c := &Config{arch: arch, Types: types}
|
2018-03-09 00:06:33 +01:00
|
|
|
c.useAvg = true
|
|
|
|
|
c.useHmul = true
|
2015-04-15 15:51:25 -07:00
|
|
|
switch arch {
|
|
|
|
|
case "amd64":
|
2015-07-19 15:48:20 -07:00
|
|
|
c.PtrSize = 8
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 8
|
2015-06-06 16:03:33 -07:00
|
|
|
c.lowerBlock = rewriteBlockAMD64
|
|
|
|
|
c.lowerValue = rewriteValueAMD64
|
2022-10-19 15:53:43 +11:00
|
|
|
c.lateLowerBlock = rewriteBlockAMD64latelower
|
2022-10-07 12:19:32 +08:00
|
|
|
c.lateLowerValue = rewriteValueAMD64latelower
|
2019-03-10 08:34:59 -07:00
|
|
|
c.splitLoad = rewriteValueAMD64splitload
|
2016-03-21 22:57:26 -07:00
|
|
|
c.registers = registersAMD64[:]
|
2016-05-19 12:33:30 -04:00
|
|
|
c.gpRegMask = gpRegMaskAMD64
|
|
|
|
|
c.fpRegMask = fpRegMaskAMD64
|
2021-01-29 13:46:34 -05:00
|
|
|
c.specialRegMask = specialRegMaskAMD64
|
2020-10-07 09:44:16 -04:00
|
|
|
c.intParamRegs = paramIntRegAMD64
|
|
|
|
|
c.floatParamRegs = paramFloatRegAMD64
|
2016-05-19 12:33:30 -04:00
|
|
|
c.FPReg = framepointerRegAMD64
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegAMD64
|
2021-06-09 14:29:20 -04:00
|
|
|
c.hasGReg = true
|
2023-03-21 09:25:43 -07:00
|
|
|
c.unalignedOK = true
|
|
|
|
|
c.haveBswap64 = true
|
|
|
|
|
c.haveBswap32 = true
|
|
|
|
|
c.haveBswap16 = true
|
2015-04-15 15:51:25 -07:00
|
|
|
case "386":
|
2015-07-19 15:48:20 -07:00
|
|
|
c.PtrSize = 4
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 4
|
2016-07-13 13:43:08 -07:00
|
|
|
c.lowerBlock = rewriteBlock386
|
|
|
|
|
c.lowerValue = rewriteValue386
|
2019-03-10 08:34:59 -07:00
|
|
|
c.splitLoad = rewriteValue386splitload
|
2016-07-13 13:43:08 -07:00
|
|
|
c.registers = registers386[:]
|
|
|
|
|
c.gpRegMask = gpRegMask386
|
|
|
|
|
c.fpRegMask = fpRegMask386
|
|
|
|
|
c.FPReg = framepointerReg386
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkReg386
|
2016-07-13 13:43:08 -07:00
|
|
|
c.hasGReg = false
|
2023-03-21 09:25:43 -07:00
|
|
|
c.unalignedOK = true
|
|
|
|
|
c.haveBswap32 = true
|
|
|
|
|
c.haveBswap16 = true
|
2016-03-21 22:57:26 -07:00
|
|
|
case "arm":
|
|
|
|
|
c.PtrSize = 4
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 4
|
2016-03-21 22:57:26 -07:00
|
|
|
c.lowerBlock = rewriteBlockARM
|
|
|
|
|
c.lowerValue = rewriteValueARM
|
|
|
|
|
c.registers = registersARM[:]
|
2016-05-19 12:33:30 -04:00
|
|
|
c.gpRegMask = gpRegMaskARM
|
|
|
|
|
c.fpRegMask = fpRegMaskARM
|
|
|
|
|
c.FPReg = framepointerRegARM
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegARM
|
2016-05-31 14:01:34 -04:00
|
|
|
c.hasGReg = true
|
2016-07-21 12:42:49 -04:00
|
|
|
case "arm64":
|
|
|
|
|
c.PtrSize = 8
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 8
|
2016-07-21 12:42:49 -04:00
|
|
|
c.lowerBlock = rewriteBlockARM64
|
|
|
|
|
c.lowerValue = rewriteValueARM64
|
2022-10-19 15:53:43 +11:00
|
|
|
c.lateLowerBlock = rewriteBlockARM64latelower
|
cmd/compile: add late lower pass for last rules to run
Usually optimization rules have corresponding priorities, some need to
be run first, some run next, and some run last, which produces the best
code. But currently our optimization rules have no priority, this CL
adds a late lower pass that runs those rules that need to be run at last,
such as split unreasonable constant folding. This pass can be seen as
the second round of the lower pass.
For example:
func foo(a, b uint64) uint64 {
d := a+0x1234568
d1 := b+0x1234568
return d&d1
}
The code generated by the master branch:
0x0004 00004 ADD $19088744, R0, R2 // movz+movk+add
0x0010 00016 ADD $19088744, R1, R1 // movz+movk+add
0x001c 00028 AND R1, R2, R0
This is because the current constant folding optimization rules do not
take into account the range of constants, causing the constant to be
loaded repeatedly. This CL splits these unreasonable constants folding
in the late lower pass. With this CL the generated code:
0x0004 00004 MOVD $19088744, R2 // movz+movk
0x000c 00012 ADD R0, R2, R3
0x0010 00016 ADD R1, R2, R1
0x0014 00020 AND R1, R3, R0
This CL also adds constant folding optimization for ADDS instruction.
In addition, in order not to introduce the codegen regression, an
optimization rule is added to change the addition of a negative number
into a subtraction of a positive number.
go1 benchmarks:
name old time/op new time/op delta
BinaryTree17-8 1.22s ± 1% 1.24s ± 0% +1.56% (p=0.008 n=5+5)
Fannkuch11-8 1.54s ± 0% 1.53s ± 0% -0.69% (p=0.016 n=4+5)
FmtFprintfEmpty-8 14.1ns ± 0% 14.1ns ± 0% ~ (p=0.079 n=4+5)
FmtFprintfString-8 26.0ns ± 0% 26.1ns ± 0% +0.23% (p=0.008 n=5+5)
FmtFprintfInt-8 32.3ns ± 0% 32.9ns ± 1% +1.72% (p=0.008 n=5+5)
FmtFprintfIntInt-8 54.5ns ± 0% 55.5ns ± 0% +1.83% (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8 61.5ns ± 0% 62.0ns ± 0% +0.93% (p=0.008 n=5+5)
FmtFprintfFloat-8 72.0ns ± 0% 73.6ns ± 0% +2.24% (p=0.008 n=5+5)
FmtManyArgs-8 221ns ± 0% 224ns ± 0% +1.22% (p=0.008 n=5+5)
GobDecode-8 1.91ms ± 0% 1.93ms ± 0% +0.98% (p=0.008 n=5+5)
GobEncode-8 1.40ms ± 1% 1.39ms ± 0% -0.79% (p=0.032 n=5+5)
Gzip-8 115ms ± 0% 117ms ± 1% +1.17% (p=0.008 n=5+5)
Gunzip-8 19.4ms ± 1% 19.3ms ± 0% -0.71% (p=0.016 n=5+4)
HTTPClientServer-8 27.0µs ± 0% 27.3µs ± 0% +0.80% (p=0.008 n=5+5)
JSONEncode-8 3.36ms ± 1% 3.33ms ± 0% ~ (p=0.056 n=5+5)
JSONDecode-8 17.5ms ± 2% 17.8ms ± 0% +1.71% (p=0.016 n=5+4)
Mandelbrot200-8 2.29ms ± 0% 2.29ms ± 0% ~ (p=0.151 n=5+5)
GoParse-8 1.35ms ± 1% 1.36ms ± 1% ~ (p=0.056 n=5+5)
RegexpMatchEasy0_32-8 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.444 n=4+5)
RegexpMatchEasy0_1K-8 131ns ±11% 118ns ± 6% ~ (p=0.056 n=5+5)
RegexpMatchEasy1_32-8 22.9ns ± 0% 22.9ns ± 0% ~ (p=0.905 n=4+5)
RegexpMatchEasy1_1K-8 126ns ± 0% 127ns ± 0% ~ (p=0.063 n=4+5)
RegexpMatchMedium_32-8 486ns ± 5% 483ns ± 0% ~ (p=0.381 n=5+4)
RegexpMatchMedium_1K-8 15.4µs ± 1% 15.5µs ± 0% ~ (p=0.151 n=5+5)
RegexpMatchHard_32-8 687ns ± 0% 686ns ± 0% ~ (p=0.103 n=5+5)
RegexpMatchHard_1K-8 20.7µs ± 0% 20.7µs ± 1% ~ (p=0.151 n=5+5)
Revcomp-8 175ms ± 2% 176ms ± 3% ~ (p=1.000 n=5+5)
Template-8 20.4ms ± 6% 20.1ms ± 2% ~ (p=0.151 n=5+5)
TimeParse-8 112ns ± 0% 113ns ± 0% +0.97% (p=0.016 n=5+4)
TimeFormat-8 156ns ± 0% 145ns ± 0% -7.14% (p=0.029 n=4+4)
Change-Id: I3ced26e89041f873ac989586514ccc5ee09f13da
Reviewed-on: https://go-review.googlesource.com/c/go/+/425134
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-08-17 10:01:17 +00:00
|
|
|
c.lateLowerValue = rewriteValueARM64latelower
|
2016-07-21 12:42:49 -04:00
|
|
|
c.registers = registersARM64[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskARM64
|
|
|
|
|
c.fpRegMask = fpRegMaskARM64
|
2021-05-25 19:19:08 -04:00
|
|
|
c.intParamRegs = paramIntRegARM64
|
|
|
|
|
c.floatParamRegs = paramFloatRegARM64
|
2016-07-21 12:42:49 -04:00
|
|
|
c.FPReg = framepointerRegARM64
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegARM64
|
2016-07-21 12:42:49 -04:00
|
|
|
c.hasGReg = true
|
2023-03-21 09:25:43 -07:00
|
|
|
c.unalignedOK = true
|
|
|
|
|
c.haveBswap64 = true
|
|
|
|
|
c.haveBswap32 = true
|
|
|
|
|
c.haveBswap16 = true
|
2016-09-16 15:02:47 -07:00
|
|
|
case "ppc64":
|
2016-10-18 23:50:42 +02:00
|
|
|
c.BigEndian = true
|
2016-09-16 15:02:47 -07:00
|
|
|
fallthrough
|
|
|
|
|
case "ppc64le":
|
2016-06-24 14:37:17 -05:00
|
|
|
c.PtrSize = 8
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 8
|
2016-06-24 14:37:17 -05:00
|
|
|
c.lowerBlock = rewriteBlockPPC64
|
|
|
|
|
c.lowerValue = rewriteValuePPC64
|
2022-10-24 14:20:59 -05:00
|
|
|
c.lateLowerBlock = rewriteBlockPPC64latelower
|
2022-10-12 11:04:50 -05:00
|
|
|
c.lateLowerValue = rewriteValuePPC64latelower
|
2016-06-24 14:37:17 -05:00
|
|
|
c.registers = registersPPC64[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskPPC64
|
|
|
|
|
c.fpRegMask = fpRegMaskPPC64
|
2021-06-07 14:11:20 -05:00
|
|
|
c.specialRegMask = specialRegMaskPPC64
|
2021-09-22 09:37:12 -05:00
|
|
|
c.intParamRegs = paramIntRegPPC64
|
|
|
|
|
c.floatParamRegs = paramFloatRegPPC64
|
2016-06-24 14:37:17 -05:00
|
|
|
c.FPReg = framepointerRegPPC64
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegPPC64
|
2016-07-26 09:24:18 -07:00
|
|
|
c.hasGReg = true
|
2023-03-21 09:25:43 -07:00
|
|
|
c.unalignedOK = true
|
|
|
|
|
// Note: ppc64 has register bswap ops only when GOPPC64>=10.
|
|
|
|
|
// But it has bswap+load and bswap+store ops for all ppc64 variants.
|
|
|
|
|
// That is the sense we're using them here - they are only used
|
|
|
|
|
// in contexts where they can be merged with a load or store.
|
|
|
|
|
c.haveBswap64 = true
|
|
|
|
|
c.haveBswap32 = true
|
|
|
|
|
c.haveBswap16 = true
|
2016-10-18 23:50:42 +02:00
|
|
|
case "mips64":
|
|
|
|
|
c.BigEndian = true
|
|
|
|
|
fallthrough
|
|
|
|
|
case "mips64le":
|
2016-08-19 16:35:36 -04:00
|
|
|
c.PtrSize = 8
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 8
|
2016-08-19 16:35:36 -04:00
|
|
|
c.lowerBlock = rewriteBlockMIPS64
|
|
|
|
|
c.lowerValue = rewriteValueMIPS64
|
|
|
|
|
c.registers = registersMIPS64[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskMIPS64
|
|
|
|
|
c.fpRegMask = fpRegMaskMIPS64
|
2016-08-22 12:25:23 -04:00
|
|
|
c.specialRegMask = specialRegMaskMIPS64
|
2016-08-19 16:35:36 -04:00
|
|
|
c.FPReg = framepointerRegMIPS64
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegMIPS64
|
2016-08-19 16:35:36 -04:00
|
|
|
c.hasGReg = true
|
2021-11-24 17:48:09 +08:00
|
|
|
case "loong64":
|
|
|
|
|
c.PtrSize = 8
|
|
|
|
|
c.RegSize = 8
|
|
|
|
|
c.lowerBlock = rewriteBlockLOONG64
|
|
|
|
|
c.lowerValue = rewriteValueLOONG64
|
|
|
|
|
c.registers = registersLOONG64[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskLOONG64
|
|
|
|
|
c.fpRegMask = fpRegMaskLOONG64
|
2023-08-16 10:39:38 +08:00
|
|
|
c.intParamRegs = paramIntRegLOONG64
|
|
|
|
|
c.floatParamRegs = paramFloatRegLOONG64
|
2021-11-24 17:48:09 +08:00
|
|
|
c.FPReg = framepointerRegLOONG64
|
|
|
|
|
c.LinkReg = linkRegLOONG64
|
|
|
|
|
c.hasGReg = true
|
2025-03-25 15:02:03 +08:00
|
|
|
c.unalignedOK = true
|
2016-09-12 14:50:10 -04:00
|
|
|
case "s390x":
|
|
|
|
|
c.PtrSize = 8
|
2016-09-28 10:20:24 -04:00
|
|
|
c.RegSize = 8
|
2016-09-12 14:50:10 -04:00
|
|
|
c.lowerBlock = rewriteBlockS390X
|
|
|
|
|
c.lowerValue = rewriteValueS390X
|
|
|
|
|
c.registers = registersS390X[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskS390X
|
|
|
|
|
c.fpRegMask = fpRegMaskS390X
|
|
|
|
|
c.FPReg = framepointerRegS390X
|
2016-10-06 15:06:45 -04:00
|
|
|
c.LinkReg = linkRegS390X
|
2016-09-12 14:50:10 -04:00
|
|
|
c.hasGReg = true
|
2016-10-18 23:50:42 +02:00
|
|
|
c.BigEndian = true
|
2023-03-21 09:25:43 -07:00
|
|
|
c.unalignedOK = true
|
|
|
|
|
c.haveBswap64 = true
|
|
|
|
|
c.haveBswap32 = true
|
|
|
|
|
c.haveBswap16 = true // only for loads&stores, see ppc64 comment
|
2016-10-18 23:50:42 +02:00
|
|
|
case "mips":
|
|
|
|
|
c.BigEndian = true
|
|
|
|
|
fallthrough
|
|
|
|
|
case "mipsle":
|
|
|
|
|
c.PtrSize = 4
|
|
|
|
|
c.RegSize = 4
|
|
|
|
|
c.lowerBlock = rewriteBlockMIPS
|
|
|
|
|
c.lowerValue = rewriteValueMIPS
|
|
|
|
|
c.registers = registersMIPS[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskMIPS
|
|
|
|
|
c.fpRegMask = fpRegMaskMIPS
|
|
|
|
|
c.specialRegMask = specialRegMaskMIPS
|
|
|
|
|
c.FPReg = framepointerRegMIPS
|
|
|
|
|
c.LinkReg = linkRegMIPS
|
|
|
|
|
c.hasGReg = true
|
2019-11-04 04:40:47 +11:00
|
|
|
case "riscv64":
|
|
|
|
|
c.PtrSize = 8
|
|
|
|
|
c.RegSize = 8
|
|
|
|
|
c.lowerBlock = rewriteBlockRISCV64
|
|
|
|
|
c.lowerValue = rewriteValueRISCV64
|
2022-10-19 15:53:43 +11:00
|
|
|
c.lateLowerBlock = rewriteBlockRISCV64latelower
|
cmd/compile: fold constant shift with extension on riscv64
For example:
movb a0, a0
srai $1, a0, a0
the assembler will expand to:
slli $56, a0, a0
srai $56, a0, a0
srai $1, a0, a0
this CL optimize to:
slli $56, a0, a0
srai $57, a0, a0
Remove 270+ instructions from Go binary on linux/riscv64.
Change-Id: I375e19f9d3bd54f2781791d8cbe5970191297dc8
Reviewed-on: https://go-review.googlesource.com/c/go/+/428496
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-06 11:43:28 +08:00
|
|
|
c.lateLowerValue = rewriteValueRISCV64latelower
|
2019-11-04 04:40:47 +11:00
|
|
|
c.registers = registersRISCV64[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskRISCV64
|
|
|
|
|
c.fpRegMask = fpRegMaskRISCV64
|
2021-10-31 22:06:47 +08:00
|
|
|
c.intParamRegs = paramIntRegRISCV64
|
|
|
|
|
c.floatParamRegs = paramFloatRegRISCV64
|
2019-11-04 04:40:47 +11:00
|
|
|
c.FPReg = framepointerRegRISCV64
|
|
|
|
|
c.hasGReg = true
|
2018-03-29 00:55:53 +02:00
|
|
|
case "wasm":
|
|
|
|
|
c.PtrSize = 8
|
|
|
|
|
c.RegSize = 8
|
|
|
|
|
c.lowerBlock = rewriteBlockWasm
|
|
|
|
|
c.lowerValue = rewriteValueWasm
|
|
|
|
|
c.registers = registersWasm[:]
|
|
|
|
|
c.gpRegMask = gpRegMaskWasm
|
|
|
|
|
c.fpRegMask = fpRegMaskWasm
|
2019-09-12 21:05:45 +02:00
|
|
|
c.fp32RegMask = fp32RegMaskWasm
|
|
|
|
|
c.fp64RegMask = fp64RegMaskWasm
|
2018-03-29 00:55:53 +02:00
|
|
|
c.FPReg = framepointerRegWasm
|
|
|
|
|
c.LinkReg = linkRegWasm
|
|
|
|
|
c.hasGReg = true
|
|
|
|
|
c.useAvg = false
|
|
|
|
|
c.useHmul = false
|
2015-04-15 15:51:25 -07:00
|
|
|
default:
|
2017-03-16 22:42:10 -07:00
|
|
|
ctxt.Diag("arch %s not implemented", arch)
|
2015-04-15 15:51:25 -07:00
|
|
|
}
|
2015-10-22 13:07:38 -07:00
|
|
|
c.ctxt = ctxt
|
2016-01-27 16:47:23 -08:00
|
|
|
c.optimize = optimize
|
2021-06-09 20:14:15 -04:00
|
|
|
c.SoftFloat = softfloat
|
|
|
|
|
if softfloat {
|
|
|
|
|
c.floatParamRegs = nil // no FP registers in softfloat mode
|
|
|
|
|
}
|
2015-04-15 15:51:25 -07:00
|
|
|
|
cmd/compile: adjust GOSSAFUNC html dumping to be more ABI-aware
Uses ,ABI instead of <ABI> because of problems with shell escaping
and windows file names, however if someone goes to all the trouble
of escaping the linker syntax and uses that instead, that works too.
Examples:
```
GOSSAFUNC=runtime.exitsyscall go build main.go
\# runtime
dumped SSA for exitsyscall,0 to ../../src/loopvar/ssa.html
dumped SSA for exitsyscall,1 to ../../src/loopvar/ssa.html
GOSSADIR=`pwd` GOSSAFUNC=runtime.exitsyscall go build main.go
\# runtime
dumped SSA for exitsyscall,0 to ../../src/loopvar/runtime.exitsyscall,0.html
dumped SSA for exitsyscall,1 to ../../src/loopvar/runtime.exitsyscall,1.html
GOSSAFUNC=runtime.exitsyscall,0 go build main.go
\# runtime
dumped SSA for exitsyscall,0 to ../../src/loopvar/ssa.html
GOSSAFUNC=runtime.exitsyscall\<1\> go build main.go
\# runtime
dumped SSA for exitsyscall,1 to ../../src/loopvar/ssa.html
```
Change-Id: Ia1138b61c797d0de49dbfae702dc306b9650a7f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/532475
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: David Chase <drchase@google.com>
2023-10-03 12:14:53 -04:00
|
|
|
c.ABI0 = abi.NewABIConfig(0, 0, ctxt.Arch.FixedFrameSize, 0)
|
|
|
|
|
c.ABI1 = abi.NewABIConfig(len(c.intParamRegs), len(c.floatParamRegs), ctxt.Arch.FixedFrameSize, 1)
|
2020-10-07 09:44:16 -04:00
|
|
|
|
2017-11-15 14:54:24 -08:00
|
|
|
if ctxt.Flag_shared {
|
|
|
|
|
// LoweredWB is secretly a CALL and CALLs on 386 in
|
|
|
|
|
// shared mode get rewritten by obj6.go to go through
|
|
|
|
|
// the GOT, which clobbers BX.
|
|
|
|
|
opcodeTable[Op386LoweredWB].reg.clobbers |= 1 << 3 // BX
|
2016-07-07 10:49:43 -04:00
|
|
|
}
|
|
|
|
|
|
2024-11-11 12:21:14 -08:00
|
|
|
c.buildRecipes(arch)
|
|
|
|
|
|
2015-04-15 15:51:25 -07:00
|
|
|
return c
|
|
|
|
|
}
|
|
|
|
|
|
2018-07-19 15:04:14 -04:00
|
|
|
func (c *Config) Ctxt() *obj.Link { return c.ctxt }
|
2023-03-21 09:25:43 -07:00
|
|
|
|
|
|
|
|
func (c *Config) haveByteSwap(size int64) bool {
|
|
|
|
|
switch size {
|
|
|
|
|
case 8:
|
|
|
|
|
return c.haveBswap64
|
|
|
|
|
case 4:
|
|
|
|
|
return c.haveBswap32
|
|
|
|
|
case 2:
|
|
|
|
|
return c.haveBswap16
|
|
|
|
|
default:
|
|
|
|
|
base.Fatalf("bad size %d\n", size)
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
}
|
2024-11-11 12:21:14 -08:00
|
|
|
|
|
|
|
|
func (c *Config) buildRecipes(arch string) {
|
|
|
|
|
// Information for strength-reducing multiplies.
|
|
|
|
|
type linearCombo struct {
|
|
|
|
|
// we can compute a*x+b*y in one instruction
|
|
|
|
|
a, b int64
|
|
|
|
|
// cost, in arbitrary units (tenths of cycles, usually)
|
|
|
|
|
cost int
|
|
|
|
|
// builds SSA value for a*x+b*y. Use the position
|
|
|
|
|
// information from m.
|
|
|
|
|
build func(m, x, y *Value) *Value
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// List all the linear combination instructions we have.
|
|
|
|
|
var linearCombos []linearCombo
|
|
|
|
|
r := func(a, b int64, cost int, build func(m, x, y *Value) *Value) {
|
|
|
|
|
linearCombos = append(linearCombos, linearCombo{a: a, b: b, cost: cost, build: build})
|
|
|
|
|
}
|
|
|
|
|
var mulCost int
|
|
|
|
|
switch arch {
|
|
|
|
|
case "amd64":
|
|
|
|
|
// Assumes that the following costs from https://gmplib.org/~tege/x86-timing.pdf:
|
|
|
|
|
// 1 - addq, shlq, leaq, negq, subq
|
|
|
|
|
// 3 - imulq
|
|
|
|
|
// These costs limit the rewrites to two instructions.
|
|
|
|
|
// Operations which have to happen in place (and thus
|
|
|
|
|
// may require a reg-reg move) score slightly higher.
|
|
|
|
|
mulCost = 30
|
|
|
|
|
// add
|
|
|
|
|
r(1, 1, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue2(m.Pos, OpAMD64ADDQ, m.Type, x, y)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64ADDL
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
// neg
|
|
|
|
|
r(-1, 0, 11,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue1(m.Pos, OpAMD64NEGQ, m.Type, x)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64NEGL
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
// sub
|
|
|
|
|
r(1, -1, 11,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue2(m.Pos, OpAMD64SUBQ, m.Type, x, y)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64SUBL
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
// lea
|
|
|
|
|
r(1, 2, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue2(m.Pos, OpAMD64LEAQ2, m.Type, x, y)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64LEAL2
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
r(1, 4, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue2(m.Pos, OpAMD64LEAQ4, m.Type, x, y)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64LEAL4
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
r(1, 8, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue2(m.Pos, OpAMD64LEAQ8, m.Type, x, y)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64LEAL8
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
// regular shifts
|
|
|
|
|
for i := 2; i < 64; i++ {
|
|
|
|
|
r(1<<i, 0, 11,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
v := m.Block.NewValue1I(m.Pos, OpAMD64SHLQconst, m.Type, int64(i), x)
|
|
|
|
|
if m.Type.Size() == 4 {
|
|
|
|
|
v.Op = OpAMD64SHLLconst
|
|
|
|
|
}
|
|
|
|
|
return v
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
case "arm64":
|
|
|
|
|
// Rationale (for M2 ultra):
|
|
|
|
|
// - multiply is 3 cycles.
|
|
|
|
|
// - add/neg/sub/shift are 1 cycle.
|
|
|
|
|
// - add/neg/sub+shiftLL are 2 cycles.
|
|
|
|
|
// We break ties against the multiply because using a
|
|
|
|
|
// multiply also needs to load the constant into a register.
|
|
|
|
|
// (It's 3 cycles and 2 instructions either way, but the
|
|
|
|
|
// linear combo one might use 1 less register.)
|
|
|
|
|
// The multiply constant might get lifted out of a loop though. Hmm....
|
|
|
|
|
// Other arm64 chips have different tradeoffs.
|
|
|
|
|
// Some chip's add+shift instructions are 1 cycle for shifts up to 4
|
|
|
|
|
// and 2 cycles for shifts bigger than 4. So weight the larger shifts
|
|
|
|
|
// a bit more.
|
|
|
|
|
// TODO: figure out a happy medium.
|
|
|
|
|
mulCost = 35
|
|
|
|
|
// add
|
|
|
|
|
r(1, 1, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue2(m.Pos, OpARM64ADD, m.Type, x, y)
|
|
|
|
|
})
|
|
|
|
|
// neg
|
|
|
|
|
r(-1, 0, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue1(m.Pos, OpARM64NEG, m.Type, x)
|
|
|
|
|
})
|
|
|
|
|
// sub
|
|
|
|
|
r(1, -1, 10,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue2(m.Pos, OpARM64SUB, m.Type, x, y)
|
|
|
|
|
})
|
|
|
|
|
// regular shifts
|
|
|
|
|
for i := 1; i < 64; i++ {
|
|
|
|
|
c := 10
|
|
|
|
|
if i == 1 {
|
|
|
|
|
// Prefer x<<1 over x+x.
|
|
|
|
|
// Note that we eventually reverse this decision in ARM64latelower.rules,
|
|
|
|
|
// but this makes shift combining rules in ARM64.rules simpler.
|
|
|
|
|
c--
|
|
|
|
|
}
|
|
|
|
|
r(1<<i, 0, c,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue1I(m.Pos, OpARM64SLLconst, m.Type, int64(i), x)
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
// ADDshiftLL
|
|
|
|
|
for i := 1; i < 64; i++ {
|
|
|
|
|
c := 20
|
|
|
|
|
if i > 4 {
|
|
|
|
|
c++
|
|
|
|
|
}
|
|
|
|
|
r(1, 1<<i, c,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue2I(m.Pos, OpARM64ADDshiftLL, m.Type, int64(i), x, y)
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
// NEGshiftLL
|
|
|
|
|
for i := 1; i < 64; i++ {
|
|
|
|
|
c := 20
|
|
|
|
|
if i > 4 {
|
|
|
|
|
c++
|
|
|
|
|
}
|
|
|
|
|
r(-1<<i, 0, c,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue1I(m.Pos, OpARM64NEGshiftLL, m.Type, int64(i), x)
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
// SUBshiftLL
|
|
|
|
|
for i := 1; i < 64; i++ {
|
|
|
|
|
c := 20
|
|
|
|
|
if i > 4 {
|
|
|
|
|
c++
|
|
|
|
|
}
|
|
|
|
|
r(1, -1<<i, c,
|
|
|
|
|
func(m, x, y *Value) *Value {
|
|
|
|
|
return m.Block.NewValue2I(m.Pos, OpARM64SUBshiftLL, m.Type, int64(i), x, y)
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
c.mulRecipes = map[int64]mulRecipe{}
|
|
|
|
|
|
|
|
|
|
// Single-instruction recipes.
|
|
|
|
|
// The only option for the input value(s) is v.
|
|
|
|
|
for _, combo := range linearCombos {
|
|
|
|
|
x := combo.a + combo.b
|
|
|
|
|
cost := combo.cost
|
|
|
|
|
old := c.mulRecipes[x]
|
|
|
|
|
if (old.build == nil || cost < old.cost) && cost < mulCost {
|
|
|
|
|
c.mulRecipes[x] = mulRecipe{cost: cost, build: func(m, v *Value) *Value {
|
|
|
|
|
return combo.build(m, v, v)
|
|
|
|
|
}}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
// Two-instruction recipes.
|
|
|
|
|
// A: Both of the outer's inputs are from the same single-instruction recipe.
|
|
|
|
|
// B: First input is v and the second is from a single-instruction recipe.
|
|
|
|
|
// C: Second input is v and the first is from a single-instruction recipe.
|
|
|
|
|
// A is slightly preferred because it often needs 1 less register, so it
|
|
|
|
|
// goes first.
|
|
|
|
|
|
|
|
|
|
// A
|
|
|
|
|
for _, inner := range linearCombos {
|
|
|
|
|
for _, outer := range linearCombos {
|
|
|
|
|
x := (inner.a + inner.b) * (outer.a + outer.b)
|
|
|
|
|
cost := inner.cost + outer.cost
|
|
|
|
|
old := c.mulRecipes[x]
|
|
|
|
|
if (old.build == nil || cost < old.cost) && cost < mulCost {
|
|
|
|
|
c.mulRecipes[x] = mulRecipe{cost: cost, build: func(m, v *Value) *Value {
|
|
|
|
|
v = inner.build(m, v, v)
|
|
|
|
|
return outer.build(m, v, v)
|
|
|
|
|
}}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// B
|
|
|
|
|
for _, inner := range linearCombos {
|
|
|
|
|
for _, outer := range linearCombos {
|
|
|
|
|
x := outer.a + outer.b*(inner.a+inner.b)
|
|
|
|
|
cost := inner.cost + outer.cost
|
|
|
|
|
old := c.mulRecipes[x]
|
|
|
|
|
if (old.build == nil || cost < old.cost) && cost < mulCost {
|
|
|
|
|
c.mulRecipes[x] = mulRecipe{cost: cost, build: func(m, v *Value) *Value {
|
|
|
|
|
return outer.build(m, v, inner.build(m, v, v))
|
|
|
|
|
}}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// C
|
|
|
|
|
for _, inner := range linearCombos {
|
|
|
|
|
for _, outer := range linearCombos {
|
|
|
|
|
x := outer.a*(inner.a+inner.b) + outer.b
|
|
|
|
|
cost := inner.cost + outer.cost
|
|
|
|
|
old := c.mulRecipes[x]
|
|
|
|
|
if (old.build == nil || cost < old.cost) && cost < mulCost {
|
|
|
|
|
c.mulRecipes[x] = mulRecipe{cost: cost, build: func(m, v *Value) *Value {
|
|
|
|
|
return outer.build(m, inner.build(m, v, v), v)
|
|
|
|
|
}}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// These cases should be handled specially by rewrite rules.
|
|
|
|
|
// (Otherwise v * 1 == (neg (neg v)))
|
|
|
|
|
delete(c.mulRecipes, 0)
|
|
|
|
|
delete(c.mulRecipes, 1)
|
|
|
|
|
|
|
|
|
|
// Currently we assume that it doesn't help to do 3 linear
|
|
|
|
|
// combination instructions.
|
|
|
|
|
|
|
|
|
|
// Currently:
|
|
|
|
|
// len(c.mulRecipes) == 5984 on arm64
|
|
|
|
|
// 680 on amd64
|
|
|
|
|
// This function takes ~2.5ms on arm64.
|
|
|
|
|
//println(len(c.mulRecipes))
|
|
|
|
|
}
|