mirror of
https://github.com/golang/go.git
synced 2025-12-08 06:10:04 +00:00
The goal of this change is to move work from walk to SSA, and simplify things along the way. This is hard to accomplish cleanly with small incremental changes, so this large commit message aims to provide a roadmap to the diff. High level description: Prior to this change, walk was responsible for constructing (most of) the stack for function calls. ascompatte gathered variadic arguments into a slice. It also rewrote n.List from a list of arguments to a list of assignments to stack slots. ascompatte was called multiple times to handle the receiver in a method call. reorder1 then introduced temporaries into n.List as needed to avoid smashing the stack. adjustargs then made extra stack space for go/defer args as needed. Node to SSA construction evaluated all the statements in n.List, and issued the function call, assuming that the stack was correctly constructed. Intrinsic calls had to dig around inside n.List to extract the arguments, since intrinsics don't use the stack to make function calls. This change moves stack construction to the SSA construction phase. ascompatte, now called walkParams, does all the work that ascompatte and reorder1 did. It handles variadic arguments, inserts the method receiver if needed, and allocates temporaries. It does not, however, make any assignments to stack slots. Instead, it moves the function arguments to n.Rlist, leaving assignments to temporaries in n.List. (It would be better to use Ninit instead of List; future work.) During SSA construction, after doing all the temporary assignments in n.List, the function arguments are assigned to stack slots by constructing the appropriate SSA Value, using (*state).storeArg. SSA construction also now handles adjustments for go/defer args. This change also simplifies intrinsic calls, since we no longer need to undo walk's work. Along the way, we simplify nodarg by pushing the fp==1 case to its callers, where it fits nicely. Generated code differences: There were a few optimizations applied along the way, the old way. f(g()) was rewritten to do a block copy of function results to function arguments. And reorder1 avoided introducing the final "save the stack" temporary in n.List. The f(g()) block copy optimization never actually triggered; the order pass rewrote away g(), so that has been removed. SSA optimizations mostly obviated the need for reorder1's optimization of avoiding the final temporary. The exception was when the temporary's type was not SSA-able; in that case, we got a Move into an autotmp and then an immediate Move onto the stack, with the autotmp never read or used again. This change introduces a new rewrite rule to detect such pointless double Moves and collapse them into a single Move. This is actually more powerful than the original optimization, since the original optimization relied on the imprecise Node.HasCall calculation. The other significant difference in the generated code is that the stack is now constructed completely in SP-offset order. Prior to this change, the stack was constructed somewhat haphazardly: first the final argument that Node.HasCall deemed to require a temporary, then other arguments, then the method receiver, then the defer/go args. SP-offset is probably a good default order. See future work. There are a few minor object file size changes as a result of this change. I investigated some regressions in early versions of this change. One regression (in archive/tar) was the addition of a single CMPQ instruction, which would be eliminated were this TODO from flagalloc to be done: // TODO: Remove original instructions if they are never used. One regression (in text/template) was an ADDQconstmodify that is now a regular MOVQLoad+ADDQconst+MOVQStore, due to an unlucky change in the order in which arguments are written. The argument change order can also now be luckier, so this appears to be a wash. All in all, though there will be minor winners and losers, this change appears to be performance neutral. Future work: Move loading the result of function calls to SSA construction; eliminate OINDREGSP. Consider pushing stack construction deeper into SSA world, perhaps in an arch-specific pass. Among other benefits, this would make it easier to transition to a new calling convention. This would require rethinking the handling of stack conflicts and is non-trivial. Figure out some clean way to indicate that stack construction Stores/Moves do not alias each other, so that subsequent passes may do things like CSE+tighten shared stack setup, do DSE using non-first Stores, etc. This would allow us to eliminate the minor text/template regression. Possibly make assignments to stack slots not treated as statements by DWARF. Compiler benchmarks: name old time/op new time/op delta Template 182ms ± 2% 179ms ± 2% -1.69% (p=0.000 n=47+48) Unicode 86.3ms ± 5% 85.1ms ± 4% -1.36% (p=0.001 n=50+50) GoTypes 646ms ± 1% 642ms ± 1% -0.63% (p=0.000 n=49+48) Compiler 2.89s ± 1% 2.86s ± 2% -1.36% (p=0.000 n=48+50) SSA 8.47s ± 1% 8.37s ± 2% -1.22% (p=0.000 n=47+50) Flate 122ms ± 2% 121ms ± 2% -0.66% (p=0.000 n=47+45) GoParser 147ms ± 2% 146ms ± 2% -0.53% (p=0.006 n=46+49) Reflect 406ms ± 2% 403ms ± 2% -0.76% (p=0.000 n=48+43) Tar 162ms ± 3% 162ms ± 4% ~ (p=0.191 n=46+50) XML 223ms ± 2% 222ms ± 2% -0.37% (p=0.031 n=45+49) [Geo mean] 382ms 378ms -0.89% name old user-time/op new user-time/op delta Template 219ms ± 3% 216ms ± 3% -1.56% (p=0.000 n=50+48) Unicode 109ms ± 6% 109ms ± 5% ~ (p=0.190 n=50+49) GoTypes 836ms ± 2% 828ms ± 2% -0.96% (p=0.000 n=49+48) Compiler 3.87s ± 2% 3.80s ± 1% -1.81% (p=0.000 n=49+46) SSA 12.0s ± 1% 11.8s ± 1% -2.01% (p=0.000 n=48+50) Flate 142ms ± 3% 141ms ± 3% -0.85% (p=0.003 n=50+48) GoParser 178ms ± 4% 175ms ± 4% -1.66% (p=0.000 n=48+46) Reflect 520ms ± 2% 512ms ± 2% -1.44% (p=0.000 n=45+48) Tar 200ms ± 3% 198ms ± 4% -0.61% (p=0.037 n=47+50) XML 277ms ± 3% 275ms ± 3% -0.85% (p=0.000 n=49+48) [Geo mean] 482ms 476ms -1.23% name old alloc/op new alloc/op delta Template 36.1MB ± 0% 35.3MB ± 0% -2.18% (p=0.008 n=5+5) Unicode 29.8MB ± 0% 29.3MB ± 0% -1.58% (p=0.008 n=5+5) GoTypes 125MB ± 0% 123MB ± 0% -2.13% (p=0.008 n=5+5) Compiler 531MB ± 0% 513MB ± 0% -3.40% (p=0.008 n=5+5) SSA 2.00GB ± 0% 1.93GB ± 0% -3.34% (p=0.008 n=5+5) Flate 24.5MB ± 0% 24.3MB ± 0% -1.18% (p=0.008 n=5+5) GoParser 29.4MB ± 0% 28.7MB ± 0% -2.34% (p=0.008 n=5+5) Reflect 87.1MB ± 0% 86.0MB ± 0% -1.33% (p=0.008 n=5+5) Tar 35.3MB ± 0% 34.8MB ± 0% -1.44% (p=0.008 n=5+5) XML 47.9MB ± 0% 47.1MB ± 0% -1.86% (p=0.008 n=5+5) [Geo mean] 82.8MB 81.1MB -2.08% name old allocs/op new allocs/op delta Template 352k ± 0% 347k ± 0% -1.32% (p=0.008 n=5+5) Unicode 342k ± 0% 339k ± 0% -0.66% (p=0.008 n=5+5) GoTypes 1.29M ± 0% 1.27M ± 0% -1.30% (p=0.008 n=5+5) Compiler 4.98M ± 0% 4.87M ± 0% -2.14% (p=0.008 n=5+5) SSA 15.7M ± 0% 15.2M ± 0% -2.86% (p=0.008 n=5+5) Flate 233k ± 0% 231k ± 0% -0.83% (p=0.008 n=5+5) GoParser 296k ± 0% 291k ± 0% -1.54% (p=0.016 n=5+4) Reflect 1.05M ± 0% 1.04M ± 0% -0.65% (p=0.008 n=5+5) Tar 343k ± 0% 339k ± 0% -0.97% (p=0.008 n=5+5) XML 432k ± 0% 426k ± 0% -1.19% (p=0.008 n=5+5) [Geo mean] 815k 804k -1.35% name old object-bytes new object-bytes delta Template 505kB ± 0% 505kB ± 0% -0.01% (p=0.008 n=5+5) Unicode 224kB ± 0% 224kB ± 0% ~ (all equal) GoTypes 1.82MB ± 0% 1.83MB ± 0% +0.06% (p=0.008 n=5+5) Flate 324kB ± 0% 324kB ± 0% +0.00% (p=0.008 n=5+5) GoParser 402kB ± 0% 402kB ± 0% +0.04% (p=0.008 n=5+5) Reflect 1.39MB ± 0% 1.39MB ± 0% -0.01% (p=0.008 n=5+5) Tar 449kB ± 0% 449kB ± 0% -0.02% (p=0.008 n=5+5) XML 598kB ± 0% 597kB ± 0% -0.05% (p=0.008 n=5+5) Change-Id: Ifc9d5c1bd01f90171414b8fb18ffe2290d271143 Reviewed-on: https://go-review.googlesource.com/c/114797 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
385 lines
11 KiB
Go
385 lines
11 KiB
Go
// Copyright 2015 The Go Authors. All rights reserved.
|
|
// Use of this source code is governed by a BSD-style
|
|
// license that can be found in the LICENSE file.
|
|
|
|
package ssa
|
|
|
|
import (
|
|
"cmd/compile/internal/types"
|
|
"cmd/internal/obj"
|
|
"cmd/internal/objabi"
|
|
"cmd/internal/src"
|
|
)
|
|
|
|
// A Config holds readonly compilation information.
|
|
// It is created once, early during compilation,
|
|
// and shared across all compilations.
|
|
type Config struct {
|
|
arch string // "amd64", etc.
|
|
PtrSize int64 // 4 or 8; copy of cmd/internal/sys.Arch.PtrSize
|
|
RegSize int64 // 4 or 8; copy of cmd/internal/sys.Arch.RegSize
|
|
Types Types
|
|
lowerBlock blockRewriter // lowering function
|
|
lowerValue valueRewriter // lowering function
|
|
registers []Register // machine registers
|
|
gpRegMask regMask // general purpose integer register mask
|
|
fpRegMask regMask // floating point register mask
|
|
specialRegMask regMask // special register mask
|
|
GCRegMap []*Register // garbage collector register map, by GC register index
|
|
FPReg int8 // register number of frame pointer, -1 if not used
|
|
LinkReg int8 // register number of link register if it is a general purpose register, -1 if not used
|
|
hasGReg bool // has hardware g register
|
|
ctxt *obj.Link // Generic arch information
|
|
optimize bool // Do optimization
|
|
noDuffDevice bool // Don't use Duff's device
|
|
useSSE bool // Use SSE for non-float operations
|
|
useAvg bool // Use optimizations that need Avg* operations
|
|
useHmul bool // Use optimizations that need Hmul* operations
|
|
nacl bool // GOOS=nacl
|
|
use387 bool // GO386=387
|
|
SoftFloat bool //
|
|
Race bool // race detector enabled
|
|
NeedsFpScratch bool // No direct move between GP and FP register sets
|
|
BigEndian bool //
|
|
}
|
|
|
|
type (
|
|
blockRewriter func(*Block) bool
|
|
valueRewriter func(*Value) bool
|
|
)
|
|
|
|
type Types struct {
|
|
Bool *types.Type
|
|
Int8 *types.Type
|
|
Int16 *types.Type
|
|
Int32 *types.Type
|
|
Int64 *types.Type
|
|
UInt8 *types.Type
|
|
UInt16 *types.Type
|
|
UInt32 *types.Type
|
|
UInt64 *types.Type
|
|
Int *types.Type
|
|
Float32 *types.Type
|
|
Float64 *types.Type
|
|
UInt *types.Type
|
|
Uintptr *types.Type
|
|
String *types.Type
|
|
BytePtr *types.Type // TODO: use unsafe.Pointer instead?
|
|
Int32Ptr *types.Type
|
|
UInt32Ptr *types.Type
|
|
IntPtr *types.Type
|
|
UintptrPtr *types.Type
|
|
Float32Ptr *types.Type
|
|
Float64Ptr *types.Type
|
|
BytePtrPtr *types.Type
|
|
}
|
|
|
|
// NewTypes creates and populates a Types.
|
|
func NewTypes() *Types {
|
|
t := new(Types)
|
|
t.SetTypPtrs()
|
|
return t
|
|
}
|
|
|
|
// SetTypPtrs populates t.
|
|
func (t *Types) SetTypPtrs() {
|
|
t.Bool = types.Types[types.TBOOL]
|
|
t.Int8 = types.Types[types.TINT8]
|
|
t.Int16 = types.Types[types.TINT16]
|
|
t.Int32 = types.Types[types.TINT32]
|
|
t.Int64 = types.Types[types.TINT64]
|
|
t.UInt8 = types.Types[types.TUINT8]
|
|
t.UInt16 = types.Types[types.TUINT16]
|
|
t.UInt32 = types.Types[types.TUINT32]
|
|
t.UInt64 = types.Types[types.TUINT64]
|
|
t.Int = types.Types[types.TINT]
|
|
t.Float32 = types.Types[types.TFLOAT32]
|
|
t.Float64 = types.Types[types.TFLOAT64]
|
|
t.UInt = types.Types[types.TUINT]
|
|
t.Uintptr = types.Types[types.TUINTPTR]
|
|
t.String = types.Types[types.TSTRING]
|
|
t.BytePtr = types.NewPtr(types.Types[types.TUINT8])
|
|
t.Int32Ptr = types.NewPtr(types.Types[types.TINT32])
|
|
t.UInt32Ptr = types.NewPtr(types.Types[types.TUINT32])
|
|
t.IntPtr = types.NewPtr(types.Types[types.TINT])
|
|
t.UintptrPtr = types.NewPtr(types.Types[types.TUINTPTR])
|
|
t.Float32Ptr = types.NewPtr(types.Types[types.TFLOAT32])
|
|
t.Float64Ptr = types.NewPtr(types.Types[types.TFLOAT64])
|
|
t.BytePtrPtr = types.NewPtr(types.NewPtr(types.Types[types.TUINT8]))
|
|
}
|
|
|
|
type Logger interface {
|
|
// Logf logs a message from the compiler.
|
|
Logf(string, ...interface{})
|
|
|
|
// Log returns true if logging is not a no-op
|
|
// some logging calls account for more than a few heap allocations.
|
|
Log() bool
|
|
|
|
// Fatal reports a compiler error and exits.
|
|
Fatalf(pos src.XPos, msg string, args ...interface{})
|
|
|
|
// Warnl writes compiler messages in the form expected by "errorcheck" tests
|
|
Warnl(pos src.XPos, fmt_ string, args ...interface{})
|
|
|
|
// Forwards the Debug flags from gc
|
|
Debug_checknil() bool
|
|
}
|
|
|
|
type Frontend interface {
|
|
CanSSA(t *types.Type) bool
|
|
|
|
Logger
|
|
|
|
// StringData returns a symbol pointing to the given string's contents.
|
|
StringData(string) interface{} // returns *gc.Sym
|
|
|
|
// Auto returns a Node for an auto variable of the given type.
|
|
// The SSA compiler uses this function to allocate space for spills.
|
|
Auto(src.XPos, *types.Type) GCNode
|
|
|
|
// Given the name for a compound type, returns the name we should use
|
|
// for the parts of that compound type.
|
|
SplitString(LocalSlot) (LocalSlot, LocalSlot)
|
|
SplitInterface(LocalSlot) (LocalSlot, LocalSlot)
|
|
SplitSlice(LocalSlot) (LocalSlot, LocalSlot, LocalSlot)
|
|
SplitComplex(LocalSlot) (LocalSlot, LocalSlot)
|
|
SplitStruct(LocalSlot, int) LocalSlot
|
|
SplitArray(LocalSlot) LocalSlot // array must be length 1
|
|
SplitInt64(LocalSlot) (LocalSlot, LocalSlot) // returns (hi, lo)
|
|
|
|
// DerefItab dereferences an itab function
|
|
// entry, given the symbol of the itab and
|
|
// the byte offset of the function pointer.
|
|
// It may return nil.
|
|
DerefItab(sym *obj.LSym, offset int64) *obj.LSym
|
|
|
|
// Line returns a string describing the given position.
|
|
Line(src.XPos) string
|
|
|
|
// AllocFrame assigns frame offsets to all live auto variables.
|
|
AllocFrame(f *Func)
|
|
|
|
// Syslook returns a symbol of the runtime function/variable with the
|
|
// given name.
|
|
Syslook(string) *obj.LSym
|
|
|
|
// UseWriteBarrier returns whether write barrier is enabled
|
|
UseWriteBarrier() bool
|
|
|
|
// SetWBPos indicates that a write barrier has been inserted
|
|
// in this function at position pos.
|
|
SetWBPos(pos src.XPos)
|
|
}
|
|
|
|
// interface used to hold a *gc.Node (a stack variable).
|
|
// We'd use *gc.Node directly but that would lead to an import cycle.
|
|
type GCNode interface {
|
|
Typ() *types.Type
|
|
String() string
|
|
IsSynthetic() bool
|
|
IsAutoTmp() bool
|
|
StorageClass() StorageClass
|
|
}
|
|
|
|
type StorageClass uint8
|
|
|
|
const (
|
|
ClassAuto StorageClass = iota // local stack variable
|
|
ClassParam // argument
|
|
ClassParamOut // return value
|
|
)
|
|
|
|
// NewConfig returns a new configuration object for the given architecture.
|
|
func NewConfig(arch string, types Types, ctxt *obj.Link, optimize bool) *Config {
|
|
c := &Config{arch: arch, Types: types}
|
|
c.useAvg = true
|
|
c.useHmul = true
|
|
switch arch {
|
|
case "amd64":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockAMD64
|
|
c.lowerValue = rewriteValueAMD64
|
|
c.registers = registersAMD64[:]
|
|
c.gpRegMask = gpRegMaskAMD64
|
|
c.fpRegMask = fpRegMaskAMD64
|
|
c.FPReg = framepointerRegAMD64
|
|
c.LinkReg = linkRegAMD64
|
|
c.hasGReg = false
|
|
case "amd64p32":
|
|
c.PtrSize = 4
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockAMD64
|
|
c.lowerValue = rewriteValueAMD64
|
|
c.registers = registersAMD64[:]
|
|
c.gpRegMask = gpRegMaskAMD64
|
|
c.fpRegMask = fpRegMaskAMD64
|
|
c.FPReg = framepointerRegAMD64
|
|
c.LinkReg = linkRegAMD64
|
|
c.hasGReg = false
|
|
c.noDuffDevice = true
|
|
case "386":
|
|
c.PtrSize = 4
|
|
c.RegSize = 4
|
|
c.lowerBlock = rewriteBlock386
|
|
c.lowerValue = rewriteValue386
|
|
c.registers = registers386[:]
|
|
c.gpRegMask = gpRegMask386
|
|
c.fpRegMask = fpRegMask386
|
|
c.FPReg = framepointerReg386
|
|
c.LinkReg = linkReg386
|
|
c.hasGReg = false
|
|
case "arm":
|
|
c.PtrSize = 4
|
|
c.RegSize = 4
|
|
c.lowerBlock = rewriteBlockARM
|
|
c.lowerValue = rewriteValueARM
|
|
c.registers = registersARM[:]
|
|
c.gpRegMask = gpRegMaskARM
|
|
c.fpRegMask = fpRegMaskARM
|
|
c.FPReg = framepointerRegARM
|
|
c.LinkReg = linkRegARM
|
|
c.hasGReg = true
|
|
case "arm64":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockARM64
|
|
c.lowerValue = rewriteValueARM64
|
|
c.registers = registersARM64[:]
|
|
c.gpRegMask = gpRegMaskARM64
|
|
c.fpRegMask = fpRegMaskARM64
|
|
c.FPReg = framepointerRegARM64
|
|
c.LinkReg = linkRegARM64
|
|
c.hasGReg = true
|
|
c.noDuffDevice = objabi.GOOS == "darwin" // darwin linker cannot handle BR26 reloc with non-zero addend
|
|
case "ppc64":
|
|
c.BigEndian = true
|
|
fallthrough
|
|
case "ppc64le":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockPPC64
|
|
c.lowerValue = rewriteValuePPC64
|
|
c.registers = registersPPC64[:]
|
|
c.gpRegMask = gpRegMaskPPC64
|
|
c.fpRegMask = fpRegMaskPPC64
|
|
c.FPReg = framepointerRegPPC64
|
|
c.LinkReg = linkRegPPC64
|
|
c.noDuffDevice = true // TODO: Resolve PPC64 DuffDevice (has zero, but not copy)
|
|
c.hasGReg = true
|
|
case "mips64":
|
|
c.BigEndian = true
|
|
fallthrough
|
|
case "mips64le":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockMIPS64
|
|
c.lowerValue = rewriteValueMIPS64
|
|
c.registers = registersMIPS64[:]
|
|
c.gpRegMask = gpRegMaskMIPS64
|
|
c.fpRegMask = fpRegMaskMIPS64
|
|
c.specialRegMask = specialRegMaskMIPS64
|
|
c.FPReg = framepointerRegMIPS64
|
|
c.LinkReg = linkRegMIPS64
|
|
c.hasGReg = true
|
|
case "s390x":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockS390X
|
|
c.lowerValue = rewriteValueS390X
|
|
c.registers = registersS390X[:]
|
|
c.gpRegMask = gpRegMaskS390X
|
|
c.fpRegMask = fpRegMaskS390X
|
|
c.FPReg = framepointerRegS390X
|
|
c.LinkReg = linkRegS390X
|
|
c.hasGReg = true
|
|
c.noDuffDevice = true
|
|
c.BigEndian = true
|
|
case "mips":
|
|
c.BigEndian = true
|
|
fallthrough
|
|
case "mipsle":
|
|
c.PtrSize = 4
|
|
c.RegSize = 4
|
|
c.lowerBlock = rewriteBlockMIPS
|
|
c.lowerValue = rewriteValueMIPS
|
|
c.registers = registersMIPS[:]
|
|
c.gpRegMask = gpRegMaskMIPS
|
|
c.fpRegMask = fpRegMaskMIPS
|
|
c.specialRegMask = specialRegMaskMIPS
|
|
c.FPReg = framepointerRegMIPS
|
|
c.LinkReg = linkRegMIPS
|
|
c.hasGReg = true
|
|
c.noDuffDevice = true
|
|
case "wasm":
|
|
c.PtrSize = 8
|
|
c.RegSize = 8
|
|
c.lowerBlock = rewriteBlockWasm
|
|
c.lowerValue = rewriteValueWasm
|
|
c.registers = registersWasm[:]
|
|
c.gpRegMask = gpRegMaskWasm
|
|
c.fpRegMask = fpRegMaskWasm
|
|
c.FPReg = framepointerRegWasm
|
|
c.LinkReg = linkRegWasm
|
|
c.hasGReg = true
|
|
c.noDuffDevice = true
|
|
c.useAvg = false
|
|
c.useHmul = false
|
|
default:
|
|
ctxt.Diag("arch %s not implemented", arch)
|
|
}
|
|
c.ctxt = ctxt
|
|
c.optimize = optimize
|
|
c.nacl = objabi.GOOS == "nacl"
|
|
c.useSSE = true
|
|
|
|
// Don't use Duff's device nor SSE on Plan 9 AMD64, because
|
|
// floating point operations are not allowed in note handler.
|
|
if objabi.GOOS == "plan9" && arch == "amd64" {
|
|
c.noDuffDevice = true
|
|
c.useSSE = false
|
|
}
|
|
|
|
if c.nacl {
|
|
c.noDuffDevice = true // Don't use Duff's device on NaCl
|
|
|
|
// Returns clobber BP on nacl/386, so the write
|
|
// barrier does.
|
|
opcodeTable[Op386LoweredWB].reg.clobbers |= 1 << 5 // BP
|
|
|
|
// ... and SI on nacl/amd64.
|
|
opcodeTable[OpAMD64LoweredWB].reg.clobbers |= 1 << 6 // SI
|
|
}
|
|
|
|
if ctxt.Flag_shared {
|
|
// LoweredWB is secretly a CALL and CALLs on 386 in
|
|
// shared mode get rewritten by obj6.go to go through
|
|
// the GOT, which clobbers BX.
|
|
opcodeTable[Op386LoweredWB].reg.clobbers |= 1 << 3 // BX
|
|
}
|
|
|
|
// Create the GC register map index.
|
|
// TODO: This is only used for debug printing. Maybe export config.registers?
|
|
gcRegMapSize := int16(0)
|
|
for _, r := range c.registers {
|
|
if r.gcNum+1 > gcRegMapSize {
|
|
gcRegMapSize = r.gcNum + 1
|
|
}
|
|
}
|
|
c.GCRegMap = make([]*Register, gcRegMapSize)
|
|
for i, r := range c.registers {
|
|
if r.gcNum != -1 {
|
|
c.GCRegMap[r.gcNum] = &c.registers[i]
|
|
}
|
|
}
|
|
|
|
return c
|
|
}
|
|
|
|
func (c *Config) Set387(b bool) {
|
|
c.NeedsFpScratch = b
|
|
c.use387 = b
|
|
}
|
|
|
|
func (c *Config) Ctxt() *obj.Link { return c.ctxt }
|