go/src/cmd/compile/internal/ssa/expand_calls.go

978 lines
30 KiB
Go
Raw Normal View History

// Copyright 2020 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package ssa
import (
"cmd/compile/internal/abi"
"cmd/compile/internal/base"
"cmd/compile/internal/ir"
"cmd/compile/internal/types"
"cmd/internal/src"
"fmt"
)
func postExpandCallsDecompose(f *Func) {
decomposeUser(f) // redo user decompose to cleanup after expand calls
cmd/compile: make prove understand div, mod better This CL introduces new divisible and divmod passes that rewrite divisibility checks and div, mod, and mul. These happen after prove, so that prove can make better sense of the code for deriving bounds, and they must run before decompose, so that 64-bit ops can be lowered to 32-bit ops on 32-bit systems. And then they need another generic pass as well, to optimize the generated code before decomposing. The three opt passes are "opt", "middle opt", and "late opt". (Perhaps instead they should be "generic", "opt", and "late opt"?) The "late opt" pass repeats the "middle opt" work on any new code that has been generated in the interim. There will not be new divs or mods, but there may be new muls. The x%c==0 rewrite rules are much simpler now, since they can match before divs have been rewritten. This has the effect of applying them more consistently and making the rewrite rules independent of the exact div rewrites. Prove is also now charged with marking signed div/mod as unsigned when the arguments call for it, allowing simpler code to be emitted in various cases. For example, t.Seconds()/2 and len(x)/2 are now recognized as unsigned, meaning they compile to a simple shift (unsigned division), avoiding the more complex fixup we need for signed values. https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32 shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std' output before and after. "Proved Rsh64x64 shifts to zero" is replaced by the higher-level "Proved Div64 is unsigned" (the shift was in the signed expansion of div by constant), but otherwise prove is only finding more things to prove. One short example, in code that does x[i%len(x)]: < runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero --- > runtime/mfinal.go:131:34: Proved Div64 is unsigned > runtime/mfinal.go:131:38: Proved IsInBounds A longer example: < crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero < crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero --- > crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds > crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds These diffs are due to the smaller opt being better and taking work away from prove: < image/jpeg/dct.go:307:5: Proved IsInBounds < image/jpeg/dct.go:308:5: Proved IsInBounds ... < image/jpeg/dct.go:442:5: Proved IsInBounds In the old opt, Mul by 8 was rewritten to Lsh by 3 early. This CL delays that rule to help prove recognize mods, but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8]. Specifically, computing the length, opt can now do: (Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) -> (Add 8 (Sub (Mul 8 i) (Mul 8 i))) -> (Add 8 (Mul 8 (Sub i i))) -> (Add 8 (Mul 8 0)) -> (Add 8 0) -> 8 The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)), Leaving the multiply as Mul enables using that step; the old rewrite to Lsh blocked it, leaving prove to figure out the length and then remove the bounds checks. But now opt can evaluate the length down to a constant 8 and then constant-fold away the bounds checks 0 < 8, 1 < 8, and so on. After that, the compiler has nothing left to prove. Benchmarks are noisy in general; I checked the assembly for the many large increases below, and the vast majority are unchanged and presumably hitting the caches differently in some way. The divisibility optimizations were not reliably triggering before. This leads to a very large improvement in some cases, like DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems and DivisbleconstU64 on 32-bit systems. Another way the divisibility optimizations were unreliable before was incorrectly triggering for x/3, x%3 even though they are written not to do that. There is a real but small slowdown in the DivisibleWDivconst benchmarks on Mac because in the cases used in the benchmark, it is still faster (on Mac) to do the divisibility check than to remultiply. This may be worth further study. Perhaps when there is no rotate (meaning the divisor is odd), the divisibility optimization should be enabled always. In any event, this CL makes it possible to study that. benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm vs base vs base vs base vs base vs base vs base vs base vs base LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~ ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50% Modify ~ ~ ~ ~ ~ ~ ~ -1.53% MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05% ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~ BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06% BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06% BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~ BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49% BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40% BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77% MulNeg ~ ~ ~ ~ ~ ~ ~ ~ Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~ DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50% ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32% DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~ DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63% DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60% DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~ DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~ DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~ DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12% ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56% DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72% DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17% DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67% ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30% DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22% DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03% DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01% DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~ ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~ DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~ DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07% DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47% ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06% DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17% DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64% DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06% DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~ ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23% DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06% DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06% DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~ ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~ DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~ DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~ DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~ DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88% ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~ DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~ DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01% Mul2 ~ ~ ~ ~ ~ ~ ~ ~ MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02% EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53% TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04% Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~ Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39% Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35% Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57% Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55% Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11% Div64 ~ ~ ~ ~ ~ ~ ~ +0.11% Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~ Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~ Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28% Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93% Mod64Small ~ ~ ~ ~ ~ ~ ~ ~ Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44% Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37% Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22% Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11% Mod64 ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02% Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~ Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02% MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37% MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21% MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02% MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03% MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03% MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04% MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13% MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~ MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30% MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~ MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~ MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~ MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~ MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31% MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47% MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03% MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03% MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03% MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03% MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22% MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64% MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14% MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04% MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~ MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~ ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31% Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~ Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79% Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~ Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44% SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31% SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~ SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~ SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62% SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75% SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08% Fixes #63110. Fixes #75954. Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0 Reviewed-on: https://go-review.googlesource.com/c/go/+/714160 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-10-22 22:22:51 -04:00
decomposeBuiltin(f) // handles both regular decomposition and cleanup.
}
func expandCalls(f *Func) {
// Convert each aggregate arg to a call into "dismantle aggregate, store/pass parts"
// Convert each aggregate result from a call into "assemble aggregate from parts"
// Convert each multivalue exit into "dismantle aggregate, store/return parts"
// Convert incoming aggregate arg into assembly of parts.
// Feed modified AST to decompose.
sp, _ := f.spSb()
x := &expandState{
f: f,
debug: f.pass.debug,
regSize: f.Config.RegSize,
sp: sp,
typs: &f.Config.Types,
wideSelects: make(map[*Value]*Value),
commonArgs: make(map[selKey]*Value),
commonSelectors: make(map[selKey]*Value),
memForCall: make(map[ID]*Value),
}
// For 32-bit, need to deal with decomposition of 64-bit integers, which depends on endianness.
if f.Config.BigEndian {
x.firstOp = OpInt64Hi
x.secondOp = OpInt64Lo
x.firstType = x.typs.Int32
x.secondType = x.typs.UInt32
} else {
x.firstOp = OpInt64Lo
x.secondOp = OpInt64Hi
x.firstType = x.typs.UInt32
x.secondType = x.typs.Int32
}
// Defer select processing until after all calls and selects are seen.
var selects []*Value
var calls []*Value
var args []*Value
var exitBlocks []*Block
var m0 *Value
// Accumulate lists of calls, args, selects, and exit blocks to process,
// note "wide" selects consumed by stores,
// rewrite mem for each call,
// rewrite each OpSelectNAddr.
for _, b := range f.Blocks {
for _, v := range b.Values {
switch v.Op {
case OpInitMem:
m0 = v
case OpClosureLECall, OpInterLECall, OpStaticLECall, OpTailLECall:
calls = append(calls, v)
case OpArg:
args = append(args, v)
case OpStore:
if a := v.Args[1]; a.Op == OpSelectN && !CanSSA(a.Type) {
if a.Uses > 1 {
panic(fmt.Errorf("Saw double use of wide SelectN %s operand of Store %s",
a.LongString(), v.LongString()))
}
x.wideSelects[a] = v
}
case OpSelectN:
if v.Type == types.TypeMem {
// rewrite the mem selector in place
call := v.Args[0]
aux := call.Aux.(*AuxCall)
mem := x.memForCall[call.ID]
if mem == nil {
v.AuxInt = int64(aux.abiInfo.OutRegistersUsed())
x.memForCall[call.ID] = v
} else {
panic(fmt.Errorf("Saw two memories for call %v, %v and %v", call, mem, v))
}
} else {
selects = append(selects, v)
}
case OpSelectNAddr:
call := v.Args[0]
which := v.AuxInt
aux := call.Aux.(*AuxCall)
pt := v.Type
off := x.offsetFrom(x.f.Entry, x.sp, aux.OffsetOfResult(which), pt)
v.copyOf(off)
}
}
// rewrite function results from an exit block
// values returned by function need to be split out into registers.
if isBlockMultiValueExit(b) {
exitBlocks = append(exitBlocks, b)
}
}
// Convert each aggregate arg into Make of its parts (and so on, to primitive types)
for _, v := range args {
var rc registerCursor
a := x.prAssignForArg(v)
aux := x.f.OwnAux
regs := a.Registers
var offset int64
if len(regs) == 0 {
offset = a.FrameOffset(aux.abiInfo)
}
auxBase := x.offsetFrom(x.f.Entry, x.sp, offset, types.NewPtr(v.Type))
rc.init(regs, aux.abiInfo, nil, auxBase, 0)
x.rewriteSelectOrArg(f.Entry.Pos, f.Entry, v, v, m0, v.Type, rc)
}
// Rewrite selects of results (which may be aggregates) into make-aggregates of register/memory-targeted selects
for _, v := range selects {
if v.Op == OpInvalid {
continue
}
call := v.Args[0]
aux := call.Aux.(*AuxCall)
mem := x.memForCall[call.ID]
if mem == nil {
mem = call.Block.NewValue1I(call.Pos, OpSelectN, types.TypeMem, int64(aux.abiInfo.OutRegistersUsed()), call)
x.memForCall[call.ID] = mem
}
i := v.AuxInt
regs := aux.RegsOfResult(i)
// If this select cannot fit into SSA and is stored, either disaggregate to register stores, or mem-mem move.
if store := x.wideSelects[v]; store != nil {
// Use the mem that comes from the store operation.
storeAddr := store.Args[0]
mem := store.Args[2]
if len(regs) > 0 {
// Cannot do a rewrite that builds up a result from pieces; instead, copy pieces to the store operation.
var rc registerCursor
rc.init(regs, aux.abiInfo, nil, storeAddr, 0)
mem = x.rewriteWideSelectToStores(call.Pos, call.Block, v, mem, v.Type, rc)
store.copyOf(mem)
} else {
// Move directly from AuxBase to store target; rewrite the store instruction.
offset := aux.OffsetOfResult(i)
auxBase := x.offsetFrom(x.f.Entry, x.sp, offset, types.NewPtr(v.Type))
// was Store dst, v, mem
// now Move dst, auxBase, mem
move := store.Block.NewValue3A(store.Pos, OpMove, types.TypeMem, v.Type, storeAddr, auxBase, mem)
move.AuxInt = v.Type.Size()
store.copyOf(move)
}
continue
}
var auxBase *Value
if len(regs) == 0 {
offset := aux.OffsetOfResult(i)
auxBase = x.offsetFrom(x.f.Entry, x.sp, offset, types.NewPtr(v.Type))
}
var rc registerCursor
rc.init(regs, aux.abiInfo, nil, auxBase, 0)
x.rewriteSelectOrArg(call.Pos, call.Block, v, v, mem, v.Type, rc)
}
rewriteCall := func(v *Value, newOp Op, argStart int) {
// Break aggregate args passed to call into smaller pieces.
x.rewriteCallArgs(v, argStart)
v.Op = newOp
rts := abi.RegisterTypes(v.Aux.(*AuxCall).abiInfo.OutParams())
v.Type = types.NewResults(append(rts, types.TypeMem))
}
// Rewrite calls
for _, v := range calls {
switch v.Op {
case OpStaticLECall:
rewriteCall(v, OpStaticCall, 0)
case OpTailLECall:
rewriteCall(v, OpTailCall, 0)
case OpClosureLECall:
rewriteCall(v, OpClosureCall, 2)
case OpInterLECall:
rewriteCall(v, OpInterCall, 1)
}
}
// Rewrite results from exit blocks
for _, b := range exitBlocks {
v := b.Controls[0]
x.rewriteFuncResults(v, b, f.OwnAux)
b.SetControl(v)
}
}
func (x *expandState) rewriteFuncResults(v *Value, b *Block, aux *AuxCall) {
// This is very similar to rewriteCallArgs
// differences:
// firstArg + preArgs
// sp vs auxBase
m0 := v.MemoryArg()
mem := m0
allResults := []*Value{}
var oldArgs []*Value
argsWithoutMem := v.Args[:len(v.Args)-1]
for j, a := range argsWithoutMem {
oldArgs = append(oldArgs, a)
i := int64(j)
auxType := aux.TypeOfResult(i)
auxBase := b.NewValue2A(v.Pos, OpLocalAddr, types.NewPtr(auxType), aux.NameOfResult(i), x.sp, mem)
auxOffset := int64(0)
aRegs := aux.RegsOfResult(int64(j))
if a.Op == OpDereference {
a.Op = OpLoad
}
var rc registerCursor
var result *[]*Value
if len(aRegs) > 0 {
result = &allResults
} else {
if a.Op == OpLoad && a.Args[0].Op == OpLocalAddr && a.Args[0].Aux == aux.NameOfResult(i) {
continue // Self move to output parameter
}
}
rc.init(aRegs, aux.abiInfo, result, auxBase, auxOffset)
mem = x.decomposeAsNecessary(v.Pos, b, a, mem, rc)
}
v.resetArgs()
v.AddArgs(allResults...)
v.AddArg(mem)
for _, a := range oldArgs {
if a.Uses == 0 {
if x.debug > 1 {
x.Printf("...marking %v unused\n", a.LongString())
}
x.invalidateRecursively(a)
}
}
v.Type = types.NewResults(append(abi.RegisterTypes(aux.abiInfo.OutParams()), types.TypeMem))
return
}
func (x *expandState) rewriteCallArgs(v *Value, firstArg int) {
if x.debug > 1 {
x.indent(3)
defer x.indent(-3)
x.Printf("rewriteCallArgs(%s; %d)\n", v.LongString(), firstArg)
}
// Thread the stores on the memory arg
aux := v.Aux.(*AuxCall)
m0 := v.MemoryArg()
mem := m0
allResults := []*Value{}
oldArgs := []*Value{}
argsWithoutMem := v.Args[firstArg : len(v.Args)-1] // Also strip closure/interface Op-specific args
sp := x.sp
if v.Op == OpTailLECall {
// For tail call, we unwind the frame before the call so we'll use the caller's
// SP.
sp = v.Block.NewValue1(src.NoXPos, OpGetCallerSP, x.typs.Uintptr, mem)
}
for i, a := range argsWithoutMem { // skip leading non-parameter SSA Args and trailing mem SSA Arg.
oldArgs = append(oldArgs, a)
auxI := int64(i)
aRegs := aux.RegsOfArg(auxI)
aType := aux.TypeOfArg(auxI)
if a.Op == OpDereference {
a.Op = OpLoad
}
var rc registerCursor
var result *[]*Value
var aOffset int64
if len(aRegs) > 0 {
result = &allResults
} else {
aOffset = aux.OffsetOfArg(auxI)
}
if v.Op == OpTailLECall && a.Op == OpArg && a.AuxInt == 0 {
// It's common for a tail call passing the same arguments (e.g. method wrapper),
// so this would be a self copy. Detect this and optimize it out.
n := a.Aux.(*ir.Name)
if n.Class == ir.PPARAM && n.FrameOffset()+x.f.Config.ctxt.Arch.FixedFrameSize == aOffset {
continue
}
}
if x.debug > 1 {
x.Printf("...storeArg %s, %v, %d\n", a.LongString(), aType, aOffset)
}
rc.init(aRegs, aux.abiInfo, result, sp, aOffset)
mem = x.decomposeAsNecessary(v.Pos, v.Block, a, mem, rc)
}
var preArgStore [2]*Value
preArgs := append(preArgStore[:0], v.Args[0:firstArg]...)
v.resetArgs()
v.AddArgs(preArgs...)
v.AddArgs(allResults...)
v.AddArg(mem)
for _, a := range oldArgs {
if a.Uses == 0 {
x.invalidateRecursively(a)
}
}
return
}
func (x *expandState) decomposePair(pos src.XPos, b *Block, a, mem *Value, t0, t1 *types.Type, o0, o1 Op, rc *registerCursor) *Value {
e := b.NewValue1(pos, o0, t0, a)
pos = pos.WithNotStmt()
mem = x.decomposeAsNecessary(pos, b, e, mem, rc.next(t0))
e = b.NewValue1(pos, o1, t1, a)
mem = x.decomposeAsNecessary(pos, b, e, mem, rc.next(t1))
return mem
}
func (x *expandState) decomposeOne(pos src.XPos, b *Block, a, mem *Value, t0 *types.Type, o0 Op, rc *registerCursor) *Value {
e := b.NewValue1(pos, o0, t0, a)
pos = pos.WithNotStmt()
mem = x.decomposeAsNecessary(pos, b, e, mem, rc.next(t0))
return mem
}
// decomposeAsNecessary converts a value (perhaps an aggregate) passed to a call or returned by a function,
// into the appropriate sequence of stores and register assignments to transmit that value in a given ABI, and
// returns the current memory after this convert/rewrite (it may be the input memory, perhaps stores were needed.)
// 'pos' is the source position all this is tied to
// 'b' is the enclosing block
// 'a' is the value to decompose
// 'm0' is the input memory arg used for the first store (or returned if there are no stores)
// 'rc' is a registerCursor which identifies the register/memory destination for the value
func (x *expandState) decomposeAsNecessary(pos src.XPos, b *Block, a, m0 *Value, rc registerCursor) *Value {
if x.debug > 1 {
x.indent(3)
defer x.indent(-3)
}
at := a.Type
if at.Size() == 0 {
return m0
}
if a.Op == OpDereference {
a.Op = OpLoad // For purposes of parameter passing expansion, a Dereference is a Load.
}
if !rc.hasRegs() && !CanSSA(at) {
dst := x.offsetFrom(b, rc.storeDest, rc.storeOffset, types.NewPtr(at))
if x.debug > 1 {
x.Printf("...recur store %s at %s\n", a.LongString(), dst.LongString())
}
if a.Op == OpLoad {
m0 = b.NewValue3A(pos, OpMove, types.TypeMem, at, dst, a.Args[0], m0)
m0.AuxInt = at.Size()
return m0
} else {
panic(fmt.Errorf("Store of not a load"))
}
}
mem := m0
switch at.Kind() {
case types.TARRAY:
et := at.Elem()
for i := int64(0); i < at.NumElem(); i++ {
e := b.NewValue1I(pos, OpArraySelect, et, i, a)
pos = pos.WithNotStmt()
mem = x.decomposeAsNecessary(pos, b, e, mem, rc.next(et))
}
return mem
case types.TSTRUCT:
for i := 0; i < at.NumFields(); i++ {
et := at.Field(i).Type // might need to read offsets from the fields
e := b.NewValue1I(pos, OpStructSelect, et, int64(i), a)
pos = pos.WithNotStmt()
if x.debug > 1 {
x.Printf("...recur decompose %s, %v\n", e.LongString(), et)
}
mem = x.decomposeAsNecessary(pos, b, e, mem, rc.next(et))
}
return mem
case types.TSLICE:
mem = x.decomposeOne(pos, b, a, mem, at.Elem().PtrTo(), OpSlicePtr, &rc)
pos = pos.WithNotStmt()
mem = x.decomposeOne(pos, b, a, mem, x.typs.Int, OpSliceLen, &rc)
return x.decomposeOne(pos, b, a, mem, x.typs.Int, OpSliceCap, &rc)
case types.TSTRING:
return x.decomposePair(pos, b, a, mem, x.typs.BytePtr, x.typs.Int, OpStringPtr, OpStringLen, &rc)
case types.TINTER:
mem = x.decomposeOne(pos, b, a, mem, x.typs.Uintptr, OpITab, &rc)
pos = pos.WithNotStmt()
// Immediate interfaces cause so many headaches.
if a.Op == OpIMake {
data := a.Args[1]
for data.Op == OpStructMake || data.Op == OpArrayMake1 {
data = data.Args[0]
}
return x.decomposeAsNecessary(pos, b, data, mem, rc.next(data.Type))
}
return x.decomposeOne(pos, b, a, mem, x.typs.BytePtr, OpIData, &rc)
case types.TCOMPLEX64:
return x.decomposePair(pos, b, a, mem, x.typs.Float32, x.typs.Float32, OpComplexReal, OpComplexImag, &rc)
case types.TCOMPLEX128:
return x.decomposePair(pos, b, a, mem, x.typs.Float64, x.typs.Float64, OpComplexReal, OpComplexImag, &rc)
case types.TINT64:
if at.Size() > x.regSize {
return x.decomposePair(pos, b, a, mem, x.firstType, x.secondType, x.firstOp, x.secondOp, &rc)
}
case types.TUINT64:
if at.Size() > x.regSize {
return x.decomposePair(pos, b, a, mem, x.typs.UInt32, x.typs.UInt32, x.firstOp, x.secondOp, &rc)
}
}
// An atomic type, either record the register or store it and update the memory.
if rc.hasRegs() {
if x.debug > 1 {
x.Printf("...recur addArg %s\n", a.LongString())
}
rc.addArg(a)
} else {
dst := x.offsetFrom(b, rc.storeDest, rc.storeOffset, types.NewPtr(at))
if x.debug > 1 {
x.Printf("...recur store %s at %s\n", a.LongString(), dst.LongString())
}
mem = b.NewValue3A(pos, OpStore, types.TypeMem, at, dst, a, mem)
}
return mem
}
// Convert scalar OpArg into the proper OpWhateverArg instruction
// Convert scalar OpSelectN into perhaps-differently-indexed OpSelectN
// Convert aggregate OpArg into Make of its parts (which are eventually scalars)
// Convert aggregate OpSelectN into Make of its parts (which are eventually scalars)
// Returns the converted value.
//
// - "pos" the position for any generated instructions
// - "b" the block for any generated instructions
// - "container" the outermost OpArg/OpSelectN
// - "a" the instruction to overwrite, if any (only the outermost caller)
// - "m0" the memory arg for any loads that are necessary
// - "at" the type of the Arg/part
// - "rc" the register/memory cursor locating the various parts of the Arg.
func (x *expandState) rewriteSelectOrArg(pos src.XPos, b *Block, container, a, m0 *Value, at *types.Type, rc registerCursor) *Value {
if at == types.TypeMem {
a.copyOf(m0)
return a
}
makeOf := func(a *Value, op Op, args []*Value) *Value {
if a == nil {
a = b.NewValue0(pos, op, at)
a.AddArgs(args...)
} else {
a.resetArgs()
a.Aux, a.AuxInt = nil, 0
a.Pos, a.Op, a.Type = pos, op, at
a.AddArgs(args...)
}
return a
}
if at.Size() == 0 {
// For consistency, create these values even though they'll ultimately be unused
if at.IsArray() {
return makeOf(a, OpArrayMake0, nil)
}
if at.IsStruct() {
return makeOf(a, OpStructMake, nil)
}
return a
}
sk := selKey{from: container, size: 0, offsetOrIndex: rc.storeOffset, typ: at}
dupe := x.commonSelectors[sk]
if dupe != nil {
if a == nil {
return dupe
}
a.copyOf(dupe)
return a
}
var argStore [10]*Value
args := argStore[:0]
addArg := func(a0 *Value) {
if a0 == nil {
as := "<nil>"
if a != nil {
as = a.LongString()
}
panic(fmt.Errorf("a0 should not be nil, a=%v, container=%v, at=%v", as, container.LongString(), at))
}
args = append(args, a0)
}
switch at.Kind() {
case types.TARRAY:
et := at.Elem()
for i := int64(0); i < at.NumElem(); i++ {
e := x.rewriteSelectOrArg(pos, b, container, nil, m0, et, rc.next(et))
addArg(e)
}
a = makeOf(a, OpArrayMake1, args)
x.commonSelectors[sk] = a
return a
case types.TSTRUCT:
// Assume ssagen/ssa.go (in buildssa) spills large aggregates so they won't appear here.
for i := 0; i < at.NumFields(); i++ {
et := at.Field(i).Type
e := x.rewriteSelectOrArg(pos, b, container, nil, m0, et, rc.next(et))
if e == nil {
panic(fmt.Errorf("nil e, et=%v, et.Size()=%d, i=%d", et, et.Size(), i))
}
addArg(e)
pos = pos.WithNotStmt()
}
if at.NumFields() > 4 {
panic(fmt.Errorf("Too many fields (%d, %d bytes), container=%s", at.NumFields(), at.Size(), container.LongString()))
}
a = makeOf(a, OpStructMake, args)
x.commonSelectors[sk] = a
return a
case types.TSLICE:
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, at.Elem().PtrTo(), rc.next(x.typs.BytePtr)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Int, rc.next(x.typs.Int)))
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Int, rc.next(x.typs.Int)))
a = makeOf(a, OpSliceMake, args)
x.commonSelectors[sk] = a
return a
case types.TSTRING:
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.BytePtr, rc.next(x.typs.BytePtr)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Int, rc.next(x.typs.Int)))
a = makeOf(a, OpStringMake, args)
x.commonSelectors[sk] = a
return a
case types.TINTER:
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Uintptr, rc.next(x.typs.Uintptr)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.BytePtr, rc.next(x.typs.BytePtr)))
a = makeOf(a, OpIMake, args)
x.commonSelectors[sk] = a
return a
case types.TCOMPLEX64:
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Float32, rc.next(x.typs.Float32)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Float32, rc.next(x.typs.Float32)))
a = makeOf(a, OpComplexMake, args)
x.commonSelectors[sk] = a
return a
case types.TCOMPLEX128:
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Float64, rc.next(x.typs.Float64)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.Float64, rc.next(x.typs.Float64)))
a = makeOf(a, OpComplexMake, args)
x.commonSelectors[sk] = a
return a
case types.TINT64:
if at.Size() > x.regSize {
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.firstType, rc.next(x.firstType)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.secondType, rc.next(x.secondType)))
if !x.f.Config.BigEndian {
// Int64Make args are big, little
args[0], args[1] = args[1], args[0]
}
a = makeOf(a, OpInt64Make, args)
x.commonSelectors[sk] = a
return a
}
case types.TUINT64:
if at.Size() > x.regSize {
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.UInt32, rc.next(x.typs.UInt32)))
pos = pos.WithNotStmt()
addArg(x.rewriteSelectOrArg(pos, b, container, nil, m0, x.typs.UInt32, rc.next(x.typs.UInt32)))
if !x.f.Config.BigEndian {
// Int64Make args are big, little
args[0], args[1] = args[1], args[0]
}
a = makeOf(a, OpInt64Make, args)
x.commonSelectors[sk] = a
return a
}
}
// An atomic type, either record the register or store it and update the memory.
// Depending on the container Op, the leaves are either OpSelectN or OpArg{Int,Float}Reg
if container.Op == OpArg {
if rc.hasRegs() {
op, i := rc.ArgOpAndRegisterFor()
name := container.Aux.(*ir.Name)
a = makeOf(a, op, nil)
a.AuxInt = i
a.Aux = &AuxNameOffset{name, rc.storeOffset}
} else {
key := selKey{container, rc.storeOffset, at.Size(), at}
w := x.commonArgs[key]
if w != nil && w.Uses != 0 {
if a == nil {
a = w
} else {
a.copyOf(w)
}
} else {
if a == nil {
aux := container.Aux
auxInt := container.AuxInt + rc.storeOffset
a = container.Block.NewValue0IA(container.Pos, OpArg, at, auxInt, aux)
} else {
// do nothing, the original should be okay.
}
x.commonArgs[key] = a
}
}
} else if container.Op == OpSelectN {
call := container.Args[0]
aux := call.Aux.(*AuxCall)
which := container.AuxInt
if at == types.TypeMem {
if a != m0 || a != x.memForCall[call.ID] {
panic(fmt.Errorf("Memories %s, %s, and %s should all be equal after %s", a.LongString(), m0.LongString(), x.memForCall[call.ID], call.LongString()))
}
} else if rc.hasRegs() {
firstReg := uint32(0)
for i := 0; i < int(which); i++ {
firstReg += uint32(len(aux.abiInfo.OutParam(i).Registers))
}
reg := int64(rc.nextSlice + Abi1RO(firstReg))
a = makeOf(a, OpSelectN, []*Value{call})
a.AuxInt = reg
} else {
off := x.offsetFrom(x.f.Entry, x.sp, rc.storeOffset+aux.OffsetOfResult(which), types.NewPtr(at))
a = makeOf(a, OpLoad, []*Value{off, m0})
}
} else {
panic(fmt.Errorf("Expected container OpArg or OpSelectN, saw %v instead", container.LongString()))
}
x.commonSelectors[sk] = a
return a
}
// rewriteWideSelectToStores handles the case of a SelectN'd result from a function call that is too large for SSA,
// but is transferred in registers. In this case the register cursor tracks both operands; the register sources and
// the memory destinations.
// This returns the memory flowing out of the last store
func (x *expandState) rewriteWideSelectToStores(pos src.XPos, b *Block, container, m0 *Value, at *types.Type, rc registerCursor) *Value {
if at.Size() == 0 {
return m0
}
switch at.Kind() {
case types.TARRAY:
et := at.Elem()
for i := int64(0); i < at.NumElem(); i++ {
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, et, rc.next(et))
}
return m0
case types.TSTRUCT:
// Assume ssagen/ssa.go (in buildssa) spills large aggregates so they won't appear here.
for i := 0; i < at.NumFields(); i++ {
et := at.Field(i).Type
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, et, rc.next(et))
pos = pos.WithNotStmt()
}
return m0
case types.TSLICE:
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, at.Elem().PtrTo(), rc.next(x.typs.BytePtr))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Int, rc.next(x.typs.Int))
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Int, rc.next(x.typs.Int))
return m0
case types.TSTRING:
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.BytePtr, rc.next(x.typs.BytePtr))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Int, rc.next(x.typs.Int))
return m0
case types.TINTER:
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Uintptr, rc.next(x.typs.Uintptr))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.BytePtr, rc.next(x.typs.BytePtr))
return m0
case types.TCOMPLEX64:
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Float32, rc.next(x.typs.Float32))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Float32, rc.next(x.typs.Float32))
return m0
case types.TCOMPLEX128:
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Float64, rc.next(x.typs.Float64))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.Float64, rc.next(x.typs.Float64))
return m0
case types.TINT64:
if at.Size() > x.regSize {
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.firstType, rc.next(x.firstType))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.secondType, rc.next(x.secondType))
return m0
}
case types.TUINT64:
if at.Size() > x.regSize {
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.UInt32, rc.next(x.typs.UInt32))
pos = pos.WithNotStmt()
m0 = x.rewriteWideSelectToStores(pos, b, container, m0, x.typs.UInt32, rc.next(x.typs.UInt32))
return m0
}
}
// TODO could change treatment of too-large OpArg, would deal with it here.
if container.Op == OpSelectN {
call := container.Args[0]
aux := call.Aux.(*AuxCall)
which := container.AuxInt
if rc.hasRegs() {
firstReg := uint32(0)
for i := 0; i < int(which); i++ {
firstReg += uint32(len(aux.abiInfo.OutParam(i).Registers))
}
reg := int64(rc.nextSlice + Abi1RO(firstReg))
a := b.NewValue1I(pos, OpSelectN, at, reg, call)
dst := x.offsetFrom(b, rc.storeDest, rc.storeOffset, types.NewPtr(at))
m0 = b.NewValue3A(pos, OpStore, types.TypeMem, at, dst, a, m0)
} else {
panic(fmt.Errorf("Expected rc to have registers"))
}
} else {
panic(fmt.Errorf("Expected container OpSelectN, saw %v instead", container.LongString()))
}
return m0
}
func isBlockMultiValueExit(b *Block) bool {
return (b.Kind == BlockRet || b.Kind == BlockRetJmp) && b.Controls[0] != nil && b.Controls[0].Op == OpMakeResult
}
type Abi1RO uint8 // An offset within a parameter's slice of register indices, for abi1.
// A registerCursor tracks which register is used for an Arg or regValues, or a piece of such.
type registerCursor struct {
storeDest *Value // if there are no register targets, then this is the base of the store.
storeOffset int64
regs []abi.RegIndex // the registers available for this Arg/result (which is all in registers or not at all)
nextSlice Abi1RO // the next register/register-slice offset
config *abi.ABIConfig
regValues *[]*Value // values assigned to registers accumulate here
}
func (c *registerCursor) String() string {
dest := "<none>"
if c.storeDest != nil {
dest = fmt.Sprintf("%s+%d", c.storeDest.String(), c.storeOffset)
}
regs := "<none>"
if c.regValues != nil {
regs = ""
for i, x := range *c.regValues {
if i > 0 {
regs = regs + "; "
}
regs = regs + x.LongString()
}
}
// not printing the config because that has not been useful
return fmt.Sprintf("RCSR{storeDest=%v, regsLen=%d, nextSlice=%d, regValues=[%s]}", dest, len(c.regs), c.nextSlice, regs)
}
// next effectively post-increments the register cursor; the receiver is advanced,
// the (aligned) old value is returned.
func (c *registerCursor) next(t *types.Type) registerCursor {
c.storeOffset = types.RoundUp(c.storeOffset, t.Alignment())
rc := *c
c.storeOffset = types.RoundUp(c.storeOffset+t.Size(), t.Alignment())
if int(c.nextSlice) < len(c.regs) {
w := c.config.NumParamRegs(t)
c.nextSlice += Abi1RO(w)
}
return rc
}
// plus returns a register cursor offset from the original, without modifying the original.
func (c *registerCursor) plus(regWidth Abi1RO) registerCursor {
rc := *c
rc.nextSlice += regWidth
return rc
}
func (c *registerCursor) init(regs []abi.RegIndex, info *abi.ABIParamResultInfo, result *[]*Value, storeDest *Value, storeOffset int64) {
c.regs = regs
c.nextSlice = 0
c.storeOffset = storeOffset
c.storeDest = storeDest
c.config = info.Config()
c.regValues = result
}
func (c *registerCursor) addArg(v *Value) {
*c.regValues = append(*c.regValues, v)
}
func (c *registerCursor) hasRegs() bool {
return len(c.regs) > 0
}
func (c *registerCursor) ArgOpAndRegisterFor() (Op, int64) {
r := c.regs[c.nextSlice]
return ArgOpAndRegisterFor(r, c.config)
}
// ArgOpAndRegisterFor converts an abi register index into an ssa Op and corresponding
// arg register index.
func ArgOpAndRegisterFor(r abi.RegIndex, abiConfig *abi.ABIConfig) (Op, int64) {
i := abiConfig.FloatIndexFor(r)
if i >= 0 { // float PR
return OpArgFloatReg, i
}
return OpArgIntReg, int64(r)
}
type selKey struct {
from *Value // what is selected from
offsetOrIndex int64 // whatever is appropriate for the selector
size int64
typ *types.Type
}
type expandState struct {
f *Func
debug int // odd values log lost statement markers, so likely settings are 1 (stmts), 2 (expansion), and 3 (both)
regSize int64
sp *Value
typs *Types
firstOp Op // for 64-bit integers on 32-bit machines, first word in memory
secondOp Op // for 64-bit integers on 32-bit machines, second word in memory
firstType *types.Type // first half type, for Int64
secondType *types.Type // second half type, for Int64
wideSelects map[*Value]*Value // Selects that are not SSA-able, mapped to consuming stores.
commonSelectors map[selKey]*Value // used to de-dupe selectors
commonArgs map[selKey]*Value // used to de-dupe OpArg/OpArgIntReg/OpArgFloatReg
memForCall map[ID]*Value // For a call, need to know the unique selector that gets the mem.
indentLevel int // Indentation for debugging recursion
}
// offsetFrom creates an offset from a pointer, simplifying chained offsets and offsets from SP
func (x *expandState) offsetFrom(b *Block, from *Value, offset int64, pt *types.Type) *Value {
ft := from.Type
if offset == 0 {
if ft == pt {
return from
}
// This captures common, (apparently) safe cases. The unsafe cases involve ft == uintptr
if (ft.IsPtr() || ft.IsUnsafePtr()) && pt.IsPtr() {
return from
}
}
// Simplify, canonicalize
for from.Op == OpOffPtr {
offset += from.AuxInt
from = from.Args[0]
}
if from == x.sp {
return x.f.ConstOffPtrSP(pt, offset, x.sp)
}
return b.NewValue1I(from.Pos.WithNotStmt(), OpOffPtr, pt, offset, from)
}
// prAssignForArg returns the ABIParamAssignment for v, assumed to be an OpArg.
func (x *expandState) prAssignForArg(v *Value) *abi.ABIParamAssignment {
if v.Op != OpArg {
panic(fmt.Errorf("Wanted OpArg, instead saw %s", v.LongString()))
}
return ParamAssignmentForArgName(x.f, v.Aux.(*ir.Name))
}
// ParamAssignmentForArgName returns the ABIParamAssignment for f's arg with matching name.
func ParamAssignmentForArgName(f *Func, name *ir.Name) *abi.ABIParamAssignment {
abiInfo := f.OwnAux.abiInfo
ip := abiInfo.InParams()
for i, a := range ip {
if a.Name == name {
return &ip[i]
}
}
panic(fmt.Errorf("Did not match param %v in prInfo %+v", name, abiInfo.InParams()))
}
// indent increments (or decrements) the indentation.
func (x *expandState) indent(n int) {
x.indentLevel += n
}
// Printf does an indented fmt.Printf on the format and args.
func (x *expandState) Printf(format string, a ...any) (n int, err error) {
if x.indentLevel > 0 {
fmt.Printf("%[1]*s", x.indentLevel, "")
}
return fmt.Printf(format, a...)
}
func (x *expandState) invalidateRecursively(a *Value) {
var s string
if x.debug > 0 {
plus := " "
if a.Pos.IsStmt() == src.PosIsStmt {
plus = " +"
}
s = a.String() + plus + a.Pos.LineNumber() + " " + a.LongString()
if x.debug > 1 {
x.Printf("...marking %v unused\n", s)
}
}
lost := a.invalidateRecursively()
if x.debug&1 != 0 && lost { // For odd values of x.debug, do this.
x.Printf("Lost statement marker in %s on former %s\n", base.Ctxt.Pkgpath+"."+x.f.Name, s)
}
}