Add compiler support for emitting ABI wrappers by creating real IR as
opposed to introducing ABI aliases. At the moment these are "no-op"
wrappers in the sense that they make a simple call (using the existing
ABI) to their target. The assumption here is that once late call
expansion can handle both ABI0 and the "new" ABIInternal (register
version), it can expand the call to do the right thing.
Note that the runtime contains functions that do not strictly follow
the rules of the current Go ABI0; this has been handled in most cases
by treating these as ABIInternal instead (these changes have been made
in previous patches).
Generation of ABI wrappers (as opposed to ABI aliases) is currently
gated by GOEXPERIMENT=regabi -- wrapper generation is on by default if
GOEXPERIMENT=regabi is set and off otherwise (but can be turned on
using "-gcflags=all=-abiwrap -ldflags=-abiwrap"). Wrapper generation
currently only workd on AMD64; explicitly enabling wrapper for other
architectures (via the command line) is not supported.
Also in this patch are a few other command line options for debugging
(tracing and/or limiting wrapper creation). These will presumably go
away at some point.
Updates #27539, #40724.
Change-Id: I1ee3226fc15a3c32ca2087b8ef8e41dbe6df4a75
Reviewed-on: https://go-review.googlesource.com/c/go/+/270863
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trust: Than McIntosh <thanm@google.com>
Deleting the Pc assignment from Patch is safe because the actual PCs
are not assigned until well after the compiler is done patching jumps.
And it proves that replacing uses of Patch with SetTarget will be safe later.
Change-Id: Iffcbe03f0b5949ccd4c91e79c1272cd06be0f434
Reviewed-on: https://go-review.googlesource.com/c/go/+/279296
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Now that we have specific types for ONAME and ODCLFUNC nodes
(*Name and *Func), use them throughout the compiler to be more
precise about what data is being operated on.
This is a somewhat large CL, but once you start applying the types
in a few places, you end up needing to apply them to many other
places to keep everything type-checking. A lot of code also melts
away as types are added.
Passes buildall w/ toolstash -cmp.
Change-Id: I21dd9b945d701c470332bac5394fca744a5b232d
Reviewed-on: https://go-review.googlesource.com/c/go/+/274097
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The plan is to introduce a Node interface that replaces the old *Node pointer-to-struct.
The previous CL defined an interface INode modeling a *Node.
This CL:
- Changes all references outside internal/ir to use INode,
along with many references inside internal/ir as well.
- Renames Node to node.
- Renames INode to Node
So now ir.Node is an interface implemented by *ir.node, which is otherwise inaccessible,
and the code outside package ir is now (clearly) using only the interface.
The usual rule is never to redefine an existing name with a new meaning,
so that old code that hasn't been updated gets a "unknown name" error
instead of more mysterious errors or silent misbehavior. That rule would
caution against replacing Node-the-struct with Node-the-interface,
as in this CL, because code that says *Node would now be using a pointer
to an interface. But this CL is being landed at the same time as another that
moves Node from gc to ir. So the net effect is to replace *gc.Node with ir.Node,
which does follow the rule: any lingering references to gc.Node will be told
it's gone, not silently start using pointers to interfaces. So the rule is followed
by the CL sequence, just not this specific CL.
Overall, the loss of inlining caused by using interfaces cuts the compiler speed
by about 6%, a not insignificant amount. However, as we convert the representation
to concrete structs that are not the giant Node over the next weeks, that speed
should come back as more of the compiler starts operating directly on concrete types
and the memory taken up by the graph of Nodes drops due to the more precise
structs. Honestly, I was expecting worse.
% benchstat bench.old bench.new
name old time/op new time/op delta
Template 168ms ± 4% 182ms ± 2% +8.34% (p=0.000 n=9+9)
Unicode 72.2ms ±10% 82.5ms ± 6% +14.38% (p=0.000 n=9+9)
GoTypes 563ms ± 8% 598ms ± 2% +6.14% (p=0.006 n=9+9)
Compiler 2.89s ± 4% 3.04s ± 2% +5.37% (p=0.000 n=10+9)
SSA 6.45s ± 4% 7.25s ± 5% +12.41% (p=0.000 n=9+10)
Flate 105ms ± 2% 115ms ± 1% +9.66% (p=0.000 n=10+8)
GoParser 144ms ±10% 152ms ± 2% +5.79% (p=0.011 n=9+8)
Reflect 345ms ± 9% 370ms ± 4% +7.28% (p=0.001 n=10+9)
Tar 149ms ± 9% 161ms ± 5% +8.05% (p=0.001 n=10+9)
XML 190ms ± 3% 209ms ± 2% +9.54% (p=0.000 n=9+8)
LinkCompiler 327ms ± 2% 325ms ± 2% ~ (p=0.382 n=8+8)
ExternalLinkCompiler 1.77s ± 4% 1.73s ± 6% ~ (p=0.113 n=9+10)
LinkWithoutDebugCompiler 214ms ± 4% 211ms ± 2% ~ (p=0.360 n=10+8)
StdCmd 14.8s ± 3% 15.9s ± 1% +6.98% (p=0.000 n=10+9)
[Geo mean] 480ms 510ms +6.31%
name old user-time/op new user-time/op delta
Template 223ms ± 3% 237ms ± 3% +6.16% (p=0.000 n=9+10)
Unicode 103ms ± 6% 113ms ± 3% +9.53% (p=0.000 n=9+9)
GoTypes 758ms ± 8% 800ms ± 2% +5.55% (p=0.003 n=10+9)
Compiler 3.95s ± 2% 4.12s ± 2% +4.34% (p=0.000 n=10+9)
SSA 9.43s ± 1% 9.74s ± 4% +3.25% (p=0.000 n=8+10)
Flate 132ms ± 2% 141ms ± 2% +6.89% (p=0.000 n=9+9)
GoParser 177ms ± 9% 183ms ± 4% ~ (p=0.050 n=9+9)
Reflect 467ms ±10% 495ms ± 7% +6.17% (p=0.029 n=10+10)
Tar 183ms ± 9% 197ms ± 5% +7.92% (p=0.001 n=10+10)
XML 249ms ± 5% 268ms ± 4% +7.82% (p=0.000 n=10+9)
LinkCompiler 544ms ± 5% 544ms ± 6% ~ (p=0.863 n=9+9)
ExternalLinkCompiler 1.79s ± 4% 1.75s ± 6% ~ (p=0.075 n=10+10)
LinkWithoutDebugCompiler 248ms ± 6% 246ms ± 2% ~ (p=0.965 n=10+8)
[Geo mean] 483ms 504ms +4.41%
[git-generate]
cd src/cmd/compile/internal/ir
: # We need to do the conversion in multiple steps, so we introduce
: # a temporary type alias that will start out meaning the pointer-to-struct
: # and then change to mean the interface.
rf '
mv Node OldNode
add node.go \
type Node = *OldNode
'
: # It should work to do this ex in ir, but it misses test files, due to a bug in rf.
: # Run the command in gc to handle gc's tests, and then again in ssa for ssa's tests.
cd ../gc
rf '
ex . ../arm ../riscv64 ../arm64 ../mips64 ../ppc64 ../mips ../wasm {
import "cmd/compile/internal/ir"
*ir.OldNode -> ir.Node
}
'
cd ../ssa
rf '
ex {
import "cmd/compile/internal/ir"
*ir.OldNode -> ir.Node
}
'
: # Back in ir, finish conversion clumsily with sed,
: # because type checking and circular aliases do not mix.
cd ../ir
sed -i '' '
/type Node = \*OldNode/d
s/\*OldNode/Node/g
s/^func (n Node)/func (n *OldNode)/
s/OldNode/node/g
s/type INode interface/type Node interface/
s/var _ INode = (Node)(nil)/var _ Node = (*node)(nil)/
' *.go
gofmt -w *.go
sed -i '' '
s/{Func{}, 136, 248}/{Func{}, 152, 280}/
s/{Name{}, 32, 56}/{Name{}, 44, 80}/
s/{Param{}, 24, 48}/{Param{}, 44, 88}/
s/{node{}, 76, 128}/{node{}, 88, 152}/
' sizeof_test.go
cd ../ssa
sed -i '' '
s/{LocalSlot{}, 28, 40}/{LocalSlot{}, 32, 48}/
' sizeof_test.go
cd ../gc
sed -i '' 's/\*ir.Node/ir.Node/' mkbuiltin.go
cd ../../../..
go install std cmd
cd cmd/compile
go test -u || go test -u
Change-Id: I196bbe3b648e4701662e4a2bada40bf155e2a553
Reviewed-on: https://go-review.googlesource.com/c/go/+/272935
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
If we want to break up package gc at all, we will need to move
the compiler IR it defines into a separate package that can be
imported by packages that gc itself imports. This CL does that.
It also removes the TINT8 etc aliases so that all code is clear
about which package things are coming from.
This CL is automatically generated by the script below.
See the comments in the script for details about the changes.
[git-generate]
cd src/cmd/compile/internal/gc
rf '
# These names were never fully qualified
# when the types package was added.
# Do it now, to avoid confusion about where they live.
inline -rm \
Txxx \
TINT8 \
TUINT8 \
TINT16 \
TUINT16 \
TINT32 \
TUINT32 \
TINT64 \
TUINT64 \
TINT \
TUINT \
TUINTPTR \
TCOMPLEX64 \
TCOMPLEX128 \
TFLOAT32 \
TFLOAT64 \
TBOOL \
TPTR \
TFUNC \
TSLICE \
TARRAY \
TSTRUCT \
TCHAN \
TMAP \
TINTER \
TFORW \
TANY \
TSTRING \
TUNSAFEPTR \
TIDEAL \
TNIL \
TBLANK \
TFUNCARGS \
TCHANARGS \
NTYPE \
BADWIDTH
# esc.go and escape.go do not need to be split.
# Append esc.go onto the end of escape.go.
mv esc.go escape.go
# Pull out the type format installation from func Main,
# so it can be carried into package ir.
mv Main:/Sconv.=/-0,/TypeLinkSym/-1 InstallTypeFormats
# Names that need to be exported for use by code left in gc.
mv Isconst IsConst
mv asNode AsNode
mv asNodes AsNodes
mv asTypesNode AsTypesNode
mv basicnames BasicTypeNames
mv builtinpkg BuiltinPkg
mv consttype ConstType
mv dumplist DumpList
mv fdumplist FDumpList
mv fmtMode FmtMode
mv goopnames OpNames
mv inspect Inspect
mv inspectList InspectList
mv localpkg LocalPkg
mv nblank BlankNode
mv numImport NumImport
mv opprec OpPrec
mv origSym OrigSym
mv stmtwithinit StmtWithInit
mv dump DumpAny
mv fdump FDumpAny
mv nod Nod
mv nodl NodAt
mv newname NewName
mv newnamel NewNameAt
mv assertRepresents AssertValidTypeForConst
mv represents ValidTypeForConst
mv nodlit NewLiteral
# Types and fields that need to be exported for use by gc.
mv nowritebarrierrecCallSym SymAndPos
mv SymAndPos.lineno SymAndPos.Pos
mv SymAndPos.target SymAndPos.Sym
mv Func.lsym Func.LSym
mv Func.setWBPos Func.SetWBPos
mv Func.numReturns Func.NumReturns
mv Func.numDefers Func.NumDefers
mv Func.nwbrCalls Func.NWBRCalls
# initLSym is an algorithm left behind in gc,
# not an operation on Func itself.
mv Func.initLSym initLSym
mv nodeQueue NodeQueue
mv NodeQueue.empty NodeQueue.Empty
mv NodeQueue.popLeft NodeQueue.PopLeft
mv NodeQueue.pushRight NodeQueue.PushRight
# Many methods on Node are actually algorithms that
# would apply to any node implementation.
# Those become plain functions.
mv Node.funcname FuncName
mv Node.isBlank IsBlank
mv Node.isGoConst isGoConst
mv Node.isNil IsNil
mv Node.isParamHeapCopy isParamHeapCopy
mv Node.isParamStackCopy isParamStackCopy
mv Node.isSimpleName isSimpleName
mv Node.mayBeShared MayBeShared
mv Node.pkgFuncName PkgFuncName
mv Node.backingArrayPtrLen backingArrayPtrLen
mv Node.isterminating isTermNode
mv Node.labeledControl labeledControl
mv Nodes.isterminating isTermNodes
mv Nodes.sigerr fmtSignature
mv Node.MethodName methodExprName
mv Node.MethodFunc methodExprFunc
mv Node.IsMethod IsMethod
# Every node will need to implement RawCopy;
# Copy and SepCopy algorithms will use it.
mv Node.rawcopy Node.RawCopy
mv Node.copy Copy
mv Node.sepcopy SepCopy
# Extract Node.Format method body into func FmtNode,
# but leave method wrapper behind.
mv Node.Format:0,$ FmtNode
# Formatting helpers that will apply to all node implementations.
mv Node.Line Line
mv Node.exprfmt exprFmt
mv Node.jconv jconvFmt
mv Node.modeString modeString
mv Node.nconv nconvFmt
mv Node.nodedump nodeDumpFmt
mv Node.nodefmt nodeFmt
mv Node.stmtfmt stmtFmt
# Constant support needed for code moving to ir.
mv okforconst OKForConst
mv vconv FmtConst
mv int64Val Int64Val
mv float64Val Float64Val
mv Node.ValueInterface ConstValue
# Organize code into files.
mv LocalPkg BuiltinPkg ir.go
mv NumImport InstallTypeFormats Line fmt.go
mv syntax.go Nod NodAt NewNameAt Class Pxxx PragmaFlag Nointerface SymAndPos \
AsNode AsTypesNode BlankNode OrigSym \
Node.SliceBounds Node.SetSliceBounds Op.IsSlice3 \
IsConst Node.Int64Val Node.CanInt64 Node.Uint64Val Node.BoolVal Node.StringVal \
Node.RawCopy SepCopy Copy \
IsNil IsBlank IsMethod \
Node.Typ Node.StorageClass node.go
mv ConstType ConstValue Int64Val Float64Val AssertValidTypeForConst ValidTypeForConst NewLiteral idealType OKForConst val.go
# Move files to new ir package.
mv bitset.go class_string.go dump.go fmt.go \
ir.go node.go op_string.go val.go \
sizeof_test.go cmd/compile/internal/ir
'
: # fix mkbuiltin.go to generate the changes made to builtin.go during rf
sed -i '' '
s/\[T/[types.T/g
s/\*Node/*ir.Node/g
/internal\/types/c \
fmt.Fprintln(&b, `import (`) \
fmt.Fprintln(&b, ` "cmd/compile/internal/ir"`) \
fmt.Fprintln(&b, ` "cmd/compile/internal/types"`) \
fmt.Fprintln(&b, `)`)
' mkbuiltin.go
gofmt -w mkbuiltin.go
: # update cmd/dist to add internal/ir
cd ../../../dist
sed -i '' '/compile.internal.gc/a\
"cmd/compile/internal/ir",
' buildtool.go
gofmt -w buildtool.go
: # update cmd/compile TestFormats
cd ../..
go install std cmd
cd cmd/compile
go test -u || go test # first one updates but fails; second passes
Change-Id: I5f7caf6b20629b51970279e81231a3574d5b51db
Reviewed-on: https://go-review.googlesource.com/c/go/+/273008
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Move Flag, Debug, Ctxt, Exit, and error messages to
new package cmd/compile/internal/base.
These are the core functionality that everything in gc uses
and which otherwise prevent splitting any other code
out of gc into different packages.
A minor milestone: the compiler source code
no longer contains the string "yy".
[git-generate]
cd src/cmd/compile/internal/gc
rf '
mv atExit AtExit
mv Ctxt atExitFuncs AtExit Exit base.go
mv lineno Pos
mv linestr FmtPos
mv flusherrors FlushErrors
mv yyerror Errorf
mv yyerrorl ErrorfAt
mv yyerrorv ErrorfVers
mv noder.yyerrorpos noder.errorAt
mv Warnl WarnfAt
mv errorexit ErrorExit
mv base.go debug.go flag.go print.go cmd/compile/internal/base
'
: # update comments
sed -i '' 's/yyerrorl/ErrorfAt/g; s/yyerror/Errorf/g' *.go
: # bootstrap.go is not built by default so invisible to rf
sed -i '' 's/Fatalf/base.Fatalf/' bootstrap.go
goimports -w bootstrap.go
: # update cmd/dist to add internal/base
cd ../../../dist
sed -i '' '/internal.amd64/a\
"cmd/compile/internal/base",
' buildtool.go
gofmt -w buildtool.go
Change-Id: I59903c7084222d6eaee38823fd222159ba24a31a
Reviewed-on: https://go-review.googlesource.com/c/go/+/272250
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
When a variable symbol is both imported (possibly through
inlining) and linkname'd, make sure its LSym is marked as
non-package for symbol indexing in the object file, so it is
resolved by name and dedup'd with the original definition.
Fixes#42401.
Change-Id: I8e90c0418c6f46a048945c5fdc06c022b77ed68d
Reviewed-on: https://go-review.googlesource.com/c/go/+/268178
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
gc debug flags are currently stored in a 256-long array, that is then
addressed using the ASCII numeric value of the flag itself (a quirk
inherited from the old C compiler). It is also a little wasteful,
since we only define 16 flags, and the other 240 array elements are
always empty.
This change makes Debug a struct, which also provides static checking
that we're not referencing flags that does not exist.
Change-Id: I2f0dfef2529325514b3398cf78635543cdf48fe0
Reviewed-on: https://go-review.googlesource.com/c/go/+/263539
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This creates space for a different kind of extension field
in LSym without making the struct any larger.
(There are many LSym, so we care about keeping the struct small.)
Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/243943
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.
This is a cleanup CL in preparation for some stack frame CLs.
Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405
Reviewed-on: https://go-review.googlesource.com/c/go/+/251437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This reverts CL 243318.
Reason for revert: Seems to be crashing some builders.
Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756
Reviewed-on: https://go-review.googlesource.com/c/go/+/251169
Reviewed-by: Bryan C. Mills <bcmills@google.com>
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.
This is a cleanup CL in preparation for some stack frame CLs.
Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/243318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
More ergonomic that way. Also change Haspointers to HasPointers
while we are here.
Change-Id: I45bedc294c1a8c2bd01dc14bd04615ae77555375
Reviewed-on: https://go-review.googlesource.com/c/go/+/249959
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Currently, we emit stack maps and register maps at almost every
instruction. This was originally intended to support non-cooperative
preemption, but was only ever used for debug call injection. Now debug
call injection also uses conservative frame scanning. As a result,
stack maps are only needed at call sites and register maps aren't
needed at all except that we happen to also encode unsafe-point
information in the register map PCDATA stream.
This CL reduces stack maps to only appear at calls, and replace full
register maps with just safe/unsafe-point information.
This is all protected by the go115ReduceLiveness feature flag, which
is defined in both runtime and cmd/compile.
This CL significantly reduces binary sizes and also speeds up compiles
and links:
name old exe-bytes new exe-bytes delta
BinGoSize 15.0MB ± 0% 14.1MB ± 0% -5.72%
name old pcln-bytes new pcln-bytes delta
BinGoSize 3.14MB ± 0% 2.48MB ± 0% -21.08%
name old time/op new time/op delta
Template 178ms ± 7% 172ms ±14% -3.59% (p=0.005 n=19+19)
Unicode 71.0ms ±12% 69.8ms ±10% ~ (p=0.126 n=18+18)
GoTypes 655ms ± 8% 615ms ± 8% -6.11% (p=0.000 n=19+19)
Compiler 3.27s ± 6% 3.15s ± 7% -3.69% (p=0.001 n=20+20)
SSA 7.10s ± 5% 6.85s ± 8% -3.53% (p=0.001 n=19+20)
Flate 124ms ±15% 116ms ±22% -6.57% (p=0.024 n=18+19)
GoParser 156ms ±26% 147ms ±34% ~ (p=0.070 n=19+19)
Reflect 406ms ± 9% 387ms ±21% -4.69% (p=0.028 n=19+20)
Tar 163ms ±15% 162ms ±27% ~ (p=0.370 n=19+19)
XML 223ms ±13% 218ms ±14% ~ (p=0.157 n=20+20)
LinkCompiler 503ms ±21% 484ms ±23% ~ (p=0.072 n=20+20)
ExternalLinkCompiler 1.27s ± 7% 1.22s ± 8% -3.85% (p=0.005 n=20+19)
LinkWithoutDebugCompiler 294ms ±17% 273ms ±11% -7.16% (p=0.001 n=19+18)
(https://perf.golang.org/search?q=upload:20200428.8)
The binary size improvement is even slightly better when you include
the CLs leading up to this. Relative to the parent of "cmd/compile:
mark PanicBounds/Extend as calls":
name old exe-bytes new exe-bytes delta
BinGoSize 15.0MB ± 0% 14.1MB ± 0% -6.18%
name old pcln-bytes new pcln-bytes delta
BinGoSize 3.22MB ± 0% 2.48MB ± 0% -22.92%
(https://perf.golang.org/search?q=upload:20200428.9)
For #36365.
Change-Id: I69448e714f2a44430067ca97f6b78e08c0abed27
Reviewed-on: https://go-review.googlesource.com/c/go/+/230544
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The compiler currently conflates whether a Value has a stack map with
whether it's an unsafe point. For the most part, unsafe-points don't
have stack maps, so this is mostly fine, but call instructions can be
both an unsafe-point *and* have a stack map. For example, none of the
instructions in a nosplit function should be preemptible, but calls
must still have stack maps in case the called function grows the stack
or get preempted.
Currently, the compiler can't distinguish this case, so calls in
nosplit functions are marked as safe-points just because they have
stack maps. This is particularly problematic if a nosplit function
calls another nosplit function, since this can introduce a preemption
point where there should be none.
We realized this was a problem for split-stack prologues a while back,
and CL 207349 changed the encoding of unsafe-points to use the
register map index instead of the stack map index so we could record
both a stack map and an unsafe-point at the same instruction. But this
was never extended into the compiler.
This CL fixes this problem in the compiler. We make LivenessIndex
slightly more abstract by separating unsafe-point marks from stack and
register map indexes. We map this to the PCDATA encoding later when
producing Progs. This isn't enough to fix the whole problem for
nosplit functions, because obj still adds prologues and marks those as
preemptible, but it's a step in the right direction.
I checked this CL by comparing maps before and after this change in
the runtime and net/http. In net/http, unsafe-points match exactly; at
anything that isn't an unsafe-point, both the stack and register maps
are unchanged by this CL. In the runtime, at every point that was a
safe-point before this change, the stack maps agree (and mostly the
runtime doesn't have register maps at all now). In both, all CALLs
(except write barrier calls) have stack maps.
For #36365.
Change-Id: I066628938b02e78be5c81a6614295bcf7cc566c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/230541
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This CL adds experimental coverage instrumentation similar to what
github.com/dvyukov/go-fuzz produces in its -libfuzzer mode. The
coverage can be enabled by compiling with -d=libfuzzer. It's intended
to be used in conjunction with -buildmode=c-archive to produce an ELF
archive (.a) file that can be linked with libFuzzer. See #14565 for
example usage.
The coverage generates a unique 8-bit counter for each basic block in
the original source code, and emits an increment operation. These
counters are then collected into the __libfuzzer_extra_counters ELF
section for use by libFuzzer.
Updates #14565.
Change-Id: I239758cc0ceb9ca1220f2d9d3d23b9e761db9bf1
Reviewed-on: https://go-review.googlesource.com/c/go/+/202117
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently, the compiler fails to mark any unsafe-points in the initial
instructions of a function as unsafe points. This happens because
unsafe points are encoded as a stack map index of -2 and the compiler
emits PCDATA instructions when there's a change in the stack map
index, but I had set the initial stack map index to -2. The actual
initial PCDATA value assumed by the PCDATA encoder and the runtime is
-1. Hence, if the first instructions had a stack map index of -2, no
PCDATA was emitted, which cause the runtime to assume the index was -1
instead.
This was particularly problematic in the runtime, where the compiler
was supposed to mark only calls as safe-points and everything else as
unsafe-points. Runtime leaf functions, for example, should have been
marked as entirely unsafe-points, but were instead marked entirely as
safe-points.
Fix this by making the PCDATA instruction generator assume the initial
PCDATA value is -1 instead of -2, so it will emit a PCDATA instruction
right away if the first real instruction is an unsafe-point.
This increases the size of the cmd/go binary by 0.02% since we now
emit slightly more PCDATA than before.
For #10958, #24543.
Change-Id: I92222107f799130072b36d49098d2686f1543699
Reviewed-on: https://go-review.googlesource.com/c/go/+/202084
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
isfat was removed in walkexpr in CL 32313. For consistency,
remove it from order expr, too.
Change-Id: I0a47e0da13ba0168d6a055d990b8efad26ad790d
Reviewed-on: https://go-review.googlesource.com/c/go/+/179057
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
dist passes the -allabis flag to the compiler to avoid having to
recreate the cross-package ABI logic from cmd/go. However, we removed
that logic from cmd/go in CL 179863 and replaced it with a different
mechanism that doesn't depend on the build system. Hence, passing
-allabis in dist is no longer necessary.
This CL removes -allabis from dist and, since that was the only use of
it, removes support for it from the compiler as well.
Updates #31230.
Change-Id: Ib005db95755a7028f49c885785e72c3970aea4f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/181079
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
Calling a Go symbol from assembly in another package currently results
in a link failure because the Go symbol is defined as ABIInternal, but
the assembly call is from ABI0. In general this is okay because you
shouldn't do this anyway, but there are special cases where this is
necessary, especially between the runtime and packages closely tied to
the runtime in std.
Currently, we address this for runtime symbols with a hack in cmd/go
that knows to scan related packages when building the symabis file for
the runtime and runtime/internal/atomic. However, in addition to being
a messy solution in the first place, this hack causes races in cmd/go
that are difficult to work around.
We considered creating dummy references from assembly in the runtime
to these symbols, just to make sure they get ABI0 wrappers. However,
there are a fairly large number of these symbols on some platforms,
and it can vary significantly depending on build flags (e.g., race
mode), so even this solution is fairly unpalatable.
This CL addresses this by providing a way to mark symbols in Go code
that should be made available to assembly in other packages. Rather
than introduce a new pragma, we lightly expand the meaning of
"//go:linkname", since that pragma already generally indicates that
you're making the symbol available in a way it wasn't before. This
also dovetails nicely with the behavior of go:linkname in gccgo, which
makes unexported symbols available to other packages.
Follow-up CLs will make use of this and then remove the hack from
cmd/go.
Updates #31230.
Change-Id: I23060c97280626581f025c5c01fb8d24bb4c5159
Reviewed-on: https://go-review.googlesource.com/c/go/+/179860
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
They can't be used, so we don't need code generated for them. We just
need to report errors in their bodies.
The compiler currently has a bunch of special cases sprinkled about
for "_" functions, because we never generate a linker symbol for them.
Instead, abort compilation earlier so we never reach any of that
special-case code.
Fixes#29870
Change-Id: I3530c9c353deabcf75ce9072c0b740e992349ee5
Reviewed-on: https://go-review.googlesource.com/c/158845
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This fixes the linux-amd64-longtest builder, which was broken by CL
147160.
Updates #27539.
Change-Id: If6e69581ef503bba2449ec9bacaa31f34f59beb1
Reviewed-on: https://go-review.googlesource.com/c/149157
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This implements compiler and linker support for separating the
function calling ABI into two ABIs: a stable and an internal ABI. At
the moment, the two ABIs are identical, but we'll be able to evolve
the internal ABI without breaking existing assembly code that depends
on the stable ABI for calling to and from Go.
The Go compiler generates internal ABI symbols for all Go functions.
It uses the symabis information produced by the assembler to create
ABI wrappers whenever it encounters a body-less Go function that's
defined in assembly or a Go function that's referenced from assembly.
Since the two ABIs are currently identical, for the moment this is
implemented using "ABI alias" symbols, which are just forwarding
references to the native ABI symbol for a function. This way there's
no actual code involved in the ABI wrapper, which is good because
we're not deriving any benefit from it right now. Once the ABIs
diverge, we can eliminate ABI aliases.
The linker represents these different ABIs internally as different
versions of the same symbol. This way, the linker keeps us honest,
since every symbol definition and reference also specifies its
version. The linker is responsible for resolving ABI aliases.
Fixes#27539.
Change-Id: I197c52ec9f8fc435db8f7a4259029b20f6d65e95
Reviewed-on: https://go-review.googlesource.com/c/147160
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Currently, liveness produces a distinct obj.LSym for each GC bitmap
for each function. These are then named by content hash and only
ultimately deduplicated by WriteObjFile.
For various reasons (see next commit), we want to remove this
deduplication behavior from WriteObjFile. Furthermore, it's
inefficient to produce these duplicate symbols in the first place.
GC bitmaps are the only source of duplicate symbols in the compiler.
This commit eliminates these duplicate symbols by declaring them in
the Ctxt symbol hash just like every other obj.LSym. As a result, all
GC bitmaps with the same content now refer to the same obj.LSym.
The next commit will remove deduplication from WriteObjFile.
For #27539.
Change-Id: I4f15e3d99530122cdf473b7a838c69ef5f79db59
Reviewed-on: https://go-review.googlesource.com/c/146557
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
There's a glitch in how attributes from procs that do not
generate code are combined, and the workaround for this
glitch appeared in two places.
"One big pile is better than two little ones."
Updates #25426.
Change-Id: I252f9adc5b77591720a61fa22e6f9dda33d95350
Reviewed-on: https://go-review.googlesource.com/113717
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
A new pass run after ssa building (before any other
optimization) identifies the "first" ssa node for each
statement. Other "noise" nodes are tagged as being never
appropriate for a statement boundary (e.g., VarKill, VarDef,
Phi).
Rewrite, deadcode, cse, and nilcheck are modified to move
the statement boundaries forward whenever possible if a
boundary-tagged ssa value is removed; never-boundary nodes
are ignored in this search (some operations involving
constants are also tagged as never-boundary and also ignored
because they are likely to be moved or removed during
optimization).
Code generation treats all nodes except those explicitly
marked as statement boundaries as "not statement" nodes,
and floats statement boundaries to the beginning of each
same-line run of instructions found within a basic block.
Line number html conversion was modified to make statement
boundary nodes a bit more obvious by prepending a "+".
The code in fuse.go that glued together the value slices
of two blocks produced a result that depended on the
former capacities (not lengths) of the two slices. This
causes differences in the 386 bootstrap, and also can
sometimes put values into an order that does a worse job
of preserving statement boundaries when values are removed.
Portions of two delve tests that had caught problems were
incorporated into ssa/debug_test.go. There are some
opportunities to do better with optimized code, but the
next-ing is not lying or overly jumpy.
Over 4 CLs, compilebench geomean measured binary size
increase of 3.5% and compile user time increase of 3.8%
(this is after optimization to reuse a sparse map instead
of creating multiple maps.)
This CL worsens the optimized-debugging experience with
Delve; we need to work with the delve team so that
they can use the is_stmt marks that we're emitting now.
The reference output changes from time to time depending
on other changes in the compiler, sometimes better,
sometimes worse.
This CL now includes a test ensuring that 99+% of the lines
in the Go command itself (a handy optimized binary) include
is_stmt markers.
Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a
Reviewed-on: https://go-review.googlesource.com/102435
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Focus on "isfoo" funcs that take a *Node, and conver them to isFoo
methods instead. This makes for more idiomatic Go code, and also more
readable func names.
Found candidates with grep, and applied most changes with sed. The funcs
chosen were isgoconst, isnil, and isblank. All had the same signature,
func(*Node) bool.
While at it, camelCase the isliteral and iszero function names. Don't
move these to methods, as they are only used in the backend part of gc,
which might one day be split into a separate package.
Passes toolstash -cmp on std cmd.
Change-Id: I4df081b12d36c46c253167c8841c5a841f1c5a16
Reviewed-on: https://go-review.googlesource.com/105555
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The NoFramePointer function flag is no longer used, so this CL
eliminates it. This cleans up some confusion between the compiler's
NoFramePointer flag and obj's NOFRAME flag. NoFramePointer was
intended to eliminate the saved base pointer on x86, but it was
translated into obj's NOFRAME flag. On x86, NOFRAME does mean to omit
the saved base pointer, but on ppc64 and s390x it has a more general
meaning of omitting *everything* from the frame, including the saved
LR and ppc64's "fixed frame". Hence, on ppc64 and s390x there are far
fewer situations where it is safe to set this flag.
Change-Id: If68991310b4d00638128c296bdd57f4ed731b46d
Reviewed-on: https://go-review.googlesource.com/92036
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Compiler and linker changes to support DWARF inlined instances,
see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md
for design details.
This functionality is gated via the cmd/compile option -gendwarfinl=N,
where N={0,1,2}, where a value of 0 disables dwarf inline generation,
a value of 1 turns on dwarf generation without tracking of formal/local
vars from inlined routines, and a value of 2 enables inlines with
variable tracking.
Updates #22080
Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e
Reviewed-on: https://go-review.googlesource.com/75550
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
* replace a copy of IsMethod with a call of it.
* a few more switches where they simplify the code.
* prefer composite literals over "n := new(...); n.x = y; ...".
* use defers to get rid of three goto labels.
* rewrite updateHasCall into two funcs to remove gotos.
Passes toolstash-check on std cmd.
Change-Id: Icb5442a89a87319ef4b640bbc5faebf41b193ef1
Reviewed-on: https://go-review.googlesource.com/72070
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
As per golint's suggestions.
Change-Id: Ie0c6ad9aa5dc69966a279562a341c7b095c47ede
Reviewed-on: https://go-review.googlesource.com/64192
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Its sole use is in walk.go. 100% code movement.
gsubr.go increasingly contains backend-y things.
With a few more relocations, it could probably be
fruitfully renamed progs.go.
Change-Id: I61ec5c2bc1f8cfdda64c6d6f580952c154ff60e0
Reviewed-on: https://go-review.googlesource.com/41972
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL adds initial support for concurrent backend compilation.
BACKGROUND
The compiler currently consists (very roughly) of the following phases:
1. Initialization.
2. Lexing and parsing into the cmd/compile/internal/syntax AST.
3. Translation into the cmd/compile/internal/gc AST.
4. Some gc AST passes: typechecking, escape analysis, inlining,
closure handling, expression evaluation ordering (order.go),
and some lowering and optimization (walk.go).
5. Translation into the cmd/compile/internal/ssa SSA form.
6. Optimization and lowering of SSA form.
7. Translation from SSA form to assembler instructions.
8. Translation from assembler instructions to machine code.
9. Writing lots of output: machine code, DWARF symbols,
type and reflection info, export data.
Phase 2 was already concurrent as of Go 1.8.
Phase 3 is planned for eventual removal;
we hope to go straight from syntax AST to SSA.
Phases 5–8 are per-function; this CL adds support for
processing multiple functions concurrently.
The slowest phases in the compiler are 5 and 6,
so this offers the opportunity for some good speed-ups.
Unfortunately, it's not quite that straightforward.
In the current compiler, the latter parts of phase 4
(order, walk) are done function-at-a-time as needed.
Making order and walk concurrency-safe proved hard,
and they're not particularly slow, so there wasn't much reward.
To enable phases 5–8 to be done concurrently,
when concurrent backend compilation is requested,
we complete phase 4 for all functions
before starting later phases for any functions.
Also, in reality, we automatically generate new
functions in phase 9, such as method wrappers
and equality and has routines.
Those new functions then go through phases 4–8.
This CL disables concurrent backend compilation
after the first, big, user-provided batch of
functions has been compiled.
This is done to keep things simple,
and because the autogenerated functions
tend to be small, few, simple, and fast to compile.
USAGE
Concurrent backend compilation still defaults to off.
To set the number of functions that may be backend-compiled
concurrently, use the compiler flag -c.
In future work, cmd/go will automatically set -c.
Furthermore, this CL has been intentionally written
so that the c=1 path has no backend concurrency whatsoever,
not even spawning any goroutines.
This helps ensure that, should problems arise
late in the development cycle,
we can simply have cmd/go set c=1 always,
and revert to the original compiler behavior.
MUTEXES
Most of the work required to make concurrent backend
compilation safe has occurred over the past month.
This CL adds a handful of mutexes to get the rest of the way there;
they are the mutexes that I didn't see a clean way to avoid.
Some of them may still be eliminable in future work.
In no particular order:
* gc.funcsymsmu. The global funcsyms slice is populated
lazily when we need function symbols for closures.
This occurs during gc AST to SSA translation.
The function funcsym also does a package lookup,
which is a source of races on types.Pkg.Syms;
funcsymsmu also covers that package lookup.
This mutex is low priority: it adds a single global,
it is in an infrequently used code path, and it is low contention.
Since funcsyms may now be added in any order,
we must sort them to preserve reproducible builds.
* gc.largeStackFramesMu. We don't discover until after SSA compilation
that a function's stack frame is gigantic.
Recording that error happens basically never,
but it does happen concurrently.
Fix with a low priority mutex and sorting.
* obj.Link.hashmu. ctxt.hash stores the mapping from
types.Syms (compiler symbols) to obj.LSyms (linker symbols).
It is accessed fairly heavily through all the phases.
This is the only heavily contended mutex.
* gc.signatlistmu. The global signatlist map is
populated with types through several of the concurrent phases,
including notably via ngotype during DWARF generation.
It is low priority for removal.
* gc.typepkgmu. Looking up symbols in the types package
happens a fair amount during backend compilation
and DWARF generation, particularly via ngotype.
This mutex helps us to avoid a broader mutex on types.Pkg.Syms.
It has low-to-moderate contention.
* types.internedStringsmu. gc AST to SSA conversion and
some SSA work introduce new autotmps.
Those autotmps have their names interned to reduce allocations.
That interning requires protecting types.internedStrings.
The autotmp names are heavily re-used, and the mutex
overhead and contention here are low, so it is probably
a worthwhile performance optimization to keep this mutex.
TESTING
I have been testing this code locally by running
'go install -race cmd/compile'
and then doing
'go build -a -gcflags=-c=128 std cmd'
for all architectures and a variety of compiler flags.
This obviously needs to be made part of the builders,
but it is too expensive to make part of all.bash.
I have filed #19962 for this.
REPRODUCIBLE BUILDS
This version of the compiler generates reproducible builds.
Testing reproducible builds also needs automation, however,
and is also too expensive for all.bash.
This is #19961.
Also of note is that some of the compiler flags used by 'toolstash -cmp'
are currently incompatible with concurrent backend compilation.
They still work fine with c=1.
Time will tell whether this is a problem.
NEXT STEPS
* Continue to find and fix races and bugs,
using a combination of code inspection, fuzzing,
and hopefully some community experimentation.
I do not know of any outstanding races,
but there probably are some.
* Improve testing.
* Improve performance, for many values of c.
* Integrate with cmd/go and fine tune.
* Support concurrent compilation with the -race flag.
It is a sad irony that it does not yet work.
* Minor code cleanup that has been deferred during
the last month due to uncertainty about the
ultimate shape of this CL.
PERFORMANCE
Here's the buried lede, at last. :)
All benchmarks are from my 8 core 2.9 GHz Intel Core i7 darwin/amd64 laptop.
First, going from tip to this CL with c=1 has almost no impact.
name old time/op new time/op delta
Template 195ms ± 3% 194ms ± 5% ~ (p=0.370 n=30+29)
Unicode 86.6ms ± 3% 87.0ms ± 7% ~ (p=0.958 n=29+30)
GoTypes 548ms ± 3% 555ms ± 4% +1.35% (p=0.001 n=30+28)
Compiler 2.51s ± 2% 2.54s ± 2% +1.17% (p=0.000 n=28+30)
SSA 5.16s ± 3% 5.16s ± 2% ~ (p=0.910 n=30+29)
Flate 124ms ± 5% 124ms ± 4% ~ (p=0.947 n=30+30)
GoParser 146ms ± 3% 146ms ± 3% ~ (p=0.150 n=29+28)
Reflect 354ms ± 3% 352ms ± 4% ~ (p=0.096 n=29+29)
Tar 107ms ± 5% 106ms ± 3% ~ (p=0.370 n=30+29)
XML 200ms ± 4% 201ms ± 4% ~ (p=0.313 n=29+28)
[Geo mean] 332ms 333ms +0.10%
name old user-time/op new user-time/op delta
Template 227ms ± 5% 225ms ± 5% ~ (p=0.457 n=28+27)
Unicode 109ms ± 4% 109ms ± 5% ~ (p=0.758 n=29+29)
GoTypes 713ms ± 4% 721ms ± 5% ~ (p=0.051 n=30+29)
Compiler 3.36s ± 2% 3.38s ± 3% ~ (p=0.146 n=30+30)
SSA 7.46s ± 3% 7.47s ± 3% ~ (p=0.804 n=30+29)
Flate 146ms ± 7% 147ms ± 3% ~ (p=0.833 n=29+27)
GoParser 179ms ± 5% 179ms ± 5% ~ (p=0.866 n=30+30)
Reflect 431ms ± 4% 429ms ± 4% ~ (p=0.593 n=29+30)
Tar 124ms ± 5% 123ms ± 5% ~ (p=0.140 n=29+29)
XML 243ms ± 4% 242ms ± 7% ~ (p=0.404 n=29+29)
[Geo mean] 415ms 415ms +0.02%
name old obj-bytes new obj-bytes delta
Template 382k ± 0% 382k ± 0% ~ (all equal)
Unicode 203k ± 0% 203k ± 0% ~ (all equal)
GoTypes 1.18M ± 0% 1.18M ± 0% ~ (all equal)
Compiler 3.98M ± 0% 3.98M ± 0% ~ (all equal)
SSA 8.28M ± 0% 8.28M ± 0% ~ (all equal)
Flate 230k ± 0% 230k ± 0% ~ (all equal)
GoParser 287k ± 0% 287k ± 0% ~ (all equal)
Reflect 1.00M ± 0% 1.00M ± 0% ~ (all equal)
Tar 190k ± 0% 190k ± 0% ~ (all equal)
XML 416k ± 0% 416k ± 0% ~ (all equal)
[Geo mean] 660k 660k +0.00%
Comparing this CL to itself, from c=1 to c=2
improves real times 20-30%, costs 5-10% more CPU time,
and adds about 2% alloc.
The allocation increase comes from allocating more ssa.Caches.
name old time/op new time/op delta
Template 202ms ± 3% 149ms ± 3% -26.15% (p=0.000 n=49+49)
Unicode 87.4ms ± 4% 84.2ms ± 3% -3.68% (p=0.000 n=48+48)
GoTypes 560ms ± 2% 398ms ± 2% -28.96% (p=0.000 n=49+49)
Compiler 2.46s ± 3% 1.76s ± 2% -28.61% (p=0.000 n=48+46)
SSA 6.17s ± 2% 4.04s ± 1% -34.52% (p=0.000 n=49+49)
Flate 126ms ± 3% 92ms ± 2% -26.81% (p=0.000 n=49+48)
GoParser 148ms ± 4% 107ms ± 2% -27.78% (p=0.000 n=49+48)
Reflect 361ms ± 3% 281ms ± 3% -22.10% (p=0.000 n=49+49)
Tar 109ms ± 4% 86ms ± 3% -20.81% (p=0.000 n=49+47)
XML 204ms ± 3% 144ms ± 2% -29.53% (p=0.000 n=48+45)
name old user-time/op new user-time/op delta
Template 246ms ± 9% 246ms ± 4% ~ (p=0.401 n=50+48)
Unicode 109ms ± 4% 111ms ± 4% +1.47% (p=0.000 n=44+50)
GoTypes 728ms ± 3% 765ms ± 3% +5.04% (p=0.000 n=46+50)
Compiler 3.33s ± 3% 3.41s ± 2% +2.31% (p=0.000 n=49+48)
SSA 8.52s ± 2% 9.11s ± 2% +6.93% (p=0.000 n=49+47)
Flate 149ms ± 4% 161ms ± 3% +8.13% (p=0.000 n=50+47)
GoParser 181ms ± 5% 192ms ± 2% +6.40% (p=0.000 n=49+46)
Reflect 452ms ± 9% 474ms ± 2% +4.99% (p=0.000 n=50+48)
Tar 126ms ± 6% 136ms ± 4% +7.95% (p=0.000 n=50+49)
XML 247ms ± 5% 264ms ± 3% +6.94% (p=0.000 n=48+50)
name old alloc/op new alloc/op delta
Template 38.8MB ± 0% 39.3MB ± 0% +1.48% (p=0.008 n=5+5)
Unicode 29.8MB ± 0% 30.2MB ± 0% +1.19% (p=0.008 n=5+5)
GoTypes 113MB ± 0% 114MB ± 0% +0.69% (p=0.008 n=5+5)
Compiler 443MB ± 0% 447MB ± 0% +0.95% (p=0.008 n=5+5)
SSA 1.25GB ± 0% 1.26GB ± 0% +0.89% (p=0.008 n=5+5)
Flate 25.3MB ± 0% 25.9MB ± 1% +2.35% (p=0.008 n=5+5)
GoParser 31.7MB ± 0% 32.2MB ± 0% +1.59% (p=0.008 n=5+5)
Reflect 78.2MB ± 0% 78.9MB ± 0% +0.91% (p=0.008 n=5+5)
Tar 26.6MB ± 0% 27.0MB ± 0% +1.80% (p=0.008 n=5+5)
XML 42.4MB ± 0% 43.4MB ± 0% +2.35% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Template 379k ± 0% 378k ± 0% ~ (p=0.421 n=5+5)
Unicode 322k ± 0% 321k ± 0% ~ (p=0.222 n=5+5)
GoTypes 1.14M ± 0% 1.14M ± 0% ~ (p=0.548 n=5+5)
Compiler 4.12M ± 0% 4.11M ± 0% -0.14% (p=0.032 n=5+5)
SSA 9.72M ± 0% 9.72M ± 0% ~ (p=0.421 n=5+5)
Flate 234k ± 1% 234k ± 0% ~ (p=0.421 n=5+5)
GoParser 316k ± 1% 315k ± 0% ~ (p=0.222 n=5+5)
Reflect 980k ± 0% 979k ± 0% ~ (p=0.095 n=5+5)
Tar 249k ± 1% 249k ± 1% ~ (p=0.841 n=5+5)
XML 392k ± 0% 391k ± 0% ~ (p=0.095 n=5+5)
From c=1 to c=4, real time is down ~40%, CPU usage up 10-20%, alloc up ~5%:
name old time/op new time/op delta
Template 203ms ± 3% 131ms ± 5% -35.45% (p=0.000 n=50+50)
Unicode 87.2ms ± 4% 84.1ms ± 2% -3.61% (p=0.000 n=48+47)
GoTypes 560ms ± 4% 310ms ± 2% -44.65% (p=0.000 n=50+49)
Compiler 2.47s ± 3% 1.41s ± 2% -43.10% (p=0.000 n=50+46)
SSA 6.17s ± 2% 3.20s ± 2% -48.06% (p=0.000 n=49+49)
Flate 126ms ± 4% 74ms ± 2% -41.06% (p=0.000 n=49+48)
GoParser 148ms ± 4% 89ms ± 3% -39.97% (p=0.000 n=49+50)
Reflect 360ms ± 3% 242ms ± 3% -32.81% (p=0.000 n=49+49)
Tar 108ms ± 4% 73ms ± 4% -32.48% (p=0.000 n=50+49)
XML 203ms ± 3% 119ms ± 3% -41.56% (p=0.000 n=49+48)
name old user-time/op new user-time/op delta
Template 246ms ± 9% 287ms ± 9% +16.98% (p=0.000 n=50+50)
Unicode 109ms ± 4% 118ms ± 5% +7.56% (p=0.000 n=46+50)
GoTypes 735ms ± 4% 806ms ± 2% +9.62% (p=0.000 n=50+50)
Compiler 3.34s ± 4% 3.56s ± 2% +6.78% (p=0.000 n=49+49)
SSA 8.54s ± 3% 10.04s ± 3% +17.55% (p=0.000 n=50+50)
Flate 149ms ± 6% 176ms ± 3% +17.82% (p=0.000 n=50+48)
GoParser 181ms ± 5% 213ms ± 3% +17.47% (p=0.000 n=50+50)
Reflect 453ms ± 6% 499ms ± 2% +10.11% (p=0.000 n=50+48)
Tar 126ms ± 5% 149ms ±11% +18.76% (p=0.000 n=50+50)
XML 246ms ± 5% 287ms ± 4% +16.53% (p=0.000 n=49+50)
name old alloc/op new alloc/op delta
Template 38.8MB ± 0% 40.4MB ± 0% +4.21% (p=0.008 n=5+5)
Unicode 29.8MB ± 0% 30.9MB ± 0% +3.68% (p=0.008 n=5+5)
GoTypes 113MB ± 0% 116MB ± 0% +2.71% (p=0.008 n=5+5)
Compiler 443MB ± 0% 455MB ± 0% +2.75% (p=0.008 n=5+5)
SSA 1.25GB ± 0% 1.27GB ± 0% +1.84% (p=0.008 n=5+5)
Flate 25.3MB ± 0% 26.9MB ± 1% +6.31% (p=0.008 n=5+5)
GoParser 31.7MB ± 0% 33.2MB ± 0% +4.61% (p=0.008 n=5+5)
Reflect 78.2MB ± 0% 80.2MB ± 0% +2.53% (p=0.008 n=5+5)
Tar 26.6MB ± 0% 27.9MB ± 0% +5.19% (p=0.008 n=5+5)
XML 42.4MB ± 0% 44.6MB ± 0% +5.20% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Template 380k ± 0% 379k ± 0% -0.39% (p=0.032 n=5+5)
Unicode 321k ± 0% 321k ± 0% ~ (p=0.841 n=5+5)
GoTypes 1.14M ± 0% 1.14M ± 0% ~ (p=0.421 n=5+5)
Compiler 4.12M ± 0% 4.14M ± 0% +0.52% (p=0.008 n=5+5)
SSA 9.72M ± 0% 9.76M ± 0% +0.37% (p=0.008 n=5+5)
Flate 234k ± 1% 234k ± 1% ~ (p=0.690 n=5+5)
GoParser 316k ± 0% 317k ± 1% ~ (p=0.841 n=5+5)
Reflect 981k ± 0% 981k ± 0% ~ (p=1.000 n=5+5)
Tar 250k ± 0% 249k ± 1% ~ (p=0.151 n=5+5)
XML 393k ± 0% 392k ± 0% ~ (p=0.056 n=5+5)
Going beyond c=4 on my machine tends to increase CPU time and allocs
without impacting real time.
The CPU time numbers matter, because when there are many concurrent
compilation processes, that will impact the overall throughput.
The numbers above are in many ways the best case scenario;
we can take full advantage of all cores.
Fortunately, the most common compilation scenario is incremental
re-compilation of a single package during a build/test cycle.
Updates #15756
Change-Id: I6725558ca2069edec0ac5b0d1683105a9fff6bea
Reviewed-on: https://go-review.googlesource.com/40693
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.
There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.
objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.
Fixes#15165.
Fixes#20026.
Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Automated refactoring using github.com/mdempsky/unbed (to rewrite
s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
field to just Func).
Passes toolstash-check -all.
Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
Reviewed-on: https://go-review.googlesource.com/40670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The compiler handled gcargs and gclocals LSyms unusually.
It generated placeholder symbols (makefuncdatasym),
filled them in, and then renamed them for content-addressability.
This is an important binary size optimization;
the same locals information occurs over and over.
This CL continues to treat these LSyms unusually,
but in a slightly more explicit way,
and importantly for concurrent compilation,
in a way that does not require concurrent
modification of Ctxt.Hash.
Instead of creating gcargs and gclocals in the usual way,
by creating a types.Sym and then an obj.LSym,
we add them directly to obj.FuncInfo,
initialize them in obj.InitTextSym,
and deduplicate and add them to ctxt.Data at the end.
Then the backend's job is simply to fill them in
and rename them appropriately.
Updates #15756
name old alloc/op new alloc/op delta
Template 38.8MB ± 0% 38.7MB ± 0% -0.22% (p=0.016 n=5+5)
Unicode 29.8MB ± 0% 29.8MB ± 0% ~ (p=0.690 n=5+5)
GoTypes 113MB ± 0% 113MB ± 0% -0.24% (p=0.008 n=5+5)
SSA 1.25GB ± 0% 1.24GB ± 0% -0.39% (p=0.008 n=5+5)
Flate 25.3MB ± 0% 25.2MB ± 0% -0.43% (p=0.008 n=5+5)
GoParser 31.7MB ± 0% 31.7MB ± 0% -0.22% (p=0.008 n=5+5)
Reflect 78.2MB ± 0% 77.6MB ± 0% -0.80% (p=0.008 n=5+5)
Tar 26.6MB ± 0% 26.3MB ± 0% -0.85% (p=0.008 n=5+5)
XML 42.4MB ± 0% 41.9MB ± 0% -1.04% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Template 378k ± 0% 377k ± 1% ~ (p=0.151 n=5+5)
Unicode 321k ± 1% 321k ± 0% ~ (p=0.841 n=5+5)
GoTypes 1.14M ± 0% 1.14M ± 0% -0.47% (p=0.016 n=5+5)
SSA 9.71M ± 0% 9.67M ± 0% -0.33% (p=0.008 n=5+5)
Flate 233k ± 1% 232k ± 1% ~ (p=0.151 n=5+5)
GoParser 316k ± 0% 315k ± 0% -0.49% (p=0.016 n=5+5)
Reflect 979k ± 0% 972k ± 0% -0.75% (p=0.008 n=5+5)
Tar 250k ± 0% 247k ± 1% -0.92% (p=0.008 n=5+5)
XML 392k ± 1% 389k ± 0% -0.67% (p=0.008 n=5+5)
Change-Id: Idc36186ca9d2f8214b5f7720bbc27b6bb22fdc48
Reviewed-on: https://go-review.googlesource.com/40697
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Prior to this CL, flags such as NOSPLIT
on ATEXT Progs were stored in From3.Offset.
Some but not all of those flags were also
duplicated into From.Sym.Attribute.
This CL migrates all of those flags into
From.Sym.Attribute and stops creating a From3.
A side-effect of this is that printing an
ATEXT Prog can no longer simply dump From3.Offset.
That's kind of good, since the raw flag value
wasn't very informative anyway, but it did
necessitate a bunch of updates to the cmd/asm tests.
The reason I'm doing this work now is that
avoiding storing flags in both From.Sym and From3.Offset
simplifies some other changes to fix the data
race first described in CL 40254.
This CL almost passes toolstash-check -all.
The only changes are in cases where the assembler
has decided that a function's flags may be altered,
e.g. to make a function with no calls in it NOSPLIT.
Prior to this CL, that information was not printed.
Sample before:
"".Ctz64 t=1 size=63 args=0x10 locals=0x0
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) TEXT "".Ctz64(SB), $0-16
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
Sample after:
"".Ctz64 t=1 nosplit size=63 args=0x10 locals=0x0
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) TEXT "".Ctz64(SB), NOSPLIT, $0-16
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
Observe the additional "nosplit" in the first line
and the additional "NOSPLIT" in the second line.
Updates #15756
Change-Id: I5c59bd8f3bdc7c780361f801d94a261f0aef3d13
Reviewed-on: https://go-review.googlesource.com/40495
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This makes the core Flushplist loop clearer.
We may also want to move the Sym initialization
much earlier in the compiler (see discussion on
CL 40254), for which this paves the way.
While we're here, eliminate package log in favor of ctxt.Diag.
Passes toolstash-check -all.
Updates #15756
Change-Id: Ieaf848d196764a5aa82578b689af7bc6638c385a
Reviewed-on: https://go-review.googlesource.com/40313
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>