Commit graph

82 commits

Author SHA1 Message Date
Matthew Dempsky
f153b6739b cmd/compile: use typecheck.InitUniverse in unit tests
Rather than ad hoc setting up the universe, just initialize it
properly.

Change-Id: I18484b952321f55eb3e1e48fd383068a4ee75f66
Reviewed-on: https://go-review.googlesource.com/c/go/+/345475
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
2021-08-27 00:36:19 +00:00
Cherry Mui
656f0888b7 [dev.typeparams] cmd/compile: make softfloat mode work with register ABI
Previously, softfloat mode does not work with register ABI, mainly
because the compiler doesn't know how to pass floating point
arguments and results. According to the ABI it should be passed in
FP registers, but there isn't any in softfloat mode.

This CL makes it work. When softfloat is used, we define the ABI
as having 0 floating point registers (because there aren't any).
The integer registers are unchanged. So floating point arguments
and results are passed in memory.

Another option is to pass (the bit representation of) floating
point values in integer registers. But this complicates things
because it'd need to reorder integer argument registers.

Change-Id: Ibecbeccb658c10a868fa7f2dcf75138f719cc809
Reviewed-on: https://go-review.googlesource.com/c/go/+/327274
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2021-08-03 16:14:24 +00:00
David Chase
b38b1b2f9a cmd/compile: manage Slot array better
steals idea from CL 312093

further investigation revealed additional duplicate
slots (equivalent, but not equal), so delete those too.

Rearranged Func.Names to be addresses of slots,
create canonical addresses so that split slots
(which use those addresses to refer to their parent,
and split slots can be further split)
will preserve "equivalent slots are equal".

Removes duplicates, improves metrics for "args at entry".

Change-Id: I5bbdcb50bd33655abcab3d27ad8cdce25499faaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/312292
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-05-08 17:03:18 +00:00
Matthew Dempsky
f24e40c14a [dev.regabi] cmd/compile: remove Name.Class_ accessors
These aren't part of the Node interface anymore, so no need to keep
them around.

Passes toolstash -cmp.

[git-generate]
cd src/cmd/compile/internal/ir

: Fix one off case that causes trouble for rf.
sed -i -e 's/n.SetClass(ir.PAUTO)/n.Class_ = ir.PAUTO/' ../ssa/export_test.go

pkgs=$(go list . ../...)
rf '
	ex '"$(echo $pkgs)"' {
		var n *Name
		var c Class
		n.Class() -> n.Class_
		n.SetClass(c) -> n.Class_ = c
	}

	rm Name.Class
	rm Name.SetClass
	mv Name.Class_ Name.Class
'

Change-Id: Ifb304bf4691a8c455456aabd8aa77178d4a49500
Reviewed-on: https://go-review.googlesource.com/c/go/+/281294
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2021-01-04 10:30:09 +00:00
Russ Cox
dac0de3748 [dev.regabi] cmd/compile: move type size calculations into package types [generated]
To break up package gc, we need to put these calculations somewhere
lower in the import graph, either an existing or new package. Package types
already needs this code and is using hacks to get it without an import cycle.
We can remove the hacks and set up for the new package gc by moving the
code into package types itself.

[git-generate]
cd src/cmd/compile/internal/gc
rf '
	# Remove old import cycle hacks in gc.
	rm TypecheckInit:/types.Widthptr =/-0,/types.Dowidth =/+0 \
		../ssa/export_test.go:/types.Dowidth =/-+
	ex {
		import "cmd/compile/internal/types"
		types.Widthptr -> Widthptr
		types.Dowidth -> dowidth
	}

	# Disable CalcSize in tests instead of base.Fatalf
	sub dowidth:/base.Fatalf\("dowidth without betypeinit"\)/ \
		// Assume this is a test. \
		return

	# Move size calculation into cmd/compile/internal/types
	mv Widthptr PtrSize
	mv Widthreg RegSize
	mv slicePtrOffset SlicePtrOffset
	mv sliceLenOffset SliceLenOffset
	mv sliceCapOffset SliceCapOffset
	mv sizeofSlice SliceSize
	mv sizeofString StringSize
	mv skipDowidthForTracing SkipSizeForTracing
	mv dowidth CalcSize
	mv checkwidth CheckSize
	mv widstruct calcStructOffset
	mv sizeCalculationDisabled CalcSizeDisabled
	mv defercheckwidth DeferCheckSize
	mv resumecheckwidth ResumeCheckSize
	mv typeptrdata PtrDataSize
	mv \
		PtrSize RegSize SlicePtrOffset SkipSizeForTracing typePos align.go PtrDataSize \
		size.go
	mv size.go cmd/compile/internal/types
'

: # Remove old import cycle hacks in types.
cd ../types
rf '
	ex {
		Widthptr -> PtrSize
		Dowidth -> CalcSize
	}
	rm Widthptr Dowidth
'

Change-Id: Ib96cdc6bda2617235480c29392ea5cfb20f60cd8
Reviewed-on: https://go-review.googlesource.com/c/go/+/279234
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-23 06:38:20 +00:00
Matthew Dempsky
dcec658f6c [dev.regabi] cmd/compile: change LocalSlot.N to *ir.Name
This was already documented as always being an ONAME, so it just
needed a few type assertion changes.

Passes buildall w/ toolstash -cmp.

Updates #42982.

Change-Id: I61f4b6ebd57c43b41977f4b37b81fe94fb11a723
Reviewed-on: https://go-review.googlesource.com/c/go/+/275757
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
2020-12-08 01:46:40 +00:00
Russ Cox
6ea2b8c54c [dev.regabi] cmd/compile: clean up and document formatting
Some cleanup left over from moving the Type and Sym formatting to types.
And then document what the type formats are, now that it's clear.

Passes buildall w/ toolstash -cmp.

Change-Id: I35cb8978f1627db1056cb8ab343ce6ba6c99afad
Reviewed-on: https://go-review.googlesource.com/c/go/+/275780
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-07 20:41:17 +00:00
Russ Cox
bb4a37bd93 [dev.regabi] cmd/compile: move Type, Sym printing to package types [generated]
Move the printing of types.Type and types.Sym out of ir
into package types, where it properly belongs. This wasn't
done originally (when the code was in gc) because the Type
and Sym printing was a bit tangled up with the Node printing.
But now they are untangled and can move into the correct
package.

This CL is automatically generated.
A followup CL will clean up a little bit more by hand.

Passes buildall w/ toolstash -cmp.

[git-generate]
cd src/cmd/compile/internal/ir
rf '
	mv FmtMode fmtMode
	mv FErr fmtGo
	mv FDbg fmtDebug
	mv FTypeId fmtTypeID
	mv FTypeIdName fmtTypeIDName
	mv methodSymName SymMethodName

	mv BuiltinPkg LocalPkg BlankSym OrigSym NumImport \
		fmtMode fmtGo symFormat sconv sconv2 symfmt SymMethodName \
		BasicTypeNames fmtBufferPool InstallTypeFormats typeFormat tconv tconv2 fldconv FmtConst \
		typefmt.go

	mv typefmt.go cmd/compile/internal/types
'
cd ../types
mv typefmt.go fmt.go

Change-Id: I6f3fd818323733ab8446f00594937c1628760b27
Reviewed-on: https://go-review.googlesource.com/c/go/+/275779
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-07 20:41:11 +00:00
Russ Cox
fb17dfa43d [dev.regabi] cmd/compile: narrow interface between ir and types
Narrow the interface between package ir and package types
to make it easier to clean up the type formatting code all in one place.

Also introduce ir.BlankSym for use by OrigSym, so that later
OrigSym can move to package types without needing to reference
a variable of type ir.Node.

Passes buildall w/ toolstash -cmp.

Change-Id: I39fa419a1c8fb3318203e31cacc8d06399deeff9
Reviewed-on: https://go-review.googlesource.com/c/go/+/275776
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-12-07 20:40:52 +00:00
Matthew Dempsky
5ffa275f3c [dev.regabi] cmd/compile: first pass at abstracting Type
Passes toolstash/buildall.

[git-generate]
cd src/cmd/compile/internal/ssa
rf '
ex . ../ir ../gc {
  import "cmd/compile/internal/types"
  var t *types.Type
  t.Etype -> t.Kind()
  t.Sym -> t.GetSym()
  t.Orig -> t.Underlying()
}
'

cd ../types
rf '
mv EType Kind
mv IRNode Object

mv Type.Etype Type.kind
mv Type.Sym Type.sym
mv Type.Orig Type.underlying
mv Type.Cache Type.cache

mv Type.GetSym Type.Sym

mv Bytetype ByteType
mv Runetype RuneType
mv Errortype ErrorType
'

cd ../gc
sed -i 's/Bytetype/ByteType/; s/Runetype/RuneType/' mkbuiltin.go

git codereview gofmt
go install cmd/compile/internal/...
go test cmd/compile -u || go test cmd/compile

Change-Id: Ibecb2d7100d3318a49238eb4a78d70acb49eedca
Reviewed-on: https://go-review.googlesource.com/c/go/+/274437
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
2020-12-01 19:24:00 +00:00
Russ Cox
41f3af9d04 [dev.regabi] cmd/compile: replace *Node type with an interface Node [generated]
The plan is to introduce a Node interface that replaces the old *Node pointer-to-struct.

The previous CL defined an interface INode modeling a *Node.

This CL:
 - Changes all references outside internal/ir to use INode,
   along with many references inside internal/ir as well.
 - Renames Node to node.
 - Renames INode to Node

So now ir.Node is an interface implemented by *ir.node, which is otherwise inaccessible,
and the code outside package ir is now (clearly) using only the interface.

The usual rule is never to redefine an existing name with a new meaning,
so that old code that hasn't been updated gets a "unknown name" error
instead of more mysterious errors or silent misbehavior. That rule would
caution against replacing Node-the-struct with Node-the-interface,
as in this CL, because code that says *Node would now be using a pointer
to an interface. But this CL is being landed at the same time as another that
moves Node from gc to ir. So the net effect is to replace *gc.Node with ir.Node,
which does follow the rule: any lingering references to gc.Node will be told
it's gone, not silently start using pointers to interfaces. So the rule is followed
by the CL sequence, just not this specific CL.

Overall, the loss of inlining caused by using interfaces cuts the compiler speed
by about 6%, a not insignificant amount. However, as we convert the representation
to concrete structs that are not the giant Node over the next weeks, that speed
should come back as more of the compiler starts operating directly on concrete types
and the memory taken up by the graph of Nodes drops due to the more precise
structs. Honestly, I was expecting worse.

% benchstat bench.old bench.new
name                      old time/op       new time/op       delta
Template                        168ms ± 4%        182ms ± 2%   +8.34%  (p=0.000 n=9+9)
Unicode                        72.2ms ±10%       82.5ms ± 6%  +14.38%  (p=0.000 n=9+9)
GoTypes                         563ms ± 8%        598ms ± 2%   +6.14%  (p=0.006 n=9+9)
Compiler                        2.89s ± 4%        3.04s ± 2%   +5.37%  (p=0.000 n=10+9)
SSA                             6.45s ± 4%        7.25s ± 5%  +12.41%  (p=0.000 n=9+10)
Flate                           105ms ± 2%        115ms ± 1%   +9.66%  (p=0.000 n=10+8)
GoParser                        144ms ±10%        152ms ± 2%   +5.79%  (p=0.011 n=9+8)
Reflect                         345ms ± 9%        370ms ± 4%   +7.28%  (p=0.001 n=10+9)
Tar                             149ms ± 9%        161ms ± 5%   +8.05%  (p=0.001 n=10+9)
XML                             190ms ± 3%        209ms ± 2%   +9.54%  (p=0.000 n=9+8)
LinkCompiler                    327ms ± 2%        325ms ± 2%     ~     (p=0.382 n=8+8)
ExternalLinkCompiler            1.77s ± 4%        1.73s ± 6%     ~     (p=0.113 n=9+10)
LinkWithoutDebugCompiler        214ms ± 4%        211ms ± 2%     ~     (p=0.360 n=10+8)
StdCmd                          14.8s ± 3%        15.9s ± 1%   +6.98%  (p=0.000 n=10+9)
[Geo mean]                      480ms             510ms        +6.31%

name                      old user-time/op  new user-time/op  delta
Template                        223ms ± 3%        237ms ± 3%   +6.16%  (p=0.000 n=9+10)
Unicode                         103ms ± 6%        113ms ± 3%   +9.53%  (p=0.000 n=9+9)
GoTypes                         758ms ± 8%        800ms ± 2%   +5.55%  (p=0.003 n=10+9)
Compiler                        3.95s ± 2%        4.12s ± 2%   +4.34%  (p=0.000 n=10+9)
SSA                             9.43s ± 1%        9.74s ± 4%   +3.25%  (p=0.000 n=8+10)
Flate                           132ms ± 2%        141ms ± 2%   +6.89%  (p=0.000 n=9+9)
GoParser                        177ms ± 9%        183ms ± 4%     ~     (p=0.050 n=9+9)
Reflect                         467ms ±10%        495ms ± 7%   +6.17%  (p=0.029 n=10+10)
Tar                             183ms ± 9%        197ms ± 5%   +7.92%  (p=0.001 n=10+10)
XML                             249ms ± 5%        268ms ± 4%   +7.82%  (p=0.000 n=10+9)
LinkCompiler                    544ms ± 5%        544ms ± 6%     ~     (p=0.863 n=9+9)
ExternalLinkCompiler            1.79s ± 4%        1.75s ± 6%     ~     (p=0.075 n=10+10)
LinkWithoutDebugCompiler        248ms ± 6%        246ms ± 2%     ~     (p=0.965 n=10+8)
[Geo mean]                      483ms             504ms        +4.41%

[git-generate]
cd src/cmd/compile/internal/ir
: # We need to do the conversion in multiple steps, so we introduce
: # a temporary type alias that will start out meaning the pointer-to-struct
: # and then change to mean the interface.
rf '
	mv Node OldNode

	add node.go \
		type Node = *OldNode
'

: # It should work to do this ex in ir, but it misses test files, due to a bug in rf.
: # Run the command in gc to handle gc's tests, and then again in ssa for ssa's tests.
cd ../gc
rf '
        ex .  ../arm ../riscv64 ../arm64 ../mips64 ../ppc64 ../mips ../wasm {
                import "cmd/compile/internal/ir"
                *ir.OldNode -> ir.Node
        }
'
cd ../ssa
rf '
        ex {
                import "cmd/compile/internal/ir"
                *ir.OldNode -> ir.Node
        }
'

: # Back in ir, finish conversion clumsily with sed,
: # because type checking and circular aliases do not mix.
cd ../ir
sed -i '' '
	/type Node = \*OldNode/d
	s/\*OldNode/Node/g
	s/^func (n Node)/func (n *OldNode)/
	s/OldNode/node/g
	s/type INode interface/type Node interface/
	s/var _ INode = (Node)(nil)/var _ Node = (*node)(nil)/
' *.go
gofmt -w *.go

sed -i '' '
	s/{Func{}, 136, 248}/{Func{}, 152, 280}/
	s/{Name{}, 32, 56}/{Name{}, 44, 80}/
	s/{Param{}, 24, 48}/{Param{}, 44, 88}/
	s/{node{}, 76, 128}/{node{}, 88, 152}/
' sizeof_test.go

cd ../ssa
sed -i '' '
	s/{LocalSlot{}, 28, 40}/{LocalSlot{}, 32, 48}/
' sizeof_test.go

cd ../gc
sed -i '' 's/\*ir.Node/ir.Node/' mkbuiltin.go

cd ../../../..
go install std cmd
cd cmd/compile
go test -u || go test -u

Change-Id: I196bbe3b648e4701662e4a2bada40bf155e2a553
Reviewed-on: https://go-review.googlesource.com/c/go/+/272935
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-11-25 17:30:43 +00:00
Russ Cox
048debb224 [dev.regabi] cmd/compile: remove gc ↔ ssa cycle hacks
The cycle hacks existed because gc needed to import ssa
which need to know about gc.Node. But now that's ir.Node,
and there's no cycle anymore.

Don't know how much it matters but LocalSlot is now
one word shorter than before, because it holds a pointer
instead of an interface for the *Node. That won't last long.

Now that they're not necessary for interface satisfaction,
IsSynthetic and IsAutoTmp can move to top-level ir functions.

Change-Id: Ie511e93466cfa2b17d9a91afc4bd8d53fdb80453
Reviewed-on: https://go-review.googlesource.com/c/go/+/272931
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-11-25 17:30:36 +00:00
Russ Cox
9e0e43d84d [dev.regabi] cmd/compile: remove uses of dummy
Per https://developers.google.com/style/inclusive-documentation,
since we are editing some of this code anyway and it is easier
to put the cleanup in a separate CL.

Change-Id: Ib6b851f43f9cc0a57676564477d4ff22abb1cee5
Reviewed-on: https://go-review.googlesource.com/c/go/+/273106
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2020-11-25 04:35:29 +00:00
David Chase
c3fe874f25 cmd/compile: avoid generating CSEs; do all aggregates; maintain debug names
This adds a pass to detect common selection operations,
to avoid generating duplicates.  Duplicate offsets are
also detected.

All aggregate types are now handled; there is some freedom in where
expand_calls is run, though it must run before softfloat.

Debug-name-maintenance is now incremental both in decompose builtin
and in expand_calls; it might be good to push this into all the
decompose passes.

(this is a smash of 5 CLs that rewrote some of the same code several
times to deal with phase-ordering problems, and included an abandoned
attempt.)

For #40724.

Change-Id: I2a0c32f20660bf8b99e2bcecd33545d97d2bd3c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/249458
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-23 18:02:16 +00:00
David Chase
4220d67084 cmd/compile: make GOSSAHASH package-sensitive, also append to log files
Turns out if your failure is in a function with a name like "Reset()"
there will be a lot of hits on the same hashcode.  Adding package sensitivity
solves this problem.

In additionm, it turned out that in the case that a logfile was specified
for the GOSSAHASH logging, that it was opened in create mode, which meant
that multiple compiler invocations would reset the file to zero length.
Opening in append mode works better; the automated harness
(github.com/dr2chase/gossahash) takes care of truncating the file before use.

Change-Id: I5601bc280faa94cbd507d302448831849db6c842
Reviewed-on: https://go-review.googlesource.com/c/go/+/246937
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-08-24 02:38:18 +00:00
Keith Randall
28157b3292 cmd/compile: start implementing strongly typed aux and auxint fields
Right now the Aux and AuxInt fields of ssa.Values are typed as
interface{} and int64, respectively. Each rule that uses these values
must cast them to the type they actually are (*obj.LSym, or int32, or
ValAndOff, etc.), use them, and then cast them back to interface{} or
int64.

We know for each opcode what the types of the Aux and AuxInt fields
should be. So let's modify the rule generator to declare the types to
be what we know they should be, autoconverting to and from the generic
types for us. That way we can make the rules more type safe.

It's difficult to make a single CL for this, so I've coopted the "=>"
token to indicate a rule that is strongly typed. "->" rules are
processed as before. That will let us migrate a few rules at a time in
separate CLs.  Hopefully we can reach a state where all rules are
strongly typed and we can drop the distinction.

This CL changes just a few rules to get a feel for what this
transition would look like.

I've decided not to put explicit types in the rules. I think it
makes the rules somewhat clearer, but definitely more verbose.
In particular, the passthrough rules that don't modify the fields
in question are verbose for no real reason.

Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/190197
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-04-09 21:18:55 +00:00
Keith Randall
5d8a61a43e cmd/compile: print recursive types correctly
Change the type printer to take a map of types that we're currently
printing. When we happen upon a type that we're already in the middle
of printing, print a reference to it instead.

A reference to another type is built using the offset of the first
byte of that type's string representation in the result. To facilitate
that computation (and it's probably more efficient, regardless), we
print the type to a buffer as we go, and build the string at the end.

It would be nice to use string.Builder instead of bytes.Buffer, but
string.Builder wasn't around in Go 1.4, and we'd like to bootstrap
from that version.

Fixes #29312

Change-Id: I49d788c1fa20f770df7b2bae3b9979d990d54803
Reviewed-on: https://go-review.googlesource.com/c/go/+/214239
Reviewed-by: Robert Griesemer <gri@golang.org>
2020-01-13 18:52:18 +00:00
Josh Bleecher Snyder
2578ac54eb cmd/compile: move argument stack construction to SSA generation
The goal of this change is to move work from walk to SSA,
and simplify things along the way.

This is hard to accomplish cleanly with small incremental changes,
so this large commit message aims to provide a roadmap to the diff.

High level description:

Prior to this change, walk was responsible for constructing (most of) the stack for function calls.
ascompatte gathered variadic arguments into a slice.
It also rewrote n.List from a list of arguments to a list of assignments to stack slots.
ascompatte was called multiple times to handle the receiver in a method call.
reorder1 then introduced temporaries into n.List as needed to avoid smashing the stack.
adjustargs then made extra stack space for go/defer args as needed.

Node to SSA construction evaluated all the statements in n.List,
and issued the function call, assuming that the stack was correctly constructed.
Intrinsic calls had to dig around inside n.List to extract the arguments,
since intrinsics don't use the stack to make function calls.

This change moves stack construction to the SSA construction phase.
ascompatte, now called walkParams, does all the work that ascompatte and reorder1 did.
It handles variadic arguments, inserts the method receiver if needed, and allocates temporaries.
It does not, however, make any assignments to stack slots.
Instead, it moves the function arguments to n.Rlist, leaving assignments to temporaries in n.List.
(It would be better to use Ninit instead of List; future work.)
During SSA construction, after doing all the temporary assignments in n.List,
the function arguments are assigned to stack slots by
constructing the appropriate SSA Value, using (*state).storeArg.
SSA construction also now handles adjustments for go/defer args.
This change also simplifies intrinsic calls, since we no longer need to undo walk's work.

Along the way, we simplify nodarg by pushing the fp==1 case to its callers, where it fits nicely.

Generated code differences:

There were a few optimizations applied along the way, the old way.
f(g()) was rewritten to do a block copy of function results to function arguments.
And reorder1 avoided introducing the final "save the stack" temporary in n.List.

The f(g()) block copy optimization never actually triggered; the order pass rewrote away g(), so that has been removed.

SSA optimizations mostly obviated the need for reorder1's optimization of avoiding the final temporary.
The exception was when the temporary's type was not SSA-able;
in that case, we got a Move into an autotmp and then an immediate Move onto the stack,
with the autotmp never read or used again.
This change introduces a new rewrite rule to detect such pointless double Moves
and collapse them into a single Move.
This is actually more powerful than the original optimization,
since the original optimization relied on the imprecise Node.HasCall calculation.

The other significant difference in the generated code is that the stack is now constructed
completely in SP-offset order. Prior to this change, the stack was constructed somewhat
haphazardly: first the final argument that Node.HasCall deemed to require a temporary,
then other arguments, then the method receiver, then the defer/go args.
SP-offset is probably a good default order. See future work.

There are a few minor object file size changes as a result of this change.
I investigated some regressions in early versions of this change.

One regression (in archive/tar) was the addition of a single CMPQ instruction,
which would be eliminated were this TODO from flagalloc to be done:
	// TODO: Remove original instructions if they are never used.

One regression (in text/template) was an ADDQconstmodify that is now
a regular MOVQLoad+ADDQconst+MOVQStore, due to an unlucky change
in the order in which arguments are written. The argument change
order can also now be luckier, so this appears to be a wash.

All in all, though there will be minor winners and losers,
this change appears to be performance neutral.

Future work:

Move loading the result of function calls to SSA construction; eliminate OINDREGSP.

Consider pushing stack construction deeper into SSA world, perhaps in an arch-specific pass.
Among other benefits, this would make it easier to transition to a new calling convention.
This would require rethinking the handling of stack conflicts and is non-trivial.

Figure out some clean way to indicate that stack construction Stores/Moves
do not alias each other, so that subsequent passes may do things like
CSE+tighten shared stack setup, do DSE using non-first Stores, etc.
This would allow us to eliminate the minor text/template regression.

Possibly make assignments to stack slots not treated as statements by DWARF.

Compiler benchmarks:

name        old time/op       new time/op       delta
Template          182ms ± 2%        179ms ± 2%  -1.69%  (p=0.000 n=47+48)
Unicode          86.3ms ± 5%       85.1ms ± 4%  -1.36%  (p=0.001 n=50+50)
GoTypes           646ms ± 1%        642ms ± 1%  -0.63%  (p=0.000 n=49+48)
Compiler          2.89s ± 1%        2.86s ± 2%  -1.36%  (p=0.000 n=48+50)
SSA               8.47s ± 1%        8.37s ± 2%  -1.22%  (p=0.000 n=47+50)
Flate             122ms ± 2%        121ms ± 2%  -0.66%  (p=0.000 n=47+45)
GoParser          147ms ± 2%        146ms ± 2%  -0.53%  (p=0.006 n=46+49)
Reflect           406ms ± 2%        403ms ± 2%  -0.76%  (p=0.000 n=48+43)
Tar               162ms ± 3%        162ms ± 4%    ~     (p=0.191 n=46+50)
XML               223ms ± 2%        222ms ± 2%  -0.37%  (p=0.031 n=45+49)
[Geo mean]        382ms             378ms       -0.89%

name        old user-time/op  new user-time/op  delta
Template          219ms ± 3%        216ms ± 3%  -1.56%  (p=0.000 n=50+48)
Unicode           109ms ± 6%        109ms ± 5%    ~     (p=0.190 n=50+49)
GoTypes           836ms ± 2%        828ms ± 2%  -0.96%  (p=0.000 n=49+48)
Compiler          3.87s ± 2%        3.80s ± 1%  -1.81%  (p=0.000 n=49+46)
SSA               12.0s ± 1%        11.8s ± 1%  -2.01%  (p=0.000 n=48+50)
Flate             142ms ± 3%        141ms ± 3%  -0.85%  (p=0.003 n=50+48)
GoParser          178ms ± 4%        175ms ± 4%  -1.66%  (p=0.000 n=48+46)
Reflect           520ms ± 2%        512ms ± 2%  -1.44%  (p=0.000 n=45+48)
Tar               200ms ± 3%        198ms ± 4%  -0.61%  (p=0.037 n=47+50)
XML               277ms ± 3%        275ms ± 3%  -0.85%  (p=0.000 n=49+48)
[Geo mean]        482ms             476ms       -1.23%

name        old alloc/op      new alloc/op      delta
Template         36.1MB ± 0%       35.3MB ± 0%  -2.18%  (p=0.008 n=5+5)
Unicode          29.8MB ± 0%       29.3MB ± 0%  -1.58%  (p=0.008 n=5+5)
GoTypes           125MB ± 0%        123MB ± 0%  -2.13%  (p=0.008 n=5+5)
Compiler          531MB ± 0%        513MB ± 0%  -3.40%  (p=0.008 n=5+5)
SSA              2.00GB ± 0%       1.93GB ± 0%  -3.34%  (p=0.008 n=5+5)
Flate            24.5MB ± 0%       24.3MB ± 0%  -1.18%  (p=0.008 n=5+5)
GoParser         29.4MB ± 0%       28.7MB ± 0%  -2.34%  (p=0.008 n=5+5)
Reflect          87.1MB ± 0%       86.0MB ± 0%  -1.33%  (p=0.008 n=5+5)
Tar              35.3MB ± 0%       34.8MB ± 0%  -1.44%  (p=0.008 n=5+5)
XML              47.9MB ± 0%       47.1MB ± 0%  -1.86%  (p=0.008 n=5+5)
[Geo mean]       82.8MB            81.1MB       -2.08%

name        old allocs/op     new allocs/op     delta
Template           352k ± 0%         347k ± 0%  -1.32%  (p=0.008 n=5+5)
Unicode            342k ± 0%         339k ± 0%  -0.66%  (p=0.008 n=5+5)
GoTypes           1.29M ± 0%        1.27M ± 0%  -1.30%  (p=0.008 n=5+5)
Compiler          4.98M ± 0%        4.87M ± 0%  -2.14%  (p=0.008 n=5+5)
SSA               15.7M ± 0%        15.2M ± 0%  -2.86%  (p=0.008 n=5+5)
Flate              233k ± 0%         231k ± 0%  -0.83%  (p=0.008 n=5+5)
GoParser           296k ± 0%         291k ± 0%  -1.54%  (p=0.016 n=5+4)
Reflect           1.05M ± 0%        1.04M ± 0%  -0.65%  (p=0.008 n=5+5)
Tar                343k ± 0%         339k ± 0%  -0.97%  (p=0.008 n=5+5)
XML                432k ± 0%         426k ± 0%  -1.19%  (p=0.008 n=5+5)
[Geo mean]         815k              804k       -1.35%

name        old object-bytes  new object-bytes  delta
Template          505kB ± 0%        505kB ± 0%  -0.01%  (p=0.008 n=5+5)
Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
GoTypes          1.82MB ± 0%       1.83MB ± 0%  +0.06%  (p=0.008 n=5+5)
Flate             324kB ± 0%        324kB ± 0%  +0.00%  (p=0.008 n=5+5)
GoParser          402kB ± 0%        402kB ± 0%  +0.04%  (p=0.008 n=5+5)
Reflect          1.39MB ± 0%       1.39MB ± 0%  -0.01%  (p=0.008 n=5+5)
Tar               449kB ± 0%        449kB ± 0%  -0.02%  (p=0.008 n=5+5)
XML               598kB ± 0%        597kB ± 0%  -0.05%  (p=0.008 n=5+5)

Change-Id: Ifc9d5c1bd01f90171414b8fb18ffe2290d271143
Reviewed-on: https://go-review.googlesource.com/c/114797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-19 21:23:16 +00:00
Matthew Dempsky
62e5215a2a cmd/compile: merge TPTR32 and TPTR64 as TPTR
Change-Id: I0490098a7235458c5aede1135426a9f19f8584a7
Reviewed-on: https://go-review.googlesource.com/c/76312
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-04 04:08:08 +00:00
David Chase
31e1c30f55 cmd/compile: do not allow regalloc to LoadReg G register
On architectures where G is stored in a register, it is
possible for a variable to allocated to it, and subsequently
that variable may be spilled and reloaded, for example
because of an intervening call.  If such an allocation
reaches a join point and it is the primary predecessor,
it becomes the target of a reload, which is only usually
right.

Fix: guard all the LoadReg ops, and spill value in the G
register (if any) before merges (in the same way that 387
FP registers are freed between blocks).

Includes test.

Fixes #25504.

Change-Id: I0482a53e20970c7315bf09c0e407ae5bba2fe05d
Reviewed-on: https://go-review.googlesource.com/114695
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-30 16:39:21 +00:00
Matthew Dempsky
e10ee798c4 cmd/compile/internal/types: remove ElemType wrapper
This was an artifact from when we had a separate ssa.Type interface to
break circular dependency between packages ssa and gc. It's no longer
needed now that package ssa directly uses package types.

Change-Id: I6a93e5d79082815f7f0eb89507381969cc6cb403
Reviewed-on: https://go-review.googlesource.com/109137
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-04-24 22:24:47 +00:00
Giovanni Bajo
a35ec9a59e cmd/compile: implement CMOV on amd64
This builds upon the branchelim pass, activating it for amd64 and
lowering CondSelect. Special care is made to FPU instructions for
NaN handling.

Benchmark results on Xeon E5630 (Westmere EP):

name                      old time/op    new time/op    delta
BinaryTree17-16              4.99s ± 9%     4.66s ± 2%     ~     (p=0.095 n=5+5)
Fannkuch11-16                4.93s ± 3%     5.04s ± 2%     ~     (p=0.548 n=5+5)
FmtFprintfEmpty-16          58.8ns ± 7%    61.4ns ±14%     ~     (p=0.579 n=5+5)
FmtFprintfString-16          114ns ± 2%     114ns ± 4%     ~     (p=0.603 n=5+5)
FmtFprintfInt-16             181ns ± 4%     125ns ± 3%  -30.90%  (p=0.008 n=5+5)
FmtFprintfIntInt-16          263ns ± 2%     217ns ± 2%  -17.34%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-16     230ns ± 1%     212ns ± 1%   -7.99%  (p=0.008 n=5+5)
FmtFprintfFloat-16           411ns ± 3%     344ns ± 5%  -16.43%  (p=0.008 n=5+5)
FmtManyArgs-16               828ns ± 4%     790ns ± 2%   -4.59%  (p=0.032 n=5+5)
GobDecode-16                10.9ms ± 4%    10.8ms ± 5%     ~     (p=0.548 n=5+5)
GobEncode-16                9.52ms ± 5%    9.46ms ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                      334ms ± 2%     337ms ± 2%     ~     (p=0.548 n=5+5)
Gunzip-16                   64.4ms ± 1%    65.0ms ± 1%   +1.00%  (p=0.008 n=5+5)
HTTPClientServer-16          156µs ± 3%     155µs ± 3%     ~     (p=0.690 n=5+5)
JSONEncode-16               21.0ms ± 1%    21.8ms ± 0%   +3.76%  (p=0.016 n=5+4)
JSONDecode-16               95.1ms ± 0%    95.7ms ± 1%     ~     (p=0.151 n=5+5)
Mandelbrot200-16            6.38ms ± 1%    6.42ms ± 1%     ~     (p=0.095 n=5+5)
GoParse-16                  5.47ms ± 2%    5.36ms ± 1%   -1.95%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16       111ns ± 1%     111ns ± 1%     ~     (p=0.635 n=5+4)
RegexpMatchEasy0_1K-16       408ns ± 1%     411ns ± 2%     ~     (p=0.087 n=5+5)
RegexpMatchEasy1_32-16       103ns ± 1%     104ns ± 1%     ~     (p=0.484 n=5+5)
RegexpMatchEasy1_1K-16       659ns ± 2%     652ns ± 1%     ~     (p=0.571 n=5+5)
RegexpMatchMedium_32-16      176ns ± 2%     174ns ± 1%     ~     (p=0.476 n=5+5)
RegexpMatchMedium_1K-16     58.6µs ± 4%    57.7µs ± 4%     ~     (p=0.548 n=5+5)
RegexpMatchHard_32-16       3.07µs ± 3%    3.04µs ± 4%     ~     (p=0.421 n=5+5)
RegexpMatchHard_1K-16       89.2µs ± 1%    87.9µs ± 2%   -1.52%  (p=0.032 n=5+5)
Revcomp-16                   575ms ± 0%     587ms ± 2%   +2.12%  (p=0.032 n=4+5)
Template-16                  110ms ± 1%     107ms ± 3%   -3.00%  (p=0.032 n=5+5)
TimeParse-16                 463ns ± 0%     462ns ± 0%     ~     (p=0.810 n=5+4)
TimeFormat-16                538ns ± 0%     535ns ± 0%   -0.63%  (p=0.024 n=5+5)

name                      old speed      new speed      delta
GobDecode-16              70.7MB/s ± 4%  71.4MB/s ± 5%     ~     (p=0.452 n=5+5)
GobEncode-16              80.7MB/s ± 5%  81.2MB/s ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                   58.2MB/s ± 2%  57.7MB/s ± 2%     ~     (p=0.452 n=5+5)
Gunzip-16                  302MB/s ± 1%   299MB/s ± 1%   -0.99%  (p=0.008 n=5+5)
JSONEncode-16             92.4MB/s ± 1%  89.1MB/s ± 0%   -3.63%  (p=0.016 n=5+4)
JSONDecode-16             20.4MB/s ± 0%  20.3MB/s ± 1%     ~     (p=0.135 n=5+5)
GoParse-16                10.6MB/s ± 2%  10.8MB/s ± 1%   +2.00%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16     286MB/s ± 1%   285MB/s ± 3%     ~     (p=1.000 n=5+5)
RegexpMatchEasy0_1K-16    2.51GB/s ± 1%  2.49GB/s ± 2%     ~     (p=0.095 n=5+5)
RegexpMatchEasy1_32-16     309MB/s ± 1%   307MB/s ± 1%     ~     (p=0.548 n=5+5)
RegexpMatchEasy1_1K-16    1.55GB/s ± 2%  1.57GB/s ± 1%     ~     (p=0.690 n=5+5)
RegexpMatchMedium_32-16   5.68MB/s ± 2%  5.73MB/s ± 1%     ~     (p=0.579 n=5+5)
RegexpMatchMedium_1K-16   17.5MB/s ± 4%  17.8MB/s ± 4%     ~     (p=0.500 n=5+5)
RegexpMatchHard_32-16     10.4MB/s ± 3%  10.5MB/s ± 4%     ~     (p=0.460 n=5+5)
RegexpMatchHard_1K-16     11.5MB/s ± 1%  11.7MB/s ± 2%   +1.57%  (p=0.032 n=5+5)
Revcomp-16                 442MB/s ± 0%   433MB/s ± 2%   -2.05%  (p=0.032 n=4+5)
Template-16               17.7MB/s ± 1%  18.2MB/s ± 3%   +3.12%  (p=0.032 n=5+5)

Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558
Reviewed-on: https://go-review.googlesource.com/100935
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-15 16:41:59 +00:00
Ilya Tocar
644d14ea0f Revert "cmd/compile: implement CMOV on amd64"
This reverts commit 080187f4f7.

It broke build of golang.org/x/exp/shiny/iconvg
See issue 24395 for details

Change-Id: Ifd6134f6214e6cee40bd3c63c32941d5fc96ae8b
Reviewed-on: https://go-review.googlesource.com/100755
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-14 21:21:23 +00:00
Giovanni Bajo
080187f4f7 cmd/compile: implement CMOV on amd64
This builds upon the branchelim pass, activating it for amd64 and
lowering CondSelect. Special care is made to FPU instructions for
NaN handling.

Benchmark results on Xeon E5630 (Westmere EP):

name                      old time/op    new time/op    delta
BinaryTree17-16              4.99s ± 9%     4.66s ± 2%     ~     (p=0.095 n=5+5)
Fannkuch11-16                4.93s ± 3%     5.04s ± 2%     ~     (p=0.548 n=5+5)
FmtFprintfEmpty-16          58.8ns ± 7%    61.4ns ±14%     ~     (p=0.579 n=5+5)
FmtFprintfString-16          114ns ± 2%     114ns ± 4%     ~     (p=0.603 n=5+5)
FmtFprintfInt-16             181ns ± 4%     125ns ± 3%  -30.90%  (p=0.008 n=5+5)
FmtFprintfIntInt-16          263ns ± 2%     217ns ± 2%  -17.34%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-16     230ns ± 1%     212ns ± 1%   -7.99%  (p=0.008 n=5+5)
FmtFprintfFloat-16           411ns ± 3%     344ns ± 5%  -16.43%  (p=0.008 n=5+5)
FmtManyArgs-16               828ns ± 4%     790ns ± 2%   -4.59%  (p=0.032 n=5+5)
GobDecode-16                10.9ms ± 4%    10.8ms ± 5%     ~     (p=0.548 n=5+5)
GobEncode-16                9.52ms ± 5%    9.46ms ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                      334ms ± 2%     337ms ± 2%     ~     (p=0.548 n=5+5)
Gunzip-16                   64.4ms ± 1%    65.0ms ± 1%   +1.00%  (p=0.008 n=5+5)
HTTPClientServer-16          156µs ± 3%     155µs ± 3%     ~     (p=0.690 n=5+5)
JSONEncode-16               21.0ms ± 1%    21.8ms ± 0%   +3.76%  (p=0.016 n=5+4)
JSONDecode-16               95.1ms ± 0%    95.7ms ± 1%     ~     (p=0.151 n=5+5)
Mandelbrot200-16            6.38ms ± 1%    6.42ms ± 1%     ~     (p=0.095 n=5+5)
GoParse-16                  5.47ms ± 2%    5.36ms ± 1%   -1.95%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16       111ns ± 1%     111ns ± 1%     ~     (p=0.635 n=5+4)
RegexpMatchEasy0_1K-16       408ns ± 1%     411ns ± 2%     ~     (p=0.087 n=5+5)
RegexpMatchEasy1_32-16       103ns ± 1%     104ns ± 1%     ~     (p=0.484 n=5+5)
RegexpMatchEasy1_1K-16       659ns ± 2%     652ns ± 1%     ~     (p=0.571 n=5+5)
RegexpMatchMedium_32-16      176ns ± 2%     174ns ± 1%     ~     (p=0.476 n=5+5)
RegexpMatchMedium_1K-16     58.6µs ± 4%    57.7µs ± 4%     ~     (p=0.548 n=5+5)
RegexpMatchHard_32-16       3.07µs ± 3%    3.04µs ± 4%     ~     (p=0.421 n=5+5)
RegexpMatchHard_1K-16       89.2µs ± 1%    87.9µs ± 2%   -1.52%  (p=0.032 n=5+5)
Revcomp-16                   575ms ± 0%     587ms ± 2%   +2.12%  (p=0.032 n=4+5)
Template-16                  110ms ± 1%     107ms ± 3%   -3.00%  (p=0.032 n=5+5)
TimeParse-16                 463ns ± 0%     462ns ± 0%     ~     (p=0.810 n=5+4)
TimeFormat-16                538ns ± 0%     535ns ± 0%   -0.63%  (p=0.024 n=5+5)

name                      old speed      new speed      delta
GobDecode-16              70.7MB/s ± 4%  71.4MB/s ± 5%     ~     (p=0.452 n=5+5)
GobEncode-16              80.7MB/s ± 5%  81.2MB/s ± 2%     ~     (p=1.000 n=5+5)
Gzip-16                   58.2MB/s ± 2%  57.7MB/s ± 2%     ~     (p=0.452 n=5+5)
Gunzip-16                  302MB/s ± 1%   299MB/s ± 1%   -0.99%  (p=0.008 n=5+5)
JSONEncode-16             92.4MB/s ± 1%  89.1MB/s ± 0%   -3.63%  (p=0.016 n=5+4)
JSONDecode-16             20.4MB/s ± 0%  20.3MB/s ± 1%     ~     (p=0.135 n=5+5)
GoParse-16                10.6MB/s ± 2%  10.8MB/s ± 1%   +2.00%  (p=0.016 n=5+5)
RegexpMatchEasy0_32-16     286MB/s ± 1%   285MB/s ± 3%     ~     (p=1.000 n=5+5)
RegexpMatchEasy0_1K-16    2.51GB/s ± 1%  2.49GB/s ± 2%     ~     (p=0.095 n=5+5)
RegexpMatchEasy1_32-16     309MB/s ± 1%   307MB/s ± 1%     ~     (p=0.548 n=5+5)
RegexpMatchEasy1_1K-16    1.55GB/s ± 2%  1.57GB/s ± 1%     ~     (p=0.690 n=5+5)
RegexpMatchMedium_32-16   5.68MB/s ± 2%  5.73MB/s ± 1%     ~     (p=0.579 n=5+5)
RegexpMatchMedium_1K-16   17.5MB/s ± 4%  17.8MB/s ± 4%     ~     (p=0.500 n=5+5)
RegexpMatchHard_32-16     10.4MB/s ± 3%  10.5MB/s ± 4%     ~     (p=0.460 n=5+5)
RegexpMatchHard_1K-16     11.5MB/s ± 1%  11.7MB/s ± 2%   +1.57%  (p=0.032 n=5+5)
Revcomp-16                 442MB/s ± 0%   433MB/s ± 2%   -2.05%  (p=0.032 n=4+5)
Template-16               17.7MB/s ± 1%  18.2MB/s ± 3%   +3.12%  (p=0.032 n=5+5)

Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c
Reviewed-on: https://go-review.googlesource.com/98695
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-12 18:01:33 +00:00
ChrisALiles
4f5389c321 cmd/compile: move the SSA local type definitions to a single location
Fixes #20304

Change-Id: I52ee02d1602ed7fffc96b27fd60990203c771aaf
Reviewed-on: https://go-review.googlesource.com/94256
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-02-27 19:40:36 +00:00
Heschi Kreinick
e181852dd4 cmd/compile/internal: use sparseSet, optimize isSynthetic
changedVars was functionally a set, but couldn't be iterated over
efficiently. In functions with many variables, the wasted iteration was
costly. Use a sparseSet instead.

(*gc.Node).String() is very expensive: it calls Sprintf, which does
reflection, etc, etc. Instead, just look at .Sym.Name, which is all we
care about.

Change-Id: Ib61cd7b5c796e1813b8859135e85da5bfe2ac686
Reviewed-on: https://go-review.googlesource.com/92402
Reviewed-by: David Chase <drchase@google.com>
2018-02-21 18:01:22 +00:00
Austin Clements
2010189407 runtime: remove legacy eager write barrier
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.

Fixes #22460.

Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).

Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-13 16:34:46 +00:00
Austin Clements
7e343134d3 cmd/compile: compiler support for buffered write barrier
This CL implements the compiler support for calling the buffered write
barrier added by the previous CL.

Since the buffered write barrier is only implemented on amd64 right
now, this still supports the old, eager write barrier as well. There's
little overhead to supporting both and this way a few tests in
test/fixedbugs that expect to have liveness maps at write barrier
calls can easily opt-in to the old, eager barrier.

This significantly improves the performance of the write barrier:

name             old time/op  new time/op  delta
WriteBarrier-12  73.5ns ±20%  19.2ns ±27%  -73.90%  (p=0.000 n=19+18)

It also reduces the size of binaries because the write barrier call is
more compact:

name        old object-bytes  new object-bytes  delta
Template           398k ± 0%         393k ± 0%  -1.14%  (p=0.008 n=5+5)
Unicode            208k ± 0%         206k ± 0%  -1.00%  (p=0.008 n=5+5)
GoTypes           1.18M ± 0%        1.15M ± 0%  -2.00%  (p=0.008 n=5+5)
Compiler          4.05M ± 0%        3.88M ± 0%  -4.26%  (p=0.008 n=5+5)
SSA               8.25M ± 0%        8.11M ± 0%  -1.59%  (p=0.008 n=5+5)
Flate              228k ± 0%         224k ± 0%  -1.83%  (p=0.008 n=5+5)
GoParser           295k ± 0%         284k ± 0%  -3.62%  (p=0.008 n=5+5)
Reflect           1.00M ± 0%        0.99M ± 0%  -0.70%  (p=0.008 n=5+5)
Tar                339k ± 0%         333k ± 0%  -1.67%  (p=0.008 n=5+5)
XML                404k ± 0%         395k ± 0%  -2.10%  (p=0.008 n=5+5)
[Geo mean]         704k              690k       -2.00%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.05M ± 0%        1.04M ± 0%  -1.55%  (p=0.008 n=5+5)

https://perf.golang.org/search?q=upload:20171027.1

(Amusingly, this also reduces compiler allocations by 0.75%, which,
combined with the better write barrier, speeds up the compiler overall
by 2.10%. See the perf link.)

It slightly improves the performance of most of the go1 benchmarks and
improves the performance of the x/benchmarks:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.40s ± 1%     2.47s ± 1%  +2.69%  (p=0.000 n=19+19)
Fannkuch11-12                2.95s ± 0%     2.95s ± 0%  +0.21%  (p=0.000 n=20+19)
FmtFprintfEmpty-12          41.8ns ± 4%    41.4ns ± 2%  -1.03%  (p=0.014 n=20+20)
FmtFprintfString-12         68.7ns ± 2%    67.5ns ± 1%  -1.75%  (p=0.000 n=20+17)
FmtFprintfInt-12            79.0ns ± 3%    77.1ns ± 1%  -2.40%  (p=0.000 n=19+17)
FmtFprintfIntInt-12          127ns ± 1%     123ns ± 3%  -3.42%  (p=0.000 n=20+20)
FmtFprintfPrefixedInt-12     152ns ± 1%     150ns ± 1%  -1.02%  (p=0.000 n=18+17)
FmtFprintfFloat-12           211ns ± 1%     209ns ± 0%  -0.99%  (p=0.000 n=20+16)
FmtManyArgs-12               500ns ± 0%     496ns ± 0%  -0.73%  (p=0.000 n=17+20)
GobDecode-12                6.44ms ± 1%    6.53ms ± 0%  +1.28%  (p=0.000 n=20+19)
GobEncode-12                5.46ms ± 0%    5.46ms ± 1%    ~     (p=0.550 n=19+20)
Gzip-12                      220ms ± 1%     216ms ± 0%  -1.75%  (p=0.000 n=19+19)
Gunzip-12                   38.8ms ± 0%    38.6ms ± 0%  -0.30%  (p=0.000 n=18+19)
HTTPClientServer-12         79.0µs ± 1%    78.2µs ± 1%  -1.01%  (p=0.000 n=20+20)
JSONEncode-12               11.9ms ± 0%    11.9ms ± 0%  -0.29%  (p=0.000 n=20+19)
JSONDecode-12               52.6ms ± 0%    52.2ms ± 0%  -0.68%  (p=0.000 n=19+20)
Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.36%  (p=0.000 n=20+20)
GoParse-12                  3.13ms ± 1%    3.18ms ± 1%  +1.67%  (p=0.000 n=19+20)
RegexpMatchEasy0_32-12      73.2ns ± 1%    72.3ns ± 1%  -1.19%  (p=0.000 n=19+18)
RegexpMatchEasy0_1K-12       241ns ± 0%     239ns ± 0%  -0.83%  (p=0.000 n=17+16)
RegexpMatchEasy1_32-12      68.6ns ± 1%    69.0ns ± 1%  +0.47%  (p=0.015 n=18+16)
RegexpMatchEasy1_1K-12       364ns ± 0%     361ns ± 0%  -0.67%  (p=0.000 n=16+17)
RegexpMatchMedium_32-12      104ns ± 1%     103ns ± 1%  -0.79%  (p=0.001 n=20+15)
RegexpMatchMedium_1K-12     33.8µs ± 3%    34.0µs ± 2%    ~     (p=0.267 n=20+19)
RegexpMatchHard_32-12       1.64µs ± 1%    1.62µs ± 2%  -1.25%  (p=0.000 n=19+18)
RegexpMatchHard_1K-12       49.2µs ± 0%    48.7µs ± 1%  -0.93%  (p=0.000 n=19+18)
Revcomp-12                   391ms ± 5%     396ms ± 7%    ~     (p=0.154 n=19+19)
Template-12                 63.1ms ± 0%    59.5ms ± 0%  -5.76%  (p=0.000 n=18+19)
TimeParse-12                 307ns ± 0%     306ns ± 0%  -0.39%  (p=0.000 n=19+17)
TimeFormat-12                325ns ± 0%     323ns ± 0%  -0.50%  (p=0.000 n=19+19)
[Geo mean]                  47.3µs         46.9µs       -0.67%

https://perf.golang.org/search?q=upload:20171026.1

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.25ms ± 1%  2.20ms ± 1%  -2.31%  (p=0.000 n=18+18)
HTTP-12                    12.6µs ± 0%  12.6µs ± 0%  -0.72%  (p=0.000 n=18+17)
JSON-12                    11.0ms ± 0%  11.0ms ± 1%  -0.68%  (p=0.000 n=17+19)

https://perf.golang.org/search?q=upload:20171026.2

Updates #14951.
Updates #22460.

Change-Id: Id4c0932890a1d41020071bec73b8522b1367d3e7
Reviewed-on: https://go-review.googlesource.com/73712
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-10-30 18:12:46 +00:00
Austin Clements
afbe646ab4 cmd/compile: report typedslicecopy write barriers
Most write barrier calls are inserted by SSA, but copy and append are
lowered to runtime.typedslicecopy during walk. Fix these to set
Func.WBPos and emit the "write barrier" warning, as done for the write
barriers inserted by SSA. As part of this, we refactor setting WBPos
and emitting this warning into the frontend so it can be shared by
both walk and SSA.

Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b
Reviewed-on: https://go-review.googlesource.com/73411
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-29 20:21:43 +00:00
Keith Randall
1787ced894 cmd/compile: remove Symbol wrappers from Aux fields
We used to have {Arg,Auto,Extern}Symbol structs with which we wrapped
a *gc.Node or *obj.LSym before storing them in the Aux field
of an ssa.Value.  This let the SSA part of the compiler distinguish
between autos and args, for example.  We no longer need the wrappers
as we can query the underlying objects directly.

There was also some sloppy usage, where VarDef had a *gc.Node
directly in its Aux field, whereas the use of that variable had
that *gc.Node wrapped in an AutoSymbol. Thus the Aux fields didn't
match (using ==) when they probably should.
This sloppy usage cleanup is the only thing in the CL that changes the
generated code - we can get rid of some more unused auto variables if
the matching happens reliably.

Removing this wrapper also lets us get rid of the varsyms cache
(which was used to prevent wrapping the same *gc.Node twice).

Change-Id: I0dedf8f82f84bfee413d310342b777316bd1d478
Reviewed-on: https://go-review.googlesource.com/64452
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-19 22:03:10 +00:00
Heschi Kreinick
2d57d94ac3 [dev.debug] cmd/compile: track variable decomposition in LocalSlot
When the compiler decomposes a user variable, track its origin so that
it can be recomposed during DWARF generation.

Change-Id: Ia71c7f8e7f4d65f0652f1c97b0dda5d9cad41936
Reviewed-on: https://go-review.googlesource.com/50878
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-26 18:39:39 +00:00
Josh Bleecher Snyder
46b88c9fbc cmd/compile: change ssa.Type into *types.Type
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.

In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.

Though this is a big CL, most of the changes are
mechanical and uninteresting.

Interesting bits:

* Add new singleton globals to package types for the special
  SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
  and TTUPLE, for SSA tuple types.
  ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
  to package types.
* We had picked the name "types" in our rules for the handy
  list of types provided by ssa.Config. That conflicted with
  the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
  types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
  and probably also some other duplicated Type methods
  designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
  and they were not particularly careful about types in general.
  Of necessity, this CL switches them to use *types.Type;
  it does not make them more type-accurate.
  Unfortunately, using types.Type means initializing a bit
  of the types universe.
  This is prime for refactoring and improvement.

This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.

name        old alloc/op      new alloc/op      delta
Template         37.9MB ± 0%       37.7MB ± 0%  -0.57%  (p=0.000 n=10+8)
Unicode          28.9MB ± 0%       28.7MB ± 0%  -0.52%  (p=0.000 n=10+10)
GoTypes           110MB ± 0%        109MB ± 0%  -0.88%  (p=0.000 n=10+10)
Flate            24.7MB ± 0%       24.6MB ± 0%  -0.66%  (p=0.000 n=10+10)
GoParser         31.1MB ± 0%       30.9MB ± 0%  -0.61%  (p=0.000 n=10+9)
Reflect          73.9MB ± 0%       73.4MB ± 0%  -0.62%  (p=0.000 n=10+8)
Tar              25.8MB ± 0%       25.6MB ± 0%  -0.77%  (p=0.000 n=9+10)
XML              41.2MB ± 0%       40.9MB ± 0%  -0.80%  (p=0.000 n=10+10)
[Geo mean]       40.5MB            40.3MB       -0.68%

name        old allocs/op     new allocs/op     delta
Template           385k ± 0%         386k ± 0%    ~     (p=0.356 n=10+9)
Unicode            343k ± 1%         344k ± 0%    ~     (p=0.481 n=10+10)
GoTypes           1.16M ± 0%        1.16M ± 0%  -0.16%  (p=0.004 n=10+10)
Flate              238k ± 1%         238k ± 1%    ~     (p=0.853 n=10+10)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.720 n=10+9)
Reflect            957k ± 0%         957k ± 0%    ~     (p=0.460 n=10+8)
Tar                252k ± 0%         252k ± 0%    ~     (p=0.133 n=9+10)
XML                400k ± 0%         400k ± 0%    ~     (p=0.796 n=10+10)
[Geo mean]         428k              428k       -0.01%


Removing all the interface calls helps non-trivially with CPU, though.

name        old time/op       new time/op       delta
Template          178ms ± 4%        173ms ± 3%  -2.90%  (p=0.000 n=94+96)
Unicode          85.0ms ± 4%       83.9ms ± 4%  -1.23%  (p=0.000 n=96+96)
GoTypes           543ms ± 3%        528ms ± 3%  -2.73%  (p=0.000 n=98+96)
Flate             116ms ± 3%        113ms ± 4%  -2.34%  (p=0.000 n=96+99)
GoParser          144ms ± 3%        140ms ± 4%  -2.80%  (p=0.000 n=99+97)
Reflect           344ms ± 3%        334ms ± 4%  -3.02%  (p=0.000 n=100+99)
Tar               106ms ± 5%        103ms ± 4%  -3.30%  (p=0.000 n=98+94)
XML               198ms ± 5%        192ms ± 4%  -2.88%  (p=0.000 n=92+95)
[Geo mean]        178ms             173ms       -2.65%

name        old user-time/op  new user-time/op  delta
Template          229ms ± 5%        224ms ± 5%  -2.36%  (p=0.000 n=95+99)
Unicode           107ms ± 6%        106ms ± 5%  -1.13%  (p=0.001 n=93+95)
GoTypes           696ms ± 4%        679ms ± 4%  -2.45%  (p=0.000 n=97+99)
Flate             137ms ± 4%        134ms ± 5%  -2.66%  (p=0.000 n=99+96)
GoParser          176ms ± 5%        172ms ± 8%  -2.27%  (p=0.000 n=98+100)
Reflect           430ms ± 6%        411ms ± 5%  -4.46%  (p=0.000 n=100+92)
Tar               128ms ±13%        123ms ±13%  -4.21%  (p=0.000 n=100+100)
XML               239ms ± 6%        233ms ± 6%  -2.50%  (p=0.000 n=95+97)
[Geo mean]        220ms             213ms       -2.76%


Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder
dae5389d3d Revert "cmd/compile: add Type.MustSize and Type.MustAlignment"
This reverts commit 94d540a4b6.

Reason for revert: prefer something along the lines of CL 42018.

Change-Id: I876fe32e98f37d8d725fe55e0fd0ea429c0198e0
Reviewed-on: https://go-review.googlesource.com/42022
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-28 01:24:13 +00:00
Josh Bleecher Snyder
94d540a4b6 cmd/compile: add Type.MustSize and Type.MustAlignment
Type.Size and Type.Alignment are for the front end:
They calculate size and alignment if needed.

Type.MustSize and Type.MustAlignment are for the back end:
They call Fatal if size and alignment are not already calculated.

Most uses are of MustSize and MustAlignment,
but that's because the back end is newer,
and this API was added to support it.

This CL was mostly generated with sed and selective reversion.
The only mildly interesting bit is the change of the ssa.Type interface
and the supporting ssa dummy types.

Follow-up to review feedback on CL 41970.

Passes toolstash-check.

Change-Id: I0d9b9505e57453dae8fb6a236a07a7a02abd459e
Reviewed-on: https://go-review.googlesource.com/42016
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-27 22:57:57 +00:00
Matthew Dempsky
c87520c598 cmd: remove IntSize and Widthint
Use PtrSize and Widthptr instead. CL prepared mostly with sed and
uniq.

Passes toolstash-check -all.

Fixes #19954.

Change-Id: I09371bd7128672885cb8bc4e7f534ad56a88d755
Reviewed-on: https://go-review.googlesource.com/40506
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-04-22 17:43:43 +00:00
Josh Bleecher Snyder
405a280d01 cmd/internal/obj: eliminate LSym.Version
There were only two versions, 0 and 1,
and the only user of version 1 was the assembler,
to indicate that a symbol was static.

Rename LSym.Version to Static,
and add it to LSym.Attributes.
Simplify call-sites.

Passes toolstash-check.

Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f
Reviewed-on: https://go-review.googlesource.com/41201
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-20 21:56:23 +00:00
Josh Bleecher Snyder
c311488283 cmd/internal/obj: remove Linklookup
It was simply a wrapper around Link.Lookup.
Unwrap everything.

CL prepared using eg with template:

package p

import "cmd/internal/obj"

func before(ctxt *obj.Link, name string, version int) *obj.LSym {
	return obj.Linklookup(ctxt, name, version)
}

func after(ctxt *obj.Link, name string, version int) *obj.LSym {
	return ctxt.Lookup(name, version)
}

Then one comment in cmd/asm/internal/asm/parse.go
was manually updated (and gofmt'ed!),
and func Linklookup deleted.

Passes toolstash-check (as a sanity measure).

Change-Id: Icc4d56b0b2b5c8888d3184c1898c48359ea1e638
Reviewed-on: https://go-review.googlesource.com/39715
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-06 19:01:41 +00:00
Josh Bleecher Snyder
0323895cc0 cmd/compile: catch and report nowritebarrier violations later
Prior to this CL, the SSA backend reported violations
of the //go:nowritebarrier annotation immediately.
This necessitated emitting errors during SSA compilation,
which is not compatible with a concurrent backend.

Instead, check for such violations later.
We already save the data required to do a late check
for violations of the //go:nowritebarrierrec annotation.
Use the same data, and check //go:nowritebarrier at the same time.

One downside to doing this is that now only a single
violation will be reported per function.
Given that this is for the runtime only,
and violations are rare, this seems an acceptable cost.

While we are here, remove several 'nerrors != 0' checks
that are rendered pointless.

Updates #15756
Fixes #19250 (as much as it ever can be)

Change-Id: Ia44c4ad5b6fd6f804d9f88d9571cec8d23665cb3
Reviewed-on: https://go-review.googlesource.com/38973
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-31 16:31:20 +00:00
Josh Bleecher Snyder
34975095d0 cmd/compile: provide pos and curfn to temp
Concurrent compilation requires providing an
explicit position and curfn to temp.
This implementation of tempAt temporarily
continues to use the globals lineno and Curfn,
so as not to collide with mdempsky's
work for #19683 eliminating the Curfn dependency
from func nod.

Updates #15756
Updates #19683

Change-Id: Ib3149ca4b0740e9f6eea44babc6f34cdd63028a9
Reviewed-on: https://go-review.googlesource.com/38592
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-25 00:09:21 +00:00
Josh Bleecher Snyder
a68e5d94fa cmd/compile: clean up SSA test API
I noted in CL 38327 that the SSA test API felt a bit
clunky after the ssa.Func/ssa.Cache/ssa.Config refactoring,
and promised to clean it up once the dust settled.
The dust has settled.

Along the way, this CL fixes a potential latent bug,
in which the amd64 test context was used for all dummy Syslook calls.
The lone SSA test using the s390x context did not depend on the
Syslook context being correct, so the bug did not arise in practice.

Change-Id: If964251d1807976073ad7f47da0b1f1f77c58413
Reviewed-on: https://go-review.googlesource.com/38346
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-19 05:37:39 +00:00
Josh Bleecher Snyder
872db79989 cmd/compile: add more types to ssa.Types
This reduces the number of calls back into the
gc Type routines, which will help performance
in a concurrent backend.
It also reduces the number of callsites
that must be considered in making the transition.

Passes toolstash-check -all. No compiler performance changes.

Updates #15756

Change-Id: Ic7a8f1daac7e01a21658ae61ac118b2a70804117
Reviewed-on: https://go-review.googlesource.com/38340
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-19 02:03:44 +00:00
Josh Bleecher Snyder
aea3aff669 cmd/compile: separate ssa.Frontend and ssa.TypeSource
Prior to this CL, the ssa.Frontend field was responsible
for providing types to the backend during compilation.
However, the types needed by the backend are few and static.
It makes more sense to use a struct for them
and to hang that struct off the ssa.Config,
which is the correct home for readonly data.
Now that Types is a struct, we can clean up the names a bit as well.

This has the added benefit of allowing early construction
of all types needed by the backend.
This will be useful for concurrent backend compilation.

Passes toolstash-check -all. No compiler performance change.

Updates #15756

Change-Id: I021658c8cf2836d6a22bbc20cc828ac38c7da08a
Reviewed-on: https://go-review.googlesource.com/38336
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-19 00:21:23 +00:00
Josh Bleecher Snyder
2cdb7f118a cmd/compile: move Frontend field from ssa.Config to ssa.Func
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.

This is a giant CL, but it is almost entirely routine refactoring.

The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.

Passes toolstash -cmp.

Updates #15756

Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17 23:18:57 +00:00
Cherry Zhang
1b85300602 cmd/compile: clean up SSA-building code
Now that the write barrier insertion is moved to SSA, the SSA
building code can be simplified.

Updates #17583.

Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25
Reviewed-on: https://go-review.googlesource.com/36840
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:24:40 +00:00
Cherry Zhang
9ebf3d5100 cmd/compile: move write barrier insertion to SSA
When the compiler insert write barriers, the frontend makes
conservative decisions at an early stage. This sometimes have
false positives because of the lack of information, for example,
writes on stack. SSA's writebarrier pass identifies writes on
stack and eliminates write barriers for them.

This CL moves write barrier insertion into SSA. The frontend no
longer makes decisions about write barriers, and simply does
normal assignments and emits normal Store ops when building SSA.
SSA writebarrier pass inserts write barrier for Stores when needed.
There, it has better information about the store because Phi and
Copy propagation are done at that time.

This CL only changes StoreWB to Store in gc/ssa.go. A followup CL
simplifies SSA building code.

Updates #17583.

Change-Id: I4592d9bc0067503befc169c50b4e6f4765673bec
Reviewed-on: https://go-review.googlesource.com/36839
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16 14:24:21 +00:00
philhofer
295307ae78 cmd/compile: de-virtualize interface calls
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates #19361

Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d
Reviewed-on: https://go-review.googlesource.com/38139
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-03-14 18:49:23 +00:00
David Chase
b59a405656 Revert "cmd/compile: de-virtualize interface calls"
This reverts commit 4e0c7c3f61.

Reason for revert: The presence-of-optimization test program is fragile, breaks under noopt, and might break if the Go libraries are tweaked.  It needs to be (re)written without reference to other packages.

Change-Id: I3aaf1ab006a1a255f961a978e9c984341740e3c7
Reviewed-on: https://go-review.googlesource.com/38097
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-13 21:15:32 +00:00
Philip Hofer
4e0c7c3f61 cmd/compile: de-virtualize interface calls
With this change, code like

    h := sha1.New()
    h.Write(buf)
    sum := h.Sum()

gets compiled into static calls rather than
interface calls, because the compiler is able
to prove that 'h' is really a *sha1.digest.

The InterCall re-write rule hits a few dozen times
during make.bash, and hundreds of times during all.bash.

The most common pattern identified by the compiler
is a constructor like

    func New() Interface { return &impl{...} }

where the constructor gets inlined into the caller,
and the result is used immediately. Examples include
{sha1,md5,crc32,crc64,...}.New, base64.NewEncoder,
base64.NewDecoder, errors.New, net.Pipe, and so on.

Some existing benchmarks that change on darwin/amd64:

Crc64/ISO4KB-8        2.67µs ± 1%    2.66µs ± 0%  -0.36%  (p=0.015 n=10+10)
Crc64/ISO1KB-8         694ns ± 0%     690ns ± 1%  -0.59%  (p=0.001 n=10+10)
Adler32KB-8            473ns ± 1%     471ns ± 0%  -0.39%  (p=0.010 n=10+9)

On architectures like amd64, the reduction in code size
appears to contribute more to benchmark improvements than just
removing the indirect call, since that branch gets predicted
accurately when called in a loop.

Updates #19361

Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed
Reviewed-on: https://go-review.googlesource.com/37751
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-13 18:24:57 +00:00
David Chase
d71f36b5aa cmd/compile: check loop rescheduling with stack bound, not counter
After benchmarking with a compiler modified to have better
spill location, it became clear that this method of checking
was actually faster on (at least) two different architectures
(ppc64 and amd64) and it also provides more timely interruption
of loops.

This change adds a modified FOR loop node "FORUNTIL" that
checks after executing the loop body instead of before (i.e.,
always at least once).  This ensures that a pointer past the
end of a slice or array is not made visible to the garbage
collector.

Without the rescheduling checks inserted, the restructured
loop from this  change apparently provides a 1% geomean
improvement on PPC64 running the go1 benchmarks; the
improvement on AMD64 is only 0.12%.

Inserting the rescheduling check exposed some peculiar bug
with the ssa test code for s390x; this was updated based on
initial code actually generated for GOARCH=s390x to use
appropriate OpArg, OpAddr, and OpVarDef.

NaCl is disabled in testing.

Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50
Reviewed-on: https://go-review.googlesource.com/36206
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-08 18:52:12 +00:00
Dave Cheney
d945b28675 cmd/compile/internal/ssa: remove unused PrintFunc variable
Change-Id: I8c581eec77beacaddc0aac29e7d380a4d5ca8acc
Reviewed-on: https://go-review.googlesource.com/37551
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-01 01:23:31 +00:00