Commit graph

180 commits

Author SHA1 Message Date
Martin Möhrmann
020a18c545 cmd/compile: move slice construction to callers of makeslice
Only return a pointer p to the new slices backing array from makeslice.
Makeslice callers then construct sliceheader{p, len, cap} explictly
instead of makeslice returning the slice.

Reduces go binary size by ~0.2%.
Removes 92 (~3.5%) panicindex calls from go binary.

Change-Id: I29b7c3b5fe8b9dcec96e2c43730575071cfe8a94
Reviewed-on: https://go-review.googlesource.com/c/141822
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-29 19:23:00 +00:00
Matthew Dempsky
2dda040f19 cmd/compile/internal/gc: represent labels as bare Syms
Avoids allocating an ONAME for OLABEL, OGOTO, and named OBREAK and
OCONTINUE nodes.

Passes toolstash-check.

Change-Id: I359142cd48e8987b5bf29ac100752f8c497261c1
Reviewed-on: https://go-review.googlesource.com/c/145200
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-27 07:32:06 +00:00
Matthew Dempsky
67a9c0afd1 cmd/compile/internal/gc: fix ONAME documentation
Named constants are represented as OLITERAL with n.Sym != nil.

Change-Id: If6bc8c507ef8c3e4e47f586d86fd1d0f20bf8974
Reviewed-on: https://go-review.googlesource.com/c/145198
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-27 00:22:57 +00:00
Josh Bleecher Snyder
2578ac54eb cmd/compile: move argument stack construction to SSA generation
The goal of this change is to move work from walk to SSA,
and simplify things along the way.

This is hard to accomplish cleanly with small incremental changes,
so this large commit message aims to provide a roadmap to the diff.

High level description:

Prior to this change, walk was responsible for constructing (most of) the stack for function calls.
ascompatte gathered variadic arguments into a slice.
It also rewrote n.List from a list of arguments to a list of assignments to stack slots.
ascompatte was called multiple times to handle the receiver in a method call.
reorder1 then introduced temporaries into n.List as needed to avoid smashing the stack.
adjustargs then made extra stack space for go/defer args as needed.

Node to SSA construction evaluated all the statements in n.List,
and issued the function call, assuming that the stack was correctly constructed.
Intrinsic calls had to dig around inside n.List to extract the arguments,
since intrinsics don't use the stack to make function calls.

This change moves stack construction to the SSA construction phase.
ascompatte, now called walkParams, does all the work that ascompatte and reorder1 did.
It handles variadic arguments, inserts the method receiver if needed, and allocates temporaries.
It does not, however, make any assignments to stack slots.
Instead, it moves the function arguments to n.Rlist, leaving assignments to temporaries in n.List.
(It would be better to use Ninit instead of List; future work.)
During SSA construction, after doing all the temporary assignments in n.List,
the function arguments are assigned to stack slots by
constructing the appropriate SSA Value, using (*state).storeArg.
SSA construction also now handles adjustments for go/defer args.
This change also simplifies intrinsic calls, since we no longer need to undo walk's work.

Along the way, we simplify nodarg by pushing the fp==1 case to its callers, where it fits nicely.

Generated code differences:

There were a few optimizations applied along the way, the old way.
f(g()) was rewritten to do a block copy of function results to function arguments.
And reorder1 avoided introducing the final "save the stack" temporary in n.List.

The f(g()) block copy optimization never actually triggered; the order pass rewrote away g(), so that has been removed.

SSA optimizations mostly obviated the need for reorder1's optimization of avoiding the final temporary.
The exception was when the temporary's type was not SSA-able;
in that case, we got a Move into an autotmp and then an immediate Move onto the stack,
with the autotmp never read or used again.
This change introduces a new rewrite rule to detect such pointless double Moves
and collapse them into a single Move.
This is actually more powerful than the original optimization,
since the original optimization relied on the imprecise Node.HasCall calculation.

The other significant difference in the generated code is that the stack is now constructed
completely in SP-offset order. Prior to this change, the stack was constructed somewhat
haphazardly: first the final argument that Node.HasCall deemed to require a temporary,
then other arguments, then the method receiver, then the defer/go args.
SP-offset is probably a good default order. See future work.

There are a few minor object file size changes as a result of this change.
I investigated some regressions in early versions of this change.

One regression (in archive/tar) was the addition of a single CMPQ instruction,
which would be eliminated were this TODO from flagalloc to be done:
	// TODO: Remove original instructions if they are never used.

One regression (in text/template) was an ADDQconstmodify that is now
a regular MOVQLoad+ADDQconst+MOVQStore, due to an unlucky change
in the order in which arguments are written. The argument change
order can also now be luckier, so this appears to be a wash.

All in all, though there will be minor winners and losers,
this change appears to be performance neutral.

Future work:

Move loading the result of function calls to SSA construction; eliminate OINDREGSP.

Consider pushing stack construction deeper into SSA world, perhaps in an arch-specific pass.
Among other benefits, this would make it easier to transition to a new calling convention.
This would require rethinking the handling of stack conflicts and is non-trivial.

Figure out some clean way to indicate that stack construction Stores/Moves
do not alias each other, so that subsequent passes may do things like
CSE+tighten shared stack setup, do DSE using non-first Stores, etc.
This would allow us to eliminate the minor text/template regression.

Possibly make assignments to stack slots not treated as statements by DWARF.

Compiler benchmarks:

name        old time/op       new time/op       delta
Template          182ms ± 2%        179ms ± 2%  -1.69%  (p=0.000 n=47+48)
Unicode          86.3ms ± 5%       85.1ms ± 4%  -1.36%  (p=0.001 n=50+50)
GoTypes           646ms ± 1%        642ms ± 1%  -0.63%  (p=0.000 n=49+48)
Compiler          2.89s ± 1%        2.86s ± 2%  -1.36%  (p=0.000 n=48+50)
SSA               8.47s ± 1%        8.37s ± 2%  -1.22%  (p=0.000 n=47+50)
Flate             122ms ± 2%        121ms ± 2%  -0.66%  (p=0.000 n=47+45)
GoParser          147ms ± 2%        146ms ± 2%  -0.53%  (p=0.006 n=46+49)
Reflect           406ms ± 2%        403ms ± 2%  -0.76%  (p=0.000 n=48+43)
Tar               162ms ± 3%        162ms ± 4%    ~     (p=0.191 n=46+50)
XML               223ms ± 2%        222ms ± 2%  -0.37%  (p=0.031 n=45+49)
[Geo mean]        382ms             378ms       -0.89%

name        old user-time/op  new user-time/op  delta
Template          219ms ± 3%        216ms ± 3%  -1.56%  (p=0.000 n=50+48)
Unicode           109ms ± 6%        109ms ± 5%    ~     (p=0.190 n=50+49)
GoTypes           836ms ± 2%        828ms ± 2%  -0.96%  (p=0.000 n=49+48)
Compiler          3.87s ± 2%        3.80s ± 1%  -1.81%  (p=0.000 n=49+46)
SSA               12.0s ± 1%        11.8s ± 1%  -2.01%  (p=0.000 n=48+50)
Flate             142ms ± 3%        141ms ± 3%  -0.85%  (p=0.003 n=50+48)
GoParser          178ms ± 4%        175ms ± 4%  -1.66%  (p=0.000 n=48+46)
Reflect           520ms ± 2%        512ms ± 2%  -1.44%  (p=0.000 n=45+48)
Tar               200ms ± 3%        198ms ± 4%  -0.61%  (p=0.037 n=47+50)
XML               277ms ± 3%        275ms ± 3%  -0.85%  (p=0.000 n=49+48)
[Geo mean]        482ms             476ms       -1.23%

name        old alloc/op      new alloc/op      delta
Template         36.1MB ± 0%       35.3MB ± 0%  -2.18%  (p=0.008 n=5+5)
Unicode          29.8MB ± 0%       29.3MB ± 0%  -1.58%  (p=0.008 n=5+5)
GoTypes           125MB ± 0%        123MB ± 0%  -2.13%  (p=0.008 n=5+5)
Compiler          531MB ± 0%        513MB ± 0%  -3.40%  (p=0.008 n=5+5)
SSA              2.00GB ± 0%       1.93GB ± 0%  -3.34%  (p=0.008 n=5+5)
Flate            24.5MB ± 0%       24.3MB ± 0%  -1.18%  (p=0.008 n=5+5)
GoParser         29.4MB ± 0%       28.7MB ± 0%  -2.34%  (p=0.008 n=5+5)
Reflect          87.1MB ± 0%       86.0MB ± 0%  -1.33%  (p=0.008 n=5+5)
Tar              35.3MB ± 0%       34.8MB ± 0%  -1.44%  (p=0.008 n=5+5)
XML              47.9MB ± 0%       47.1MB ± 0%  -1.86%  (p=0.008 n=5+5)
[Geo mean]       82.8MB            81.1MB       -2.08%

name        old allocs/op     new allocs/op     delta
Template           352k ± 0%         347k ± 0%  -1.32%  (p=0.008 n=5+5)
Unicode            342k ± 0%         339k ± 0%  -0.66%  (p=0.008 n=5+5)
GoTypes           1.29M ± 0%        1.27M ± 0%  -1.30%  (p=0.008 n=5+5)
Compiler          4.98M ± 0%        4.87M ± 0%  -2.14%  (p=0.008 n=5+5)
SSA               15.7M ± 0%        15.2M ± 0%  -2.86%  (p=0.008 n=5+5)
Flate              233k ± 0%         231k ± 0%  -0.83%  (p=0.008 n=5+5)
GoParser           296k ± 0%         291k ± 0%  -1.54%  (p=0.016 n=5+4)
Reflect           1.05M ± 0%        1.04M ± 0%  -0.65%  (p=0.008 n=5+5)
Tar                343k ± 0%         339k ± 0%  -0.97%  (p=0.008 n=5+5)
XML                432k ± 0%         426k ± 0%  -1.19%  (p=0.008 n=5+5)
[Geo mean]         815k              804k       -1.35%

name        old object-bytes  new object-bytes  delta
Template          505kB ± 0%        505kB ± 0%  -0.01%  (p=0.008 n=5+5)
Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
GoTypes          1.82MB ± 0%       1.83MB ± 0%  +0.06%  (p=0.008 n=5+5)
Flate             324kB ± 0%        324kB ± 0%  +0.00%  (p=0.008 n=5+5)
GoParser          402kB ± 0%        402kB ± 0%  +0.04%  (p=0.008 n=5+5)
Reflect          1.39MB ± 0%       1.39MB ± 0%  -0.01%  (p=0.008 n=5+5)
Tar               449kB ± 0%        449kB ± 0%  -0.02%  (p=0.008 n=5+5)
XML               598kB ± 0%        597kB ± 0%  -0.05%  (p=0.008 n=5+5)

Change-Id: Ifc9d5c1bd01f90171414b8fb18ffe2290d271143
Reviewed-on: https://go-review.googlesource.com/c/114797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-19 21:23:16 +00:00
Matthew Dempsky
a0d6420d8b cmd/compile/internal/gc: remove OCMPIFACE/OCMPSTR placeholders
Change-Id: If05f6146a1fd97f61fc71629c5c29df43220d0c8
Reviewed-on: https://go-review.googlesource.com/c/141638
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-11 21:18:44 +00:00
Matthew Dempsky
28fbbf4111 cmd/compile/internal/gc: remove OCMPIFACE and OCMPSTR
Interface and string comparisons don't need separate Ops any more than
struct or array comparisons do.

Removing them requires shuffling some code around in walk (and a
little in order), but overall allows simplifying things a bit.

Passes toolstash-check.

Change-Id: I084b8a6c089b768dc76d220379f4daed8a35db15
Reviewed-on: https://go-review.googlesource.com/c/141637
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-11 21:18:33 +00:00
Igor Zhilianin
f90e89e675 all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev: d4cfc41a55
GitHub-Pull-Request: golang/go#28049
Reviewed-on: https://go-review.googlesource.com/c/140299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-06 15:40:03 +00:00
Keith Randall
9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Austin Clements
837ed98d63 cmd/compile: don't produce a past-the-end pointer in range loops
Currently, range loops over slices and arrays are compiled roughly
like:

for i, x := range s { b }
  ⇓
for i, _n, _p := 0, len(s), &s[0]; i < _n; i, _p = i+1, _p + unsafe.Sizeof(s[0]) { b }
  ⇓
i, _n, _p := 0, len(s), &s[0]
goto cond
body:
{ b }
i, _p = i+1, _p + unsafe.Sizeof(s[0])
cond:
if i < _n { goto body } else { goto end }
end:

The problem with this lowering is that _p may temporarily point past
the end of the allocation the moment before the loop terminates. Right
now this isn't a problem because there's never a safe-point during
this brief moment.

We're about to introduce safe-points everywhere, so this bad pointer
is going to be a problem. We could mark the increment as an unsafe
block, but this inhibits reordering opportunities and could result in
infrequent safe-points if the body is short.

Instead, this CL fixes this by changing how we compile range loops to
never produce this past-the-end pointer. It changes the lowering to
roughly:

i, _n, _p := 0, len(s), &s[0]
if i < _n { goto body } else { goto end }
top:
_p += unsafe.Sizeof(s[0])
body:
{ b }
i++
if i < _n { goto top } else { goto end }
end:

Notably, the increment is split into two parts: we increment the index
before checking the condition, but increment the pointer only *after*
the condition check has succeeded.

The implementation builds on the OFORUNTIL construct that was
introduced during the loop preemption experiments, since OFORUNTIL
places the increment and condition after the loop body. To support the
extra "late increment" step, we further define OFORUNTIL's "List"
field to contain the late increment statements. This makes all of this
a relatively small change.

This depends on the improvements to the prove pass in CL 102603. With
the current lowering, bounds-check elimination knows that i < _n in
the body because the body block is dominated by the cond block. In the
new lowering, deriving this fact requires detecting that i < _n on
*both* paths into body and hence is true in body. CL 102603 made prove
able to detect this.

The code size effect of this is minimal. The cmd/go binary on
linux/amd64 increases by 0.17%. Performance-wise, this actually
appears to be a net win, though it's mostly noise:

name                      old time/op    new time/op    delta
BinaryTree17-12              2.80s ± 0%     2.61s ± 1%  -6.88%  (p=0.000 n=20+18)
Fannkuch11-12                2.41s ± 0%     2.42s ± 0%  +0.05%  (p=0.005 n=20+20)
FmtFprintfEmpty-12          41.6ns ± 5%    41.4ns ± 6%    ~     (p=0.765 n=20+19)
FmtFprintfString-12         69.4ns ± 3%    69.3ns ± 1%    ~     (p=0.084 n=19+17)
FmtFprintfInt-12            76.1ns ± 1%    77.3ns ± 1%  +1.57%  (p=0.000 n=19+19)
FmtFprintfIntInt-12          122ns ± 2%     123ns ± 3%  +0.95%  (p=0.015 n=20+20)
FmtFprintfPrefixedInt-12     153ns ± 2%     151ns ± 3%  -1.27%  (p=0.013 n=20+20)
FmtFprintfFloat-12           215ns ± 0%     216ns ± 0%  +0.47%  (p=0.000 n=20+16)
FmtManyArgs-12               486ns ± 1%     498ns ± 0%  +2.40%  (p=0.000 n=20+17)
GobDecode-12                6.43ms ± 0%    6.50ms ± 0%  +1.08%  (p=0.000 n=18+19)
GobEncode-12                5.43ms ± 1%    5.47ms ± 0%  +0.76%  (p=0.000 n=20+20)
Gzip-12                      218ms ± 1%     218ms ± 1%    ~     (p=0.883 n=20+20)
Gunzip-12                   38.8ms ± 0%    38.9ms ± 0%    ~     (p=0.644 n=19+19)
HTTPClientServer-12         76.2µs ± 1%    76.4µs ± 2%    ~     (p=0.218 n=20+20)
JSONEncode-12               12.2ms ± 0%    12.3ms ± 1%  +0.45%  (p=0.000 n=19+19)
JSONDecode-12               54.2ms ± 1%    53.3ms ± 0%  -1.67%  (p=0.000 n=20+20)
Mandelbrot200-12            3.71ms ± 0%    3.71ms ± 0%    ~     (p=0.143 n=19+20)
GoParse-12                  3.22ms ± 0%    3.19ms ± 1%  -0.72%  (p=0.000 n=20+20)
RegexpMatchEasy0_32-12      76.7ns ± 1%    75.8ns ± 1%  -1.19%  (p=0.000 n=20+17)
RegexpMatchEasy0_1K-12       245ns ± 1%     243ns ± 0%  -0.72%  (p=0.000 n=18+17)
RegexpMatchEasy1_32-12      71.9ns ± 0%    71.7ns ± 1%  -0.39%  (p=0.006 n=12+18)
RegexpMatchEasy1_1K-12       358ns ± 1%     354ns ± 1%  -1.13%  (p=0.000 n=20+19)
RegexpMatchMedium_32-12      105ns ± 2%     105ns ± 1%  -0.63%  (p=0.007 n=19+20)
RegexpMatchMedium_1K-12     31.9µs ± 1%    31.9µs ± 1%    ~     (p=1.000 n=17+17)
RegexpMatchHard_32-12       1.51µs ± 1%    1.52µs ± 2%  +0.46%  (p=0.042 n=18+18)
RegexpMatchHard_1K-12       45.3µs ± 1%    45.5µs ± 2%  +0.44%  (p=0.029 n=18+19)
Revcomp-12                   388ms ± 1%     385ms ± 0%  -0.57%  (p=0.000 n=19+18)
Template-12                 63.0ms ± 1%    63.3ms ± 0%  +0.50%  (p=0.000 n=19+20)
TimeParse-12                 309ns ± 1%     307ns ± 0%  -0.62%  (p=0.000 n=20+20)
TimeFormat-12                328ns ± 0%     333ns ± 0%  +1.35%  (p=0.000 n=19+19)
[Geo mean]                  47.0µs         46.9µs       -0.20%

(https://perf.golang.org/search?q=upload:20180326.1)

For #10958.
For #24543.

Change-Id: Icbd52e711fdbe7938a1fea3e6baca1104b53ac3a
Reviewed-on: https://go-review.googlesource.com/102604
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 14:15:46 +00:00
Robert Griesemer
2b23996939 cmd/compile: use existing flag bits to record 'used' property of Names (cleanup)
Change-Id: I804d5ab111e33bd2c2554e2bac75b5273b0b4160
Reviewed-on: https://go-review.googlesource.com/106121
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-04-11 16:43:17 +00:00
Matthew Dempsky
17df5ed910 cmd/compile: insert instrumentation during SSA building
Insert appropriate race/msan calls before each memory operation during
SSA construction.

This is conceptually simple, but subtle because we need to be careful
that inserted instrumentation calls don't clobber arguments that are
currently being prepared for a user function call.

reorder1 already handles introducing temporary variables for arguments
in some cases. This CL changes it to use them for all arguments when
instrumenting.

Also, we can't SSA struct types with more than one field while
instrumenting. Otherwise, concurrent uses of disjoint fields within an
SSA-able struct can introduce false races.

This is both somewhat better and somewhat worse than the old racewalk
instrumentation pass. We're now able to easily recognize cases like
constructing non-escaping closures on the stack or accessing closure
variables don't need instrumentation calls. On the other hand,
spilling escaping parameters to the heap now results in an
instrumentation call.

Overall, this CL results in a small net reduction in the number of
instrumentation calls, but a small net increase in binary size for
instrumented executables. cmd/go ends up with 5.6% fewer calls, but a
2.4% larger binary.

Fixes #19054.

Change-Id: I70d1dd32ad6340e6fdb691e6d5a01452f58e97f3
Reviewed-on: https://go-review.googlesource.com/102817
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-09 18:40:55 +00:00
Matthew Dempsky
562a199961 cmd/compile: extract inline related fields into separate Inline type
Inl, Inldcl, and InlCost are only applicable to functions with bodies
that can be inlined, so pull them out into a separate Inline type to
make understanding them easier.

A side benefit is that we can check if a function can be inlined by
just checking if n.Func.Inl is non-nil, which simplifies handling of
empty function bodies.

While here, remove some unnecessary Curfn twiddling, and make imported
functions use Inl.Dcl instead of Func.Dcl for consistency for local
functions.

Passes toolstash-check.

Change-Id: Ifd4a80349d85d9e8e4484952b38ec4a63182e81f
Reviewed-on: https://go-review.googlesource.com/104756
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-04-05 05:12:36 +00:00
Matthew Dempsky
eb3c44b2c4 cmd/compile: cleanup closure.go
The main thing is we now eagerly create the ODCLFUNC node for
closures, immediately cross-link them, and assign fields (e.g., Nbody,
Dcl, Parents, Marks) directly on the ODCLFUNC (previously they were
assigned on the OCLOSURE and later moved to the ODCLFUNC).

This allows us to set Curfn to the ODCLFUNC instead of the OCLOSURE,
which makes things more consistent with normal function declarations.
(Notably, this means Cvars now hang off the ODCLFUNC instead of the
OCLOSURE.)

Assignment of xfunc symbol names also now happens before typechecking
their body, which means debugging output now provides a more helpful
name than "<S>".

In golang.org/cl/66810, we changed "x := y" statements to avoid
creating false closure variables for x, but we still create them for
struct literals like "s{f: x}". Update comment in capturevars
accordingly.

More opportunity for cleanups still, but this makes some substantial
progress, IMO.

Passes toolstash-check.

Change-Id: I65a4efc91886e3dcd1000561348af88297775cd7
Reviewed-on: https://go-review.googlesource.com/100197
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-03-14 23:54:39 +00:00
Matthew Dempsky
e4de522c95 cmd/compile: fix Node.Etype overloading
Add helper methods that validate n.Op and convert to/from the
appropriate type.

Notably, there was a lot of code in walk.go that thought setting
Etype=1 on an OADDR node affected escape analysis.

Passes toolstash-check.

TBR=marvin

Change-Id: Ieae7c67225c1459c9719f9e6a748a25b975cf758
Reviewed-on: https://go-review.googlesource.com/99535
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-09 21:44:35 +00:00
Matthew Dempsky
e3127f023f cmd/compile: fuse escape analysis parameter tagging loops
Simplifies the code somewhat and allows removing Param.Field.

Passes toolstash-check.

Change-Id: Id854416aea8afd27ce4830ff0f5ff940f7353792
Reviewed-on: https://go-review.googlesource.com/99336
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-08 18:21:52 +00:00
Matthew Dempsky
d7eb4901f1 cmd/compile: remove funcdepth variables
There were only two large classes of use for these variables:

1) Testing "funcdepth != 0" or "funcdepth > 0", which is equivalent to
checking "Curfn != nil".

2) In oldname, detecting whether a closure variable has been created
for the current function, which can be handled by instead testing
"n.Name.Curfn != Curfn".

Lastly, merge funcstart into funchdr, since it's only called once, and
it better matches up with funcbody now.

Passes toolstash-check.

Change-Id: I8fe159a9d37ef7debc4cd310354cea22a8b23394
Reviewed-on: https://go-review.googlesource.com/99076
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-07 06:05:18 +00:00
Heschi Kreinick
e181852dd4 cmd/compile/internal: use sparseSet, optimize isSynthetic
changedVars was functionally a set, but couldn't be iterated over
efficiently. In functions with many variables, the wasted iteration was
costly. Use a sparseSet instead.

(*gc.Node).String() is very expensive: it calls Sprintf, which does
reflection, etc, etc. Instead, just look at .Sym.Name, which is all we
care about.

Change-Id: Ib61cd7b5c796e1813b8859135e85da5bfe2ac686
Reviewed-on: https://go-review.googlesource.com/92402
Reviewed-by: David Chase <drchase@google.com>
2018-02-21 18:01:22 +00:00
Austin Clements
a7f73c436d cmd/compile: eliminate NoFramePointer
The NoFramePointer function flag is no longer used, so this CL
eliminates it. This cleans up some confusion between the compiler's
NoFramePointer flag and obj's NOFRAME flag. NoFramePointer was
intended to eliminate the saved base pointer on x86, but it was
translated into obj's NOFRAME flag. On x86, NOFRAME does mean to omit
the saved base pointer, but on ppc64 and s390x it has a more general
meaning of omitting *everything* from the frame, including the saved
LR and ppc64's "fixed frame". Hence, on ppc64 and s390x there are far
fewer situations where it is safe to set this flag.

Change-Id: If68991310b4d00638128c296bdd57f4ed731b46d
Reviewed-on: https://go-review.googlesource.com/92036
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-12 21:41:21 +00:00
Than McIntosh
4435fcfd6c compiler,linker: support for DWARF inlined instances
Compiler and linker changes to support DWARF inlined instances,
see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md
for design details.

This functionality is gated via the cmd/compile option -gendwarfinl=N,
where N={0,1,2}, where a value of 0 disables dwarf inline generation,
a value of 1 turns on dwarf generation without tracking of formal/local
vars from inlined routines, and a value of 2 enables inlines with
variable tracking.

Updates #22080

Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e
Reviewed-on: https://go-review.googlesource.com/75550
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2017-11-30 14:39:19 +00:00
Daniel Martí
3231d4e4ef cmd/compile: replace opnames with stringer
Now possible, since stringer just got the -trimprefix flag added.

While at it, simplify a few Op stringifications since we can now use %v,
and no longer have to worry about o<len(opnames).

Passes toolstash -cmp on std cmd.

Fixes #15462.

Change-Id: Icdcde0b0a5eb165d18488918175024da274f782b
Reviewed-on: https://go-review.googlesource.com/76790
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dave Cheney <dave@cheney.net>
2017-11-10 10:56:22 +00:00
Matthew Dempsky
8684534321 cmd/compile: don't export unreachable inline method bodies
Previously, anytime we exported a function or method declaration
(which includes methods for every type transitively exported), we
included the inline function bodies, if any. However, in many cases,
it's impossible (or at least very unlikely) for the importing package
to call the method.

For example:

    package p
    type T int
    func (t T) M() { t.u() }
    func (t T) u() {}
    func (t T) v() {}

T.M and T.u are inlineable, and they're both reachable through calls
to T.M, which is exported. However, t.v is also inlineable, but cannot
be reached.

Exception: if p.T is embedded in another type q.U, p.T.v will be
promoted to q.U.v, and the generated wrapper function could have
inlined the call to p.T.v. However, in practice, this doesn't happen,
and a missed inlining opportunity doesn't affect correctness.

To implement this, this CL introduces an extra flood fill pass before
exporting to mark inline bodies that are actually reachable, so the
exporter can skip over methods like t.v.

This reduces Kubernetes build time (as measured by "time go build -a
k8s.io/kubernetes/cmd/...") on an HP Z620 measurably:

    == before ==
    real    0m44.658s
    user    11m19.136s
    sys     0m53.844s

    == after ==
    real    0m41.702s
    user    10m29.732s
    sys     0m50.908s

It also significantly cuts down the cost of enabling mid-stack
inlining (-l=4):

    == before (-l=4) ==
    real    1m19.236s
    user    20m6.528s
    sys     1m17.328s

    == after (-l=4) ==
    real    0m59.100s
    user    13m12.808s
    sys     0m58.776s

Updates #19348.

Change-Id: Iade58233ca42af823a1630517a53848b5d3c7a7e
Reviewed-on: https://go-review.googlesource.com/74110
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-31 19:12:51 +00:00
Austin Clements
afbe646ab4 cmd/compile: report typedslicecopy write barriers
Most write barrier calls are inserted by SSA, but copy and append are
lowered to runtime.typedslicecopy during walk. Fix these to set
Func.WBPos and emit the "write barrier" warning, as done for the write
barriers inserted by SSA. As part of this, we refactor setting WBPos
and emitting this warning into the frontend so it can be shared by
both walk and SSA.

Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b
Reviewed-on: https://go-review.googlesource.com/73411
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-29 20:21:43 +00:00
Austin Clements
5a4b6bce37 cmd/compile: improve coverage of nowritebarrierrec check
The current go:nowritebarrierrec checker has two problems that limit
its coverage:

1. It doesn't understand that systemstack calls its argument, which
means there are several cases where we fail to detect prohibited write
barriers.

2. It only observes calls in the AST, so calls constructed during
lowering by SSA aren't followed.

This CL completely rewrites this checker to address these issues.

The current checker runs entirely after walk and uses visitBottomUp,
which introduces several problems for checking across systemstack.
First, visitBottomUp itself doesn't understand systemstack calls, so
the callee may be ordered after the caller, causing the checker to
fail to propagate constraints. Second, many systemstack calls are
passed a closure, which is quite difficult to resolve back to the
function definition after transformclosure and walk have run. Third,
visitBottomUp works exclusively on the AST, so it can't observe calls
created by SSA.

To address these problems, this commit splits the check into two
phases and rewrites it to use a call graph generated during SSA
lowering. The first phase runs before transformclosure/walk and simply
records systemstack arguments when they're easy to get. Then, it
modifies genssa to record static call edges at the point where we're
lowering to Progs (which is the latest point at which position
information is conveniently available). Finally, the second phase runs
after all functions have been lowered and uses a direct BFS walk of
the call graph (combining systemstack calls with static calls) to find
prohibited write barriers and construct nice error messages.

Fixes #22384.
For #22460.

Change-Id: I39668f7f2366ab3c1ab1a71eaf25484d25349540
Reviewed-on: https://go-review.googlesource.com/72773
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-29 19:36:44 +00:00
Matthew Dempsky
4b1f2bb688 cmd/compile: document function syntax representation
Not perfect, but it's a start.

Change-Id: I3283385223a39ea73567b0130290bfe4de69d018
Reviewed-on: https://go-review.googlesource.com/73552
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-25 22:29:57 +00:00
Matthew Dempsky
fcd32885df cmd/compile: refactor method expression detection
Eliminates lots of ad hoc code for recognizing the same thing in
different ways.

Passes toolstash-check.

Change-Id: Ic0bb005308e96331b4ef30f455b860e476725b61
Reviewed-on: https://go-review.googlesource.com/73190
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-24 22:21:34 +00:00
Matthew Dempsky
12c9d753f8 cmd/compile: refactor generic AST walking code
racewalk's "foreach" function applies a function to all of a Node's
immediate children, but with a non-idiomatic signature.

This CL reworks it to recursively iterate over the entire subtree
rooted at Node and provides a way to short-circuit iteration.

Passes toolstash -cmp for std cmd with -race.

Change-Id: I738b73953d608709802c97945b7e0f4e4940d3f4
Reviewed-on: https://go-review.googlesource.com/71911
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-19 20:06:45 +00:00
Hugues Bruant
4f70a2a699 cmd/compile: inline calls to local closures
Calls to a closure held in a local, non-escaping,
variable can be inlined, provided the closure body
can be inlined and the variable is never written to.

The current implementation has the following limitations:

 - closures with captured variables are not inlined because
   doing so naively triggers invariant violation in the SSA
   phase
 - re-assignment check is currently approximated by checking
   the Addrtaken property of the variable which should be safe
   but may miss optimization opportunities if the address is
   not used for a write before the invocation

Updates #15561

Change-Id: I508cad5d28f027bd7e933b1f793c14dcfef8b5a1
Reviewed-on: https://go-review.googlesource.com/65071
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2017-10-11 22:32:36 +00:00
Alberto Donizetti
07f7db3ea9 cmd/compile: fix OTYPESW Op comment
OTYPESW Op comment says

  // List = Left.(type)

this seems to be wrong. Adding

  fmt.Printf("n: %v\n", cond)
  fmt.Printf("  n.List: %v\n", cond.List)
  fmt.Printf("  n.Left: %v\n", cond.Left)
  fmt.Printf("  n.Right: %v\n", cond.Right)

to (s *typeSwitch) walk(sw *Node), and compiling the following code
snippet

  var y interface{}
  switch x := y.(type) {
  default:
    println(x)
  }

prints

  n: <node TYPESW>
    n.List:
    n.Left: x
    n.Right: y

The correct OTYPESW Node field positions are

  // Left = Right.(type)

This is confirmed by the fact that, further in the code,
typeSwitch.walk() checks that Right (and not Left) is of type
interface:

  cond.Right = walkexpr(cond.Right, &sw.Ninit)
  if !cond.Right.Type.IsInterface() {
    yyerror("type switch must be on an interface")
    return
  }

This patch fixes the OTYPESW comment.

Change-Id: Ief1e409cfabb7640d7f7b0d4faabbcffaf605450
Reviewed-on: https://go-review.googlesource.com/69112
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-10-11 07:55:04 +00:00
Matthew Dempsky
b2e7eae7c4 cmd/compile: remove Local flags on Type and Node
These are redundant with checking x.Sym.Pkg == localpkg.

Passes toolstash-check -all.

Change-Id: Iebe25f7932cd15a036141b468ad75c239abcdcf7
Reviewed-on: https://go-review.googlesource.com/66670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-10-04 17:55:43 +00:00
Matthew Dempsky
3066dbad52 cmd/compile: cleanup toolstash pacifier from OXFALL removal
Change-Id: Ide7fe6b09247b7a6befbdfc2d6ce5988aa1df323
Reviewed-on: https://go-review.googlesource.com/61131
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-19 18:20:29 +00:00
Matthew Dempsky
4347baac7d cmd/compile: eliminate OXFALL
Previously, we used OXFALL vs OFALL to distinguish fallthrough
statements that had been validated. Because in the Node AST we flatten
statement blocks, OXCASE and OXFALL needed to keep track of their
block scopes for this purpose.

Now that we have an AST that keeps these separate, we can just perform
the validation earlier.

Passes toolstash-check.

Fixes #14540.

Change-Id: I8421eaba16c2b3b72c9c5483b5cf20b14261385e
Reviewed-on: https://go-review.googlesource.com/61130
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-09-19 18:08:50 +00:00
Heschi Kreinick
4c54a047c6 [dev.debug] cmd/compile: better DWARF with optimizations on
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.

After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.

There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:

  type myint int
  var a int
  b := myint(int)
and
  b := (*uint64)(unsafe.Pointer(a))

don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.

Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.

A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.

TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.

go build -toolexec 'toolstash -cmp' -a std succeeds.

With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after:  06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean  /tmp/220352263 /tmp/621364410
completed   15 of   15, estimated time remaining 0s (eta 3:52PM)
name        old time/op       new time/op       delta
Template          199ms ± 3%        198ms ± 2%     ~     (p=0.400 n=15+14)
Unicode          96.6ms ± 5%       96.4ms ± 5%     ~     (p=0.838 n=15+15)
GoTypes           653ms ± 2%        647ms ± 2%     ~     (p=0.102 n=15+14)
Flate             133ms ± 6%        129ms ± 3%   -2.62%  (p=0.041 n=15+15)
GoParser          164ms ± 5%        159ms ± 3%   -3.05%  (p=0.000 n=15+15)
Reflect           428ms ± 4%        422ms ± 3%     ~     (p=0.156 n=15+13)
Tar               123ms ±10%        124ms ± 8%     ~     (p=0.461 n=15+15)
XML               228ms ± 3%        224ms ± 3%   -1.57%  (p=0.045 n=15+15)
[Geo mean]        206ms             377ms       +82.86%

name        old user-time/op  new user-time/op  delta
Template          292ms ±10%        301ms ±12%     ~     (p=0.189 n=15+15)
Unicode           166ms ±37%        158ms ±14%     ~     (p=0.418 n=15+14)
GoTypes           962ms ± 6%        963ms ± 7%     ~     (p=0.976 n=15+15)
Flate             207ms ±19%        200ms ±14%     ~     (p=0.345 n=14+15)
GoParser          246ms ±22%        240ms ±15%     ~     (p=0.587 n=15+15)
Reflect           611ms ±13%        587ms ±14%     ~     (p=0.085 n=15+13)
Tar               211ms ±12%        217ms ±14%     ~     (p=0.355 n=14+15)
XML               335ms ±15%        320ms ±18%     ~     (p=0.169 n=15+15)
[Geo mean]        317ms             583ms       +83.72%

name        old alloc/op      new alloc/op      delta
Template         40.2MB ± 0%       40.2MB ± 0%   -0.15%  (p=0.000 n=14+15)
Unicode          29.2MB ± 0%       29.3MB ± 0%     ~     (p=0.624 n=15+15)
GoTypes           114MB ± 0%        114MB ± 0%   -0.15%  (p=0.000 n=15+14)
Flate            25.7MB ± 0%       25.6MB ± 0%   -0.18%  (p=0.000 n=13+15)
GoParser         32.2MB ± 0%       32.2MB ± 0%   -0.14%  (p=0.003 n=15+15)
Reflect          77.8MB ± 0%       77.9MB ± 0%     ~     (p=0.061 n=15+15)
Tar              27.1MB ± 0%       27.0MB ± 0%   -0.11%  (p=0.029 n=15+15)
XML              42.7MB ± 0%       42.5MB ± 0%   -0.29%  (p=0.000 n=15+15)
[Geo mean]       42.1MB            75.0MB       +78.05%

name        old allocs/op     new allocs/op     delta
Template           402k ± 1%         398k ± 0%   -0.91%  (p=0.000 n=15+15)
Unicode            344k ± 1%         344k ± 0%     ~     (p=0.715 n=15+14)
GoTypes           1.18M ± 0%        1.17M ± 0%   -0.91%  (p=0.000 n=15+14)
Flate              243k ± 0%         240k ± 1%   -1.05%  (p=0.000 n=13+15)
GoParser           327k ± 1%         324k ± 1%   -0.96%  (p=0.000 n=15+15)
Reflect            984k ± 1%         982k ± 0%     ~     (p=0.050 n=15+15)
Tar                261k ± 1%         259k ± 1%   -0.77%  (p=0.000 n=15+15)
XML                411k ± 0%         404k ± 1%   -1.55%  (p=0.000 n=15+15)
[Geo mean]         439k              755k       +72.01%

name        old text-bytes    new text-bytes    delta
HelloSize         694kB ± 0%        694kB ± 0%   -0.00%  (p=0.000 n=15+15)

name        old data-bytes    new data-bytes    delta
HelloSize        5.55kB ± 0%       5.55kB ± 0%     ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         133kB ± 0%        133kB ± 0%     ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.04MB ± 0%       1.04MB ± 0%     ~     (all equal)

Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-27 20:19:44 +00:00
Alessandro Arzilli
2ad41a3090 cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in .debug_info
section when compiling with -N -l.

Version of debug_info is updated from DWARF v2 to DWARF v3 since
version 2 does not allow lexical blocks with discontinuous PC ranges.

Remaining open problems:
- scope information is removed from inlined functions
- variables records do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, even before its declaration.

Updates #6913.
Updates #12899.

Change-Id: Idc6808788512ea20e7e45bcf782453acb416fb49
Reviewed-on: https://go-review.googlesource.com/40095
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-05-18 23:10:50 +00:00
Josh Bleecher Snyder
53e62aba2f cmd/compile: add Func.SetNilCheckDisabled
Generated hash and eq routines don't need nil checks.
Prior to this CL, this was accomplished by
temporarily incrementing the global variable disable_checknil.
However, that increment lasted only the lifetime of the
call to funccompile. After CL 41503, funccompile may
do nothing but enqueue the function for compilation,
resulting in nil checks being generated.

Fix this by adding an explicit flag to a function
indicating whether nil checks should be disabled
for that function.

While we're here, allow concurrent compilation
with the -w and -W flags, since that was needed
to investigate this issue.

Fixes #20242

Change-Id: Ib9140c22c49e9a09e62fa3cf350f5d3eff18e2bd
Reviewed-on: https://go-review.googlesource.com/42591
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-05-05 19:34:09 +00:00
Josh Bleecher Snyder
794d29a46f cmd/compile: use a map to track liveness variable indices
It is not safe to modify Node.Opt in the backend.
Instead of using Node.Opt to store liveness variable indices, use a map.
This simplifies the code and makes it much more clearly race-free.
There are generally few such variables, so the maps are not a significant
source of allocations; this also remove some allocations from putting
int32s into interfaces.

Because map lookups are more expensive than interface value extraction,
reorder valueEffects to do the map lookup last.

The only remaining use of Node.Opt is now in esc.go.

Passes toolstash-check.

Fixes #20144

name        old alloc/op      new alloc/op      delta
Template         37.8MB ± 0%       37.9MB ± 0%    ~     (p=0.548 n=5+5)
Unicode          28.9MB ± 0%       28.9MB ± 0%    ~     (p=0.548 n=5+5)
GoTypes           110MB ± 0%        110MB ± 0%  +0.16%  (p=0.008 n=5+5)
Compiler          461MB ± 0%        462MB ± 0%  +0.08%  (p=0.008 n=5+5)
SSA              1.11GB ± 0%       1.11GB ± 0%  +0.11%  (p=0.008 n=5+5)
Flate            24.7MB ± 0%       24.7MB ± 0%    ~     (p=0.690 n=5+5)
GoParser         31.1MB ± 0%       31.1MB ± 0%    ~     (p=0.841 n=5+5)
Reflect          73.7MB ± 0%       73.8MB ± 0%  +0.23%  (p=0.008 n=5+5)
Tar              25.8MB ± 0%       25.7MB ± 0%    ~     (p=0.690 n=5+5)
XML              41.2MB ± 0%       41.2MB ± 0%    ~     (p=0.841 n=5+5)
[Geo mean]       71.9MB            71.9MB       +0.06%

name        old allocs/op     new allocs/op     delta
Template           385k ± 0%         384k ± 0%    ~     (p=0.548 n=5+5)
Unicode            344k ± 0%         343k ± 1%    ~     (p=0.421 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.690 n=5+5)
Compiler          4.43M ± 0%        4.42M ± 0%    ~     (p=0.095 n=5+5)
SSA               9.86M ± 0%        9.84M ± 0%  -0.19%  (p=0.008 n=5+5)
Flate              238k ± 0%         238k ± 0%    ~     (p=1.000 n=5+5)
GoParser           321k ± 0%         320k ± 0%    ~     (p=0.310 n=5+5)
Reflect            956k ± 0%         956k ± 0%    ~     (p=1.000 n=5+5)
Tar                252k ± 0%         251k ± 0%    ~     (p=0.056 n=5+5)
XML                402k ± 1%         400k ± 1%  -0.57%  (p=0.032 n=5+5)
[Geo mean]         740k              739k       -0.19%

Change-Id: Id5916c9def76add272e89c59fe10968f0a6bb01d
Reviewed-on: https://go-review.googlesource.com/42135
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-28 19:50:53 +00:00
Josh Bleecher Snyder
fc08a19cef cmd/compile: move Used from gc.Node to gc.Name
Node.Used was written to from the backend
concurrently with reads of Node.Class
for the same ONAME Nodes.
I do not know why it was not failing consistently
under the race detector, but it is a race.

This is likely also a problem with Node.HasVal and Node.HasOpt.
They will be handled in a separate CL.

Fix Used by moving it to gc.Name and making it a separate bool.
There was one non-Name use of Used, marking OLABELs as used.
That is no longer needed, now that goto and label checking
happens early in the front end.

Leave the getters and setters in place,
to ease changing the representation in the future
(or changing to an interface!).

Updates #20144

Change-Id: I9bec7c6d33dcb129a4cfa9d338462ea33087f9f7
Reviewed-on: https://go-review.googlesource.com/42015
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-27 22:58:13 +00:00
Josh Bleecher Snyder
386765afdf cmd/compile: move Node.Class to flags
Put it at position zero, since it is fairly hot.

This shrinks gc.Node into a smaller size class on 64 bit systems.

name        old time/op       new time/op       delta
Template          193ms ± 5%        192ms ± 3%    ~     (p=0.353 n=94+93)
Unicode          86.1ms ± 5%       85.0ms ± 4%  -1.23%  (p=0.000 n=95+98)
GoTypes           546ms ± 3%        544ms ± 4%  -0.40%  (p=0.007 n=94+97)
Compiler          2.56s ± 3%        2.54s ± 3%  -0.67%  (p=0.000 n=99+97)
SSA               5.13s ± 2%        5.10s ± 3%  -0.55%  (p=0.000 n=94+98)
Flate             122ms ± 6%        121ms ± 4%  -0.75%  (p=0.002 n=97+95)
GoParser          144ms ± 5%        144ms ± 4%    ~     (p=0.298 n=98+97)
Reflect           348ms ± 4%        349ms ± 4%    ~     (p=0.350 n=98+97)
Tar               105ms ± 5%        104ms ± 5%    ~     (p=0.154 n=96+98)
XML               200ms ± 5%        198ms ± 4%  -0.71%  (p=0.015 n=97+98)
[Geo mean]        330ms             328ms       -0.52%

name        old user-time/op  new user-time/op  delta
Template          229ms ±11%        224ms ± 7%  -2.16%  (p=0.001 n=100+87)
Unicode           109ms ± 5%        109ms ± 6%    ~     (p=0.897 n=96+91)
GoTypes           712ms ± 4%        709ms ± 4%    ~     (p=0.085 n=96+98)
Compiler          3.41s ± 3%        3.36s ± 3%  -1.43%  (p=0.000 n=98+98)
SSA               7.46s ± 3%        7.31s ± 3%  -2.02%  (p=0.000 n=100+99)
Flate             145ms ± 6%        143ms ± 6%  -1.11%  (p=0.001 n=99+97)
GoParser          177ms ± 5%        176ms ± 5%  -0.78%  (p=0.018 n=95+95)
Reflect           432ms ± 7%        435ms ± 9%    ~     (p=0.296 n=100+100)
Tar               121ms ± 7%        121ms ± 5%    ~     (p=0.072 n=100+95)
XML               241ms ± 4%        239ms ± 5%    ~     (p=0.085 n=97+99)
[Geo mean]        413ms             410ms       -0.73%

name        old alloc/op      new alloc/op      delta
Template         38.4MB ± 0%       37.7MB ± 0%  -1.85%  (p=0.008 n=5+5)
Unicode          30.1MB ± 0%       28.8MB ± 0%  -4.09%  (p=0.008 n=5+5)
GoTypes           112MB ± 0%        110MB ± 0%  -1.69%  (p=0.008 n=5+5)
Compiler          470MB ± 0%        461MB ± 0%  -1.91%  (p=0.008 n=5+5)
SSA              1.13GB ± 0%       1.11GB ± 0%  -1.70%  (p=0.008 n=5+5)
Flate            25.0MB ± 0%       24.6MB ± 0%  -1.67%  (p=0.008 n=5+5)
GoParser         31.6MB ± 0%       31.1MB ± 0%  -1.66%  (p=0.008 n=5+5)
Reflect          77.1MB ± 0%       75.8MB ± 0%  -1.69%  (p=0.008 n=5+5)
Tar              26.3MB ± 0%       25.7MB ± 0%  -2.06%  (p=0.008 n=5+5)
XML              41.9MB ± 0%       41.1MB ± 0%  -1.93%  (p=0.008 n=5+5)
[Geo mean]       73.5MB            72.0MB       -2.03%

name        old allocs/op     new allocs/op     delta
Template           383k ± 0%         383k ± 0%    ~     (p=0.690 n=5+5)
Unicode            343k ± 0%         343k ± 0%    ~     (p=0.841 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.310 n=5+5)
Compiler          4.43M ± 0%        4.42M ± 0%  -0.17%  (p=0.008 n=5+5)
SSA               9.85M ± 0%        9.85M ± 0%    ~     (p=0.310 n=5+5)
Flate              236k ± 0%         236k ± 1%    ~     (p=0.841 n=5+5)
GoParser           320k ± 0%         320k ± 0%    ~     (p=0.421 n=5+5)
Reflect            988k ± 0%         987k ± 0%    ~     (p=0.690 n=5+5)
Tar                252k ± 0%         251k ± 0%    ~     (p=0.095 n=5+5)
XML                399k ± 0%         399k ± 0%    ~     (p=1.000 n=5+5)
[Geo mean]         741k              740k       -0.07%

Change-Id: I9e952b58a98e30a12494304db9ce50d0a85e459c
Reviewed-on: https://go-review.googlesource.com/41797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
2017-04-26 16:58:33 +00:00
Josh Bleecher Snyder
502a03ffcf cmd/compile: move Node.Typecheck to flags
Change-Id: Id5aa4a1499068bf2d3497b21d794f970b7e47fdf
Reviewed-on: https://go-review.googlesource.com/41795
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-26 01:27:28 +00:00
Josh Bleecher Snyder
e2560ace3c cmd/compile: move Node.Initorder to flags
Grand savings: 6 bits.

Change-Id: I364be54cc41534689e01672ed0fe2c10a560d3d4
Reviewed-on: https://go-review.googlesource.com/41794
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 01:12:09 +00:00
Josh Bleecher Snyder
af7da9a53b cmd/compile: convert Node.Embedded into a flag
Change-Id: I30c59ba84dcacc3de39c42f94484b47bb7c36eba
Reviewed-on: https://go-review.googlesource.com/41792
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 01:01:53 +00:00
Josh Bleecher Snyder
d286399641 cmd/compile: move Node.Walkdef into flags
Node.Walkdef is 0, 1, or 2, so it only requires two bits.
Add support for 2-bit values to bitset,
and use it for Node.Walkdef.

Class, Embedded, Typecheck, and Initorder will follow suit
in subsequent CLs.

The multi-bit flags will go at the beginning,
since that generates (marginally) more efficient code.

Change-Id: Id6e2e66e437f10aaa05b8a6e1652efb327d06128
Reviewed-on: https://go-review.googlesource.com/41791
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 00:43:48 +00:00
Josh Bleecher Snyder
2fb2ebc32e cmd/compile: make node.hasVal into two bools
In addition to being more compact,
this makes the code a lot clearer.

Change-Id: Ibcb70526c2e5913dcf34904fda194e3585228c3f
Reviewed-on: https://go-review.googlesource.com/41761
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 00:02:30 +00:00
Josh Bleecher Snyder
7f0757b394 cmd/compile: make node.Likely a flag
node.Likely may once have held -1/0/+1,
but it is now only 0/1.

With improved SSA heuristics,
it may someday go away entirely.

Change-Id: I6451d17fd7fb47e67fea4d39df302b6db00ea57b
Reviewed-on: https://go-review.googlesource.com/41760
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-26 00:02:24 +00:00
Josh Bleecher Snyder
eba396f596 cmd/compile: add and use gc.Node.funcname
Change-Id: If5631eae7e2ad2bef56e79b82f77105246e68773
Reviewed-on: https://go-review.googlesource.com/41494
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-23 19:00:28 +00:00
Josh Bleecher Snyder
3692925c5e cmd/compile: move Text.From.Sym initialization earlier
The initialization of an ATEXT Prog's From.Sym
can race with the assemblers in a concurrent compiler.
CL 40254 contains an initial, failed attempt to
fix that race.

This CL takes a different approach: Rather than
expose an API to initialize the Prog,
expose an API to initialize the Sym.

The initialization of the Sym can then be
moved earlier in the compiler, avoiding the race.

The growth of gc.Func has negligible
performance impact; see below.

Passes toolstash -cmp.

Updates #15756

name       old alloc/op      new alloc/op      delta
Template        38.8MB ± 0%       38.8MB ± 0%    ~     (p=0.968 n=9+10)
Unicode         29.8MB ± 0%       29.8MB ± 0%    ~     (p=0.684 n=10+10)
GoTypes          113MB ± 0%        113MB ± 0%    ~     (p=0.912 n=10+10)
SSA             1.25GB ± 0%       1.25GB ± 0%    ~     (p=0.481 n=10+10)
Flate           25.3MB ± 0%       25.3MB ± 0%    ~     (p=0.105 n=10+10)
GoParser        31.7MB ± 0%       31.8MB ± 0%  +0.09%  (p=0.016 n=8+10)
Reflect         78.3MB ± 0%       78.2MB ± 0%    ~     (p=0.190 n=10+10)
Tar             26.5MB ± 0%       26.6MB ± 0%  +0.13%  (p=0.011 n=10+10)
XML             42.4MB ± 0%       42.4MB ± 0%    ~     (p=0.971 n=10+10)

name       old allocs/op     new allocs/op     delta
Template          378k ± 1%         378k ± 0%    ~     (p=0.315 n=10+9)
Unicode           321k ± 1%         321k ± 0%    ~     (p=0.436 n=10+10)
GoTypes          1.14M ± 0%        1.14M ± 0%    ~     (p=0.079 n=10+9)
SSA              9.70M ± 0%        9.70M ± 0%  -0.04%  (p=0.035 n=10+10)
Flate             233k ± 1%         234k ± 1%    ~     (p=0.529 n=10+10)
GoParser          315k ± 0%         316k ± 0%    ~     (p=0.095 n=9+10)
Reflect           980k ± 0%         980k ± 0%    ~     (p=0.436 n=10+10)
Tar               249k ± 1%         250k ± 0%    ~     (p=0.280 n=10+10)
XML               391k ± 1%         391k ± 1%    ~     (p=0.481 n=10+10)

Change-Id: I3c93033dddd2e1df8cc54a106a6e615d27859e71
Reviewed-on: https://go-review.googlesource.com/40496
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-12 22:49:06 +00:00
Josh Bleecher Snyder
3cf1ce40bd Revert "cmd/compile: output DWARF lexical blocks for local variables"
This reverts commit c8b889cc48.

Reason for revert: broke noopt build, compiler performance regression, new Curfn uses

Let's fix those and then try this again.

Change-Id: Icc3cad1365d04cac8fd09da9dbb0bbf55c13ef44
Reviewed-on: https://go-review.googlesource.com/39991
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 19:52:26 +00:00
Alessandro Arzilli
c8b889cc48 cmd/compile: output DWARF lexical blocks for local variables
Change compiler and linker to emit DWARF lexical blocks in debug_info.
Version of debug_info is updated from DWARF v.2 to DWARF v.3 since version 2
does not allow lexical blocks with discontinuous ranges.

Second attempt at https://go-review.googlesource.com/#/c/29591/

Remaining open problems:
- scope information is removed from inlined functions
- variables in debug_info do not have DW_AT_start_scope attributes so a
variable will shadow other variables with the same name as soon as its
containing scope begins, before its declaration.

Updates golang/go#12899, golang/go#6913

Change-Id: I0e260a45b564d14a87b88974eb16c5387cb410a5
Reviewed-on: https://go-review.googlesource.com/36879
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 18:15:06 +00:00
Robert Griesemer
f68f292820 cmd/compile: factor out Pkg, Sym, and Type into package types
- created new package cmd/compile/internal/types
- moved Pkg, Sym, Type to new package
- to break cycles, for now we need the (ugly) types/utils.go
  file which contains a handful of functions that must be installed
  early by the gc frontend
- to break cycles, for now we need two functions to convert between
  *gc.Node and *types.Node (the latter is a dummy type)
- adjusted the gc's code to use the new package and the conversion
  functions as needed
- made several Pkg, Sym, and Type methods functions as needed
- renamed constructors typ, typPtr, typArray, etc. to types.New,
  types.NewPtr, types.NewArray, etc.

Passes toolstash-check -all.

Change-Id: I8adfa5e85c731645d0a7fd2030375ed6ebf54b72
Reviewed-on: https://go-review.googlesource.com/39855
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-04-07 03:04:00 +00:00
Josh Bleecher Snyder
8e36575ebe cmd/compile: don't mutate shared nodes in orderinit
A few gc.Node ops may be shared across functions.
The compiler is (mostly) already careful to avoid mutating them.
However, from a concurrency perspective, replacing (say)
an empty list with an empty list still counts as a mutation.
One place this occurs is orderinit. Avoid it.

This requires fixing one spot where shared nodes were mutated.
It doesn't result in any functional or performance changes.

Passes toolstash-check.

Updates #15756

Change-Id: I63c93b31baeeac62d7574804acb6b7f2bc9d14a9
Reviewed-on: https://go-review.googlesource.com/39196
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2017-03-31 22:38:01 +00:00
Josh Bleecher Snyder
a0d6d3855f cmd/compile: construct typename in walk instead of SSA conversion
This eliminates references to lineno and
other globals from ssa conversion.

Passes toolstash-check.

Updates #15756

Change-Id: I9792074fab0036b42f454b79139d0b27db913fb5
Reviewed-on: https://go-review.googlesource.com/38721
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-27 23:10:39 +00:00