Commit graph

16 commits

Author SHA1 Message Date
khr@golang.org
524946d247 cmd/compile: don't preload registers if destination already scheduled
In regalloc, we allocate some values to registers before loop entry,
so that they don't need to be loaded (from spill locations) during
the loop.

But it is pointless if we've already regalloc'd the loop body.
Whatever restores we needed for the body are already generated.

It's not clear if this code is ever useful. No tests fail if I just
remove it. But at least this change is worthwhile. It doesn't help,
and it actively inserts more restores than we really need (mostly
because the desired register list is approximate - I have seen cases
where the loads implicated here end up being dead because the restores
hit the wrong registers and the edge shuffle pass knows it needs
the restores in different registers).

While we are here, might as well have layoutRegallocOrder return
the standard layout order instead of recomputing it.

Change-Id: Ia624d5121de59b6123492603695de50b272b277f
Reviewed-on: https://go-review.googlesource.com/c/go/+/672735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2025-05-19 17:13:21 -07:00
Keith Randall
68bd383368 cmd/compile: add cache of sizeable objects so they can be reused
We kind of have this mechanism already, just normalizing it and
using it in a bunch of places. Previously a bunch of places cached
slices only for the duration of a single function compilation. Now
we can reuse slices across a whole compiler run.

Use a sync.Pool of powers-of-two sizes. This lets us use not
too much memory, and avoid holding onto memory we're no longer
using when a GC happens.

There's a few different types we need, so generate the code for it.
Generics would be useful here, but we can't use generics in the
compiler because of bootstrapping.

Change-Id: I6cf37e7b7b2e802882aaa723a0b29770511ccd82
Reviewed-on: https://go-review.googlesource.com/c/go/+/444820
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-10-31 21:41:20 +00:00
David Chase
b38b1b2f9a cmd/compile: manage Slot array better
steals idea from CL 312093

further investigation revealed additional duplicate
slots (equivalent, but not equal), so delete those too.

Rearranged Func.Names to be addresses of slots,
create canonical addresses so that split slots
(which use those addresses to refer to their parent,
and split slots can be further split)
will preserve "equivalent slots are equal".

Removes duplicates, improves metrics for "args at entry".

Change-Id: I5bbdcb50bd33655abcab3d27ad8cdce25499faaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/312292
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-05-08 17:03:18 +00:00
erifan01
600259b099 cmd/compile: use depth first topological sort algorithm for layout
The current layout algorithm tries to put consecutive blocks together,
so the priority of the successor block is higher than the priority of
the zero indegree block. This algorithm is beneficial for subsequent
register allocation, but will result in more branch instructions.
The depth-first topological sorting algorithm is a well-known layout
algorithm, which has applications in many languages, and it helps to
reduce branch instructions. This CL applies it to the layout pass.
The test results show that it helps to reduce the code size.

This CL also includes the following changes:
1, Removed the primary predecessor mechanism. The new layout algorithm is
  not very friendly to register allocator in some cases, in order to adapt
  to the new layout algorithm, a new primary predecessor selection strategy
  is introduced.
2, Since the new layout implementation may place non-loop blocks between
  loop blocks, some adaptive modifications have also been made to looprotate
  pass.
3, The layout also affects the results of codegen, so this CL also adjusted
  several codegen tests accordingly.

It is inevitable that this CL will cause the code size or performance of a
few functions to decrease, but the number of cases it improves is much larger
than the number of cases it drops.

Statistical data from compilecmp on linux/amd64 is as follow:
name                      old time/op       new time/op       delta
Template                        382ms ± 4%        382ms ± 4%    ~     (p=0.497 n=49+50)
Unicode                         170ms ± 9%        169ms ± 8%    ~     (p=0.344 n=48+50)
GoTypes                         2.01s ± 4%        2.01s ± 4%    ~     (p=0.628 n=50+48)
Compiler                        190ms ±10%        189ms ± 9%    ~     (p=0.734 n=50+50)
SSA                             11.8s ± 2%        11.8s ± 3%    ~     (p=0.877 n=50+50)
Flate                           241ms ± 9%        241ms ± 8%    ~     (p=0.897 n=50+49)
GoParser                        366ms ± 3%        361ms ± 4%  -1.21%  (p=0.004 n=47+50)
Reflect                         835ms ± 3%        838ms ± 3%    ~     (p=0.275 n=50+49)
Tar                             336ms ± 4%        335ms ± 3%    ~     (p=0.454 n=48+48)
XML                             433ms ± 4%        431ms ± 3%    ~     (p=0.071 n=49+48)
LinkCompiler                    706ms ± 4%        705ms ± 4%    ~     (p=0.608 n=50+49)
ExternalLinkCompiler            1.85s ± 3%        1.83s ± 2%  -1.47%  (p=0.000 n=49+48)
LinkWithoutDebugCompiler        437ms ± 5%        437ms ± 6%    ~     (p=0.953 n=49+50)
[Geo mean]                      615ms             613ms       -0.37%

name                      old alloc/op      new alloc/op      delta
Template                       38.7MB ± 1%       38.7MB ± 1%    ~     (p=0.834 n=50+50)
Unicode                        28.1MB ± 0%       28.1MB ± 0%  -0.22%  (p=0.000 n=49+50)
GoTypes                         168MB ± 1%        168MB ± 1%    ~     (p=0.054 n=47+47)
Compiler                       23.0MB ± 1%       23.0MB ± 1%    ~     (p=0.432 n=50+50)
SSA                            1.54GB ± 0%       1.54GB ± 0%  +0.21%  (p=0.000 n=50+50)
Flate                          23.6MB ± 1%       23.6MB ± 1%    ~     (p=0.153 n=43+46)
GoParser                       35.1MB ± 1%       35.1MB ± 2%    ~     (p=0.202 n=50+50)
Reflect                        84.7MB ± 1%       84.7MB ± 1%    ~     (p=0.333 n=48+49)
Tar                            34.5MB ± 1%       34.5MB ± 1%    ~     (p=0.406 n=46+49)
XML                            44.3MB ± 2%       44.2MB ± 3%    ~     (p=0.981 n=50+50)
LinkCompiler                    131MB ± 0%        128MB ± 0%  -2.74%  (p=0.000 n=50+50)
ExternalLinkCompiler            120MB ± 0%        120MB ± 0%  +0.01%  (p=0.007 n=50+50)
LinkWithoutDebugCompiler       77.3MB ± 0%       77.3MB ± 0%  -0.02%  (p=0.000 n=50+50)
[Geo mean]                     69.3MB            69.1MB       -0.22%

file      before    after     Δ        %
addr2line 4104220   4043684   -60536   -1.475%
api       5342502   5249678   -92824   -1.737%
asm       4973785   4858257   -115528  -2.323%
buildid   2667844   2625660   -42184   -1.581%
cgo       4686849   4616313   -70536   -1.505%
compile   23667431  23268406  -399025  -1.686%
cover     4959676   4874108   -85568   -1.725%
dist      3515934   3450422   -65512   -1.863%
doc       3995581   3925469   -70112   -1.755%
fix       3379202   3318522   -60680   -1.796%
link      6743249   6629913   -113336  -1.681%
nm        4047529   3991777   -55752   -1.377%
objdump   4456151   4388151   -68000   -1.526%
pack      2435040   2398072   -36968   -1.518%
pprof     13804080  13565808  -238272  -1.726%
test2json 2690043   2645987   -44056   -1.638%
trace     10418492  10232716  -185776  -1.783%
vet       7258259   7121259   -137000  -1.888%
total     113145867 111204202 -1941665 -1.716%

The situation on linux/arm64 is as follow:
name                      old time/op       new time/op       delta
Template                        280ms ± 1%        282ms ± 1%  +0.75%  (p=0.000 n=46+48)
Unicode                         124ms ± 2%        124ms ± 2%  +0.37%  (p=0.045 n=50+50)
GoTypes                         1.69s ± 1%        1.70s ± 1%  +0.56%  (p=0.000 n=49+50)
Compiler                        122ms ± 1%        123ms ± 1%  +0.93%  (p=0.000 n=50+50)
SSA                             12.6s ± 1%        12.7s ± 0%  +0.72%  (p=0.000 n=50+50)
Flate                           170ms ± 1%        172ms ± 1%  +0.97%  (p=0.000 n=49+49)
GoParser                        262ms ± 1%        263ms ± 1%  +0.39%  (p=0.000 n=49+48)
Reflect                         639ms ± 1%        650ms ± 1%  +1.63%  (p=0.000 n=49+49)
Tar                             243ms ± 1%        245ms ± 1%  +0.82%  (p=0.000 n=50+50)
XML                             324ms ± 1%        327ms ± 1%  +0.72%  (p=0.000 n=50+49)
LinkCompiler                    597ms ± 1%        596ms ± 1%  -0.27%  (p=0.001 n=48+47)
ExternalLinkCompiler            1.90s ± 1%        1.88s ± 1%  -1.00%  (p=0.000 n=50+50)
LinkWithoutDebugCompiler        364ms ± 1%        363ms ± 1%    ~     (p=0.220 n=49+50)
[Geo mean]                      485ms             488ms       +0.49%

name                      old alloc/op      new alloc/op      delta
Template                       38.7MB ± 0%       38.8MB ± 1%    ~     (p=0.093 n=43+49)
Unicode                        28.4MB ± 0%       28.4MB ± 0%  +0.03%  (p=0.000 n=49+45)
GoTypes                         169MB ± 1%        169MB ± 1%  +0.23%  (p=0.010 n=50+50)
Compiler                       23.2MB ± 1%       23.2MB ± 1%  +0.11%  (p=0.000 n=40+44)
SSA                            1.54GB ± 0%       1.55GB ± 0%  +0.45%  (p=0.000 n=47+49)
Flate                          23.8MB ± 2%       23.8MB ± 1%    ~     (p=0.543 n=50+50)
GoParser                       35.3MB ± 1%       35.4MB ± 1%    ~     (p=0.792 n=50+50)
Reflect                        85.2MB ± 1%       85.2MB ± 0%    ~     (p=0.055 n=50+47)
Tar                            34.5MB ± 1%       34.5MB ± 1%  +0.06%  (p=0.015 n=50+50)
XML                            43.8MB ± 2%       43.9MB ± 2%  +0.19%  (p=0.000 n=48+48)
LinkCompiler                    137MB ± 0%        136MB ± 0%  -0.92%  (p=0.000 n=50+50)
ExternalLinkCompiler            127MB ± 0%        127MB ± 0%    ~     (p=0.516 n=50+50)
LinkWithoutDebugCompiler       84.0MB ± 0%       84.0MB ± 0%    ~     (p=0.057 n=50+50)
[Geo mean]                     70.4MB            70.4MB       +0.01%

file      before    after     Δ        %
addr2line 4021557   4002933   -18624   -0.463%
api       5127847   5028503   -99344   -1.937%
asm       5034716   4936836   -97880   -1.944%
buildid   2608118   2594094   -14024   -0.538%
cgo       4488592   4398320   -90272   -2.011%
compile   22501129  22213592  -287537  -1.278%
cover     4742301   4713573   -28728   -0.606%
dist      3388071   3365311   -22760   -0.672%
doc       3802250   3776082   -26168   -0.688%
fix       3306147   3216939   -89208   -2.698%
link      6404483   6363699   -40784   -0.637%
nm        3941026   3921930   -19096   -0.485%
objdump   4383330   4295122   -88208   -2.012%
pack      2404547   2389515   -15032   -0.625%
pprof     12996234  12856818  -139416  -1.073%
test2json 2668500   2586788   -81712   -3.062%
trace     9816276   9609580   -206696  -2.106%
vet       6900682   6787338   -113344  -1.643%
total     108535806 107056973 -1478833 -1.363%

Change-Id: Iaec1cdcaacca8025e9babb0fb8a532fddb70c87d
Reviewed-on: https://go-review.googlesource.com/c/go/+/255239
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: eric fang <eric.fang@arm.com>
2021-03-16 02:44:54 +00:00
Josh Bleecher Snyder
42d4df9459 cmd/compile: lay out exit post-dominated blocks at the end
Complete a long-standing TODO in the code.

Exit blocks are cold code, so we lay them out at the end of the function.
Blocks that are post-dominated by exit blocks are also ipso facto exit blocks.
Treat them as such.

Implement using a simple loop, because there are generally very few exit blocks.

In addition to improved instruction cache, this empirically yields
better register allocation.

Binary size impact:

file    before    after     Δ       %       
cgo     4812872   4808776   -4096   -0.085% 
fix     3370072   3365976   -4096   -0.122% 
vet     8252280   8248184   -4096   -0.050% 
total   115052984 115040696 -12288  -0.011% 

This also appears to improve compiler performance
(-0.15% geomean time/op, -1.20% geomean user time/op),
but that could just be alignment effects.
Compiler benchmarking hasn't been super reliably recently,
and there's no particular reason to think this should
speed up the compiler that much.

Change-Id: I3d262c4f5cb80626a67a5c17285e2fa09f423c00
Reviewed-on: https://go-review.googlesource.com/c/go/+/227217
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2020-04-06 16:08:18 +00:00
Yury Smolsky
3068fcfa0d cmd/compile: add control flow graphs to ssa.html
This CL adds CFGs to ssa.html.
It execs dot to generate SVG,
which then gets inlined into the html.
Some standard naming and javascript hacks
enable integration with the rest of ssa.html.
Clicking on blocks highlights the relevant
part of the CFG, and vice versa.

Sample output and screenshots can be seen in #28177.

CFGs can be turned on with the suffix mask:
:*            - dump CFG for every phase
:lower        - just the lower phase
:lower-layout - lower through layout
:w,x-y        - phases w and x through y

Calling dot after every pass is noticeably slow,
instead use the range of phases.

Dead blocks are not displayed on CFG.

User can zoom and pan individual CFG
when the automatic adjustment has failed.

Dot-related errors are reported
without bringing down the process.

Fixes #28177

Change-Id: Id52c42d86c4559ca737288aa10561b67a119c63d
Reviewed-on: https://go-review.googlesource.com/c/142517
Run-TryBot: Yury Smolsky <yury@smolsky.by>
Reviewed-by: David Chase <drchase@google.com>
2018-11-21 10:22:43 +00:00
Igor Zhilianin
f90e89e675 all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev: d4cfc41a55
GitHub-Pull-Request: golang/go#28049
Reviewed-on: https://go-review.googlesource.com/c/140299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-06 15:40:03 +00:00
David Chase
c18ff18465 cmd/compile: decouple emitted block order from regalloc block order
While tinkering with different block orders for the preemptible
loop experiment, crashed the register allocator with a "bad"
one (these exist).  Realized that one knob was controlling
two things (register allocation and branch patterns) and
decided that life would be simpler if the two orders were
independent.

Ran some experiments and determined that we have probably,
mostly, been optimizing for register allocation effects, not
branch effects.  Bad block orders for register allocation are
somewhat costly.

This will also allow separate experimentation with perhaps-
better block orders for register allocation.

Change-Id: I6ecf2f24cca178b6f8acc0d3c4caaef043c11ed9
Reviewed-on: https://go-review.googlesource.com/47314
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-22 03:02:34 +00:00
Josh Bleecher Snyder
4b0d74f89d cmd/compile: lay out exit blocks last
In Go 1.8.x, panics are generally scheduled at the very end of functions.
That property was lost in Go 1.9; this CL restores it.

This helps with the Fannkuch benchmark:

name          old time/op  new time/op  delta
Fannkuch11-8   2.74s ± 2%   2.55s ± 2%  -7.03%  (p=0.000 n=20+20)

This increases the fannkuch function size from 801 bytes to 831 bytes,
but that is still smaller than Go 1.8.1 at 844 bytes.

It generally increases binary size a tiny amount.
Negligible compiler performance impact.

For the code in #14758:

name   old time/op    new time/op    delta
Foo-8     326ns ± 3%     312ns ± 3%  -4.32%  (p=0.000 n=28+30)
Bar-8     560ns ± 2%     565ns ± 2%  +0.96%  (p=0.002 n=30+27)

Updates #18977

name        old alloc/op      new alloc/op      delta
Template         38.8MB ± 0%       38.8MB ± 0%    ~     (p=0.690 n=5+5)
Unicode          28.7MB ± 0%       28.7MB ± 0%    ~     (p=0.841 n=5+5)
GoTypes           109MB ± 0%        109MB ± 0%    ~     (p=0.690 n=5+5)
Compiler          457MB ± 0%        457MB ± 0%    ~     (p=0.841 n=5+5)
SSA              1.10GB ± 0%       1.10GB ± 0%  +0.03%  (p=0.032 n=5+5)
Flate            24.4MB ± 0%       24.5MB ± 0%    ~     (p=0.690 n=5+5)
GoParser         30.9MB ± 0%       30.9MB ± 0%    ~     (p=0.421 n=5+5)
Reflect          73.3MB ± 0%       73.3MB ± 0%    ~     (p=1.000 n=5+5)
Tar              25.5MB ± 0%       25.5MB ± 0%    ~     (p=0.095 n=5+5)
XML              40.8MB ± 0%       40.9MB ± 0%    ~     (p=0.056 n=5+5)
[Geo mean]       71.6MB            71.6MB       +0.01%

name        old allocs/op     new allocs/op     delta
Template           395k ± 0%         394k ± 1%    ~     (p=1.000 n=5+5)
Unicode            344k ± 0%         344k ± 0%    ~     (p=0.690 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.421 n=5+5)
Compiler          4.41M ± 0%        4.41M ± 0%    ~     (p=0.841 n=5+5)
SSA               9.79M ± 0%        9.79M ± 0%    ~     (p=0.310 n=5+5)
Flate              237k ± 0%         237k ± 0%    ~     (p=0.841 n=5+5)
GoParser           321k ± 0%         321k ± 1%    ~     (p=0.421 n=5+5)
Reflect            956k ± 0%         956k ± 0%    ~     (p=1.000 n=5+5)
Tar                251k ± 1%         252k ± 0%    ~     (p=0.095 n=5+5)
XML                399k ± 0%         400k ± 0%    ~     (p=0.222 n=5+5)
[Geo mean]         741k              741k       +0.03%

name        old object-bytes  new object-bytes  delta
Template           386k ± 0%         386k ± 0%  +0.05%  (p=0.008 n=5+5)
Unicode            202k ± 0%         202k ± 0%  +0.02%  (p=0.008 n=5+5)
GoTypes           1.16M ± 0%        1.16M ± 0%  +0.07%  (p=0.008 n=5+5)
Compiler          3.91M ± 0%        3.91M ± 0%  +0.05%  (p=0.008 n=5+5)
SSA               7.86M ± 0%        7.87M ± 0%  +0.07%  (p=0.008 n=5+5)
Flate              227k ± 0%         227k ± 0%  +0.10%  (p=0.008 n=5+5)
GoParser           283k ± 0%         283k ± 0%  +0.04%  (p=0.008 n=5+5)
Reflect            950k ± 0%         951k ± 0%  +0.04%  (p=0.008 n=5+5)
Tar                187k ± 0%         187k ± 0%  -0.03%  (p=0.008 n=5+5)
XML                406k ± 0%         406k ± 0%  +0.04%  (p=0.008 n=5+5)
[Geo mean]         647k              647k       +0.04%

Change-Id: I2015aa26338b90cf41e47f89564e336dc02608df
Reviewed-on: https://go-review.googlesource.com/43293
Reviewed-by: Keith Randall <khr@golang.org>
2017-05-16 13:46:10 +00:00
Keith Randall
4fa050024f cmd/compile: enable constant-time CFG editing
Provide indexes along with block pointers for Preds
and Succs arrays.  This allows us to splice edges in
and out of those arrays in constant time.

Fixes worst-case O(n^2) behavior in deadcode and fuse.

benchmark                     old ns/op      new ns/op     delta
BenchmarkFuse1-8              2065           2057          -0.39%
BenchmarkFuse10-8             9408           9073          -3.56%
BenchmarkFuse100-8            105238         76277         -27.52%
BenchmarkFuse1000-8           3982562        1026750       -74.22%
BenchmarkFuse10000-8          301220329      12824005      -95.74%
BenchmarkDeadCode1-8          1588           1566          -1.39%
BenchmarkDeadCode10-8         4333           4250          -1.92%
BenchmarkDeadCode100-8        32031          32574         +1.70%
BenchmarkDeadCode1000-8       590407         468275        -20.69%
BenchmarkDeadCode10000-8      17822890       5000818       -71.94%
BenchmarkDeadCode100000-8     1388706640     78021127      -94.38%
BenchmarkDeadCode200000-8     5372518479     168598762     -96.86%

Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17
Reviewed-on: https://go-review.googlesource.com/22589
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05 15:58:59 +00:00
Dave Cheney
b9feb91f32 cmd/compile: minor cleanups
Some minor scoping cleanups found by a very old version of grind.

Change-Id: I1d373817586445fc87e38305929097b652696fdd
Reviewed-on: https://go-review.googlesource.com/21064
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-03-24 11:18:04 +00:00
Todd Neal
f962f33035 [dev.ssa] cmd/compile: reuse sparse sets across compiler passes
Cache sparse sets in the function so they can be reused by subsequent
compiler passes.

benchmark                        old ns/op     new ns/op     delta
BenchmarkDSEPass-8               206945        180022        -13.01%
BenchmarkDSEPassBlock-8          5286103       2614054       -50.55%
BenchmarkCSEPass-8               1790277       1790655       +0.02%
BenchmarkCSEPassBlock-8          18083588      18112771      +0.16%
BenchmarkDeadcodePass-8          59837         41375         -30.85%
BenchmarkDeadcodePassBlock-8     1651575       511169        -69.05%
BenchmarkMultiPass-8             531529        427506        -19.57%
BenchmarkMultiPassBlock-8        7033496       4487814       -36.19%

benchmark                        old allocs     new allocs     delta
BenchmarkDSEPass-8               11             4              -63.64%
BenchmarkDSEPassBlock-8          599            120            -79.97%
BenchmarkCSEPass-8               18             18             +0.00%
BenchmarkCSEPassBlock-8          2700           2700           +0.00%
BenchmarkDeadcodePass-8          4              3              -25.00%
BenchmarkDeadcodePassBlock-8     30             9              -70.00%
BenchmarkMultiPass-8             24             20             -16.67%
BenchmarkMultiPassBlock-8        1800           1000           -44.44%

benchmark                        old bytes     new bytes     delta
BenchmarkDSEPass-8               221367        142           -99.94%
BenchmarkDSEPassBlock-8          3695207       3846          -99.90%
BenchmarkCSEPass-8               303328        303328        +0.00%
BenchmarkCSEPassBlock-8          5006400       5006400       +0.00%
BenchmarkDeadcodePass-8          84232         10506         -87.53%
BenchmarkDeadcodePassBlock-8     1274940       163680        -87.16%
BenchmarkMultiPass-8             608674        313834        -48.44%
BenchmarkMultiPassBlock-8        9906001       5003450       -49.49%

Change-Id: Ib1fa58c7f494b374d1a4bb9cffbc2c48377b59d3
Reviewed-on: https://go-review.googlesource.com/19100
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2016-01-30 13:57:39 +00:00
Josh Bleecher Snyder
bbf8c5ce2f [dev.ssa] cmd/compile: initial implementation of likely direction
Change-Id: Id8457b18c07bf717d13c9423d8f314f253eee64f
Reviewed-on: https://go-review.googlesource.com/13580
Reviewed-by: Keith Randall <khr@golang.org>
2015-08-12 22:03:08 +00:00
Josh Bleecher Snyder
37ddc270ca [dev.ssa] cmd/compile/ssa: add -f suffix to logging methods
Requested in CL 11380.

Change-Id: Icf0d23fb8d383c76272401e363cc9b2169d11403
Reviewed-on: https://go-review.googlesource.com/11450
Reviewed-by: Alan Donovan <adonovan@google.com>
2015-06-24 21:48:26 +00:00
Josh Bleecher Snyder
8c6abfeacb [dev.ssa] cmd/compile/ssa: separate logging, work in progress, and fatal errors
The SSA implementation logs for three purposes:

	* debug logging
	* fatal errors
	* unimplemented features

Separating these three uses lets us attempt an SSA
implementation for all functions, not just
_ssa functions. This turns the entire standard
library into a compilation test, and makes it
easy to figure out things like
"how much coverage does SSA have now" and
"what should we do next to get more coverage?".

Functions called _ssa are still special.
They log profusely by default and
the output of the SSA implementation
is used. For all other functions,
logging is off, and the implementation
is built and discarded, due to lack of
support for the runtime.

While we're here, fix a few minor bugs and
add some extra Unimplementeds to allow
all.bash to pass.

As of now, SSA handles 20.79% of the functions
in the standard library (689 of 3314).

The top missing features are:

 10.03%  2597 SSA unimplemented: zero for type error not implemented
  7.79%  2016 SSA unimplemented: addr: bad op DOTPTR
  7.33%  1898 SSA unimplemented: unhandled expr EQ
  6.10%  1579 SSA unimplemented: unhandled expr OROR
  4.91%  1271 SSA unimplemented: unhandled expr NE
  4.49%  1163 SSA unimplemented: unhandled expr LROT
  4.00%  1036 SSA unimplemented: unhandled expr LEN
  3.56%   923 SSA unimplemented: unhandled stmt CALLFUNC
  2.37%   615 SSA unimplemented: zero for type []byte not implemented
  1.90%   492 SSA unimplemented: unhandled stmt CALLMETH
  1.74%   450 SSA unimplemented: unhandled expr CALLINTER
  1.74%   450 SSA unimplemented: unhandled expr DOT
  1.71%   444 SSA unimplemented: unhandled expr ANDAND
  1.65%   426 SSA unimplemented: unhandled expr CLOSUREVAR
  1.54%   400 SSA unimplemented: unhandled expr CALLMETH
  1.51%   390 SSA unimplemented: unhandled stmt SWITCH
  1.47%   380 SSA unimplemented: unhandled expr CONV
  1.33%   345 SSA unimplemented: addr: bad op *
  1.30%   336 SSA unimplemented: unhandled OLITERAL 6

Change-Id: I4ca07951e276714dc13c31de28640aead17a1be7
Reviewed-on: https://go-review.googlesource.com/11160
Reviewed-by: Keith Randall <khr@golang.org>
2015-06-21 02:56:36 +00:00
Keith Randall
067e8dfd82 [dev.ssa] Merge remote-tracking branch 'origin/master' into mergebranch
Semi-regular merge of tip to dev.ssa.

Complicated a bit by the move of cmd/internal/* to cmd/compile/internal/*.

Change-Id: I1c66d3c29bb95cce4a53c5a3476373aa5245303d
2015-05-28 13:51:18 -07:00