Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Josh Bleecher Snyder	c9ccdf1f8c	cmd/compile: make deadcode pass cheaper The deadcode pass runs a lot. I'd like it to run even more. This change adds dedicated storage for deadcode to ssa.Cache. In addition to being a nice win now, it makes deadcode easier to add other places in the future. name old time/op new time/op delta Template 210ms ± 3% 209ms ± 2% ~ (p=0.951 n=93+95) Unicode 92.2ms ± 3% 93.0ms ± 3% +0.87% (p=0.000 n=94+94) GoTypes 739ms ± 2% 733ms ± 2% -0.84% (p=0.000 n=92+94) Compiler 3.51s ± 2% 3.49s ± 2% -0.57% (p=0.000 n=94+91) SSA 9.80s ± 2% 9.75s ± 2% -0.57% (p=0.000 n=95+92) Flate 132ms ± 2% 132ms ± 3% ~ (p=0.165 n=94+98) GoParser 160ms ± 3% 159ms ± 3% -0.42% (p=0.005 n=96+94) Reflect 446ms ± 4% 442ms ± 4% -0.91% (p=0.000 n=95+98) Tar 186ms ± 3% 186ms ± 2% ~ (p=0.221 n=94+97) XML 252ms ± 2% 250ms ± 2% -0.55% (p=0.000 n=95+94) [Geo mean] 430ms 429ms -0.34% name old user-time/op new user-time/op delta Template 256ms ± 3% 257ms ± 3% ~ (p=0.521 n=94+98) Unicode 120ms ± 9% 121ms ± 9% ~ (p=0.074 n=99+100) GoTypes 935ms ± 3% 935ms ± 2% ~ (p=0.574 n=82+96) Compiler 4.56s ± 1% 4.55s ± 2% ~ (p=0.247 n=88+90) SSA 13.6s ± 2% 13.6s ± 1% ~ (p=0.277 n=94+95) Flate 155ms ± 3% 156ms ± 3% ~ (p=0.181 n=95+100) GoParser 193ms ± 8% 184ms ± 6% -4.39% (p=0.000 n=100+89) Reflect 549ms ± 3% 552ms ± 3% +0.45% (p=0.036 n=94+96) Tar 230ms ± 4% 230ms ± 4% ~ (p=0.670 n=97+99) XML 315ms ± 5% 309ms ±12% -2.05% (p=0.000 n=99+99) [Geo mean] 540ms 538ms -0.47% name old alloc/op new alloc/op delta Template 40.3MB ± 0% 38.9MB ± 0% -3.36% (p=0.008 n=5+5) Unicode 28.6MB ± 0% 28.4MB ± 0% -0.90% (p=0.008 n=5+5) GoTypes 137MB ± 0% 132MB ± 0% -3.65% (p=0.008 n=5+5) Compiler 637MB ± 0% 609MB ± 0% -4.40% (p=0.008 n=5+5) SSA 2.19GB ± 0% 2.07GB ± 0% -5.63% (p=0.008 n=5+5) Flate 25.0MB ± 0% 24.1MB ± 0% -3.80% (p=0.008 n=5+5) GoParser 30.0MB ± 0% 29.1MB ± 0% -3.17% (p=0.008 n=5+5) Reflect 87.1MB ± 0% 84.4MB ± 0% -3.05% (p=0.008 n=5+5) Tar 37.3MB ± 0% 36.0MB ± 0% -3.31% (p=0.008 n=5+5) XML 49.8MB ± 0% 48.0MB ± 0% -3.69% (p=0.008 n=5+5) [Geo mean] 87.6MB 84.6MB -3.50% name old allocs/op new allocs/op delta Template 387k ± 0% 380k ± 0% -1.76% (p=0.008 n=5+5) Unicode 342k ± 0% 341k ± 0% -0.31% (p=0.008 n=5+5) GoTypes 1.39M ± 0% 1.37M ± 0% -1.64% (p=0.008 n=5+5) Compiler 5.68M ± 0% 5.60M ± 0% -1.41% (p=0.008 n=5+5) SSA 17.1M ± 0% 16.8M ± 0% -1.49% (p=0.008 n=5+5) Flate 240k ± 0% 236k ± 0% -1.99% (p=0.008 n=5+5) GoParser 309k ± 0% 304k ± 0% -1.57% (p=0.008 n=5+5) Reflect 1.01M ± 0% 0.99M ± 0% -2.69% (p=0.008 n=5+5) Tar 360k ± 0% 353k ± 0% -1.91% (p=0.008 n=5+5) XML 447k ± 0% 441k ± 0% -1.26% (p=0.008 n=5+5) [Geo mean] 858k 844k -1.60% Fixes #15306 Change-Id: I9f558adb911efddead3865542fe2ca71f66fe1da Reviewed-on: https://go-review.googlesource.com/c/go/+/166718 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-03-11 21:20:01 +00:00
Yury Smolsky	3068fcfa0d	cmd/compile: add control flow graphs to ssa.html This CL adds CFGs to ssa.html. It execs dot to generate SVG, which then gets inlined into the html. Some standard naming and javascript hacks enable integration with the rest of ssa.html. Clicking on blocks highlights the relevant part of the CFG, and vice versa. Sample output and screenshots can be seen in #28177. CFGs can be turned on with the suffix mask: :* - dump CFG for every phase :lower - just the lower phase :lower-layout - lower through layout :w,x-y - phases w and x through y Calling dot after every pass is noticeably slow, instead use the range of phases. Dead blocks are not displayed on CFG. User can zoom and pan individual CFG when the automatic adjustment has failed. Dot-related errors are reported without bringing down the process. Fixes #28177 Change-Id: Id52c42d86c4559ca737288aa10561b67a119c63d Reviewed-on: https://go-review.googlesource.com/c/142517 Run-TryBot: Yury Smolsky <yury@smolsky.by> Reviewed-by: David Chase <drchase@google.com>	2018-11-21 10:22:43 +00:00
Brad Fitzpatrick	3813edf26e	all: use "reports whether" consistently in the few places that didn't Go documentation style for boolean funcs is to say: // Foo reports whether ... func Foo() bool (rather than "returns true if") This CL also replaces 4 uses of "iff" with the same "reports whether" wording, which doesn't lose any meaning, and will prevent people from sending typo fixes when they don't realize it's "if and only if". In the past I think we've had the typo CLs updated to just say "reports whether". So do them all at once. (Inspired by the addition of another "returns true if" in CL 146938 in fd_plan9.go) Created with: $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" \| grep -v vendor) $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" \| grep -v vendor) Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f Reviewed-on: https://go-review.googlesource.com/c/147037 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-11-02 22:47:58 +00:00
David Chase	69c5830c2b	cmd/compile: repair display of values & blocks in prog column This restores the printing of vXX and bYY in the left-hand edge of the last column of ssa.html, where the generated progs appear. Change-Id: I81ab9b2fa5ae28e6e5de1b77665cfbed8d14e000 Reviewed-on: https://go-review.googlesource.com/c/141277 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Yury Smolsky <yury@smolsky.by>	2018-10-11 15:29:00 +00:00
David Chase	0029cd479e	cmd/compile: add LocalAddr that takes SP,mem operands Lack of a well-defined order between VarDef and related address operations sometimes causes problems with store order and write barrier transformations; glitches in the order are made irreparable (by later optimizations) if the two parts of the glitch straddle a split in the original block caused by insertion of a write barrier diamond. Fix this by creating a LocalAddr for addresses of locals (what VarDef matters for) that takes a memory input to help make the order explicit. Addr is modified to only be legal for SB operand, so there is no overlap between Addr and LocalAddr uses (there may be some downstream cleanup from this). Changes to generic.rules and rewrite.go ensure that codegen tests continue to pass; CSE of LocalAddr is impaired, not quite sure of the cost. Fixes #26105. Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515 Reviewed-on: https://go-review.googlesource.com/122483 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-07-12 18:45:31 +00:00
Austin Clements	a367f44c18	cmd/compile: enable stack maps everywhere except unsafe points This modifies issafepoint in liveness analysis to report almost every operation as a safe point. There are four things we don't mark as safe-points: 1. Runtime code (other than at calls). 2. go:nosplit functions (other than at calls). 3. Instructions between the load of the write barrier-enabled flag and the write. 4. Instructions leading up to a uintptr -> unsafe.Pointer conversion. We'll optimize this in later CLs: name old time/op new time/op delta Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10) Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9) GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10) SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10) Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10) GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10) Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10) Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9) XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10) [Geo mean] 394ms 401ms +1.75% No effect on binary size because we're not yet emitting these extra safe points. For #24543. Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859 Reviewed-on: https://go-review.googlesource.com/109348 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-22 14:43:37 +00:00
Giovanni Bajo	3c8545c5f6	cmd/compile: reduce allocations in prove by reusing posets In prove, reuse posets between different functions by storing them in the per-worker cache. Allocation count regression caused by prove improvements is down from 5% to 3% after this CL. Updates #25179 Change-Id: I6d14003109833d9b3ef5165fdea00aa9c9e952e8 Reviewed-on: https://go-review.googlesource.com/110455 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-14 14:44:55 +00:00
David Chase	c2c1822b12	cmd/compile: assign and preserve statement boundaries. A new pass run after ssa building (before any other optimization) identifies the "first" ssa node for each statement. Other "noise" nodes are tagged as being never appropriate for a statement boundary (e.g., VarKill, VarDef, Phi). Rewrite, deadcode, cse, and nilcheck are modified to move the statement boundaries forward whenever possible if a boundary-tagged ssa value is removed; never-boundary nodes are ignored in this search (some operations involving constants are also tagged as never-boundary and also ignored because they are likely to be moved or removed during optimization). Code generation treats all nodes except those explicitly marked as statement boundaries as "not statement" nodes, and floats statement boundaries to the beginning of each same-line run of instructions found within a basic block. Line number html conversion was modified to make statement boundary nodes a bit more obvious by prepending a "+". The code in fuse.go that glued together the value slices of two blocks produced a result that depended on the former capacities (not lengths) of the two slices. This causes differences in the 386 bootstrap, and also can sometimes put values into an order that does a worse job of preserving statement boundaries when values are removed. Portions of two delve tests that had caught problems were incorporated into ssa/debug_test.go. There are some opportunities to do better with optimized code, but the next-ing is not lying or overly jumpy. Over 4 CLs, compilebench geomean measured binary size increase of 3.5% and compile user time increase of 3.8% (this is after optimization to reuse a sparse map instead of creating multiple maps.) This CL worsens the optimized-debugging experience with Delve; we need to work with the delve team so that they can use the is_stmt marks that we're emitting now. The reference output changes from time to time depending on other changes in the compiler, sometimes better, sometimes worse. This CL now includes a test ensuring that 99+% of the lines in the Go command itself (a handy optimized binary) include is_stmt markers. Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a Reviewed-on: https://go-review.googlesource.com/102435 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-14 14:09:49 +00:00
Daniel Martí	14393c5cd4	cmd: remove a few more unused parameters ssa's pos parameter on the Const* funcs is unused, so remove it. ld's alloc parameter on elfnote is always true, so remove the arguments and simplify the code. Finally, arm's addpltreloc never has its return parameter used, so remove it. Change-Id: I63387ecf6ab7b5f7c20df36be823322bb98427b8 Reviewed-on: https://go-review.googlesource.com/104456 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-09 17:10:25 +00:00
Daniel Martí	cd2cb6e3f5	cmd/compile: cache sparse maps across ssa passes This is done for sparse sets already, but it was missing for sparse maps. Only affects deadstore and regalloc, as they're the only ones that use sparse maps. name old time/op new time/op delta DSEPass-4 247µs ± 0% 216µs ± 0% -12.75% (p=0.008 n=5+5) DSEPassBlock-4 3.05ms ± 1% 2.87ms ± 1% -6.02% (p=0.002 n=6+6) CSEPass-4 2.30ms ± 0% 2.32ms ± 0% +0.53% (p=0.026 n=6+6) CSEPassBlock-4 23.8ms ± 0% 23.8ms ± 0% ~ (p=0.931 n=6+5) DeadcodePass-4 51.7µs ± 1% 51.5µs ± 2% ~ (p=0.429 n=5+6) DeadcodePassBlock-4 734µs ± 1% 742µs ± 3% ~ (p=0.394 n=6+6) MultiPass-4 152µs ± 0% 149µs ± 2% ~ (p=0.082 n=5+6) MultiPassBlock-4 2.67ms ± 1% 2.41ms ± 2% -9.77% (p=0.008 n=5+5) name old alloc/op new alloc/op delta DSEPass-4 41.2kB ± 0% 0.1kB ± 0% -99.68% (p=0.002 n=6+6) DSEPassBlock-4 560kB ± 0% 4kB ± 0% -99.34% (p=0.026 n=5+6) CSEPass-4 189kB ± 0% 189kB ± 0% ~ (all equal) CSEPassBlock-4 3.10MB ± 0% 3.10MB ± 0% ~ (p=0.444 n=5+5) DeadcodePass-4 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) DeadcodePassBlock-4 164kB ± 0% 164kB ± 0% ~ (all equal) MultiPass-4 240kB ± 0% 199kB ± 0% -17.06% (p=0.002 n=6+6) MultiPassBlock-4 3.60MB ± 0% 2.99MB ± 0% -17.06% (p=0.002 n=6+6) name old allocs/op new allocs/op delta DSEPass-4 8.00 ± 0% 4.00 ± 0% -50.00% (p=0.002 n=6+6) DSEPassBlock-4 240 ± 0% 120 ± 0% -50.00% (p=0.002 n=6+6) CSEPass-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) CSEPassBlock-4 1.35k ± 0% 1.35k ± 0% ~ (all equal) DeadcodePass-4 3.00 ± 0% 3.00 ± 0% ~ (all equal) DeadcodePassBlock-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) MultiPass-4 11.0 ± 0% 10.0 ± 0% -9.09% (p=0.002 n=6+6) MultiPassBlock-4 165 ± 0% 150 ± 0% -9.09% (p=0.002 n=6+6) Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a Reviewed-on: https://go-review.googlesource.com/98449 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-15 17:24:39 +00:00
Austin Clements	491f409a32	cmd/compile: minor comment improvements/corrections Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3 Reviewed-on: https://go-review.googlesource.com/87475 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:21 +00:00
Keith Randall	4b00d3f4a2	cmd/compile: implement comparisons directly with memory Allow the compiler to generate code like CMPQ 16(AX), $7 It's tricky because it's difficult to spill such a comparison during flagalloc, because the same memory state might not be available at the restore locations. Solve this problem by decomposing the compare+load back into its parts if it needs to be spilled. The big win is that the write barrier test goes from: MOVL runtime.writeBarrier(SB), CX TESTL CX, CX JNE 60 to CMPL runtime.writeBarrier(SB), $0 JNE 59 It's one instruction and one byte smaller. Fixes #19485 Fixes #15245 Update #22460 Binaries are about 0.15% smaller. Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836 Reviewed-on: https://go-review.googlesource.com/86035 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-02-26 23:49:44 +00:00
David Chase	f22cf7131a	cmd/compile: use src.NoXPos for entry-block constants The ssa backend is aggressive about placing constants and certain other values in the Entry block. It's implausible that the original line numbers for these constants makes any sort of sense when it appears to a user stepping in a debugger, and they're also not that useful in dumps since entry-block instructions tend to be constants (i.e., unlikely to be the cause of a crash). Therefore, use src.NoXPos for any values that are explicitly inserted into a function's entry block. Passes all tests, including ssa/debug_test.go with both gdb and a fairly recent dlv. Hand-verified that it solves the reported problem; constructed a test that reproduced a problem, and fixed it. Modified test harness to allow injection of slightly more interesting inputs. Fixes #22558. Change-Id: I4476927067846bc4366da7793d2375c111694c55 Reviewed-on: https://go-review.googlesource.com/81215 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-12-01 07:09:54 +00:00
Austin Clements	afbe646ab4	cmd/compile: report typedslicecopy write barriers Most write barrier calls are inserted by SSA, but copy and append are lowered to runtime.typedslicecopy during walk. Fix these to set Func.WBPos and emit the "write barrier" warning, as done for the write barriers inserted by SSA. As part of this, we refactor setting WBPos and emitting this warning into the frontend so it can be shared by both walk and SSA. Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b Reviewed-on: https://go-review.googlesource.com/73411 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-10-29 20:21:43 +00:00
Keith Randall	770d8d8207	cmd/compile: free value earlier in nilcheck When we remove a nil check, add it back to the free Value pool immediately. Fixes #18732 Change-Id: I8d644faabbfb52157d3f2d071150ff0342ac28dc Reviewed-on: https://go-review.googlesource.com/58810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-08-25 06:01:26 +00:00
Josh Bleecher Snyder	46b88c9fbc	cmd/compile: change ssa.Type into types.Type When package ssa was created, Type was in package gc. To avoid circular dependencies, we used an interface (ssa.Type) to represent type information in SSA. In the Go 1.9 cycle, gri extricated the Type type from package gc. As a result, we can now use it in package ssa. Now, instead of package types depending on package ssa, it is the other way. This is a more sensible dependency tree, and helps compiler performance a bit. Though this is a big CL, most of the changes are mechanical and uninteresting. Interesting bits: Add new singleton globals to package types for the special SSA types Memory, Void, Invalid, Flags, and Int128. * Add two new Types, TSSA for the special types, and TTUPLE, for SSA tuple types. ssa.MakeTuple is now types.NewTuple. * Move type comparison result constants CMPlt, CMPeq, and CMPgt to package types. * We had picked the name "types" in our rules for the handy list of types provided by ssa.Config. That conflicted with the types package name, so change it to "typ". * Update the type comparison routine to handle tuples and special types inline. * Teach gc/fmt.go how to print special types. * We can now eliminate ElemTypes in favor of just Elem, and probably also some other duplicated Type methods designed to return ssa.Type instead of types.Type. The ssa tests were using their own dummy types, and they were not particularly careful about types in general. Of necessity, this CL switches them to use *types.Type; it does not make them more type-accurate. Unfortunately, using types.Type means initializing a bit of the types universe. This is prime for refactoring and improvement. This shrinks ssa.Value; it now fits in a smaller size class on 64 bit systems. This doesn't have a giant impact, though, since most Values are preallocated in a chunk. name old alloc/op new alloc/op delta Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8) Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10) GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10) Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10) GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9) Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8) Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10) XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10) [Geo mean] 40.5MB 40.3MB -0.68% name old allocs/op new allocs/op delta Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9) Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10) GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10) Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10) GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9) Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8) Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10) XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10) [Geo mean] 428k 428k -0.01% Removing all the interface calls helps non-trivially with CPU, though. name old time/op new time/op delta Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96) Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96) GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96) Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99) GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97) Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99) Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94) XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95) [Geo mean] 178ms 173ms -2.65% name old user-time/op new user-time/op delta Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99) Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95) GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99) Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96) GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100) Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92) Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100) XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97) [Geo mean] 220ms 213ms -2.76% Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1 Reviewed-on: https://go-review.googlesource.com/42145 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder	4ee934ad27	cmd/compile: remove references to *os.File from ssa package This reduces the size of the ssa export data by 10%, from 76154 to 67886. It doesn't appear that #20084, which would do this automatically, is going to be fixed soon. Do it manually for now. This speeds up compiling cmd/compile/internal/amd64 and presumably its comrades as well: name old time/op new time/op delta CompileAMD64 89.6ms ± 6% 86.7ms ± 5% -3.29% (p=0.000 n=49+47) name old user-time/op new user-time/op delta CompileAMD64 116ms ± 5% 112ms ± 5% -3.51% (p=0.000 n=45+42) name old alloc/op new alloc/op delta CompileAMD64 26.7MB ± 0% 25.8MB ± 0% -3.26% (p=0.008 n=5+5) name old allocs/op new allocs/op delta CompileAMD64 223k ± 0% 213k ± 0% -4.46% (p=0.008 n=5+5) Updates #20084 Change-Id: I49e8951c5bfce63ad2b7f4fc3bfa0868c53114f9 Reviewed-on: https://go-review.googlesource.com/41493 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-24 23:58:14 +00:00
Josh Bleecher Snyder	0323895cc0	cmd/compile: catch and report nowritebarrier violations later Prior to this CL, the SSA backend reported violations of the //go:nowritebarrier annotation immediately. This necessitated emitting errors during SSA compilation, which is not compatible with a concurrent backend. Instead, check for such violations later. We already save the data required to do a late check for violations of the //go:nowritebarrierrec annotation. Use the same data, and check //go:nowritebarrier at the same time. One downside to doing this is that now only a single violation will be reported per function. Given that this is for the runtime only, and violations are rare, this seems an acceptable cost. While we are here, remove several 'nerrors != 0' checks that are rendered pointless. Updates #15756 Fixes #19250 (as much as it ever can be) Change-Id: Ia44c4ad5b6fd6f804d9f88d9571cec8d23665cb3 Reviewed-on: https://go-review.googlesource.com/38973 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-31 16:31:20 +00:00
Josh Bleecher Snyder	b3a8beb9d1	cmd/compile: minor cleanup in debug code Change-Id: I9885606801b9c8fcb62c16d0856025c4e83e658b Reviewed-on: https://go-review.googlesource.com/38650 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-24 22:21:55 +00:00
Matthew Dempsky	325904fe6a	cmd/compile: port liveness analysis to SSA Passes toolstash-check -all. Change-Id: I92c3c25d6c053f971f346f4fa3bbc76419b58183 Reviewed-on: https://go-review.googlesource.com/38087 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-20 22:58:50 +00:00
Josh Bleecher Snyder	2cdb7f118a	cmd/compile: move Frontend field from ssa.Config to ssa.Func Suggested by mdempsky in CL 38232. This allows us to use the Frontend field to associate frontend state and information with a function. See the following CL in the series for examples. This is a giant CL, but it is almost entirely routine refactoring. The ssa test API is starting to feel a bit unwieldy. I will clean it up separately, once the dust has settled. Passes toolstash -cmp. Updates #15756 Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf Reviewed-on: https://go-review.googlesource.com/38327 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder	88e47187c1	cmd/compile: relocate code from config.go to func.go This is a follow-up to CL 38167. Pure code movement. Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57 Reviewed-on: https://go-review.googlesource.com/38322 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-17 05:21:53 +00:00
Josh Bleecher Snyder	a5e3cac895	cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config This makes ssa.Func, ssa.Cache, and ssa.Config fulfill the roles laid out for them in CL 38160. The only non-trivial change in this CL is how cached values and blocks get IDs. Prior to this CL, their IDs were assigned as part of resetting the cache, and only modified IDs were reset. This required knowing how many values and blocks were modified, which required a tight coupling between ssa.Func and ssa.Config. To eliminate that coupling, we now zero values and blocks during reset, and assign their IDs when they are used. Since unused values and blocks have ID == 0, we can efficiently find the last used value/block, to avoid zeroing everything. Bulk zeroing is efficient, but not efficient enough to obviate the need to avoid zeroing everything every time. As a happy side-effect, ssa.Func.Free is no longer necessary. DebugHashMatch and friends now belong in func.go. They have been left in place for clarity and review. I will move them in a subsequent CL. Passes toolstash -cmp. No compiler performance impact. No change in 'go test cmd/compile/internal/ssa' execution time. Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098 Reviewed-on: https://go-review.googlesource.com/38167 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-17 05:21:42 +00:00
Cherry Zhang	c8f38b3398	cmd/compile: use type information in Aux for Store size Remove size AuxInt in Store, and alignment in Move/Zero. We still pass size AuxInt to Move/Zero, as it is used for partial Move/Zero lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288). SizeAndAlign is gone. Passes "toolstash -cmp" on std. Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d Reviewed-on: https://go-review.googlesource.com/38150 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:25:04 +00:00
Cherry Zhang	1b85300602	cmd/compile: clean up SSA-building code Now that the write barrier insertion is moved to SSA, the SSA building code can be simplified. Updates #17583. Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25 Reviewed-on: https://go-review.googlesource.com/36840 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:24:40 +00:00
Josh Bleecher Snyder	43afcb5c96	cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache The line between ssa.Func and ssa.Config has blurred. Concurrent compilation in the backend will require more precision. This CL lays out an (aspirational) organization. The implementation will come in follow-up CLs, once the organization is settled. ssa.Config holds basic compiler configuration, mostly arch-specific information. It is configured once, early on, and is readonly, so it is safe for concurrent use. ssa.Func is a single-shot object used for compiling a single Func. It is not concurrency-safe and not re-usable. ssa.Cache is a multi-use object used to avoid expensive allocations during compilation. Each ssa.Func is given an ssa.Cache to use. ssa.Cache is not concurrency-safe. Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc Reviewed-on: https://go-review.googlesource.com/38160 Reviewed-by: Keith Randall <khr@golang.org>	2017-03-15 04:27:49 +00:00
David Chase	886e9e6065	cmd/compile: put spills in better places Previously we always issued a spill right after the op that was being spilled. This CL pushes spills father away from the generator, hopefully pushing them into unlikely branches. For example: x = ... if unlikely { call ... } ... use x ... Used to compile to x = ... spill x if unlikely { call ... restore x } It now compiles to x = ... if unlikely { spill x call ... restore x } This is particularly useful for code which appends, as the only call is an unlikely call to growslice. It also helps for the spills needed around write barrier calls. The basic algorithm is walk down the dominator tree following a path where the block still dominates all of the restores. We're looking for a block that: 1) dominates all restores 2) has the value being spilled in a register 3) has a loop depth no deeper than the value being spilled The walking-down code is iterative. I was forced to limit it to searching 100 blocks so it doesn't become O(n^2). Maybe one day we'll find a better way. I had to delete most of David's code which pushed spills out of loops. I suspect this CL subsumes most of the cases that his code handled. Generally positive performance improvements, but hard to tell for sure with all the noise. (compilebench times are unchanged.) name old time/op new time/op delta BinaryTree17-12 2.91s ±15% 2.80s ±12% ~ (p=0.063 n=10+10) Fannkuch11-12 3.47s ± 0% 3.30s ± 4% -4.91% (p=0.000 n=9+10) FmtFprintfEmpty-12 48.0ns ± 1% 47.4ns ± 1% -1.32% (p=0.002 n=9+9) FmtFprintfString-12 85.6ns ±11% 79.4ns ± 3% -7.27% (p=0.005 n=10+10) FmtFprintfInt-12 91.8ns ±10% 85.9ns ± 4% ~ (p=0.203 n=10+9) FmtFprintfIntInt-12 135ns ±13% 127ns ± 1% -5.72% (p=0.025 n=10+9) FmtFprintfPrefixedInt-12 167ns ± 1% 168ns ± 2% ~ (p=0.580 n=9+10) FmtFprintfFloat-12 249ns ±11% 230ns ± 1% -7.32% (p=0.000 n=10+10) FmtManyArgs-12 504ns ± 7% 506ns ± 1% ~ (p=0.198 n=9+9) GobDecode-12 6.95ms ± 1% 7.04ms ± 1% +1.37% (p=0.001 n=10+10) GobEncode-12 6.32ms ±13% 6.04ms ± 1% ~ (p=0.063 n=10+10) Gzip-12 233ms ± 1% 235ms ± 0% +1.01% (p=0.000 n=10+9) Gunzip-12 40.1ms ± 1% 39.6ms ± 0% -1.12% (p=0.000 n=10+8) HTTPClientServer-12 227µs ± 9% 221µs ± 5% ~ (p=0.114 n=9+8) JSONEncode-12 16.1ms ± 2% 15.8ms ± 1% -2.09% (p=0.002 n=9+8) JSONDecode-12 61.8ms ±11% 57.9ms ± 1% -6.30% (p=0.000 n=10+9) Mandelbrot200-12 4.30ms ± 3% 4.28ms ± 1% ~ (p=0.203 n=10+8) GoParse-12 3.18ms ± 2% 3.18ms ± 2% ~ (p=0.579 n=10+10) RegexpMatchEasy0_32-12 76.7ns ± 1% 77.5ns ± 1% +0.92% (p=0.002 n=9+8) RegexpMatchEasy0_1K-12 239ns ± 3% 239ns ± 1% ~ (p=0.204 n=10+10) RegexpMatchEasy1_32-12 71.4ns ± 1% 70.6ns ± 0% -1.15% (p=0.000 n=10+9) RegexpMatchEasy1_1K-12 383ns ± 2% 390ns ±10% ~ (p=0.181 n=8+9) RegexpMatchMedium_32-12 114ns ± 0% 113ns ± 1% -0.88% (p=0.000 n=9+8) RegexpMatchMedium_1K-12 36.3µs ± 1% 36.8µs ± 1% +1.59% (p=0.000 n=10+8) RegexpMatchHard_32-12 1.90µs ± 1% 1.90µs ± 1% ~ (p=0.341 n=10+10) RegexpMatchHard_1K-12 59.4µs ±11% 57.8µs ± 1% ~ (p=0.968 n=10+9) Revcomp-12 461ms ± 1% 462ms ± 1% ~ (p=1.000 n=9+9) Template-12 67.5ms ± 1% 66.3ms ± 1% -1.77% (p=0.000 n=10+8) TimeParse-12 314ns ± 3% 309ns ± 0% -1.56% (p=0.000 n=9+8) TimeFormat-12 340ns ± 2% 331ns ± 1% -2.79% (p=0.000 n=10+10) The go binary is 0.2% larger. Not really sure why the size would change. Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c Reviewed-on: https://go-review.googlesource.com/34822 Reviewed-by: David Chase <drchase@google.com>	2017-03-15 02:09:25 +00:00
Josh Bleecher Snyder	3dcfce8d19	cmd/compile: add OpOffPtr [c] SP to constant cache They accounted for almost 30% of all CSE'd values. By never creating the duplicates in the first place, we reduce the high water mark of Value IDs, which in turn makes all SSA phases cheaper, particularly regalloc. name old time/op new time/op delta Template 200ms ± 3% 198ms ± 4% -0.87% (p=0.016 n=50+49) Unicode 86.9ms ± 2% 85.5ms ± 3% -1.56% (p=0.000 n=49+50) GoTypes 553ms ± 4% 551ms ± 4% ~ (p=0.183 n=50+49) SSA 3.97s ± 3% 3.93s ± 2% -1.06% (p=0.000 n=48+48) Flate 124ms ± 4% 124ms ± 3% ~ (p=0.545 n=48+50) GoParser 146ms ± 4% 146ms ± 4% ~ (p=0.810 n=49+49) Reflect 357ms ± 3% 355ms ± 3% -0.59% (p=0.049 n=50+48) Tar 106ms ± 4% 107ms ± 5% ~ (p=0.454 n=49+50) XML 203ms ± 4% 203ms ± 4% ~ (p=0.726 n=48+50) name old user-ns/op new user-ns/op delta Template 237M ± 3% 235M ± 4% ~ (p=0.208 n=47+48) Unicode 111M ± 4% 108M ± 9% -2.50% (p=0.000 n=47+50) GoTypes 736M ± 5% 729M ± 4% -0.95% (p=0.017 n=50+46) SSA 5.73G ± 4% 5.74G ± 4% ~ (p=0.765 n=50+50) Flate 150M ± 5% 148M ± 6% -0.89% (p=0.045 n=48+47) GoParser 180M ± 5% 178M ± 7% -1.34% (p=0.012 n=50+50) Reflect 450M ± 4% 444M ± 4% -1.40% (p=0.000 n=50+49) Tar 124M ± 7% 123M ± 7% ~ (p=0.092 n=50+50) XML 248M ± 6% 245M ± 5% ~ (p=0.057 n=50+50) name old alloc/op new alloc/op delta Template 39.4MB ± 0% 39.3MB ± 0% -0.37% (p=0.000 n=50+50) Unicode 30.9MB ± 0% 30.9MB ± 0% -0.27% (p=0.000 n=48+50) GoTypes 114MB ± 0% 113MB ± 0% -1.03% (p=0.000 n=50+49) SSA 882MB ± 0% 865MB ± 0% -1.95% (p=0.000 n=49+49) Flate 25.8MB ± 0% 25.7MB ± 0% -0.21% (p=0.000 n=50+50) GoParser 31.7MB ± 0% 31.6MB ± 0% -0.33% (p=0.000 n=50+50) Reflect 79.7MB ± 0% 79.3MB ± 0% -0.49% (p=0.000 n=44+49) Tar 27.2MB ± 0% 27.1MB ± 0% -0.31% (p=0.000 n=50+50) XML 42.7MB ± 0% 42.3MB ± 0% -1.05% (p=0.000 n=48+49) name old allocs/op new allocs/op delta Template 379k ± 1% 380k ± 1% +0.26% (p=0.000 n=50+50) Unicode 324k ± 1% 324k ± 1% ~ (p=0.964 n=49+50) GoTypes 1.14M ± 0% 1.15M ± 0% +0.14% (p=0.000 n=50+49) SSA 7.89M ± 0% 7.89M ± 0% -0.05% (p=0.000 n=49+49) Flate 240k ± 1% 241k ± 1% +0.27% (p=0.001 n=50+50) GoParser 310k ± 1% 311k ± 1% +0.48% (p=0.000 n=50+49) Reflect 1.00M ± 0% 1.00M ± 0% +0.17% (p=0.000 n=48+50) Tar 254k ± 1% 255k ± 1% +0.23% (p=0.005 n=50+50) XML 395k ± 1% 395k ± 1% +0.19% (p=0.002 n=49+47) Change-Id: Iaa8f5f37e23bd81983409f7359f9dcd4dfe2961f Reviewed-on: https://go-review.googlesource.com/38003 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-10 16:50:58 +00:00
Josh Bleecher Snyder	d11a2184fb	cmd/compile: allow earlier GC of freed constant value Minor fix, because it's the right thing to do. No significant impact. Change-Id: I2138285d397494daa9a88c414149c2a7860edd7e Reviewed-on: https://go-review.googlesource.com/38001 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-10 01:39:09 +00:00
Josh Bleecher Snyder	c63ad970f6	cmd/compile: rename Func.constVal arg for clarity Values have an Aux and an AuxInt. We're setting AuxInt, not Aux. Say so. Change-Id: I41aa783273bb7e1ba47c941aa4233f818e37dadd Reviewed-on: https://go-review.googlesource.com/37997 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-09 23:39:01 +00:00
Josh Bleecher Snyder	f791b288d1	cmd/compile: remove some allocs from CSE Pick up a few pennies: * CSE gets run twice for each function, but the set of Aux values doesn't change. Avoid populating it twice. * Don't bother populating auxmap for values that can't be CSE'd anyway. name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.7MB ± 0% -0.61% (p=0.008 n=5+5) Unicode 32.3MB ± 0% 32.3MB ± 0% -0.22% (p=0.008 n=5+5) GoTypes 122MB ± 0% 121MB ± 0% -0.55% (p=0.008 n=5+5) Compiler 482MB ± 0% 479MB ± 0% -0.58% (p=0.008 n=5+5) SSA 865MB ± 0% 862MB ± 0% -0.35% (p=0.008 n=5+5) Flate 26.5MB ± 0% 26.5MB ± 0% ~ (p=0.056 n=5+5) GoParser 32.6MB ± 0% 32.4MB ± 0% -0.58% (p=0.008 n=5+5) Reflect 84.2MB ± 0% 83.8MB ± 0% -0.57% (p=0.008 n=5+5) Tar 27.7MB ± 0% 27.6MB ± 0% -0.37% (p=0.008 n=5+5) XML 44.7MB ± 0% 44.5MB ± 0% -0.53% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Template 373k ± 0% 373k ± 1% ~ (p=1.000 n=5+5) Unicode 326k ± 0% 325k ± 0% ~ (p=0.548 n=5+5) GoTypes 1.16M ± 0% 1.16M ± 0% ~ (p=0.841 n=5+5) Compiler 4.16M ± 0% 4.15M ± 0% ~ (p=0.222 n=5+5) SSA 7.57M ± 0% 7.56M ± 0% -0.22% (p=0.008 n=5+5) Flate 238k ± 1% 239k ± 1% ~ (p=0.690 n=5+5) GoParser 304k ± 0% 304k ± 0% ~ (p=1.000 n=5+5) Reflect 1.01M ± 0% 1.00M ± 0% -0.31% (p=0.016 n=4+5) Tar 245k ± 0% 245k ± 1% ~ (p=0.548 n=5+5) XML 393k ± 0% 391k ± 1% ~ (p=0.095 n=5+5) Change-Id: I78f1ffe129bd8fd590b7511717dd2bf9f5ecbd6d Reviewed-on: https://go-review.googlesource.com/36690 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-09 20:42:46 +00:00
Matthew Dempsky	5c90e1cf8a	cmd/compile/internal/ssa: remove Func.StaticData field Rather than collecting static data nodes to be written out later, just write them out immediately. Change-Id: I51708b690e94bc3e288b4d6ba3307bf738a80f64 Reviewed-on: https://go-review.googlesource.com/36352 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-04 01:09:26 +00:00
Josh Bleecher Snyder	57546d67ec	cmd/compile: add reusable []Location to ssa.Config name old time/op new time/op delta Template 218ms ± 3% 214ms ± 3% -1.70% (p=0.000 n=30+30) Unicode 100ms ± 3% 100ms ± 4% ~ (p=0.614 n=29+30) GoTypes 657ms ± 1% 660ms ± 3% +0.46% (p=0.046 n=29+30) Compiler 2.80s ± 2% 2.80s ± 1% ~ (p=0.451 n=28+29) Flate 131ms ± 2% 132ms ± 4% ~ (p=1.000 n=29+29) GoParser 159ms ± 3% 160ms ± 5% ~ (p=0.341 n=28+30) Reflect 406ms ± 3% 408ms ± 4% ~ (p=0.511 n=28+30) Tar 118ms ± 4% 118ms ± 4% ~ (p=0.827 n=29+30) XML 222ms ± 6% 222ms ± 3% ~ (p=0.532 n=30+30) name old user-ns/op new user-ns/op delta Template 274user-ms ± 3% 272user-ms ± 3% -0.87% (p=0.015 n=29+30) Unicode 140user-ms ± 4% 140user-ms ± 3% ~ (p=0.735 n=29+30) GoTypes 890user-ms ± 1% 897user-ms ± 2% +0.88% (p=0.002 n=29+30) Compiler 3.88user-s ± 2% 3.89user-s ± 1% ~ (p=0.132 n=30+29) Flate 168user-ms ± 2% 157user-ms ± 4% -6.21% (p=0.000 n=25+28) GoParser 211user-ms ± 2% 213user-ms ± 5% ~ (p=0.086 n=28+30) Reflect 539user-ms ± 2% 541user-ms ± 3% ~ (p=0.267 n=27+29) Tar 156user-ms ± 7% 155user-ms ± 5% ~ (p=0.708 n=30+30) XML 291user-ms ± 5% 294user-ms ± 3% +0.83% (p=0.029 n=29+30) name old alloc/op new alloc/op delta Template 40.7MB ± 0% 39.4MB ± 0% -3.26% (p=0.000 n=29+26) Unicode 30.8MB ± 0% 30.7MB ± 0% -0.40% (p=0.000 n=28+30) GoTypes 123MB ± 0% 119MB ± 0% -3.47% (p=0.000 n=30+29) Compiler 472MB ± 0% 455MB ± 0% -3.60% (p=0.000 n=30+30) Flate 26.5MB ± 0% 25.6MB ± 0% -3.21% (p=0.000 n=28+30) GoParser 32.3MB ± 0% 31.4MB ± 0% -2.98% (p=0.000 n=29+30) Reflect 84.4MB ± 0% 82.1MB ± 0% -2.83% (p=0.000 n=30+30) Tar 27.3MB ± 0% 26.5MB ± 0% -2.70% (p=0.000 n=29+29) XML 44.6MB ± 0% 43.1MB ± 0% -3.49% (p=0.000 n=30+30) name old allocs/op new allocs/op delta Template 401k ± 1% 399k ± 0% -0.35% (p=0.000 n=30+28) Unicode 331k ± 0% 331k ± 1% ~ (p=0.907 n=28+30) GoTypes 1.24M ± 0% 1.23M ± 0% -0.43% (p=0.000 n=30+30) Compiler 4.26M ± 0% 4.25M ± 0% -0.34% (p=0.000 n=29+30) Flate 252k ± 1% 251k ± 1% -0.41% (p=0.000 n=30+30) GoParser 325k ± 1% 324k ± 1% -0.31% (p=0.000 n=27+30) Reflect 1.06M ± 0% 1.05M ± 0% -0.69% (p=0.000 n=30+30) Tar 266k ± 1% 265k ± 1% -0.51% (p=0.000 n=29+30) XML 416k ± 1% 415k ± 1% -0.36% (p=0.002 n=30+30) Change-Id: I8f784001324df83b2764c44f0e83a540e5beab34 Reviewed-on: https://go-review.googlesource.com/36212 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-02 22:39:32 +00:00
Russ Cox	47ce87877b	all: merge dev.inline into master Change-Id: I7715581a04e513dcda9918e853fa6b1ddc703770	2017-02-01 09:47:23 -05:00
Robert Griesemer	472c792e0a	[dev.inline] cmd/internal/src: introduce compact source position representation XPos is a compact (8 instead of 16 bytes on a 64bit machine) source position representation. There is a 1:1 correspondence between each XPos and each regular Pos, translated via a global table. In some sense this brings back the LineHist, though positions can track line and column information; there is a O(1) translation between the representations (no binary search), and the translation is factored out. The size increase with the prior change is brought down again and the compiler speed is in line with the master repo (measured on the same "quiet" machine as for prior change): name old time/op new time/op delta Template 256ms ± 1% 262ms ± 2% ~ (p=0.063 n=5+4) Unicode 132ms ± 1% 135ms ± 2% ~ (p=0.063 n=5+4) GoTypes 891ms ± 1% 871ms ± 1% -2.28% (p=0.016 n=5+4) Compiler 3.84s ± 2% 3.89s ± 2% ~ (p=0.413 n=5+4) MakeBash 47.1s ± 1% 46.2s ± 2% ~ (p=0.095 n=5+5) name old user-ns/op new user-ns/op delta Template 309M ± 1% 314M ± 2% ~ (p=0.111 n=5+4) Unicode 165M ± 1% 172M ± 9% ~ (p=0.151 n=5+5) GoTypes 1.14G ± 2% 1.12G ± 1% ~ (p=0.063 n=5+4) Compiler 5.00G ± 1% 4.96G ± 1% ~ (p=0.286 n=5+4) Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47 Reviewed-on: https://go-review.googlesource.com/34506 Run-TryBot: Robert Griesemer <gri@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-01-09 22:43:22 +00:00
David Chase	7f1ff65c39	cmd/compile: insert scheduling checks on loop backedges Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-01-09 21:01:29 +00:00
Robert Griesemer	c10499b539	[dev.inline] cmd/compile/internal/ssa: another round of renames from line -> pos (cleanup) Mostly mechanical renames. Make variable names consistent with use. Change-Id: Iaa89d31deab11eca6e784595b58e779ad525c8a3 Reviewed-on: https://go-review.googlesource.com/34146 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-12-08 23:10:30 +00:00
Robert Griesemer	cfd17f51c8	[dev.inline] cmd/compile/internal/ssa: rename various fields from Line to Pos This is a mostly mechanical rename followed by manual fixes where necessary. Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842 Reviewed-on: https://go-review.googlesource.com/34137 Reviewed-by: David Lazar <lazard@golang.org>	2016-12-08 21:36:52 +00:00
Robert Griesemer	24597c080b	[dev.inline] cmd/compile: introduce cmd/internal/src.Pos type for line numbers This is a step toward chosing a different position representation. By introducing an explicit type, it will be easier to make the transition step-wise while ensuring everything keeps running. This has been reviewed via https://go-review.googlesource.com/#/c/34025/. Change-Id: Ibceddcd62d8f346321ac3250e3940e9c436ed684 Reviewed-on: https://go-review.googlesource.com/34132 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Lazar <lazard@golang.org>	2016-12-08 21:26:25 +00:00
David Chase	a190f3c8a3	cmd/compile: enable flag-specified dump of specific phase+function For very large input files, use of GOSSAFUNC to obtain a dump after compilation steps can lead to both unwieldy large output files and unwieldy larger processes (because the output is buffered in a string). This flag -d=ssa/<phase>/dump:<function name> provides finer control of what is dumped, into a smaller file, and with less memory overhead in the running compiler. The special phase name "build" is added to allow printing of the just-built ssa before any transformations are applied. This was helpful in making sense of the gogo/protobuf problems. The output format was tweaked to remove gratuitous spaces, and a crude -d=ssa/help help text was added. Change-Id: If7516e22203420eb6ed3614f7cee44cb9260f43e Reviewed-on: https://go-review.googlesource.com/23044 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-10-20 22:23:56 +00:00
Hajime Hoshi	c5368123fe	cmd/compile: remove redundant function idom Change-Id: Ib14b5421bb5e407bbd4d3cbfc68c92d3dd257cb1 Reviewed-on: https://go-review.googlesource.com/30732 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-10-11 16:43:12 +00:00
Keith Randall	5a6e511c61	cmd/compile: Use Sreedhar+Gao phi building algorithm Should be more asymptotically happy. We process each variable in turn to find all the locations where it needs a phi (the dominance frontier of all of its definitions). Then we add all those phis. This takes O(n * #variables), although hopefully much less. Then we do a single tree walk to match all the FwdRefs with the nearest definition or phi. This takes O(n) time. The one remaining inefficiency is that we might end up introducing a bunch of dead phis in the first step. A TODO is to introduce phis only where they might be used by a read. The old algorithm is still faster on small functions, so there's a cutover size (currently 500 blocks). This algorithm supercedes the David's sparse phi placement algorithm for large functions. Lowers compile time of example from #14934 from ~10 sec to ~4 sec. Lowers compile time of example from #16361 from ~4.5 sec to ~3 sec. Lowers #16407 from ~20 min to ~30 sec. Update #14934 Update #16361 Fixes #16407 Change-Id: I1cff6364e1623c143190b6a924d7599e309db58f Reviewed-on: https://go-review.googlesource.com/30163 Reviewed-by: David Chase <drchase@google.com>	2016-10-03 20:30:08 +00:00
Keith Randall	75ce89c20d	cmd/compile: cache CFG-dependent computations We compute a lot of stuff based off the CFG: postorder traversal, dominators, dominator tree, loop nest. Multiple phases use this information and we end up recomputing some of it. Add a cache for this information so if the CFG hasn't changed, we can reuse the previous computation. Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2 Reviewed-on: https://go-review.googlesource.com/29356 Reviewed-by: David Chase <drchase@google.com>	2016-09-19 16:00:13 +00:00
Keith Randall	167e381f40	cmd/compile: make ssa compilation unconditional Rip out the code that allows SSA to be used conditionally. No longer exists: ssa=0 flag GOSSAHASH GOSSAPKG SSATEST GOSSAFUNC now only controls the printing of the IR/html. Still need to rip out all of the old backend. It should no longer be callable after this CL. Update #16357 Change-Id: Ib30cc18fba6ca52232c41689ba610b0a94aa74f5 Reviewed-on: https://go-review.googlesource.com/29155 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2016-09-14 17:38:04 +00:00
Keith Randall	84aac622a4	cmd/compile: intrinsify the rest of runtime/internal/atomic for amd64 Atomic swap, add/and/or, compare and swap. Also works on amd64p32. Change-Id: Idf2d8f3e1255f71deba759e6e75e293afe4ab2ba Reviewed-on: https://go-review.googlesource.com/27813 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-08-28 16:31:08 +00:00
David Chase	6b99fb5bea	cmd/compile: use sparse algorithm for phis in large program This adds a sparse method for locating nearest ancestors in a dominator tree, and checks blocks with more than one predecessor for differences and inserts phi functions where there are. Uses reversed post order to cut number of passes, running it from first def to last use ("last use" for paramout and mem is end-of-program; last use for a phi input from a backedge is the source of the back edge) Includes a cutover from old algorithm to new to avoid paying large constant factor for small programs. This keeps normal builds running at about the same time, while not running over-long on large machine-generated inputs. Add "phase" flags for ssa/build -- ssa/build/stats prints number of blocks, values (before and after linking references and inserting phis, so expansion can be measured), and their product; the product governs the cutover, where a good value seems to be somewhere between 1 and 5 million. Among the files compiled by make.bash, this is the shape of the tail of the distribution for #blocks, #vars, and their product: #blocks #vars product max 6171 28180 173,898,780 99.9% 1641 6548 10,401,878 99% 463 1909 873,721 95% 152 639 95,235 90% 84 359 30,021 The old algorithm is indeed usually fastest, for 99%ile values of usually. The fix to LookupVarOutgoing ( https://go-review.googlesource.com/#/c/22790/ ) deals with some of the same problems addressed by this CL, but on at least one bug ( #15537 ) this change is still a significant help. With this CL: /tmp/gopath$ rm -rf pkg bin /tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \ github.com/gogo/protobuf/test/theproto3/combos/... ... real 4m35.200s user 13m16.644s sys 0m36.712s and pprof reports 3.4GB allocated in one of the larger profiles With tip: /tmp/gopath$ rm -rf pkg bin /tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \ github.com/gogo/protobuf/test/theproto3/combos/... ... real 10m36.569s user 25m52.286s sys 4m3.696s and pprof reports 8.3GB allocated in the same larger profile With this CL, most of the compilation time on the benchmarked input is spent in register/stack allocation (cumulative 53%) and in the sparse lookup algorithm itself (cumulative 20%). Fixes #15537. Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00 Reviewed-on: https://go-review.googlesource.com/22342 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-16 21:08:05 +00:00
Keith Randall	4fa050024f	cmd/compile: enable constant-time CFG editing Provide indexes along with block pointers for Preds and Succs arrays. This allows us to splice edges in and out of those arrays in constant time. Fixes worst-case O(n^2) behavior in deadcode and fuse. benchmark old ns/op new ns/op delta BenchmarkFuse1-8 2065 2057 -0.39% BenchmarkFuse10-8 9408 9073 -3.56% BenchmarkFuse100-8 105238 76277 -27.52% BenchmarkFuse1000-8 3982562 1026750 -74.22% BenchmarkFuse10000-8 301220329 12824005 -95.74% BenchmarkDeadCode1-8 1588 1566 -1.39% BenchmarkDeadCode10-8 4333 4250 -1.92% BenchmarkDeadCode100-8 32031 32574 +1.70% BenchmarkDeadCode1000-8 590407 468275 -20.69% BenchmarkDeadCode10000-8 17822890 5000818 -71.94% BenchmarkDeadCode100000-8 1388706640 78021127 -94.38% BenchmarkDeadCode200000-8 5372518479 168598762 -96.86% Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17 Reviewed-on: https://go-review.googlesource.com/22589 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-05 15:58:59 +00:00
Josh Bleecher Snyder	2563b6f9fe	cmd/compile/internal/ssa: use Compare instead of Equal They have different semantics. Equal is stricter and is designed for the front-end. Compare is looser and cheaper and is designed for the back-end. To avoid possible regression, remove Equal from ssa.Type. Updates #15043 Change-Id: Ie23ce75ff6b4d01b7982e0a89e6f81b5d099d8d6 Reviewed-on: https://go-review.googlesource.com/21483 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>	2016-04-17 04:50:45 +00:00
Alexandru Moșoi	9743e4b031	cmd/compile: share dominator tree among many passes These passes do not modify the dominator tree too much. % benchstat old.txt new.txt name old time/op new time/op delta Template 335ms ± 3% 325ms ± 8% ~ (p=0.074 n=8+9) GoTypes 1.05s ± 1% 1.05s ± 3% ~ (p=0.095 n=9+10) Compiler 5.37s ± 4% 5.29s ± 1% -1.42% (p=0.022 n=9+10) MakeBash 34.9s ± 3% 34.4s ± 2% ~ (p=0.095 n=9+10) name old alloc/op new alloc/op delta Template 55.4MB ± 0% 54.9MB ± 0% -0.81% (p=0.000 n=10+10) GoTypes 179MB ± 0% 178MB ± 0% -0.89% (p=0.000 n=10+10) Compiler 807MB ± 0% 798MB ± 0% -1.10% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Template 498k ± 0% 496k ± 0% -0.29% (p=0.000 n=9+9) GoTypes 1.42M ± 0% 1.41M ± 0% -0.24% (p=0.000 n=10+10) Compiler 5.61M ± 0% 5.60M ± 0% -0.12% (p=0.000 n=10+10) Change-Id: I4cd20cfba3f132ebf371e16046ab14d7e42799ec Reviewed-on: https://go-review.googlesource.com/21806 Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-04-12 14:44:26 +00:00
Keith Randall	7e40627a0e	cmd/compile: zero all three argstorage slots These changes were missed when going from 2 to 3 argstorage slots. https://go-review.googlesource.com/20296/ Change-Id: I930a307bb0b695bf1ae088030c9bbb6d14ca31d2 Reviewed-on: https://go-review.googlesource.com/21841 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2016-04-11 20:49:22 +00:00

1 2

85 commits