Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
David Chase	92cb157cf3	[dev.regabi] cmd/compile: late expansion of return values By-hand rebase of earlier CL, because that was easier than letting git try to figure things out. This will naively insert self-moves; in the case that these involve memory, the expander detects these and removes them and their vardefs. Change-Id: Icf72575eb7ae4a186b0de462bc8cf0bedc84d3e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/279519 Trust: David Chase <drchase@google.com> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2021-01-20 19:57:48 +00:00
David Chase	c1370e918f	[dev.regabi] cmd/compile: add code to support register ABI spills around morestack calls This is a selected copy from the register ABI experiment CL, focused on the files and data structures that handle spilling around morestack. Unnecessary code from the experiment was removed, other code was adapted. Would it make sense to leave comments in the experiment as pieces are brought over? Experiment CL (for comparison purposes) https://go-review.googlesource.com/c/go/+/28832 Change-Id: I92136f070351d4fcca1407b52ecf9b80898fed95 Reviewed-on: https://go-review.googlesource.com/c/go/+/279520 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2021-01-13 15:47:13 +00:00
Matthew Dempsky	1a98ab0e2d	[dev.regabi] cmd/compile: add ssa.Aux tag interface for Value.Aux It's currently hard to automate refactorings around the Value.Aux field, because we don't have any static typing information for it. Adding a tag interface will make subsequent CLs easier and safer. Passes buildall w/ toolstash -cmp. Updates #42982. Change-Id: I41ae8e411a66bda3195a0957b60c2fe8a8002893 Reviewed-on: https://go-review.googlesource.com/c/go/+/275756 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Trust: Matthew Dempsky <mdempsky@google.com>	2020-12-08 01:46:31 +00:00
Cuong Manh Le	0ecf769633	cmd/compile: do not mark OpSP, OpSB pos for debugging Fixes #42801 Change-Id: I2080ecacc109479f5820035401ce2b26d72e2ef2 Reviewed-on: https://go-review.googlesource.com/c/go/+/273506 Trust: Cuong Manh Le <cuong.manhle.vn@gmail.com> Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>	2020-12-01 01:35:19 +00:00
soolaugust	2e7706d1fa	ssa: comment Sdom() with the form "Sdom..." Change-Id: I7ddb3d178e5437a7c3d8e94a089ac7a476a7dc85 GitHub-Last-Rev: `bc27289128` GitHub-Pull-Request: golang/go#41925 Reviewed-on: https://go-review.googlesource.com/c/go/+/261437 Trust: Alberto Donizetti <alb.donizetti@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2020-10-12 15:38:05 +00:00
David Chase	44c0586931	cmd/compile: add code to expand calls just before late opt Still needs to generate the calls that will need lowering. Change-Id: Ifd4e510193441a5e27c462c1f1d704f07bf6dec3 Reviewed-on: https://go-review.googlesource.com/c/go/+/242359 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-09-18 16:14:40 +00:00
David Chase	bf833ead62	cmd/compile: ensure that ssa.Func constant cache is consistent It was not necessarily consistent before, we were just lucky. Change-Id: I3a92dc724e0af7b4d810a6a0b7b1d58844eb8f87 Reviewed-on: https://go-review.googlesource.com/c/go/+/251440 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-09-05 14:11:21 +00:00
David Chase	4220d67084	cmd/compile: make GOSSAHASH package-sensitive, also append to log files Turns out if your failure is in a function with a name like "Reset()" there will be a lot of hits on the same hashcode. Adding package sensitivity solves this problem. In additionm, it turned out that in the case that a logfile was specified for the GOSSAHASH logging, that it was opened in create mode, which meant that multiple compiler invocations would reset the file to zero length. Opening in append mode works better; the automated harness (github.com/dr2chase/gossahash) takes care of truncating the file before use. Change-Id: I5601bc280faa94cbd507d302448831849db6c842 Reviewed-on: https://go-review.googlesource.com/c/go/+/246937 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-08-24 02:38:18 +00:00
surechen	17553c6e71	cmd/compile: move dumpFileSeq I noticed that there is a Todo comment here. This variable is only used for filename when dump a function's ssa passes result in details. It is no problem to print a function alone, but may be edited by not only one goroutine if dump multiple functions at the same time. Although it looks only dump one function's ssa passes now. As far as I am concerned this variable can be a member variable of the struct Func. I'm not sure if this change is necessary. Looking forward to your advices, thank you very much. Change-Id: I35dd7247889e0cc7f19c0b400b597206592dee75 Reviewed-on: https://go-review.googlesource.com/c/go/+/244918 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org>	2020-08-17 21:04:19 +00:00
Keith Randall	32a84c99e1	cmd/compile: fix live variable computation for deferreturn Taking the live variable set from the last return point is problematic. See #40629 for details, but there may not be a return point, or it may be before the final defer. Additionally, keeping track of the last call as a *Value doesn't quite work. If it is dead-code eliminated, the storage for the Value is reused for some other random instruction. Its live variable information, if it is available at all, is wrong. Instead, just mark all the open-defer argument slots as live throughout the function. (They are already zero-initialized.) Fixes #40629 Change-Id: Ie456c7db3082d0de57eaa5234a0f32525a1cce13 Reviewed-on: https://go-review.googlesource.com/c/go/+/247522 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dan Scales <danscales@google.com>	2020-08-14 21:49:36 +00:00
Dan Scales	1b3a1db19f	cmd/compile: fix liveness for open-coded defer args for infinite loops Once defined, a stack slot holding an open-coded defer arg should always be marked live, since it may be used at any time if there is a panic. These stack slots are typically kept live naturally by the open-defer code inlined at each return/exit point. However, we need to do extra work to make sure that they are kept live if a function has an infinite loop or a panic exit. For this fix, only in the case of a function that is using open-coded defers, we compute the set of blocks (most often empty) that cannot reach a return or a BlockExit (panic) because of an infinite loop. Then, for each block b which cannot reach a return or BlockExit or is a BlockExit block, we mark each defer arg slot as live, as long as the definition of the defer arg slot dominates block b. For this change, had to export (*Func).sdom (-> Sdom) and SparseTree.isAncestorEq (-> IsAncestorEq) Updates #35277 Change-Id: I7b53c9bd38ba384a3794386dd0eb94e4cbde4eb1 Reviewed-on: https://go-review.googlesource.com/c/go/+/204802 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-05 17:19:16 +00:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
David Chase	6081a9f7e6	cmd/compile: index line number tables by source file to improve sparsity This reduces allocations and also resolves some lurking inliner/inlinee line-number match problems. However, it does add about 1.5% to compile time. This fixes compiler OOMs seen compiling some large protobuf- derived inputs. For compiling the compiler itself, compilebench -pkg cmd/compile/internal/ssa -memprofile withcl.prof the numberlines-related memory consumption is reduced from 129MB to 29MB (about a 5% overall reduction in allocation). Additionally modified after going over changes with Austin to remove unused code (nobody called size()) and correct the cache-clearing code. I've attempted to speed this up by not using maps, and have not succeeded. I'd rather get correct code in now, speed it up later if I can. Updates #27739. Fixes #29279. Change-Id: I098005de4e45196a5f5b10c0886a49f88e9f8fd5 Reviewed-on: https://go-review.googlesource.com/c/go/+/154617 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-05-14 18:48:16 +00:00
Josh Bleecher Snyder	0047353c53	cmd/compile: add countRule rewrite rule helper noteRule is useful when you're trying to debug a particular rule, or get a general sense for how often a rule fires overall. It is less useful if you're trying to figure out which functions might be useful to benchmark to ascertain the impact of a newly added rule. Enter countRule. You use it like noteRule, except that you get per-function summaries. Sample output: # runtime (*mspan).sweep: idx1=1 evacuate_faststr: idx1=1 evacuate_fast32: idx1=1 evacuate: idx1=2 evacuate_fast64: idx1=1 sweepone: idx1=1 purgecachedstats: idx1=1 mProf_Free: idx1=1 This suggests that the map benchmarks might be good to run for this added rule. Change-Id: Id471c3231f1736165f2020f6979ff01c29677808 Reviewed-on: https://go-review.googlesource.com/c/go/+/167088 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-05-08 22:58:10 +00:00
Keith Randall	2c423f063b	cmd/compile,runtime: provide index information on bounds check failure A few examples (for accessing a slice of length 3): s[-1] runtime error: index out of range [-1] s[3] runtime error: index out of range [3] with length 3 s[-1:0] runtime error: slice bounds out of range [-1:] s[3:0] runtime error: slice bounds out of range [3:0] s[3:-1] runtime error: slice bounds out of range [:-1] s[3:4] runtime error: slice bounds out of range [:4] with capacity 3 s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3 Note that in cases where there are multiple things wrong with the indexes (e.g. s[3:-1]), we report one of those errors kind of arbitrarily, currently the rightmost one. An exhaustive set of examples is in issue30116[u].out in the CL. The message text has the same prefix as the old message text. That leads to slightly awkward phrasing but hopefully minimizes the chance that code depending on the error text will break. Increases the size of the go binary by 0.5% (amd64). The panic functions take arguments in registers in order to keep the size of the compiled code as small as possible. Fixes #30116 Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd Reviewed-on: https://go-review.googlesource.com/c/go/+/161477 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-03-18 17:33:38 +00:00
Josh Bleecher Snyder	c9ccdf1f8c	cmd/compile: make deadcode pass cheaper The deadcode pass runs a lot. I'd like it to run even more. This change adds dedicated storage for deadcode to ssa.Cache. In addition to being a nice win now, it makes deadcode easier to add other places in the future. name old time/op new time/op delta Template 210ms ± 3% 209ms ± 2% ~ (p=0.951 n=93+95) Unicode 92.2ms ± 3% 93.0ms ± 3% +0.87% (p=0.000 n=94+94) GoTypes 739ms ± 2% 733ms ± 2% -0.84% (p=0.000 n=92+94) Compiler 3.51s ± 2% 3.49s ± 2% -0.57% (p=0.000 n=94+91) SSA 9.80s ± 2% 9.75s ± 2% -0.57% (p=0.000 n=95+92) Flate 132ms ± 2% 132ms ± 3% ~ (p=0.165 n=94+98) GoParser 160ms ± 3% 159ms ± 3% -0.42% (p=0.005 n=96+94) Reflect 446ms ± 4% 442ms ± 4% -0.91% (p=0.000 n=95+98) Tar 186ms ± 3% 186ms ± 2% ~ (p=0.221 n=94+97) XML 252ms ± 2% 250ms ± 2% -0.55% (p=0.000 n=95+94) [Geo mean] 430ms 429ms -0.34% name old user-time/op new user-time/op delta Template 256ms ± 3% 257ms ± 3% ~ (p=0.521 n=94+98) Unicode 120ms ± 9% 121ms ± 9% ~ (p=0.074 n=99+100) GoTypes 935ms ± 3% 935ms ± 2% ~ (p=0.574 n=82+96) Compiler 4.56s ± 1% 4.55s ± 2% ~ (p=0.247 n=88+90) SSA 13.6s ± 2% 13.6s ± 1% ~ (p=0.277 n=94+95) Flate 155ms ± 3% 156ms ± 3% ~ (p=0.181 n=95+100) GoParser 193ms ± 8% 184ms ± 6% -4.39% (p=0.000 n=100+89) Reflect 549ms ± 3% 552ms ± 3% +0.45% (p=0.036 n=94+96) Tar 230ms ± 4% 230ms ± 4% ~ (p=0.670 n=97+99) XML 315ms ± 5% 309ms ±12% -2.05% (p=0.000 n=99+99) [Geo mean] 540ms 538ms -0.47% name old alloc/op new alloc/op delta Template 40.3MB ± 0% 38.9MB ± 0% -3.36% (p=0.008 n=5+5) Unicode 28.6MB ± 0% 28.4MB ± 0% -0.90% (p=0.008 n=5+5) GoTypes 137MB ± 0% 132MB ± 0% -3.65% (p=0.008 n=5+5) Compiler 637MB ± 0% 609MB ± 0% -4.40% (p=0.008 n=5+5) SSA 2.19GB ± 0% 2.07GB ± 0% -5.63% (p=0.008 n=5+5) Flate 25.0MB ± 0% 24.1MB ± 0% -3.80% (p=0.008 n=5+5) GoParser 30.0MB ± 0% 29.1MB ± 0% -3.17% (p=0.008 n=5+5) Reflect 87.1MB ± 0% 84.4MB ± 0% -3.05% (p=0.008 n=5+5) Tar 37.3MB ± 0% 36.0MB ± 0% -3.31% (p=0.008 n=5+5) XML 49.8MB ± 0% 48.0MB ± 0% -3.69% (p=0.008 n=5+5) [Geo mean] 87.6MB 84.6MB -3.50% name old allocs/op new allocs/op delta Template 387k ± 0% 380k ± 0% -1.76% (p=0.008 n=5+5) Unicode 342k ± 0% 341k ± 0% -0.31% (p=0.008 n=5+5) GoTypes 1.39M ± 0% 1.37M ± 0% -1.64% (p=0.008 n=5+5) Compiler 5.68M ± 0% 5.60M ± 0% -1.41% (p=0.008 n=5+5) SSA 17.1M ± 0% 16.8M ± 0% -1.49% (p=0.008 n=5+5) Flate 240k ± 0% 236k ± 0% -1.99% (p=0.008 n=5+5) GoParser 309k ± 0% 304k ± 0% -1.57% (p=0.008 n=5+5) Reflect 1.01M ± 0% 0.99M ± 0% -2.69% (p=0.008 n=5+5) Tar 360k ± 0% 353k ± 0% -1.91% (p=0.008 n=5+5) XML 447k ± 0% 441k ± 0% -1.26% (p=0.008 n=5+5) [Geo mean] 858k 844k -1.60% Fixes #15306 Change-Id: I9f558adb911efddead3865542fe2ca71f66fe1da Reviewed-on: https://go-review.googlesource.com/c/go/+/166718 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-03-11 21:20:01 +00:00
Yury Smolsky	3068fcfa0d	cmd/compile: add control flow graphs to ssa.html This CL adds CFGs to ssa.html. It execs dot to generate SVG, which then gets inlined into the html. Some standard naming and javascript hacks enable integration with the rest of ssa.html. Clicking on blocks highlights the relevant part of the CFG, and vice versa. Sample output and screenshots can be seen in #28177. CFGs can be turned on with the suffix mask: :* - dump CFG for every phase :lower - just the lower phase :lower-layout - lower through layout :w,x-y - phases w and x through y Calling dot after every pass is noticeably slow, instead use the range of phases. Dead blocks are not displayed on CFG. User can zoom and pan individual CFG when the automatic adjustment has failed. Dot-related errors are reported without bringing down the process. Fixes #28177 Change-Id: Id52c42d86c4559ca737288aa10561b67a119c63d Reviewed-on: https://go-review.googlesource.com/c/142517 Run-TryBot: Yury Smolsky <yury@smolsky.by> Reviewed-by: David Chase <drchase@google.com>	2018-11-21 10:22:43 +00:00
Brad Fitzpatrick	3813edf26e	all: use "reports whether" consistently in the few places that didn't Go documentation style for boolean funcs is to say: // Foo reports whether ... func Foo() bool (rather than "returns true if") This CL also replaces 4 uses of "iff" with the same "reports whether" wording, which doesn't lose any meaning, and will prevent people from sending typo fixes when they don't realize it's "if and only if". In the past I think we've had the typo CLs updated to just say "reports whether". So do them all at once. (Inspired by the addition of another "returns true if" in CL 146938 in fd_plan9.go) Created with: $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" \| grep -v vendor) $ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" \| grep -v vendor) Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f Reviewed-on: https://go-review.googlesource.com/c/147037 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-11-02 22:47:58 +00:00
David Chase	69c5830c2b	cmd/compile: repair display of values & blocks in prog column This restores the printing of vXX and bYY in the left-hand edge of the last column of ssa.html, where the generated progs appear. Change-Id: I81ab9b2fa5ae28e6e5de1b77665cfbed8d14e000 Reviewed-on: https://go-review.googlesource.com/c/141277 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Yury Smolsky <yury@smolsky.by>	2018-10-11 15:29:00 +00:00
David Chase	0029cd479e	cmd/compile: add LocalAddr that takes SP,mem operands Lack of a well-defined order between VarDef and related address operations sometimes causes problems with store order and write barrier transformations; glitches in the order are made irreparable (by later optimizations) if the two parts of the glitch straddle a split in the original block caused by insertion of a write barrier diamond. Fix this by creating a LocalAddr for addresses of locals (what VarDef matters for) that takes a memory input to help make the order explicit. Addr is modified to only be legal for SB operand, so there is no overlap between Addr and LocalAddr uses (there may be some downstream cleanup from this). Changes to generic.rules and rewrite.go ensure that codegen tests continue to pass; CSE of LocalAddr is impaired, not quite sure of the cost. Fixes #26105. Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515 Reviewed-on: https://go-review.googlesource.com/122483 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2018-07-12 18:45:31 +00:00
Austin Clements	a367f44c18	cmd/compile: enable stack maps everywhere except unsafe points This modifies issafepoint in liveness analysis to report almost every operation as a safe point. There are four things we don't mark as safe-points: 1. Runtime code (other than at calls). 2. go:nosplit functions (other than at calls). 3. Instructions between the load of the write barrier-enabled flag and the write. 4. Instructions leading up to a uintptr -> unsafe.Pointer conversion. We'll optimize this in later CLs: name old time/op new time/op delta Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10) Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9) GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10) SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10) Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10) GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10) Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10) Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9) XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10) [Geo mean] 394ms 401ms +1.75% No effect on binary size because we're not yet emitting these extra safe points. For #24543. Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859 Reviewed-on: https://go-review.googlesource.com/109348 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-22 14:43:37 +00:00
Giovanni Bajo	3c8545c5f6	cmd/compile: reduce allocations in prove by reusing posets In prove, reuse posets between different functions by storing them in the per-worker cache. Allocation count regression caused by prove improvements is down from 5% to 3% after this CL. Updates #25179 Change-Id: I6d14003109833d9b3ef5165fdea00aa9c9e952e8 Reviewed-on: https://go-review.googlesource.com/110455 Run-TryBot: Giovanni Bajo <rasky@develer.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-05-14 14:44:55 +00:00
David Chase	c2c1822b12	cmd/compile: assign and preserve statement boundaries. A new pass run after ssa building (before any other optimization) identifies the "first" ssa node for each statement. Other "noise" nodes are tagged as being never appropriate for a statement boundary (e.g., VarKill, VarDef, Phi). Rewrite, deadcode, cse, and nilcheck are modified to move the statement boundaries forward whenever possible if a boundary-tagged ssa value is removed; never-boundary nodes are ignored in this search (some operations involving constants are also tagged as never-boundary and also ignored because they are likely to be moved or removed during optimization). Code generation treats all nodes except those explicitly marked as statement boundaries as "not statement" nodes, and floats statement boundaries to the beginning of each same-line run of instructions found within a basic block. Line number html conversion was modified to make statement boundary nodes a bit more obvious by prepending a "+". The code in fuse.go that glued together the value slices of two blocks produced a result that depended on the former capacities (not lengths) of the two slices. This causes differences in the 386 bootstrap, and also can sometimes put values into an order that does a worse job of preserving statement boundaries when values are removed. Portions of two delve tests that had caught problems were incorporated into ssa/debug_test.go. There are some opportunities to do better with optimized code, but the next-ing is not lying or overly jumpy. Over 4 CLs, compilebench geomean measured binary size increase of 3.5% and compile user time increase of 3.8% (this is after optimization to reuse a sparse map instead of creating multiple maps.) This CL worsens the optimized-debugging experience with Delve; we need to work with the delve team so that they can use the is_stmt marks that we're emitting now. The reference output changes from time to time depending on other changes in the compiler, sometimes better, sometimes worse. This CL now includes a test ensuring that 99+% of the lines in the Go command itself (a handy optimized binary) include is_stmt markers. Change-Id: I359c94e06843f1eb41f9da437bd614885aa9644a Reviewed-on: https://go-review.googlesource.com/102435 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-14 14:09:49 +00:00
Daniel Martí	14393c5cd4	cmd: remove a few more unused parameters ssa's pos parameter on the Const* funcs is unused, so remove it. ld's alloc parameter on elfnote is always true, so remove the arguments and simplify the code. Finally, arm's addpltreloc never has its return parameter used, so remove it. Change-Id: I63387ecf6ab7b5f7c20df36be823322bb98427b8 Reviewed-on: https://go-review.googlesource.com/104456 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-04-09 17:10:25 +00:00
Daniel Martí	cd2cb6e3f5	cmd/compile: cache sparse maps across ssa passes This is done for sparse sets already, but it was missing for sparse maps. Only affects deadstore and regalloc, as they're the only ones that use sparse maps. name old time/op new time/op delta DSEPass-4 247µs ± 0% 216µs ± 0% -12.75% (p=0.008 n=5+5) DSEPassBlock-4 3.05ms ± 1% 2.87ms ± 1% -6.02% (p=0.002 n=6+6) CSEPass-4 2.30ms ± 0% 2.32ms ± 0% +0.53% (p=0.026 n=6+6) CSEPassBlock-4 23.8ms ± 0% 23.8ms ± 0% ~ (p=0.931 n=6+5) DeadcodePass-4 51.7µs ± 1% 51.5µs ± 2% ~ (p=0.429 n=5+6) DeadcodePassBlock-4 734µs ± 1% 742µs ± 3% ~ (p=0.394 n=6+6) MultiPass-4 152µs ± 0% 149µs ± 2% ~ (p=0.082 n=5+6) MultiPassBlock-4 2.67ms ± 1% 2.41ms ± 2% -9.77% (p=0.008 n=5+5) name old alloc/op new alloc/op delta DSEPass-4 41.2kB ± 0% 0.1kB ± 0% -99.68% (p=0.002 n=6+6) DSEPassBlock-4 560kB ± 0% 4kB ± 0% -99.34% (p=0.026 n=5+6) CSEPass-4 189kB ± 0% 189kB ± 0% ~ (all equal) CSEPassBlock-4 3.10MB ± 0% 3.10MB ± 0% ~ (p=0.444 n=5+5) DeadcodePass-4 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) DeadcodePassBlock-4 164kB ± 0% 164kB ± 0% ~ (all equal) MultiPass-4 240kB ± 0% 199kB ± 0% -17.06% (p=0.002 n=6+6) MultiPassBlock-4 3.60MB ± 0% 2.99MB ± 0% -17.06% (p=0.002 n=6+6) name old allocs/op new allocs/op delta DSEPass-4 8.00 ± 0% 4.00 ± 0% -50.00% (p=0.002 n=6+6) DSEPassBlock-4 240 ± 0% 120 ± 0% -50.00% (p=0.002 n=6+6) CSEPass-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) CSEPassBlock-4 1.35k ± 0% 1.35k ± 0% ~ (all equal) DeadcodePass-4 3.00 ± 0% 3.00 ± 0% ~ (all equal) DeadcodePassBlock-4 9.00 ± 0% 9.00 ± 0% ~ (all equal) MultiPass-4 11.0 ± 0% 10.0 ± 0% -9.09% (p=0.002 n=6+6) MultiPassBlock-4 165 ± 0% 150 ± 0% -9.09% (p=0.002 n=6+6) Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a Reviewed-on: https://go-review.googlesource.com/98449 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-03-15 17:24:39 +00:00
Austin Clements	491f409a32	cmd/compile: minor comment improvements/corrections Change-Id: Ie0934f1528d58d4971cdef726d3e2d23cf3935d3 Reviewed-on: https://go-review.googlesource.com/87475 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>	2018-03-08 22:25:21 +00:00
Keith Randall	4b00d3f4a2	cmd/compile: implement comparisons directly with memory Allow the compiler to generate code like CMPQ 16(AX), $7 It's tricky because it's difficult to spill such a comparison during flagalloc, because the same memory state might not be available at the restore locations. Solve this problem by decomposing the compare+load back into its parts if it needs to be spilled. The big win is that the write barrier test goes from: MOVL runtime.writeBarrier(SB), CX TESTL CX, CX JNE 60 to CMPL runtime.writeBarrier(SB), $0 JNE 59 It's one instruction and one byte smaller. Fixes #19485 Fixes #15245 Update #22460 Binaries are about 0.15% smaller. Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836 Reviewed-on: https://go-review.googlesource.com/86035 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2018-02-26 23:49:44 +00:00
David Chase	f22cf7131a	cmd/compile: use src.NoXPos for entry-block constants The ssa backend is aggressive about placing constants and certain other values in the Entry block. It's implausible that the original line numbers for these constants makes any sort of sense when it appears to a user stepping in a debugger, and they're also not that useful in dumps since entry-block instructions tend to be constants (i.e., unlikely to be the cause of a crash). Therefore, use src.NoXPos for any values that are explicitly inserted into a function's entry block. Passes all tests, including ssa/debug_test.go with both gdb and a fairly recent dlv. Hand-verified that it solves the reported problem; constructed a test that reproduced a problem, and fixed it. Modified test harness to allow injection of slightly more interesting inputs. Fixes #22558. Change-Id: I4476927067846bc4366da7793d2375c111694c55 Reviewed-on: https://go-review.googlesource.com/81215 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-12-01 07:09:54 +00:00
Austin Clements	afbe646ab4	cmd/compile: report typedslicecopy write barriers Most write barrier calls are inserted by SSA, but copy and append are lowered to runtime.typedslicecopy during walk. Fix these to set Func.WBPos and emit the "write barrier" warning, as done for the write barriers inserted by SSA. As part of this, we refactor setting WBPos and emitting this warning into the frontend so it can be shared by both walk and SSA. Change-Id: I5fe9997d9bdb55e03e01dd58aee28908c35f606b Reviewed-on: https://go-review.googlesource.com/73411 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-10-29 20:21:43 +00:00
Keith Randall	770d8d8207	cmd/compile: free value earlier in nilcheck When we remove a nil check, add it back to the free Value pool immediately. Fixes #18732 Change-Id: I8d644faabbfb52157d3f2d071150ff0342ac28dc Reviewed-on: https://go-review.googlesource.com/58810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-08-25 06:01:26 +00:00
Josh Bleecher Snyder	46b88c9fbc	cmd/compile: change ssa.Type into types.Type When package ssa was created, Type was in package gc. To avoid circular dependencies, we used an interface (ssa.Type) to represent type information in SSA. In the Go 1.9 cycle, gri extricated the Type type from package gc. As a result, we can now use it in package ssa. Now, instead of package types depending on package ssa, it is the other way. This is a more sensible dependency tree, and helps compiler performance a bit. Though this is a big CL, most of the changes are mechanical and uninteresting. Interesting bits: Add new singleton globals to package types for the special SSA types Memory, Void, Invalid, Flags, and Int128. * Add two new Types, TSSA for the special types, and TTUPLE, for SSA tuple types. ssa.MakeTuple is now types.NewTuple. * Move type comparison result constants CMPlt, CMPeq, and CMPgt to package types. * We had picked the name "types" in our rules for the handy list of types provided by ssa.Config. That conflicted with the types package name, so change it to "typ". * Update the type comparison routine to handle tuples and special types inline. * Teach gc/fmt.go how to print special types. * We can now eliminate ElemTypes in favor of just Elem, and probably also some other duplicated Type methods designed to return ssa.Type instead of types.Type. The ssa tests were using their own dummy types, and they were not particularly careful about types in general. Of necessity, this CL switches them to use *types.Type; it does not make them more type-accurate. Unfortunately, using types.Type means initializing a bit of the types universe. This is prime for refactoring and improvement. This shrinks ssa.Value; it now fits in a smaller size class on 64 bit systems. This doesn't have a giant impact, though, since most Values are preallocated in a chunk. name old alloc/op new alloc/op delta Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8) Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10) GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10) Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10) GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9) Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8) Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10) XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10) [Geo mean] 40.5MB 40.3MB -0.68% name old allocs/op new allocs/op delta Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9) Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10) GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10) Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10) GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9) Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8) Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10) XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10) [Geo mean] 428k 428k -0.01% Removing all the interface calls helps non-trivially with CPU, though. name old time/op new time/op delta Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96) Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96) GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96) Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99) GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97) Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99) Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94) XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95) [Geo mean] 178ms 173ms -2.65% name old user-time/op new user-time/op delta Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99) Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95) GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99) Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96) GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100) Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92) Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100) XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97) [Geo mean] 220ms 213ms -2.76% Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1 Reviewed-on: https://go-review.googlesource.com/42145 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder	4ee934ad27	cmd/compile: remove references to *os.File from ssa package This reduces the size of the ssa export data by 10%, from 76154 to 67886. It doesn't appear that #20084, which would do this automatically, is going to be fixed soon. Do it manually for now. This speeds up compiling cmd/compile/internal/amd64 and presumably its comrades as well: name old time/op new time/op delta CompileAMD64 89.6ms ± 6% 86.7ms ± 5% -3.29% (p=0.000 n=49+47) name old user-time/op new user-time/op delta CompileAMD64 116ms ± 5% 112ms ± 5% -3.51% (p=0.000 n=45+42) name old alloc/op new alloc/op delta CompileAMD64 26.7MB ± 0% 25.8MB ± 0% -3.26% (p=0.008 n=5+5) name old allocs/op new allocs/op delta CompileAMD64 223k ± 0% 213k ± 0% -4.46% (p=0.008 n=5+5) Updates #20084 Change-Id: I49e8951c5bfce63ad2b7f4fc3bfa0868c53114f9 Reviewed-on: https://go-review.googlesource.com/41493 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-24 23:58:14 +00:00
Josh Bleecher Snyder	0323895cc0	cmd/compile: catch and report nowritebarrier violations later Prior to this CL, the SSA backend reported violations of the //go:nowritebarrier annotation immediately. This necessitated emitting errors during SSA compilation, which is not compatible with a concurrent backend. Instead, check for such violations later. We already save the data required to do a late check for violations of the //go:nowritebarrierrec annotation. Use the same data, and check //go:nowritebarrier at the same time. One downside to doing this is that now only a single violation will be reported per function. Given that this is for the runtime only, and violations are rare, this seems an acceptable cost. While we are here, remove several 'nerrors != 0' checks that are rendered pointless. Updates #15756 Fixes #19250 (as much as it ever can be) Change-Id: Ia44c4ad5b6fd6f804d9f88d9571cec8d23665cb3 Reviewed-on: https://go-review.googlesource.com/38973 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-31 16:31:20 +00:00
Josh Bleecher Snyder	b3a8beb9d1	cmd/compile: minor cleanup in debug code Change-Id: I9885606801b9c8fcb62c16d0856025c4e83e658b Reviewed-on: https://go-review.googlesource.com/38650 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-24 22:21:55 +00:00
Matthew Dempsky	325904fe6a	cmd/compile: port liveness analysis to SSA Passes toolstash-check -all. Change-Id: I92c3c25d6c053f971f346f4fa3bbc76419b58183 Reviewed-on: https://go-review.googlesource.com/38087 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-20 22:58:50 +00:00
Josh Bleecher Snyder	2cdb7f118a	cmd/compile: move Frontend field from ssa.Config to ssa.Func Suggested by mdempsky in CL 38232. This allows us to use the Frontend field to associate frontend state and information with a function. See the following CL in the series for examples. This is a giant CL, but it is almost entirely routine refactoring. The ssa test API is starting to feel a bit unwieldy. I will clean it up separately, once the dust has settled. Passes toolstash -cmp. Updates #15756 Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf Reviewed-on: https://go-review.googlesource.com/38327 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder	88e47187c1	cmd/compile: relocate code from config.go to func.go This is a follow-up to CL 38167. Pure code movement. Change-Id: I13e58f7eac6718c77076d89e13fc721a5205ec57 Reviewed-on: https://go-review.googlesource.com/38322 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-17 05:21:53 +00:00
Josh Bleecher Snyder	a5e3cac895	cmd/compile: rearrange fields between ssa.Func, ssa.Cache, and ssa.Config This makes ssa.Func, ssa.Cache, and ssa.Config fulfill the roles laid out for them in CL 38160. The only non-trivial change in this CL is how cached values and blocks get IDs. Prior to this CL, their IDs were assigned as part of resetting the cache, and only modified IDs were reset. This required knowing how many values and blocks were modified, which required a tight coupling between ssa.Func and ssa.Config. To eliminate that coupling, we now zero values and blocks during reset, and assign their IDs when they are used. Since unused values and blocks have ID == 0, we can efficiently find the last used value/block, to avoid zeroing everything. Bulk zeroing is efficient, but not efficient enough to obviate the need to avoid zeroing everything every time. As a happy side-effect, ssa.Func.Free is no longer necessary. DebugHashMatch and friends now belong in func.go. They have been left in place for clarity and review. I will move them in a subsequent CL. Passes toolstash -cmp. No compiler performance impact. No change in 'go test cmd/compile/internal/ssa' execution time. Change-Id: I2eb7af58da067ef6a36e815a6f386cfe8634d098 Reviewed-on: https://go-review.googlesource.com/38167 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-17 05:21:42 +00:00
Cherry Zhang	c8f38b3398	cmd/compile: use type information in Aux for Store size Remove size AuxInt in Store, and alignment in Move/Zero. We still pass size AuxInt to Move/Zero, as it is used for partial Move/Zero lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288). SizeAndAlign is gone. Passes "toolstash -cmp" on std. Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d Reviewed-on: https://go-review.googlesource.com/38150 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:25:04 +00:00
Cherry Zhang	1b85300602	cmd/compile: clean up SSA-building code Now that the write barrier insertion is moved to SSA, the SSA building code can be simplified. Updates #17583. Change-Id: I5cacc034b11aa90b0abe6f8dd97e4e3994e2bc25 Reviewed-on: https://go-review.googlesource.com/36840 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:24:40 +00:00
Josh Bleecher Snyder	43afcb5c96	cmd/compile: define roles for ssa.Func, ssa.Config, and ssa.Cache The line between ssa.Func and ssa.Config has blurred. Concurrent compilation in the backend will require more precision. This CL lays out an (aspirational) organization. The implementation will come in follow-up CLs, once the organization is settled. ssa.Config holds basic compiler configuration, mostly arch-specific information. It is configured once, early on, and is readonly, so it is safe for concurrent use. ssa.Func is a single-shot object used for compiling a single Func. It is not concurrency-safe and not re-usable. ssa.Cache is a multi-use object used to avoid expensive allocations during compilation. Each ssa.Func is given an ssa.Cache to use. ssa.Cache is not concurrency-safe. Change-Id: Id02809b6f3541541cac6c27bbb598834888ce1cc Reviewed-on: https://go-review.googlesource.com/38160 Reviewed-by: Keith Randall <khr@golang.org>	2017-03-15 04:27:49 +00:00
David Chase	886e9e6065	cmd/compile: put spills in better places Previously we always issued a spill right after the op that was being spilled. This CL pushes spills father away from the generator, hopefully pushing them into unlikely branches. For example: x = ... if unlikely { call ... } ... use x ... Used to compile to x = ... spill x if unlikely { call ... restore x } It now compiles to x = ... if unlikely { spill x call ... restore x } This is particularly useful for code which appends, as the only call is an unlikely call to growslice. It also helps for the spills needed around write barrier calls. The basic algorithm is walk down the dominator tree following a path where the block still dominates all of the restores. We're looking for a block that: 1) dominates all restores 2) has the value being spilled in a register 3) has a loop depth no deeper than the value being spilled The walking-down code is iterative. I was forced to limit it to searching 100 blocks so it doesn't become O(n^2). Maybe one day we'll find a better way. I had to delete most of David's code which pushed spills out of loops. I suspect this CL subsumes most of the cases that his code handled. Generally positive performance improvements, but hard to tell for sure with all the noise. (compilebench times are unchanged.) name old time/op new time/op delta BinaryTree17-12 2.91s ±15% 2.80s ±12% ~ (p=0.063 n=10+10) Fannkuch11-12 3.47s ± 0% 3.30s ± 4% -4.91% (p=0.000 n=9+10) FmtFprintfEmpty-12 48.0ns ± 1% 47.4ns ± 1% -1.32% (p=0.002 n=9+9) FmtFprintfString-12 85.6ns ±11% 79.4ns ± 3% -7.27% (p=0.005 n=10+10) FmtFprintfInt-12 91.8ns ±10% 85.9ns ± 4% ~ (p=0.203 n=10+9) FmtFprintfIntInt-12 135ns ±13% 127ns ± 1% -5.72% (p=0.025 n=10+9) FmtFprintfPrefixedInt-12 167ns ± 1% 168ns ± 2% ~ (p=0.580 n=9+10) FmtFprintfFloat-12 249ns ±11% 230ns ± 1% -7.32% (p=0.000 n=10+10) FmtManyArgs-12 504ns ± 7% 506ns ± 1% ~ (p=0.198 n=9+9) GobDecode-12 6.95ms ± 1% 7.04ms ± 1% +1.37% (p=0.001 n=10+10) GobEncode-12 6.32ms ±13% 6.04ms ± 1% ~ (p=0.063 n=10+10) Gzip-12 233ms ± 1% 235ms ± 0% +1.01% (p=0.000 n=10+9) Gunzip-12 40.1ms ± 1% 39.6ms ± 0% -1.12% (p=0.000 n=10+8) HTTPClientServer-12 227µs ± 9% 221µs ± 5% ~ (p=0.114 n=9+8) JSONEncode-12 16.1ms ± 2% 15.8ms ± 1% -2.09% (p=0.002 n=9+8) JSONDecode-12 61.8ms ±11% 57.9ms ± 1% -6.30% (p=0.000 n=10+9) Mandelbrot200-12 4.30ms ± 3% 4.28ms ± 1% ~ (p=0.203 n=10+8) GoParse-12 3.18ms ± 2% 3.18ms ± 2% ~ (p=0.579 n=10+10) RegexpMatchEasy0_32-12 76.7ns ± 1% 77.5ns ± 1% +0.92% (p=0.002 n=9+8) RegexpMatchEasy0_1K-12 239ns ± 3% 239ns ± 1% ~ (p=0.204 n=10+10) RegexpMatchEasy1_32-12 71.4ns ± 1% 70.6ns ± 0% -1.15% (p=0.000 n=10+9) RegexpMatchEasy1_1K-12 383ns ± 2% 390ns ±10% ~ (p=0.181 n=8+9) RegexpMatchMedium_32-12 114ns ± 0% 113ns ± 1% -0.88% (p=0.000 n=9+8) RegexpMatchMedium_1K-12 36.3µs ± 1% 36.8µs ± 1% +1.59% (p=0.000 n=10+8) RegexpMatchHard_32-12 1.90µs ± 1% 1.90µs ± 1% ~ (p=0.341 n=10+10) RegexpMatchHard_1K-12 59.4µs ±11% 57.8µs ± 1% ~ (p=0.968 n=10+9) Revcomp-12 461ms ± 1% 462ms ± 1% ~ (p=1.000 n=9+9) Template-12 67.5ms ± 1% 66.3ms ± 1% -1.77% (p=0.000 n=10+8) TimeParse-12 314ns ± 3% 309ns ± 0% -1.56% (p=0.000 n=9+8) TimeFormat-12 340ns ± 2% 331ns ± 1% -2.79% (p=0.000 n=10+10) The go binary is 0.2% larger. Not really sure why the size would change. Change-Id: Ia5116e53a3aeb025ef350ffc51c14ae5cc17871c Reviewed-on: https://go-review.googlesource.com/34822 Reviewed-by: David Chase <drchase@google.com>	2017-03-15 02:09:25 +00:00
Josh Bleecher Snyder	3dcfce8d19	cmd/compile: add OpOffPtr [c] SP to constant cache They accounted for almost 30% of all CSE'd values. By never creating the duplicates in the first place, we reduce the high water mark of Value IDs, which in turn makes all SSA phases cheaper, particularly regalloc. name old time/op new time/op delta Template 200ms ± 3% 198ms ± 4% -0.87% (p=0.016 n=50+49) Unicode 86.9ms ± 2% 85.5ms ± 3% -1.56% (p=0.000 n=49+50) GoTypes 553ms ± 4% 551ms ± 4% ~ (p=0.183 n=50+49) SSA 3.97s ± 3% 3.93s ± 2% -1.06% (p=0.000 n=48+48) Flate 124ms ± 4% 124ms ± 3% ~ (p=0.545 n=48+50) GoParser 146ms ± 4% 146ms ± 4% ~ (p=0.810 n=49+49) Reflect 357ms ± 3% 355ms ± 3% -0.59% (p=0.049 n=50+48) Tar 106ms ± 4% 107ms ± 5% ~ (p=0.454 n=49+50) XML 203ms ± 4% 203ms ± 4% ~ (p=0.726 n=48+50) name old user-ns/op new user-ns/op delta Template 237M ± 3% 235M ± 4% ~ (p=0.208 n=47+48) Unicode 111M ± 4% 108M ± 9% -2.50% (p=0.000 n=47+50) GoTypes 736M ± 5% 729M ± 4% -0.95% (p=0.017 n=50+46) SSA 5.73G ± 4% 5.74G ± 4% ~ (p=0.765 n=50+50) Flate 150M ± 5% 148M ± 6% -0.89% (p=0.045 n=48+47) GoParser 180M ± 5% 178M ± 7% -1.34% (p=0.012 n=50+50) Reflect 450M ± 4% 444M ± 4% -1.40% (p=0.000 n=50+49) Tar 124M ± 7% 123M ± 7% ~ (p=0.092 n=50+50) XML 248M ± 6% 245M ± 5% ~ (p=0.057 n=50+50) name old alloc/op new alloc/op delta Template 39.4MB ± 0% 39.3MB ± 0% -0.37% (p=0.000 n=50+50) Unicode 30.9MB ± 0% 30.9MB ± 0% -0.27% (p=0.000 n=48+50) GoTypes 114MB ± 0% 113MB ± 0% -1.03% (p=0.000 n=50+49) SSA 882MB ± 0% 865MB ± 0% -1.95% (p=0.000 n=49+49) Flate 25.8MB ± 0% 25.7MB ± 0% -0.21% (p=0.000 n=50+50) GoParser 31.7MB ± 0% 31.6MB ± 0% -0.33% (p=0.000 n=50+50) Reflect 79.7MB ± 0% 79.3MB ± 0% -0.49% (p=0.000 n=44+49) Tar 27.2MB ± 0% 27.1MB ± 0% -0.31% (p=0.000 n=50+50) XML 42.7MB ± 0% 42.3MB ± 0% -1.05% (p=0.000 n=48+49) name old allocs/op new allocs/op delta Template 379k ± 1% 380k ± 1% +0.26% (p=0.000 n=50+50) Unicode 324k ± 1% 324k ± 1% ~ (p=0.964 n=49+50) GoTypes 1.14M ± 0% 1.15M ± 0% +0.14% (p=0.000 n=50+49) SSA 7.89M ± 0% 7.89M ± 0% -0.05% (p=0.000 n=49+49) Flate 240k ± 1% 241k ± 1% +0.27% (p=0.001 n=50+50) GoParser 310k ± 1% 311k ± 1% +0.48% (p=0.000 n=50+49) Reflect 1.00M ± 0% 1.00M ± 0% +0.17% (p=0.000 n=48+50) Tar 254k ± 1% 255k ± 1% +0.23% (p=0.005 n=50+50) XML 395k ± 1% 395k ± 1% +0.19% (p=0.002 n=49+47) Change-Id: Iaa8f5f37e23bd81983409f7359f9dcd4dfe2961f Reviewed-on: https://go-review.googlesource.com/38003 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-10 16:50:58 +00:00
Josh Bleecher Snyder	d11a2184fb	cmd/compile: allow earlier GC of freed constant value Minor fix, because it's the right thing to do. No significant impact. Change-Id: I2138285d397494daa9a88c414149c2a7860edd7e Reviewed-on: https://go-review.googlesource.com/38001 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-10 01:39:09 +00:00
Josh Bleecher Snyder	c63ad970f6	cmd/compile: rename Func.constVal arg for clarity Values have an Aux and an AuxInt. We're setting AuxInt, not Aux. Say so. Change-Id: I41aa783273bb7e1ba47c941aa4233f818e37dadd Reviewed-on: https://go-review.googlesource.com/37997 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-09 23:39:01 +00:00
Josh Bleecher Snyder	f791b288d1	cmd/compile: remove some allocs from CSE Pick up a few pennies: * CSE gets run twice for each function, but the set of Aux values doesn't change. Avoid populating it twice. * Don't bother populating auxmap for values that can't be CSE'd anyway. name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.7MB ± 0% -0.61% (p=0.008 n=5+5) Unicode 32.3MB ± 0% 32.3MB ± 0% -0.22% (p=0.008 n=5+5) GoTypes 122MB ± 0% 121MB ± 0% -0.55% (p=0.008 n=5+5) Compiler 482MB ± 0% 479MB ± 0% -0.58% (p=0.008 n=5+5) SSA 865MB ± 0% 862MB ± 0% -0.35% (p=0.008 n=5+5) Flate 26.5MB ± 0% 26.5MB ± 0% ~ (p=0.056 n=5+5) GoParser 32.6MB ± 0% 32.4MB ± 0% -0.58% (p=0.008 n=5+5) Reflect 84.2MB ± 0% 83.8MB ± 0% -0.57% (p=0.008 n=5+5) Tar 27.7MB ± 0% 27.6MB ± 0% -0.37% (p=0.008 n=5+5) XML 44.7MB ± 0% 44.5MB ± 0% -0.53% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Template 373k ± 0% 373k ± 1% ~ (p=1.000 n=5+5) Unicode 326k ± 0% 325k ± 0% ~ (p=0.548 n=5+5) GoTypes 1.16M ± 0% 1.16M ± 0% ~ (p=0.841 n=5+5) Compiler 4.16M ± 0% 4.15M ± 0% ~ (p=0.222 n=5+5) SSA 7.57M ± 0% 7.56M ± 0% -0.22% (p=0.008 n=5+5) Flate 238k ± 1% 239k ± 1% ~ (p=0.690 n=5+5) GoParser 304k ± 0% 304k ± 0% ~ (p=1.000 n=5+5) Reflect 1.01M ± 0% 1.00M ± 0% -0.31% (p=0.016 n=4+5) Tar 245k ± 0% 245k ± 1% ~ (p=0.548 n=5+5) XML 393k ± 0% 391k ± 1% ~ (p=0.095 n=5+5) Change-Id: I78f1ffe129bd8fd590b7511717dd2bf9f5ecbd6d Reviewed-on: https://go-review.googlesource.com/36690 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-09 20:42:46 +00:00
Matthew Dempsky	5c90e1cf8a	cmd/compile/internal/ssa: remove Func.StaticData field Rather than collecting static data nodes to be written out later, just write them out immediately. Change-Id: I51708b690e94bc3e288b4d6ba3307bf738a80f64 Reviewed-on: https://go-review.googlesource.com/36352 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-04 01:09:26 +00:00
Josh Bleecher Snyder	57546d67ec	cmd/compile: add reusable []Location to ssa.Config name old time/op new time/op delta Template 218ms ± 3% 214ms ± 3% -1.70% (p=0.000 n=30+30) Unicode 100ms ± 3% 100ms ± 4% ~ (p=0.614 n=29+30) GoTypes 657ms ± 1% 660ms ± 3% +0.46% (p=0.046 n=29+30) Compiler 2.80s ± 2% 2.80s ± 1% ~ (p=0.451 n=28+29) Flate 131ms ± 2% 132ms ± 4% ~ (p=1.000 n=29+29) GoParser 159ms ± 3% 160ms ± 5% ~ (p=0.341 n=28+30) Reflect 406ms ± 3% 408ms ± 4% ~ (p=0.511 n=28+30) Tar 118ms ± 4% 118ms ± 4% ~ (p=0.827 n=29+30) XML 222ms ± 6% 222ms ± 3% ~ (p=0.532 n=30+30) name old user-ns/op new user-ns/op delta Template 274user-ms ± 3% 272user-ms ± 3% -0.87% (p=0.015 n=29+30) Unicode 140user-ms ± 4% 140user-ms ± 3% ~ (p=0.735 n=29+30) GoTypes 890user-ms ± 1% 897user-ms ± 2% +0.88% (p=0.002 n=29+30) Compiler 3.88user-s ± 2% 3.89user-s ± 1% ~ (p=0.132 n=30+29) Flate 168user-ms ± 2% 157user-ms ± 4% -6.21% (p=0.000 n=25+28) GoParser 211user-ms ± 2% 213user-ms ± 5% ~ (p=0.086 n=28+30) Reflect 539user-ms ± 2% 541user-ms ± 3% ~ (p=0.267 n=27+29) Tar 156user-ms ± 7% 155user-ms ± 5% ~ (p=0.708 n=30+30) XML 291user-ms ± 5% 294user-ms ± 3% +0.83% (p=0.029 n=29+30) name old alloc/op new alloc/op delta Template 40.7MB ± 0% 39.4MB ± 0% -3.26% (p=0.000 n=29+26) Unicode 30.8MB ± 0% 30.7MB ± 0% -0.40% (p=0.000 n=28+30) GoTypes 123MB ± 0% 119MB ± 0% -3.47% (p=0.000 n=30+29) Compiler 472MB ± 0% 455MB ± 0% -3.60% (p=0.000 n=30+30) Flate 26.5MB ± 0% 25.6MB ± 0% -3.21% (p=0.000 n=28+30) GoParser 32.3MB ± 0% 31.4MB ± 0% -2.98% (p=0.000 n=29+30) Reflect 84.4MB ± 0% 82.1MB ± 0% -2.83% (p=0.000 n=30+30) Tar 27.3MB ± 0% 26.5MB ± 0% -2.70% (p=0.000 n=29+29) XML 44.6MB ± 0% 43.1MB ± 0% -3.49% (p=0.000 n=30+30) name old allocs/op new allocs/op delta Template 401k ± 1% 399k ± 0% -0.35% (p=0.000 n=30+28) Unicode 331k ± 0% 331k ± 1% ~ (p=0.907 n=28+30) GoTypes 1.24M ± 0% 1.23M ± 0% -0.43% (p=0.000 n=30+30) Compiler 4.26M ± 0% 4.25M ± 0% -0.34% (p=0.000 n=29+30) Flate 252k ± 1% 251k ± 1% -0.41% (p=0.000 n=30+30) GoParser 325k ± 1% 324k ± 1% -0.31% (p=0.000 n=27+30) Reflect 1.06M ± 0% 1.05M ± 0% -0.69% (p=0.000 n=30+30) Tar 266k ± 1% 265k ± 1% -0.51% (p=0.000 n=29+30) XML 416k ± 1% 415k ± 1% -0.36% (p=0.002 n=30+30) Change-Id: I8f784001324df83b2764c44f0e83a540e5beab34 Reviewed-on: https://go-review.googlesource.com/36212 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-02 22:39:32 +00:00

1 2 3

102 commits