Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Josh Bleecher Snyder	46b88c9fbc	cmd/compile: change ssa.Type into types.Type When package ssa was created, Type was in package gc. To avoid circular dependencies, we used an interface (ssa.Type) to represent type information in SSA. In the Go 1.9 cycle, gri extricated the Type type from package gc. As a result, we can now use it in package ssa. Now, instead of package types depending on package ssa, it is the other way. This is a more sensible dependency tree, and helps compiler performance a bit. Though this is a big CL, most of the changes are mechanical and uninteresting. Interesting bits: Add new singleton globals to package types for the special SSA types Memory, Void, Invalid, Flags, and Int128. * Add two new Types, TSSA for the special types, and TTUPLE, for SSA tuple types. ssa.MakeTuple is now types.NewTuple. * Move type comparison result constants CMPlt, CMPeq, and CMPgt to package types. * We had picked the name "types" in our rules for the handy list of types provided by ssa.Config. That conflicted with the types package name, so change it to "typ". * Update the type comparison routine to handle tuples and special types inline. * Teach gc/fmt.go how to print special types. * We can now eliminate ElemTypes in favor of just Elem, and probably also some other duplicated Type methods designed to return ssa.Type instead of types.Type. The ssa tests were using their own dummy types, and they were not particularly careful about types in general. Of necessity, this CL switches them to use *types.Type; it does not make them more type-accurate. Unfortunately, using types.Type means initializing a bit of the types universe. This is prime for refactoring and improvement. This shrinks ssa.Value; it now fits in a smaller size class on 64 bit systems. This doesn't have a giant impact, though, since most Values are preallocated in a chunk. name old alloc/op new alloc/op delta Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8) Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10) GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10) Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10) GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9) Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8) Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10) XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10) [Geo mean] 40.5MB 40.3MB -0.68% name old allocs/op new allocs/op delta Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9) Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10) GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10) Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10) GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9) Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8) Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10) XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10) [Geo mean] 428k 428k -0.01% Removing all the interface calls helps non-trivially with CPU, though. name old time/op new time/op delta Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96) Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96) GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96) Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99) GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97) Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99) Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94) XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95) [Geo mean] 178ms 173ms -2.65% name old user-time/op new user-time/op delta Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99) Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95) GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99) Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96) GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100) Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92) Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100) XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97) [Geo mean] 220ms 213ms -2.76% Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1 Reviewed-on: https://go-review.googlesource.com/42145 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-05-09 23:01:51 +00:00
Josh Bleecher Snyder	6a24b2d0c1	cmd/compile: add boolean simplification rules These collectively fire a few hundred times during make.bash, mostly rewriting XOR SETNE -> SETEQ. Fixes #17905. Change-Id: Ic5eb241ee93ed67099da3de11f59e4df9fab64a3 Reviewed-on: https://go-review.googlesource.com/42491 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-05-09 21:50:55 +00:00
Marvin Stenger	9aeced650f	cmd/compile/internal/ssa: mark boolean instructions commutative Mark AndB, OrB, EqB, and NeqB as commutative. Change-Id: Ife7cfcb9780cc5dd669617cb52339ab336667da4 Reviewed-on: https://go-review.googlesource.com/42515 Reviewed-by: Giovanni Bajo <rasky@develer.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-05-09 21:31:38 +00:00
Josh Bleecher Snyder	00db0cbf86	cmd/compile: add minor bit twiddling optimizations Noticed while adding to the bitset implementation in cmd/compile/internal/gc. The (Com (Const)) optimizations were already present in the AMD64 lowered optimizations. They trigger 118, 44, 262, and 108 times respectively for int sizes 8, 16, 32, and 64 in a run of make.bash. The (Or (And)) optimization is new. It triggers 3 times for int size 8 and once for int size 64 during make.bash, in packages internal/poll, reflect, encoding/asn1, and go/types, so there is a bit of natural test coverage. Change-Id: I44072864ff88831d5ec7dce37c516d29df056e98 Reviewed-on: https://go-review.googlesource.com/41758 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-05-01 17:18:06 +00:00
Josh Bleecher Snyder	dae5389d3d	Revert "cmd/compile: add Type.MustSize and Type.MustAlignment" This reverts commit `94d540a4b6`. Reason for revert: prefer something along the lines of CL 42018. Change-Id: I876fe32e98f37d8d725fe55e0fd0ea429c0198e0 Reviewed-on: https://go-review.googlesource.com/42022 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-04-28 01:24:13 +00:00
Josh Bleecher Snyder	94d540a4b6	cmd/compile: add Type.MustSize and Type.MustAlignment Type.Size and Type.Alignment are for the front end: They calculate size and alignment if needed. Type.MustSize and Type.MustAlignment are for the back end: They call Fatal if size and alignment are not already calculated. Most uses are of MustSize and MustAlignment, but that's because the back end is newer, and this API was added to support it. This CL was mostly generated with sed and selective reversion. The only mildly interesting bit is the change of the ssa.Type interface and the supporting ssa dummy types. Follow-up to review feedback on CL 41970. Passes toolstash-check. Change-Id: I0d9b9505e57453dae8fb6a236a07a7a02abd459e Reviewed-on: https://go-review.googlesource.com/42016 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-27 22:57:57 +00:00
Josh Bleecher Snyder	fc7b83d192	cmd/compile: break up large value rewrite functions This makes the cmd/compile/internal/ssa package compile much faster, and has no impact on the speed of the compiler. The chunk size was selected empirically, in that at chunk size 10, the object file was smaller than at chunk size 5 or 20. name old time/op new time/op delta SSA 7.33s ± 5% 5.64s ± 1% -23.10% (p=0.000 n=10+10) name old user-time/op new user-time/op delta SSA 9.70s ± 1% 8.04s ± 2% -17.17% (p=0.000 n=9+10) name old obj-bytes new obj-bytes delta SSA 9.82M ± 0% 8.28M ± 0% -15.67% (p=0.000 n=10+10) Change-Id: Iab472905da3f0e82f3db2c93d06e2759abc9dd44 Reviewed-on: https://go-review.googlesource.com/41296 Reviewed-by: Keith Randall <khr@golang.org>	2017-04-21 13:13:22 +00:00
Josh Bleecher Snyder	eaa198f3d1	cmd/compile: stop generating block successor vars in rewrite rules They are left over from the days before we had BlockKindFirst and swapSuccessors. Change-Id: I9259d53ac2821ca4d5de5dd520ca4b78f52ecad4 Reviewed-on: https://go-review.googlesource.com/41206 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-04-21 04:11:51 +00:00
Matthew Dempsky	1e3570ac86	cmd/internal/objabi: extract shared functionality from obj Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing the assembler backends no longer requires reinstalling cmd/link or cmd/addr2line. There's also now one canonical definition of the object file format in cmd/internal/objabi/doc.go, with a warning to update all three implementations. objabi is still something of a grab bag of unrelated code (e.g., flag and environment variable handling probably belong in a separate "tool" package), but this is still progress. Fixes #15165. Fixes #20026. Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c Reviewed-on: https://go-review.googlesource.com/40972 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-04-19 00:00:09 +00:00
Keith Randall	53f8a6aeb0	cmd/compile: automatically handle commuting ops in rewrite rules Note that this is a redo of an undo of the original buggy CL 38666. We have lots of rewrite rules that vary only in the fact that we have 2 versions for the 2 different orderings of various commuting ops. For example: (ADDL x (MOVLconst [c])) -> (ADDLconst [c] x) (ADDL (MOVLconst [c]) x) -> (ADDLconst [c] x) It can get unwieldly quickly, especially when there is more than one commuting op in a rule. Our existing "fix" for this problem is to have rules that canonicalize the operations first. For example: (Eq64 x (Const64 <t> [c])) && x.Op != OpConst64 -> (Eq64 (Const64 <t> [c]) x) Subsequent rules can then assume if there is a constant arg to Eq64, it will be the first one. This fix kinda works, but it is fragile and only works when we remember to include the required extra rules. The fundamental problem is that the rule matcher doesn't know anything about commuting ops. This CL fixes that fact. We already have information about which ops commute. (The register allocator takes advantage of commutivity.) The rule generator now automatically generates multiple rules for a single source rule when there are commutative ops in the rule. We can now drop all of our almost-duplicate source-level rules and the canonicalization rules. I have some CLs in progress that will be a lot less verbose when the rule generator handles commutivity for me. I had to reorganize the load-combining rules a bit. The 8-way OR rules generated 128 different reorderings, which was causing the generator to put too much code in the rewrite*.go files (the big ones were going from 25K lines to 132K lines). Instead I reorganized the rules to combine pairs of loads at a time. The generated rule files are now actually a bit (5%) smaller. Make.bash times are ~unchanged. Compiler benchmarks are not observably different. Probably because we don't spend much compiler time in rule matching anyway. I've also done a pass over all of our ops adding commutative markings for ops which hadn't had them previously. Fixes #18292 Change-Id: Ic1c0e43fbf579539f459971625f69690c9ab8805 Reviewed-on: https://go-review.googlesource.com/38801 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-04-03 22:03:43 +00:00
Keith Randall	63a72fd447	cmd/compile: strength-reduce floating point x2 -> x+x x/c, c power of 2 -> x(1/c) Fixes #19827 Change-Id: I74c9f0b5b49b2ed26c0990314c7d1d5f9631b6f1 Reviewed-on: https://go-review.googlesource.com/39295 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-04-03 21:27:03 +00:00
Brad Fitzpatrick	69fe9ea43e	cmd/compile/internal/ssa: use recently agreed upon generated code header Updates #13560 Change-Id: I9bc08ca5cf0627e653d55f748ebb83be8b69ea3b Reviewed-on: https://go-review.googlesource.com/39296 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-03 18:04:41 +00:00
Ben Shi	8577f81a10	cmd/compile/internal: Optimization with RBIT and REV By checking GOARM in ssa/gen/ARM.rules, each intermediate operator can be implemented via different instruction serials. It is up to the user to choose between compitability and efficiency. The Bswap32(x) is optimized to REV(x) when GOARM >= 6. The CTZ(x) is optimized to CLZ(RBIT x) when GOARM == 7. Change-Id: Ie9ee645fa39333fa79ad84ed4d1cefac30422814 Reviewed-on: https://go-review.googlesource.com/35610 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-31 15:10:24 +00:00
Keith Randall	68da265c8e	Revert "cmd/compile: automatically handle commuting ops in rewrite rules" This reverts commit `041ecb697f`. Reason for revert: Not working on S390x and some 386 archs. I have a guess why the S390x is failing. No clue on the 386 yet. Revert until I can figure it out. Change-Id: I64f1ce78fa6d1037ebe7ee2a8a8107cb4c1db70c Reviewed-on: https://go-review.googlesource.com/38790 Reviewed-by: Keith Randall <khr@golang.org>	2017-03-29 18:06:44 +00:00
Keith Randall	041ecb697f	cmd/compile: automatically handle commuting ops in rewrite rules We have lots of rewrite rules that vary only in the fact that we have 2 versions for the 2 different orderings of various commuting ops. For example: (ADDL x (MOVLconst [c])) -> (ADDLconst [c] x) (ADDL (MOVLconst [c]) x) -> (ADDLconst [c] x) It can get unwieldly quickly, especially when there is more than one commuting op in a rule. Our existing "fix" for this problem is to have rules that canonicalize the operations first. For example: (Eq64 x (Const64 <t> [c])) && x.Op != OpConst64 -> (Eq64 (Const64 <t> [c]) x) Subsequent rules can then assume if there is a constant arg to Eq64, it will be the first one. This fix kinda works, but it is fragile and only works when we remember to include the required extra rules. The fundamental problem is that the rule matcher doesn't know anything about commuting ops. This CL fixes that fact. We already have information about which ops commute. (The register allocator takes advantage of commutivity.) The rule generator now automatically generates multiple rules for a single source rule when there are commutative ops in the rule. We can now drop all of our almost-duplicate source-level rules and the canonicalization rules. I have some CLs in progress that will be a lot less verbose when the rule generator handles commutivity for me. I had to reorganize the load-combining rules a bit. The 8-way OR rules generated 128 different reorderings, which was causing the generator to put too much code in the rewrite*.go files (the big ones were going from 25K lines to 132K lines). Instead I reorganized the rules to combine pairs of loads at a time. The generated rule files are now actually a bit (5%) smaller. [Note to reviewers: check these carefully. Most of the other rule changes are trivial.] Make.bash times are ~unchanged. Compiler benchmarks are not observably different. Probably because we don't spend much compiler time in rule matching anyway. I've also done a pass over all of our ops adding commutative markings for ops which hadn't had them previously. Fixes #18292 Change-Id: I999b1307272e91965b66754576019dedcbe7527a Reviewed-on: https://go-review.googlesource.com/38666 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-03-29 16:22:09 +00:00
philhofer	d68bb16b1e	cmd/compile/internal/ssa: recognize constant pointer comparison Teach the backend to recognize that the address of a symbol is equal with itself, and that the addresses of two different symbols are different. Some examples of where this rule hits in the standard library: - inlined uses of (*time.Time).setLoc (e.g. time.UTC) - inlined uses of bufio.NewReader (via type assertion) Change-Id: I23dcb068c2ec333655c1292917bec13bbd908c24 Reviewed-on: https://go-review.googlesource.com/38338 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-20 21:04:44 +00:00
Josh Bleecher Snyder	aea3aff669	cmd/compile: separate ssa.Frontend and ssa.TypeSource Prior to this CL, the ssa.Frontend field was responsible for providing types to the backend during compilation. However, the types needed by the backend are few and static. It makes more sense to use a struct for them and to hang that struct off the ssa.Config, which is the correct home for readonly data. Now that Types is a struct, we can clean up the names a bit as well. This has the added benefit of allowing early construction of all types needed by the backend. This will be useful for concurrent backend compilation. Passes toolstash-check -all. No compiler performance change. Updates #15756 Change-Id: I021658c8cf2836d6a22bbc20cc828ac38c7da08a Reviewed-on: https://go-review.googlesource.com/38336 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-19 00:21:23 +00:00
Josh Bleecher Snyder	2cdb7f118a	cmd/compile: move Frontend field from ssa.Config to ssa.Func Suggested by mdempsky in CL 38232. This allows us to use the Frontend field to associate frontend state and information with a function. See the following CL in the series for examples. This is a giant CL, but it is almost entirely routine refactoring. The ssa test API is starting to feel a bit unwieldy. I will clean it up separately, once the dust has settled. Passes toolstash -cmp. Updates #15756 Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf Reviewed-on: https://go-review.googlesource.com/38327 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-17 23:18:57 +00:00
Josh Bleecher Snyder	193510f2f6	cmd/compile: evaluate config as needed in rewrite rules Prior to this CL, config was an explicit argument to the SSA rewrite rules, and rules that needed a Frontend got at it via config. An upcoming CL moves Frontend from Config to Func, so rules can no longer reach Frontend via Config. Passing a Frontend as an argument to the rewrite rules causes a 2-3% regression in compile times. This CL takes a different approach: It treats the variable names "config" and "fe" as special and calculates them as needed. The "as needed part" is also important to performance: If they are calculated eagerly, the nilchecks themselves cause a regression. This introduces a little bit of magic into the rewrite generator. However, from the perspective of the rules, the config variable was already more or less magic. And it makes the upcoming changes much clearer. Passes toolstash -cmp. Change-Id: I173f2bcc124cba43d53138bfa3775e21316a9107 Reviewed-on: https://go-review.googlesource.com/38326 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-17 22:35:29 +00:00
Cherry Zhang	c8f38b3398	cmd/compile: use type information in Aux for Store size Remove size AuxInt in Store, and alignment in Move/Zero. We still pass size AuxInt to Move/Zero, as it is used for partial Move/Zero lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288). SizeAndAlign is gone. Passes "toolstash -cmp" on std. Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d Reviewed-on: https://go-review.googlesource.com/38150 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:25:04 +00:00
Cherry Zhang	211c8c9f1a	cmd/compile: pass types on SSA Store/Move/Zero ops For SSA Store/Move/Zero ops, attach the type of the value being stored to the op as the Aux field. This type will be used for write barrier insertion (in a followup CL). Since SSA passes do not accurately propagate types of values (because of type casting), we can't simply use type of the store's arguments for write barrier insertion. Passes "toolstash -cmp" on std. Updates #17583. Change-Id: I051d5e5c482931640d1d7d879b2a6bb91f2e0056 Reviewed-on: https://go-review.googlesource.com/36838 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-16 14:22:53 +00:00
philhofer	295307ae78	cmd/compile: de-virtualize interface calls With this change, code like h := sha1.New() h.Write(buf) sum := h.Sum() gets compiled into static calls rather than interface calls, because the compiler is able to prove that 'h' is really a *sha1.digest. The InterCall re-write rule hits a few dozen times during make.bash, and hundreds of times during all.bash. The most common pattern identified by the compiler is a constructor like func New() Interface { return &impl{...} } where the constructor gets inlined into the caller, and the result is used immediately. Examples include {sha1,md5,crc32,crc64,...}.New, base64.NewEncoder, base64.NewDecoder, errors.New, net.Pipe, and so on. Some existing benchmarks that change on darwin/amd64: Crc64/ISO4KB-8 2.67µs ± 1% 2.66µs ± 0% -0.36% (p=0.015 n=10+10) Crc64/ISO1KB-8 694ns ± 0% 690ns ± 1% -0.59% (p=0.001 n=10+10) Adler32KB-8 473ns ± 1% 471ns ± 0% -0.39% (p=0.010 n=10+9) On architectures like amd64, the reduction in code size appears to contribute more to benchmark improvements than just removing the indirect call, since that branch gets predicted accurately when called in a loop. Updates #19361 Change-Id: I57d4dc21ef40a05ec0fbd55a9bb0eb74cdc67a3d Reviewed-on: https://go-review.googlesource.com/38139 Run-TryBot: Philip Hofer <phofer@umich.edu> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-03-14 18:49:23 +00:00
David Chase	b59a405656	Revert "cmd/compile: de-virtualize interface calls" This reverts commit `4e0c7c3f61`. Reason for revert: The presence-of-optimization test program is fragile, breaks under noopt, and might break if the Go libraries are tweaked. It needs to be (re)written without reference to other packages. Change-Id: I3aaf1ab006a1a255f961a978e9c984341740e3c7 Reviewed-on: https://go-review.googlesource.com/38097 Reviewed-by: Keith Randall <khr@golang.org>	2017-03-13 21:15:32 +00:00
Philip Hofer	4e0c7c3f61	cmd/compile: de-virtualize interface calls With this change, code like h := sha1.New() h.Write(buf) sum := h.Sum() gets compiled into static calls rather than interface calls, because the compiler is able to prove that 'h' is really a *sha1.digest. The InterCall re-write rule hits a few dozen times during make.bash, and hundreds of times during all.bash. The most common pattern identified by the compiler is a constructor like func New() Interface { return &impl{...} } where the constructor gets inlined into the caller, and the result is used immediately. Examples include {sha1,md5,crc32,crc64,...}.New, base64.NewEncoder, base64.NewDecoder, errors.New, net.Pipe, and so on. Some existing benchmarks that change on darwin/amd64: Crc64/ISO4KB-8 2.67µs ± 1% 2.66µs ± 0% -0.36% (p=0.015 n=10+10) Crc64/ISO1KB-8 694ns ± 0% 690ns ± 1% -0.59% (p=0.001 n=10+10) Adler32KB-8 473ns ± 1% 471ns ± 0% -0.39% (p=0.010 n=10+9) On architectures like amd64, the reduction in code size appears to contribute more to benchmark improvements than just removing the indirect call, since that branch gets predicted accurately when called in a loop. Updates #19361 Change-Id: Ia9d30afdd5f6b4d38d38b14b88f308acae8ce7ed Reviewed-on: https://go-review.googlesource.com/37751 Run-TryBot: Philip Hofer <phofer@umich.edu> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-13 18:24:57 +00:00
Josh Bleecher Snyder	4b3a55ec02	cmd/compile: add 32 bit (AddPtr (Const)) rule This triggers about 50k times during 32 bit make.bash. Change-Id: Ia0c2b1a8246b92173b4b0d94a4037626f76b6e73 Reviewed-on: https://go-review.googlesource.com/37998 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-10 17:24:06 +00:00
Michael Munday	945180fe2a	cmd/compile: fix OffPtr type in 2-field struct Store rule The type of the OffPtr for the first field was incorrect. It should have been a pointer to the field type, rather than the field type itself. Fixes #19475. Change-Id: I3960b404da0f4bee759331126cce6140d2ce1df7 Reviewed-on: https://go-review.googlesource.com/37869 Run-TryBot: Michael Munday <munday@ca.ibm.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-09 19:09:56 +00:00
philhofer	a6bd42f263	cmd/compile: emit OffPtr for first field in SSA'd structs Given (Store [c] (OffPtr <T1> [0] (Addr <T> _)) _ (Store [c] (Addr <T> _) _ _)) dead store elimination doesn't eliminate the inner Store, because it addresses a type of a different width than the first store. When decomposing StructMake operations, always generate an OffPtr to address struct fields so that dead stores to the first field of the struct can be optimized away. benchmarks affected on darwin/amd64: HTTPClientServer-8 73.2µs ± 1% 72.7µs ± 1% -0.69% (p=0.022 n=9+10) TimeParse-8 304ns ± 1% 300ns ± 0% -1.61% (p=0.000 n=9+9) RegexpMatchEasy1_32-8 80.1ns ± 0% 79.5ns ± 1% -0.84% (p=0.000 n=8+9) GobDecode-8 6.78ms ± 0% 6.81ms ± 1% +0.46% (p=0.000 n=9+10) Gunzip-8 36.1ms ± 1% 36.2ms ± 0% +0.37% (p=0.019 n=10+10) JSONEncode-8 15.6ms ± 0% 15.7ms ± 0% +0.69% (p=0.000 n=9+10) Change-Id: Ia80d73fd047f9400c616ca64fdee4f438a0e7f21 Reviewed-on: https://go-review.googlesource.com/37769 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-06 23:47:59 +00:00
Giovanni Bajo	4fc45ae879	cmd/compile: improve generic rules for BCE based on AND operations. Match more patterns generated by the compiler where the index for a bound check is bounded through a AND operation, with different register sizes. These rules trigger a dozen of times in a bootstrap. Change-Id: Ic9fff16f21d08580f19a366c3ee1a372e58357d1 Reviewed-on: https://go-review.googlesource.com/37442 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-04 16:13:07 +00:00
Cherry Zhang	c8eaeb8cba	cmd/compile: remove zeroing after newobject The Zero op right after newobject has been removed. But this rule does not cover Store of constant zero (for SSA-able types). Add rules to cover Store op as well. Updates #19027. Change-Id: I5d2b62eeca0aa9ce8dc7205b264b779de01c660b Reviewed-on: https://go-review.googlesource.com/36836 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-03 20:36:54 +00:00
Cherry Zhang	9b480521d8	cmd/compile: fix optimization of Zero newobject on amd64p32 On amd64p32, PtrSize and RegSize don't agree, and function return value is aligned with RegSize. Fix this rule. Other architectures are not affected, where PtrSize and RegSize are the same. Change-Id: If187d3dfde3dc3b931b8e97db5eeff49a781551b Reviewed-on: https://go-review.googlesource.com/37720 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-03 20:36:31 +00:00
philhofer	379567aad1	cmd/compile/ssa: more aggressive constant folding Add rewrite rules that canonicalize the location of constants in expressions, and fold conststants that appear in operations that can be trivially reassociated. After this change, the compiler constant-folds expressions like "4 + x - 1" and "4 & x & 1" Benchmarks affected on darwin/amd64: name old time/op new time/op delta FmtFprintfInt-8 82.1ns ± 1% 81.7ns ± 1% -0.46% (p=0.023 n=8+9) FmtFprintfIntInt-8 122ns ± 2% 120ns ± 2% -1.48% (p=0.047 n=10+10) FmtManyArgs-8 493ns ± 0% 486ns ± 1% -1.37% (p=0.000 n=8+10) Gzip-8 230ms ± 0% 229ms ± 1% -0.46% (p=0.001 n=10+9) HTTPClientServer-8 74.5µs ± 1% 73.7µs ± 1% -1.11% (p=0.000 n=10+10) JSONDecode-8 51.7ms ± 0% 51.9ms ± 1% +0.42% (p=0.017 n=10+9) RegexpMatchEasy0_32-8 82.6ns ± 1% 81.7ns ± 0% -1.02% (p=0.000 n=9+8) RegexpMatchMedium_32-8 121ns ± 1% 120ns ± 1% -1.48% (p=0.001 n=10+10) Revcomp-8 426ms ± 1% 400ms ± 1% -6.16% (p=0.000 n=10+10) TimeFormat-8 330ns ± 1% 327ns ± 0% -0.82% (p=0.000 n=10+10) name old speed new speed delta Gzip-8 84.4MB/s ± 0% 84.8MB/s ± 1% +0.47% (p=0.001 n=10+9) JSONDecode-8 37.6MB/s ± 0% 37.4MB/s ± 1% -0.42% (p=0.016 n=10+9) RegexpMatchEasy0_32-8 387MB/s ± 1% 392MB/s ± 0% +1.06% (p=0.000 n=9+8) RegexpMatchMedium_32-8 8.21MB/s ± 1% 8.34MB/s ± 1% +1.58% (p=0.000 n=10+9) Revcomp-8 597MB/s ± 1% 636MB/s ± 1% +6.57% (p=0.000 n=10+10) Change-Id: Ie37ff91605b76a984a8400dfd1e34f50bf61c864 Reviewed-on: https://go-review.googlesource.com/37290 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-28 20:25:33 +00:00
Michael Munday	bd8a39b67a	cmd/compile: emit fused multiply-{add,subtract} instructions on s390x Explcitly block fused multiply-add pattern matching when a cast is used after the multiplication, for example: - (a * b) + c // can emit fused multiply-add - float64(a * b) + c // cannot emit fused multiply-add float{32,64} and complex{64,128} casts of matching types are now kept as OCONV operations rather than being replaced with OCONVNOP operations because they now imply a rounding operation (and therefore aren't a no-op anymore). Operations (for example, multiplication) on complex types may utilize fused multiply-add and -subtract instructions internally. There is no way to disable this behavior at the moment. Improves the performance of the floating point implementation of poly1305: name old speed new speed delta 64 246MB/s ± 0% 275MB/s ± 0% +11.48% (p=0.000 n=10+8) 1K 312MB/s ± 0% 357MB/s ± 0% +14.41% (p=0.000 n=10+10) 64Unaligned 246MB/s ± 0% 274MB/s ± 0% +11.43% (p=0.000 n=10+10) 1KUnaligned 312MB/s ± 0% 357MB/s ± 0% +14.39% (p=0.000 n=10+8) Updates #17895. Change-Id: Ia771d275bb9150d1a598f8cc773444663de5ce16 Reviewed-on: https://go-review.googlesource.com/36963 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 15:34:20 +00:00
Josh Bleecher Snyder	417f49a363	cmd/compile: fold (NegNN (ConstNN ...)) Fix up and enable a few rules. They trigger a handful of times in std, despite the frontend handling. Change-Id: I83378c057cbbc95a4f2b58cd8c36aec0e9dc547f Reviewed-on: https://go-review.googlesource.com/37227 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 04:43:35 +00:00
Michael Munday	72a071c1da	cmd/compile: rewrite pairs of shifts to extensions Replaces pairs of shifts with sign/zero extension where possible. For example: (uint64(x) << 32) >> 32 -> uint64(uint32(x)) Reduces the execution time of the following code by ~4.5% on s390x: for i := 0; i < N; i++ { x += (uint64(i)<<32)>>32 } Change-Id: Idb2d56f27e80a2e1366bc995922ad3fd958c51a7 Reviewed-on: https://go-review.googlesource.com/37292 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-22 21:31:03 +00:00
Keith Randall	a9292b833b	cmd/compile: fix 32-bit unsigned division on 64-bit machines The type of an intermediate multiply was wrong. When that intermediate multiply was spilled, the top 32 bits were lost. Fixes #19153 Change-Id: Ib29350a4351efa405935b7f7ee3c112668e64108 Reviewed-on: https://go-review.googlesource.com/37212 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-17 22:21:04 +00:00
Keith Randall	708ba22a0c	cmd/compile: move constant divide strength reduction to SSA rules Currently the conversion from constant divides to multiplies is mostly done during the walk pass. This is suboptimal because SSA can determine that the value being divided by is constant more often (e.g. after inlining). Change-Id: If1a9b993edd71be37396b9167f77da271966f85f Reviewed-on: https://go-review.googlesource.com/37015 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 06:16:44 +00:00
Josh Bleecher Snyder	47d2a4dafa	cmd/compile: remove walkmul Replace with generic rewrite rules. Change-Id: I3ee32076cfd9db5801f1a7bdbb73a994255884a9 Reviewed-on: https://go-review.googlesource.com/36323 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-06 07:21:14 +00:00
Josh Bleecher Snyder	12c58bbf81	cmd/compile: optimize (ZeroExt (Const [c])) These rules trigger 116 times while running make.bash. And at least for the sample code at https://github.com/golang/go/issues/18906#issuecomment-277174241 they are providing optimizations not already present in amd64. Updates #18906 Change-Id: I410a480f566f5ab176fc573fb5ac74f9cffec225 Reviewed-on: https://go-review.googlesource.com/36217 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-03 06:52:46 +00:00
Cherry Zhang	6ad2d6aa92	cmd/compile: simplify IsNonNil ConstNil Change-Id: I9ed5a2065cef06708e319b16c801da2eff42004e Reviewed-on: https://go-review.googlesource.com/35497 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-02 21:28:50 +00:00
Robert Griesemer	cfd17f51c8	[dev.inline] cmd/compile/internal/ssa: rename various fields from Line to Pos This is a mostly mechanical rename followed by manual fixes where necessary. Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842 Reviewed-on: https://go-review.googlesource.com/34137 Reviewed-by: David Lazar <lazard@golang.org>	2016-12-08 21:36:52 +00:00
Robert Griesemer	82d0caea2c	[dev.inline] cmd/internal/src: make Pos implementation abstract Adjust cmd/compile accordingly. This will make it easier to replace the underlying implementation. Change-Id: I33645850bb18c839b24785b6222a9e028617addb Reviewed-on: https://go-review.googlesource.com/34133 Reviewed-by: David Lazar <lazard@golang.org>	2016-12-08 21:31:28 +00:00
Keith Randall	688995d1e9	cmd/compile: do more type conversion inline The code to do the conversion is smaller than the call to the runtime. The 1-result asserts need to call panic if they fail, but that code is out of line. The only conversions left in the runtime are those which might allocate and those which might need to generate an itab. Given the following types: type E interface{} type I interface { foo() } type I2 iterface { foo(); bar() } type Big [10]int func (b Big) foo() { ... } This CL inlines the following conversions: was assertE2T var e E = ... b := i.(Big) was assertE2T2 var e E = ... b, ok := i.(Big) was assertI2T var i I = ... b := i.(Big) was assertI2T2 var i I = ... b, ok := i.(Big) was assertI2E var i I = ... e := i.(E) was assertI2E2 var i I = ... e, ok := i.(E) These are the remaining runtime calls: convT2E: var b Big = ... var e E = b convT2I: var b Big = ... var i I = b convI2I: var i2 I2 = ... var i I = i2 assertE2I: var e E = ... i := e.(I) assertE2I2: var e E = ... i, ok := e.(I) assertI2I: var i I = ... i2 := i.(I2) assertI2I2: var i I = ... i2, ok := i.(I2) Fixes #17405 Fixes #8422 Change-Id: Ida2367bf8ce3cd2c6bb599a1814f1d275afabe21 Reviewed-on: https://go-review.googlesource.com/32313 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-11-02 21:33:03 +00:00
Keith Randall	741445068f	cmd/compile: make [0]T and [1]T SSAable types We used to have to keep on-stack copies of these types. Now they can be registerized. [0]T is kind of trivial but might as well handle it. This change enables another change I'm working on to improve how x.(T) expressions are handled (#17405). This CL helps because now all types that are direct interface types are registerizeable (e.g. [1]*byte). No higher-degree arrays for now because non-constant indexes are hard. Update #17405 Change-Id: I2399940965d17b3969ae66f6fe447a8cefdd6edd Reviewed-on: https://go-review.googlesource.com/32416 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-10-31 19:44:19 +00:00
Keith Randall	ac74225dcc	cmd/compile: remove redundant extension after shift var x uint64 uint8(x >> 56) We don't need to generate any code for the uint8(). Update #15090 Change-Id: Ie1ca4e32022dccf7f7bc42d531a285521fb67872 Reviewed-on: https://go-review.googlesource.com/28191 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-10-27 21:21:59 +00:00
Keith Randall	deb4177cf0	cmd/compile: use masks instead of branches for slicing When we do var x []byte = ... y := x[i:] We can't just use y.ptr = x.ptr + i, as the new pointer may point to the next object in memory after the backing array. We used to fix this by doing: y.cap = x.cap - i delta := i if y.cap == 0 { delta = 0 } y.ptr = x.ptr + delta That generates a branch in what is otherwise straight-line code. Better to do: y.cap = x.cap - i mask := (y.cap - 1) >> 63 // -1 if y.cap==0, 0 otherwise y.ptr = x.ptr + i &^ mask It's about the same number of instructions (~4, depending on what parts are constant, and the target architecture), but it is all inline. It plays nicely with CSE, and the mask can be computed in parallel with the index (in cases where a multiply is required). It is a minor win in both speed and space. Change-Id: Ied60465a0b8abb683c02208402e5bb7ac0e8370f Reviewed-on: https://go-review.googlesource.com/32022 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2016-10-27 20:22:49 +00:00
David Chase	0f29942489	cmd/compile: Repurpose old sliceopt.go for prove phase. Adapt old test for prove's bounds check elimination. Added missing rule to generic rules that lead to differences between 32 and 64 bit platforms on sliceopt test. Added debugging to prove.go that was helpful-to-necessary to discover that missing rule. Lowered debugging level on prove.go from 3 to 1; no idea why it was previously 3. Change-Id: I09de206aeb2fced9f2796efe2bfd4a59927eda0c Reviewed-on: https://go-review.googlesource.com/23290 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-10-20 23:50:19 +00:00
Cherry Zhang	4d07d3e29c	cmd/compile: re-enable nilcheck removal for newobject Also add compiler debug ouput and add a test. Fixes #15390. Change-Id: Iceba1414c29bcc213b87837387bf8ded1f3157f1 Reviewed-on: https://go-review.googlesource.com/30011 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-09-28 19:41:49 +00:00
Keith Randall	3134ab3c2d	cmd/compile: redo nil checks Get rid of BlockCheck. Josh goaded me into it, and I went down a rabbithole making it happen. NilCheck now panics if the pointer is nil and returns void, as before. BlockCheck is gone, and NilCheck is no longer a Control value for any block. It just exists (and deadcode knows not to throw it away). I rewrote the nilcheckelim pass to handle this case. In particular, there can now be multiple NilCheck ops per block. I moved all of the arch-dependent nil check elimination done as part of ssaGenValue into its own proper pass, so we don't have to duplicate that code for every architecture. Making the arch-dependent nil check its own pass means I needed to add a bunch of flags to the opcode table so I could write the code without arch-dependent ops everywhere. Change-Id: I419f891ac9b0de313033ff09115c374163416a9f Reviewed-on: https://go-review.googlesource.com/29120 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-09-15 02:42:13 +00:00
Martin Möhrmann	00459f05e0	cmd/compile: fold negation into comparison operators This allows for example AMD64 ssa to generate (SETNE x) instead of (XORLconst [1] SETE). make.bash trigger count on AMD64: 691 generic.rules:225 1 generic.rules:226 4 generic.rules:228 1 generic.rules:229 8 generic.rules:231 6 generic.rules:238 2 generic.rules:257 Change-Id: I5b9827b2df63c8532675079e5a6026aa47bfd8dc Reviewed-on: https://go-review.googlesource.com/28232 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2016-08-31 20:24:03 +00:00
Cherry Zhang	b2e0e9688a	cmd/compile: remove Zero and NilCheck for newobject Recognize runtime.newobject and don't Zero or NilCheck it. Fixes #15914 (?) Updates #15390. TBD: add test Change-Id: Ia3bfa5c2ddbe2c27c92d9f68534a713b5ce95934 Reviewed-on: https://go-review.googlesource.com/27930 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-08-30 23:10:43 +00:00

1 2 3 4

153 commits