Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Heschi Kreinick	ac7761e1a4	cmd/compile, cmd/asm: remove Link.Plists Link.Plists never contained more than one Plist, and sometimes none. Passing around the Plist being worked on is straightforward and makes the data flow easier to follow. Change-Id: I79cb30cb2bd3d319fdbb1dfa5d35b27fcb748e5c Reviewed-on: https://go-review.googlesource.com/37169 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-01 00:29:23 +00:00
Josh Bleecher Snyder	504bc3ed24	cmd/compile, runtime: specialize convT2x, don't alloc for zero vals Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 19:23:33 +00:00
Cherry Zhang	f6fc0dd620	cmd/compile: update signature of runtime.memclr* runtime.memclr* functions have signatures func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) func memclrHasPointers(ptr unsafe.Pointer, n uintptr) Update compiler's copy. Also teach gc/mkbuiltin.go to handle unsafe.Pointer. The import statement and its support is not really necessary, but just to make it look like real Go code. Fixes #19185. Change-Id: I251d02571fde2716d4727e31e04d56ec04b6f22a Reviewed-on: https://go-review.googlesource.com/37257 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-28 19:22:29 +00:00
Michael Munday	bd8a39b67a	cmd/compile: emit fused multiply-{add,subtract} instructions on s390x Explcitly block fused multiply-add pattern matching when a cast is used after the multiplication, for example: - (a * b) + c // can emit fused multiply-add - float64(a * b) + c // cannot emit fused multiply-add float{32,64} and complex{64,128} casts of matching types are now kept as OCONV operations rather than being replaced with OCONVNOP operations because they now imply a rounding operation (and therefore aren't a no-op anymore). Operations (for example, multiplication) on complex types may utilize fused multiply-add and -subtract instructions internally. There is no way to disable this behavior at the moment. Improves the performance of the floating point implementation of poly1305: name old speed new speed delta 64 246MB/s ± 0% 275MB/s ± 0% +11.48% (p=0.000 n=10+8) 1K 312MB/s ± 0% 357MB/s ± 0% +14.41% (p=0.000 n=10+10) 64Unaligned 246MB/s ± 0% 274MB/s ± 0% +11.43% (p=0.000 n=10+10) 1KUnaligned 312MB/s ± 0% 357MB/s ± 0% +14.39% (p=0.000 n=10+8) Updates #17895. Change-Id: Ia771d275bb9150d1a598f8cc773444663de5ce16 Reviewed-on: https://go-review.googlesource.com/36963 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 15:34:20 +00:00
Martin Möhrmann	a8f07310e3	cmd/compile: fix assignment order in string range loop Fixes #18376. Change-Id: I4fe24f479311cd4cd1bdad9a966b681e50e3d500 Reviewed-on: https://go-review.googlesource.com/35955 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-28 08:23:52 +00:00
Josh Bleecher Snyder	1e29cd8c2b	cmd/compile: ignore some dead code during escape analysis This is the escape analysis analog of CL 37499. Fixes #12397 Fixes #16871 The only "moved to heap" decisions eliminated by this CL in std+cmd are: cmd/compile/internal/gc/const.go:1514: moved to heap: ac cmd/compile/internal/gc/const.go:1515: moved to heap: bd cmd/compile/internal/gc/const.go:1516: moved to heap: bc cmd/compile/internal/gc/const.go:1517: moved to heap: ad cmd/compile/internal/gc/const.go:1546: moved to heap: ac cmd/compile/internal/gc/const.go:1547: moved to heap: bd cmd/compile/internal/gc/const.go:1548: moved to heap: bc cmd/compile/internal/gc/const.go:1549: moved to heap: ad cmd/compile/internal/gc/const.go:1550: moved to heap: cc_plus cmd/compile/internal/gc/export.go:162: moved to heap: copy cmd/compile/internal/gc/mpfloat.go:66: moved to heap: b cmd/compile/internal/gc/mpfloat.go:97: moved to heap: b Change-Id: I0d420b69c84a41ba9968c394e8957910bab5edea Reviewed-on: https://go-review.googlesource.com/37508 Reviewed-by: David Chase <drchase@google.com>	2017-02-27 21:31:04 +00:00
Matthew Dempsky	7b8f51188b	cmd/compile/internal/gc: refactor liveness bitmap generation Keep liveness bit vectors as simple live-variable vectors during liveness analysis. We can defer expanding them into runtime heap bitmaps until we're actually writing out the symbol data, and then we only need temporary memory to expand one bitmap at a time. This is logically cleaner (e.g., we no longer depend on stack frame layout during analysis) and saves a little bit on allocations. name old alloc/op new alloc/op delta Template 41.4MB ± 0% 41.3MB ± 0% -0.28% (p=0.000 n=60+60) Unicode 32.6MB ± 0% 32.6MB ± 0% -0.11% (p=0.000 n=59+60) GoTypes 119MB ± 0% 119MB ± 0% -0.35% (p=0.000 n=60+59) Compiler 483MB ± 0% 481MB ± 0% -0.47% (p=0.000 n=59+60) name old allocs/op new allocs/op delta Template 381k ± 1% 380k ± 1% -0.32% (p=0.000 n=60+60) Unicode 325k ± 1% 325k ± 1% ~ (p=0.867 n=60+60) GoTypes 1.16M ± 0% 1.15M ± 0% -0.40% (p=0.000 n=60+59) Compiler 4.22M ± 0% 4.19M ± 0% -0.61% (p=0.000 n=59+60) Passes toolstash -cmp. Change-Id: I8175efe55201ffb5017f79ae6cb90df03f1b7e99 Reviewed-on: https://go-review.googlesource.com/37458 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-27 21:01:20 +00:00
Matthew Dempsky	f7f3514bd8	cmd/compile/internal/gc: simplify ascompatte Passes toolstash -cmp. Change-Id: Ibb51ccaf29ee97c3463543175c9ac7b85ea10a7f Reviewed-on: https://go-review.googlesource.com/37339 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-27 20:40:31 +00:00
Josh Bleecher Snyder	0df81e8887	cmd/compile: simplify and clean up inlnode Change-Id: I0d14d68b57e8605cdae8a45d6fa97255a42297d8 Reviewed-on: https://go-review.googlesource.com/37521 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-27 19:25:21 +00:00
Josh Bleecher Snyder	566e72d0ce	cmd/compile: ignore some dead code when deciding whether to inline Constant evaluation provides some rudimentary knowledge of dead code at inlining decision time. Use it. This CL addresses only dead code inside if statements. For statements are never inlined anyway, and dead code inside for statements is rare. Analyzing switch statements is worth doing, but it is more complicated, since we would have to evaluate each case; leave it for later. Fixes #9274 After this CL, the following functions in std+cmd can be newly inlined: cmd/internal/obj/x86/asm6.go:3122: can inline subreg cmd/vendor/golang.org/x/arch/x86/x86asm/decode.go:172: can inline instPrefix cmd/vendor/golang.org/x/arch/x86/x86asm/decode.go:202: can inline truncated go/constant/value.go:234: can inline makeFloat go/types/labels.go:52: can inline (block).insert math/big/float.go:231: can inline (Float).Sign math/bits/bits.go:57: can inline OnesCount net/http/server.go:597: can inline (*Server).newConn runtime/hashmap.go:1165: can inline reflect_maplen runtime/proc.go:207: can inline os_beforeExit runtime/signal_unix.go:55: can inline init.5 runtime/stack.go:1081: can inline gostartcallfn Change-Id: I4c92fb96aa0c3d33df7b3f2da548612e79b56b5b Reviewed-on: https://go-review.googlesource.com/37499 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-27 19:18:01 +00:00
Josh Bleecher Snyder	e458264aca	cmd/compile: fix dolinkobj flag in TestAssembly Follow-up to CL 37270. This considerably reduces the time to run the test. Before: real 0m7.638s user 0m14.341s sys 0m2.244s After: real 0m4.867s user 0m7.107s sys 0m1.842s Change-Id: I8837a5da0979a1c365e1ce5874d81708249a4129 Reviewed-on: https://go-review.googlesource.com/37461 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Munday <munday@ca.ibm.com>	2017-02-25 14:39:29 +00:00
David Chase	febafe60d4	cmd/compile: added cheapexpr call to simplify operand of CONVIFACE New special case for booleans and byte-sized integer types converted to interfaces needs to ensure that the operand is not too complex, if it were to appear in a parameter list for example. Added test, also increased the recursive node dump depth to a level that was actually useful for an actual bug. Fixes #19275. Change-Id: If36ac3115edf439e886703f32d149ee0a46eb2a5 Reviewed-on: https://go-review.googlesource.com/37470 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-25 04:53:23 +00:00
Martin Möhrmann	fdef951116	cmd/compile: make setting and accessing of node slice elements more uniform Add Set3 function to complement existing Set1 and Set2 functions. Consistently use Set1, Set2 and Set3 for []Node instead of Set where applicable. Add SetFirst and SetSecond for setting elements of []Node to mirror First and Second for accessing elements in []*Node. Replace uses of Index by First and Second and SetIndex with SetFirst and SetSecond where applicable. Passes toolstash -cmp. Change-Id: I8255aae768cf245c8f93eec2e9efa05b8112b4e5 Reviewed-on: https://go-review.googlesource.com/37430 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-24 21:55:24 +00:00
Lorenzo Masini	fb1f47a77c	cmd/compile: speed up TestAssembly TestAssembly was very slow, leading to it being skipped by default. This is not surprising, it separately invoked the compiler and parsed the result many times. Now the test assembles one source file for arch/os combination, containing the relevant functions. Tests for each arch/os run in parallel. Now the test runs approximately 10x faster on my Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz. Fixes #18966 Change-Id: I45ab97630b627a32e17900c109f790eb4c0e90d9 Reviewed-on: https://go-review.googlesource.com/37270 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-24 21:23:43 +00:00
Josh Bleecher Snyder	d9270ecb3a	cmd/compile: evaluate zero-sized values converted to interfaces CL 35562 substituted zerobase for the pointer for interfaces containing zero-sized values. However, it failed to evaluate the zero-sized value expression for side-effects. Fix that. The other similar interface value optimizations are not affected, because they all actually use the value one way or another. Fixes #19246 Change-Id: I1168a99561477c63c29751d5cd04cf81b5ea509d Reviewed-on: https://go-review.googlesource.com/37395 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-24 19:09:41 +00:00
David R. Jenni	d55f528826	cmd/compile: silence superfluous assignment error message Avoid printing a second error message when a field of an undefined variable is accessed. Fixes #8440. Change-Id: I3fe0b11fa3423cec3871cb01b5951efa8ea7451a Reviewed-on: https://go-review.googlesource.com/36751 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-23 21:06:11 +00:00
Josh Bleecher Snyder	005c77dde8	cmd/compile: add -dolinkobj flag When set to false, the -dolinkobj flag instructs the compiler not to generate or emit linker information. This is handy when you need the compiler's export data, e.g. for use with go/importer, but you want to avoid the cost of full compilation. This must be used with care, since the resulting files are unusable for linking. This CL interacts with #18369, where adding gcflags and ldflags to buildid has been mooted. On the one hand, adding gcflags would make safe use of this flag easier, since if the full object files were needed, a simple 'go install' would fix it. On the other hand, this would mean that 'go install -gcflags=-dolinkobj=false' would rebuild the object files, although any existing object files would probably suffice. Change-Id: I8dc75ab5a40095c785c1a4d2260aeb63c4d10f73 Reviewed-on: https://go-review.googlesource.com/37384 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-23 07:12:23 +00:00
Emmanuel Odeke	19d2061d50	cmd/compile: suppress callsite signatures if any type is unknown Fixes #19012. Fallback to return signatures without detailed types. These error message will be of the form of issue: * https://golang.org/issues/4215 * https://golang.org/issues/6750 So: func f(x int, y uint) { return x > y } f(10, "a" < 3) will give errors: too many errors to return too many arguments in call to f instead of: too many errors to return have (<T>) want () too many arguments in call to f have (number, <T>) want (number, number) Change-Id: I680abc7cdd8444400e234caddf3ff49c2d69f53d Reviewed-on: https://go-review.googlesource.com/36806 Reviewed-by: Robert Griesemer <gri@golang.org>	2017-02-22 17:55:45 +00:00
Michael Munday	094992e22a	cmd/compile: zero extend when replacing load-hit-store on s390x Keith pointed out that these rules should zero extend during the review of CL 36845. In practice the generic rules are responsible for eliminating most load-hit-stores and they do not have this problem. When the s390x rules are triggered any cast following the elided load-hit-store is kept because of the sequence the rules are applied in (i.e. the load is removed before the zero extension gets a chance to be merged into the load). It is therefore not clear that this issue results in any functional bugs. This CL includes a test, but it only tests the generic rules currently. Change-Id: Idbc43c782097a3fb159be293ec3138c5b36858ad Reviewed-on: https://go-review.googlesource.com/37154 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-22 16:22:49 +00:00
Ian Lance Taylor	db6e27c38d	cmd/compile: update builtin writeBarrier to match runtime The definition of writeBarrier in the runtime was changed in CL 22855 to include padding. Update the definition built in to the compiler to match. This doesn't affect the generated code, as the compiler sets the type to use anyhow, but having them be different seems clearly wrong. Change-Id: I8eac05bf70a424a0b2338ba5e9e41af231316de0 Reviewed-on: https://go-review.googlesource.com/37377 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-22 01:32:31 +00:00
Michael Munday	10d718b983	cmd/compile: fix type of OffPtr generated by ODOTPTR The type of the OffPtr should be consistent with the type of the following load. Before this CL it was typed as a pointer to the struct. Fixes #19164. Change-Id: Ibcdec4411c6f719702f76f8dba3cce8691bfbe0c Reviewed-on: https://go-review.googlesource.com/37254 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-21 19:28:38 +00:00
Martin Möhrmann	3892d50796	cmd/compile: remove unused constant divide strength reduction code Change list https://golang.org/cl/37015/ moved the optimization of division by constants to the generic ssa backend. This removes the old now unused code that was used for this optimization outside of the ssa backend. Change-Id: I86223e56742e48dbb372ba8d779681e66448c513 Reviewed-on: https://go-review.googlesource.com/37198 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-19 19:11:45 +00:00
Matthew Dempsky	c61cf5e6b7	cmd/compile/internal/gc: remove Node.IsStatic field We can immediately emit static assignment data rather than queueing them up to be processed during SSA building. Passes toolstash -cmp. Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e Reviewed-on: https://go-review.googlesource.com/37024 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 22:06:52 +00:00
Cherry Zhang	c4b8dadb40	cmd/compile: fix some types in SSA These seem not to really matter, but good to be correct. Change-Id: I02edb9797c3d6739725cfbe4723c75f151acd05e Reviewed-on: https://go-review.googlesource.com/36837 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 19:20:46 +00:00
Cherry Zhang	c4ef597c47	cmd/compile: redo writebarrier pass SSA's writebarrier pass requires WB store ops are always at the end of a block. If we move write barrier insertion into SSA and emits normal Store ops when building SSA, this requirement becomes impractical -- it will create too many blocks for all the Store ops. Redo SSA's writebarrier pass, explicitly order values in store order, so it no longer needs this requirement. Updates #17583. Fixes #19067. Change-Id: I66e817e526affb7e13517d4245905300a90b7170 Reviewed-on: https://go-review.googlesource.com/36834 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-17 19:20:25 +00:00
Keith Randall	708ba22a0c	cmd/compile: move constant divide strength reduction to SSA rules Currently the conversion from constant divides to multiplies is mostly done during the walk pass. This is suboptimal because SSA can determine that the value being divided by is constant more often (e.g. after inlining). Change-Id: If1a9b993edd71be37396b9167f77da271966f85f Reviewed-on: https://go-review.googlesource.com/37015 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-17 06:16:44 +00:00
Matthew Dempsky	794f1ebff7	cmd/compile: simplify needwritebarrier Currently, whether we need a write barrier is simply a property of the pointer slot being written to. The only optimization we currently apply using the value being written is that pointers to stack variables can omit write barriers because they're only written to stack slots... but we already omit write barriers for all writes to the stack anyway. Passes toolstash -cmp. Change-Id: I7f16b71ff473899ed96706232d371d5b2b7ae789 Reviewed-on: https://go-review.googlesource.com/37109 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 22:42:36 +00:00
Matthew Dempsky	fc456c7f7b	cmd/compile/internal/gc: drop unused src.XPos params in SSA builder Passes toolstash -cmp. Change-Id: I037278404ebf762482557e2b6867cbc595074a83 Reviewed-on: https://go-review.googlesource.com/37023 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-16 17:34:39 +00:00
Matthew Dempsky	a6b3331236	cmd/compile/internal/gc: skip useless loads for non-SSA params Change-Id: I78ca43a0f0a6a162a2ade1352e2facb29432d4ac Reviewed-on: https://go-review.googlesource.com/37102 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-15 23:12:43 +00:00
Matthew Dempsky	862fde81fc	cmd/compile/internal/gc: document (*state).checkgoto No behavior change. Change-Id: I595c15ee976adf21bdbabdf24edf203c9e446185 Reviewed-on: https://go-review.googlesource.com/36958 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-15 22:59:55 +00:00
Robert Griesemer	2770c507a5	cmd/compile: fix position for "missing type in composite literal" error Fixes #18231. Change-Id: If1615da4db0e6f0516369a1dc37340d80c78f237 Reviewed-on: https://go-review.googlesource.com/37018 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-15 01:33:44 +00:00
Kirill Smelkov	4477fd097f	cmd/compile/internal/ssa: combine 2 byte loads + shifts into word load + rolw 8 on AMD64 ... and same for stores. This does for binary.BigEndian.Uint16() what was already done for Uint32 and Uint64 with BSWAP in `10f75748` (CL 32222). Here is how generated code changes e.g. for the following function (omitting saying the same prologue/epilogue): func get16(b [2]byte) uint16 { return binary.BigEndian.Uint16(b[:]) } "".get16 t=1 size=21 args=0x10 locals=0x0 // before 0x0000 00000 (x.go:15) MOVBLZX "".b+9(FP), AX 0x0005 00005 (x.go:15) MOVBLZX "".b+8(FP), CX 0x000a 00010 (x.go:15) SHLL $8, CX 0x000d 00013 (x.go:15) ORL CX, AX // after 0x0000 00000 (x.go:15) MOVWLZX "".b+8(FP), AX 0x0005 00005 (x.go:15) ROLW $8, AX encoding/binary is speedup overall a bit: name old time/op new time/op delta ReadSlice1000Int32s-4 4.83µs ± 0% 4.83µs ± 0% ~ (p=0.206 n=4+5) ReadStruct-4 1.29µs ± 2% 1.28µs ± 1% -1.27% (p=0.032 n=4+5) ReadInts-4 384ns ± 1% 385ns ± 1% ~ (p=0.968 n=4+5) WriteInts-4 534ns ± 3% 526ns ± 0% -1.54% (p=0.048 n=4+5) WriteSlice1000Int32s-4 5.02µs ± 0% 5.11µs ± 3% ~ (p=0.175 n=4+5) PutUint16-4 0.59ns ± 0% 0.49ns ± 2% -16.95% (p=0.016 n=4+5) PutUint32-4 0.52ns ± 0% 0.52ns ± 0% ~ (all equal) PutUint64-4 0.53ns ± 0% 0.53ns ± 0% ~ (all equal) PutUvarint32-4 19.9ns ± 0% 19.9ns ± 1% ~ (p=0.556 n=4+5) PutUvarint64-4 54.5ns ± 1% 54.2ns ± 0% ~ (p=0.333 n=4+5) name old speed new speed delta ReadSlice1000Int32s-4 829MB/s ± 0% 828MB/s ± 0% ~ (p=0.190 n=4+5) ReadStruct-4 58.0MB/s ± 2% 58.7MB/s ± 1% +1.30% (p=0.032 n=4+5) ReadInts-4 78.0MB/s ± 1% 77.8MB/s ± 1% ~ (p=0.968 n=4+5) WriteInts-4 56.1MB/s ± 3% 57.0MB/s ± 0% ~ (p=0.063 n=4+5) WriteSlice1000Int32s-4 797MB/s ± 0% 783MB/s ± 3% ~ (p=0.190 n=4+5) PutUint16-4 3.37GB/s ± 0% 4.07GB/s ± 2% +20.83% (p=0.016 n=4+5) PutUint32-4 7.73GB/s ± 0% 7.72GB/s ± 0% ~ (p=0.556 n=4+5) PutUint64-4 15.1GB/s ± 0% 15.1GB/s ± 0% ~ (p=0.905 n=4+5) PutUvarint32-4 201MB/s ± 0% 201MB/s ± 0% ~ (p=0.905 n=4+5) PutUvarint64-4 147MB/s ± 1% 147MB/s ± 0% ~ (p=0.286 n=4+5) ( "a bit" only because most of the time is spent in reflection-like things there, not actual bytes decoding. Even for direct PutUint16 benchmark the looping adds overhead and lowers visible benefit. For code-generated encoders / decoders actual effect is more than 20% ) Adding Uint32 and Uint64 raw benchmarks too for completeness. NOTE I had to adjust load-combining rule for bswap case to match first 2 bytes loads as result of "2-bytes load+shift" -> "loadw + rorw 8" rewrite. Reason is: for loads+shift, even e.g. into uint16 var var b []byte var v uin16 v = uint16(b[1]) \| uint16(b[0])<<8 the compiler eventually generates L(ong) shift - SHLLconst [8], probably because it is more straightforward / other reasons to work on the whole register. This way 2 bytes rewriting rule is using SHLLconst (not SHLWconst) in its pattern, and then it always gets matched first, even if 2-byte rule comes syntactically after 4-byte rule in AMD64.rules because 4-bytes rule seemingly needs more applyRewrite() cycles to trigger. If 2-bytes rule gets matched for inner half of var b []byte var v uin32 v = uint32(b[3]) \| uint32(b[2])<<8 \| uint32(b[1])<<16 \| uint32(b[0])<<24 and we keep 4-byte load rule unchanged, the result will be MOVW + RORW $8 and then series of byte loads and shifts - not one MOVL + BSWAPL. There is no such problem for stores: there compiler, since it probably knows store destination is 2 bytes wide, uses SHRWconst 8 (not SHRLconst 8) and thus 2-byte store rule is not a subset of rule for 4-byte stores. Fixes #17151 (int16 was last missing piece there) Change-Id: Idc03ba965bfce2b94fef456b02ff6742194748f6 Reviewed-on: https://go-review.googlesource.com/34636 Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 22:17:08 +00:00
Cherry Zhang	78200799a2	cmd/compile: undo special handling of zero-valued STRUCTLIT CL 35261 introduces special handling of zero-valued STRUCTLIT for efficient struct zeroing. But it didn't cover all use cases, for example, CONVNOP STRUCTLIT is not handled. On the other hand, CL 34566 handles zeroing earlier, so we don't need the change in CL 35261 for efficient zeroing. Other uses of zero-valued struct literals are very rare. So undo the change in walk.go in CL 35261. Add a test for efficient zeroing. Fixes #19084. Change-Id: I0807f7423fb44d47bf325b3c1ce9611a14953853 Reviewed-on: https://go-review.googlesource.com/36955 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-14 18:57:56 +00:00
Kirill Smelkov	bd91e3569a	cmd/compile/internal/ssa: generate bswap/store for indexed bigendian byte stores too on AMD64 Commit `10f75748` (CL 32222) added rewrite rules to combine byte loads/stores + shifts into larger loads/stores + bswap. For loads both MOVBload and MOVBloadidx1 were handled but for store only MOVBstore was there without MOVBstoreidx added to rewrite pattern. Fix it. Here is how generated code changes for the following 2 functions (ommitting staying the same prologue/epilogue): func put32(b []byte, i int, v uint32) { binary.BigEndian.PutUint32(b[i:], v) } func put64(b []byte, i int, v uint64) { binary.BigEndian.PutUint64(b[i:], v) } "".put32 t=1 size=100 args=0x28 locals=0x0 // before 0x0032 00050 (x.go:5) MOVL CX, DX 0x0034 00052 (x.go:5) SHRL $24, CX 0x0037 00055 (x.go:5) MOVQ "".b+8(FP), BX 0x003c 00060 (x.go:5) MOVB CL, (BX)(AX1) 0x003f 00063 (x.go:5) MOVL DX, CX 0x0041 00065 (x.go:5) SHRL $16, DX 0x0044 00068 (x.go:5) MOVB DL, 1(BX)(AX1) 0x0048 00072 (x.go:5) MOVL CX, DX 0x004a 00074 (x.go:5) SHRL $8, CX 0x004d 00077 (x.go:5) MOVB CL, 2(BX)(AX1) 0x0051 00081 (x.go:5) MOVB DL, 3(BX)(AX1) // after 0x0032 00050 (x.go:5) BSWAPL CX 0x0034 00052 (x.go:5) MOVQ "".b+8(FP), DX 0x0039 00057 (x.go:5) MOVL CX, (DX)(AX1) "".put64 t=1 size=155 args=0x28 locals=0x0 // before 0x0037 00055 (x.go:9) MOVQ CX, DX 0x003a 00058 (x.go:9) SHRQ $56, CX 0x003e 00062 (x.go:9) MOVQ "".b+8(FP), BX 0x0043 00067 (x.go:9) MOVB CL, (BX)(AX1) 0x0046 00070 (x.go:9) MOVQ DX, CX 0x0049 00073 (x.go:9) SHRQ $48, DX 0x004d 00077 (x.go:9) MOVB DL, 1(BX)(AX1) 0x0051 00081 (x.go:9) MOVQ CX, DX 0x0054 00084 (x.go:9) SHRQ $40, CX 0x0058 00088 (x.go:9) MOVB CL, 2(BX)(AX1) 0x005c 00092 (x.go:9) MOVQ DX, CX 0x005f 00095 (x.go:9) SHRQ $32, DX 0x0063 00099 (x.go:9) MOVB DL, 3(BX)(AX1) 0x0067 00103 (x.go:9) MOVQ CX, DX 0x006a 00106 (x.go:9) SHRQ $24, CX 0x006e 00110 (x.go:9) MOVB CL, 4(BX)(AX1) 0x0072 00114 (x.go:9) MOVQ DX, CX 0x0075 00117 (x.go:9) SHRQ $16, DX 0x0079 00121 (x.go:9) MOVB DL, 5(BX)(AX1) 0x007d 00125 (x.go:9) MOVQ CX, DX 0x0080 00128 (x.go:9) SHRQ $8, CX 0x0084 00132 (x.go:9) MOVB CL, 6(BX)(AX1) 0x0088 00136 (x.go:9) MOVB DL, 7(BX)(AX1) // after 0x0033 00051 (x.go:9) BSWAPQ CX 0x0036 00054 (x.go:9) MOVQ "".b+8(FP), DX 0x003b 00059 (x.go:9) MOVQ CX, (DX)(AX1) Updates #17151 Change-Id: I3f4a7f28f210e62e153e60da5abd1d39508cc6c4 Reviewed-on: https://go-review.googlesource.com/34635 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2017-02-14 18:35:43 +00:00
Matthew Dempsky	02de5ed748	cmd/internal/obj: add AddrName type and cleanup AddrType values Passes toolstash -cmp. Change-Id: Ida3eda9bd9d79a34c1c3f18cb41aea9392698076 Reviewed-on: https://go-review.googlesource.com/36950 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-13 21:56:17 +00:00
Kirill Smelkov	e2948f7efe	cmd/compile: Show arch/os when something in TestAssembly fails It is not always obvious from the first glance when looking at TestAssembly failure in which context the code was generated. For example x86 and x86-64 are similar, and those of us who do not work with assembly every day can even take s390x version as something similar to x86. So when something fails lets print the whole test context - this includes os and arch which were previously missing. An example failure: before: --- FAIL: TestAssembly (40.48s) asm_test.go:46: expected: MOVWZ $.$, go: import "encoding/binary" func f(b []byte) uint32 { return binary.LittleEndian.Uint32(b) } asm:"".f t=1 size=160 args=0x20 locals=0x0 ... after: --- FAIL: TestAssembly (40.43s) asm_test.go:46: linux/s390x: expected: MOVWZ $.$, go: import "encoding/binary" func f(b []byte) uint32 { return binary.LittleEndian.Uint32(b) } asm:"".f t=1 size=160 args=0x20 locals=0x0 Motivated-by: #18946#issuecomment-279491071 Change-Id: I61089ceec05da7a165718a7d69dec4227dd0e993 Reviewed-on: https://go-review.googlesource.com/36881 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-13 20:30:31 +00:00
Michael Munday	074b73b1b2	cmd/compile: fix s390x load-combining rules MOVD{reg,nop} operations (added in CL 36256) inserted to preserve type information were blocking the load-combining rules. Fix this by merging type changes into loads wherever possible. Fixes #19059. Change-Id: I8a1df06eb0f231b40ae43107d4a3bd0b9c441b59 Reviewed-on: https://go-review.googlesource.com/36843 Run-TryBot: Michael Munday <munday@ca.ibm.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-13 20:04:14 +00:00
Keith Randall	b548eee3d9	cmd/compile: fix load-combining rules CL 33632 reorders args of commutative ops in order to make CSE for commutative ops more robust. Unfortunately, that broke the load-combining rules which depend on a certain ordering of OR ops' arguments. Introduce some additional rules that order OR ops' arguments consistently so that the load-combining rules fire. Note: there's also something else wrong with the s390x rules. I've filed #19059 for that. Fixes #18946 Change-Id: I0a5447196bd88a55ccee683c69a57b943a9972e1 Reviewed-on: https://go-review.googlesource.com/36911 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-13 18:29:51 +00:00
Josh Bleecher Snyder	c5fed5bb24	cmd/compile: cull some dead arch-specific Ops Change-Id: Iee7daa5b91b7896ce857321e307f2ee47b7f095f Reviewed-on: https://go-review.googlesource.com/36906 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-13 18:19:24 +00:00
Keith Randall	5a75d6a08e	cmd/compile: optimize non-empty-interface type conversions When doing i.(T) for non-empty-interface i and concrete type T, there's no need to read the type out of the itab. Just compare the itab to the itab we expect for that interface/type pair. Also optimize type switches by putting the type hash of the concrete type in the itab. That way we don't need to load the type pointer out of the itab. Update #18492 Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce Reviewed-on: https://go-review.googlesource.com/34810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder	2c91bb4c8a	cmd/compile: make panicwrap argument-free When code defines a method on T, the compiler generates a corresponding wrapper method on T. The first thing the wrapper does is check whether the pointer is nil and if so, call panicwrap. This is done to provide a useful error message. The existing implementation gets its information from arguments set up by the compiler. However, with some trouble, this information can be extracted from the name of the wrapper method itself. Removing the arguments to panicwrap simplifies and shrinks the wrapper method. It also means that the call to panicwrap does not require any stack space. This enables a further optimization on amd64/x86, which is to skip the function prologue if nothing else in the method requires stack space. This is frequently the case in simple, hot methods, such as Less and Swap in sort.Interface implementations. Fixes #19040. Benchmarks for package sort on amd64: name old time/op new time/op delta SearchWrappers-8 104ns ± 1% 104ns ± 1% ~ (p=0.286 n=27+27) SortString1K-8 128µs ± 1% 128µs ± 1% -0.44% (p=0.004 n=30+30) SortString1K_Slice-8 118µs ± 2% 117µs ± 1% ~ (p=0.106 n=30+30) StableString1K-8 18.6µs ± 1% 18.6µs ± 1% ~ (p=0.446 n=28+26) SortInt1K-8 65.9µs ± 1% 60.7µs ± 1% -7.96% (p=0.000 n=28+30) StableInt1K-8 75.3µs ± 2% 72.8µs ± 1% -3.41% (p=0.000 n=30+30) StableInt1K_Slice-8 57.7µs ± 1% 57.7µs ± 1% ~ (p=0.515 n=30+30) SortInt64K-8 6.28ms ± 1% 6.01ms ± 1% -4.19% (p=0.000 n=28+28) SortInt64K_Slice-8 5.04ms ± 1% 5.04ms ± 1% ~ (p=0.927 n=28+27) StableInt64K-8 6.65ms ± 1% 6.38ms ± 1% -3.97% (p=0.000 n=26+30) Sort1e2-8 37.9µs ± 1% 37.2µs ± 1% -1.89% (p=0.000 n=29+27) Stable1e2-8 77.0µs ± 1% 74.7µs ± 1% -3.06% (p=0.000 n=27+30) Sort1e4-8 8.21ms ± 2% 7.98ms ± 1% -2.77% (p=0.000 n=29+30) Stable1e4-8 24.8ms ± 1% 24.3ms ± 1% -2.31% (p=0.000 n=28+30) Sort1e6-8 1.27s ± 4% 1.22s ± 1% -3.42% (p=0.000 n=30+29) Stable1e6-8 5.06s ± 1% 4.92s ± 1% -2.77% (p=0.000 n=25+29) [Geo mean] 731µs 714µs -2.29% Before/after assembly for sort.(intPairs).Less follows. It can be optimized further, but that's for a follow-up CL. Before: "".(intPairs).Less t=1 size=214 args=0x20 locals=0x38 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Less(SB), $56-32 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) CMPQ SP, 16(CX) 0x000d 00013 (<autogenerated>:1) JLS 204 0x0013 00019 (<autogenerated>:1) SUBQ $56, SP 0x0017 00023 (<autogenerated>:1) MOVQ BP, 48(SP) 0x001c 00028 (<autogenerated>:1) LEAQ 48(SP), BP 0x0021 00033 (<autogenerated>:1) MOVQ 32(CX), BX 0x0025 00037 (<autogenerated>:1) TESTQ BX, BX 0x0028 00040 (<autogenerated>:1) JEQ 55 0x002a 00042 (<autogenerated>:1) LEAQ 64(SP), DI 0x002f 00047 (<autogenerated>:1) CMPQ (BX), DI 0x0032 00050 (<autogenerated>:1) JNE 55 0x0034 00052 (<autogenerated>:1) MOVQ SP, (BX) 0x0037 00055 (<autogenerated>:1) NOP 0x0037 00055 (<autogenerated>:1) FUNCDATA $0, gclocals·4032f753396f2012ad1784f398b170f4(SB) 0x0037 00055 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x0037 00055 (<autogenerated>:1) MOVQ ""..this+64(FP), AX 0x003c 00060 (<autogenerated>:1) TESTQ AX, AX 0x003f 00063 (<autogenerated>:1) JEQ $0, 135 0x0041 00065 (<autogenerated>:1) MOVQ (AX), CX 0x0044 00068 (<autogenerated>:1) MOVQ 8(AX), AX 0x0048 00072 (<autogenerated>:1) MOVQ "".i+72(FP), DX 0x004d 00077 (<autogenerated>:1) CMPQ DX, AX 0x0050 00080 (<autogenerated>:1) JCC $0, 128 0x0052 00082 (<autogenerated>:1) SHLQ $4, DX 0x0056 00086 (<autogenerated>:1) MOVQ (CX)(DX1), DX 0x005a 00090 (<autogenerated>:1) MOVQ "".j+80(FP), BX 0x005f 00095 (<autogenerated>:1) CMPQ BX, AX 0x0062 00098 (<autogenerated>:1) JCC $0, 128 0x0064 00100 (<autogenerated>:1) SHLQ $4, BX 0x0068 00104 (<autogenerated>:1) MOVQ (CX)(BX1), AX 0x006c 00108 (<autogenerated>:1) CMPQ DX, AX 0x006f 00111 (<autogenerated>:1) SETLT AL 0x0072 00114 (<autogenerated>:1) MOVB AL, "".~r2+88(FP) 0x0076 00118 (<autogenerated>:1) MOVQ 48(SP), BP 0x007b 00123 (<autogenerated>:1) ADDQ $56, SP 0x007f 00127 (<autogenerated>:1) RET 0x0080 00128 (<autogenerated>:1) PCDATA $0, $1 0x0080 00128 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x0085 00133 (<autogenerated>:1) UNDEF 0x0087 00135 (<autogenerated>:1) LEAQ go.string."sort_test"(SB), AX 0x008e 00142 (<autogenerated>:1) MOVQ AX, (SP) 0x0092 00146 (<autogenerated>:1) MOVQ $9, 8(SP) 0x009b 00155 (<autogenerated>:1) LEAQ go.string."intPairs"(SB), AX 0x00a2 00162 (<autogenerated>:1) MOVQ AX, 16(SP) 0x00a7 00167 (<autogenerated>:1) MOVQ $8, 24(SP) 0x00b0 00176 (<autogenerated>:1) LEAQ go.string."Less"(SB), AX 0x00b7 00183 (<autogenerated>:1) MOVQ AX, 32(SP) 0x00bc 00188 (<autogenerated>:1) MOVQ $4, 40(SP) 0x00c5 00197 (<autogenerated>:1) PCDATA $0, $1 0x00c5 00197 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x00ca 00202 (<autogenerated>:1) UNDEF 0x00cc 00204 (<autogenerated>:1) NOP 0x00cc 00204 (<autogenerated>:1) PCDATA $0, $-1 0x00cc 00204 (<autogenerated>:1) CALL runtime.morestack_noctxt(SB) 0x00d1 00209 (<autogenerated>:1) JMP 0 After: "".(intPairs).Swap t=1 size=147 args=0x18 locals=0x8 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Swap(SB), $8-24 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) SUBQ $8, SP 0x000d 00013 (<autogenerated>:1) MOVQ BP, (SP) 0x0011 00017 (<autogenerated>:1) LEAQ (SP), BP 0x0015 00021 (<autogenerated>:1) MOVQ 32(CX), BX 0x0019 00025 (<autogenerated>:1) TESTQ BX, BX 0x001c 00028 (<autogenerated>:1) JEQ 43 0x001e 00030 (<autogenerated>:1) LEAQ 16(SP), DI 0x0023 00035 (<autogenerated>:1) CMPQ (BX), DI 0x0026 00038 (<autogenerated>:1) JNE 43 0x0028 00040 (<autogenerated>:1) MOVQ SP, (BX) 0x002b 00043 (<autogenerated>:1) NOP 0x002b 00043 (<autogenerated>:1) FUNCDATA $0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB) 0x002b 00043 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x002b 00043 (<autogenerated>:1) MOVQ ""..this+16(FP), AX 0x0030 00048 (<autogenerated>:1) TESTQ AX, AX 0x0033 00051 (<autogenerated>:1) JEQ $0, 140 0x0035 00053 (<autogenerated>:1) MOVQ (AX), CX 0x0038 00056 (<autogenerated>:1) MOVQ 8(AX), AX 0x003c 00060 (<autogenerated>:1) MOVQ "".i+24(FP), DX 0x0041 00065 (<autogenerated>:1) CMPQ DX, AX 0x0044 00068 (<autogenerated>:1) JCC $0, 133 0x0046 00070 (<autogenerated>:1) SHLQ $4, DX 0x004a 00074 (<autogenerated>:1) MOVQ 8(CX)(DX1), BX 0x004f 00079 (<autogenerated>:1) MOVQ (CX)(DX1), SI 0x0053 00083 (<autogenerated>:1) MOVQ "".j+32(FP), DI 0x0058 00088 (<autogenerated>:1) CMPQ DI, AX 0x005b 00091 (<autogenerated>:1) JCC $0, 133 0x005d 00093 (<autogenerated>:1) SHLQ $4, DI 0x0061 00097 (<autogenerated>:1) MOVQ 8(CX)(DI1), AX 0x0066 00102 (<autogenerated>:1) MOVQ (CX)(DI1), R8 0x006a 00106 (<autogenerated>:1) MOVQ R8, (CX)(DX1) 0x006e 00110 (<autogenerated>:1) MOVQ AX, 8(CX)(DX1) 0x0073 00115 (<autogenerated>:1) MOVQ SI, (CX)(DI1) 0x0077 00119 (<autogenerated>:1) MOVQ BX, 8(CX)(DI1) 0x007c 00124 (<autogenerated>:1) MOVQ (SP), BP 0x0080 00128 (<autogenerated>:1) ADDQ $8, SP 0x0084 00132 (<autogenerated>:1) RET 0x0085 00133 (<autogenerated>:1) PCDATA $0, $1 0x0085 00133 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x008a 00138 (<autogenerated>:1) UNDEF 0x008c 00140 (<autogenerated>:1) PCDATA $0, $1 0x008c 00140 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x0091 00145 (<autogenerated>:1) UNDEF Change-Id: I15bb8435f0690badb868799f313ed8817335efd3 Reviewed-on: https://go-review.googlesource.com/36809 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-11 23:27:35 +00:00
Dhananjay Nakrani	1cde87b312	cmd/compile: Ensure left-to-right assignment Add temporaries to reorder the assignment for OAS2XXX nodes. This makes orderstmt(), rewrite a, b, c = ... as tmp1, tmp2, tmp3 = ... a, b, c = tmp1, tmp2, tmp3 and a, ok = ... as t1, t2 = ... a = t1 ok = t2 Fixes #13433. Change-Id: Id0f5956e3a254d0a6f4b89b5f7b0e055b1f0e21f Reviewed-on: https://go-review.googlesource.com/34713 Run-TryBot: Dhananjay Nakrani <dhananjayn@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-11 21:46:21 +00:00
Matthew Dempsky	bdb9b945b9	cmd/compile: eliminate OASWB Instead we can just call needwritebarrier when constructing the SSA representation. Change-Id: I6fefaad49daada9cdb3050f112889e49dca0047b Reviewed-on: https://go-review.googlesource.com/34566 Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-10 22:31:58 +00:00
Hajime Hoshi	249aca5dee	cmd/compile/internal/gc: unexport or remove global functions Change-Id: Ib2109ab773fbf2a35188300cf91a54735f75fc7c Reviewed-on: https://go-review.googlesource.com/36736 Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-10 17:57:38 +00:00
Josh Bleecher Snyder	5faba3057d	cmd/compile: use constants directly for fast map access calls CL 35554 taught order.go to use static variables for constants that needed to be addressable for runtime routines. However, there is one class of runtime routines that do not actually need an addressable value: fast map access routines. This CL teaches order.go to avoid using static variables for addressability in those cases. Instead, it avoids introducing a temp at all, which the backend would just have to optimize away. Fixes #19015. Change-Id: I5ef780c604fac3fb48dabb23a344435e283cb832 Reviewed-on: https://go-review.googlesource.com/36693 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-10 04:57:20 +00:00
Austin Clements	450472989b	cmd/compile: disallow combining nosplit and systemstack go:systemstack works by tweaking the stack check prologue to check against a different bound, while go:nosplit removes the stack check prologue entirely. Hence, they can't be used together. Make the build fail if they are. Change-Id: I2d180c4b1d31ff49ec193291ecdd42921d253359 Reviewed-on: https://go-review.googlesource.com/36710 Run-TryBot: Austin Clements <austin@google.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-09 22:27:17 +00:00
Cherry Zhang	a146dd3a2f	cmd/compile: handle DOT STRUCTLIT for zero-valued struct in SSA CL 35261 makes SSA handle zero-valued STRUCTLIT, but DOT operation was not handled. Fixes #18994. Change-Id: Ic7976036acca1523b0b14afac4d170797e8aee20 Reviewed-on: https://go-review.googlesource.com/36565 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-08 21:01:51 +00:00
David Lazar	e3efdffacd	cmd/compile: include linknames in export data This lets the compiler inline functions that contain a linknamed symbol. Previously, the net/http tests would fail to build with -l=4 because the compiler inlined functions that call net.byteIndex (which is linknamed to strings.IndexByte). This changes only the compiler-specific export data, so we don't need to bump the export format version number. The following benchmark results show how the size of package export data is impacted by this change. These benchmarks were created by compiling the go1 benchmark and running `go tool pack x` to extract the export data from the resulting .a files. name old bytes new bytes delta bufio 3.48k ± 0% 3.58k ± 0% +2.90% bytes 5.05k ± 0% 5.16k ± 0% +2.16% compress/bzip2 2.61k ± 0% 2.68k ± 0% +2.68% compress/flate 5.07k ± 0% 5.14k ± 0% +1.40% compress/gzip 8.26k ± 0% 8.40k ± 0% +1.70% container/list 1.69k ± 0% 1.76k ± 0% +4.07% context 3.93k ± 0% 4.01k ± 0% +1.86% crypto 1.03k ± 0% 1.03k ± 0% +0.39% crypto/aes 475 ± 0% 475 ± 0% +0.00% crypto/cipher 1.18k ± 0% 1.18k ± 0% +0.00% crypto/des 502 ± 0% 502 ± 0% +0.00% crypto/dsa 5.71k ± 0% 5.77k ± 0% +1.16% crypto/ecdsa 6.67k ± 0% 6.75k ± 0% +1.08% crypto/elliptic 6.28k ± 0% 6.35k ± 0% +1.07% crypto/hmac 464 ± 0% 464 ± 0% +0.00% crypto/internal/cipherhw 313 ± 0% 313 ± 0% +0.00% crypto/md5 691 ± 0% 695 ± 0% +0.58% crypto/rand 5.37k ± 0% 5.43k ± 0% +1.23% crypto/rc4 512 ± 0% 512 ± 0% +0.00% crypto/rsa 7.05k ± 0% 7.12k ± 0% +1.05% crypto/sha1 756 ± 0% 760 ± 0% +0.53% crypto/sha256 523 ± 0% 523 ± 0% +0.00% crypto/sha512 662 ± 0% 662 ± 0% +0.00% crypto/subtle 835 ± 0% 873 ± 0% +4.55% crypto/tls 28.1k ± 0% 28.5k ± 0% +1.30% crypto/x509 17.7k ± 0% 17.9k ± 0% +1.04% crypto/x509/pkix 9.75k ± 0% 9.90k ± 0% +1.50% encoding 473 ± 0% 473 ± 0% +0.00% encoding/asn1 1.41k ± 0% 1.42k ± 0% +1.00% encoding/base64 1.67k ± 0% 1.69k ± 0% +0.90% encoding/binary 2.65k ± 0% 2.76k ± 0% +4.07% encoding/gob 13.3k ± 0% 13.5k ± 0% +1.65% encoding/hex 854 ± 0% 857 ± 0% +0.35% encoding/json 11.9k ± 0% 12.1k ± 0% +1.71% encoding/pem 484 ± 0% 484 ± 0% +0.00% errors 360 ± 0% 361 ± 0% +0.28% flag 7.32k ± 0% 7.42k ± 0% +1.48% fmt 1.42k ± 0% 1.42k ± 0% +0.00% go/ast 15.7k ± 0% 15.8k ± 0% +1.07% go/parser 7.48k ± 0% 7.59k ± 0% +1.55% go/scanner 3.88k ± 0% 3.94k ± 0% +1.39% go/token 3.51k ± 0% 3.53k ± 0% +0.60% hash 507 ± 0% 507 ± 0% +0.00% hash/crc32 685 ± 0% 685 ± 0% +0.00% internal/nettrace 474 ± 0% 474 ± 0% +0.00% internal/pprof/profile 8.29k ± 0% 8.36k ± 0% +0.89% internal/race 511 ± 0% 511 ± 0% +0.00% internal/singleflight 966 ± 0% 969 ± 0% +0.31% internal/syscall/unix 427 ± 0% 427 ± 0% +0.00% io 3.48k ± 0% 3.52k ± 0% +1.15% io/ioutil 5.30k ± 0% 5.38k ± 0% +1.53% log 4.46k ± 0% 4.53k ± 0% +1.59% math 3.72k ± 0% 3.75k ± 0% +0.75% math/big 8.91k ± 0% 9.01k ± 0% +1.15% math/rand 1.29k ± 0% 1.30k ± 0% +0.46% mime 2.59k ± 0% 2.63k ± 0% +1.55% mime/multipart 3.61k ± 0% 3.68k ± 0% +1.80% mime/quotedprintable 2.20k ± 0% 2.25k ± 0% +2.50% net 21.1k ± 0% 21.3k ± 0% +1.10% net/http 56.6k ± 0% 57.3k ± 0% +1.28% net/http/httptest 33.6k ± 0% 34.1k ± 0% +1.38% net/http/httptrace 14.4k ± 0% 14.5k ± 0% +1.29% net/http/internal 2.70k ± 0% 2.77k ± 0% +2.59% net/textproto 4.51k ± 0% 4.60k ± 0% +1.82% net/url 1.71k ± 0% 1.73k ± 0% +1.41% os 11.3k ± 0% 11.4k ± 0% +1.36% path 587 ± 0% 589 ± 0% +0.34% path/filepath 4.46k ± 0% 4.55k ± 0% +1.88% reflect 6.39k ± 0% 6.43k ± 0% +0.72% regexp 5.82k ± 0% 5.88k ± 0% +1.12% regexp/syntax 3.22k ± 0% 3.24k ± 0% +0.62% runtime 12.9k ± 0% 13.2k ± 0% +1.94% runtime/cgo 229 ± 0% 229 ± 0% +0.00% runtime/debug 3.66k ± 0% 3.72k ± 0% +1.86% runtime/internal/atomic 905 ± 0% 905 ± 0% +0.00% runtime/internal/sys 2.00k ± 0% 2.05k ± 0% +2.55% runtime/pprof 4.16k ± 0% 4.23k ± 0% +1.66% runtime/pprof/internal/protopprof 11.5k ± 0% 11.7k ± 0% +1.27% runtime/trace 354 ± 0% 354 ± 0% +0.00% sort 1.63k ± 0% 1.68k ± 0% +2.94% strconv 1.84k ± 0% 1.85k ± 0% +0.54% strings 3.87k ± 0% 3.97k ± 0% +2.48% sync 1.51k ± 0% 1.52k ± 0% +0.33% sync/atomic 1.58k ± 0% 1.60k ± 0% +1.27% syscall 53.2k ± 0% 53.3k ± 0% +0.20% testing 8.14k ± 0% 8.26k ± 0% +1.49% testing/internal/testdeps 597 ± 0% 598 ± 0% +0.17% text/tabwriter 3.09k ± 0% 3.14k ± 0% +1.85% text/template 15.4k ± 0% 15.7k ± 0% +1.89% text/template/parse 8.90k ± 0% 9.12k ± 0% +2.46% time 5.75k ± 0% 5.86k ± 0% +1.86% unicode 4.62k ± 0% 4.62k ± 0% +0.07% unicode/utf16 693 ± 0% 706 ± 0% +1.88% unicode/utf8 1.05k ± 0% 1.07k ± 0% +1.14% vendor/golang_org/x/crypto/chacha20poly1305 1.25k ± 0% 1.26k ± 0% +0.64% vendor/golang_org/x/crypto/curve25519 392 ± 0% 392 ± 0% +0.00% vendor/golang_org/x/crypto/poly1305 426 ± 0% 426 ± 0% +0.00% vendor/golang_org/x/net/http2/hpack 4.19k ± 0% 4.26k ± 0% +1.69% vendor/golang_org/x/net/idna 355 ± 0% 355 ± 0% +0.00% vendor/golang_org/x/net/lex/httplex 609 ± 0% 615 ± 0% +0.99% vendor/golang_org/x/text/transform 1.31k ± 0% 1.31k ± 0% +0.08% vendor/golang_org/x/text/unicode/norm 5.78k ± 0% 5.90k ± 0% +2.06% vendor/golang_org/x/text/width 1.24k ± 0% 1.24k ± 0% +0.16% [Geo mean] 2.49k 2.52k +1.10% Fixes #18167. Change-Id: Ia5b7e70adc9652c7ee9954ca2efc1c59fa79be2b Reviewed-on: https://go-review.googlesource.com/33911 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-02-08 20:59:45 +00:00
Matthew Dempsky	7bad00366b	cmd/internal/obj: remove ATYPE In cmd/compile, we can directly construct obj.Auto to represent local variables and attach them to the function's obj.LSym. In preparation for being able to emit more precise DWARF info based on other compiler available information (e.g., lexical scoping). Change-Id: I9c4225ec59306bec42552838493022e0e9d70228 Reviewed-on: https://go-review.googlesource.com/36420 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-02-07 22:38:18 +00:00
Cherry Zhang	a833485828	cmd/compile: do not use statictmp for zeroing Also fixes #18687. Change-Id: I7c6d47c71e632adf4c16937a29074621f771844c Reviewed-on: https://go-review.googlesource.com/35261 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-07 21:15:21 +00:00

1 2 3 4 5 ...

1538 commits