Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-11-11 14:11:04 +00:00

Author	SHA1	Message	Date
smasher164	58b031949b	cmd/compile: add fma intrinsic for arm This change introduces an arm intrinsic that generates the FMULAD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.ARM.HasVFPv4. A rewrite rule translates the generic intrinsic to FMULAD. Updates #25819. Change-Id: I8459e5dd1cdbdca35f88a78dbeb7d387f1e20efa Reviewed-on: https://go-review.googlesource.com/c/go/+/142117 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 17:42:47 +00:00
smasher164	7a6da218b1	cmd/compile: add fma intrinsic for amd64 To permit ssa-level optimization, this change introduces an amd64 intrinsic that generates the VFMADD231SD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic ("Fma") to VFMADD231SD. The benchmark compares the software implementation (old) with the intrinsic (new). name old time/op new time/op delta Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5) Updates #25819. Change-Id: I966655e5f96817a5d06dff5942418a3915b09584 Reviewed-on: https://go-review.googlesource.com/c/go/+/137156 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:42:10 +00:00
Matthew Dempsky	80a6fedea0	cmd/compile: add -d=checkptr to validate unsafe.Pointer rules This CL adds -d=checkptr as a compile-time option for adding instrumentation to check that Go code is following unsafe.Pointer safety rules dynamically. In particular, it currently checks two things: 1. When converting unsafe.Pointer to *T, make sure the resulting pointer is aligned appropriately for T. 2. When performing pointer arithmetic, if the result points to a Go heap object, make sure we can find an unsafe.Pointer-typed operand that pointed into the same object. These checks are currently disabled for the runtime, and can also be disabled through a new //go:nocheckptr annotation. The latter is necessary for functions like strings.noescape, which intentionally violate safety rules to workaround escape analysis limitations. Fixes #22218. Change-Id: If5a51273881d93048f74bcff10a3275c9c91da6a Reviewed-on: https://go-review.googlesource.com/c/go/+/162237 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-17 00:40:21 +00:00
Keith Randall	36f30ba289	cmd/compile,runtime: generate hash functions only for types which are map keys Right now we generate hash functions for all types, just in case they are used as map keys. That's a lot of wasted effort and binary size for types which will never be used as a map key. Instead, generate hash functions only for types that we know are map keys. Just doing that is a bit too simple, since maps with an interface type as a key might have to hash any concrete key type that implements that interface. So for that case, implement hashing of such types at runtime (instead of with generated code). It will be slower, but only for maps with interface types as keys, and maybe only a bit slower as the aeshash time probably dominates the dispatch time. Reorg where we keep the equals and hash functions. Move the hash function from the key type to the map type, saving a field in every non-map type. That leaves only one function in the alg structure, so get rid of that and just keep the equal function in the type descriptor itself. cmd/go now has 10 generated hash functions, instead of 504. Makes cmd/go 1.0% smaller. Update #6853. Speed on non-interface keys is unchanged. Speed on interface keys is ~20% slower: name old time/op new time/op delta MapInterfaceString-8 23.0ns ±21% 27.6ns ±14% +20.01% (p=0.002 n=10+10) MapInterfacePtr-8 19.4ns ±16% 23.7ns ± 7% +22.48% (p=0.000 n=10+8) Change-Id: I7c2e42292a46b5d4e288aaec4029bdbb01089263 Reviewed-on: https://go-review.googlesource.com/c/go/+/191198 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2019-09-03 20:41:29 +00:00
Keith Randall	2c423f063b	cmd/compile,runtime: provide index information on bounds check failure A few examples (for accessing a slice of length 3): s[-1] runtime error: index out of range [-1] s[3] runtime error: index out of range [3] with length 3 s[-1:0] runtime error: slice bounds out of range [-1:] s[3:0] runtime error: slice bounds out of range [3:0] s[3:-1] runtime error: slice bounds out of range [:-1] s[3:4] runtime error: slice bounds out of range [:4] with capacity 3 s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3 Note that in cases where there are multiple things wrong with the indexes (e.g. s[3:-1]), we report one of those errors kind of arbitrarily, currently the rightmost one. An exhaustive set of examples is in issue30116[u].out in the CL. The message text has the same prefix as the old message text. That leads to slightly awkward phrasing but hopefully minimizes the chance that code depending on the error text will break. Increases the size of the go binary by 0.5% (amd64). The panic functions take arguments in registers in order to keep the size of the compiled code as small as possible. Fixes #30116 Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd Reviewed-on: https://go-review.googlesource.com/c/go/+/161477 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-03-18 17:33:38 +00:00
Keith Randall	585c9e8412	cmd/compile: implement shifts by signed amounts Allow shifts by signed amounts. Panic if the shift amount is negative. TODO: We end up doing two compares per shift, see Ian's comment https://github.com/golang/go/issues/19113#issuecomment-443241799 that we could do it with a single comparison in the normal case. The prove pass mostly handles this code well. For instance, it removes the <0 check for cases like this: if s >= 0 { _ = x << s } _ = x << len(a) This case isn't handled well yet: _ = x << (y & 0xf) I'll do followon CLs for unhandled cases as needed. Update #19113 R=go1.13 Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1 Reviewed-on: https://go-review.googlesource.com/c/158719 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2019-02-15 23:13:09 +00:00
Martin Möhrmann	75798e8ada	runtime: make processor capability variable naming platform specific The current support_XXX variables are specific for the amd64 and 386 platforms. Prefix processor capability variables by architecture to have a consistent naming scheme and avoid reuse of the existing variables for new platforms. This also aligns naming of runtime variables closer with internal/cpu processor capability variable names. Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d Reviewed-on: https://go-review.googlesource.com/c/91435 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-11-14 20:30:31 +00:00
Josh Bleecher Snyder	5848b6c9b8	cmd/compile: shrink specialized convT2x call sites convT2E16 and other specialized type-to-interface routines accept a type/itab argument and return a complete interface value. However, we know enough in the routine to do without the type. And the caller can construct the interface value using the type. Doing so shrinks the call sites of ten of the specialized convT2x routines. It also lets us unify the empty and non-empty interface routines. Cuts 12k off cmd/go. name old time/op new time/op delta ConvT2ESmall-8 2.96ns ± 2% 2.34ns ± 4% -21.01% (p=0.000 n=175+189) ConvT2EUintptr-8 3.00ns ± 3% 2.34ns ± 4% -22.02% (p=0.000 n=189+187) ConvT2ELarge-8 21.3ns ± 7% 21.5ns ± 5% +1.02% (p=0.000 n=200+197) ConvT2ISmall-8 2.99ns ± 4% 2.33ns ± 3% -21.95% (p=0.000 n=193+184) ConvT2IUintptr-8 3.02ns ± 3% 2.33ns ± 3% -22.82% (p=0.000 n=198+190) ConvT2ILarge-8 21.7ns ± 5% 22.2ns ± 4% +2.31% (p=0.000 n=199+198) ConvT2Ezero/zero/16-8 2.96ns ± 2% 2.33ns ± 3% -21.11% (p=0.000 n=174+187) ConvT2Ezero/zero/32-8 2.96ns ± 1% 2.35ns ± 4% -20.62% (p=0.000 n=163+193) ConvT2Ezero/zero/64-8 2.99ns ± 2% 2.34ns ± 4% -21.78% (p=0.000 n=183+188) ConvT2Ezero/zero/str-8 3.27ns ± 3% 2.54ns ± 3% -22.32% (p=0.000 n=195+192) ConvT2Ezero/zero/slice-8 3.46ns ± 4% 2.81ns ± 3% -18.96% (p=0.000 n=197+164) ConvT2Ezero/zero/big-8 88.4ns ±20% 90.0ns ±20% +1.84% (p=0.000 n=196+198) ConvT2Ezero/nonzero/16-8 12.6ns ± 3% 12.3ns ± 3% -2.34% (p=0.000 n=167+196) ConvT2Ezero/nonzero/32-8 12.3ns ± 4% 11.9ns ± 3% -2.95% (p=0.000 n=187+193) ConvT2Ezero/nonzero/64-8 14.2ns ± 6% 13.8ns ± 5% -2.94% (p=0.000 n=198+199) ConvT2Ezero/nonzero/str-8 27.2ns ± 5% 26.8ns ± 5% -1.33% (p=0.000 n=200+198) ConvT2Ezero/nonzero/slice-8 33.3ns ± 8% 33.1ns ± 6% -0.82% (p=0.000 n=199+200) ConvT2Ezero/nonzero/big-8 88.8ns ±22% 90.2ns ±18% +1.58% (p=0.000 n=200+199) Neligible toolspeed impact. name old alloc/op new alloc/op delta Template 35.4MB ± 0% 35.3MB ± 0% -0.06% (p=0.008 n=5+5) Unicode 29.1MB ± 0% 29.1MB ± 0% ~ (p=0.310 n=5+5) GoTypes 122MB ± 0% 122MB ± 0% -0.08% (p=0.008 n=5+5) Compiler 514MB ± 0% 513MB ± 0% -0.02% (p=0.008 n=5+5) SSA 1.94GB ± 0% 1.94GB ± 0% -0.01% (p=0.008 n=5+5) Flate 24.2MB ± 0% 24.2MB ± 0% ~ (p=0.548 n=5+5) GoParser 28.5MB ± 0% 28.5MB ± 0% -0.05% (p=0.016 n=5+5) Reflect 86.3MB ± 0% 86.2MB ± 0% -0.02% (p=0.008 n=5+5) Tar 34.9MB ± 0% 34.9MB ± 0% ~ (p=0.095 n=5+5) XML 47.1MB ± 0% 47.1MB ± 0% -0.05% (p=0.008 n=5+5) [Geo mean] 81.0MB 81.0MB -0.03% name old allocs/op new allocs/op delta Template 349k ± 0% 349k ± 0% -0.08% (p=0.008 n=5+5) Unicode 340k ± 0% 340k ± 0% ~ (p=0.111 n=5+5) GoTypes 1.28M ± 0% 1.28M ± 0% -0.09% (p=0.008 n=5+5) Compiler 4.92M ± 0% 4.92M ± 0% -0.08% (p=0.008 n=5+5) SSA 15.3M ± 0% 15.3M ± 0% -0.03% (p=0.008 n=5+5) Flate 233k ± 0% 233k ± 0% ~ (p=0.500 n=5+5) GoParser 292k ± 0% 292k ± 0% -0.06% (p=0.008 n=5+5) Reflect 1.05M ± 0% 1.05M ± 0% -0.02% (p=0.008 n=5+5) Tar 344k ± 0% 343k ± 0% -0.06% (p=0.008 n=5+5) XML 430k ± 0% 429k ± 0% -0.08% (p=0.008 n=5+5) [Geo mean] 809k 809k -0.05% name old object-bytes new object-bytes delta Template 507kB ± 0% 507kB ± 0% -0.04% (p=0.008 n=5+5) Unicode 225kB ± 0% 225kB ± 0% ~ (all equal) GoTypes 1.85MB ± 0% 1.85MB ± 0% -0.08% (p=0.008 n=5+5) Compiler 6.75MB ± 0% 6.75MB ± 0% +0.01% (p=0.008 n=5+5) SSA 21.4MB ± 0% 21.4MB ± 0% -0.02% (p=0.008 n=5+5) Flate 328kB ± 0% 328kB ± 0% -0.03% (p=0.008 n=5+5) GoParser 403kB ± 0% 402kB ± 0% -0.06% (p=0.008 n=5+5) Reflect 1.41MB ± 0% 1.41MB ± 0% -0.03% (p=0.008 n=5+5) Tar 457kB ± 0% 457kB ± 0% -0.05% (p=0.008 n=5+5) XML 601kB ± 0% 600kB ± 0% -0.16% (p=0.008 n=5+5) [Geo mean] 1.05MB 1.04MB -0.05% Change-Id: I677a4108c0ecd32617549294036aa84f9214c4fe Reviewed-on: https://go-review.googlesource.com/c/147360 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2018-11-06 00:02:14 +00:00
Martin Möhrmann	020a18c545	cmd/compile: move slice construction to callers of makeslice Only return a pointer p to the new slices backing array from makeslice. Makeslice callers then construct sliceheader{p, len, cap} explictly instead of makeslice returning the slice. Reduces go binary size by ~0.2%. Removes 92 (~3.5%) panicindex calls from go binary. Change-Id: I29b7c3b5fe8b9dcec96e2c43730575071cfe8a94 Reviewed-on: https://go-review.googlesource.com/c/141822 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-10-29 19:23:00 +00:00
Keith Randall	0e9f8a21f8	runtime,cmd/compile: pass strings and slices to convT2{E,I} by value When we pass these types by reference, we usually have to allocate temporaries on the stack, initialize them, then pass their address to the conversion functions. It's simpler to pass these types directly by value. This particularly applies to conversions needed for fmt.Printf (to interface{} for constructing a [...]interface{}). func f(a, b, c string) { fmt.Printf("%s %s\n", a, b) fmt.Printf("%s %s\n", b, c) } This function's stack frame shrinks from 200 to 136 bytes, and its code shrinks from 535 to 453 bytes. The go binary shrinks 0.3%. Update #24286 Aside: for this function f, we don't really need to allocate temporaries for the convT2E function. We could use the address of a, b, and c directly. That might get similar (or maybe better?) improvements. I investigated a bit, but it seemed complicated to do it safely. This change was much easier. Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d Reviewed-on: https://go-review.googlesource.com/c/135377 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-10-14 03:46:51 +00:00
Lynn Boger	9e9ff565cd	runtime/race: implement race detector for ppc64le This adds the support to enable the race detector for ppc64le. Added runtime/race_ppc64le.s to manage the calls from Go to the LLVM tsan functions, mostly converting from the Go ABI to the PPC64 ABI expected by Clang generated code. Changed racewalk.go to call racefuncenterfp instead of racefuncenter on ppc64le to allow the caller pc to be obtained in the asm code before calling the tsan version. Changed the set up code for racecallbackthunk so it doesn't use the autogenerated save and restore of the link register since that sequence uses registers inconsistent with the normal ppc64 ABI. Made various changes to recognize that race is supported for ppc64le. Ensured that tls_g is updated and accessible from race_linux_ppc64le.s so that the race ctx can be obtained and passed to tsan functions. This enables the race tests for ppc64le in cmd/dist/test.go and increases the timeout when running the benchmarks with the -race option to avoid timing out. Updates #24354, #23731 Change-Id: Ib97dc7ac313e6313c836dc7d2fb698f9d8fba3ef Reviewed-on: https://go-review.googlesource.com/107935 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-06-11 17:45:36 +00:00
Martin Möhrmann	aee71dd70b	cmd/compile: optimize map-clearing range idiom replace map clears of the form: for k := range m { delete(m, k) } (where m is map with key type that is reflexive for ==) with a new runtime function that clears the maps backing array with a memclr and reinitializes the hmap struct. Map key types that for example contain floats are not replaced by this optimization since NaN keys cannot be deleted from maps using delete. name old time/op new time/op delta GoMapClear/Reflexive/1 92.2ns ± 1% 47.1ns ± 2% -48.89% (p=0.000 n=9+9) GoMapClear/Reflexive/10 108ns ± 1% 48ns ± 2% -55.68% (p=0.000 n=10+10) GoMapClear/Reflexive/100 303ns ± 2% 110ns ± 3% -63.56% (p=0.000 n=10+10) GoMapClear/Reflexive/1000 3.58µs ± 3% 1.23µs ± 2% -65.49% (p=0.000 n=9+10) GoMapClear/Reflexive/10000 28.2µs ± 3% 10.3µs ± 2% -63.55% (p=0.000 n=9+10) GoMapClear/NonReflexive/1 121ns ± 2% 124ns ± 7% ~ (p=0.097 n=10+10) GoMapClear/NonReflexive/10 137ns ± 2% 139ns ± 3% +1.53% (p=0.033 n=10+10) GoMapClear/NonReflexive/100 331ns ± 3% 334ns ± 2% ~ (p=0.342 n=10+10) GoMapClear/NonReflexive/1000 3.64µs ± 3% 3.64µs ± 2% ~ (p=0.887 n=9+10) GoMapClear/NonReflexive/10000 28.1µs ± 2% 28.4µs ± 3% ~ (p=0.247 n=10+10) Fixes #20138 Change-Id: I181332a8ef434a4f0d89659f492d8711db3f3213 Reviewed-on: https://go-review.googlesource.com/110055 Reviewed-by: Keith Randall <khr@golang.org>	2018-05-08 21:15:16 +00:00
Martin Möhrmann	b9a59d9f2e	cmd/compile: optimize len([]rune(string)) Adds a new runtime function to count runes in a string. Modifies the compiler to detect the pattern len([]rune(string)) and replaces it with the new rune counting runtime function. RuneCount/lenruneslice/ASCII 27.8ns ± 2% 14.5ns ± 3% -47.70% (p=0.000 n=10+10) RuneCount/lenruneslice/Japanese 126ns ± 2% 60ns ± 2% -52.03% (p=0.000 n=10+10) RuneCount/lenruneslice/MixedLength 104ns ± 2% 50ns ± 1% -51.71% (p=0.000 n=10+9) Fixes #24923 Change-Id: Ie9c7e7391a4e2cca675c5cdcc1e5ce7d523948b9 Reviewed-on: https://go-review.googlesource.com/108985 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-05-06 05:31:01 +00:00
Martin Möhrmann	a8a60ac2a7	cmd/compile: optimize append(x, make([]T, y)...) slice extension Changes the compiler to recognize the slice extension pattern append(x, make([]T, y)...) and replace it with growslice and an optional memclr to avoid an allocation for make([]T, y). Memclr is not called in case growslice already allocated a new cleared backing array when T contains pointers. amd64: name old time/op new time/op delta ExtendSlice/IntSlice 103ns ± 4% 57ns ± 4% -44.55% (p=0.000 n=18+18) ExtendSlice/PointerSlice 155ns ± 3% 77ns ± 3% -49.93% (p=0.000 n=20+20) ExtendSlice/NoGrow 50.2ns ± 3% 5.2ns ± 2% -89.67% (p=0.000 n=18+18) name old alloc/op new alloc/op delta ExtendSlice/IntSlice 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/PointerSlice 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/NoGrow 32.0B ± 0% 0.0B -100.00% (p=0.000 n=20+20) name old allocs/op new allocs/op delta ExtendSlice/IntSlice 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/PointerSlice 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=20+20) ExtendSlice/NoGrow 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) Fixes #21266 Change-Id: Idc3077665f63cbe89762b590c5967a864fd1c07f Reviewed-on: https://go-review.googlesource.com/109517 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2018-05-06 04:28:23 +00:00
Matthew Dempsky	6658219b7e	runtime: eliminate scase.receivedp Make selectgo return recvOK as a result parameter instead. Change-Id: Iffd436371d360bf666b76d4d7503e7c3037a9f1d Reviewed-on: https://go-review.googlesource.com/37935 Reviewed-by: Austin Clements <austin@google.com>	2018-05-01 03:17:54 +00:00
Matthew Dempsky	004260afde	cmd/compile: open code select{send,recv,default} Registration now looks like: var cases [4]runtime.scases var order [8]uint16 cases[0].kind = caseSend cases[0].c = c1 cases[0].elem = &v1 if raceenabled \|\| msanenabled { selectsetpc(&cases[0]) } cases[1].kind = caseRecv cases[1].c = c2 cases[1].elem = &v2 if raceenabled \|\| msanenabled { selectsetpc(&cases[1]) } ... Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48 Reviewed-on: https://go-review.googlesource.com/37934 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-01 03:17:44 +00:00
Matthew Dempsky	3aa53b3135	runtime: eliminate runtime.hselect Now the registration phase looks like: var cases [4]runtime.scases var order [8]uint16 selectsend(&cases[0], c1, &v1) selectrecv(&cases[1], c2, &v2, nil) selectrecv(&cases[2], c3, &v3, &ok) selectdefault(&cases[3]) chosen := selectgo(&cases[0], &order[0], 4) Primarily, this is just preparation for having the compiler open-code selectsend, selectrecv, and selectdefault. As a minor benefit, order can now be layed out separately on the stack in the pointer-free segment, so it won't take up space in the function's stack pointer maps. Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45 Reviewed-on: https://go-review.googlesource.com/37933 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-05-01 03:17:31 +00:00
ChrisALiles	22ff9521da	cmd/compile: pass arguments to convt2E/I integer functions by value The motivation is avoid generating a pointer to the data being converted so it can be garbage collected. The change also slightly reduces binary size by shrinking call sites. Fixes #24286 Benchmark results: name old time/op new time/op delta ConvT2ESmall-4 2.86ns ± 0% 2.80ns ± 1% -2.12% (p=0.000 n=29+28) ConvT2EUintptr-4 2.88ns ± 1% 2.88ns ± 0% -0.20% (p=0.002 n=28+30) ConvT2ELarge-4 19.6ns ± 0% 20.4ns ± 1% +4.22% (p=0.000 n=19+30) ConvT2ISmall-4 3.01ns ± 0% 2.85ns ± 0% -5.32% (p=0.000 n=24+28) ConvT2IUintptr-4 3.00ns ± 1% 2.87ns ± 0% -4.44% (p=0.000 n=29+25) ConvT2ILarge-4 20.4ns ± 1% 21.3ns ± 1% +4.41% (p=0.000 n=30+26) ConvT2Ezero/zero/16-4 2.84ns ± 1% 2.99ns ± 0% +5.38% (p=0.000 n=30+25) ConvT2Ezero/zero/32-4 2.83ns ± 2% 3.00ns ± 0% +5.91% (p=0.004 n=27+3) Change-Id: I65016ec94c53f97c52113121cab582d0c342b7a8 Reviewed-on: https://go-review.googlesource.com/102636 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-29 15:53:39 +00:00
Matthew Dempsky	b55eedd173	Revert "cmd/compile: cleanup nodpc and nodfp" This reverts commit `dcac984b97`. Reason for revert: broke LR architectures (arm64, ppc64, s390x) Change-Id: I531d311c9053e81503c8c78d6cf044b318fc828b Reviewed-on: https://go-review.googlesource.com/99695 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-03-08 21:23:01 +00:00
Matthew Dempsky	dcac984b97	cmd/compile: cleanup nodpc and nodfp Instead of creating a new &nodfp expression for every recover() call, or a new nodpc variable for every function instrumented by the race detector, this CL introduces two new uintptr-typed pseudo-variables callerSP and callerPC. These pseudo-variables act just like calls to the runtime's getcallersp() and getcallerpc() functions. For consistency, change runtime.gorecover's builtin stub's parameter type from "*int32" to "uintptr". Passes toolstash-check, but toolstash-check -race fails because of register allocator changes. Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358 Reviewed-on: https://go-review.googlesource.com/99416 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-08 18:22:29 +00:00
Austin Clements	2010189407	runtime: remove legacy eager write barrier Now that the buffered write barrier is implemented for all architectures, we can remove the old eager write barrier implementation. This CL removes the implementation from the runtime, support in the compiler for calling it, and updates some compiler tests that relied on the old eager barrier support. It also makes sure that all of the useful comments from the old write barrier implementation still have a place to live. Fixes #22460. Updates #21640 since this fixes the layering concerns of the write barrier (but not the other things in that issue). Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74 Reviewed-on: https://go-review.googlesource.com/92705 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-02-13 16:34:46 +00:00
Keith Randall	48e207d518	cmd/compile: fix mapassign_fast* routines for pointer keys The signature of the mapassign_fast* routines need to distinguish the pointerness of their key argument. If the affected routines suspend part way through, the object pointed to by the key might get garbage collected because the key is typed as a uint{32,64}. This is not a problem for mapaccess or mapdelete because the key in those situations do not live beyond the call involved. If the object referenced by the key is garbage collected prematurely, the code still works fine. Even if that object is subsequently reallocated, it can't be written to the map in time to affect the lookup/delete. Fixes #22781 Change-Id: I0bbbc5e9883d5ce702faf4e655348be1191ee439 Reviewed-on: https://go-review.googlesource.com/79018 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2017-11-22 04:30:27 +00:00
Martin Möhrmann	fbfc2031a6	cmd/compile: specialize map creation for small hint sizes Handle make(map[any]any) and make(map[any]any, hint) where hint <= BUCKETSIZE special to allow for faster map initialization and to improve binary size by using runtime calls with fewer arguments. Given hint is smaller or equal to BUCKETSIZE in which case overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap: * If hmap needs to be allocated on the stack then only hmap's hash0 field needs to be initialized and no call to makemap is needed. * If hmap needs to be allocated on the heap then a new special makehmap function will allocate hmap and intialize hmap's hash0 field. Reduces size of the godoc by ~36kb. AMD64 name old time/op new time/op delta NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10) NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10) Updates #6853 Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b Reviewed-on: https://go-review.googlesource.com/61190 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-11-02 17:03:45 +00:00
Ilya Tocar	94484d8ed5	cmd/compile: intrinsify math.{Trunc/Ceil/Floor} on amd64 This significantly speed-ups Trunc. Ceil/Floor are using the same instruction, so do them too. name old time/op new time/op delta Floor-6 3.33ns ± 1% 3.22ns ± 0% -3.39% (p=0.000 n=10+10) Ceil-6 3.33ns ± 1% 3.22ns ± 0% -3.16% (p=0.000 n=10+7) Trunc-6 4.83ns ± 0% 3.22ns ± 0% -33.36% (p=0.000 n=6+8) Change-Id: If848790e458eedfe38a6a0407bb4f589c68ac254 Reviewed-on: https://go-review.googlesource.com/68630 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-10-31 19:30:54 +00:00
Daniel Martí	99da8730b0	all: remove some double spaces from comments Went mainly for the ones that make no sense, such as the ones mid-sentence or after commas. Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0 Reviewed-on: https://go-review.googlesource.com/57293 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-08-26 15:09:09 +00:00
Martin Möhrmann	cbc4e5d9c4	cmd/compile: generate makemap calls with int arguments Where possible generate calls to runtime makemap with int hint argument during compile time instead of makemap with int64 hint argument. This eliminates converting the hint argument for calls to makemap with int64 hint argument for platforms where int64 values do not fit into an argument of type int. A similar optimization for makeslice was introduced in CL golang.org/cl/27851. 386: name old time/op new time/op delta NewEmptyMap 53.5ns ± 5% 41.9ns ± 5% -21.56% (p=0.000 n=10+10) NewSmallMap 182ns ± 1% 165ns ± 1% -8.92% (p=0.000 n=10+10) Change-Id: Ibd2b4c57b36f171b173bf7a0602b3a59771e6e44 Reviewed-on: https://go-review.googlesource.com/55142 Reviewed-by: Keith Randall <khr@golang.org>	2017-08-22 20:28:21 +00:00
Martin Möhrmann	3216e0cefa	cmd/compile: replace eqstring with memequal eqstring is only called for strings with equal lengths. Instead of pushing a pointer and length for each argument string on the stack we can omit pushing one of the lengths on the stack. Changing eqstrings signature to eqstring(uint8, uint8, int) bool to implement the above optimization would make it very similar to the existing memequal(any, any, uintptr) bool function. Since string lengths are positive we can avoid code redundancy and use memequal instead of using eqstring with an optimized signature. go command binary size reduced by 4128 bytes on amd64. name old time/op new time/op delta CompareStringEqual 6.03ns ± 1% 5.71ns ± 1% -5.23% (p=0.000 n=19+18) CompareStringIdentical 2.88ns ± 1% 3.22ns ± 7% +11.86% (p=0.000 n=20+20) CompareStringSameLength 4.31ns ± 1% 4.01ns ± 1% -7.17% (p=0.000 n=19+19) CompareStringDifferentLength 0.29ns ± 2% 0.29ns ± 2% ~ (p=1.000 n=20+20) CompareStringBigUnaligned 64.3µs ± 2% 64.1µs ± 3% ~ (p=0.164 n=20+19) CompareStringBig 61.9µs ± 1% 61.6µs ± 2% -0.46% (p=0.033 n=20+19) Change-Id: Ice15f3b937c981f0d3bc8479a9ea0d10658ac8df Reviewed-on: https://go-review.googlesource.com/53650 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-08-22 17:59:02 +00:00
Martin Möhrmann	06a78b5737	cmd/compile: pass stack allocated bucket to makemap inside hmap name old time/op new time/op delta NewEmptyMap 53.2ns ± 7% 48.0ns ± 5% -9.77% (p=0.000 n=20+20) NewSmallMap 111ns ± 1% 106ns ± 2% -3.78% (p=0.000 n=20+19) Change-Id: I979d21ab16eae9f6893873becca517db57e054b5 Reviewed-on: https://go-review.googlesource.com/56290 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2017-08-22 06:01:59 +00:00
Martin Möhrmann	8a6e51aede	cmd/compile: generate makechan calls with int arguments Where possible generate calls to runtime makechan with int arguments during compile time instead of makechan with int64 arguments. This eliminates converting arguments for calls to makechan with int64 arguments for platforms where int64 values do not fit into arguments of type int. A similar optimization for makeslice was introduced in CL golang.org/cl/27851. 386: name old time/op new time/op delta MakeChan/Byte 52.4ns ± 6% 45.0ns ± 1% -14.14% (p=0.000 n=10+10) MakeChan/Int 54.5ns ± 1% 49.1ns ± 1% -9.87% (p=0.000 n=10+10) MakeChan/Ptr 150ns ± 1% 143ns ± 0% -4.38% (p=0.000 n=9+7) MakeChan/Struct/0 49.2ns ± 2% 43.2ns ± 2% -12.27% (p=0.000 n=10+10) MakeChan/Struct/32 81.7ns ± 2% 76.2ns ± 1% -6.71% (p=0.000 n=10+10) MakeChan/Struct/40 88.4ns ± 2% 82.5ns ± 2% -6.60% (p=0.000 n=10+10) AMD64: name old time/op new time/op delta MakeChan/Byte 83.4ns ± 8% 80.8ns ± 3% ~ (p=0.171 n=10+10) MakeChan/Int 101ns ± 3% 101ns ± 2% ~ (p=0.412 n=10+10) MakeChan/Ptr 128ns ± 1% 128ns ± 1% ~ (p=0.191 n=10+10) MakeChan/Struct/0 67.6ns ± 3% 68.7ns ± 4% ~ (p=0.224 n=10+10) MakeChan/Struct/32 138ns ± 1% 139ns ± 1% ~ (p=0.185 n=10+9) MakeChan/Struct/40 154ns ± 1% 154ns ± 1% -0.55% (p=0.027 n=10+9) Change-Id: Ie854cb066007232c5e9f71ea7d6fe27e81a9c050 Reviewed-on: https://go-review.googlesource.com/55140 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-08-15 05:54:24 +00:00
Keith Randall	5cadc91b3c	cmd/compile: intrinsics for math/bits.OnesCount Popcount instructions on amd64 are not guaranteed to be present, so we must guard their call. Rewrite rules can't generate control flow at the moment, so the intrinsifier needs to generate that code. name old time/op new time/op delta OnesCount-8 2.47ns ± 5% 1.04ns ± 2% -57.70% (p=0.000 n=10+10) OnesCount16-8 1.05ns ± 1% 0.78ns ± 0% -25.56% (p=0.000 n=9+8) OnesCount32-8 1.63ns ± 5% 1.04ns ± 2% -35.96% (p=0.000 n=10+10) OnesCount64-8 2.45ns ± 0% 1.04ns ± 1% -57.55% (p=0.000 n=6+10) Update #18616 Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673 Reviewed-on: https://go-review.googlesource.com/38320 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2017-04-04 02:40:11 +00:00
Keith Randall	e67d881bc3	cmd/compile: simplify efaceeq and ifaceeq Clean up code that does interface equality. Avoid doing checks in efaceeq/ifaceeq that we already did before calling those routines. No noticeable performance changes for existing benchmarks. name old time/op new time/op delta EfaceCmpDiff-8 604ns ± 1% 553ns ± 1% -8.41% (p=0.000 n=9+10) Fixes #18618 Change-Id: I3bd46db82b96494873045bc3300c56400bc582eb Reviewed-on: https://go-review.googlesource.com/38606 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-03-24 23:03:09 +00:00
Daniel Martí	2e29eb57db	runtime: remove unused *chantype parameters The chanrecv funcs don't use it at all. The chansend ones do, but the element type is now part of the hchan struct, which is already a parameter. hchan can be nil in chansend when sending to a nil channel, so when instrumenting we must copy to the stack to be able to read the channel type. name old time/op new time/op delta ChanUncontended 6.42µs ± 1% 6.22µs ± 0% -3.06% (p=0.000 n=19+18) Initially found by github.com/mvdan/unparam. Fixes #19591. Change-Id: I3a5e8a0082e8445cc3f0074695e3593fd9c88412 Reviewed-on: https://go-review.googlesource.com/38351 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-21 17:10:16 +00:00
Hugues Bruant	5d6b7fcaa1	runtime: add mapdelete_fast* Add benchmarks for map delete with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapDelete/Int32/1-8 151ns ± 8% 99ns ± 3% -34.39% (p=0.008 n=5+5) MapDelete/Int32/2-8 128ns ± 2% 111ns ±15% -13.40% (p=0.040 n=5+5) MapDelete/Int32/4-8 128ns ± 5% 114ns ± 2% -10.82% (p=0.008 n=5+5) MapDelete/Int64/1-8 144ns ± 0% 104ns ± 3% -27.53% (p=0.016 n=4+5) MapDelete/Int64/2-8 153ns ± 1% 126ns ± 3% -17.17% (p=0.008 n=5+5) MapDelete/Int64/4-8 178ns ± 3% 136ns ± 2% -23.60% (p=0.008 n=5+5) MapDelete/Str/1-8 187ns ± 3% 171ns ± 3% -8.54% (p=0.008 n=5+5) MapDelete/Str/2-8 221ns ± 3% 206ns ± 4% -7.18% (p=0.016 n=5+4) MapDelete/Str/4-8 256ns ± 5% 232ns ± 2% -9.36% (p=0.016 n=4+5) name old time/op new time/op delta BinaryTree17-8 2.78s ± 7% 2.70s ± 1% ~ (p=0.151 n=5+5) Fannkuch11-8 3.21s ± 2% 3.19s ± 1% ~ (p=0.310 n=5+5) FmtFprintfEmpty-8 49.1ns ± 3% 50.2ns ± 2% ~ (p=0.095 n=5+5) FmtFprintfString-8 78.6ns ± 4% 80.2ns ± 5% ~ (p=0.460 n=5+5) FmtFprintfInt-8 79.7ns ± 1% 81.0ns ± 3% ~ (p=0.103 n=5+5) FmtFprintfIntInt-8 117ns ± 2% 119ns ± 0% ~ (p=0.079 n=5+4) FmtFprintfPrefixedInt-8 153ns ± 1% 146ns ± 3% -4.19% (p=0.024 n=5+5) FmtFprintfFloat-8 239ns ± 1% 237ns ± 1% ~ (p=0.246 n=5+5) FmtManyArgs-8 506ns ± 2% 509ns ± 2% ~ (p=0.238 n=5+5) GobDecode-8 7.06ms ± 4% 6.86ms ± 1% ~ (p=0.222 n=5+5) GobEncode-8 6.01ms ± 5% 5.87ms ± 2% ~ (p=0.222 n=5+5) Gzip-8 246ms ± 4% 236ms ± 1% -4.12% (p=0.008 n=5+5) Gunzip-8 37.7ms ± 4% 37.3ms ± 1% ~ (p=0.841 n=5+5) HTTPClientServer-8 64.9µs ± 1% 64.4µs ± 0% -0.80% (p=0.032 n=5+4) JSONEncode-8 16.0ms ± 2% 16.2ms ±11% ~ (p=0.548 n=5+5) JSONDecode-8 53.2ms ± 2% 53.1ms ± 4% ~ (p=1.000 n=5+5) Mandelbrot200-8 4.33ms ± 2% 4.32ms ± 2% ~ (p=0.841 n=5+5) GoParse-8 3.24ms ± 2% 3.27ms ± 4% ~ (p=0.690 n=5+5) RegexpMatchEasy0_32-8 86.2ns ± 1% 85.2ns ± 3% ~ (p=0.286 n=5+5) RegexpMatchEasy0_1K-8 198ns ± 2% 199ns ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_32-8 82.6ns ± 2% 81.8ns ± 1% ~ (p=0.294 n=5+5) RegexpMatchEasy1_1K-8 359ns ± 2% 354ns ± 1% -1.39% (p=0.048 n=5+5) RegexpMatchMedium_32-8 123ns ± 2% 123ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchMedium_1K-8 38.2µs ± 2% 38.6µs ± 8% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 1.92µs ± 2% 1.91µs ± 5% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-8 57.6µs ± 1% 57.0µs ± 2% ~ (p=0.310 n=5+5) Revcomp-8 483ms ± 7% 441ms ± 1% -8.79% (p=0.016 n=5+4) Template-8 58.0ms ± 1% 58.2ms ± 7% ~ (p=0.310 n=5+5) TimeParse-8 324ns ± 6% 312ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 330ns ± 1% 329ns ± 1% ~ (p=0.968 n=5+5) name old speed new speed delta GobDecode-8 109MB/s ± 4% 112MB/s ± 1% ~ (p=0.222 n=5+5) GobEncode-8 128MB/s ± 5% 131MB/s ± 2% ~ (p=0.222 n=5+5) Gzip-8 78.9MB/s ± 4% 82.3MB/s ± 1% +4.25% (p=0.008 n=5+5) Gunzip-8 514MB/s ± 4% 521MB/s ± 1% ~ (p=0.841 n=5+5) JSONEncode-8 121MB/s ± 2% 120MB/s ±10% ~ (p=0.548 n=5+5) JSONDecode-8 36.5MB/s ± 2% 36.6MB/s ± 4% ~ (p=1.000 n=5+5) GoParse-8 17.9MB/s ± 2% 17.7MB/s ± 4% ~ (p=0.730 n=5+5) RegexpMatchEasy0_32-8 371MB/s ± 1% 375MB/s ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy0_1K-8 5.15GB/s ± 1% 5.13GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 387MB/s ± 2% 391MB/s ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K-8 2.85GB/s ± 2% 2.89GB/s ± 1% ~ (p=0.056 n=5+5) RegexpMatchMedium_32-8 8.07MB/s ± 2% 8.06MB/s ± 1% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 26.8MB/s ± 2% 26.6MB/s ± 7% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 16.7MB/s ± 2% 16.7MB/s ± 5% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-8 17.8MB/s ± 1% 18.0MB/s ± 2% ~ (p=0.310 n=5+5) Revcomp-8 527MB/s ± 6% 577MB/s ± 1% +9.44% (p=0.016 n=5+4) Template-8 33.5MB/s ± 1% 33.4MB/s ± 7% ~ (p=0.310 n=5+5) Updates #19495 Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5 Reviewed-on: https://go-review.googlesource.com/38172 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-21 06:07:24 +00:00
Hugues Bruant	ec091b6af2	runtime: add mapassign_fast* Add benchmarks for map assignment with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapAssignInt32_255-8 24.7ns ± 3% 17.4ns ± 2% -29.75% (p=0.000 n=10+10) MapAssignInt32_64k-8 45.5ns ± 4% 37.6ns ± 4% -17.18% (p=0.000 n=10+10) MapAssignInt64_255-8 26.0ns ± 3% 17.9ns ± 4% -31.03% (p=0.000 n=10+10) MapAssignInt64_64k-8 46.9ns ± 5% 38.7ns ± 2% -17.53% (p=0.000 n=9+10) MapAssignStr_255-8 47.8ns ± 3% 24.8ns ± 4% -48.01% (p=0.000 n=10+10) MapAssignStr_64k-8 83.0ns ± 3% 51.9ns ± 3% -37.45% (p=0.000 n=10+9) name old time/op new time/op delta BinaryTree17-8 3.11s ±19% 2.78s ± 3% ~ (p=0.095 n=5+5) Fannkuch11-8 3.26s ± 1% 3.21s ± 2% ~ (p=0.056 n=5+5) FmtFprintfEmpty-8 50.3ns ± 1% 50.8ns ± 2% ~ (p=0.246 n=5+5) FmtFprintfString-8 82.7ns ± 4% 80.1ns ± 5% ~ (p=0.238 n=5+5) FmtFprintfInt-8 82.6ns ± 2% 81.9ns ± 3% ~ (p=0.508 n=5+5) FmtFprintfIntInt-8 124ns ± 4% 121ns ± 3% ~ (p=0.111 n=5+5) FmtFprintfPrefixedInt-8 158ns ± 6% 160ns ± 2% ~ (p=0.341 n=5+5) FmtFprintfFloat-8 249ns ± 2% 245ns ± 2% ~ (p=0.095 n=5+5) FmtManyArgs-8 513ns ± 2% 519ns ± 3% ~ (p=0.151 n=5+5) GobDecode-8 7.48ms ±12% 7.11ms ± 2% ~ (p=0.222 n=5+5) GobEncode-8 6.25ms ± 1% 6.03ms ± 2% -3.56% (p=0.008 n=5+5) Gzip-8 252ms ± 4% 252ms ± 4% ~ (p=1.000 n=5+5) Gunzip-8 38.4ms ± 3% 38.6ms ± 2% ~ (p=0.690 n=5+5) HTTPClientServer-8 76.9µs ±41% 66.4µs ± 6% ~ (p=0.310 n=5+5) JSONEncode-8 16.5ms ± 3% 16.7ms ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 54.6ms ± 1% 54.3ms ± 2% ~ (p=0.548 n=5+5) Mandelbrot200-8 4.45ms ± 3% 4.47ms ± 1% ~ (p=0.841 n=5+5) GoParse-8 3.43ms ± 1% 3.32ms ± 2% -3.28% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 88.2ns ± 3% 89.4ns ± 2% ~ (p=0.333 n=5+5) RegexpMatchEasy0_1K-8 205ns ± 1% 206ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchEasy1_32-8 85.1ns ± 1% 85.5ns ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 365ns ± 1% 371ns ± 9% ~ (p=1.000 n=5+5) RegexpMatchMedium_32-8 129ns ± 2% 128ns ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 39.8µs ± 0% 39.7µs ± 4% ~ (p=0.730 n=4+5) RegexpMatchHard_32-8 1.99µs ± 3% 2.05µs ±16% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 59.3µs ± 1% 60.3µs ± 7% ~ (p=1.000 n=5+5) Revcomp-8 1.36s ±63% 0.52s ± 5% ~ (p=0.095 n=5+5) Template-8 62.6ms ±14% 60.5ms ± 5% ~ (p=0.690 n=5+5) TimeParse-8 330ns ± 2% 324ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 350ns ± 3% 340ns ± 1% -2.86% (p=0.008 n=5+5) name old speed new speed delta GobDecode-8 103MB/s ±11% 108MB/s ± 2% ~ (p=0.222 n=5+5) GobEncode-8 123MB/s ± 1% 127MB/s ± 2% +3.71% (p=0.008 n=5+5) Gzip-8 77.1MB/s ± 4% 76.9MB/s ± 3% ~ (p=1.000 n=5+5) Gunzip-8 505MB/s ± 3% 503MB/s ± 2% ~ (p=0.690 n=5+5) JSONEncode-8 118MB/s ± 3% 116MB/s ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 35.5MB/s ± 1% 35.8MB/s ± 2% ~ (p=0.397 n=5+5) GoParse-8 16.9MB/s ± 1% 17.4MB/s ± 2% +3.45% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 363MB/s ± 3% 358MB/s ± 2% ~ (p=0.421 n=5+5) RegexpMatchEasy0_1K-8 4.98GB/s ± 1% 4.97GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 376MB/s ± 1% 375MB/s ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 2.80GB/s ± 1% 2.76GB/s ± 9% ~ (p=0.841 n=5+5) RegexpMatchMedium_32-8 7.73MB/s ± 1% 7.76MB/s ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 25.8MB/s ± 0% 25.8MB/s ± 4% ~ (p=0.651 n=4+5) RegexpMatchHard_32-8 16.1MB/s ± 3% 15.7MB/s ±14% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 17.3MB/s ± 1% 17.0MB/s ± 7% ~ (p=0.984 n=5+5) Revcomp-8 273MB/s ±83% 488MB/s ± 5% ~ (p=0.095 n=5+5) Template-8 31.1MB/s ±13% 32.1MB/s ± 5% ~ (p=0.690 n=5+5) Updates #19495 Change-Id: I116e9a2a4594769318b22d736464de8a98499909 Reviewed-on: https://go-review.googlesource.com/38091 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-13 23:43:16 +00:00
Matthew Dempsky	c310c688ff	cmd/compile, runtime: simplify multiway select implementation This commit reworks multiway select statements to use normal control flow primitives instead of the previous setjmp/longjmp-like behavior. This simplifies liveness analysis and should prevent issues around "returns twice" function calls within SSA passes. test/live.go is updated because liveness analysis's CFG is more representative of actual control flow. The case bodies are the only real successors of the selectgo call, but previously the selectsend, selectrecv, etc. calls were included in the successors list too. Updates #19331. Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83 Reviewed-on: https://go-review.googlesource.com/37661 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-07 20:14:17 +00:00
Josh Bleecher Snyder	504bc3ed24	cmd/compile, runtime: specialize convT2x, don't alloc for zero vals Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 19:23:33 +00:00
Cherry Zhang	f6fc0dd620	cmd/compile: update signature of runtime.memclr* runtime.memclr* functions have signatures func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) func memclrHasPointers(ptr unsafe.Pointer, n uintptr) Update compiler's copy. Also teach gc/mkbuiltin.go to handle unsafe.Pointer. The import statement and its support is not really necessary, but just to make it look like real Go code. Fixes #19185. Change-Id: I251d02571fde2716d4727e31e04d56ec04b6f22a Reviewed-on: https://go-review.googlesource.com/37257 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-28 19:22:29 +00:00
Ian Lance Taylor	db6e27c38d	cmd/compile: update builtin writeBarrier to match runtime The definition of writeBarrier in the runtime was changed in CL 22855 to include padding. Update the definition built in to the compiler to match. This doesn't affect the generated code, as the compiler sets the type to use anyhow, but having them be different seems clearly wrong. Change-Id: I8eac05bf70a424a0b2338ba5e9e41af231316de0 Reviewed-on: https://go-review.googlesource.com/37377 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-02-22 01:32:31 +00:00
Keith Randall	5a75d6a08e	cmd/compile: optimize non-empty-interface type conversions When doing i.(T) for non-empty-interface i and concrete type T, there's no need to read the type out of the itab. Just compare the itab to the itab we expect for that interface/type pair. Also optimize type switches by putting the type hash of the concrete type in the itab. That way we don't need to load the type pointer out of the itab. Update #18492 Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce Reviewed-on: https://go-review.googlesource.com/34810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder	2c91bb4c8a	cmd/compile: make panicwrap argument-free When code defines a method on T, the compiler generates a corresponding wrapper method on T. The first thing the wrapper does is check whether the pointer is nil and if so, call panicwrap. This is done to provide a useful error message. The existing implementation gets its information from arguments set up by the compiler. However, with some trouble, this information can be extracted from the name of the wrapper method itself. Removing the arguments to panicwrap simplifies and shrinks the wrapper method. It also means that the call to panicwrap does not require any stack space. This enables a further optimization on amd64/x86, which is to skip the function prologue if nothing else in the method requires stack space. This is frequently the case in simple, hot methods, such as Less and Swap in sort.Interface implementations. Fixes #19040. Benchmarks for package sort on amd64: name old time/op new time/op delta SearchWrappers-8 104ns ± 1% 104ns ± 1% ~ (p=0.286 n=27+27) SortString1K-8 128µs ± 1% 128µs ± 1% -0.44% (p=0.004 n=30+30) SortString1K_Slice-8 118µs ± 2% 117µs ± 1% ~ (p=0.106 n=30+30) StableString1K-8 18.6µs ± 1% 18.6µs ± 1% ~ (p=0.446 n=28+26) SortInt1K-8 65.9µs ± 1% 60.7µs ± 1% -7.96% (p=0.000 n=28+30) StableInt1K-8 75.3µs ± 2% 72.8µs ± 1% -3.41% (p=0.000 n=30+30) StableInt1K_Slice-8 57.7µs ± 1% 57.7µs ± 1% ~ (p=0.515 n=30+30) SortInt64K-8 6.28ms ± 1% 6.01ms ± 1% -4.19% (p=0.000 n=28+28) SortInt64K_Slice-8 5.04ms ± 1% 5.04ms ± 1% ~ (p=0.927 n=28+27) StableInt64K-8 6.65ms ± 1% 6.38ms ± 1% -3.97% (p=0.000 n=26+30) Sort1e2-8 37.9µs ± 1% 37.2µs ± 1% -1.89% (p=0.000 n=29+27) Stable1e2-8 77.0µs ± 1% 74.7µs ± 1% -3.06% (p=0.000 n=27+30) Sort1e4-8 8.21ms ± 2% 7.98ms ± 1% -2.77% (p=0.000 n=29+30) Stable1e4-8 24.8ms ± 1% 24.3ms ± 1% -2.31% (p=0.000 n=28+30) Sort1e6-8 1.27s ± 4% 1.22s ± 1% -3.42% (p=0.000 n=30+29) Stable1e6-8 5.06s ± 1% 4.92s ± 1% -2.77% (p=0.000 n=25+29) [Geo mean] 731µs 714µs -2.29% Before/after assembly for sort.(intPairs).Less follows. It can be optimized further, but that's for a follow-up CL. Before: "".(intPairs).Less t=1 size=214 args=0x20 locals=0x38 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Less(SB), $56-32 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) CMPQ SP, 16(CX) 0x000d 00013 (<autogenerated>:1) JLS 204 0x0013 00019 (<autogenerated>:1) SUBQ $56, SP 0x0017 00023 (<autogenerated>:1) MOVQ BP, 48(SP) 0x001c 00028 (<autogenerated>:1) LEAQ 48(SP), BP 0x0021 00033 (<autogenerated>:1) MOVQ 32(CX), BX 0x0025 00037 (<autogenerated>:1) TESTQ BX, BX 0x0028 00040 (<autogenerated>:1) JEQ 55 0x002a 00042 (<autogenerated>:1) LEAQ 64(SP), DI 0x002f 00047 (<autogenerated>:1) CMPQ (BX), DI 0x0032 00050 (<autogenerated>:1) JNE 55 0x0034 00052 (<autogenerated>:1) MOVQ SP, (BX) 0x0037 00055 (<autogenerated>:1) NOP 0x0037 00055 (<autogenerated>:1) FUNCDATA $0, gclocals·4032f753396f2012ad1784f398b170f4(SB) 0x0037 00055 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x0037 00055 (<autogenerated>:1) MOVQ ""..this+64(FP), AX 0x003c 00060 (<autogenerated>:1) TESTQ AX, AX 0x003f 00063 (<autogenerated>:1) JEQ $0, 135 0x0041 00065 (<autogenerated>:1) MOVQ (AX), CX 0x0044 00068 (<autogenerated>:1) MOVQ 8(AX), AX 0x0048 00072 (<autogenerated>:1) MOVQ "".i+72(FP), DX 0x004d 00077 (<autogenerated>:1) CMPQ DX, AX 0x0050 00080 (<autogenerated>:1) JCC $0, 128 0x0052 00082 (<autogenerated>:1) SHLQ $4, DX 0x0056 00086 (<autogenerated>:1) MOVQ (CX)(DX1), DX 0x005a 00090 (<autogenerated>:1) MOVQ "".j+80(FP), BX 0x005f 00095 (<autogenerated>:1) CMPQ BX, AX 0x0062 00098 (<autogenerated>:1) JCC $0, 128 0x0064 00100 (<autogenerated>:1) SHLQ $4, BX 0x0068 00104 (<autogenerated>:1) MOVQ (CX)(BX1), AX 0x006c 00108 (<autogenerated>:1) CMPQ DX, AX 0x006f 00111 (<autogenerated>:1) SETLT AL 0x0072 00114 (<autogenerated>:1) MOVB AL, "".~r2+88(FP) 0x0076 00118 (<autogenerated>:1) MOVQ 48(SP), BP 0x007b 00123 (<autogenerated>:1) ADDQ $56, SP 0x007f 00127 (<autogenerated>:1) RET 0x0080 00128 (<autogenerated>:1) PCDATA $0, $1 0x0080 00128 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x0085 00133 (<autogenerated>:1) UNDEF 0x0087 00135 (<autogenerated>:1) LEAQ go.string."sort_test"(SB), AX 0x008e 00142 (<autogenerated>:1) MOVQ AX, (SP) 0x0092 00146 (<autogenerated>:1) MOVQ $9, 8(SP) 0x009b 00155 (<autogenerated>:1) LEAQ go.string."intPairs"(SB), AX 0x00a2 00162 (<autogenerated>:1) MOVQ AX, 16(SP) 0x00a7 00167 (<autogenerated>:1) MOVQ $8, 24(SP) 0x00b0 00176 (<autogenerated>:1) LEAQ go.string."Less"(SB), AX 0x00b7 00183 (<autogenerated>:1) MOVQ AX, 32(SP) 0x00bc 00188 (<autogenerated>:1) MOVQ $4, 40(SP) 0x00c5 00197 (<autogenerated>:1) PCDATA $0, $1 0x00c5 00197 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x00ca 00202 (<autogenerated>:1) UNDEF 0x00cc 00204 (<autogenerated>:1) NOP 0x00cc 00204 (<autogenerated>:1) PCDATA $0, $-1 0x00cc 00204 (<autogenerated>:1) CALL runtime.morestack_noctxt(SB) 0x00d1 00209 (<autogenerated>:1) JMP 0 After: "".(intPairs).Swap t=1 size=147 args=0x18 locals=0x8 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Swap(SB), $8-24 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) SUBQ $8, SP 0x000d 00013 (<autogenerated>:1) MOVQ BP, (SP) 0x0011 00017 (<autogenerated>:1) LEAQ (SP), BP 0x0015 00021 (<autogenerated>:1) MOVQ 32(CX), BX 0x0019 00025 (<autogenerated>:1) TESTQ BX, BX 0x001c 00028 (<autogenerated>:1) JEQ 43 0x001e 00030 (<autogenerated>:1) LEAQ 16(SP), DI 0x0023 00035 (<autogenerated>:1) CMPQ (BX), DI 0x0026 00038 (<autogenerated>:1) JNE 43 0x0028 00040 (<autogenerated>:1) MOVQ SP, (BX) 0x002b 00043 (<autogenerated>:1) NOP 0x002b 00043 (<autogenerated>:1) FUNCDATA $0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB) 0x002b 00043 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x002b 00043 (<autogenerated>:1) MOVQ ""..this+16(FP), AX 0x0030 00048 (<autogenerated>:1) TESTQ AX, AX 0x0033 00051 (<autogenerated>:1) JEQ $0, 140 0x0035 00053 (<autogenerated>:1) MOVQ (AX), CX 0x0038 00056 (<autogenerated>:1) MOVQ 8(AX), AX 0x003c 00060 (<autogenerated>:1) MOVQ "".i+24(FP), DX 0x0041 00065 (<autogenerated>:1) CMPQ DX, AX 0x0044 00068 (<autogenerated>:1) JCC $0, 133 0x0046 00070 (<autogenerated>:1) SHLQ $4, DX 0x004a 00074 (<autogenerated>:1) MOVQ 8(CX)(DX1), BX 0x004f 00079 (<autogenerated>:1) MOVQ (CX)(DX1), SI 0x0053 00083 (<autogenerated>:1) MOVQ "".j+32(FP), DI 0x0058 00088 (<autogenerated>:1) CMPQ DI, AX 0x005b 00091 (<autogenerated>:1) JCC $0, 133 0x005d 00093 (<autogenerated>:1) SHLQ $4, DI 0x0061 00097 (<autogenerated>:1) MOVQ 8(CX)(DI1), AX 0x0066 00102 (<autogenerated>:1) MOVQ (CX)(DI1), R8 0x006a 00106 (<autogenerated>:1) MOVQ R8, (CX)(DX1) 0x006e 00110 (<autogenerated>:1) MOVQ AX, 8(CX)(DX1) 0x0073 00115 (<autogenerated>:1) MOVQ SI, (CX)(DI1) 0x0077 00119 (<autogenerated>:1) MOVQ BX, 8(CX)(DI1) 0x007c 00124 (<autogenerated>:1) MOVQ (SP), BP 0x0080 00128 (<autogenerated>:1) ADDQ $8, SP 0x0084 00132 (<autogenerated>:1) RET 0x0085 00133 (<autogenerated>:1) PCDATA $0, $1 0x0085 00133 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x008a 00138 (<autogenerated>:1) UNDEF 0x008c 00140 (<autogenerated>:1) PCDATA $0, $1 0x008c 00140 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x0091 00145 (<autogenerated>:1) UNDEF Change-Id: I15bb8435f0690badb868799f313ed8817335efd3 Reviewed-on: https://go-review.googlesource.com/36809 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-11 23:27:35 +00:00
David Chase	7f1ff65c39	cmd/compile: insert scheduling checks on loop backedges Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-01-09 21:01:29 +00:00
Keith Randall	688995d1e9	cmd/compile: do more type conversion inline The code to do the conversion is smaller than the call to the runtime. The 1-result asserts need to call panic if they fail, but that code is out of line. The only conversions left in the runtime are those which might allocate and those which might need to generate an itab. Given the following types: type E interface{} type I interface { foo() } type I2 iterface { foo(); bar() } type Big [10]int func (b Big) foo() { ... } This CL inlines the following conversions: was assertE2T var e E = ... b := i.(Big) was assertE2T2 var e E = ... b, ok := i.(Big) was assertI2T var i I = ... b := i.(Big) was assertI2T2 var i I = ... b, ok := i.(Big) was assertI2E var i I = ... e := i.(E) was assertI2E2 var i I = ... e, ok := i.(E) These are the remaining runtime calls: convT2E: var b Big = ... var e E = b convT2I: var b Big = ... var i I = b convI2I: var i2 I2 = ... var i I = i2 assertE2I: var e E = ... i := e.(I) assertE2I2: var e E = ... i, ok := e.(I) assertI2I: var i I = ... i2 := i.(I2) assertI2I2: var i I = ... i2, ok := i.(I2) Fixes #17405 Fixes #8422 Change-Id: Ida2367bf8ce3cd2c6bb599a1814f1d275afabe21 Reviewed-on: https://go-review.googlesource.com/32313 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>	2016-11-02 21:33:03 +00:00
Austin Clements	87e48c5afd	runtime, cmd/compile: rename memclr -> memclrNoHeapPointers Since barrier-less memclr is only safe in very narrow circumstances, this commit renames memclr to avoid accidentally calling memclr on typed memory. This can cause subtle, non-deterministic bugs, so it's worth some effort to prevent. In the near term, this will also prevent bugs creeping in from any concurrent CLs that add calls to memclr; if this happens, whichever patch hits master second will fail to compile. This also adds the other new memclr variants to the compiler's builtin.go to minimize the churn on that binary blob. We'll use these in future commits. Updates #17503. Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca Reviewed-on: https://go-review.googlesource.com/31369 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 18:20:33 +00:00
Martin Möhrmann	b679665a18	cmd/compile: move stringtoslicebytetmp to the backend - removes the runtime function stringtoslicebytetmp - removes the generation of calls to stringtoslicebytetmp from the frontend - adds handling of OSTRARRAYBYTETMP in the backend This reduces binary sizes and avoids function call overhead. Change-Id: Ib9988d48549cee663b685b4897a483f94727b940 Reviewed-on: https://go-review.googlesource.com/32158 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-28 07:58:47 +00:00
Keith Randall	c78d072c8e	Revert "Revert "cmd/compile: inline convI2E"" This reverts commit `7dd9c385f6`. Reason for revert: Reverting the revert, which will re-enable the convI2E optimization. We originally reverted the convI2E optimization because it was making the builder fail, but the underlying cause was later determined to be unrelated. Original CL: https://go-review.googlesource.com/31260 Revert CL: https://go-review.googlesource.com/31310 Real bug: https://go-review.googlesource.com/c/25159 Real fix: https://go-review.googlesource.com/c/31316 Change-Id: I17237bb577a23a7675a5caab970ccda71a4124f2 Reviewed-on: https://go-review.googlesource.com/32023 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-10-25 23:51:34 +00:00
Matthew Dempsky	7dd9c385f6	Revert "cmd/compile: inline convI2E" This reverts commit `395d36a67d`. Appears to be responsible for builder failures. Change-Id: Ic6c6307f662767e529060b88704a9f074785d99e Reviewed-on: https://go-review.googlesource.com/31310 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-10-17 20:37:19 +00:00
Keith Randall	395d36a67d	cmd/compile: inline convI2E It's pretty simple. For: e = (interface{})(i) Do: tmp = i.itab if tmp != nil { tmp = tmp.typ_ // load type from itab } e = eface{tmp, i.data} It is smaller and faster than calling the runtime. Change-Id: I0ad27f62f4ec0b6cd53bc8530e4da0eae3e67a6c Reviewed-on: https://go-review.googlesource.com/31260 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-10-17 19:09:07 +00:00
Martin Möhrmann	d295174030	runtime: speed up non-ASCII rune decoding Copies utf8 constants and EncodeRune implementation from unicode/utf8. Adds a new decoderune implementation that is used by the compiler in code generated for ranging over strings. It does not handle ASCII runes since these are handled directly before calls to decoderune. The DecodeRuneInString implementation from unicode/utf8 is not used since it uses a lookup table that would increase the use of cpu caches. Adds more tests that check decoding of valid and invalid utf8 sequences. name old time/op new time/op delta RuneIterate/range2/ASCII-4 7.45ns ± 2% 7.45ns ± 1% ~ (p=0.634 n=16+16) RuneIterate/range2/Japanese-4 53.5ns ± 1% 49.2ns ± 2% -8.03% (p=0.000 n=20+20) RuneIterate/range2/MixedLength-4 46.3ns ± 1% 41.0ns ± 2% -11.57% (p=0.000 n=20+20) new: "".decoderune t=1 size=423 args=0x28 locals=0x0 old: "".charntorune t=1 size=666 args=0x28 locals=0x0 Change-Id: I1df1fdb385bb9ea5e5e71b8818ea2bf5ce62de52 Reviewed-on: https://go-review.googlesource.com/28490 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-10-17 11:25:22 +00:00
Keith Randall	442de98c14	cmd/compile,runtime: redo how map assignments work To compile: m[k] = v instead of: mapassign(maptype, m, &k, &v), do do: *mapassign(maptype, m, &k) = v mapassign returns a pointer to the value slot in the map. It is just like mapaccess except that it will allocate a new slot if k is not already present in the map. This makes map accesses faster but potentially larger (codewise). It is faster because the write into the map is done when the compiler knows the concrete type, so it can be done with a few store instructions instead of calling typedmemmove. We also potentially avoid stack temporaries to hold v. The code can be larger when the map has pointers in its value type, since there is a write barrier call in addition to the mapassign call. That makes the code at the callsite a bit bigger (go binary is 0.3% bigger). This CL is in preparation for doing operations like m[k] += v with only a single runtime call. That will roughly double the speed of such operations. Update #17133 Update #5147 Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5 Reviewed-on: https://go-review.googlesource.com/30815 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-12 20:41:23 +00:00
Keith Randall	6129f37367	cmd/compile: inline convT2{I,E} when result doesn't escape No point in calling a function when we can build the interface using a known type (or itab) and the address of a local. Get rid of third arg (preallocated stack space) to convT2{I,E}. Makes go binary smaller by 0.2% benchmark old ns/op new ns/op delta BenchmarkEfaceInteger-8 16.7 10.1 -39.52% Update #17118 Update #15375 Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2 Reviewed-on: https://go-review.googlesource.com/29373 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2016-09-19 02:37:08 +00:00

1 2

67 commits