Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Dan Scales	cc47b0d2cd	cmd/compile: handle some missing cases of non-SSAable values for args of open-coded defers In my experimentation, I had found that most non-SSAable expressions were converted to autotmp variables during AST evaluation. However, this was not true generally, as witnessed by issue #35213, which has a non-SSAable field reference of a struct that is not converted to an autotmp. So, I fixed openDeferSave() to handle non-SSAable nodes more generally, and make sure that these non-SSAable expressions are not evaluated more than once (which could incorrectly repeat side effects). Fixes #35213 Change-Id: I8043d5576b455e94163599e930ca0275e550d594 Reviewed-on: https://go-review.googlesource.com/c/go/+/203888 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-29 19:58:24 +00:00
Austin Clements	97592b3c14	cmd/compile: intrinsics for runtime/internal/atomic.Store8 For #10958, #24543, but makes sense on its own. Change-Id: I2a87dab66b82a1863e4b6512b1f8def51463ce2a Reviewed-on: https://go-review.googlesource.com/c/go/+/203284 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-29 03:18:55 +00:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
smasher164	03fb1f607b	cmd/compile: don't use FMA on plan9 CL 137156 introduces an intrinsic on AMD64 that executes vfmadd231sd when feature detection is successful. However, because floating-point isn't allowed in note handler, the builder disables SSE instructions, and fails when attempting to execute this instruction. This change disables FMA on plan9 to immediately use the software fallback. Fixes #35063. Change-Id: I87d8f0995bd2f15013d203e618938f5079c9eed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/202617 Reviewed-by: Keith Randall <khr@golang.org>	2019-10-22 19:36:42 +00:00
smasher164	58b031949b	cmd/compile: add fma intrinsic for arm This change introduces an arm intrinsic that generates the FMULAD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.ARM.HasVFPv4. A rewrite rule translates the generic intrinsic to FMULAD. Updates #25819. Change-Id: I8459e5dd1cdbdca35f88a78dbeb7d387f1e20efa Reviewed-on: https://go-review.googlesource.com/c/go/+/142117 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 17:42:47 +00:00
smasher164	7a6da218b1	cmd/compile: add fma intrinsic for amd64 To permit ssa-level optimization, this change introduces an amd64 intrinsic that generates the VFMADD231SD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic ("Fma") to VFMADD231SD. The benchmark compares the software implementation (old) with the intrinsic (new). name old time/op new time/op delta Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5) Updates #25819. Change-Id: I966655e5f96817a5d06dff5942418a3915b09584 Reviewed-on: https://go-review.googlesource.com/c/go/+/137156 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:42:10 +00:00
smasher164	33425ab8db	cmd/compile: introduce generic ssa intrinsic for fused-multiply-add In order to make math.FMA a compiler intrinsic for ISAs like ARM64, PPC64[le], and S390X, a generic 3-argument opcode "Fma" is provided and rewritten as ARM64: (Fma x y z) -> (FMADDD z x y) PPC64: (Fma x y z) -> (FMADD x y z) S390X: (Fma x y z) -> (FMADD z x y) Updates #25819. Change-Id: Ie5bc628311e6feeb28ddf9adaa6e702c8c291efa Reviewed-on: https://go-review.googlesource.com/c/go/+/131959 Run-TryBot: Akhil Indurti <aindurti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-21 16:24:15 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
Cherry Zhang	c4817f5d4f	cmd/compile: on Wasm and AIX, let deferred nil function panic at invocation The Go spec requires If a deferred function value evaluates to nil, execution panics when the function is invoked, not when the "defer" statement is executed. On Wasm and AIX, currently we actually emit a nil check at the point of defer statement, which will make it panic too early. This CL fixes this. Also, on Wasm, now the nil function will be passed through deferreturn to jmpdefer, which does an explicit nil check and calls sigpanic if it is nil. This sigpanic, being called from assembly, is ABI0. So change the assembler backend to also handle sigpanic in ABI0. Fixes #34926. Updates #8047. Change-Id: I28489a571cee36d2aef041f917b8cfdc31d557d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/201297 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-16 00:05:37 +00:00
Meng Zhuo	50f1157760	cmd/compile: add math/bits.Mul64 intrinsic on mips64x Benchmark: name old time/op new time/op delta Mul 36.0ns ± 1% 2.8ns ± 0% -92.31% (p=0.000 n=10+10) Mul32 4.37ns ± 0% 4.37ns ± 0% ~ (p=0.429 n=6+10) Mul64 36.4ns ± 0% 2.8ns ± 0% -92.37% (p=0.000 n=10+9) Change-Id: Ic4f4e5958adbf24999abcee721d0180b5413fca7 Reviewed-on: https://go-review.googlesource.com/c/go/+/200582 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-10-14 21:23:34 +00:00
Matthew Dempsky	06b12e660c	cmd/compile: move some ONAME-specific flags from Node to Name The IsClosureVar, IsOutputParamHeapAddr, Assigned, Addrtaken, InlFormal, and InlLocal flags are only interesting for ONAME nodes, so it's better to set these flags on Name.flags instead of Node.flags. Two caveats though: 1. Previously, we would set Assigned and Addrtaken on the entire expression tree involved in an assignment or addressing operation. However, the rest of the compiler only actually cares about knowing whether the underlying ONAME (if any) was assigned/addressed. 2. This actually requires bumping Name.flags from bitset8 to bitset16, whereas it doesn't allow shrinking Node.flags any. However, Name has some trailing padding bytes, so expanding Name.flags doesn't cost any memory. Passes toolstash-check. Change-Id: I7775d713566a38d5b9723360b1659b79391744c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/200898 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-14 18:57:11 +00:00
Matthew Dempsky	46be01f4e0	cmd/compile: remove Addable flag This flag is supposed to indicate whether the expression is "addressable"; but in practice, we infer this from other attributes about the expression (e.g., n.Op and n.Class()). Passes toolstash-check. Change-Id: I19352ca07ab5646e232d98e8a7c1c9aec822ddd0 Reviewed-on: https://go-review.googlesource.com/c/go/+/200897 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-10-13 01:48:30 +00:00
David Chase	c450ace12c	cmd/compile: remove statement marks from secondary calls Calls are code-generated in an alternate path that inherits its positions from values, not from SSAGenState. The default position on SSAGenState was marked as not-a-statement, but this was not applied to the value itself, leading to spurious "is statement" marks in the output (convention: after code generation in the compiler, everything is either definitely a statement or definitely not a statement, nothing is in the undetermined state). This CL causes a 35 statement regression in ssa/stmtlines_test. This is down from the earlier 150 because of all the other CLs preceding this one that deal with the root causes of the missing lines (repeated lines on nested calls hid missing lines). This also removes some line repeats from ssa/debug_test. Change-Id: Ie9a507bd5447e906b35bbd098e3295211df2ae01 Reviewed-on: https://go-review.googlesource.com/c/go/+/188018 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2019-10-04 20:41:52 +00:00
Ruixin(Peter) Bao	ac2ceba01a	cmd/compile/internal/gc: intrinsify mulWW on s390x SSA rule have already been added previously to intrisinfy Mul/Mul64 on s390x. In this CL, we want to let mulWW use that SSA rule as well. Also removed an extra line for formatting. Benchmarks: QuoRem-18 3.59µs ±15% 2.94µs ± 3% -18.06% (p=0.000 n=8+8) ModSqrt225_Tonelli-18 806µs ± 0% 800µs ± 0% -0.85% (p=0.000 n=7+8) ModSqrt225_3Mod4-18 245µs ± 1% 243µs ± 0% -0.81% (p=0.001 n=8+8) ModSqrt231_Tonelli-18 837µs ± 0% 834µs ± 1% -0.36% (p=0.028 n=8+8) ModSqrt231_5Mod8-18 282µs ± 0% 280µs ± 0% -0.76% (p=0.000 n=8+8) Sqrt-18 45.8µs ± 2% 38.6µs ± 0% -15.63% (p=0.000 n=8+8) IntSqr/1-18 19.1ns ± 0% 13.1ns ± 0% -31.41% (p=0.000 n=8+8) IntSqr/2-18 48.3ns ± 2% 48.2ns ± 0% ~ (p=0.094 n=8+8) IntSqr/3-18 70.5ns ± 1% 70.7ns ± 0% ~ (p=0.428 n=8+8) IntSqr/5-18 119ns ± 1% 118ns ± 0% -1.02% (p=0.000 n=7+8) IntSqr/8-18 215ns ± 1% 215ns ± 0% ~ (p=0.320 n=8+7) IntSqr/10-18 302ns ± 1% 301ns ± 0% ~ (p=0.148 n=8+7) IntSqr/20-18 952ns ± 1% 807ns ± 0% -15.28% (p=0.000 n=8+8) IntSqr/30-18 1.74µs ± 0% 1.53µs ± 0% -11.93% (p=0.000 n=8+8) IntSqr/50-18 3.91µs ± 0% 3.57µs ± 0% -8.64% (p=0.000 n=7+8) IntSqr/80-18 8.66µs ± 1% 8.11µs ± 0% -6.39% (p=0.000 n=8+8) IntSqr/100-18 12.8µs ± 0% 12.2µs ± 0% -5.19% (p=0.000 n=8+8) IntSqr/200-18 46.0µs ± 0% 44.5µs ± 0% -3.06% (p=0.000 n=8+8) IntSqr/300-18 81.4µs ± 0% 78.4µs ± 0% -3.71% (p=0.000 n=7+8) IntSqr/500-18 212µs ± 1% 206µs ± 0% -2.66% (p=0.000 n=8+8) IntSqr/800-18 419µs ± 1% 406µs ± 0% -3.07% (p=0.000 n=8+8) IntSqr/1000-18 635µs ± 0% 621µs ± 0% -2.13% (p=0.000 n=8+8) Change-Id: Ib097857186932b902601ab087cbeff3fc9555c3e Reviewed-on: https://go-review.googlesource.com/c/go/+/197639 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-02 11:08:43 +00:00
Mohit Verma	c729116332	cmd/compile: use Node.Right for OAS2* nodes (cleanup) This CL changes cmd/compile to use Node.Right instead of Node.Rlist for OAS2FUNC/OAS2RECV/OAS2MAPR/OAS2DOTTYPE nodes. Fixes #32293 Change-Id: I4c9d9100be2d98d15e016797f934f64d385f5faa Reviewed-on: https://go-review.googlesource.com/c/go/+/197817 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-28 05:04:49 +00:00
Cuong Manh Le	75da700d0a	cmd/compile: consistently use strlit to access constants string values Passes toolstash-check. Change-Id: Ieaef20b7649787727b69469f93ffc942022bc079 Reviewed-on: https://go-review.googlesource.com/c/go/+/195198 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-16 11:41:20 +00:00
Ruixin Bao	98aa97806b	cmd/compile: add math/bits.Mul64 intrinsic on s390x This change adds an intrinsic for Mul64 on s390x. To achieve that, a new assembly instruction, MLGR, is introduced in s390x/asmz.go. This assembly instruction directly uses an existing instruction on Z and supports multiplication of two 64 bit unsigned integer and stores the result in two separate registers. In this case, we require the multiplcand to be stored in register R3 and the output result (the high and low 64 bit of the product) to be stored in R2 and R3 respectively. A test case is also added. Benchmark: name old time/op new time/op delta Mul-18 11.1ns ± 0% 1.4ns ± 0% -87.39% (p=0.002 n=8+10) Mul32-18 2.07ns ± 0% 2.07ns ± 0% ~ (all equal) Mul64-18 11.1ns ± 1% 1.4ns ± 0% -87.42% (p=0.000 n=10+10) Change-Id: Ieca6ad1f61fff9a48a31d50bbd3f3c6d9e6675c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/194572 Reviewed-by: Michael Munday <mike.munday@ibm.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-09-13 09:04:48 +00:00
Ainar Garipov	0efbd10157	all: fix typos Use the following (suboptimal) script to obtain a list of possible typos: #!/usr/bin/env sh set -x git ls-files \|\ grep -e '\.$c\\|cc\\|go$$' \|\ xargs -n 1\ awk\ '/\/\// { gsub(/.\/\//, ""); print; } /\/\/, /\\// { gsub(/.\/\/, ""); gsub(/\\/.*/, ""); }' \|\ hunspell -d en_US -l \|\ grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' \|\ grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' \|\ sort -f \|\ uniq -c \|\ awk '$1 == 1 { print $2; }' Then, go through the results manually and fix the most obvious typos in the non-vendored code. Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae Reviewed-on: https://go-review.googlesource.com/c/go/+/193848 Reviewed-by: Robert Griesemer <gri@golang.org>	2019-09-08 17:28:20 +00:00
Matthew Dempsky	581526ce96	cmd/compile: rewrite untyped constant conversion logic This CL detangles the hairy mess that was convlit+defaultlit. In particular, it makes the following changes: 1. convlit1 now follows the standard typecheck behavior of setting "n.Type = nil" if there's an error. Notably, this means for a lot of test cases, we now avoid reporting useless follow-on error messages. For example, after reporting that "1 << s + 1.0" has an invalid shift, we no longer also report that it can't be assigned to string. 2. Previously, assignconvfn had some extra logic for trying to suppress errors from convlit/defaultlit so that it could provide its own errors with better context information. Instead, this extra context information is now passed down into convlit1 directly. 3. Relatedly, this CL also removes redundant calls to defaultlit prior to assignconv. As a consequence, when an expression doesn't make sense for a particular assignment (e.g., assigning an untyped string to an integer), the error messages now say "untyped string" instead of just "string". This is more consistent with go/types behavior. 4. defaultlit2 is now smarter about only trying to convert pairs of untyped constants when it's likely to succeed. This allows us to report better error messages for things like 3+"x"; instead of "cannot convert 3 to string" we now report "mismatched types untyped number and untyped string". Passes toolstash-check. Change-Id: I26822a02dc35855bd0ac774907b1cf5737e91882 Reviewed-on: https://go-review.googlesource.com/c/go/+/187657 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-09-06 23:15:48 +00:00
Cuong Manh Le	d2f958d8d1	cmd/compile: extend ssa.go to handle 1-element array and 1-field struct Assinging to 1-element array/1-field struct variable is considered clobbering the whole variable. By emitting OpVarDef in this case, liveness analysis can now know the variable is redefined. Also, the isfat is not necessary anymore, and will be removed in follow up CL. Fixes #33916 Change-Id: Iece0d90b05273f333d59d6ee5b12ee7dc71908c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/192979 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-09-03 19:33:04 +00:00
Brian Kessler	b003afe4fe	cmd/compile: intrinsify RotateLeft32 on wasm wasm has 32-bit versions of all integer operations. This change lowers RotateLeft32 to i32.rotl on wasm and intrinsifies the math/bits call. Benchmarking on amd64 under node.js this is ~25% faster. node v10.15.3/amd64 name old time/op new time/op delta RotateLeft 8.37ns ± 1% 8.28ns ± 0% -1.05% (p=0.029 n=4+4) RotateLeft8 11.9ns ± 1% 11.8ns ± 0% ~ (p=0.167 n=5+5) RotateLeft16 11.8ns ± 0% 11.8ns ± 0% ~ (all equal) RotateLeft32 11.9ns ± 1% 8.7ns ± 0% -26.32% (p=0.008 n=5+5) RotateLeft64 8.31ns ± 1% 8.43ns ± 2% ~ (p=0.063 n=5+5) Updates #31265 Change-Id: I5b8e155978faeea536c4f6427ac9564d2f096a46 Reviewed-on: https://go-review.googlesource.com/c/go/+/182359 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>	2019-08-31 17:03:04 +00:00
Ben Shi	8d5197d818	cmd/compile: optimize 386's math.bits.TrailingZeros16 This CL reverts CL 192097 and fixes the issue in CL 189277. Change-Id: Icd271262e1f5019a8e01c91f91c12c1261eeb02b Reviewed-on: https://go-review.googlesource.com/c/go/+/192519 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-30 17:37:00 +00:00
Ben Shi	3cfd003a8a	cmd/compile: optimize ARM's math.bits.RotateLeft32 This CL optimizes math.bits.RotateLeft32 to inline "MOVW Rx@>Ry, Rd" on ARM. The benchmark results of math/bits show some improvements. name old time/op new time/op delta RotateLeft-4 9.42ns ± 0% 6.91ns ± 0% -26.66% (p=0.000 n=40+33) RotateLeft8-4 8.79ns ± 0% 8.79ns ± 0% -0.04% (p=0.000 n=40+31) RotateLeft16-4 8.79ns ± 0% 8.79ns ± 0% -0.04% (p=0.000 n=40+32) RotateLeft32-4 8.16ns ± 0% 7.54ns ± 0% -7.68% (p=0.000 n=40+40) RotateLeft64-4 15.7ns ± 0% 15.7ns ± 0% ~ (all equal) updates #31265 Change-Id: I77bc1c2c702d5323fc7cad5264a8e2d5666bf712 Reviewed-on: https://go-review.googlesource.com/c/go/+/188697 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-08-28 15:41:58 +00:00
Ben Shi	c683ab8128	cmd/compile: optimize ARM's math.Abs This CL optimizes math.Abs to an inline ABSD instruction on ARM. The benchmark results of src/math/ show big improvements. name old time/op new time/op delta Acos-4 181ns ± 0% 182ns ± 0% +0.30% (p=0.000 n=40+40) Acosh-4 202ns ± 0% 202ns ± 0% ~ (all equal) Asin-4 163ns ± 0% 163ns ± 0% ~ (all equal) Asinh-4 242ns ± 0% 242ns ± 0% ~ (all equal) Atan-4 120ns ± 0% 121ns ± 0% +0.83% (p=0.000 n=40+40) Atanh-4 202ns ± 0% 202ns ± 0% ~ (all equal) Atan2-4 173ns ± 0% 173ns ± 0% ~ (all equal) Cbrt-4 1.06µs ± 0% 1.06µs ± 0% +0.09% (p=0.000 n=39+37) Ceil-4 72.9ns ± 0% 72.8ns ± 0% ~ (p=0.237 n=40+40) Copysign-4 13.2ns ± 0% 13.2ns ± 0% ~ (all equal) Cos-4 193ns ± 0% 183ns ± 0% -5.18% (p=0.000 n=40+40) Cosh-4 254ns ± 0% 239ns ± 0% -5.91% (p=0.000 n=40+40) Erf-4 112ns ± 0% 112ns ± 0% ~ (all equal) Erfc-4 117ns ± 0% 117ns ± 0% ~ (all equal) Erfinv-4 127ns ± 0% 127ns ± 1% ~ (p=0.492 n=40+40) Erfcinv-4 128ns ± 0% 128ns ± 0% ~ (all equal) Exp-4 212ns ± 0% 206ns ± 0% -3.05% (p=0.000 n=40+40) ExpGo-4 216ns ± 0% 209ns ± 0% -3.24% (p=0.000 n=40+40) Expm1-4 142ns ± 0% 142ns ± 0% ~ (all equal) Exp2-4 191ns ± 0% 184ns ± 0% -3.45% (p=0.000 n=40+40) Exp2Go-4 194ns ± 0% 187ns ± 0% -3.61% (p=0.000 n=40+40) Abs-4 14.4ns ± 0% 6.3ns ± 0% -56.39% (p=0.000 n=38+39) Dim-4 12.6ns ± 0% 12.6ns ± 0% ~ (all equal) Floor-4 49.6ns ± 0% 49.6ns ± 0% ~ (all equal) Max-4 27.6ns ± 0% 27.6ns ± 0% ~ (all equal) Min-4 27.0ns ± 0% 27.0ns ± 0% ~ (all equal) Mod-4 349ns ± 0% 305ns ± 1% -12.55% (p=0.000 n=33+40) Frexp-4 54.0ns ± 0% 47.1ns ± 0% -12.78% (p=0.000 n=38+38) Gamma-4 242ns ± 0% 234ns ± 0% -3.16% (p=0.000 n=36+40) Hypot-4 84.8ns ± 0% 67.8ns ± 0% -20.05% (p=0.000 n=31+35) HypotGo-4 88.5ns ± 0% 71.6ns ± 0% -19.12% (p=0.000 n=40+38) Ilogb-4 45.8ns ± 0% 38.9ns ± 0% -15.12% (p=0.000 n=40+32) J0-4 821ns ± 0% 802ns ± 0% -2.33% (p=0.000 n=33+40) J1-4 816ns ± 0% 807ns ± 0% -1.05% (p=0.000 n=40+29) Jn-4 1.67µs ± 0% 1.65µs ± 0% -1.45% (p=0.000 n=40+39) Ldexp-4 61.5ns ± 0% 54.6ns ± 0% -11.27% (p=0.000 n=40+32) Lgamma-4 188ns ± 0% 188ns ± 0% ~ (all equal) Log-4 154ns ± 0% 147ns ± 0% -4.78% (p=0.000 n=40+40) Logb-4 50.9ns ± 0% 42.7ns ± 0% -16.11% (p=0.000 n=34+39) Log1p-4 160ns ± 0% 159ns ± 0% ~ (p=0.828 n=40+40) Log10-4 173ns ± 0% 166ns ± 0% -4.05% (p=0.000 n=40+40) Log2-4 65.3ns ± 0% 58.4ns ± 0% -10.57% (p=0.000 n=37+37) Modf-4 36.4ns ± 0% 36.4ns ± 0% ~ (all equal) Nextafter32-4 36.4ns ± 0% 36.4ns ± 0% ~ (all equal) Nextafter64-4 32.7ns ± 0% 32.6ns ± 0% ~ (p=0.375 n=40+40) PowInt-4 300ns ± 0% 277ns ± 0% -7.78% (p=0.000 n=40+40) PowFrac-4 676ns ± 0% 635ns ± 0% -6.00% (p=0.000 n=40+35) Pow10Pos-4 17.6ns ± 0% 17.6ns ± 0% ~ (all equal) Pow10Neg-4 22.0ns ± 0% 22.0ns ± 0% ~ (all equal) Round-4 30.1ns ± 0% 30.1ns ± 0% ~ (all equal) RoundToEven-4 38.9ns ± 0% 38.9ns ± 0% ~ (all equal) Remainder-4 291ns ± 0% 263ns ± 0% -9.62% (p=0.000 n=40+40) Signbit-4 11.3ns ± 0% 11.3ns ± 0% ~ (all equal) Sin-4 185ns ± 0% 185ns ± 0% ~ (all equal) Sincos-4 230ns ± 0% 230ns ± 0% ~ (all equal) Sinh-4 253ns ± 0% 246ns ± 0% -2.77% (p=0.000 n=39+39) SqrtIndirect-4 41.4ns ± 0% 41.4ns ± 0% ~ (all equal) SqrtLatency-4 13.8ns ± 0% 13.8ns ± 0% ~ (all equal) SqrtIndirectLatency-4 37.0ns ± 0% 37.0ns ± 0% ~ (p=0.632 n=40+40) SqrtGoLatency-4 911ns ± 0% 911ns ± 0% +0.08% (p=0.000 n=40+40) SqrtPrime-4 13.2µs ± 0% 13.2µs ± 0% +0.01% (p=0.038 n=38+40) Tan-4 205ns ± 0% 205ns ± 0% ~ (all equal) Tanh-4 264ns ± 0% 247ns ± 0% -6.44% (p=0.000 n=39+32) Trunc-4 45.2ns ± 0% 45.2ns ± 0% ~ (all equal) Y0-4 796ns ± 0% 792ns ± 0% -0.55% (p=0.000 n=35+40) Y1-4 804ns ± 0% 797ns ± 0% -0.82% (p=0.000 n=24+40) Yn-4 1.64µs ± 0% 1.62µs ± 0% -1.27% (p=0.000 n=40+39) Float64bits-4 8.16ns ± 0% 8.16ns ± 0% +0.04% (p=0.000 n=35+40) Float64frombits-4 10.7ns ± 0% 10.7ns ± 0% ~ (all equal) Float32bits-4 7.53ns ± 0% 7.53ns ± 0% ~ (p=0.760 n=40+40) Float32frombits-4 6.91ns ± 0% 6.91ns ± 0% -0.04% (p=0.002 n=32+38) [Geo mean] 111ns 106ns -3.98% Change-Id: I54f4fd7f5160db020b430b556bde59cc0fdb996d Reviewed-on: https://go-review.googlesource.com/c/go/+/188678 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-08-28 15:41:28 +00:00
Bryan C. Mills	372b0eed17	Revert "cmd/compile: optimize 386's math.bits.TrailingZeros16" This reverts CL 189277. Reason for revert: broke 32-bit builders. Updates #33902 Change-Id: Ie5f180d0371a90e5057ed578c334372e5fc3a286 Reviewed-on: https://go-review.googlesource.com/c/go/+/192097 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2019-08-28 12:57:59 +00:00
Ben Shi	22355d6cd2	cmd/compile: optimize 386's math.bits.TrailingZeros16 This CL optimizes math.bits.TrailingZeros16 on 386 with a pair of BSFL and ORL instrcutions. The case TrailingZeros16-4 of the benchmark test in math/bits shows big improvement. name old time/op new time/op delta TrailingZeros16-4 1.55ns ± 1% 0.87ns ± 1% -43.87% (p=0.000 n=50+49) Change-Id: Ia899975b0e46f45dcd20223b713ed632bc32740b Reviewed-on: https://go-review.googlesource.com/c/go/+/189277 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-28 02:29:54 +00:00
LE Manh Cuong	1a432f27d5	cmd/compile: eliminate usage of global Fatalf in ssa.go state and ssafn both have their own Fatalf, so use them instead of global Fatalf. Updates #19683 Change-Id: Ie02a961d4285ab0a3f3b8d889a5b498d926ed567 Reviewed-on: https://go-review.googlesource.com/c/go/+/188539 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-08-27 17:05:15 +00:00
Keith Randall	8f296f59de	Revert "Revert "cmd/compile,runtime: allocate defer records on the stack"" This reverts CL 180761 Reason for revert: Reinstate the stack-allocated defer CL. There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues. Issue #32477: Fix has been submitted as CL 181258. Issue #32498: Possible fix is CL 181377 (not submitted yet). Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/181378 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-06-10 16:19:39 +00:00
Michael Munday	ac8dbe7747	cmd/compile, runtime: make atomic loads/stores sequentially consistent on s390x The z/Architecture does not guarantee that a load following a store will not be reordered with that store, unless they access the same address. Therefore if we want to ensure the sequential consistency of atomic loads and stores we need to perform serialization operations after atomic stores. We do not need to serialize in the runtime when using StoreRel[ease] and LoadAcq[uire]. The z/Architecture already provides sufficient ordering guarantees for these operations. name old time/op new time/op delta AtomicLoad64-16 0.51ns ± 0% 0.51ns ± 0% ~ (all equal) AtomicStore64-16 0.51ns ± 0% 0.60ns ± 9% +16.47% (p=0.000 n=17+20) AtomicLoad-16 0.51ns ± 0% 0.51ns ± 0% ~ (all equal) AtomicStore-16 0.51ns ± 0% 0.60ns ± 9% +16.50% (p=0.000 n=18+20) Fixes #32428. Change-Id: I88d19a4010c46070e4fff4b41587efe4c628d4d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/180439 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-06-06 16:15:43 +00:00
Keith Randall	49200e3f3e	Revert "cmd/compile,runtime: allocate defer records on the stack" This reverts commit `fff4f599fe`. Reason for revert: Seems to still have issues around GC. Fixes #32452 Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/180761 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2019-06-05 19:50:09 +00:00
Keith Randall	fff4f599fe	cmd/compile,runtime: allocate defer records on the stack When a defer is executed at most once in a function body, we can allocate the defer record for it on the stack instead of on the heap. This should make defers like this (which are very common) faster. This optimization applies to 363 out of the 370 static defer sites in the cmd/go binary. name old time/op new time/op delta Defer-4 52.2ns ± 5% 36.2ns ± 3% -30.70% (p=0.000 n=10+10) Fixes #6980 Update #14939 Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004 Reviewed-on: https://go-review.googlesource.com/c/go/+/171758 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-06-04 17:35:20 +00:00
Cherry Zhang	c10db03cbe	cmd/compile: make sure build works when intrinsics are disabled Some runtime functions, like getcallerpc/sp, don't have Go or assembly implementations and have to be intrinsified. Make sure they are, even if intrinsics are disabled. This makes "go build -gcflags=all=-d=ssa/intrinsics/off hello.go" work. Change-Id: I77caaed7715d3ca7ffef68a3cdc9357f095c6b9f Reviewed-on: https://go-review.googlesource.com/c/go/+/179897 Run-TryBot: Cherry Zhang <cherryyz@google.com> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-05-31 21:27:59 +00:00
LE Manh Cuong	d0aca5759e	cmd/compile: fix doc typo in ssa.go Change-Id: Ie299a5eca6f6a7c5a37c00ff0de7ce322450375b Reviewed-on: https://go-review.googlesource.com/c/go/+/178123 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-05-21 14:19:32 +00:00
Austin Clements	4a4e05b0b1	cmd/compile,runtime/internal/atomic: add Load8 Change-Id: Id52a5730cf9207ee7ccebac4ef12791dc5720e7c Reviewed-on: https://go-review.googlesource.com/c/go/+/172283 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-05-03 19:25:37 +00:00
Michael Munday	2c1b5130aa	cmd/compile: add math/bits.{Add,Sub}64 intrinsics on s390x This CL adds intrinsics for the 64-bit addition and subtraction functions in math/bits. These intrinsics use the condition code to propagate the carry or borrow bit. To make the carry chains more efficient I've removed the 'clobberFlags' property from most of the load and store operations. Originally these ops did clobber flags when using offsets that didn't fit in a signed 20-bit integer, however that is no longer true. As with other platforms the intrinsics are faster when executed in a chain rather than a loop because currently we need to spill and restore the carry bit between each loop iteration. We may be able to reduce the need to do this on s390x (e.g. by using compare-and-branch instructions that do not clobber flags) in the future. name old time/op new time/op delta Add64 1.21ns ± 2% 2.03ns ± 2% +67.18% (p=0.000 n=7+10) Add64multiple 2.98ns ± 3% 1.03ns ± 0% -65.39% (p=0.000 n=10+9) Sub64 1.23ns ± 4% 2.03ns ± 1% +64.85% (p=0.000 n=10+10) Sub64multiple 3.73ns ± 4% 1.04ns ± 1% -72.28% (p=0.000 n=10+8) Change-Id: I913bbd5e19e6b95bef52f5bc4f14d6fe40119083 Reviewed-on: https://go-review.googlesource.com/c/go/+/174303 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-05-03 10:41:15 +00:00
Keith Randall	e7d08b6fe6	cmd/compile: fix line numbers for index panics In the statement x = a[i], the index panic should appear to come from the line number of the '['. Previous to this CL we sometimes used the line number of the '=' instead. Fixes #29504 Change-Id: Ie718fd303c1ac2aee33e88d52c9ba9bcf220dea1 Reviewed-on: https://go-review.googlesource.com/c/go/+/174617 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-04-30 21:30:30 +00:00
Carlos Eduardo Seo	50ad09418e	cmd/compile: intrinsify math/bits.Add64 for ppc64x This change creates an intrinsic for Add64 for ppc64x and adds a testcase for it. name old time/op new time/op delta Add64-160 1.90ns ±40% 2.29ns ± 0% ~ (p=0.119 n=5+5) Add64multiple-160 6.69ns ± 2% 2.45ns ± 4% -63.47% (p=0.016 n=4+5) Change-Id: I9abe6fb023fdf62eea3c9b46a1820f60bb0a7f97 Reviewed-on: https://go-review.googlesource.com/c/go/+/173758 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>	2019-04-28 23:51:04 +00:00
Keith Randall	fd788a86b6	cmd/compile: always mark atColumn1 results as statements In 31618, we end up comparing the is-stmt-ness of positions to repurpose real instructions as inline marks. If the is-stmt-ness doesn't match, we end up not being able to remove the inline mark. Always use statement-full positions to do the matching, so we always find a match if there is one. Also always use positions that are statements for inline marks. Fixes #31618 Change-Id: Idaf39bdb32fa45238d5cd52973cadf4504f947d5 Reviewed-on: https://go-review.googlesource.com/c/go/+/173324 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-04-23 17:39:11 +00:00
erifan01	f8f265b9cf	cmd/compile: intrinsify math/bits.Sub64 for arm64 This CL instrinsifies Sub64 with arm64 instruction sequence NEGS, SBCS, NGC and NEG, and optimzes the case of borrowing chains. Benchmarks: name old time/op new time/op delta Sub-64 2.500000ns +- 0% 2.048000ns +- 1% -18.08% (p=0.000 n=10+10) Sub32-64 2.500000ns +- 0% 2.500000ns +- 0% ~ (all equal) Sub64-64 2.500000ns +- 0% 2.080000ns +- 0% -16.80% (p=0.000 n=10+7) Sub64multiple-64 7.090000ns +- 0% 2.090000ns +- 0% -70.52% (p=0.000 n=10+10) Change-Id: I3d2664e009a9635e13b55d2c4567c7b34c2c0655 Reviewed-on: https://go-review.googlesource.com/c/go/+/159018 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-04-22 14:40:20 +00:00
Josh Bleecher Snyder	5781df421e	all: s/cancelation/cancellation/ Though there is variation in the spelling of canceled, cancellation is always spelled with a double l. Reference: https://www.grammarly.com/blog/canceled-vs-cancelled/ Change-Id: I240f1a297776c8e27e74f3eca566d2bc4c856f2f Reviewed-on: https://go-review.googlesource.com/c/go/+/170060 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-16 20:27:15 +00:00
Josh Bleecher Snyder	336f951b07	cmd/compile: add ORESULT, remove OINDREGSP This change is mostly cosmetic. OINDREGSP was used only for reading the results of a function call. In recognition of that fact, rename it to ORESULT. Along the way, trim down our handling of it to the bare minimum, and rely on the increased clarity of ORESULT to inline nodarg. Passes toolstash-check. Change-Id: I25b177df4ea54a8e94b1698d044c297b7e453c64 Reviewed-on: https://go-review.googlesource.com/c/go/+/170705 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-04-08 21:33:15 +00:00
Keith Randall	ad6c691542	cmd/compile: remove AUNDEF opcode This opcode was only used to mark unreachable code for plive to use. plive now uses the SSA representation, so it knows locations are unreachable because they are ends of Exit blocks. It doesn't need these opcodes any more. These opcodes actually used space in the binary, 2 bytes per undef on x86 and more for other archs. Makes the amd64 go binary 0.2% smaller. Change-Id: I64c84c35db7c7949617a3a5830f09c8e5fcd2620 Reviewed-on: https://go-review.googlesource.com/c/go/+/171058 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-04-07 01:15:28 +00:00
Michael Munday	726a9398f7	cmd/compile/internal/gc: minor cleanup of slicing Tidy the code up a little bit to move variable definitions closer to uses, prefer early return to else branches and some other minor tweaks. I'd like to make some more changes to this code in the near future and this CL should make those changes cleaner. Change-Id: Ie7d7f2e4bb1e670347941e255c9cdc1703282db5 Reviewed-on: https://go-review.googlesource.com/c/go/+/170120 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2019-04-01 21:16:31 +00:00
Josh Bleecher Snyder	5ee1d5d39f	cmd/compile: minor cleanup Use constants that are easier to read. Change-Id: I11fd6363b3bd283a4cc7c9908c2327123c64dcf7 Reviewed-on: https://go-review.googlesource.com/c/go/+/169723 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2019-03-28 23:16:26 +00:00
David Chase	591193b01f	cmd/compile: enhance debug_test for infinite loops ssa/debug_test.go already had a step limit; this exposes it to individual tests, and it is then set low for the infinite loop tests. That however is not enough; in an infinite loop debuggers see an unchanging line number, and therefore keep trying until they see a different one. To do this, the concept of a "bogus" line number is introduced, and on output single-instruction infinite loops are detected and a hardware nop with correct line number is inserted into the loop; the branch itself receives a bogus line number. This breaks up the endless stream of same line number and causes both gdb and delve to not hang; Delve complains about the incorrect line number while gdb does a sort of odd step-to-nowhere that then steps back to the loop. Since repeats are suppressed in the reference file, a single line is shown there. (The wrong line number mentioned in previous message was an artifact of debug_test.go, not Delve, and is now fixed.) The bogus line number exposed in Delve is less than wonderful, but compared to hanging, it is better. Fixes #30664. Change-Id: I30c927cf8869a84c6c9b84033ee44d7044aab552 Reviewed-on: https://go-review.googlesource.com/c/go/+/168477 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-03-27 21:04:34 +00:00
Keith Randall	2034fbab5b	cmd/compile: use existing instructions instead of nops for inline marks Instead of always inserting a nop to use as the target of an inline mark, see if we can instead find an instruction we're issuing anyway with the correct line number, and use that instruction. That way, we don't need to issue a nop. Makes cmd/go 0.3% smaller. Update #29571 Change-Id: If6cfc93ab3352ec2c6e0878f8074a3bf0786b2f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/158021 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2019-03-25 16:49:29 +00:00
Josh Bleecher Snyder	23b476a3c8	cmd/compile: port callnew to ssa conversion This is part of a general effort to shrink walk. In an ideal world, we'd have an SSA op for allocation, but we don't yet have a good mechanism for introducing function calling during SSA compilation. In the meantime, SSA conversion is a better place for it. This also makes it easier to introduce new optimizations; instead of doing the typecheck walk dance, we can simply write what we want the backend to do. I introduced a new opcode in this change because: (a) It avoids a class of bugs involving correctly detecting whether this ONEW is a "before walk" ONEW or an "after walk" ONEW. It also means that using ONEW or ONEWOBJ in the wrong context will generally result in a faster failure. (b) Opcodes are cheap. (c) It provides a better place to put documentation. This change also is also marginally more performant: name old alloc/op new alloc/op delta Template 39.1MB ± 0% 39.0MB ± 0% -0.14% (p=0.008 n=5+5) Unicode 28.4MB ± 0% 28.4MB ± 0% ~ (p=0.421 n=5+5) GoTypes 132MB ± 0% 132MB ± 0% -0.23% (p=0.008 n=5+5) Compiler 608MB ± 0% 607MB ± 0% -0.25% (p=0.008 n=5+5) SSA 2.04GB ± 0% 2.04GB ± 0% -0.01% (p=0.008 n=5+5) Flate 24.4MB ± 0% 24.3MB ± 0% -0.13% (p=0.008 n=5+5) GoParser 29.3MB ± 0% 29.1MB ± 0% -0.54% (p=0.008 n=5+5) Reflect 84.8MB ± 0% 84.7MB ± 0% -0.21% (p=0.008 n=5+5) Tar 36.7MB ± 0% 36.6MB ± 0% -0.10% (p=0.008 n=5+5) XML 48.7MB ± 0% 48.6MB ± 0% -0.24% (p=0.008 n=5+5) [Geo mean] 85.0MB 84.8MB -0.19% name old allocs/op new allocs/op delta Template 383k ± 0% 382k ± 0% -0.26% (p=0.008 n=5+5) Unicode 341k ± 0% 341k ± 0% ~ (p=0.579 n=5+5) GoTypes 1.37M ± 0% 1.36M ± 0% -0.39% (p=0.008 n=5+5) Compiler 5.59M ± 0% 5.56M ± 0% -0.49% (p=0.008 n=5+5) SSA 16.9M ± 0% 16.9M ± 0% -0.03% (p=0.008 n=5+5) Flate 238k ± 0% 238k ± 0% -0.23% (p=0.008 n=5+5) GoParser 306k ± 0% 303k ± 0% -0.93% (p=0.008 n=5+5) Reflect 990k ± 0% 987k ± 0% -0.33% (p=0.008 n=5+5) Tar 356k ± 0% 355k ± 0% -0.20% (p=0.008 n=5+5) XML 444k ± 0% 442k ± 0% -0.45% (p=0.008 n=5+5) [Geo mean] 848k 845k -0.33% Change-Id: I2c36003a7cbf71b53857b7de734852b698f49310 Reviewed-on: https://go-review.googlesource.com/c/go/+/167957 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-03-20 19:38:50 +00:00
erifan01	5714c91b53	cmd/compile: intrinsify math/bits.Add64 for arm64 This CL instrinsifies Add64 with arm64 instruction sequence ADDS, ADCS and ADC, and optimzes the case of carry chains.The CL also changes the test code so that the intrinsic implementation can be tested. Benchmarks: name old time/op new time/op delta Add-224 2.500000ns +- 0% 2.090000ns +- 4% -16.40% (p=0.000 n=9+10) Add32-224 2.500000ns +- 0% 2.500000ns +- 0% ~ (all equal) Add64-224 2.500000ns +- 0% 1.577778ns +- 2% -36.89% (p=0.000 n=10+9) Add64multiple-224 6.000000ns +- 0% 2.000000ns +- 0% -66.67% (p=0.000 n=10+10) Change-Id: I6ee91c9a85c16cc72ade5fd94868c579f16c7615 Reviewed-on: https://go-review.googlesource.com/c/go/+/159017 Run-TryBot: Ben Shi <powerman1st@163.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-03-20 05:39:49 +00:00
Keith Randall	2c423f063b	cmd/compile,runtime: provide index information on bounds check failure A few examples (for accessing a slice of length 3): s[-1] runtime error: index out of range [-1] s[3] runtime error: index out of range [3] with length 3 s[-1:0] runtime error: slice bounds out of range [-1:] s[3:0] runtime error: slice bounds out of range [3:0] s[3:-1] runtime error: slice bounds out of range [:-1] s[3:4] runtime error: slice bounds out of range [:4] with capacity 3 s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3 Note that in cases where there are multiple things wrong with the indexes (e.g. s[3:-1]), we report one of those errors kind of arbitrarily, currently the rightmost one. An exhaustive set of examples is in issue30116[u].out in the CL. The message text has the same prefix as the old message text. That leads to slightly awkward phrasing but hopefully minimizes the chance that code depending on the error text will break. Increases the size of the go binary by 0.5% (amd64). The panic functions take arguments in registers in order to keep the size of the compiled code as small as possible. Fixes #30116 Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd Reviewed-on: https://go-review.googlesource.com/c/go/+/161477 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2019-03-18 17:33:38 +00:00

1 2 3 4 5 ...

763 commits