Commit graph

192 commits

Author SHA1 Message Date
Cherry Zhang
6ae3b70ef2 cmd/compile: add clobberdeadreg mode
When -clobberdeadreg flag is set, the compiler inserts code that
clobbers integer registers at call sites. This may be helpful for
debugging register ABI.

Only implemented on AMD64 for now.

Change-Id: Ia203d3f891c30fd95d0103489056fe01d63a2899
Reviewed-on: https://go-review.googlesource.com/c/go/+/302809
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2021-03-19 23:21:21 +00:00
Josh Bleecher Snyder
d0d38f0f70 cmd/compile: fix whitespace in comment
The whitespace was there to align with the following comment,
but the extra whitespace was unnecessary; it wasn't gofmt'd.
Then the file got gofmt'd, but the whitespace didn't get fixed.

Change-Id: I45aad9605b99d83545e4e611ae3ea1b2ff9e6bf6
Reviewed-on: https://go-review.googlesource.com/c/go/+/301649
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
2021-03-14 21:50:39 +00:00
David Chase
7240a18adb cmd/compile: test register ABI for method, interface, closure calls
This is enabled with a ridiculous magic name for method,
or for last input type passed, that needs to be changed
to something inutterable before actual release.

Ridiculous method name: MagicMethodNameForTestingRegisterABI
Ridiculous last (input) type name: MagicLastTypeNameForTestingRegisterABI

RLTN is tested with strings.Contains, so you can have
MagicLastTypeNameForTestingRegisterABI1
and
MagicLastTypeNameForTestingRegisterABI2
if that is helpful

Includes test test/abi/fibish2.go

Updates #44816.

Change-Id: I592a6edc71ca9bebdd1d00e24edee1ceebb3e43f
Reviewed-on: https://go-review.googlesource.com/c/go/+/299410
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-12 21:18:15 +00:00
David Chase
3e524ee65a cmd/compile: make modified Aux type for OpArgXXXX pass ssa/check
For #40724.

Change-Id: I7d1e76139d187cd15a6e0df9d19542b7200589f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/297911
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-03 15:01:05 +00:00
fanzha02
2b50ab2aee cmd/compile: optimize single-precision floating point square root
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.

On arm64 flatform.

previous:
  FCVTSD F0, F0
  FSQRTD F0, F0
  FCVTDS F0, F0

optimized:
  FSQRTS S0, S0

And this patch adds the test case to check the correctness.

This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)

Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2021-03-02 06:38:07 +00:00
David Chase
e25040d162 cmd/compile: change StaticCall to return a "Results"
StaticLECall (multiple value in +mem, multiple value result +mem) ->
StaticCall (multiple ergister value in +mem,
   multiple register-sized-value result +mem) ->
ARCH CallStatic (multiple ergister value in +mem,
   multiple register-sized-value result +mem)

But the architecture-dependent stuff is indifferent to whether
it is mem->mem or (mem)->(mem) until Prog generation.

Deal with OpSelectN -> Prog in ssagen/ssa.go, others, as they
appear.

For #40724.

Change-Id: I1d0436f6371054f1881862641d8e7e418e4a6a16
Reviewed-on: https://go-review.googlesource.com/c/go/+/293391
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-26 02:52:33 +00:00
David Chase
74cac8d479 cmd/compile: add AMD64 parameter register defs, Arg ops, plumb to ssa.Config
This is partial plumbing recycled from the original register abi test work;
these are the parts that translate easily.  Some other bits are deferred till
later when they are ready to be used.

For #40724.

Change-Id: Ica8c55a4526793446189725a2bc3839124feb38f
Reviewed-on: https://go-review.googlesource.com/c/go/+/260539
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-23 18:14:42 +00:00
Cherry Zhang
c167635a6e cmd/compile: gofmt
Change-Id: I2aa44b9976bdac577db44ef76e18fdf5748df8ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/269477
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-11-12 14:38:38 +00:00
Jonathan Swinney
ecc3f5112e cmd/compile: improve atomic swap intrinsics on arm64
ARMv8.1 has added new instructions for atomic memory operations. This
change builds on the previous change which added support for atomic add,
0a7ac93c27, to include similar support for
atomic-compare-and-swap, atomic-swap, atomic-or, and atomic-and
intrinsics. Since the new instructions are not guaranteed to be present,
we guard their usages with a branch on a CPU feature.

Peformance on an ARMv8.1 machine:
name                 old time/op  new time/op  delta
CompareAndSwap-16    37.9ns ±16%  24.1ns ± 4%  -36.44%  (p=0.000 n=10+9)
CompareAndSwap64-16  38.6ns ±15%  24.1ns ± 3%  -37.47%  (p=0.000 n=10+10)

name       old time/op  new time/op  delta
Swap-16    46.9ns ±32%  12.5ns ± 6%  -73.40%  (p=0.000 n=10+10)
Swap64-16  53.4ns ± 1%  12.5ns ± 6%  -76.56%  (p=0.000 n=10+10)

name            old time/op  new time/op  delta
Or8-16          8.81ns ± 0%  5.61ns ± 0%  -36.32%  (p=0.000 n=10+10)
Or-16           7.21ns ± 0%  5.61ns ± 0%  -22.19%  (p=0.000 n=10+10)
Or8Parallel-16  59.8ns ± 3%  12.5ns ± 2%  -79.10%  (p=0.000 n=10+10)
OrParallel-16   51.7ns ± 3%  12.5ns ± 2%  -75.84%  (p=0.000 n=10+10)

name             old time/op  new time/op  delta
And8-16          8.81ns ± 0%  5.61ns ± 0%  -36.32%  (p=0.000 n=10+10)
And-16           7.21ns ± 0%  5.61ns ± 0%  -22.19%  (p=0.000 n=10+10)
And8Parallel-16  59.1ns ± 6%  12.8ns ± 3%  -78.33%  (p=0.000 n=10+10)
AndParallel-16   51.4ns ± 7%  12.8ns ± 3%  -75.03%  (p=0.000 n=10+10)

Performance on an ARMv8.0 machine (no atomics instructions):
name                 old time/op  new time/op  delta
CompareAndSwap-16    61.3ns ± 0%  62.4ns ± 0%  +1.70%  (p=0.000 n=8+9)
CompareAndSwap64-16  62.0ns ± 3%  61.3ns ± 2%    ~     (p=0.093 n=10+10)

name       old time/op  new time/op  delta
Swap-16     127ns ± 2%   131ns ± 2%  +2.91%  (p=0.001 n=10+10)
Swap64-16   128ns ± 1%   131ns ± 2%  +2.43%  (p=0.001 n=10+10)

name            old time/op  new time/op  delta
Or8-16          14.9ns ± 0%  15.3ns ± 0%  +2.68%  (p=0.000 n=10+10)
Or-16           11.8ns ± 0%  12.3ns ± 0%  +4.24%  (p=0.000 n=10+10)
Or8Parallel-16   137ns ± 1%   144ns ± 1%  +4.97%  (p=0.000 n=10+10)
OrParallel-16    128ns ± 1%   136ns ± 1%  +6.34%  (p=0.000 n=10+10)

name             old time/op  new time/op  delta
And8-16          14.9ns ± 0%  15.3ns ± 0%  +2.68%  (p=0.000 n=10+10)
And-16           11.8ns ± 0%  12.3ns ± 0%  +4.24%  (p=0.000 n=10+10)
And8Parallel-16   134ns ± 2%   141ns ± 1%  +5.29%  (p=0.000 n=10+10)
AndParallel-16    125ns ± 2%   134ns ± 1%  +7.10%  (p=0.000 n=10+10)

Fixes #39304

Change-Id: Idaca68701d4751650be6b4bedca3d57f51571712
Reviewed-on: https://go-review.googlesource.com/c/go/+/234217
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trust: fannie zhang <Fannie.Zhang@arm.com>
2020-11-05 23:21:33 +00:00
David Chase
f2ee58b6bb cmd/compile: using new calls, optimize runtime.memequal(x,constant,1)
Proof of concept; also an actual optimization that fires 180 times
in the Go source base.

Change-Id: I5cb87474be764264cde6e4cbcb471ef109306f08
Reviewed-on: https://go-review.googlesource.com/c/go/+/248404
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-11-02 05:29:59 +00:00
Michael Pratt
44dbeaf356 cmd/compile: intrinsify runtime/internal/atomic.{And,Or} on AMD64
These are identical to And8 and Or8, just using ANDL/ORL instead of
ANDB/ORB.

Change-Id: I99cf90a8b0b5f211fb23325dddd55821875f0c8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/263140
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-23 14:18:13 +00:00
Paul E. Murphy
15ead857db cmd/compiler,cmd/go,sync: add internal {LoadAcq,StoreRel}64 on ppc64
Add an internal atomic intrinsic for load with acquire semantics
(extending LoadAcq to 64b) and add LoadAcquintptr for internal
use within the sync package.  For other arches, this remaps to the
appropriate atomic.Load{,64} intrinsic which should not alter code
generation.

Similarly, add StoreRel{uintptr,64} for consistency, and inline.

Finally, add an exception to allow sync to directly use the
runtime/internal/atomic package which avoids more convoluted
workarounds (contributed by Lynn Boger).

In an extreme example, sync.(*Pool).pin consumes 20% of wall time
during fmt tests.  This is reduced to 5% on ppc64le/power9.

From the fmt benchmarks on ppc64le:

name                           old time/op  new time/op  delta
SprintfPadding                  468ns ± 0%   451ns ± 0%   -3.63%
SprintfEmpty                   73.3ns ± 0%  51.9ns ± 0%  -29.20%
SprintfString                   135ns ± 0%   122ns ± 0%   -9.63%
SprintfTruncateString           232ns ± 0%   214ns ± 0%   -7.76%
SprintfTruncateBytes            216ns ± 0%   202ns ± 0%   -6.48%
SprintfSlowParsingPath          162ns ± 0%   142ns ± 0%  -12.35%
SprintfQuoteString             1.00µs ± 0%  0.99µs ± 0%   -1.39%
SprintfInt                      117ns ± 0%   104ns ± 0%  -11.11%
SprintfIntInt                   190ns ± 0%   175ns ± 0%   -7.89%
SprintfPrefixedInt              232ns ± 0%   212ns ± 0%   -8.62%
SprintfFloat                    270ns ± 0%   255ns ± 0%   -5.56%
SprintfComplex                 1.01µs ± 0%  0.99µs ± 0%   -1.68%
SprintfBoolean                  127ns ± 0%   111ns ± 0%  -12.60%
SprintfHexString                220ns ± 0%   198ns ± 0%  -10.00%
SprintfHexBytes                 261ns ± 0%   252ns ± 0%   -3.45%
SprintfBytes                    600ns ± 0%   590ns ± 0%   -1.67%
SprintfStringer                 684ns ± 0%   658ns ± 0%   -3.80%
SprintfStructure               2.57µs ± 0%  2.57µs ± 0%   -0.12%
ManyArgs                        669ns ± 0%   646ns ± 0%   -3.44%
FprintInt                       140ns ± 0%   136ns ± 0%   -2.86%
FprintfBytes                    184ns ± 0%   181ns ± 0%   -1.63%
FprintIntNoAlloc                140ns ± 0%   136ns ± 0%   -2.86%
ScanInts                        929µs ± 0%   921µs ± 0%   -0.79%
ScanRecursiveInt                122ms ± 0%   121ms ± 0%   -0.11%
ScanRecursiveIntReaderWrapper   122ms ± 0%   122ms ± 0%   -0.18%

Change-Id: I4d66780261b57b06ef600229e475462e7313f0d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/253748
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-21 14:34:44 +00:00
Michael Pratt
d4a5797b88 cmd/compile: drop SSA atomics TODO
These ops have been used for sync/atomic since golang.org/cl/28076.

Change-Id: Ic1445c073273f6b191b97018e8eb6f7ad9c48922
Reviewed-on: https://go-review.googlesource.com/c/go/+/261077
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-10-09 18:49:12 +00:00
David Chase
adef4deeb8 cmd/compile: enable late expansion for interface calls
Includes a few tweaks to Value.copyOf(a) (make it a no-op for
a self-copy) and new pattern hack "___" (3 underscores) is
like ellipsis, except the replacement doesn't need to have
matching ellipsis/underscores.

Moved the arg-length check in generated pattern-matching code
BEFORE the args are probed, because not all instances of
variable length OpFoo will have all the args mentioned in
some rule for OpFoo, and when that happens, the compiler
panics without the early check.

Change-Id: I66de40672b3794a6427890ff96c805a488d783f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/247537
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-01 17:47:21 +00:00
David Chase
35e413c537 cmd/compile: add new ops for experiment with late call expansion
Added Dereference, StaticLECall, SelectN, SelectNAddr

Change-Id: I5426ae488e83956539aa07f7415b8acc26c933e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/239082
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-09-17 19:18:10 +00:00
David Chase
3c85e995ef cmd/compile: extend ssa.AuxCall to closure and interface calls
Also introduce helper methods.

Change-Id: I11a744ed002bae0ca9ebabba3206e1c14147e03d
Reviewed-on: https://go-review.googlesource.com/c/go/+/239080
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 20:58:14 +00:00
David Chase
b4ef49e527 cmd/compile: introduce special ssa Aux type for calls
This is prerequisite to moving call expansion later into SSA,
and probably a good idea anyway.  Passes tests.

This is the first minimal CL that does a 1-for-1 substitution
of *ssa.AuxCall for *obj.LSym.  Next step (next CL) is to make
this change for all calls so that additional information can
be stored in AuxCall.

Change-Id: Ia3a7715648fd9fb1a176850767a726e6f5b959eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/237680
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 20:57:24 +00:00
Austin Clements
2bad2f7eba cmd/compile: mark PanicBounds/Extend as calls
PanicBounds and PanicExtend are lowered to runtime calls (with a
non-Go ABI), but are not currently marked as calls. Since liveness
analysis only emits stack maps at calls in the runtime, this means
these panic call sites in the runtime won't get a stack map. These
almost immediately turn into throws in the runtime, but there's still
a chance they'll try to grow the stack first, which would lead to a
different panic.

To fix this, mark these operations as calls.

Outside the runtime, we currently emit stack maps for everything that
isn't an unsafe-point, so these panic calls get stack maps by default.
However, we're about to move to emitting stack maps only at call
sites, at which point this will start to matter outside the runtime as
well.

I confirmed that this has no effect on anything but PCDATA/FUNCDATA in
runtime and net/http.

For #36365.

Change-Id: Ic5bb463fd152cc320c815dc04cf62005261ae169
Reviewed-on: https://go-review.googlesource.com/c/go/+/230539
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-29 21:29:14 +00:00
Josh Bleecher Snyder
6b5ab20b65 cmd/compile: convert devirtualization rule to typed aux
Passes toolstash-check.

Change-Id: Iebcdc35f1a2112d5384c70eb3fdbd92ebb3d248e
Reviewed-on: https://go-review.googlesource.com/c/go/+/229689
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:12:23 +00:00
Michael Munday
bfd569fcb0 cmd/compile: delete the floating point Greater and Geq ops
Extend CL 220417 (which removed the integer Greater and Geq ops) to
floating point comparisons. Greater and Geq can always be
implemented using Less and Leq.

Fixes #37316.

Change-Id: Ieaddb4877dd0ff9037a1dd11d0a9a9e45ced71e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/222397
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-07 19:55:05 +00:00
Josh Bleecher Snyder
fff7509d47 cmd/compile: add intrinsic HasCPUFeature for checking cpu features
Before using some CPU instructions, we must check for their presence.
We use global variables in the runtime package to record features.

Prior to this CL, we issued a regular memory load for these features.
The downside to this is that, because it is a regular memory load,
it cannot be hoisted out of loops or otherwise reordered with other loads.

This CL introduces a new intrinsic just for checking cpu features.
It still ends up resulting in a memory load, but that memory load can
now be floated to the entry block and rematerialized as needed.

One downside is that the regular load could be combined with the comparison
into a CMPBconstload+NE. This new intrinsic cannot; it generates MOVB+TESTB+NE.
(It is possible that MOVBQZX+TESTQ+NE would be better.)

This CL does only amd64. It is easy to extend to other architectures.

For the benchmark in #36196, on my machine, this offers a mild speedup.

name      old time/op  new time/op  delta
FMA-8     1.39ns ± 6%  1.29ns ± 9%  -7.19%  (p=0.000 n=97+96)
NonFMA-8  2.03ns ±11%  2.04ns ±12%    ~     (p=0.618 n=99+98)

Updates #15808
Updates #36196

Change-Id: I75e2fcfcf5a6df1bdb80657a7143bed69fca6deb
Reviewed-on: https://go-review.googlesource.com/c/go/+/212360
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2020-04-04 01:01:04 +00:00
Russ Cox
877ef86bec cmd/compile: add spectre mitigation mode enabled by -spectre
This commit adds a new cmd/compile flag -spectre,
which accepts a comma-separated list of possible
Spectre mitigations to apply, or the empty string (none),
or "all". The only known mitigation right now is "index",
which uses conditional moves to ensure that x86-64 CPUs
do not speculate past index bounds checks.

Speculating past index bounds checks may be problematic
on systems running privileged servers that accept requests
from untrusted users who can execute their own programs
on the same machine. (And some more constraints that
make it even more unlikely in practice.)

The cases this protects against are analogous to the ones
Microsoft explains in the "Array out of bounds load/store feeding ..."
sections here:
https://docs.microsoft.com/en-us/cpp/security/developer-guidance-speculative-execution?view=vs-2019#array-out-of-bounds-load-feeding-an-indirect-branch

Change-Id: Ib7532d7e12466b17e04c4e2075c2a456dc98f610
Reviewed-on: https://go-review.googlesource.com/c/go/+/222660
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-13 19:05:46 +00:00
Keith Randall
cd9fd640db cmd/compile: don't allow NaNs in floating-point constant ops
Trying this CL again, with a fixed test that allows platforms
to disagree on the exact behavior of converting NaNs.

We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.

That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.

To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).

Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.

Update #36400

Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb
Reviewed-on: https://go-review.googlesource.com/c/go/+/221790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-03-04 04:49:54 +00:00
Josh Bleecher Snyder
d889f0cb10 cmd/compile: use correct types in phiopt
We try to preserve type correctness of generic ops.
phiopt modified a bool to be an int without a conversion.
Add a conversion. There are a few random fluctations in the
generated code as a result, but nothing noteworthy or systematic.

no binary size changes

file                        before   after    Δ       %       
math.s                      35966    35961    -5      -0.014% 
debug/dwarf.s               108141   108147   +6      +0.006% 
crypto/dsa.s                6047     6044     -3      -0.050% 
image/png.s                 42882    42885    +3      +0.007% 
go/parser.s                 80281    80278    -3      -0.004% 
cmd/internal/obj.s          115116   115113   -3      -0.003% 
go/types.s                  322130   322118   -12     -0.004% 
cmd/internal/obj/arm64.s    151679   151685   +6      +0.004% 
go/internal/gccgoimporter.s 56487    56493    +6      +0.011% 
cmd/test2json.s             1650     1647     -3      -0.182% 
cmd/link/internal/loadelf.s 35442    35443    +1      +0.003% 
cmd/go/internal/work.s      305039   305035   -4      -0.001% 
cmd/link/internal/ld.s      544835   544834   -1      -0.000% 
net/http.s                  558777   558774   -3      -0.001% 
cmd/compile/internal/ssa.s  3926551  3926994  +443    +0.011% 
cmd/compile/internal/gc.s   1552320  1552321  +1      +0.000% 
total                       18862241 18862670 +429    +0.002% 


Change-Id: I4289e773be6be534ea3f907d68f614441b8f9b46
Reviewed-on: https://go-review.googlesource.com/c/go/+/221607
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-29 17:02:29 +00:00
Josh Bleecher Snyder
4ae1879dda cmd/compile: document Move's type
Fixes #37381

Change-Id: I8abf07d6342c10fc8d52e11c6a70fb0ec09220d2
Reviewed-on: https://go-review.googlesource.com/c/go/+/220683
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-02-27 20:32:29 +00:00
Michael Munday
cb74dcc172 cmd/compile: remove Greater* and Geq* generic integer ops
The generic Greater and Geq ops can always be replaced with the Less and
Leq ops. This CL therefore removes them. This simplifies the compiler since
it reduces the number of operations that need handling in both code and in
rewrite rules. This will be especially true when adding control flow
optimizations such as the integer-in-range optimizations in CL 165998.

Change-Id: If0648b2b19998ac1bddccbf251283f3be4ec3040
Reviewed-on: https://go-review.googlesource.com/c/go/+/220417
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-26 13:11:53 +00:00
Bryan C. Mills
a9f1ea4a83 Revert "cmd/compile: don't allow NaNs in floating-point constant ops"
This reverts CL 213477.

Reason for revert: tests are failing on linux-mips*-rtrk builders.

Change-Id: I8168f7450890233f1bd7e53930b73693c26d4dc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/220897
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-02-25 15:49:19 +00:00
Keith Randall
2aa7c6c548 cmd/compile: don't allow NaNs in floating-point constant ops
We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.

That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.

To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).

Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.

This has been a bug since 1.10, and there's an easy workaround
(declare a global varaible containing the signaling NaN pattern, and
use that as the argument to math.Float32frombits) so we'll fix it in
1.15.

Fixes #36400
Update #36399

Change-Id: Icf155e743281560eda2eed953d19a829552ccfda
Reviewed-on: https://go-review.googlesource.com/c/go/+/213477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-02-25 02:21:53 +00:00
Josh Bleecher Snyder
53dede938b cmd/compile: fix typo in comment
Change-Id: I25fbd63f10ea9892589ad44cc45761926aff0648
Reviewed-on: https://go-review.googlesource.com/c/go/+/212841
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-12-30 19:26:27 +00:00
Russ Cox
543c6d2e0d math, cmd/compile: rename Fma to FMA
This API was added for #25819, where it was discussed as math.FMA.
The commit adding it used math.Fma, presumably for consistency
with the rest of the unusual names in package math
(Sincos, Acosh, Erfcinv, Float32bits, etc).

I believe that using an idiomatic Go name is more important here
than consistency with these other names, most of which are historical
baggage from C's standard library.

Early additions like Float32frombits happened before "uppercase for export"
(so they were originally like "float32frombits") and they were not properly
reconsidered when we uppercased the symbols to export them.
That's a mistake we live with.

The names of functions we have added since then, and even a few
that were legacy, are more properly Go-cased, such as IsNaN, IsInf,
and RoundToEven, rather than Isnan, Isinf, and Roundtoeven.
And also constants like MaxFloat32.

For new API, we should keep using proper Go-cased symbols
instead of minimally-upper-cased-C symbols.

So math.FMA, not math.Fma.

This API has not yet been released, so this change does not break
the compatibility promise.

This CL also modifies cmd/compile, since the compiler knows
the name of the function. I could have stopped at changing the
string constants, but it seemed to make more sense to use a
consistent casing everywhere.

Change-Id: I0f6f3407f41e99bfa8239467345c33945088896e
Reviewed-on: https://go-review.googlesource.com/c/go/+/205317
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-07 14:51:06 +00:00
Austin Clements
97592b3c14 cmd/compile: intrinsics for runtime/internal/atomic.Store8
For #10958, #24543, but makes sense on its own.

Change-Id: I2a87dab66b82a1863e4b6512b1f8def51463ce2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/203284
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-10-29 03:18:55 +00:00
Josh Bleecher Snyder
e071cc4873 cmd/compile: update comment about OpPhi argLength
We use -1 for Phi argLength, not "a really large number".

Change-Id: I5d4fa364c22c3cf40213f2368a9a20de7e989aac
Reviewed-on: https://go-review.googlesource.com/c/go/+/203883
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-10-28 19:32:27 +00:00
smasher164
33425ab8db cmd/compile: introduce generic ssa intrinsic for fused-multiply-add
In order to make math.FMA a compiler intrinsic for ISAs like ARM64,
PPC64[le], and S390X, a generic 3-argument opcode "Fma" is provided and
rewritten as

    ARM64: (Fma x y z) -> (FMADDD z x y)
    PPC64: (Fma x y z) -> (FMADD x y z)
    S390X: (Fma x y z) -> (FMADD z x y)

Updates #25819.

Change-Id: Ie5bc628311e6feeb28ddf9adaa6e702c8c291efa
Reviewed-on: https://go-review.googlesource.com/c/go/+/131959
Run-TryBot: Akhil Indurti <aindurti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-21 16:24:15 +00:00
Michael Munday
9c2e7e8bed cmd/compile: allow multiple SSA block control values
Control values are used to choose which successor of a block is
jumped to. Typically a control value takes the form of a 'flags'
value that represents the result of a comparison. Some
architectures however use a variable in a register as a control
value.

Up until now we have managed with a single control value per block.
However some architectures (e.g. s390x and riscv64) have combined
compare-and-branch instructions that take two variables in registers
as parameters. To generate these instructions we need to support 2
control values per block.

This CL allows up to 2 control values to be used in a block in
order to support the addition of compare-and-branch instructions.
I have implemented s390x compare-and-branch instructions in a
different CL.

Passes toolstash-check -all.

Results of compilebench:

name                      old time/op       new time/op       delta
Template                        208ms ± 1%        209ms ± 1%    ~     (p=0.289 n=20+20)
Unicode                        83.7ms ± 1%       83.3ms ± 3%  -0.49%  (p=0.017 n=18+18)
GoTypes                         748ms ± 1%        748ms ± 0%    ~     (p=0.460 n=20+18)
Compiler                        3.47s ± 1%        3.48s ± 1%    ~     (p=0.070 n=19+18)
SSA                             11.5s ± 1%        11.7s ± 1%  +1.64%  (p=0.000 n=19+18)
Flate                           130ms ± 1%        130ms ± 1%    ~     (p=0.588 n=19+20)
GoParser                        160ms ± 1%        161ms ± 1%    ~     (p=0.211 n=20+20)
Reflect                         465ms ± 1%        467ms ± 1%  +0.42%  (p=0.007 n=20+20)
Tar                             184ms ± 1%        185ms ± 2%    ~     (p=0.087 n=18+20)
XML                             253ms ± 1%        253ms ± 1%    ~     (p=0.377 n=20+18)
LinkCompiler                    769ms ± 2%        774ms ± 2%    ~     (p=0.070 n=19+19)
ExternalLinkCompiler            3.59s ±11%        3.68s ± 6%    ~     (p=0.072 n=20+20)
LinkWithoutDebugCompiler        446ms ± 5%        454ms ± 3%  +1.79%  (p=0.002 n=19+20)
StdCmd                          26.0s ± 2%        26.0s ± 2%    ~     (p=0.799 n=20+20)

name                      old user-time/op  new user-time/op  delta
Template                        238ms ± 5%        240ms ± 5%    ~     (p=0.142 n=20+20)
Unicode                         105ms ±11%        106ms ±10%    ~     (p=0.512 n=20+20)
GoTypes                         876ms ± 2%        873ms ± 4%    ~     (p=0.647 n=20+19)
Compiler                        4.17s ± 2%        4.19s ± 1%    ~     (p=0.093 n=20+18)
SSA                             13.9s ± 1%        14.1s ± 1%  +1.45%  (p=0.000 n=18+18)
Flate                           145ms ±13%        146ms ± 5%    ~     (p=0.851 n=20+18)
GoParser                        185ms ± 5%        188ms ± 7%    ~     (p=0.174 n=20+20)
Reflect                         534ms ± 3%        538ms ± 2%    ~     (p=0.105 n=20+18)
Tar                             215ms ± 4%        211ms ± 9%    ~     (p=0.079 n=19+20)
XML                             295ms ± 6%        295ms ± 5%    ~     (p=0.968 n=20+20)
LinkCompiler                    832ms ± 4%        837ms ± 7%    ~     (p=0.707 n=17+20)
ExternalLinkCompiler            1.58s ± 8%        1.60s ± 4%    ~     (p=0.296 n=20+19)
LinkWithoutDebugCompiler        478ms ±12%        489ms ±10%    ~     (p=0.429 n=20+20)

name                      old object-bytes  new object-bytes  delta
Template                        559kB ± 0%        559kB ± 0%    ~     (all equal)
Unicode                         216kB ± 0%        216kB ± 0%    ~     (all equal)
GoTypes                        2.03MB ± 0%       2.03MB ± 0%    ~     (all equal)
Compiler                       8.07MB ± 0%       8.07MB ± 0%  -0.06%  (p=0.000 n=20+20)
SSA                            27.1MB ± 0%       27.3MB ± 0%  +0.89%  (p=0.000 n=20+20)
Flate                           343kB ± 0%        343kB ± 0%    ~     (all equal)
GoParser                        441kB ± 0%        441kB ± 0%    ~     (all equal)
Reflect                        1.36MB ± 0%       1.36MB ± 0%    ~     (all equal)
Tar                             487kB ± 0%        487kB ± 0%    ~     (all equal)
XML                             632kB ± 0%        632kB ± 0%    ~     (all equal)

name                      old export-bytes  new export-bytes  delta
Template                       18.5kB ± 0%       18.5kB ± 0%    ~     (all equal)
Unicode                        7.92kB ± 0%       7.92kB ± 0%    ~     (all equal)
GoTypes                        35.0kB ± 0%       35.0kB ± 0%    ~     (all equal)
Compiler                        109kB ± 0%        110kB ± 0%  +0.72%  (p=0.000 n=20+20)
SSA                             137kB ± 0%        138kB ± 0%  +0.58%  (p=0.000 n=20+20)
Flate                          4.89kB ± 0%       4.89kB ± 0%    ~     (all equal)
GoParser                       8.49kB ± 0%       8.49kB ± 0%    ~     (all equal)
Reflect                        11.4kB ± 0%       11.4kB ± 0%    ~     (all equal)
Tar                            10.5kB ± 0%       10.5kB ± 0%    ~     (all equal)
XML                            16.7kB ± 0%       16.7kB ± 0%    ~     (all equal)

name                      old text-bytes    new text-bytes    delta
HelloSize                       761kB ± 0%        761kB ± 0%    ~     (all equal)
CmdGoSize                      10.8MB ± 0%       10.8MB ± 0%    ~     (all equal)

name                      old data-bytes    new data-bytes    delta
HelloSize                      10.7kB ± 0%       10.7kB ± 0%    ~     (all equal)
CmdGoSize                       312kB ± 0%        312kB ± 0%    ~     (all equal)

name                      old bss-bytes     new bss-bytes     delta
HelloSize                       122kB ± 0%        122kB ± 0%    ~     (all equal)
CmdGoSize                       146kB ± 0%        146kB ± 0%    ~     (all equal)

name                      old exe-bytes     new exe-bytes     delta
HelloSize                      1.13MB ± 0%       1.13MB ± 0%    ~     (all equal)
CmdGoSize                      15.1MB ± 0%       15.1MB ± 0%    ~     (all equal)

Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f
Reviewed-on: https://go-review.googlesource.com/c/go/+/196557
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
2019-10-02 09:56:36 +00:00
Ainar Garipov
0efbd10157 all: fix typos
Use the following (suboptimal) script to obtain a list of possible
typos:

  #!/usr/bin/env sh

  set -x

  git ls-files |\
    grep -e '\.\(c\|cc\|go\)$' |\
    xargs -n 1\
    awk\
    '/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
    hunspell -d en_US -l |\
    grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
    grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
    sort -f |\
    uniq -c |\
    awk '$1 == 1 { print $2; }'

Then, go through the results manually and fix the most obvious typos in
the non-vendored code.

Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-09-08 17:28:20 +00:00
Austin Clements
4a4e05b0b1 cmd/compile,runtime/internal/atomic: add Load8
Change-Id: Id52a5730cf9207ee7ccebac4ef12791dc5720e7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/172283
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-05-03 19:25:37 +00:00
Keith Randall
2c423f063b cmd/compile,runtime: provide index information on bounds check failure
A few examples (for accessing a slice of length 3):

   s[-1]    runtime error: index out of range [-1]
   s[3]     runtime error: index out of range [3] with length 3
   s[-1:0]  runtime error: slice bounds out of range [-1:]
   s[3:0]   runtime error: slice bounds out of range [3:0]
   s[3:-1]  runtime error: slice bounds out of range [:-1]
   s[3:4]   runtime error: slice bounds out of range [:4] with capacity 3
   s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3

Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.

An exhaustive set of examples is in issue30116[u].out in the CL.

The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.

Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.

Fixes #30116

Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-18 17:33:38 +00:00
Keith Randall
69c2c56453 cmd/compile,runtime: redo mid-stack inlining tracebacks
Work involved in getting a stack trace is divided between
runtime.Callers and runtime.CallersFrames.

Before this CL, runtime.Callers returns a pc per runtime frame.
runtime.CallersFrames is responsible for expanding a runtime frame
into potentially multiple user frames.

After this CL, runtime.Callers returns a pc per user frame.
runtime.CallersFrames just maps those to user frame info.

Entries in the result of runtime.Callers are now pcs
of the calls (or of the inline marks), not of the instruction
just after the call.

Fixes #29007
Fixes #28640
Update #26320

Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
Reviewed-on: https://go-review.googlesource.com/c/152537
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-12-28 20:55:36 +00:00
Brad Fitzpatrick
3813edf26e all: use "reports whether" consistently in the few places that didn't
Go documentation style for boolean funcs is to say:

    // Foo reports whether ...
    func Foo() bool

(rather than "returns true if")

This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.

(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)

Created with:

$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)

Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-02 22:47:58 +00:00
Keith Randall
dd789550a7 cmd/compile: intrinsify math/bits.Sub on amd64
name             old time/op  new time/op  delta
Sub-8            1.12ns ± 1%  1.17ns ± 1%   +5.20%          (p=0.008 n=5+5)
Sub32-8          1.11ns ± 0%  1.11ns ± 0%     ~     (all samples are equal)
Sub64-8          1.12ns ± 0%  1.18ns ± 1%   +5.00%          (p=0.016 n=4+5)
Sub64multiple-8  4.10ns ± 1%  0.86ns ± 1%  -78.93%          (p=0.008 n=5+5)

Fixes #28273

Change-Id: Ibcb6f2fd32d987c3bcbae4f4cd9d335a3de98548
Reviewed-on: https://go-review.googlesource.com/c/144258
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:27 +00:00
Keith Randall
899f3a2892 cmd/compile: intrinsify math/bits.Add on amd64
name             old time/op  new time/op  delta
Add-8            1.11ns ± 0%  1.18ns ± 0%   +6.31%  (p=0.029 n=4+4)
Add32-8          1.02ns ± 0%  1.02ns ± 1%     ~     (p=0.333 n=4+5)
Add64-8          1.11ns ± 1%  1.17ns ± 0%   +5.79%  (p=0.008 n=5+5)
Add64multiple-8  4.35ns ± 1%  0.86ns ± 0%  -80.22%  (p=0.000 n=5+4)

The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.

Update #28273

Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:00 +00:00
Carlos Eduardo Seo
5c472132bf cmd/compile, runtime: add new lightweight atomics for ppc64x
This change creates the infrastructure for new lightweight atomics
primitives in runtime/internal/atomic:

- LoadAcq, for load-acquire
- StoreRel, for store-release
- CasRel, for Compare-and-Swap-release

and implements them for ppc64x. There is visible performance improvement
in producer-consumer scenarios, like BenchmarkChanProdCons*:

benchmark                           old ns/op     new ns/op     delta
BenchmarkChanProdCons0-48           2034          2034          +0.00%
BenchmarkChanProdCons10-48          1798          1608          -10.57%
BenchmarkChanProdCons100-48         1596          1585          -0.69%
BenchmarkChanProdConsWork0-48       2084          2046          -1.82%
BenchmarkChanProdConsWork10-48      1829          1668          -8.80%
BenchmarkChanProdConsWork100-48     1650          1650          +0.00%

Fixes #21348

Change-Id: I1f6ce377e4a0fe4bd7f5f775e8036f50070ad8db
Reviewed-on: https://go-review.googlesource.com/c/142277
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-10-23 18:10:38 +00:00
ChrisALiles
13d5cd7847 cmd/compile: use proved bounds to remove signed division fix-ups
prove is able to find 94 occurrences in std cmd where a divisor
can't have the value -1. The change removes
the extraneous fix-up code for these cases.

Fixes #25239

Change-Id: Ic184de971f47cc57c702eb72805b8e291c14035d
Reviewed-on: https://go-review.googlesource.com/c/130215
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-23 02:29:44 +00:00
Martin Möhrmann
a1ca4893ff cmd/compile: add intrinsics for runtime/internal/math on 386 and amd64
Add generic, 386 and amd64 specific ops and SSA rules for multiplication
with overflow and branching based on overflow flags. Use these to intrinsify
runtime/internal/math.MulUinptr.

On amd64
  mul, overflow := math.MulUintptr(a, b)
  if overflow {
is lowered to two instructions:
  MULQ SI
  JO 0x10ee35c

No codegen tests as codegen can not currently test unexported internal runtime
functions.

amd64:
name              old time/op  new time/op  delta
MulUintptr/small  1.16ns ± 5%  0.88ns ± 6%  -24.36%  (p=0.000 n=19+20)
MulUintptr/large  10.7ns ± 1%   1.1ns ± 1%  -89.28%  (p=0.000 n=17+19)

Change-Id: If60739a86f820e5044d677276c21df90d3c7a86a
Reviewed-on: https://go-review.googlesource.com/c/141820
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-15 19:04:09 +00:00
Keith Randall
9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Andrew Bonventre
5ac2476748 cmd/compile: make math/bits.RotateLeft* an intrinsic on amd64
Previously, pattern matching was good enough to achieve good performance
for the RotateLeft* functions, but the inlining cost for them was much
too high. Make RotateLeft* intrinsic on amd64 as a stop-gap for now to
reduce inlining costs.

This should be done (or at least looked at) for other architectures
as well.

Updates golang/go#17566

Change-Id: I6a106ff00b6c4e3f490650af3e083ed2be00c819
Reviewed-on: https://go-review.googlesource.com/132435
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-08-30 22:48:28 +00:00
David Chase
0029cd479e cmd/compile: add LocalAddr that takes SP,mem operands
Lack of a well-defined order between VarDef and related
address operations sometimes causes problems with store order
and write barrier transformations; glitches in the order are
made irreparable (by later optimizations) if the two parts of
the glitch straddle a split in the original block caused by
insertion of a write barrier diamond.

Fix this by creating a LocalAddr for addresses of locals
(what VarDef matters for) that takes a memory input to
help make the order explicit.  Addr is modified to only
be legal for SB operand, so there is no overlap between
Addr and LocalAddr uses (there may be some downstream
cleanup from this).

Changes to generic.rules and rewrite.go ensure that codegen
tests continue to pass; CSE of LocalAddr is impaired, not
quite sure of the cost.

Fixes #26105.

Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515
Reviewed-on: https://go-review.googlesource.com/122483
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-07-12 18:45:31 +00:00
Wei Xiao
0a7ac93c27 cmd/compile: improve atomic add intrinsics with ARMv8.1 new instruction
ARMv8.1 has added new instruction (LDADDAL) for atomic memory operations. This
CL improves existing atomic add intrinsics with the new instruction. Since the
new instruction is only guaranteed to be present after ARMv8.1, we guard its
usage with a conditional on CPU feature.

Performance result on ARMv8.1 machine:
name        old time/op  new time/op  delta
Xadd-224    1.05µs ± 6%  0.02µs ± 4%  -98.06%  (p=0.000 n=10+8)
Xadd64-224  1.05µs ± 3%  0.02µs ±13%  -98.10%  (p=0.000 n=9+10)
[Geo mean]  1.05µs       0.02µs       -98.08%

Performance result on ARMv8.0 machine:
name        old time/op  new time/op  delta
Xadd-46      538ns ± 1%   541ns ± 1%  +0.62%  (p=0.000 n=9+9)
Xadd64-46    505ns ± 1%   508ns ± 0%  +0.48%  (p=0.003 n=9+8)
[Geo mean]   521ns        524ns       +0.55%

Change-Id: If4b5d8d0e2d6f84fe1492a4f5de0789910ad0ee9
Reviewed-on: https://go-review.googlesource.com/81877
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-06-21 14:52:43 +00:00
Keith Randall
a8a050819b cmd/compile: mention the AuxInt field for bounded shifts
This comment needs updating after moving the "bounded" mark from
Aux to AuxInt.

Change-Id: I924fb22a81fffcd6944b93f0e3357c3aa2c4c49e
Reviewed-on: https://go-review.googlesource.com/111880
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-07 21:13:49 +00:00
Josh Bleecher Snyder
9eb4590ad2 cmd/compile: simplify shifts using bounds from prove pass
The prove pass sometimes has bounds information
that later rewrite passes do not.

Use this information to mark shifts as bounded,
and then use that information to generate better code on amd64.
It may prove to be helpful on other architectures, too.

While here, coalesce the existing shift lowering rules.

This triggers 35 times building std+cmd. The full list is below.

Here's an example from runtime.heapBitsSetType:

			if nb < 8 {
				b |= uintptr(*p) << nb
				p = add1(p)
			} else {
				nb -= 8
			}

We now generate better code on amd64 for that left shift.

Updates #25087

vendor/golang_org/x/crypto/curve25519/mont25519_amd64.go:48:20: Proved Rsh8Ux64 bounded
runtime/mbitmap.go:1252:22: Proved Lsh64x64 bounded
runtime/mbitmap.go:1265:16: Proved Lsh64x64 bounded
runtime/mbitmap.go:1275:28: Proved Lsh64x64 bounded
runtime/mbitmap.go:1645:25: Proved Lsh64x64 bounded
runtime/mbitmap.go:1663:25: Proved Lsh64x64 bounded
runtime/mbitmap.go:1808:41: Proved Lsh64x64 bounded
runtime/mbitmap.go:1831:49: Proved Lsh64x64 bounded
syscall/route_bsd.go:227:23: Proved Lsh32x64 bounded
syscall/route_bsd.go:295:23: Proved Lsh32x64 bounded
syscall/route_darwin.go:40:23: Proved Lsh32x64 bounded
compress/bzip2/bzip2.go:384:26: Proved Lsh64x16 bounded
vendor/golang_org/x/net/route/address.go:370:14: Proved Lsh64x64 bounded
compress/flate/inflate.go:201:54: Proved Lsh64x64 bounded
math/big/prime.go:50:25: Proved Lsh64x64 bounded
vendor/golang_org/x/crypto/cryptobyte/asn1.go:464:43: Proved Lsh8x8 bounded
net/ip.go:87:21: Proved Rsh8Ux64 bounded
cmd/internal/goobj/read.go:267:23: Proved Lsh64x64 bounded
cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:534:27: Proved Lsh32x32 bounded
cmd/vendor/golang.org/x/arch/arm64/arm64asm/decode.go:544:27: Proved Lsh32x32 bounded
cmd/internal/obj/arm/asm5.go:1044:16: Proved Lsh32x64 bounded
cmd/internal/obj/arm/asm5.go:1065:10: Proved Lsh32x32 bounded
cmd/internal/obj/mips/obj0.go:1311:21: Proved Lsh32x64 bounded
cmd/compile/internal/syntax/scanner.go:352:23: Proved Lsh64x64 bounded
go/types/expr.go:222:36: Proved Lsh64x64 bounded
crypto/x509/x509.go:1626:9: Proved Rsh8Ux64 bounded
cmd/link/internal/loadelf/ldelf.go:823:22: Proved Lsh8x64 bounded
net/http/h2_bundle.go:1470:17: Proved Lsh8x8 bounded
net/http/h2_bundle.go:1477:46: Proved Lsh8x8 bounded
net/http/h2_bundle.go:1481:31: Proved Lsh64x8 bounded
cmd/compile/internal/ssa/rewriteARM64.go:18759:17: Proved Lsh64x64 bounded
cmd/compile/internal/ssa/sparsemap.go:70:23: Proved Lsh32x64 bounded
cmd/compile/internal/ssa/sparsemap.go:73:45: Proved Lsh32x64 bounded

Change-Id: I58bb72f3e6f12f6ac69be633ea7222c245438142
Reviewed-on: https://go-review.googlesource.com/109776
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-04-29 15:59:45 +00:00