Keith Randall
c6118af558
cmd/compile: don't do floating point optimization x+0 -> x
...
That optimization is not valid if x == -0.
The test is a bit tricky because 0 == -0. We distinguish
0 from -0 with 1/0 == inf, 1/-0 == -inf.
This has been a bug since CL 24790 in Go 1.8. Probably doesn't
warrant a backport.
Fixes #27718
Note: the optimization x-0 -> x is actually valid.
But it's probably best to take it out, so as to not confuse readers.
Change-Id: I99f16a93b45f7406ec8053c2dc759a13eba035fa
Reviewed-on: https://go-review.googlesource.com/135701
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-18 20:27:09 +00:00
fanzha02
a19a83c8ef
cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64
...
Use float <-> int register moves without conversion instead of stores
and loads to move float <-> int values.
Math package benchmark results.
name old time/op new time/op delta
Acosh 153ns ± 0% 147ns ± 0% -3.92% (p=0.000 n=10+10)
Asinh 183ns ± 0% 177ns ± 0% -3.28% (p=0.000 n=10+10)
Atanh 157ns ± 0% 155ns ± 0% -1.27% (p=0.000 n=10+10)
Atan2 118ns ± 0% 117ns ± 1% -0.59% (p=0.003 n=10+10)
Cbrt 119ns ± 0% 114ns ± 0% -4.20% (p=0.000 n=10+10)
Copysign 7.51ns ± 0% 6.51ns ± 0% -13.32% (p=0.000 n=9+10)
Cos 73.1ns ± 0% 70.6ns ± 0% -3.42% (p=0.000 n=10+10)
Cosh 119ns ± 0% 121ns ± 0% +1.68% (p=0.000 n=10+9)
ExpGo 154ns ± 0% 149ns ± 0% -3.05% (p=0.000 n=9+10)
Expm1 101ns ± 0% 99ns ± 0% -1.88% (p=0.000 n=10+10)
Exp2Go 150ns ± 0% 146ns ± 0% -2.67% (p=0.000 n=10+10)
Abs 7.01ns ± 0% 6.01ns ± 0% -14.27% (p=0.000 n=10+9)
Mod 234ns ± 0% 212ns ± 0% -9.40% (p=0.000 n=9+10)
Frexp 34.5ns ± 0% 30.0ns ± 0% -13.04% (p=0.000 n=10+10)
Gamma 112ns ± 0% 111ns ± 0% -0.89% (p=0.000 n=10+10)
Hypot 73.6ns ± 0% 68.6ns ± 0% -6.79% (p=0.000 n=10+10)
HypotGo 77.1ns ± 0% 72.1ns ± 0% -6.49% (p=0.000 n=10+10)
Ilogb 31.0ns ± 0% 28.0ns ± 0% -9.68% (p=0.000 n=10+10)
J0 437ns ± 0% 434ns ± 0% -0.62% (p=0.000 n=10+10)
J1 433ns ± 0% 431ns ± 0% -0.46% (p=0.000 n=10+10)
Jn 927ns ± 0% 922ns ± 0% -0.54% (p=0.000 n=10+10)
Ldexp 41.5ns ± 0% 37.0ns ± 0% -10.84% (p=0.000 n=9+10)
Log 124ns ± 0% 118ns ± 0% -4.84% (p=0.000 n=10+9)
Logb 34.0ns ± 0% 32.0ns ± 0% -5.88% (p=0.000 n=10+10)
Log1p 110ns ± 0% 108ns ± 0% -1.82% (p=0.000 n=10+10)
Log10 136ns ± 0% 132ns ± 0% -2.94% (p=0.000 n=10+10)
Log2 51.6ns ± 0% 47.1ns ± 0% -8.72% (p=0.000 n=10+10)
Nextafter32 33.0ns ± 0% 30.5ns ± 0% -7.58% (p=0.000 n=10+10)
Nextafter64 29.0ns ± 0% 26.5ns ± 0% -8.62% (p=0.000 n=10+10)
PowInt 169ns ± 0% 160ns ± 0% -5.33% (p=0.000 n=10+10)
PowFrac 375ns ± 0% 361ns ± 0% -3.73% (p=0.000 n=10+10)
RoundToEven 14.0ns ± 0% 12.5ns ± 0% -10.71% (p=0.000 n=10+10)
Remainder 206ns ± 0% 192ns ± 0% -6.80% (p=0.000 n=10+9)
Signbit 6.01ns ± 0% 5.51ns ± 0% -8.32% (p=0.000 n=10+9)
Sin 70.1ns ± 0% 69.6ns ± 0% -0.71% (p=0.000 n=10+10)
Sincos 99.1ns ± 0% 99.6ns ± 0% +0.50% (p=0.000 n=9+10)
SqrtGoLatency 178ns ± 0% 146ns ± 0% -17.70% (p=0.000 n=8+10)
SqrtPrime 9.19µs ± 0% 9.20µs ± 0% +0.01% (p=0.000 n=9+9)
Tanh 125ns ± 1% 127ns ± 0% +1.36% (p=0.000 n=10+10)
Y0 428ns ± 0% 426ns ± 0% -0.47% (p=0.000 n=10+10)
Y1 431ns ± 0% 429ns ± 0% -0.46% (p=0.000 n=10+9)
Yn 906ns ± 0% 901ns ± 0% -0.55% (p=0.000 n=10+10)
Float64bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.000 n=10+10)
Float64frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+9)
Float32bits 4.50ns ± 0% 3.50ns ± 0% -22.22% (p=0.002 n=8+10)
Float32frombits 4.00ns ± 0% 3.50ns ± 0% -12.50% (p=0.000 n=10+10)
Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2
Reviewed-on: https://go-review.googlesource.com/129715
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-17 20:49:04 +00:00
Iskander Sharipov
859cf7fc0f
cmd/compile/internal/gc: handle array slice self-assign in esc.go
...
Instead of skipping all OSLICEARR, skip only ones with non-pointer
array type. For pointers to arrays, it's safe to apply the
self-assignment slicing optimizations.
Refactored the matching code into separate function for readability.
This is an extension to already existing optimization.
On its own, it does not improve any code under std, but
it opens some new optimization opportunities. One
of them is described in the referenced issue.
Updates #7921
Change-Id: I08ac660d3ef80eb15fd7933fb73cf53ded9333ad
Reviewed-on: https://go-review.googlesource.com/133375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-17 11:14:58 +00:00
Keith Randall
b1f656b1ce
cmd/compile: fold address calculations into CMPload[const] ops
...
Makes go binary smaller by 0.2%.
I noticed this in autogenerated equal methods, and there are
probably a lot of those.
Change-Id: I4e04eb3653fbceb9dd6a4eee97ceab1fa4d10b72
Reviewed-on: https://go-review.googlesource.com/135379
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-09-14 19:42:09 +00:00
Lynn Boger
8dbd9afbb0
cmd/compile: improve rules for PPC64.rules
...
This adds some improvements to the rules for PPC64 to eliminate
unnecessary zero or sign extends, and fix some rule for truncates
which were not always using the correct sign instruction.
This reduces of size of many functions by 1 or 2 instructions and
can improve performance in cases where the execution time depends
on small loops where at least 1 instruction was removed and where that
loop contributes a significant amount of the total execution time.
Included is a testcase for codegen to verify the sign/zero extend
instructions are omitted.
An example of the improvement (strings):
IndexAnyASCII/256:1-16 392ns ± 0% 369ns ± 0% -5.79% (p=0.000 n=1+10)
IndexAnyASCII/256:2-16 397ns ± 0% 376ns ± 0% -5.23% (p=0.000 n=1+9)
IndexAnyASCII/256:4-16 405ns ± 0% 384ns ± 0% -5.19% (p=1.714 n=1+6)
IndexAnyASCII/256:8-16 427ns ± 0% 403ns ± 0% -5.57% (p=0.000 n=1+10)
IndexAnyASCII/256:16-16 441ns ± 0% 418ns ± 1% -5.33% (p=0.000 n=1+10)
IndexAnyASCII/4096:1-16 5.62µs ± 0% 5.27µs ± 1% -6.31% (p=0.000 n=1+10)
IndexAnyASCII/4096:2-16 5.67µs ± 0% 5.29µs ± 0% -6.67% (p=0.222 n=1+8)
IndexAnyASCII/4096:4-16 5.66µs ± 0% 5.28µs ± 1% -6.66% (p=0.000 n=1+10)
IndexAnyASCII/4096:8-16 5.66µs ± 0% 5.31µs ± 1% -6.10% (p=0.000 n=1+10)
IndexAnyASCII/4096:16-16 5.70µs ± 0% 5.33µs ± 1% -6.43% (p=0.182 n=1+10)
Change-Id: I739a6132b505936d39001aada5a978ff2a5f0500
Reviewed-on: https://go-review.googlesource.com/129875
Reviewed-by: David Chase <drchase@google.com>
2018-09-13 18:24:53 +00:00
erifan01
8149db4f64
cmd/compile: intrinsify math.RoundToEven and math.Abs on arm64
...
math.RoundToEven can be done by one arm64 instruction FRINTND, intrinsify it to improve performance.
The current pure Go implementation of the function Abs is translated into five instructions on arm64:
str, ldr, and, str, ldr. The intrinsic implementation requires only one instruction, so in terms of
performance, intrinsify it is worthwhile.
Benchmarks:
name old time/op new time/op delta
Abs-8 3.50ns ± 0% 1.50ns ± 0% -57.14% (p=0.000 n=10+10)
RoundToEven-8 9.26ns ± 0% 1.50ns ± 0% -83.80% (p=0.000 n=10+10)
Change-Id: I9456b26ab282b544dfac0154fc86f17aed96ac3d
Reviewed-on: https://go-review.googlesource.com/116535
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-13 14:52:51 +00:00
fanzha02
d5377c2026
test: fix the wrong test of math.Copysign(c, -1) for arm64
...
The CL 132915 added the wrong codegen test for math.Copysign(c, -1),
it should test that AND is not emitted. This CL fixes this error.
Change-Id: Ida1d3d54ebfc7f238abccbc1f70f914e1b5bfd91
Reviewed-on: https://go-review.googlesource.com/134815
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-12 15:34:20 +00:00
Ben Shi
9f2411894b
cmd/compile: optimize arm's bit operation
...
BFC (Bit Field Clear) was introduced in ARMv7, which can simplify
ANDconst and BICconst. And this CL implements that optimization.
1. The total size of pkg/android_arm decreases about 3KB, excluding
cmd/compile/.
2. There is no regression in the go1 benchmark result, and some
cases (FmtFprintfEmpty-4 and RegexpMatchMedium_32-4) even get
slight improvement.
name old time/op new time/op delta
BinaryTree17-4 25.3s ± 1% 25.2s ± 1% ~ (p=0.072 n=30+29)
Fannkuch11-4 13.3s ± 0% 13.3s ± 0% +0.13% (p=0.000 n=30+26)
FmtFprintfEmpty-4 407ns ± 0% 394ns ± 0% -3.19% (p=0.000 n=26+28)
FmtFprintfString-4 664ns ± 0% 662ns ± 0% -0.22% (p=0.000 n=30+30)
FmtFprintfInt-4 712ns ± 0% 706ns ± 0% -0.79% (p=0.000 n=30+30)
FmtFprintfIntInt-4 1.06µs ± 0% 1.05µs ± 0% -0.38% (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4 1.16µs ± 0% 1.16µs ± 0% -0.13% (p=0.000 n=30+29)
FmtFprintfFloat-4 2.24µs ± 0% 2.23µs ± 0% -0.51% (p=0.000 n=29+21)
FmtManyArgs-4 4.09µs ± 0% 4.06µs ± 0% -0.83% (p=0.000 n=28+30)
GobDecode-4 55.0ms ± 5% 55.4ms ± 5% ~ (p=0.307 n=30+30)
GobEncode-4 51.2ms ± 1% 51.9ms ± 1% +1.23% (p=0.000 n=29+30)
Gzip-4 2.64s ± 0% 2.60s ± 0% -1.35% (p=0.000 n=30+29)
Gunzip-4 309ms ± 0% 308ms ± 0% -0.27% (p=0.000 n=30+30)
HTTPClientServer-4 1.03ms ± 5% 1.02ms ± 4% ~ (p=0.117 n=30+29)
JSONEncode-4 101ms ± 2% 101ms ± 2% ~ (p=0.338 n=29+29)
JSONDecode-4 383ms ± 2% 382ms ± 2% ~ (p=0.751 n=26+30)
Mandelbrot200-4 18.4ms ± 0% 18.4ms ± 0% -0.10% (p=0.000 n=29+29)
GoParse-4 22.6ms ± 0% 22.5ms ± 0% -0.39% (p=0.000 n=30+30)
RegexpMatchEasy0_32-4 761ns ± 0% 750ns ± 0% -1.47% (p=0.000 n=26+29)
RegexpMatchEasy0_1K-4 4.33µs ± 0% 4.34µs ± 0% +0.27% (p=0.000 n=25+28)
RegexpMatchEasy1_32-4 809ns ± 0% 795ns ± 0% -1.74% (p=0.000 n=27+25)
RegexpMatchEasy1_1K-4 5.54µs ± 0% 5.53µs ± 0% -0.18% (p=0.000 n=29+29)
RegexpMatchMedium_32-4 1.11µs ± 0% 1.08µs ± 0% -2.78% (p=0.000 n=27+29)
RegexpMatchMedium_1K-4 255µs ± 0% 255µs ± 0% -0.02% (p=0.029 n=30+30)
RegexpMatchHard_32-4 14.7µs ± 0% 14.7µs ± 0% -0.28% (p=0.000 n=30+29)
RegexpMatchHard_1K-4 439µs ± 0% 439µs ± 0% ~ (p=0.907 n=23+27)
Revcomp-4 41.9ms ± 1% 41.9ms ± 1% ~ (p=0.230 n=28+30)
Template-4 522ms ± 1% 528ms ± 1% +1.25% (p=0.000 n=30+30)
TimeParse-4 3.34µs ± 0% 3.35µs ± 0% +0.23% (p=0.000 n=30+27)
TimeFormat-4 6.06µs ± 0% 6.13µs ± 0% +1.08% (p=0.000 n=29+29)
[Geo mean] 384µs 382µs -0.37%
name old speed new speed delta
GobDecode-4 14.0MB/s ± 5% 13.9MB/s ± 5% ~ (p=0.308 n=30+30)
GobEncode-4 15.0MB/s ± 1% 14.8MB/s ± 1% -1.22% (p=0.000 n=29+30)
Gzip-4 7.36MB/s ± 0% 7.46MB/s ± 0% +1.35% (p=0.000 n=30+30)
Gunzip-4 62.8MB/s ± 0% 63.0MB/s ± 0% +0.27% (p=0.000 n=30+30)
JSONEncode-4 19.2MB/s ± 2% 19.2MB/s ± 2% ~ (p=0.312 n=29+29)
JSONDecode-4 5.05MB/s ± 3% 5.08MB/s ± 2% ~ (p=0.356 n=29+30)
GoParse-4 2.56MB/s ± 0% 2.57MB/s ± 0% +0.39% (p=0.000 n=23+27)
RegexpMatchEasy0_32-4 42.0MB/s ± 0% 42.6MB/s ± 0% +1.50% (p=0.000 n=26+28)
RegexpMatchEasy0_1K-4 236MB/s ± 0% 236MB/s ± 0% -0.27% (p=0.000 n=25+28)
RegexpMatchEasy1_32-4 39.6MB/s ± 0% 40.2MB/s ± 0% +1.73% (p=0.000 n=27+27)
RegexpMatchEasy1_1K-4 185MB/s ± 0% 185MB/s ± 0% +0.18% (p=0.000 n=29+29)
RegexpMatchMedium_32-4 900kB/s ± 0% 920kB/s ± 0% +2.22% (p=0.000 n=29+29)
RegexpMatchMedium_1K-4 4.02MB/s ± 0% 4.02MB/s ± 0% +0.07% (p=0.004 n=30+27)
RegexpMatchHard_32-4 2.17MB/s ± 0% 2.18MB/s ± 0% +0.46% (p=0.000 n=30+26)
RegexpMatchHard_1K-4 2.33MB/s ± 0% 2.33MB/s ± 0% ~ (all equal)
Revcomp-4 60.6MB/s ± 1% 60.7MB/s ± 1% ~ (p=0.207 n=28+30)
Template-4 3.72MB/s ± 1% 3.67MB/s ± 1% -1.23% (p=0.000 n=30+30)
[Geo mean] 12.9MB/s 12.9MB/s +0.29%
Change-Id: I07f497f8bb476c950dc555491d00c9066fb64a4e
Reviewed-on: https://go-review.googlesource.com/134232
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-11 14:37:51 +00:00
Iskander Sharipov
b7f9c640b7
test: extend noescape bytes.Buffer test suite
...
Added some more cases that should be guarded against regression.
Change-Id: I9f1dda2fd0be9b6e167ef1cc018fc8cce55c066c
Reviewed-on: https://go-review.googlesource.com/134017
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-09-07 18:49:12 +00:00
erifan01
204cc14bdd
cmd/compile: implement non-constant rotates using ROR on arm64
...
Add some rules to match the Go code like:
y &= 63
x << y | x >> (64-y)
or
y &= 63
x >> y | x << (64-y)
as a ROR instruction. Make math/bits.RotateLeft faster on arm64.
Extends CL 132435 to arm64.
Benchmarks of math/bits.RotateLeftxxN:
name old time/op new time/op delta
RotateLeft-8 3.548750ns +- 1% 2.003750ns +- 0% -43.54% (p=0.000 n=8+8)
RotateLeft8-8 3.925000ns +- 0% 3.925000ns +- 0% ~ (p=1.000 n=8+8)
RotateLeft16-8 3.925000ns +- 0% 3.927500ns +- 0% ~ (p=0.608 n=8+8)
RotateLeft32-8 3.925000ns +- 0% 2.002500ns +- 0% -48.98% (p=0.000 n=8+8)
RotateLeft64-8 3.536250ns +- 0% 2.003750ns +- 0% -43.34% (p=0.000 n=8+8)
Change-Id: I77622cd7f39b917427e060647321f5513973232c
Reviewed-on: https://go-review.googlesource.com/122542
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-07 14:52:02 +00:00
Ben Shi
031a35ec84
cmd/compile: optimize 386's comparison
...
Optimization of "(CMPconst [0] (ANDL x y)) -> (TESTL x y)" only
get benefits if there is no further use of the result of x&y. A
condition of uses==1 will have slight improvements.
1. The code size of pkg/linux_386 decreases about 300 bytes, excluding
cmd/compile/.
2. The go1 benchmark shows no regression, and even a slight improvement
in test case FmtFprintfEmpty-4, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 3.34s ± 3% 3.32s ± 2% ~ (p=0.197 n=30+30)
Fannkuch11-4 3.48s ± 2% 3.47s ± 1% -0.33% (p=0.015 n=30+30)
FmtFprintfEmpty-4 46.3ns ± 4% 44.8ns ± 4% -3.33% (p=0.000 n=30+30)
FmtFprintfString-4 78.8ns ± 7% 77.3ns ± 5% ~ (p=0.098 n=30+26)
FmtFprintfInt-4 90.2ns ± 1% 90.0ns ± 7% -0.23% (p=0.027 n=18+30)
FmtFprintfIntInt-4 144ns ± 4% 143ns ± 5% ~ (p=0.945 n=30+29)
FmtFprintfPrefixedInt-4 180ns ± 4% 180ns ± 5% ~ (p=0.858 n=30+30)
FmtFprintfFloat-4 409ns ± 4% 406ns ± 3% -0.87% (p=0.028 n=30+30)
FmtManyArgs-4 611ns ± 5% 608ns ± 4% ~ (p=0.812 n=30+30)
GobDecode-4 7.30ms ± 5% 7.26ms ± 5% ~ (p=0.522 n=30+29)
GobEncode-4 6.90ms ± 7% 6.82ms ± 4% ~ (p=0.086 n=29+28)
Gzip-4 396ms ± 4% 400ms ± 4% +0.99% (p=0.026 n=30+30)
Gunzip-4 41.1ms ± 3% 41.2ms ± 3% ~ (p=0.495 n=30+30)
HTTPClientServer-4 63.7µs ± 3% 63.3µs ± 2% ~ (p=0.113 n=29+29)
JSONEncode-4 16.1ms ± 2% 16.1ms ± 2% -0.30% (p=0.041 n=30+30)
JSONDecode-4 60.9ms ± 3% 61.2ms ± 6% ~ (p=0.187 n=30+30)
Mandelbrot200-4 5.17ms ± 2% 5.19ms ± 3% ~ (p=0.676 n=30+30)
GoParse-4 3.28ms ± 3% 3.25ms ± 2% -0.97% (p=0.002 n=30+30)
RegexpMatchEasy0_32-4 103ns ± 4% 104ns ± 4% ~ (p=0.352 n=30+30)
RegexpMatchEasy0_1K-4 849ns ± 2% 845ns ± 2% ~ (p=0.381 n=30+30)
RegexpMatchEasy1_32-4 113ns ± 4% 113ns ± 4% ~ (p=0.795 n=30+30)
RegexpMatchEasy1_1K-4 1.03µs ± 3% 1.03µs ± 4% ~ (p=0.275 n=25+30)
RegexpMatchMedium_32-4 132ns ± 3% 132ns ± 3% ~ (p=0.970 n=30+30)
RegexpMatchMedium_1K-4 41.4µs ± 3% 41.4µs ± 3% ~ (p=0.212 n=30+30)
RegexpMatchHard_32-4 2.22µs ± 4% 2.22µs ± 4% ~ (p=0.399 n=30+30)
RegexpMatchHard_1K-4 67.2µs ± 3% 67.6µs ± 4% ~ (p=0.359 n=30+30)
Revcomp-4 1.84s ± 2% 1.83s ± 2% ~ (p=0.532 n=30+30)
Template-4 69.1ms ± 4% 68.8ms ± 3% ~ (p=0.146 n=30+30)
TimeParse-4 441ns ± 3% 442ns ± 3% ~ (p=0.154 n=30+30)
TimeFormat-4 413ns ± 3% 414ns ± 3% ~ (p=0.275 n=30+30)
[Geo mean] 66.2µs 66.0µs -0.28%
name old speed new speed delta
GobDecode-4 105MB/s ± 5% 106MB/s ± 5% ~ (p=0.514 n=30+29)
GobEncode-4 111MB/s ± 5% 113MB/s ± 4% +1.37% (p=0.046 n=28+28)
Gzip-4 49.1MB/s ± 4% 48.6MB/s ± 4% -0.98% (p=0.028 n=30+30)
Gunzip-4 472MB/s ± 4% 472MB/s ± 3% ~ (p=0.496 n=30+30)
JSONEncode-4 120MB/s ± 2% 121MB/s ± 2% +0.29% (p=0.042 n=30+30)
JSONDecode-4 31.9MB/s ± 3% 31.7MB/s ± 6% ~ (p=0.186 n=30+30)
GoParse-4 17.6MB/s ± 3% 17.8MB/s ± 2% +0.98% (p=0.002 n=30+30)
RegexpMatchEasy0_32-4 309MB/s ± 4% 307MB/s ± 4% ~ (p=0.501 n=30+30)
RegexpMatchEasy0_1K-4 1.21GB/s ± 2% 1.21GB/s ± 2% ~ (p=0.301 n=30+30)
RegexpMatchEasy1_32-4 283MB/s ± 4% 282MB/s ± 3% ~ (p=0.877 n=30+30)
RegexpMatchEasy1_1K-4 1.00GB/s ± 3% 0.99GB/s ± 4% ~ (p=0.276 n=25+30)
RegexpMatchMedium_32-4 7.54MB/s ± 3% 7.55MB/s ± 3% ~ (p=0.528 n=30+30)
RegexpMatchMedium_1K-4 24.7MB/s ± 3% 24.7MB/s ± 3% ~ (p=0.203 n=30+30)
RegexpMatchHard_32-4 14.4MB/s ± 4% 14.4MB/s ± 4% ~ (p=0.407 n=30+30)
RegexpMatchHard_1K-4 15.3MB/s ± 3% 15.1MB/s ± 4% ~ (p=0.306 n=30+30)
Revcomp-4 138MB/s ± 2% 139MB/s ± 2% ~ (p=0.520 n=30+30)
Template-4 28.1MB/s ± 4% 28.2MB/s ± 3% ~ (p=0.149 n=30+30)
[Geo mean] 81.5MB/s 81.5MB/s +0.06%
Change-Id: I7f75425f79eec93cdd8fdd94db13ad4f61b6a2f5
Reviewed-on: https://go-review.googlesource.com/133657
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-07 01:20:45 +00:00
fanzha02
2e5c32518c
cmd/compile: optimize math.Copysign on arm64
...
Add rewrite rules to optimize math.Copysign() when the second
argument is negative floating point constant.
For example, math.Copysign(c, -2): The previous compile output is
"AND $9223372036854775807, R0, R0; ORR $-9223372036854775808, R0, R0".
The optimized compile output is "ORR $-9223372036854775808, R0, R0"
Math package benchmark results.
name old time/op new time/op delta
Copysign-8 2.61ns ± 2% 2.49ns ± 0% -4.55% (p=0.000 n=10+10)
Cos-8 43.0ns ± 0% 41.5ns ± 0% -3.49% (p=0.000 n=10+10)
Cosh-8 98.6ns ± 0% 98.1ns ± 0% -0.51% (p=0.000 n=10+10)
ExpGo-8 107ns ± 0% 105ns ± 0% -1.87% (p=0.000 n=10+10)
Exp2Go-8 100ns ± 0% 100ns ± 0% +0.39% (p=0.000 n=10+8)
Max-8 6.56ns ± 2% 6.45ns ± 1% -1.63% (p=0.002 n=10+10)
Min-8 6.66ns ± 3% 6.47ns ± 2% -2.82% (p=0.006 n=10+10)
Mod-8 107ns ± 1% 104ns ± 1% -2.72% (p=0.000 n=10+10)
Frexp-8 11.5ns ± 1% 11.0ns ± 0% -4.56% (p=0.000 n=8+10)
HypotGo-8 19.4ns ± 0% 19.4ns ± 0% +0.36% (p=0.019 n=10+10)
Ilogb-8 8.63ns ± 0% 8.51ns ± 0% -1.36% (p=0.000 n=10+10)
Jn-8 584ns ± 0% 585ns ± 0% +0.17% (p=0.000 n=7+8)
Ldexp-8 13.8ns ± 0% 13.5ns ± 0% -2.17% (p=0.002 n=8+10)
Logb-8 10.2ns ± 0% 9.9ns ± 0% -2.65% (p=0.000 n=10+7)
Nextafter64-8 7.54ns ± 0% 7.51ns ± 0% -0.37% (p=0.000 n=10+10)
Remainder-8 73.5ns ± 1% 70.4ns ± 1% -4.27% (p=0.000 n=10+10)
SqrtGoLatency-8 79.6ns ± 0% 76.2ns ± 0% -4.30% (p=0.000 n=9+10)
Yn-8 582ns ± 0% 579ns ± 0% -0.52% (p=0.000 n=10+10)
Change-Id: I0c9cd1ea87435e7b8bab94b4e79e6e29785f25b1
Reviewed-on: https://go-review.googlesource.com/132915
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 19:57:25 +00:00
Iskander Sharipov
9c2be4c22d
bytes: remove bootstrap array from Buffer
...
Rationale: small buffer optimization does not work and it has
made things slower since 2014. Until we can make it work,
we should prefer simpler code that also turns out to be more
efficient.
With this change, it's possible to use
NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired
stack-allocated initial buffer since escape analysis can
prove the created slice to be non-escaping.
New implementation key points:
- Zero value bytes.Buffer performs better than before
- You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes
- The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly
- Empty writes don't cause allocations
Buffer benchmarks from bytes package:
name old time/op new time/op delta
ReadString-8 9.20µs ± 1% 9.22µs ± 1% ~ (p=0.148 n=10+10)
WriteByte-8 28.1µs ± 0% 26.2µs ± 0% -6.78% (p=0.000 n=10+10)
WriteRune-8 64.9µs ± 0% 65.0µs ± 0% +0.16% (p=0.000 n=10+10)
BufferNotEmptyWriteRead-8 469µs ± 0% 461µs ± 0% -1.76% (p=0.000 n=9+10)
BufferFullSmallReads-8 108µs ± 0% 108µs ± 0% -0.21% (p=0.000 n=10+10)
name old speed new speed delta
ReadString-8 3.56GB/s ± 1% 3.55GB/s ± 1% ~ (p=0.165 n=10+10)
WriteByte-8 146MB/s ± 0% 156MB/s ± 0% +7.26% (p=0.000 n=9+10)
WriteRune-8 189MB/s ± 0% 189MB/s ± 0% -0.16% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
ReadString-8 32.8kB ± 0% 32.8kB ± 0% ~ (all equal)
WriteByte-8 0.00B 0.00B ~ (all equal)
WriteRune-8 0.00B 0.00B ~ (all equal)
BufferNotEmptyWriteRead-8 4.72kB ± 0% 4.67kB ± 0% -1.02% (p=0.000 n=10+10)
BufferFullSmallReads-8 3.44kB ± 0% 3.33kB ± 0% -3.26% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
ReadString-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
WriteByte-8 0.00 0.00 ~ (all equal)
WriteRune-8 0.00 0.00 ~ (all equal)
BufferNotEmptyWriteRead-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
BufferFullSmallReads-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10)
The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc):
HTTPClientServer-8 64.0 ± 0% 63.0 ± 0% -1.56% (p=0.000 n=10+10)
For more explanations and benchmarks see the referenced issue.
Updates #7921
Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3
Reviewed-on: https://go-review.googlesource.com/133715
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-06 19:33:18 +00:00
taylorza
4a095b87d3
cmd/compile: don't crash reporting misuse of shadowed built-in function
...
The existing implementation causes a compiler panic if a function parameter shadows a built-in function, and then calling that shadowed name.
Fixes #27356
Change-Id: I1ffb6dc01e63c7f499e5f6f75f77ce2318f35bcd
Reviewed-on: https://go-review.googlesource.com/132876
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 02:49:21 +00:00
Ben Shi
067dfce21f
cmd/compile: optimize ARM's comparision
...
Optimize (CMPconst [0] (ADD x y)) to (CMN x y) will only get benefits
when the result of the addition is no longer used, otherwise there
might be even performance drop. And this CL fixes that issue for
CMP/CMN/TST/TEQ.
There is little regression in the go1 benchmark (excluding noise),
and the test case JSONDecode-4 even gets improvement.
name old time/op new time/op delta
BinaryTree17-4 21.6s ± 1% 21.6s ± 0% -0.22% (p=0.013 n=30+30)
Fannkuch11-4 11.1s ± 0% 11.1s ± 0% +0.11% (p=0.000 n=30+29)
FmtFprintfEmpty-4 297ns ± 0% 297ns ± 0% +0.08% (p=0.007 n=26+28)
FmtFprintfString-4 589ns ± 1% 589ns ± 0% ~ (p=0.659 n=30+25)
FmtFprintfInt-4 644ns ± 1% 650ns ± 0% +0.88% (p=0.000 n=30+24)
FmtFprintfIntInt-4 964ns ± 0% 977ns ± 0% +1.33% (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4 1.06µs ± 0% 1.07µs ± 0% +1.31% (p=0.000 n=29+27)
FmtFprintfFloat-4 1.89µs ± 0% 1.92µs ± 0% +1.25% (p=0.000 n=29+29)
FmtManyArgs-4 3.63µs ± 0% 3.67µs ± 0% +1.33% (p=0.000 n=29+27)
GobDecode-4 38.1ms ± 1% 37.9ms ± 1% -0.60% (p=0.000 n=29+29)
GobEncode-4 35.3ms ± 2% 35.2ms ± 1% ~ (p=0.286 n=30+30)
Gzip-4 2.36s ± 0% 2.37s ± 2% ~ (p=0.277 n=24+28)
Gunzip-4 264ms ± 1% 264ms ± 1% ~ (p=0.104 n=28+30)
HTTPClientServer-4 1.04ms ± 4% 1.02ms ± 4% -1.65% (p=0.000 n=28+28)
JSONEncode-4 78.5ms ± 1% 79.6ms ± 1% +1.34% (p=0.000 n=27+28)
JSONDecode-4 379ms ± 4% 352ms ± 5% -7.09% (p=0.000 n=29+30)
Mandelbrot200-4 17.6ms ± 0% 17.6ms ± 0% ~ (p=0.206 n=28+29)
GoParse-4 21.9ms ± 1% 22.1ms ± 1% +0.87% (p=0.000 n=28+26)
RegexpMatchEasy0_32-4 631ns ± 0% 641ns ± 0% +1.63% (p=0.000 n=29+30)
RegexpMatchEasy0_1K-4 4.11µs ± 0% 4.11µs ± 0% ~ (p=0.700 n=30+30)
RegexpMatchEasy1_32-4 670ns ± 0% 679ns ± 0% +1.37% (p=0.000 n=21+30)
RegexpMatchEasy1_1K-4 5.31µs ± 0% 5.26µs ± 0% -1.03% (p=0.000 n=25+28)
RegexpMatchMedium_32-4 905ns ± 0% 906ns ± 0% +0.14% (p=0.001 n=30+30)
RegexpMatchMedium_1K-4 192µs ± 0% 191µs ± 0% -0.45% (p=0.000 n=29+27)
RegexpMatchHard_32-4 11.8µs ± 0% 11.7µs ± 0% -0.39% (p=0.000 n=29+28)
RegexpMatchHard_1K-4 347µs ± 0% 347µs ± 0% ~ (p=0.084 n=29+30)
Revcomp-4 37.5ms ± 1% 37.5ms ± 1% ~ (p=0.279 n=29+29)
Template-4 519ms ± 2% 519ms ± 2% ~ (p=0.652 n=28+29)
TimeParse-4 2.83µs ± 0% 2.78µs ± 0% -1.90% (p=0.000 n=27+28)
TimeFormat-4 5.79µs ± 0% 5.60µs ± 0% -3.23% (p=0.000 n=29+29)
[Geo mean] 331µs 330µs -0.16%
name old speed new speed delta
GobDecode-4 20.1MB/s ± 1% 20.3MB/s ± 1% +0.61% (p=0.000 n=29+29)
GobEncode-4 21.7MB/s ± 2% 21.8MB/s ± 1% ~ (p=0.294 n=30+30)
Gzip-4 8.23MB/s ± 1% 8.20MB/s ± 2% ~ (p=0.099 n=26+28)
Gunzip-4 73.5MB/s ± 1% 73.4MB/s ± 1% ~ (p=0.107 n=28+30)
JSONEncode-4 24.7MB/s ± 1% 24.4MB/s ± 1% -1.32% (p=0.000 n=27+28)
JSONDecode-4 5.13MB/s ± 4% 5.52MB/s ± 5% +7.65% (p=0.000 n=29+30)
GoParse-4 2.65MB/s ± 1% 2.63MB/s ± 1% -0.87% (p=0.000 n=28+26)
RegexpMatchEasy0_32-4 50.7MB/s ± 0% 49.9MB/s ± 0% -1.58% (p=0.000 n=29+29)
RegexpMatchEasy0_1K-4 249MB/s ± 0% 249MB/s ± 0% ~ (p=0.342 n=30+28)
RegexpMatchEasy1_32-4 47.7MB/s ± 0% 47.1MB/s ± 0% -1.39% (p=0.000 n=26+30)
RegexpMatchEasy1_1K-4 193MB/s ± 0% 195MB/s ± 0% +1.04% (p=0.000 n=25+28)
RegexpMatchMedium_32-4 1.10MB/s ± 0% 1.10MB/s ± 0% -0.42% (p=0.000 n=30+26)
RegexpMatchMedium_1K-4 5.33MB/s ± 0% 5.36MB/s ± 0% +0.43% (p=0.000 n=29+29)
RegexpMatchHard_32-4 2.72MB/s ± 0% 2.73MB/s ± 0% +0.37% (p=0.000 n=29+30)
RegexpMatchHard_1K-4 2.95MB/s ± 0% 2.95MB/s ± 0% ~ (all equal)
Revcomp-4 67.8MB/s ± 1% 67.7MB/s ± 1% ~ (p=0.273 n=29+29)
Template-4 3.74MB/s ± 2% 3.74MB/s ± 2% ~ (p=0.665 n=28+29)
[Geo mean] 15.2MB/s 15.2MB/s +0.21%
Change-Id: Ifed1fb8cc02d5ca52c8bc6c21b6b5bf6dbb2701a
Reviewed-on: https://go-review.googlesource.com/132115
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 14:54:08 +00:00
Iskander Sharipov
4cf33e361a
cmd/compile/internal/gc: fix mayAffectMemory in esc.go
...
For OINDEX and other Left+Right nodes, we want the whole
node to be considered as "may affect memory" if either
of Left or Right affect memory. Initial implementation
only considered node as such if both Left and Right were non-safe.
Change-Id: Icfb965a0b4c24d8f83f3722216db068dad2eba95
Reviewed-on: https://go-review.googlesource.com/133275
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-09-05 14:16:25 +00:00
Iskander Sharipov
bcf3e063cc
test: remove go:noinline from escape_because.go
...
File is compiled with "-l" flag, so go:noinline is redundant.
Change-Id: Ia269f3b9de9466857fc578ba5164613393e82369
Reviewed-on: https://go-review.googlesource.com/133295
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 10:45:58 +00:00
Tobias Klauser
d7fc2205d4
test: fix nilptr3 check for wasm
...
CL 131735 only updated nilptr3.go for the adjusted nil check. Adjust
nilptr3_wasm.go as well.
Change-Id: I4a6257d32bb212666fe768dac53901ea0b051138
Reviewed-on: https://go-review.googlesource.com/133495
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-05 09:57:32 +00:00
Michael Munday
f94de9c9fb
cmd/compile: make math/bits.RotateLeft{32,64} intrinsics on s390x
...
Extends CL 132435 to s390x. s390x has 32- and 64-bit variable
rotate left instructions.
Change-Id: Ic4f1ebb0e0543207ed2fc8c119e0163b428138a5
Reviewed-on: https://go-review.googlesource.com/133035
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 08:29:02 +00:00
Ben Shi
0e9f1de0b7
cmd/compile: optimize arm64's comparison
...
Add more optimization with TST/CMN.
1. A tiny benchmark shows more than 12% improvement.
TSTCMN-4 378µs ± 0% 332µs ± 0% -12.15% (p=0.000 n=30+27)
(https://github.com/benshi001/ugo1/blob/master/tstcmn_test.go )
2. There is little regression in the go1 benchmark, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 19.1s ± 0% 19.1s ± 0% ~ (p=0.994 n=28+29)
Fannkuch11-4 10.0s ± 0% 10.0s ± 0% ~ (p=0.198 n=30+25)
FmtFprintfEmpty-4 233ns ± 0% 233ns ± 0% +0.14% (p=0.002 n=24+30)
FmtFprintfString-4 428ns ± 0% 428ns ± 0% ~ (all equal)
FmtFprintfInt-4 472ns ± 0% 472ns ± 0% ~ (all equal)
FmtFprintfIntInt-4 725ns ± 0% 725ns ± 0% ~ (all equal)
FmtFprintfPrefixedInt-4 889ns ± 0% 888ns ± 0% ~ (p=0.632 n=28+30)
FmtFprintfFloat-4 1.20µs ± 0% 1.20µs ± 0% +0.05% (p=0.001 n=18+30)
FmtManyArgs-4 3.00µs ± 0% 2.99µs ± 0% -0.07% (p=0.001 n=27+30)
GobDecode-4 42.1ms ± 0% 42.2ms ± 0% +0.29% (p=0.000 n=28+28)
GobEncode-4 38.6ms ± 9% 38.8ms ± 9% ~ (p=0.912 n=30+30)
Gzip-4 2.07s ± 1% 2.05s ± 1% -0.64% (p=0.000 n=29+30)
Gunzip-4 175ms ± 0% 175ms ± 0% -0.15% (p=0.001 n=30+30)
HTTPClientServer-4 872µs ± 5% 880µs ± 6% ~ (p=0.196 n=30+29)
JSONEncode-4 88.5ms ± 1% 89.8ms ± 1% +1.49% (p=0.000 n=23+24)
JSONDecode-4 393ms ± 1% 390ms ± 1% -0.89% (p=0.000 n=28+30)
Mandelbrot200-4 19.5ms ± 0% 19.5ms ± 0% ~ (p=0.405 n=29+28)
GoParse-4 19.9ms ± 0% 20.0ms ± 0% +0.27% (p=0.000 n=30+30)
RegexpMatchEasy0_32-4 431ns ± 0% 431ns ± 0% ~ (p=1.000 n=30+30)
RegexpMatchEasy0_1K-4 1.61µs ± 0% 1.61µs ± 0% ~ (p=0.527 n=26+26)
RegexpMatchEasy1_32-4 443ns ± 0% 443ns ± 0% ~ (all equal)
RegexpMatchEasy1_1K-4 2.58µs ± 1% 2.58µs ± 1% ~ (p=0.578 n=27+25)
RegexpMatchMedium_32-4 740ns ± 0% 740ns ± 0% ~ (p=0.357 n=30+30)
RegexpMatchMedium_1K-4 223µs ± 0% 223µs ± 0% +0.16% (p=0.000 n=30+29)
RegexpMatchHard_32-4 12.3µs ± 0% 12.3µs ± 0% ~ (p=0.236 n=27+27)
RegexpMatchHard_1K-4 371µs ± 0% 371µs ± 0% +0.09% (p=0.000 n=30+27)
Revcomp-4 2.85s ± 0% 2.85s ± 0% ~ (p=0.057 n=28+25)
Template-4 408ms ± 1% 409ms ± 1% ~ (p=0.117 n=29+29)
TimeParse-4 1.93µs ± 0% 1.93µs ± 0% ~ (p=0.535 n=29+28)
TimeFormat-4 1.99µs ± 0% 1.99µs ± 0% ~ (p=0.168 n=29+28)
[Geo mean] 306µs 307µs +0.07%
name old speed new speed delta
GobDecode-4 18.3MB/s ± 0% 18.2MB/s ± 0% -0.31% (p=0.000 n=28+29)
GobEncode-4 19.9MB/s ± 8% 19.8MB/s ± 9% ~ (p=0.923 n=30+30)
Gzip-4 9.39MB/s ± 1% 9.45MB/s ± 1% +0.65% (p=0.000 n=29+30)
Gunzip-4 111MB/s ± 0% 111MB/s ± 0% +0.15% (p=0.001 n=30+30)
JSONEncode-4 21.9MB/s ± 1% 21.6MB/s ± 1% -1.45% (p=0.000 n=23+23)
JSONDecode-4 4.94MB/s ± 1% 4.98MB/s ± 1% +0.84% (p=0.000 n=27+30)
GoParse-4 2.91MB/s ± 0% 2.90MB/s ± 0% -0.34% (p=0.000 n=21+22)
RegexpMatchEasy0_32-4 74.1MB/s ± 0% 74.1MB/s ± 0% ~ (p=0.469 n=29+28)
RegexpMatchEasy0_1K-4 634MB/s ± 0% 634MB/s ± 0% ~ (p=0.978 n=24+28)
RegexpMatchEasy1_32-4 72.2MB/s ± 0% 72.2MB/s ± 0% ~ (p=0.064 n=27+29)
RegexpMatchEasy1_1K-4 396MB/s ± 1% 396MB/s ± 1% ~ (p=0.583 n=27+25)
RegexpMatchMedium_32-4 1.35MB/s ± 0% 1.35MB/s ± 0% ~ (all equal)
RegexpMatchMedium_1K-4 4.60MB/s ± 0% 4.59MB/s ± 0% -0.14% (p=0.000 n=30+26)
RegexpMatchHard_32-4 2.61MB/s ± 0% 2.61MB/s ± 0% ~ (all equal)
RegexpMatchHard_1K-4 2.76MB/s ± 0% 2.76MB/s ± 0% ~ (all equal)
Revcomp-4 89.1MB/s ± 0% 89.1MB/s ± 0% ~ (p=0.059 n=28+25)
Template-4 4.75MB/s ± 1% 4.75MB/s ± 1% ~ (p=0.106 n=29+29)
[Geo mean] 18.3MB/s 18.3MB/s -0.07%
Change-Id: I3cd76ce63e84b0c3cebabf9fa3573b76a7343899
Reviewed-on: https://go-review.googlesource.com/124935
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 02:51:28 +00:00
Ben Shi
b444215116
cmd/compile: optimize ARM64's code with MADD/MSUB
...
MADD does MUL-ADD in a single instruction, and MSUB does the
similiar simplification for MUL-SUB.
The CL implements the optimization with MADD/MSUB.
1. The total size of pkg/android_arm64/ decreases about 20KB,
excluding cmd/compile/.
2. The go1 benchmark shows a little improvement for RegexpMatchHard_32-4
and Template-4, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 16.3s ± 1% 16.5s ± 1% +1.41% (p=0.000 n=26+28)
Fannkuch11-4 8.79s ± 1% 8.76s ± 0% -0.36% (p=0.000 n=26+28)
FmtFprintfEmpty-4 172ns ± 0% 172ns ± 0% ~ (all equal)
FmtFprintfString-4 362ns ± 1% 364ns ± 0% +0.55% (p=0.000 n=30+30)
FmtFprintfInt-4 416ns ± 0% 416ns ± 0% ~ (p=0.099 n=22+30)
FmtFprintfIntInt-4 655ns ± 1% 660ns ± 1% +0.76% (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4 810ns ± 0% 809ns ± 0% -0.08% (p=0.009 n=29+29)
FmtFprintfFloat-4 1.08µs ± 0% 1.09µs ± 0% +0.61% (p=0.000 n=30+29)
FmtManyArgs-4 2.70µs ± 0% 2.69µs ± 0% -0.23% (p=0.000 n=29+28)
GobDecode-4 32.2ms ± 1% 32.1ms ± 1% -0.39% (p=0.000 n=27+26)
GobEncode-4 27.4ms ± 2% 27.4ms ± 1% ~ (p=0.864 n=28+28)
Gzip-4 1.53s ± 1% 1.52s ± 1% -0.30% (p=0.031 n=29+29)
Gunzip-4 146ms ± 0% 146ms ± 0% -0.14% (p=0.001 n=25+30)
HTTPClientServer-4 1.00ms ± 4% 0.98ms ± 6% -1.65% (p=0.001 n=29+30)
JSONEncode-4 67.3ms ± 1% 67.2ms ± 1% ~ (p=0.520 n=28+28)
JSONDecode-4 329ms ± 5% 330ms ± 4% ~ (p=0.142 n=30+30)
Mandelbrot200-4 17.3ms ± 0% 17.3ms ± 0% ~ (p=0.055 n=26+29)
GoParse-4 16.9ms ± 1% 17.0ms ± 1% +0.82% (p=0.000 n=30+30)
RegexpMatchEasy0_32-4 382ns ± 0% 382ns ± 0% ~ (all equal)
RegexpMatchEasy0_1K-4 1.33µs ± 0% 1.33µs ± 0% -0.25% (p=0.000 n=30+27)
RegexpMatchEasy1_32-4 361ns ± 0% 361ns ± 0% -0.08% (p=0.002 n=30+28)
RegexpMatchEasy1_1K-4 2.11µs ± 0% 2.09µs ± 0% -0.54% (p=0.000 n=30+29)
RegexpMatchMedium_32-4 594ns ± 0% 592ns ± 0% -0.32% (p=0.000 n=30+30)
RegexpMatchMedium_1K-4 173µs ± 0% 172µs ± 0% -0.77% (p=0.000 n=29+27)
RegexpMatchHard_32-4 10.4µs ± 0% 10.1µs ± 0% -3.63% (p=0.000 n=28+27)
RegexpMatchHard_1K-4 306µs ± 0% 301µs ± 0% -1.64% (p=0.000 n=29+30)
Revcomp-4 2.51s ± 1% 2.52s ± 0% +0.18% (p=0.017 n=26+27)
Template-4 394ms ± 3% 382ms ± 3% -3.22% (p=0.000 n=28+28)
TimeParse-4 1.67µs ± 0% 1.67µs ± 0% +0.05% (p=0.030 n=27+30)
TimeFormat-4 1.72µs ± 0% 1.70µs ± 0% -0.79% (p=0.000 n=28+26)
[Geo mean] 259µs 259µs -0.33%
name old speed new speed delta
GobDecode-4 23.8MB/s ± 1% 23.9MB/s ± 1% +0.40% (p=0.001 n=27+26)
GobEncode-4 28.0MB/s ± 2% 28.0MB/s ± 1% ~ (p=0.863 n=28+28)
Gzip-4 12.7MB/s ± 1% 12.7MB/s ± 1% +0.32% (p=0.026 n=29+29)
Gunzip-4 133MB/s ± 0% 133MB/s ± 0% +0.15% (p=0.001 n=24+30)
JSONEncode-4 28.8MB/s ± 1% 28.9MB/s ± 1% ~ (p=0.475 n=28+28)
JSONDecode-4 5.89MB/s ± 4% 5.87MB/s ± 5% ~ (p=0.174 n=29+30)
GoParse-4 3.43MB/s ± 0% 3.40MB/s ± 1% -0.83% (p=0.000 n=28+30)
RegexpMatchEasy0_32-4 83.6MB/s ± 0% 83.6MB/s ± 0% ~ (p=0.848 n=28+29)
RegexpMatchEasy0_1K-4 768MB/s ± 0% 770MB/s ± 0% +0.25% (p=0.000 n=30+27)
RegexpMatchEasy1_32-4 88.5MB/s ± 0% 88.5MB/s ± 0% ~ (p=0.086 n=29+29)
RegexpMatchEasy1_1K-4 486MB/s ± 0% 489MB/s ± 0% +0.54% (p=0.000 n=30+29)
RegexpMatchMedium_32-4 1.68MB/s ± 0% 1.69MB/s ± 0% +0.60% (p=0.000 n=30+23)
RegexpMatchMedium_1K-4 5.90MB/s ± 0% 5.95MB/s ± 0% +0.85% (p=0.000 n=18+20)
RegexpMatchHard_32-4 3.07MB/s ± 0% 3.18MB/s ± 0% +3.72% (p=0.000 n=29+26)
RegexpMatchHard_1K-4 3.35MB/s ± 0% 3.40MB/s ± 0% +1.69% (p=0.000 n=30+30)
Revcomp-4 101MB/s ± 0% 101MB/s ± 0% -0.18% (p=0.018 n=26+27)
Template-4 4.92MB/s ± 4% 5.09MB/s ± 3% +3.31% (p=0.000 n=28+28)
[Geo mean] 22.4MB/s 22.6MB/s +0.62%
Change-Id: I8f304b272785739f57b3c8f736316f658f8c1b2a
Reviewed-on: https://go-review.googlesource.com/129119
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-04 20:41:58 +00:00
Matthew Dempsky
f7a633aa79
cmd/compile: use "N variables but M values" error for OAS
...
Makes the error message more consistent between OAS and OAS2.
Fixes #26616 .
Change-Id: I07ab46c5ef8a37efb2cb557632697f5d1bf789f7
Reviewed-on: https://go-review.googlesource.com/131280
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-09-04 19:05:56 +00:00
Alexey Naidonov
669fa8f36a
cmd/compile: remove unnecessary nil-check
...
Removes unnecessary nil-check when referencing offset from an
address. Suggested by Keith Randall in golang/go#27180 .
Updates golang/go#27180
Change-Id: I326ed7fda7cfa98b7e4354c811900707fee26021
Reviewed-on: https://go-review.googlesource.com/131735
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-04 17:44:14 +00:00
Iskander Sharipov
67ac554d79
cmd/compile/internal/gc: fix invalid positions for sink nodes in esc.go
...
Make OAS2 and OAS2FUNC sink locations point to the assignment position,
not the nth LHS position.
Fixes #26987
Change-Id: Ibeb9df2da754da8b6638fe1e49e813f37515c13c
Reviewed-on: https://go-review.googlesource.com/129315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-03 16:46:59 +00:00
Michael Munday
6f9b94ab66
cmd/compile: implement OnesCount{8,16,32,64} intrinsics on s390x
...
This CL implements the math/bits.OnesCount{8,16,32,64} functions
as intrinsics on s390x using the 'population count' (popcnt)
instruction. This instruction was released as the 'population-count'
facility which uses the same facility bit (45) as the
'distinct-operands' facility which is a pre-requisite for Go on
s390x. We can therefore use it without a feature check.
The s390x popcnt instruction treats a 64 bit register as a vector
of 8 bytes, summing the number of ones in each byte individually.
It then writes the results to the corresponding bytes in the
output register. Therefore to implement OnesCount{16,32,64} we
need to sum the individual byte counts using some extra
instructions. To do this efficiently I've added some additional
pseudo operations to the s390x SSA backend.
Unlike other architectures the new instruction sequence is faster
for OnesCount8, so that is implemented using the intrinsic.
name old time/op new time/op delta
OnesCount 3.21ns ± 1% 1.35ns ± 0% -58.00% (p=0.000 n=20+20)
OnesCount8 0.91ns ± 1% 0.81ns ± 0% -11.43% (p=0.000 n=20+20)
OnesCount16 1.51ns ± 3% 1.21ns ± 0% -19.71% (p=0.000 n=20+17)
OnesCount32 1.91ns ± 0% 1.12ns ± 1% -41.60% (p=0.000 n=19+20)
OnesCount64 3.18ns ± 4% 1.35ns ± 0% -57.52% (p=0.000 n=20+20)
Change-Id: Id54f0bd28b6db9a887ad12c0d72fcc168ef9c4e0
Reviewed-on: https://go-review.googlesource.com/114675
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-03 14:35:38 +00:00
Iskander Sharipov
ff468a43be
cmd/compile/internal/gc: better handling of self-assignments in esc.go
...
Teach escape analysis to recognize these assignment patterns
as not causing the src to leak:
val.x = val.y
val.x[i] = val.y[j]
val.x1.x2 = val.x1.y2
... etc
Helps to avoid "leaking param" with assignments showed above.
The implementation is based on somewhat similiar xs=xs[a:b]
special case that is ignored by the escape analysis.
We may figure out more generalized version of this,
but this one looks like a safe step into that direction.
Updates #14858
Change-Id: I6fe5bfedec9c03bdc1d7624883324a523bd11fde
Reviewed-on: https://go-review.googlesource.com/126395
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-09-03 14:28:51 +00:00
Giovanni Bajo
dd5e9b32ff
cmd/compile: add testcase for #24876
...
This is still not fixed, the testcase reflects that there are still
a few boundchecks. Let's fix the good alternative with an explicit
test though.
Updates #24876
Change-Id: I4da35eb353e19052bd7b69ea6190a69ced8b9b3d
Reviewed-on: https://go-review.googlesource.com/107355
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-02 10:34:51 +00:00
Giovanni Bajo
f02cc88f46
test: relax whitespaces matching in codegen tests
...
The codegen testsuite uses regexp to parse the syntax, but it doesn't
have a way to tell line comments containing checks from line comments
containing English sentences. This means that any syntax error (that
is, non-matching regexp) is currently ignored and not reported.
There were some tests in memcombine.go that had an extraneous space
and were thus effectively disabled. It would be great if we could
report it as a syntax error, but for now we just punt and swallow the
spaces as a workaround, to avoid the same mistake again.
Fixes #25452
Change-Id: Ic7747a2278bc00adffd0c199ce40937acbbc9cf0
Reviewed-on: https://go-review.googlesource.com/113835
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-02 10:31:37 +00:00
Giovanni Bajo
09ea3c08e8
cmd/compile: in prove, fix fence-post implications for unsigned domain
...
Fence-post implications of the form "x-1 >= w && x > min ⇒ x > w"
were not correctly handling unsigned domain, by always checking signed
limits.
This bug was uncovered once we taught prove that len(x) is always
>= 0 in the signed domain.
In the code being miscompiled (s[len(s)-1]), prove checks
whether len(s)-1 >= len(s) in the unsigned domain; if it proves
that this is always false, it can remove the bound check.
Notice that len(s)-1 >= len(s) can be true for len(s) = 0 because
of the wrap-around, so this is something prove should not be
able to deduce.
But because of the bug, the gate condition for the fence-post
implication was len(s) > MinInt64 instead of len(s) > 0; that
condition would be good in the signed domain but not in the
unsigned domain. And since in CL105635 we taught prove that
len(s) >= 0, the condition incorrectly triggered
(len(s) >= 0 > MinInt64) and things were going downfall.
Fixes #27251
Fixes #27289
Change-Id: I3dbcb1955ac5a66a0dcbee500f41e8d219409be5
Reviewed-on: https://go-review.googlesource.com/132495
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-31 08:54:38 +00:00
Andrew Bonventre
5ac2476748
cmd/compile: make math/bits.RotateLeft* an intrinsic on amd64
...
Previously, pattern matching was good enough to achieve good performance
for the RotateLeft* functions, but the inlining cost for them was much
too high. Make RotateLeft* intrinsic on amd64 as a stop-gap for now to
reduce inlining costs.
This should be done (or at least looked at) for other architectures
as well.
Updates golang/go#17566
Change-Id: I6a106ff00b6c4e3f490650af3e083ed2be00c819
Reviewed-on: https://go-review.googlesource.com/132435
Run-TryBot: Andrew Bonventre <andybons@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-08-30 22:48:28 +00:00
Cherry Zhang
289dce24c1
test: fix nosplit test on 386
...
The 120->124 change in https://go-review.googlesource.com/c/go/+/61511/21/test/nosplit.go#143
looks accidental. Change back to 120.
Change-Id: I1690a8ae2d32756ba05544d2ed1baabfa64e1704
Reviewed-on: https://go-review.googlesource.com/131958
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-30 12:06:34 +00:00
Cherry Zhang
54f9c0416a
cmd/compile: count nil check as use in dead auto elim
...
Nil check is special in that it has no use but we must keep it.
Count it as a use of the auto.
Fixes #27278 .
Change-Id: I857c3d0db2ebdca1bc342b4993c0dac5c01e067f
Reviewed-on: https://go-review.googlesource.com/131955
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-30 02:19:37 +00:00
Zheng Xu
8f4fd3f34e
build: support frame-pointer for arm64
...
Supporting frame-pointer makes Linux's perf and other profilers much more useful
because it lets them gather a stack trace efficiently on profiling events. Major
changes include:
1. save FP on the word below where RSP is pointing to (proposed by Cherry and Austin)
2. adjust some specific offsets in runtime assembly and wrapper code
3. add support to FP in goroutine scheduler
4. adjust link stack overflow check to take the extra word into account
5. adjust nosplit test cases to enable frame sizes which are 16 bytes aligned
Performance impacts on go1 benchmarks:
Enable frame-pointer (by default)
name old time/op new time/op delta
BinaryTree17-46 5.94s ± 0% 6.00s ± 0% +1.03% (p=0.029 n=4+4)
Fannkuch11-46 2.84s ± 1% 2.77s ± 0% -2.58% (p=0.008 n=5+5)
FmtFprintfEmpty-46 55.0ns ± 1% 58.9ns ± 1% +7.06% (p=0.008 n=5+5)
FmtFprintfString-46 102ns ± 0% 105ns ± 0% +2.94% (p=0.008 n=5+5)
FmtFprintfInt-46 118ns ± 0% 117ns ± 1% -1.19% (p=0.000 n=4+5)
FmtFprintfIntInt-46 181ns ± 0% 182ns ± 1% ~ (p=0.444 n=5+5)
FmtFprintfPrefixedInt-46 215ns ± 1% 214ns ± 0% ~ (p=0.254 n=5+4)
FmtFprintfFloat-46 292ns ± 0% 296ns ± 0% +1.46% (p=0.029 n=4+4)
FmtManyArgs-46 720ns ± 0% 732ns ± 0% +1.72% (p=0.008 n=5+5)
GobDecode-46 9.82ms ± 1% 10.03ms ± 2% +2.10% (p=0.008 n=5+5)
GobEncode-46 8.14ms ± 0% 8.72ms ± 1% +7.14% (p=0.008 n=5+5)
Gzip-46 420ms ± 0% 424ms ± 0% +0.92% (p=0.008 n=5+5)
Gunzip-46 48.2ms ± 0% 48.4ms ± 0% +0.41% (p=0.008 n=5+5)
HTTPClientServer-46 201µs ± 4% 201µs ± 0% ~ (p=0.730 n=5+4)
JSONEncode-46 17.1ms ± 0% 17.7ms ± 1% +3.80% (p=0.008 n=5+5)
JSONDecode-46 88.0ms ± 0% 90.1ms ± 0% +2.42% (p=0.008 n=5+5)
Mandelbrot200-46 5.06ms ± 0% 5.07ms ± 0% ~ (p=0.310 n=5+5)
GoParse-46 5.04ms ± 0% 5.12ms ± 0% +1.53% (p=0.008 n=5+5)
RegexpMatchEasy0_32-46 117ns ± 0% 117ns ± 0% ~ (all equal)
RegexpMatchEasy0_1K-46 332ns ± 0% 329ns ± 0% -0.78% (p=0.008 n=5+5)
RegexpMatchEasy1_32-46 104ns ± 0% 113ns ± 0% +8.65% (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46 563ns ± 0% 569ns ± 0% +1.10% (p=0.008 n=5+5)
RegexpMatchMedium_32-46 167ns ± 2% 177ns ± 1% +5.74% (p=0.008 n=5+5)
RegexpMatchMedium_1K-46 49.5µs ± 0% 53.4µs ± 0% +7.81% (p=0.008 n=5+5)
RegexpMatchHard_32-46 2.56µs ± 1% 2.72µs ± 0% +6.01% (p=0.008 n=5+5)
RegexpMatchHard_1K-46 77.0µs ± 0% 81.8µs ± 0% +6.24% (p=0.016 n=5+4)
Revcomp-46 631ms ± 1% 627ms ± 1% ~ (p=0.095 n=5+5)
Template-46 81.8ms ± 0% 86.3ms ± 0% +5.55% (p=0.008 n=5+5)
TimeParse-46 423ns ± 0% 432ns ± 0% +2.32% (p=0.008 n=5+5)
TimeFormat-46 478ns ± 2% 497ns ± 1% +3.89% (p=0.008 n=5+5)
[Geo mean] 71.6µs 73.3µs +2.45%
name old speed new speed delta
GobDecode-46 78.1MB/s ± 1% 76.6MB/s ± 2% -2.04% (p=0.008 n=5+5)
GobEncode-46 94.3MB/s ± 0% 88.0MB/s ± 1% -6.67% (p=0.008 n=5+5)
Gzip-46 46.2MB/s ± 0% 45.8MB/s ± 0% -0.91% (p=0.008 n=5+5)
Gunzip-46 403MB/s ± 0% 401MB/s ± 0% -0.41% (p=0.008 n=5+5)
JSONEncode-46 114MB/s ± 0% 109MB/s ± 1% -3.66% (p=0.008 n=5+5)
JSONDecode-46 22.0MB/s ± 0% 21.5MB/s ± 0% -2.35% (p=0.008 n=5+5)
GoParse-46 11.5MB/s ± 0% 11.3MB/s ± 0% -1.51% (p=0.008 n=5+5)
RegexpMatchEasy0_32-46 272MB/s ± 0% 272MB/s ± 1% ~ (p=0.190 n=4+5)
RegexpMatchEasy0_1K-46 3.08GB/s ± 0% 3.11GB/s ± 0% +0.77% (p=0.008 n=5+5)
RegexpMatchEasy1_32-46 306MB/s ± 0% 283MB/s ± 0% -7.63% (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46 1.82GB/s ± 0% 1.80GB/s ± 0% -1.07% (p=0.008 n=5+5)
RegexpMatchMedium_32-46 5.99MB/s ± 0% 5.64MB/s ± 1% -5.77% (p=0.016 n=4+5)
RegexpMatchMedium_1K-46 20.7MB/s ± 0% 19.2MB/s ± 0% -7.25% (p=0.008 n=5+5)
RegexpMatchHard_32-46 12.5MB/s ± 1% 11.8MB/s ± 0% -5.66% (p=0.008 n=5+5)
RegexpMatchHard_1K-46 13.3MB/s ± 0% 12.5MB/s ± 1% -6.01% (p=0.008 n=5+5)
Revcomp-46 402MB/s ± 1% 405MB/s ± 1% ~ (p=0.095 n=5+5)
Template-46 23.7MB/s ± 0% 22.5MB/s ± 0% -5.25% (p=0.008 n=5+5)
[Geo mean] 82.2MB/s 79.6MB/s -3.26%
Disable frame-pointer (GOEXPERIMENT=noframepointer)
name old time/op new time/op delta
BinaryTree17-46 5.94s ± 0% 5.96s ± 0% +0.39% (p=0.029 n=4+4)
Fannkuch11-46 2.84s ± 1% 2.79s ± 1% -1.68% (p=0.008 n=5+5)
FmtFprintfEmpty-46 55.0ns ± 1% 55.2ns ± 3% ~ (p=0.794 n=5+5)
FmtFprintfString-46 102ns ± 0% 103ns ± 0% +0.98% (p=0.016 n=5+4)
FmtFprintfInt-46 118ns ± 0% 115ns ± 0% -2.54% (p=0.029 n=4+4)
FmtFprintfIntInt-46 181ns ± 0% 179ns ± 0% -1.10% (p=0.000 n=5+4)
FmtFprintfPrefixedInt-46 215ns ± 1% 213ns ± 0% ~ (p=0.143 n=5+4)
FmtFprintfFloat-46 292ns ± 0% 300ns ± 0% +2.83% (p=0.029 n=4+4)
FmtManyArgs-46 720ns ± 0% 739ns ± 0% +2.64% (p=0.008 n=5+5)
GobDecode-46 9.82ms ± 1% 9.78ms ± 1% ~ (p=0.151 n=5+5)
GobEncode-46 8.14ms ± 0% 8.12ms ± 1% ~ (p=0.690 n=5+5)
Gzip-46 420ms ± 0% 420ms ± 0% ~ (p=0.548 n=5+5)
Gunzip-46 48.2ms ± 0% 48.0ms ± 0% -0.33% (p=0.032 n=5+5)
HTTPClientServer-46 201µs ± 4% 199µs ± 3% ~ (p=0.548 n=5+5)
JSONEncode-46 17.1ms ± 0% 17.2ms ± 0% ~ (p=0.056 n=5+5)
JSONDecode-46 88.0ms ± 0% 88.6ms ± 0% +0.64% (p=0.008 n=5+5)
Mandelbrot200-46 5.06ms ± 0% 5.07ms ± 0% ~ (p=0.548 n=5+5)
GoParse-46 5.04ms ± 0% 5.07ms ± 0% +0.65% (p=0.008 n=5+5)
RegexpMatchEasy0_32-46 117ns ± 0% 112ns ± 4% -4.27% (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46 332ns ± 0% 330ns ± 1% ~ (p=0.095 n=5+5)
RegexpMatchEasy1_32-46 104ns ± 0% 110ns ± 1% +5.29% (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46 563ns ± 0% 567ns ± 2% ~ (p=0.151 n=5+5)
RegexpMatchMedium_32-46 167ns ± 2% 166ns ± 0% ~ (p=0.333 n=5+4)
RegexpMatchMedium_1K-46 49.5µs ± 0% 49.6µs ± 0% ~ (p=0.841 n=5+5)
RegexpMatchHard_32-46 2.56µs ± 1% 2.49µs ± 0% -2.81% (p=0.008 n=5+5)
RegexpMatchHard_1K-46 77.0µs ± 0% 75.8µs ± 0% -1.55% (p=0.008 n=5+5)
Revcomp-46 631ms ± 1% 628ms ± 0% ~ (p=0.095 n=5+5)
Template-46 81.8ms ± 0% 84.3ms ± 1% +3.05% (p=0.008 n=5+5)
TimeParse-46 423ns ± 0% 425ns ± 0% +0.52% (p=0.008 n=5+5)
TimeFormat-46 478ns ± 2% 478ns ± 1% ~ (p=1.000 n=5+5)
[Geo mean] 71.6µs 71.6µs -0.01%
name old speed new speed delta
GobDecode-46 78.1MB/s ± 1% 78.5MB/s ± 1% ~ (p=0.151 n=5+5)
GobEncode-46 94.3MB/s ± 0% 94.5MB/s ± 1% ~ (p=0.690 n=5+5)
Gzip-46 46.2MB/s ± 0% 46.2MB/s ± 0% ~ (p=0.571 n=5+5)
Gunzip-46 403MB/s ± 0% 404MB/s ± 0% +0.33% (p=0.032 n=5+5)
JSONEncode-46 114MB/s ± 0% 113MB/s ± 0% ~ (p=0.056 n=5+5)
JSONDecode-46 22.0MB/s ± 0% 21.9MB/s ± 0% -0.64% (p=0.008 n=5+5)
GoParse-46 11.5MB/s ± 0% 11.4MB/s ± 0% -0.64% (p=0.008 n=5+5)
RegexpMatchEasy0_32-46 272MB/s ± 0% 285MB/s ± 4% +4.74% (p=0.016 n=4+5)
RegexpMatchEasy0_1K-46 3.08GB/s ± 0% 3.10GB/s ± 1% ~ (p=0.151 n=5+5)
RegexpMatchEasy1_32-46 306MB/s ± 0% 290MB/s ± 1% -5.21% (p=0.029 n=4+4)
RegexpMatchEasy1_1K-46 1.82GB/s ± 0% 1.81GB/s ± 2% ~ (p=0.151 n=5+5)
RegexpMatchMedium_32-46 5.99MB/s ± 0% 6.02MB/s ± 1% ~ (p=0.063 n=4+5)
RegexpMatchMedium_1K-46 20.7MB/s ± 0% 20.7MB/s ± 0% ~ (p=0.659 n=5+5)
RegexpMatchHard_32-46 12.5MB/s ± 1% 12.8MB/s ± 0% +2.88% (p=0.008 n=5+5)
RegexpMatchHard_1K-46 13.3MB/s ± 0% 13.5MB/s ± 0% +1.58% (p=0.008 n=5+5)
Revcomp-46 402MB/s ± 1% 405MB/s ± 0% ~ (p=0.095 n=5+5)
Template-46 23.7MB/s ± 0% 23.0MB/s ± 1% -2.95% (p=0.008 n=5+5)
[Geo mean] 82.2MB/s 82.3MB/s +0.04%
Frame-pointer is enabled on Linux by default but can be disabled by setting: GOEXPERIMENT=noframepointer.
Fixes #10110
Change-Id: I1bfaca6dba29a63009d7c6ab04ed7a1413d9479e
Reviewed-on: https://go-review.googlesource.com/61511
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-29 18:28:34 +00:00
Dave Brophy
7c7cecc184
fmt: fix incorrect format of whole-number floats when using %#v
...
This fixes the unwanted behaviour where printing a zero float with the
#v fmt verb outputs "0" - e.g. missing the trailing decimal. This means
that the output would be interpreted as an int rather than a float when
parsed as Go source. After this change the the output is "0.0".
Fixes #26363
Change-Id: Ic5c060522459cd5ce077675d47c848b22ddc34fa
GitHub-Last-Rev: adfb061363
GitHub-Pull-Request: golang/go#26383
Reviewed-on: https://go-review.googlesource.com/123956
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Rob Pike <r@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-28 20:15:15 +00:00
Ben Shi
3ca3e89bb6
cmd/compile: optimize arm64 with indexed FP load/store
...
The FP load/store on arm64 have register indexed forms. And this
CL implements this optimization.
1. The total size of pkg/android_arm64 (excluding cmd/compile)
decreases about 400 bytes.
2. There is no regression in the go1 benchmark, the test case
GobEncode even gets slight improvement, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 19.0s ± 0% 19.0s ± 1% ~ (p=0.817 n=29+29)
Fannkuch11-4 9.94s ± 0% 9.95s ± 0% +0.03% (p=0.010 n=24+30)
FmtFprintfEmpty-4 233ns ± 0% 233ns ± 0% ~ (all equal)
FmtFprintfString-4 427ns ± 0% 427ns ± 0% ~ (p=0.649 n=30+30)
FmtFprintfInt-4 471ns ± 0% 471ns ± 0% ~ (all equal)
FmtFprintfIntInt-4 730ns ± 0% 730ns ± 0% ~ (all equal)
FmtFprintfPrefixedInt-4 889ns ± 0% 889ns ± 0% ~ (all equal)
FmtFprintfFloat-4 1.21µs ± 0% 1.21µs ± 0% +0.04% (p=0.012 n=20+30)
FmtManyArgs-4 2.99µs ± 0% 2.99µs ± 0% ~ (p=0.651 n=29+29)
GobDecode-4 42.4ms ± 1% 42.3ms ± 1% -0.27% (p=0.001 n=29+28)
GobEncode-4 37.8ms ±11% 36.0ms ± 0% -4.67% (p=0.000 n=30+26)
Gzip-4 1.98s ± 1% 1.96s ± 1% -1.26% (p=0.000 n=30+30)
Gunzip-4 175ms ± 0% 175ms ± 0% ~ (p=0.988 n=29+29)
HTTPClientServer-4 854µs ± 5% 860µs ± 5% ~ (p=0.236 n=28+29)
JSONEncode-4 88.8ms ± 0% 87.9ms ± 0% -1.00% (p=0.000 n=24+26)
JSONDecode-4 390ms ± 1% 392ms ± 2% +0.48% (p=0.025 n=30+30)
Mandelbrot200-4 19.5ms ± 0% 19.5ms ± 0% ~ (p=0.894 n=24+29)
GoParse-4 20.3ms ± 0% 20.1ms ± 1% -0.94% (p=0.000 n=27+26)
RegexpMatchEasy0_32-4 451ns ± 0% 451ns ± 0% ~ (p=0.578 n=30+30)
RegexpMatchEasy0_1K-4 1.63µs ± 0% 1.63µs ± 0% ~ (p=0.298 n=30+28)
RegexpMatchEasy1_32-4 431ns ± 0% 434ns ± 0% +0.67% (p=0.000 n=30+29)
RegexpMatchEasy1_1K-4 2.60µs ± 0% 2.64µs ± 0% +1.36% (p=0.000 n=28+26)
RegexpMatchMedium_32-4 744ns ± 0% 744ns ± 0% ~ (p=0.474 n=29+29)
RegexpMatchMedium_1K-4 223µs ± 0% 223µs ± 0% -0.08% (p=0.038 n=26+30)
RegexpMatchHard_32-4 12.2µs ± 0% 12.3µs ± 0% +0.27% (p=0.000 n=29+30)
RegexpMatchHard_1K-4 373µs ± 0% 373µs ± 0% ~ (p=0.219 n=29+28)
Revcomp-4 2.84s ± 0% 2.84s ± 0% ~ (p=0.130 n=28+28)
Template-4 394ms ± 1% 392ms ± 1% -0.52% (p=0.001 n=30+30)
TimeParse-4 1.93µs ± 0% 1.93µs ± 0% ~ (p=0.587 n=29+30)
TimeFormat-4 2.00µs ± 0% 2.00µs ± 0% +0.07% (p=0.001 n=28+27)
[Geo mean] 306µs 305µs -0.17%
name old speed new speed delta
GobDecode-4 18.1MB/s ± 1% 18.2MB/s ± 1% +0.27% (p=0.001 n=29+28)
GobEncode-4 20.3MB/s ±10% 21.3MB/s ± 0% +4.64% (p=0.000 n=30+26)
Gzip-4 9.79MB/s ± 1% 9.91MB/s ± 1% +1.28% (p=0.000 n=30+30)
Gunzip-4 111MB/s ± 0% 111MB/s ± 0% ~ (p=0.988 n=29+29)
JSONEncode-4 21.8MB/s ± 0% 22.1MB/s ± 0% +1.02% (p=0.000 n=24+26)
JSONDecode-4 4.97MB/s ± 1% 4.95MB/s ± 2% -0.45% (p=0.031 n=30+30)
GoParse-4 2.85MB/s ± 1% 2.88MB/s ± 1% +1.03% (p=0.000 n=30+26)
RegexpMatchEasy0_32-4 70.9MB/s ± 0% 70.9MB/s ± 0% ~ (p=0.904 n=29+28)
RegexpMatchEasy0_1K-4 627MB/s ± 0% 627MB/s ± 0% ~ (p=0.156 n=30+30)
RegexpMatchEasy1_32-4 74.2MB/s ± 0% 73.7MB/s ± 0% -0.67% (p=0.000 n=30+29)
RegexpMatchEasy1_1K-4 393MB/s ± 0% 388MB/s ± 0% -1.34% (p=0.000 n=28+26)
RegexpMatchMedium_32-4 1.34MB/s ± 0% 1.34MB/s ± 0% ~ (all equal)
RegexpMatchMedium_1K-4 4.59MB/s ± 0% 4.59MB/s ± 0% +0.07% (p=0.035 n=25+30)
RegexpMatchHard_32-4 2.61MB/s ± 0% 2.61MB/s ± 0% -0.11% (p=0.002 n=28+30)
RegexpMatchHard_1K-4 2.75MB/s ± 0% 2.75MB/s ± 0% +0.15% (p=0.001 n=30+24)
Revcomp-4 89.4MB/s ± 0% 89.4MB/s ± 0% ~ (p=0.140 n=28+28)
Template-4 4.93MB/s ± 1% 4.95MB/s ± 1% +0.51% (p=0.001 n=30+30)
[Geo mean] 18.4MB/s 18.4MB/s +0.37%
Change-Id: I9a6b521a971b21cfb51064e8e9b853cef8a1d071
Reviewed-on: https://go-review.googlesource.com/124636
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-28 02:37:18 +00:00
Ben Shi
2b69ad0b3c
cmd/compile: optimize arm's comparison
...
The CMP/CMN/TST/TEQ perform similar to SUB/ADD/AND/XOR except
the result is abondoned, and only NZCV flags are affected.
This CL implements further optimization with them.
1. A micro benchmark test gets more than 9% improvment.
TSTTEQ-4 6.99ms ± 0% 6.35ms ± 0% -9.15% (p=0.000 n=33+36)
(https://github.com/benshi001/ugo1/blob/master/tstteq2_test.go )
2. The go1 benckmark shows no regression, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 25.7s ± 1% 25.7s ± 1% ~ (p=0.830 n=40+40)
Fannkuch11-4 13.3s ± 0% 13.2s ± 0% -0.65% (p=0.000 n=40+34)
FmtFprintfEmpty-4 394ns ± 0% 394ns ± 0% ~ (p=0.819 n=40+40)
FmtFprintfString-4 677ns ± 0% 677ns ± 0% +0.06% (p=0.039 n=39+40)
FmtFprintfInt-4 707ns ± 0% 706ns ± 0% -0.14% (p=0.000 n=40+39)
FmtFprintfIntInt-4 1.04µs ± 0% 1.04µs ± 0% +0.10% (p=0.000 n=29+31)
FmtFprintfPrefixedInt-4 1.10µs ± 0% 1.11µs ± 0% +0.65% (p=0.000 n=39+37)
FmtFprintfFloat-4 2.27µs ± 0% 2.26µs ± 0% -0.53% (p=0.000 n=39+40)
FmtManyArgs-4 3.96µs ± 0% 3.96µs ± 0% +0.10% (p=0.000 n=39+40)
GobDecode-4 53.4ms ± 1% 52.8ms ± 2% -1.10% (p=0.000 n=39+39)
GobEncode-4 50.3ms ± 3% 50.4ms ± 2% ~ (p=0.089 n=40+39)
Gzip-4 2.62s ± 0% 2.64s ± 0% +0.60% (p=0.000 n=40+39)
Gunzip-4 312ms ± 0% 312ms ± 0% +0.02% (p=0.030 n=40+39)
HTTPClientServer-4 1.01ms ± 7% 0.98ms ± 7% -2.37% (p=0.000 n=40+39)
JSONEncode-4 126ms ± 1% 126ms ± 1% -0.38% (p=0.004 n=39+39)
JSONDecode-4 423ms ± 0% 426ms ± 2% +0.72% (p=0.001 n=39+40)
Mandelbrot200-4 18.4ms ± 0% 18.4ms ± 0% +0.04% (p=0.000 n=38+40)
GoParse-4 22.8ms ± 0% 22.6ms ± 0% -0.68% (p=0.000 n=35+40)
RegexpMatchEasy0_32-4 699ns ± 0% 704ns ± 0% +0.73% (p=0.000 n=27+40)
RegexpMatchEasy0_1K-4 4.27µs ± 0% 4.26µs ± 0% -0.09% (p=0.000 n=35+38)
RegexpMatchEasy1_32-4 741ns ± 0% 735ns ± 0% -0.85% (p=0.000 n=40+35)
RegexpMatchEasy1_1K-4 5.53µs ± 0% 5.49µs ± 0% -0.69% (p=0.000 n=39+40)
RegexpMatchMedium_32-4 1.07µs ± 0% 1.04µs ± 2% -2.34% (p=0.000 n=40+40)
RegexpMatchMedium_1K-4 261µs ± 0% 261µs ± 0% -0.16% (p=0.000 n=40+39)
RegexpMatchHard_32-4 14.9µs ± 0% 14.9µs ± 0% -0.18% (p=0.000 n=39+40)
RegexpMatchHard_1K-4 445µs ± 0% 446µs ± 0% +0.09% (p=0.000 n=36+34)
Revcomp-4 41.8ms ± 1% 41.8ms ± 1% ~ (p=0.595 n=39+38)
Template-4 530ms ± 1% 528ms ± 1% -0.49% (p=0.000 n=40+40)
TimeParse-4 3.39µs ± 0% 3.42µs ± 0% +0.98% (p=0.000 n=36+38)
TimeFormat-4 6.12µs ± 0% 6.07µs ± 0% -0.81% (p=0.000 n=34+38)
[Geo mean] 384µs 383µs -0.24%
name old speed new speed delta
GobDecode-4 14.4MB/s ± 1% 14.5MB/s ± 2% +1.11% (p=0.000 n=39+39)
GobEncode-4 15.3MB/s ± 3% 15.2MB/s ± 2% ~ (p=0.104 n=40+39)
Gzip-4 7.40MB/s ± 1% 7.36MB/s ± 0% -0.60% (p=0.000 n=40+39)
Gunzip-4 62.2MB/s ± 0% 62.1MB/s ± 0% -0.02% (p=0.047 n=40+39)
JSONEncode-4 15.4MB/s ± 1% 15.4MB/s ± 2% +0.39% (p=0.002 n=39+39)
JSONDecode-4 4.59MB/s ± 0% 4.56MB/s ± 2% -0.71% (p=0.000 n=39+40)
GoParse-4 2.54MB/s ± 0% 2.56MB/s ± 0% +0.72% (p=0.000 n=26+40)
RegexpMatchEasy0_32-4 45.8MB/s ± 0% 45.4MB/s ± 0% -0.75% (p=0.000 n=38+40)
RegexpMatchEasy0_1K-4 240MB/s ± 0% 240MB/s ± 0% +0.09% (p=0.000 n=35+38)
RegexpMatchEasy1_32-4 43.1MB/s ± 0% 43.5MB/s ± 0% +0.84% (p=0.000 n=40+39)
RegexpMatchEasy1_1K-4 185MB/s ± 0% 186MB/s ± 0% +0.69% (p=0.000 n=39+40)
RegexpMatchMedium_32-4 936kB/s ± 1% 959kB/s ± 2% +2.38% (p=0.000 n=40+40)
RegexpMatchMedium_1K-4 3.92MB/s ± 0% 3.93MB/s ± 0% +0.18% (p=0.000 n=39+40)
RegexpMatchHard_32-4 2.15MB/s ± 0% 2.15MB/s ± 0% +0.19% (p=0.000 n=40+40)
RegexpMatchHard_1K-4 2.30MB/s ± 0% 2.30MB/s ± 0% ~ (all equal)
Revcomp-4 60.8MB/s ± 1% 60.8MB/s ± 1% ~ (p=0.600 n=39+38)
Template-4 3.66MB/s ± 1% 3.68MB/s ± 1% +0.46% (p=0.000 n=40+40)
[Geo mean] 12.8MB/s 12.8MB/s +0.27%
Change-Id: I849161169ecf0876a04b7c1d3990fa8d1435215e
Reviewed-on: https://go-review.googlesource.com/122855
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-27 15:49:46 +00:00
Alberto Donizetti
42cc4ca30a
cmd/compile: prevent overflow in walkinrange
...
In the compiler frontend, walkinrange indiscriminately calls Int64()
on const CTINT nodes, even though Int64's return value is undefined
for anything over 2⁶³ (in practise, it'll return a negative number).
This causes the introduction of bad constants during rewrites of
unsigned expressions, which make the compiler reject valid Go
programs.
This change introduces a preliminary check that Int64() is safe to
call on the consts on hand. If it isn't, walkinrange exits without
doing any rewrite.
Fixes #27143
Change-Id: I2017073cae65468a521ff3262d4ea8ab0d7098d9
Reviewed-on: https://go-review.googlesource.com/130735
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-08-26 21:52:27 +00:00
Ben Shi
e03220a594
cmd/compile: optimize 386 code with FLDPI
...
FLDPI pushes the constant pi to 387's register stack, which is
more efficient than MOVSSconst/MOVSDconst.
1. This optimization reduces 0.3KB of the total size of pkg/linux_386
(exlcuding cmd/compile).
2. There is little regression in the go1 benchmark.
name old time/op new time/op delta
BinaryTree17-4 3.30s ± 3% 3.30s ± 2% ~ (p=0.759 n=40+39)
Fannkuch11-4 3.53s ± 1% 3.54s ± 1% ~ (p=0.168 n=40+40)
FmtFprintfEmpty-4 45.5ns ± 3% 45.6ns ± 3% ~ (p=0.553 n=40+40)
FmtFprintfString-4 78.4ns ± 3% 78.3ns ± 3% ~ (p=0.593 n=40+40)
FmtFprintfInt-4 88.8ns ± 2% 89.9ns ± 2% ~ (p=0.083 n=40+33)
FmtFprintfIntInt-4 140ns ± 4% 140ns ± 4% ~ (p=0.656 n=40+40)
FmtFprintfPrefixedInt-4 180ns ± 2% 181ns ± 3% +0.53% (p=0.050 n=40+40)
FmtFprintfFloat-4 408ns ± 4% 411ns ± 3% ~ (p=0.112 n=40+40)
FmtManyArgs-4 599ns ± 3% 602ns ± 3% ~ (p=0.784 n=40+40)
GobDecode-4 7.24ms ± 6% 7.30ms ± 5% ~ (p=0.171 n=40+40)
GobEncode-4 6.98ms ± 5% 6.89ms ± 8% ~ (p=0.107 n=40+40)
Gzip-4 396ms ± 4% 396ms ± 3% ~ (p=0.852 n=40+40)
Gunzip-4 41.3ms ± 3% 41.5ms ± 4% ~ (p=0.221 n=40+40)
HTTPClientServer-4 63.4µs ± 3% 63.4µs ± 2% ~ (p=0.895 n=39+40)
JSONEncode-4 17.5ms ± 2% 17.5ms ± 3% ~ (p=0.090 n=40+40)
JSONDecode-4 60.6ms ± 3% 60.1ms ± 4% ~ (p=0.184 n=40+40)
Mandelbrot200-4 7.80ms ± 3% 7.78ms ± 2% ~ (p=0.512 n=40+40)
GoParse-4 3.30ms ± 3% 3.28ms ± 2% -0.61% (p=0.034 n=40+40)
RegexpMatchEasy0_32-4 104ns ± 4% 103ns ± 4% ~ (p=0.118 n=40+40)
RegexpMatchEasy0_1K-4 850ns ± 2% 848ns ± 2% ~ (p=0.370 n=40+40)
RegexpMatchEasy1_32-4 112ns ± 4% 112ns ± 4% ~ (p=0.848 n=40+40)
RegexpMatchEasy1_1K-4 1.04µs ± 4% 1.03µs ± 4% ~ (p=0.333 n=40+40)
RegexpMatchMedium_32-4 132ns ± 4% 131ns ± 3% ~ (p=0.527 n=40+40)
RegexpMatchMedium_1K-4 43.4µs ± 3% 43.5µs ± 3% ~ (p=0.111 n=40+40)
RegexpMatchHard_32-4 2.24µs ± 4% 2.24µs ± 4% ~ (p=0.441 n=40+40)
RegexpMatchHard_1K-4 67.9µs ± 3% 68.0µs ± 3% ~ (p=0.095 n=40+40)
Revcomp-4 1.84s ± 2% 1.84s ± 2% ~ (p=0.677 n=40+40)
Template-4 68.4ms ± 3% 68.6ms ± 3% ~ (p=0.345 n=40+40)
TimeParse-4 433ns ± 3% 433ns ± 3% ~ (p=0.403 n=40+40)
TimeFormat-4 407ns ± 3% 406ns ± 3% ~ (p=0.900 n=40+40)
[Geo mean] 67.1µs 67.2µs +0.04%
name old speed new speed delta
GobDecode-4 106MB/s ± 5% 105MB/s ± 5% ~ (p=0.173 n=40+40)
GobEncode-4 110MB/s ± 5% 112MB/s ± 9% ~ (p=0.104 n=40+40)
Gzip-4 49.0MB/s ± 4% 49.1MB/s ± 4% ~ (p=0.836 n=40+40)
Gunzip-4 471MB/s ± 3% 468MB/s ± 4% ~ (p=0.218 n=40+40)
JSONEncode-4 111MB/s ± 2% 111MB/s ± 3% ~ (p=0.090 n=40+40)
JSONDecode-4 32.0MB/s ± 3% 32.3MB/s ± 4% ~ (p=0.194 n=40+40)
GoParse-4 17.6MB/s ± 3% 17.7MB/s ± 2% +0.62% (p=0.035 n=40+40)
RegexpMatchEasy0_32-4 307MB/s ± 4% 309MB/s ± 4% +0.70% (p=0.041 n=40+40)
RegexpMatchEasy0_1K-4 1.20GB/s ± 3% 1.21GB/s ± 2% ~ (p=0.353 n=40+40)
RegexpMatchEasy1_32-4 285MB/s ± 3% 284MB/s ± 4% ~ (p=0.384 n=40+40)
RegexpMatchEasy1_1K-4 988MB/s ± 4% 992MB/s ± 3% ~ (p=0.335 n=40+40)
RegexpMatchMedium_32-4 7.56MB/s ± 4% 7.57MB/s ± 4% ~ (p=0.314 n=40+40)
RegexpMatchMedium_1K-4 23.6MB/s ± 3% 23.6MB/s ± 3% ~ (p=0.107 n=40+40)
RegexpMatchHard_32-4 14.3MB/s ± 4% 14.3MB/s ± 4% ~ (p=0.429 n=40+40)
RegexpMatchHard_1K-4 15.1MB/s ± 3% 15.1MB/s ± 3% ~ (p=0.099 n=40+40)
Revcomp-4 138MB/s ± 2% 138MB/s ± 2% ~ (p=0.658 n=40+40)
Template-4 28.4MB/s ± 3% 28.3MB/s ± 3% ~ (p=0.331 n=40+40)
[Geo mean] 80.8MB/s 80.8MB/s +0.09%
Change-Id: I0cb715eead68ade097a302e7fb80ccbd1d1b511e
Reviewed-on: https://go-review.googlesource.com/130975
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-25 02:39:49 +00:00
Ben Shi
3bc34385fa
cmd/compile: introduce more read-modify-write operations for amd64
...
Add suport of read-modify-write for AND/SUB/AND/OR/XOR on amd64.
1. The total size of pkg/linux_amd64 decreases about 4KB, excluding
cmd/compile.
2. The go1 benchmark shows a little improvement, excluding noise.
name old time/op new time/op delta
BinaryTree17-4 2.63s ± 3% 2.65s ± 4% +1.01% (p=0.037 n=35+35)
Fannkuch11-4 2.33s ± 2% 2.39s ± 2% +2.49% (p=0.000 n=35+35)
FmtFprintfEmpty-4 45.4ns ± 5% 40.8ns ± 6% -10.09% (p=0.000 n=35+35)
FmtFprintfString-4 73.3ns ± 4% 70.9ns ± 3% -3.23% (p=0.000 n=30+35)
FmtFprintfInt-4 79.9ns ± 4% 79.5ns ± 3% ~ (p=0.736 n=34+35)
FmtFprintfIntInt-4 126ns ± 4% 125ns ± 4% ~ (p=0.083 n=35+35)
FmtFprintfPrefixedInt-4 152ns ± 6% 152ns ± 3% ~ (p=0.855 n=34+35)
FmtFprintfFloat-4 215ns ± 4% 213ns ± 4% ~ (p=0.066 n=35+35)
FmtManyArgs-4 522ns ± 3% 506ns ± 3% -3.15% (p=0.000 n=35+35)
GobDecode-4 6.45ms ± 8% 6.51ms ± 7% +0.96% (p=0.026 n=35+35)
GobEncode-4 6.10ms ± 6% 6.02ms ± 8% ~ (p=0.160 n=35+35)
Gzip-4 228ms ± 3% 221ms ± 3% -2.92% (p=0.000 n=35+35)
Gunzip-4 37.5ms ± 4% 37.2ms ± 3% -0.78% (p=0.036 n=35+35)
HTTPClientServer-4 58.7µs ± 2% 59.2µs ± 1% +0.80% (p=0.000 n=33+33)
JSONEncode-4 12.0ms ± 3% 12.2ms ± 3% +1.84% (p=0.008 n=35+35)
JSONDecode-4 57.0ms ± 4% 56.6ms ± 3% ~ (p=0.320 n=35+35)
Mandelbrot200-4 3.82ms ± 3% 3.79ms ± 3% ~ (p=0.074 n=35+35)
GoParse-4 3.21ms ± 5% 3.24ms ± 4% ~ (p=0.119 n=35+35)
RegexpMatchEasy0_32-4 76.3ns ± 4% 75.4ns ± 4% -1.14% (p=0.014 n=34+33)
RegexpMatchEasy0_1K-4 251ns ± 4% 254ns ± 3% +1.28% (p=0.016 n=35+35)
RegexpMatchEasy1_32-4 69.6ns ± 3% 70.1ns ± 3% +0.82% (p=0.005 n=35+35)
RegexpMatchEasy1_1K-4 367ns ± 4% 376ns ± 4% +2.47% (p=0.000 n=35+35)
RegexpMatchMedium_32-4 108ns ± 5% 104ns ± 4% -3.18% (p=0.000 n=35+35)
RegexpMatchMedium_1K-4 33.8µs ± 3% 32.7µs ± 3% -3.27% (p=0.000 n=35+35)
RegexpMatchHard_32-4 1.55µs ± 3% 1.52µs ± 3% -1.64% (p=0.000 n=35+35)
RegexpMatchHard_1K-4 46.6µs ± 3% 46.6µs ± 4% ~ (p=0.149 n=35+35)
Revcomp-4 416ms ± 7% 412ms ± 6% -0.95% (p=0.033 n=33+35)
Template-4 64.3ms ± 3% 62.4ms ± 7% -2.94% (p=0.000 n=35+35)
TimeParse-4 320ns ± 2% 322ns ± 3% ~ (p=0.589 n=35+35)
TimeFormat-4 300ns ± 3% 300ns ± 3% ~ (p=0.597 n=35+35)
[Geo mean] 47.4µs 47.0µs -0.86%
name old speed new speed delta
GobDecode-4 119MB/s ± 7% 118MB/s ± 7% -0.96% (p=0.027 n=35+35)
GobEncode-4 126MB/s ± 7% 127MB/s ± 6% ~ (p=0.157 n=34+34)
Gzip-4 85.3MB/s ± 3% 87.9MB/s ± 3% +3.02% (p=0.000 n=35+35)
Gunzip-4 518MB/s ± 4% 522MB/s ± 3% +0.79% (p=0.037 n=35+35)
JSONEncode-4 162MB/s ± 3% 159MB/s ± 3% -1.81% (p=0.009 n=35+35)
JSONDecode-4 34.1MB/s ± 4% 34.3MB/s ± 3% ~ (p=0.318 n=35+35)
GoParse-4 18.0MB/s ± 5% 17.9MB/s ± 4% ~ (p=0.117 n=35+35)
RegexpMatchEasy0_32-4 419MB/s ± 3% 425MB/s ± 4% +1.46% (p=0.003 n=32+33)
RegexpMatchEasy0_1K-4 4.07GB/s ± 4% 4.02GB/s ± 3% -1.28% (p=0.014 n=35+35)
RegexpMatchEasy1_32-4 460MB/s ± 3% 456MB/s ± 4% -0.82% (p=0.004 n=35+35)
RegexpMatchEasy1_1K-4 2.79GB/s ± 4% 2.72GB/s ± 4% -2.39% (p=0.000 n=35+35)
RegexpMatchMedium_32-4 9.23MB/s ± 4% 9.53MB/s ± 4% +3.16% (p=0.000 n=35+35)
RegexpMatchMedium_1K-4 30.3MB/s ± 3% 31.3MB/s ± 3% +3.38% (p=0.000 n=35+35)
RegexpMatchHard_32-4 20.7MB/s ± 3% 21.0MB/s ± 3% +1.67% (p=0.000 n=35+35)
RegexpMatchHard_1K-4 22.0MB/s ± 3% 21.9MB/s ± 4% ~ (p=0.277 n=35+33)
Revcomp-4 612MB/s ± 7% 618MB/s ± 6% +0.96% (p=0.034 n=33+35)
Template-4 30.2MB/s ± 3% 31.1MB/s ± 6% +3.05% (p=0.000 n=35+35)
[Geo mean] 123MB/s 124MB/s +0.64%
Change-Id: Ia025da272e07d0069413824bfff3471b106d6280
Reviewed-on: https://go-review.googlesource.com/121535
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-24 23:38:25 +00:00
Kazuhiro Sera
ad644d2e86
all: fix typos detected by github.com/client9/misspell
...
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661
GitHub-Last-Rev: ae85bcf82b
GitHub-Pull-Request: golang/go#26920
Reviewed-on: https://go-review.googlesource.com/128955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-23 15:54:07 +00:00
Yury Smolsky
1484270aec
test: restore tests for the reject unsafe code option
...
Tests in test/safe were neglected after moving to the run.go
framework. This change restores them.
These tests are skipped for go/types via -+ option.
Fixes #25668
Change-Id: I8fe26574a76fa7afa8664c467d7c2e6334f1bba9
Reviewed-on: https://go-review.googlesource.com/124660
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-22 17:54:09 +00:00
Yury Smolsky
02fecd33f6
test: remove errchk, the perl script
...
gc tests do not depend on errchk.
Fixes #25669
Change-Id: I99eb87bb9677897b9167d4fc9a6321fa66cd9116
Reviewed-on: https://go-review.googlesource.com/115955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-22 16:28:36 +00:00
Ben Shi
a0a7e9fc0c
cmd/compile: implement "OPC $imm, (mem)" for 386
...
New read-modify-write operations are introduced in this CL for 386.
1. The total size of pkg/linux_386 decreases about 10KB (excluding
cmd/compile).
2. The go1 benchmark shows little regression.
name old time/op new time/op delta
BinaryTree17-4 3.32s ± 4% 3.29s ± 2% ~ (p=0.059 n=30+30)
Fannkuch11-4 3.49s ± 1% 3.46s ± 1% -0.92% (p=0.001 n=30+30)
FmtFprintfEmpty-4 47.7ns ± 2% 46.8ns ± 5% -1.93% (p=0.011 n=25+30)
FmtFprintfString-4 79.5ns ± 7% 80.2ns ± 3% +0.89% (p=0.001 n=28+29)
FmtFprintfInt-4 90.5ns ± 2% 92.1ns ± 2% +1.82% (p=0.014 n=22+30)
FmtFprintfIntInt-4 141ns ± 1% 144ns ± 3% +2.23% (p=0.013 n=22+30)
FmtFprintfPrefixedInt-4 183ns ± 2% 184ns ± 3% ~ (p=0.080 n=21+30)
FmtFprintfFloat-4 409ns ± 3% 412ns ± 3% +0.83% (p=0.040 n=30+30)
FmtManyArgs-4 597ns ± 6% 607ns ± 4% +1.71% (p=0.006 n=30+30)
GobDecode-4 7.21ms ± 5% 7.18ms ± 6% ~ (p=0.665 n=30+30)
GobEncode-4 7.17ms ± 6% 7.09ms ± 7% ~ (p=0.117 n=29+30)
Gzip-4 413ms ± 4% 399ms ± 4% -3.48% (p=0.000 n=30+30)
Gunzip-4 41.3ms ± 4% 41.7ms ± 3% +1.05% (p=0.011 n=30+30)
HTTPClientServer-4 63.5µs ± 3% 62.9µs ± 2% -0.97% (p=0.017 n=30+27)
JSONEncode-4 20.3ms ± 5% 20.1ms ± 5% -1.16% (p=0.004 n=30+30)
JSONDecode-4 66.2ms ± 4% 67.7ms ± 4% +2.21% (p=0.000 n=30+30)
Mandelbrot200-4 5.16ms ± 3% 5.18ms ± 3% ~ (p=0.123 n=30+30)
GoParse-4 3.23ms ± 2% 3.27ms ± 2% +1.08% (p=0.006 n=30+30)
RegexpMatchEasy0_32-4 98.9ns ± 5% 97.1ns ± 4% -1.83% (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4 842ns ± 3% 842ns ± 3% ~ (p=0.550 n=30+30)
RegexpMatchEasy1_32-4 107ns ± 4% 105ns ± 4% -1.93% (p=0.012 n=30+30)
RegexpMatchEasy1_1K-4 1.03µs ± 4% 1.04µs ± 4% ~ (p=0.304 n=30+30)
RegexpMatchMedium_32-4 132ns ± 2% 129ns ± 4% -2.02% (p=0.000 n=21+30)
RegexpMatchMedium_1K-4 44.1µs ± 4% 43.8µs ± 3% ~ (p=0.641 n=30+30)
RegexpMatchHard_32-4 2.26µs ± 4% 2.23µs ± 4% -1.28% (p=0.023 n=30+30)
RegexpMatchHard_1K-4 68.1µs ± 3% 68.6µs ± 4% ~ (p=0.089 n=30+30)
Revcomp-4 1.85s ± 2% 1.84s ± 2% ~ (p=0.072 n=30+30)
Template-4 69.2ms ± 3% 68.5ms ± 3% -1.04% (p=0.012 n=30+30)
TimeParse-4 441ns ± 3% 446ns ± 4% +1.21% (p=0.001 n=30+30)
TimeFormat-4 415ns ± 3% 415ns ± 3% ~ (p=0.436 n=30+30)
[Geo mean] 67.0µs 66.9µs -0.17%
name old speed new speed delta
GobDecode-4 107MB/s ± 5% 107MB/s ± 6% ~ (p=0.663 n=30+30)
GobEncode-4 107MB/s ± 6% 108MB/s ± 7% ~ (p=0.117 n=29+30)
Gzip-4 47.0MB/s ± 4% 48.7MB/s ± 4% +3.61% (p=0.000 n=30+30)
Gunzip-4 470MB/s ± 4% 466MB/s ± 4% -1.05% (p=0.011 n=30+30)
JSONEncode-4 95.6MB/s ± 5% 96.7MB/s ± 5% +1.16% (p=0.005 n=30+30)
JSONDecode-4 29.3MB/s ± 4% 28.7MB/s ± 4% -2.17% (p=0.000 n=30+30)
GoParse-4 17.9MB/s ± 2% 17.7MB/s ± 2% -1.06% (p=0.007 n=30+30)
RegexpMatchEasy0_32-4 323MB/s ± 5% 329MB/s ± 4% +1.93% (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4 1.22GB/s ± 3% 1.22GB/s ± 3% ~ (p=0.496 n=30+30)
RegexpMatchEasy1_32-4 298MB/s ± 4% 303MB/s ± 4% +1.84% (p=0.017 n=30+30)
RegexpMatchEasy1_1K-4 995MB/s ± 4% 989MB/s ± 4% ~ (p=0.307 n=30+30)
RegexpMatchMedium_32-4 7.56MB/s ± 4% 7.74MB/s ± 4% +2.46% (p=0.000 n=22+30)
RegexpMatchMedium_1K-4 23.2MB/s ± 4% 23.4MB/s ± 3% ~ (p=0.651 n=30+30)
RegexpMatchHard_32-4 14.2MB/s ± 4% 14.3MB/s ± 4% +1.29% (p=0.021 n=30+30)
RegexpMatchHard_1K-4 15.0MB/s ± 3% 14.9MB/s ± 4% ~ (p=0.069 n=30+29)
Revcomp-4 138MB/s ± 2% 138MB/s ± 2% ~ (p=0.072 n=30+30)
Template-4 28.1MB/s ± 3% 28.4MB/s ± 3% +1.05% (p=0.012 n=30+30)
[Geo mean] 79.7MB/s 80.2MB/s +0.60%
Change-Id: I44a1dfc942c9a385904553c4fe1fa8e509c8aa31
Reviewed-on: https://go-review.googlesource.com/120916
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-22 04:12:42 +00:00
Ben Shi
90f2fa0037
cmd/compile: optimize 386 code with MULLload/DIVSSload/DIVSDload
...
IMULL/DIVSS/DIVSD all can take the source operand from memory
directly. And this CL implement that optimization.
1. The total size of pkg/linux_386 decreases about 84KB (excluding
cmd/compile).
2. The go1 benchmark shows little regression in total (excluding noise).
name old time/op new time/op delta
BinaryTree17-4 3.29s ± 2% 3.27s ± 4% ~ (p=0.192 n=30+30)
Fannkuch11-4 3.49s ± 2% 3.54s ± 1% +1.48% (p=0.000 n=30+30)
FmtFprintfEmpty-4 45.9ns ± 3% 46.3ns ± 4% +0.89% (p=0.037 n=30+30)
FmtFprintfString-4 78.8ns ± 3% 78.7ns ± 4% ~ (p=0.209 n=30+27)
FmtFprintfInt-4 91.0ns ± 2% 90.3ns ± 2% -0.82% (p=0.031 n=30+27)
FmtFprintfIntInt-4 142ns ± 4% 143ns ± 4% ~ (p=0.136 n=30+30)
FmtFprintfPrefixedInt-4 181ns ± 3% 183ns ± 4% +1.40% (p=0.005 n=30+30)
FmtFprintfFloat-4 404ns ± 4% 408ns ± 3% ~ (p=0.397 n=30+30)
FmtManyArgs-4 601ns ± 3% 609ns ± 5% ~ (p=0.059 n=30+30)
GobDecode-4 7.21ms ± 5% 7.24ms ± 5% ~ (p=0.612 n=30+30)
GobEncode-4 6.91ms ± 6% 6.91ms ± 6% ~ (p=0.797 n=30+30)
Gzip-4 398ms ± 6% 399ms ± 4% ~ (p=0.173 n=30+30)
Gunzip-4 41.7ms ± 3% 41.8ms ± 3% ~ (p=0.423 n=30+30)
HTTPClientServer-4 62.3µs ± 2% 62.7µs ± 3% ~ (p=0.085 n=29+30)
JSONEncode-4 21.0ms ± 4% 20.7ms ± 5% -1.39% (p=0.014 n=30+30)
JSONDecode-4 66.3ms ± 3% 67.4ms ± 1% +1.71% (p=0.003 n=30+24)
Mandelbrot200-4 5.15ms ± 3% 5.16ms ± 3% ~ (p=0.697 n=30+30)
GoParse-4 3.24ms ± 3% 3.27ms ± 4% +0.91% (p=0.032 n=30+30)
RegexpMatchEasy0_32-4 101ns ± 5% 99ns ± 4% -1.82% (p=0.008 n=29+30)
RegexpMatchEasy0_1K-4 848ns ± 4% 841ns ± 2% -0.77% (p=0.043 n=30+30)
RegexpMatchEasy1_32-4 106ns ± 6% 106ns ± 3% ~ (p=0.939 n=29+30)
RegexpMatchEasy1_1K-4 1.02µs ± 3% 1.03µs ± 4% ~ (p=0.297 n=28+30)
RegexpMatchMedium_32-4 129ns ± 4% 127ns ± 4% ~ (p=0.073 n=30+30)
RegexpMatchMedium_1K-4 43.9µs ± 3% 43.8µs ± 3% ~ (p=0.186 n=30+30)
RegexpMatchHard_32-4 2.24µs ± 4% 2.22µs ± 4% ~ (p=0.332 n=30+29)
RegexpMatchHard_1K-4 68.0µs ± 4% 67.5µs ± 3% ~ (p=0.290 n=30+30)
Revcomp-4 1.85s ± 3% 1.85s ± 3% ~ (p=0.358 n=30+30)
Template-4 69.6ms ± 3% 70.0ms ± 4% ~ (p=0.273 n=30+30)
TimeParse-4 445ns ± 3% 441ns ± 3% ~ (p=0.494 n=30+30)
TimeFormat-4 412ns ± 3% 412ns ± 6% ~ (p=0.841 n=30+30)
[Geo mean] 66.7µs 66.8µs +0.13%
name old speed new speed delta
GobDecode-4 107MB/s ± 5% 106MB/s ± 5% ~ (p=0.615 n=30+30)
GobEncode-4 111MB/s ± 6% 111MB/s ± 6% ~ (p=0.790 n=30+30)
Gzip-4 48.8MB/s ± 6% 48.7MB/s ± 4% ~ (p=0.167 n=30+30)
Gunzip-4 465MB/s ± 3% 465MB/s ± 3% ~ (p=0.420 n=30+30)
JSONEncode-4 92.4MB/s ± 4% 93.7MB/s ± 5% +1.42% (p=0.015 n=30+30)
JSONDecode-4 29.3MB/s ± 3% 28.8MB/s ± 1% -1.72% (p=0.003 n=30+24)
GoParse-4 17.9MB/s ± 3% 17.7MB/s ± 4% -0.89% (p=0.037 n=30+30)
RegexpMatchEasy0_32-4 317MB/s ± 8% 324MB/s ± 4% +2.14% (p=0.006 n=30+30)
RegexpMatchEasy0_1K-4 1.21GB/s ± 4% 1.22GB/s ± 2% +0.77% (p=0.036 n=30+30)
RegexpMatchEasy1_32-4 298MB/s ± 7% 299MB/s ± 4% ~ (p=0.511 n=30+30)
RegexpMatchEasy1_1K-4 1.00GB/s ± 3% 1.00GB/s ± 4% ~ (p=0.304 n=28+30)
RegexpMatchMedium_32-4 7.75MB/s ± 4% 7.82MB/s ± 4% ~ (p=0.089 n=30+30)
RegexpMatchMedium_1K-4 23.3MB/s ± 3% 23.4MB/s ± 3% ~ (p=0.181 n=30+30)
RegexpMatchHard_32-4 14.3MB/s ± 4% 14.4MB/s ± 4% ~ (p=0.320 n=30+29)
RegexpMatchHard_1K-4 15.1MB/s ± 4% 15.2MB/s ± 3% ~ (p=0.273 n=30+30)
Revcomp-4 137MB/s ± 3% 137MB/s ± 3% ~ (p=0.352 n=30+30)
Template-4 27.9MB/s ± 3% 27.7MB/s ± 4% ~ (p=0.277 n=30+30)
[Geo mean] 79.9MB/s 80.1MB/s +0.15%
Change-Id: I97333cd8ddabb3c7c88ca5aa9e14a005b74d306d
Reviewed-on: https://go-review.googlesource.com/120695
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-22 03:38:38 +00:00
Ben Shi
705f3c74e6
cmd/compile: optimize AMD64 with DIVSSload and DIVSDload
...
DIVSSload & DIVSDload directly operate on a memory operand. And
binary size can be reduced by them, while the performance is
not affected.
The total size of pkg/linux_amd64 (excluding cmd/compile) decreases
about 6KB.
There is little regression in the go1 benchmark test (excluding noise).
name old time/op new time/op delta
BinaryTree17-4 2.63s ± 4% 2.62s ± 4% ~ (p=0.809 n=30+30)
Fannkuch11-4 2.40s ± 2% 2.40s ± 2% ~ (p=0.109 n=30+30)
FmtFprintfEmpty-4 43.1ns ± 4% 43.2ns ± 9% ~ (p=0.168 n=30+30)
FmtFprintfString-4 73.6ns ± 4% 74.1ns ± 4% ~ (p=0.069 n=30+30)
FmtFprintfInt-4 81.0ns ± 3% 81.4ns ± 5% ~ (p=0.350 n=30+30)
FmtFprintfIntInt-4 127ns ± 4% 129ns ± 4% +0.99% (p=0.021 n=30+30)
FmtFprintfPrefixedInt-4 156ns ± 4% 155ns ± 4% ~ (p=0.415 n=30+30)
FmtFprintfFloat-4 219ns ± 4% 218ns ± 4% ~ (p=0.071 n=30+30)
FmtManyArgs-4 522ns ± 3% 518ns ± 3% -0.68% (p=0.034 n=30+30)
GobDecode-4 6.49ms ± 6% 6.52ms ± 6% ~ (p=0.832 n=30+30)
GobEncode-4 6.10ms ± 9% 6.14ms ± 7% ~ (p=0.485 n=30+30)
Gzip-4 227ms ± 1% 224ms ± 4% ~ (p=0.484 n=24+30)
Gunzip-4 37.2ms ± 3% 36.8ms ± 4% ~ (p=0.889 n=30+30)
HTTPClientServer-4 58.9µs ± 1% 58.7µs ± 2% -0.42% (p=0.003 n=28+28)
JSONEncode-4 12.0ms ± 3% 12.0ms ± 4% ~ (p=0.523 n=30+30)
JSONDecode-4 54.6ms ± 4% 54.5ms ± 4% ~ (p=0.708 n=30+30)
Mandelbrot200-4 3.78ms ± 4% 3.81ms ± 3% +0.99% (p=0.016 n=30+30)
GoParse-4 3.20ms ± 4% 3.20ms ± 5% ~ (p=0.994 n=30+30)
RegexpMatchEasy0_32-4 77.0ns ± 4% 75.9ns ± 3% -1.39% (p=0.006 n=29+30)
RegexpMatchEasy0_1K-4 255ns ± 4% 253ns ± 4% ~ (p=0.091 n=30+30)
RegexpMatchEasy1_32-4 69.7ns ± 3% 70.3ns ± 4% ~ (p=0.120 n=30+30)
RegexpMatchEasy1_1K-4 373ns ± 2% 378ns ± 3% +1.43% (p=0.000 n=21+26)
RegexpMatchMedium_32-4 107ns ± 2% 108ns ± 4% +1.50% (p=0.012 n=22+30)
RegexpMatchMedium_1K-4 34.0µs ± 1% 34.3µs ± 3% +1.08% (p=0.008 n=24+30)
RegexpMatchHard_32-4 1.53µs ± 3% 1.54µs ± 3% ~ (p=0.234 n=30+30)
RegexpMatchHard_1K-4 46.7µs ± 4% 47.0µs ± 4% ~ (p=0.420 n=30+30)
Revcomp-4 411ms ± 7% 415ms ± 6% ~ (p=0.059 n=30+30)
Template-4 65.5ms ± 5% 66.9ms ± 4% +2.21% (p=0.001 n=30+30)
TimeParse-4 317ns ± 3% 311ns ± 3% -1.97% (p=0.000 n=30+30)
TimeFormat-4 293ns ± 3% 294ns ± 3% ~ (p=0.243 n=30+30)
[Geo mean] 47.4µs 47.5µs +0.17%
name old speed new speed delta
GobDecode-4 118MB/s ± 5% 118MB/s ± 6% ~ (p=0.832 n=30+30)
GobEncode-4 125MB/s ± 7% 125MB/s ± 7% ~ (p=0.625 n=29+30)
Gzip-4 85.3MB/s ± 1% 86.6MB/s ± 4% ~ (p=0.486 n=24+30)
Gunzip-4 522MB/s ± 3% 527MB/s ± 4% ~ (p=0.889 n=30+30)
JSONEncode-4 162MB/s ± 3% 162MB/s ± 4% ~ (p=0.520 n=30+30)
JSONDecode-4 35.5MB/s ± 4% 35.6MB/s ± 4% ~ (p=0.701 n=30+30)
GoParse-4 18.1MB/s ± 4% 18.1MB/s ± 4% ~ (p=0.891 n=29+30)
RegexpMatchEasy0_32-4 416MB/s ± 4% 422MB/s ± 3% +1.43% (p=0.005 n=29+30)
RegexpMatchEasy0_1K-4 4.01GB/s ± 4% 4.04GB/s ± 4% ~ (p=0.091 n=30+30)
RegexpMatchEasy1_32-4 460MB/s ± 3% 456MB/s ± 5% ~ (p=0.123 n=30+30)
RegexpMatchEasy1_1K-4 2.74GB/s ± 2% 2.70GB/s ± 3% -1.33% (p=0.000 n=22+26)
RegexpMatchMedium_32-4 9.39MB/s ± 3% 9.19MB/s ± 4% -2.06% (p=0.001 n=28+30)
RegexpMatchMedium_1K-4 30.1MB/s ± 1% 29.8MB/s ± 3% -1.04% (p=0.008 n=24+30)
RegexpMatchHard_32-4 20.9MB/s ± 3% 20.8MB/s ± 3% ~ (p=0.234 n=30+30)
RegexpMatchHard_1K-4 21.9MB/s ± 4% 21.8MB/s ± 4% ~ (p=0.420 n=30+30)
Revcomp-4 619MB/s ± 7% 612MB/s ± 7% ~ (p=0.059 n=30+30)
Template-4 29.6MB/s ± 4% 29.0MB/s ± 4% -2.16% (p=0.002 n=30+30)
[Geo mean] 123MB/s 123MB/s -0.33%
Change-Id: Ia59e077feae4f2824df79059daea4d0f678e3e4c
Reviewed-on: https://go-review.googlesource.com/120275
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-08-22 02:47:49 +00:00
Ian Lance Taylor
7e64377903
cmd/compile: only support -race and -msan where they work
...
Consolidate decision about whether -race and -msan options are
supported in cmd/internal/sys. Use consolidated functions in
cmd/compile and cmd/go. Use a copy of them in cmd/dist; cmd/dist can't
import cmd/internal/sys because Go 1.4 doesn't have it.
Fixes #24315
Change-Id: I9cecaed4895eb1a2a49379b4848db40de66d32a9
Reviewed-on: https://go-review.googlesource.com/121816
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-21 03:38:27 +00:00
Ilya Tocar
a3593685cf
cmd/compile/internal/ssa: remove useless zero extension
...
We generate MOVBLZX for byte-sized LoadReg, so
(MOVBQZX (LoadReg (Arg))) is the same as
(LoadReg (Arg)). Remove those zero extension where possible.
Triggers several times during all.bash.
Fixes #25378
Updates #15300
Change-Id: If50656e66f217832a13ee8f49c47997f4fcc093a
Reviewed-on: https://go-review.googlesource.com/115617
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-20 21:38:20 +00:00
Ilya Tocar
c292b32f33
cmd/compile: enable disjoint memmove inlining on amd64
...
Memmove can use AVX/prefetches/other optional instructions, so
only do it for small sizes, when call overhead dominates.
Change-Id: Ice5e93deb11462217f7fb5fc350b703109bb4090
Reviewed-on: https://go-review.googlesource.com/112517
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-08-20 21:10:12 +00:00
Ben Shi
0a382e0b7f
cmd/compile: optimize ARMv7 code
...
"AND $0xffff0000, Rx" will be encoded to 12 bytes.
1. MOVWload from the constant pool to Rtmp
2. AND Rtmp, Rx
3. a 4-byte item in the constant pool
It can be simplified to 8 bytes on ARMv7, since ARMv7 has
"MOVW $imm-16, Rx".
1. MOVW $0xffff, Rtmp
2. BIC Rtmp, Rx
The above optimization also applies to BICconst, ADDconst and
SUBconst.
1. The total size of pkg/android_arm (excluding cmd/compile)
decreases about 2KB.
2. The go1 benchmark shows no regression, exlcuding noise.
name old time/op new time/op delta
BinaryTree17-4 25.5s ± 1% 25.2s ± 1% -0.85% (p=0.000 n=30+30)
Fannkuch11-4 13.3s ± 0% 13.3s ± 0% +0.16% (p=0.000 n=24+25)
FmtFprintfEmpty-4 397ns ± 0% 394ns ± 0% -0.64% (p=0.000 n=30+30)
FmtFprintfString-4 679ns ± 0% 678ns ± 0% ~ (p=0.093 n=30+29)
FmtFprintfInt-4 708ns ± 0% 707ns ± 0% -0.19% (p=0.000 n=27+28)
FmtFprintfIntInt-4 1.05µs ± 0% 1.05µs ± 0% -0.07% (p=0.001 n=18+30)
FmtFprintfPrefixedInt-4 1.16µs ± 0% 1.15µs ± 0% -0.41% (p=0.000 n=29+30)
FmtFprintfFloat-4 2.26µs ± 0% 2.23µs ± 1% -1.40% (p=0.000 n=30+30)
FmtManyArgs-4 3.96µs ± 0% 3.95µs ± 0% -0.29% (p=0.000 n=29+30)
GobDecode-4 52.9ms ± 2% 53.4ms ± 2% +0.92% (p=0.004 n=28+30)
GobEncode-4 49.7ms ± 2% 49.8ms ± 2% ~ (p=0.890 n=30+26)
Gzip-4 2.61s ± 0% 2.60s ± 0% -0.36% (p=0.000 n=29+29)
Gunzip-4 312ms ± 0% 311ms ± 0% -0.13% (p=0.000 n=30+28)
HTTPClientServer-4 1.02ms ± 8% 1.00ms ± 7% ~ (p=0.224 n=29+26)
JSONEncode-4 125ms ± 1% 124ms ± 3% -1.05% (p=0.000 n=25+30)
JSONDecode-4 432ms ± 1% 436ms ± 2% ~ (p=0.277 n=26+30)
Mandelbrot200-4 18.4ms ± 0% 18.4ms ± 0% +0.02% (p=0.001 n=28+25)
GoParse-4 22.4ms ± 1% 22.3ms ± 1% -0.41% (p=0.000 n=28+28)
RegexpMatchEasy0_32-4 697ns ± 0% 706ns ± 0% +1.23% (p=0.000 n=19+30)
RegexpMatchEasy0_1K-4 4.27µs ± 0% 4.26µs ± 0% -0.06% (p=0.000 n=30+30)
RegexpMatchEasy1_32-4 741ns ± 0% 735ns ± 0% -0.86% (p=0.000 n=26+30)
RegexpMatchEasy1_1K-4 5.49µs ± 0% 5.49µs ± 0% -0.03% (p=0.023 n=25+30)
RegexpMatchMedium_32-4 1.05µs ± 2% 1.04µs ± 2% ~ (p=0.893 n=30+30)
RegexpMatchMedium_1K-4 261µs ± 0% 261µs ± 0% -0.11% (p=0.000 n=29+30)
RegexpMatchHard_32-4 14.9µs ± 0% 14.9µs ± 0% -0.36% (p=0.000 n=23+29)
RegexpMatchHard_1K-4 446µs ± 0% 445µs ± 0% -0.17% (p=0.000 n=30+29)
Revcomp-4 41.6ms ± 1% 41.7ms ± 1% +0.27% (p=0.040 n=28+30)
Template-4 531ms ± 0% 532ms ± 1% ~ (p=0.059 n=30+30)
TimeParse-4 3.40µs ± 0% 3.33µs ± 0% -2.02% (p=0.000 n=30+30)
TimeFormat-4 6.14µs ± 0% 6.11µs ± 0% -0.45% (p=0.000 n=27+29)
[Geo mean] 384µs 383µs -0.27%
name old speed new speed delta
GobDecode-4 14.5MB/s ± 2% 14.4MB/s ± 2% -0.90% (p=0.005 n=28+30)
GobEncode-4 15.4MB/s ± 2% 15.4MB/s ± 2% ~ (p=0.741 n=30+25)
Gzip-4 7.44MB/s ± 0% 7.47MB/s ± 1% +0.37% (p=0.000 n=25+30)
Gunzip-4 62.3MB/s ± 0% 62.4MB/s ± 0% +0.13% (p=0.000 n=30+28)
JSONEncode-4 15.5MB/s ± 1% 15.6MB/s ± 3% +1.07% (p=0.000 n=25+30)
JSONDecode-4 4.48MB/s ± 0% 4.46MB/s ± 2% ~ (p=0.655 n=23+30)
GoParse-4 2.58MB/s ± 1% 2.59MB/s ± 1% +0.42% (p=0.000 n=28+29)
RegexpMatchEasy0_32-4 45.9MB/s ± 0% 45.3MB/s ± 0% -1.23% (p=0.000 n=28+30)
RegexpMatchEasy0_1K-4 240MB/s ± 0% 240MB/s ± 0% +0.07% (p=0.000 n=30+30)
RegexpMatchEasy1_32-4 43.2MB/s ± 0% 43.5MB/s ± 0% +0.85% (p=0.000 n=30+28)
RegexpMatchEasy1_1K-4 186MB/s ± 0% 186MB/s ± 0% +0.03% (p=0.026 n=25+30)
RegexpMatchMedium_32-4 955kB/s ± 2% 960kB/s ± 2% ~ (p=0.084 n=30+30)
RegexpMatchMedium_1K-4 3.92MB/s ± 0% 3.93MB/s ± 0% +0.14% (p=0.000 n=29+30)
RegexpMatchHard_32-4 2.14MB/s ± 0% 2.15MB/s ± 0% +0.31% (p=0.000 n=30+26)
RegexpMatchHard_1K-4 2.30MB/s ± 0% 2.30MB/s ± 0% ~ (all equal)
Revcomp-4 61.1MB/s ± 1% 60.9MB/s ± 1% -0.27% (p=0.039 n=28+30)
Template-4 3.66MB/s ± 0% 3.65MB/s ± 1% -0.14% (p=0.045 n=30+30)
[Geo mean] 12.8MB/s 12.8MB/s +0.04%
Change-Id: I02370e2584b4c041fddd324c97628fd6f0c12183
Reviewed-on: https://go-review.googlesource.com/123179
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-20 15:13:23 +00:00
Ben Shi
556971316f
cmd/compile: optimize 386's comparison
...
CMPL/CMPW/CMPB can take a memory operand on 386, and this CL
implements that optimization.
1. The total size of pkg/linux_386 decreases about 45KB, excluding
cmd/compile.
2. The go1 benchmark shows a little improvement.
name old time/op new time/op delta
BinaryTree17-4 3.36s ± 2% 3.37s ± 3% ~ (p=0.537 n=40+40)
Fannkuch11-4 3.59s ± 1% 3.53s ± 2% -1.58% (p=0.000 n=40+40)
FmtFprintfEmpty-4 46.0ns ± 3% 45.8ns ± 3% ~ (p=0.249 n=40+40)
FmtFprintfString-4 80.0ns ± 4% 78.8ns ± 3% -1.49% (p=0.001 n=40+40)
FmtFprintfInt-4 89.7ns ± 2% 90.3ns ± 2% +0.74% (p=0.003 n=40+40)
FmtFprintfIntInt-4 144ns ± 3% 143ns ± 3% -0.95% (p=0.003 n=40+40)
FmtFprintfPrefixedInt-4 181ns ± 4% 180ns ± 2% ~ (p=0.103 n=40+40)
FmtFprintfFloat-4 412ns ± 3% 408ns ± 4% -0.97% (p=0.018 n=40+40)
FmtManyArgs-4 607ns ± 4% 605ns ± 4% ~ (p=0.148 n=40+40)
GobDecode-4 7.19ms ± 4% 7.24ms ± 5% ~ (p=0.340 n=40+40)
GobEncode-4 7.04ms ± 9% 6.99ms ± 9% ~ (p=0.289 n=40+40)
Gzip-4 400ms ± 6% 398ms ± 5% ~ (p=0.168 n=40+40)
Gunzip-4 41.2ms ± 3% 41.7ms ± 3% +1.40% (p=0.001 n=40+40)
HTTPClientServer-4 62.5µs ± 1% 62.1µs ± 2% -0.61% (p=0.000 n=37+37)
JSONEncode-4 20.7ms ± 4% 20.4ms ± 3% -1.60% (p=0.000 n=40+40)
JSONDecode-4 69.4ms ± 4% 69.2ms ± 6% ~ (p=0.177 n=40+40)
Mandelbrot200-4 5.22ms ± 6% 5.21ms ± 3% ~ (p=0.531 n=40+40)
GoParse-4 3.29ms ± 3% 3.28ms ± 3% ~ (p=0.321 n=40+39)
RegexpMatchEasy0_32-4 104ns ± 4% 103ns ± 7% -0.89% (p=0.040 n=40+40)
RegexpMatchEasy0_1K-4 852ns ± 3% 853ns ± 2% ~ (p=0.357 n=40+40)
RegexpMatchEasy1_32-4 113ns ± 8% 113ns ± 3% ~ (p=0.906 n=40+40)
RegexpMatchEasy1_1K-4 1.03µs ± 4% 1.03µs ± 5% ~ (p=0.326 n=40+40)
RegexpMatchMedium_32-4 136ns ± 3% 133ns ± 3% -2.31% (p=0.000 n=40+40)
RegexpMatchMedium_1K-4 44.0µs ± 3% 43.7µs ± 3% ~ (p=0.053 n=40+40)
RegexpMatchHard_32-4 2.27µs ± 3% 2.26µs ± 4% ~ (p=0.391 n=40+40)
RegexpMatchHard_1K-4 68.0µs ± 3% 68.9µs ± 3% +1.28% (p=0.000 n=40+40)
Revcomp-4 1.86s ± 5% 1.86s ± 2% ~ (p=0.950 n=40+40)
Template-4 73.4ms ± 4% 69.9ms ± 7% -4.78% (p=0.000 n=40+40)
TimeParse-4 449ns ± 4% 441ns ± 5% -1.76% (p=0.000 n=40+40)
TimeFormat-4 416ns ± 3% 417ns ± 4% ~ (p=0.304 n=40+40)
[Geo mean] 67.7µs 67.3µs -0.55%
name old speed new speed delta
GobDecode-4 107MB/s ± 4% 106MB/s ± 5% ~ (p=0.336 n=40+40)
GobEncode-4 109MB/s ± 5% 110MB/s ± 9% ~ (p=0.142 n=38+40)
Gzip-4 48.5MB/s ± 5% 48.8MB/s ± 5% ~ (p=0.172 n=40+40)
Gunzip-4 472MB/s ± 3% 465MB/s ± 3% -1.39% (p=0.001 n=40+40)
JSONEncode-4 93.6MB/s ± 4% 95.1MB/s ± 3% +1.61% (p=0.000 n=40+40)
JSONDecode-4 28.0MB/s ± 3% 28.1MB/s ± 6% ~ (p=0.181 n=40+40)
GoParse-4 17.6MB/s ± 3% 17.7MB/s ± 3% ~ (p=0.350 n=40+39)
RegexpMatchEasy0_32-4 308MB/s ± 4% 311MB/s ± 6% +0.96% (p=0.025 n=40+40)
RegexpMatchEasy0_1K-4 1.20GB/s ± 3% 1.20GB/s ± 2% ~ (p=0.317 n=40+40)
RegexpMatchEasy1_32-4 282MB/s ± 7% 282MB/s ± 3% ~ (p=0.516 n=40+40)
RegexpMatchEasy1_1K-4 994MB/s ± 4% 991MB/s ± 5% ~ (p=0.319 n=40+40)
RegexpMatchMedium_32-4 7.31MB/s ± 3% 7.49MB/s ± 3% +2.46% (p=0.000 n=40+40)
RegexpMatchMedium_1K-4 23.3MB/s ± 3% 23.4MB/s ± 3% ~ (p=0.052 n=40+40)
RegexpMatchHard_32-4 14.1MB/s ± 3% 14.1MB/s ± 4% ~ (p=0.391 n=40+40)
RegexpMatchHard_1K-4 15.1MB/s ± 3% 14.9MB/s ± 3% -1.27% (p=0.000 n=40+40)
Revcomp-4 137MB/s ± 5% 137MB/s ± 2% ~ (p=0.942 n=40+40)
Template-4 26.5MB/s ± 4% 27.8MB/s ± 7% +5.03% (p=0.000 n=40+40)
[Geo mean] 78.6MB/s 79.0MB/s +0.57%
Change-Id: Idcacc6881ef57cd7dc33aa87b711282842b72a53
Reviewed-on: https://go-review.googlesource.com/126618
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-20 14:23:22 +00:00