Commit graph

44 commits

Author SHA1 Message Date
Hugues Bruant
ec091b6af2 runtime: add mapassign_fast*
Add benchmarks for map assignment with int32/int64/string key

Benchmark results on darwin/amd64

name                  old time/op  new time/op  delta
MapAssignInt32_255-8  24.7ns ± 3%  17.4ns ± 2%  -29.75%  (p=0.000 n=10+10)
MapAssignInt32_64k-8  45.5ns ± 4%  37.6ns ± 4%  -17.18%  (p=0.000 n=10+10)
MapAssignInt64_255-8  26.0ns ± 3%  17.9ns ± 4%  -31.03%  (p=0.000 n=10+10)
MapAssignInt64_64k-8  46.9ns ± 5%  38.7ns ± 2%  -17.53%  (p=0.000 n=9+10)
MapAssignStr_255-8    47.8ns ± 3%  24.8ns ± 4%  -48.01%  (p=0.000 n=10+10)
MapAssignStr_64k-8    83.0ns ± 3%  51.9ns ± 3%  -37.45%  (p=0.000 n=10+9)

name                     old time/op    new time/op    delta
BinaryTree17-8              3.11s ±19%     2.78s ± 3%    ~     (p=0.095 n=5+5)
Fannkuch11-8                3.26s ± 1%     3.21s ± 2%    ~     (p=0.056 n=5+5)
FmtFprintfEmpty-8          50.3ns ± 1%    50.8ns ± 2%    ~     (p=0.246 n=5+5)
FmtFprintfString-8         82.7ns ± 4%    80.1ns ± 5%    ~     (p=0.238 n=5+5)
FmtFprintfInt-8            82.6ns ± 2%    81.9ns ± 3%    ~     (p=0.508 n=5+5)
FmtFprintfIntInt-8          124ns ± 4%     121ns ± 3%    ~     (p=0.111 n=5+5)
FmtFprintfPrefixedInt-8     158ns ± 6%     160ns ± 2%    ~     (p=0.341 n=5+5)
FmtFprintfFloat-8           249ns ± 2%     245ns ± 2%    ~     (p=0.095 n=5+5)
FmtManyArgs-8               513ns ± 2%     519ns ± 3%    ~     (p=0.151 n=5+5)
GobDecode-8                7.48ms ±12%    7.11ms ± 2%    ~     (p=0.222 n=5+5)
GobEncode-8                6.25ms ± 1%    6.03ms ± 2%  -3.56%  (p=0.008 n=5+5)
Gzip-8                      252ms ± 4%     252ms ± 4%    ~     (p=1.000 n=5+5)
Gunzip-8                   38.4ms ± 3%    38.6ms ± 2%    ~     (p=0.690 n=5+5)
HTTPClientServer-8         76.9µs ±41%    66.4µs ± 6%    ~     (p=0.310 n=5+5)
JSONEncode-8               16.5ms ± 3%    16.7ms ± 3%    ~     (p=0.421 n=5+5)
JSONDecode-8               54.6ms ± 1%    54.3ms ± 2%    ~     (p=0.548 n=5+5)
Mandelbrot200-8            4.45ms ± 3%    4.47ms ± 1%    ~     (p=0.841 n=5+5)
GoParse-8                  3.43ms ± 1%    3.32ms ± 2%  -3.28%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8      88.2ns ± 3%    89.4ns ± 2%    ~     (p=0.333 n=5+5)
RegexpMatchEasy0_1K-8       205ns ± 1%     206ns ± 1%    ~     (p=0.905 n=5+5)
RegexpMatchEasy1_32-8      85.1ns ± 1%    85.5ns ± 5%    ~     (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8       365ns ± 1%     371ns ± 9%    ~     (p=1.000 n=5+5)
RegexpMatchMedium_32-8      129ns ± 2%     128ns ± 3%    ~     (p=0.730 n=5+5)
RegexpMatchMedium_1K-8     39.8µs ± 0%    39.7µs ± 4%    ~     (p=0.730 n=4+5)
RegexpMatchHard_32-8       1.99µs ± 3%    2.05µs ±16%    ~     (p=0.794 n=5+5)
RegexpMatchHard_1K-8       59.3µs ± 1%    60.3µs ± 7%    ~     (p=1.000 n=5+5)
Revcomp-8                   1.36s ±63%     0.52s ± 5%    ~     (p=0.095 n=5+5)
Template-8                 62.6ms ±14%    60.5ms ± 5%    ~     (p=0.690 n=5+5)
TimeParse-8                 330ns ± 2%     324ns ± 2%    ~     (p=0.087 n=5+5)
TimeFormat-8                350ns ± 3%     340ns ± 1%  -2.86%  (p=0.008 n=5+5)

name                     old speed      new speed      delta
GobDecode-8               103MB/s ±11%   108MB/s ± 2%    ~     (p=0.222 n=5+5)
GobEncode-8               123MB/s ± 1%   127MB/s ± 2%  +3.71%  (p=0.008 n=5+5)
Gzip-8                   77.1MB/s ± 4%  76.9MB/s ± 3%    ~     (p=1.000 n=5+5)
Gunzip-8                  505MB/s ± 3%   503MB/s ± 2%    ~     (p=0.690 n=5+5)
JSONEncode-8              118MB/s ± 3%   116MB/s ± 3%    ~     (p=0.421 n=5+5)
JSONDecode-8             35.5MB/s ± 1%  35.8MB/s ± 2%    ~     (p=0.397 n=5+5)
GoParse-8                16.9MB/s ± 1%  17.4MB/s ± 2%  +3.45%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8     363MB/s ± 3%   358MB/s ± 2%    ~     (p=0.421 n=5+5)
RegexpMatchEasy0_1K-8    4.98GB/s ± 1%  4.97GB/s ± 1%    ~     (p=0.548 n=5+5)
RegexpMatchEasy1_32-8     376MB/s ± 1%   375MB/s ± 5%    ~     (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8    2.80GB/s ± 1%  2.76GB/s ± 9%    ~     (p=0.841 n=5+5)
RegexpMatchMedium_32-8   7.73MB/s ± 1%  7.76MB/s ± 3%    ~     (p=0.730 n=5+5)
RegexpMatchMedium_1K-8   25.8MB/s ± 0%  25.8MB/s ± 4%    ~     (p=0.651 n=4+5)
RegexpMatchHard_32-8     16.1MB/s ± 3%  15.7MB/s ±14%    ~     (p=0.794 n=5+5)
RegexpMatchHard_1K-8     17.3MB/s ± 1%  17.0MB/s ± 7%    ~     (p=0.984 n=5+5)
Revcomp-8                 273MB/s ±83%   488MB/s ± 5%    ~     (p=0.095 n=5+5)
Template-8               31.1MB/s ±13%  32.1MB/s ± 5%    ~     (p=0.690 n=5+5)

Updates #19495

Change-Id: I116e9a2a4594769318b22d736464de8a98499909
Reviewed-on: https://go-review.googlesource.com/38091
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13 23:43:16 +00:00
Matthew Dempsky
c310c688ff cmd/compile, runtime: simplify multiway select implementation
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.

test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.

Updates #19331.

Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-03-07 20:14:17 +00:00
Josh Bleecher Snyder
504bc3ed24 cmd/compile, runtime: specialize convT2x, don't alloc for zero vals
Prior to this CL, all runtime conversions
from a concrete value to an interface went
through one of two runtime calls: convT2E or convT2I.
However, in practice, basic types are very common.
Specializing convT2x for those basic types allows
for a more efficient implementation for those types.
For basic scalars and strings, allocation and copying
can use the same methods as normal code.
For pointer-free types, allocation can occur without
zeroing, and copying can take place without GC calls.
For slices, copying is cheaper and simpler.

This CL adds twelve runtime routines:

convT2E16, convT2I16
convT2E32, convT2I32
convT2E64, convT2I64
convT2Estring, convT2Istring
convT2Eslice, convT2Islice
convT2Enoptr, convT2Inoptr

While compiling make.bash, 93% of all convT2x calls
are now to one of these specialized convT2x call.

Within specialized convT2x routines, it is cheap to check
for a zero value, in a way that it is not in general.
When we detect a zero value there, we return a pointer
to zeroVal, rather than allocating.

name                         old time/op  new time/op  delta
ConvT2Ezero/zero/16-8        17.9ns ± 2%   3.0ns ± 3%  -83.20%  (p=0.000 n=56+56)
ConvT2Ezero/zero/32-8        17.8ns ± 2%   3.0ns ± 3%  -83.15%  (p=0.000 n=59+60)
ConvT2Ezero/zero/64-8        20.1ns ± 1%   3.0ns ± 2%  -84.98%  (p=0.000 n=57+57)
ConvT2Ezero/zero/str-8       32.6ns ± 1%   3.0ns ± 4%  -90.70%  (p=0.000 n=59+60)
ConvT2Ezero/zero/slice-8     36.7ns ± 2%   3.0ns ± 2%  -91.78%  (p=0.000 n=59+59)
ConvT2Ezero/zero/big-8       91.9ns ± 2%  85.9ns ± 2%   -6.52%  (p=0.000 n=57+57)
ConvT2Ezero/nonzero/16-8     17.7ns ± 2%  12.7ns ± 3%  -28.38%  (p=0.000 n=55+60)
ConvT2Ezero/nonzero/32-8     17.8ns ± 1%  12.7ns ± 1%  -28.44%  (p=0.000 n=54+57)
ConvT2Ezero/nonzero/64-8     20.0ns ± 1%  15.0ns ± 1%  -24.90%  (p=0.000 n=56+58)
ConvT2Ezero/nonzero/str-8    32.6ns ± 1%  25.7ns ± 1%  -21.17%  (p=0.000 n=58+55)
ConvT2Ezero/nonzero/slice-8  36.8ns ± 2%  30.4ns ± 1%  -17.32%  (p=0.000 n=60+52)
ConvT2Ezero/nonzero/big-8    92.1ns ± 2%  85.9ns ± 2%   -6.70%  (p=0.000 n=57+59)

Benchmarks on a real program (the compiler):

name       old time/op      new time/op      delta
Template        227ms ± 5%       221ms ± 2%  -2.48%  (p=0.000 n=30+26)
Unicode         102ms ± 5%       100ms ± 3%  -1.30%  (p=0.009 n=30+26)
GoTypes         656ms ± 5%       659ms ± 4%    ~     (p=0.208 n=30+30)
Compiler        2.82s ± 2%       2.82s ± 1%    ~     (p=0.614 n=29+27)
Flate           128ms ± 2%       128ms ± 5%    ~     (p=0.783 n=27+28)
GoParser        158ms ± 3%       158ms ± 3%    ~     (p=0.261 n=28+30)
Reflect         408ms ± 7%       401ms ± 3%    ~     (p=0.075 n=30+30)
Tar             123ms ± 6%       121ms ± 8%    ~     (p=0.287 n=29+30)
XML             220ms ± 2%       220ms ± 4%    ~     (p=0.805 n=29+29)

name       old user-ns/op   new user-ns/op   delta
Template   281user-ms ± 4%  279user-ms ± 3%  -0.87%  (p=0.044 n=28+28)
Unicode    142user-ms ± 4%  141user-ms ± 3%  -1.04%  (p=0.015 n=30+27)
GoTypes    884user-ms ± 3%  886user-ms ± 2%    ~     (p=0.532 n=30+30)
Compiler   3.94user-s ± 3%  3.92user-s ± 1%    ~     (p=0.185 n=30+28)
Flate      165user-ms ± 2%  165user-ms ± 4%    ~     (p=0.780 n=27+29)
GoParser   209user-ms ± 2%  208user-ms ± 3%    ~     (p=0.453 n=28+30)
Reflect    533user-ms ± 6%  526user-ms ± 3%    ~     (p=0.057 n=30+30)
Tar        156user-ms ± 6%  154user-ms ± 6%    ~     (p=0.133 n=29+30)
XML        288user-ms ± 4%  288user-ms ± 4%    ~     (p=0.633 n=30+30)

name       old alloc/op     new alloc/op     delta
Template       41.0MB ± 0%      40.9MB ± 0%  -0.11%  (p=0.000 n=29+29)
Unicode        32.6MB ± 0%      32.6MB ± 0%    ~     (p=0.572 n=29+30)
GoTypes         122MB ± 0%       122MB ± 0%  -0.10%  (p=0.000 n=30+30)
Compiler        482MB ± 0%       481MB ± 0%  -0.07%  (p=0.000 n=30+29)
Flate          26.6MB ± 0%      26.6MB ± 0%    ~     (p=0.096 n=30+30)
GoParser       32.7MB ± 0%      32.6MB ± 0%  -0.06%  (p=0.011 n=28+28)
Reflect        84.2MB ± 0%      84.1MB ± 0%  -0.17%  (p=0.000 n=29+30)
Tar            27.7MB ± 0%      27.7MB ± 0%  -0.05%  (p=0.032 n=27+28)
XML            44.7MB ± 0%      44.7MB ± 0%    ~     (p=0.131 n=28+30)

name       old allocs/op    new allocs/op    delta
Template         373k ± 1%        370k ± 1%  -0.76%  (p=0.000 n=30+30)
Unicode          325k ± 1%        325k ± 1%    ~     (p=0.383 n=29+30)
GoTypes         1.16M ± 0%       1.15M ± 0%  -0.75%  (p=0.000 n=29+30)
Compiler        4.15M ± 0%       4.13M ± 0%  -0.59%  (p=0.000 n=30+29)
Flate            238k ± 1%        237k ± 1%  -0.62%  (p=0.000 n=30+30)
GoParser         304k ± 1%        302k ± 1%  -0.64%  (p=0.000 n=30+28)
Reflect         1.00M ± 0%       0.99M ± 0%  -1.10%  (p=0.000 n=29+30)
Tar              245k ± 1%        244k ± 1%  -0.59%  (p=0.000 n=27+29)
XML              391k ± 1%        389k ± 1%  -0.59%  (p=0.000 n=29+30)

Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f
Reviewed-on: https://go-review.googlesource.com/36476
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-02-28 19:23:33 +00:00
Cherry Zhang
f6fc0dd620 cmd/compile: update signature of runtime.memclr*
runtime.memclr* functions have signatures

func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr)
func memclrHasPointers(ptr unsafe.Pointer, n uintptr)

Update compiler's copy. Also teach gc/mkbuiltin.go to handle
unsafe.Pointer. The import statement and its support is not
really necessary, but just to make it look like real Go code.

Fixes #19185.

Change-Id: I251d02571fde2716d4727e31e04d56ec04b6f22a
Reviewed-on: https://go-review.googlesource.com/37257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2017-02-28 19:22:29 +00:00
Ian Lance Taylor
db6e27c38d cmd/compile: update builtin writeBarrier to match runtime
The definition of writeBarrier in the runtime was changed in CL 22855
to include padding. Update the definition built in to the compiler to match.
This doesn't affect the generated code, as the compiler sets the type
to use anyhow, but having them be different seems clearly wrong.

Change-Id: I8eac05bf70a424a0b2338ba5e9e41af231316de0
Reviewed-on: https://go-review.googlesource.com/37377
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-02-22 01:32:31 +00:00
Keith Randall
5a75d6a08e cmd/compile: optimize non-empty-interface type conversions
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.

Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.

Update #18492

Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder
2c91bb4c8a cmd/compile: make panicwrap argument-free
When code defines a method on T,
the compiler generates a corresponding wrapper method on *T.
The first thing the wrapper does is check whether
the pointer is nil and if so, call panicwrap.
This is done to provide a useful error message.

The existing implementation gets its information
from arguments set up by the compiler.
However, with some trouble, this information can
be extracted from the name of the wrapper method itself.

Removing the arguments to panicwrap simplifies and
shrinks the wrapper method.
It also means that the call to panicwrap does not
require any stack space.
This enables a further optimization on amd64/x86,
which is to skip the function prologue if nothing
else in the method requires stack space.
This is frequently the case in simple, hot methods,
such as Less and Swap in sort.Interface implementations.

Fixes #19040.

Benchmarks for package sort on amd64:

name                  old time/op  new time/op  delta
SearchWrappers-8       104ns ± 1%   104ns ± 1%    ~     (p=0.286 n=27+27)
SortString1K-8         128µs ± 1%   128µs ± 1%  -0.44%  (p=0.004 n=30+30)
SortString1K_Slice-8   118µs ± 2%   117µs ± 1%    ~     (p=0.106 n=30+30)
StableString1K-8      18.6µs ± 1%  18.6µs ± 1%    ~     (p=0.446 n=28+26)
SortInt1K-8           65.9µs ± 1%  60.7µs ± 1%  -7.96%  (p=0.000 n=28+30)
StableInt1K-8         75.3µs ± 2%  72.8µs ± 1%  -3.41%  (p=0.000 n=30+30)
StableInt1K_Slice-8   57.7µs ± 1%  57.7µs ± 1%    ~     (p=0.515 n=30+30)
SortInt64K-8          6.28ms ± 1%  6.01ms ± 1%  -4.19%  (p=0.000 n=28+28)
SortInt64K_Slice-8    5.04ms ± 1%  5.04ms ± 1%    ~     (p=0.927 n=28+27)
StableInt64K-8        6.65ms ± 1%  6.38ms ± 1%  -3.97%  (p=0.000 n=26+30)
Sort1e2-8             37.9µs ± 1%  37.2µs ± 1%  -1.89%  (p=0.000 n=29+27)
Stable1e2-8           77.0µs ± 1%  74.7µs ± 1%  -3.06%  (p=0.000 n=27+30)
Sort1e4-8             8.21ms ± 2%  7.98ms ± 1%  -2.77%  (p=0.000 n=29+30)
Stable1e4-8           24.8ms ± 1%  24.3ms ± 1%  -2.31%  (p=0.000 n=28+30)
Sort1e6-8              1.27s ± 4%   1.22s ± 1%  -3.42%  (p=0.000 n=30+29)
Stable1e6-8            5.06s ± 1%   4.92s ± 1%  -2.77%  (p=0.000 n=25+29)
[Geo mean]             731µs        714µs       -2.29%

Before/after assembly for sort.(*intPairs).Less follows.
It can be optimized further, but that's for a follow-up CL.

Before:

"".(*intPairs).Less t=1 size=214 args=0x20 locals=0x38
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Less(SB), $56-32
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	CMPQ	SP, 16(CX)
	0x000d 00013 (<autogenerated>:1)	JLS	204
	0x0013 00019 (<autogenerated>:1)	SUBQ	$56, SP
	0x0017 00023 (<autogenerated>:1)	MOVQ	BP, 48(SP)
	0x001c 00028 (<autogenerated>:1)	LEAQ	48(SP), BP
	0x0021 00033 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0025 00037 (<autogenerated>:1)	TESTQ	BX, BX
	0x0028 00040 (<autogenerated>:1)	JEQ	55
	0x002a 00042 (<autogenerated>:1)	LEAQ	64(SP), DI
	0x002f 00047 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0032 00050 (<autogenerated>:1)	JNE	55
	0x0034 00052 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x0037 00055 (<autogenerated>:1)	NOP
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$0, gclocals·4032f753396f2012ad1784f398b170f4(SB)
	0x0037 00055 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x0037 00055 (<autogenerated>:1)	MOVQ	""..this+64(FP), AX
	0x003c 00060 (<autogenerated>:1)	TESTQ	AX, AX
	0x003f 00063 (<autogenerated>:1)	JEQ	$0, 135
	0x0041 00065 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0044 00068 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x0048 00072 (<autogenerated>:1)	MOVQ	"".i+72(FP), DX
	0x004d 00077 (<autogenerated>:1)	CMPQ	DX, AX
	0x0050 00080 (<autogenerated>:1)	JCC	$0, 128
	0x0052 00082 (<autogenerated>:1)	SHLQ	$4, DX
	0x0056 00086 (<autogenerated>:1)	MOVQ	(CX)(DX*1), DX
	0x005a 00090 (<autogenerated>:1)	MOVQ	"".j+80(FP), BX
	0x005f 00095 (<autogenerated>:1)	CMPQ	BX, AX
	0x0062 00098 (<autogenerated>:1)	JCC	$0, 128
	0x0064 00100 (<autogenerated>:1)	SHLQ	$4, BX
	0x0068 00104 (<autogenerated>:1)	MOVQ	(CX)(BX*1), AX
	0x006c 00108 (<autogenerated>:1)	CMPQ	DX, AX
	0x006f 00111 (<autogenerated>:1)	SETLT	AL
	0x0072 00114 (<autogenerated>:1)	MOVB	AL, "".~r2+88(FP)
	0x0076 00118 (<autogenerated>:1)	MOVQ	48(SP), BP
	0x007b 00123 (<autogenerated>:1)	ADDQ	$56, SP
	0x007f 00127 (<autogenerated>:1)	RET
	0x0080 00128 (<autogenerated>:1)	PCDATA	$0, $1
	0x0080 00128 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x0085 00133 (<autogenerated>:1)	UNDEF
	0x0087 00135 (<autogenerated>:1)	LEAQ	go.string."sort_test"(SB), AX
	0x008e 00142 (<autogenerated>:1)	MOVQ	AX, (SP)
	0x0092 00146 (<autogenerated>:1)	MOVQ	$9, 8(SP)
	0x009b 00155 (<autogenerated>:1)	LEAQ	go.string."intPairs"(SB), AX
	0x00a2 00162 (<autogenerated>:1)	MOVQ	AX, 16(SP)
	0x00a7 00167 (<autogenerated>:1)	MOVQ	$8, 24(SP)
	0x00b0 00176 (<autogenerated>:1)	LEAQ	go.string."Less"(SB), AX
	0x00b7 00183 (<autogenerated>:1)	MOVQ	AX, 32(SP)
	0x00bc 00188 (<autogenerated>:1)	MOVQ	$4, 40(SP)
	0x00c5 00197 (<autogenerated>:1)	PCDATA	$0, $1
	0x00c5 00197 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x00ca 00202 (<autogenerated>:1)	UNDEF
	0x00cc 00204 (<autogenerated>:1)	NOP
	0x00cc 00204 (<autogenerated>:1)	PCDATA	$0, $-1
	0x00cc 00204 (<autogenerated>:1)	CALL	runtime.morestack_noctxt(SB)
	0x00d1 00209 (<autogenerated>:1)	JMP	0

After:

"".(*intPairs).Swap t=1 size=147 args=0x18 locals=0x8
	0x0000 00000 (<autogenerated>:1)	TEXT	"".(*intPairs).Swap(SB), $8-24
	0x0000 00000 (<autogenerated>:1)	MOVQ	(TLS), CX
	0x0009 00009 (<autogenerated>:1)	SUBQ	$8, SP
	0x000d 00013 (<autogenerated>:1)	MOVQ	BP, (SP)
	0x0011 00017 (<autogenerated>:1)	LEAQ	(SP), BP
	0x0015 00021 (<autogenerated>:1)	MOVQ	32(CX), BX
	0x0019 00025 (<autogenerated>:1)	TESTQ	BX, BX
	0x001c 00028 (<autogenerated>:1)	JEQ	43
	0x001e 00030 (<autogenerated>:1)	LEAQ	16(SP), DI
	0x0023 00035 (<autogenerated>:1)	CMPQ	(BX), DI
	0x0026 00038 (<autogenerated>:1)	JNE	43
	0x0028 00040 (<autogenerated>:1)	MOVQ	SP, (BX)
	0x002b 00043 (<autogenerated>:1)	NOP
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB)
	0x002b 00043 (<autogenerated>:1)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x002b 00043 (<autogenerated>:1)	MOVQ	""..this+16(FP), AX
	0x0030 00048 (<autogenerated>:1)	TESTQ	AX, AX
	0x0033 00051 (<autogenerated>:1)	JEQ	$0, 140
	0x0035 00053 (<autogenerated>:1)	MOVQ	(AX), CX
	0x0038 00056 (<autogenerated>:1)	MOVQ	8(AX), AX
	0x003c 00060 (<autogenerated>:1)	MOVQ	"".i+24(FP), DX
	0x0041 00065 (<autogenerated>:1)	CMPQ	DX, AX
	0x0044 00068 (<autogenerated>:1)	JCC	$0, 133
	0x0046 00070 (<autogenerated>:1)	SHLQ	$4, DX
	0x004a 00074 (<autogenerated>:1)	MOVQ	8(CX)(DX*1), BX
	0x004f 00079 (<autogenerated>:1)	MOVQ	(CX)(DX*1), SI
	0x0053 00083 (<autogenerated>:1)	MOVQ	"".j+32(FP), DI
	0x0058 00088 (<autogenerated>:1)	CMPQ	DI, AX
	0x005b 00091 (<autogenerated>:1)	JCC	$0, 133
	0x005d 00093 (<autogenerated>:1)	SHLQ	$4, DI
	0x0061 00097 (<autogenerated>:1)	MOVQ	8(CX)(DI*1), AX
	0x0066 00102 (<autogenerated>:1)	MOVQ	(CX)(DI*1), R8
	0x006a 00106 (<autogenerated>:1)	MOVQ	R8, (CX)(DX*1)
	0x006e 00110 (<autogenerated>:1)	MOVQ	AX, 8(CX)(DX*1)
	0x0073 00115 (<autogenerated>:1)	MOVQ	SI, (CX)(DI*1)
	0x0077 00119 (<autogenerated>:1)	MOVQ	BX, 8(CX)(DI*1)
	0x007c 00124 (<autogenerated>:1)	MOVQ	(SP), BP
	0x0080 00128 (<autogenerated>:1)	ADDQ	$8, SP
	0x0084 00132 (<autogenerated>:1)	RET
	0x0085 00133 (<autogenerated>:1)	PCDATA	$0, $1
	0x0085 00133 (<autogenerated>:1)	CALL	runtime.panicindex(SB)
	0x008a 00138 (<autogenerated>:1)	UNDEF
	0x008c 00140 (<autogenerated>:1)	PCDATA	$0, $1
	0x008c 00140 (<autogenerated>:1)	CALL	runtime.panicwrap(SB)
	0x0091 00145 (<autogenerated>:1)	UNDEF

Change-Id: I15bb8435f0690badb868799f313ed8817335efd3
Reviewed-on: https://go-review.googlesource.com/36809
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-11 23:27:35 +00:00
David Chase
7f1ff65c39 cmd/compile: insert scheduling checks on loop backedges
Loop breaking with a counter.  Benchmarked (see comments),
eyeball checked for sanity on popular loops.  This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.

Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test.  Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.

If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop.  Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.

This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.

Updates #17831.
Updates #10958.

Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09 21:01:29 +00:00
Keith Randall
688995d1e9 cmd/compile: do more type conversion inline
The code to do the conversion is smaller than the
call to the runtime.
The 1-result asserts need to call panic if they fail, but that
code is out of line.

The only conversions left in the runtime are those which
might allocate and those which might need to generate an itab.

Given the following types:
  type E interface{}
  type I interface { foo() }
  type I2 iterface { foo(); bar() }
  type Big [10]int
  func (b Big) foo() { ... }

This CL inlines the following conversions:

was assertE2T
  var e E = ...
  b := i.(Big)
was assertE2T2
  var e E = ...
  b, ok := i.(Big)
was assertI2T
  var i I = ...
  b := i.(Big)
was assertI2T2
  var i I = ...
  b, ok := i.(Big)
was assertI2E
  var i I = ...
  e := i.(E)
was assertI2E2
  var i I = ...
  e, ok := i.(E)

These are the remaining runtime calls:

convT2E:
  var b Big = ...
  var e E = b
convT2I:
  var b Big = ...
  var i I = b
convI2I:
  var i2 I2 = ...
  var i I = i2
assertE2I:
  var e E = ...
  i := e.(I)
assertE2I2:
  var e E = ...
  i, ok := e.(I)
assertI2I:
  var i I = ...
  i2 := i.(I2)
assertI2I2:
  var i I = ...
  i2, ok := i.(I2)

Fixes #17405
Fixes #8422

Change-Id: Ida2367bf8ce3cd2c6bb599a1814f1d275afabe21
Reviewed-on: https://go-review.googlesource.com/32313
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2016-11-02 21:33:03 +00:00
Austin Clements
87e48c5afd runtime, cmd/compile: rename memclr -> memclrNoHeapPointers
Since barrier-less memclr is only safe in very narrow circumstances,
this commit renames memclr to avoid accidentally calling memclr on
typed memory. This can cause subtle, non-deterministic bugs, so it's
worth some effort to prevent. In the near term, this will also prevent
bugs creeping in from any concurrent CLs that add calls to memclr; if
this happens, whichever patch hits master second will fail to compile.

This also adds the other new memclr variants to the compiler's
builtin.go to minimize the churn on that binary blob. We'll use these
in future commits.

Updates #17503.

Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca
Reviewed-on: https://go-review.googlesource.com/31369
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-28 18:20:33 +00:00
Martin Möhrmann
b679665a18 cmd/compile: move stringtoslicebytetmp to the backend
- removes the runtime function stringtoslicebytetmp
- removes the generation of calls to stringtoslicebytetmp from the frontend
- adds handling of OSTRARRAYBYTETMP in the backend

This reduces binary sizes and avoids function call overhead.

Change-Id: Ib9988d48549cee663b685b4897a483f94727b940
Reviewed-on: https://go-review.googlesource.com/32158
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-28 07:58:47 +00:00
Keith Randall
c78d072c8e Revert "Revert "cmd/compile: inline convI2E""
This reverts commit 7dd9c385f6.

Reason for revert: Reverting the revert, which will re-enable the convI2E optimization.  We originally reverted the convI2E optimization because it was making the builder fail, but the underlying cause was later determined to be unrelated.

Original CL: https://go-review.googlesource.com/31260
Revert CL: https://go-review.googlesource.com/31310
Real bug: https://go-review.googlesource.com/c/25159
Real fix: https://go-review.googlesource.com/c/31316

Change-Id: I17237bb577a23a7675a5caab970ccda71a4124f2
Reviewed-on: https://go-review.googlesource.com/32023
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-25 23:51:34 +00:00
Matthew Dempsky
42b37819a1 cmd/compile: rework mkbuiltin.go to generate code
Generating binary export data requires a working Go compiler. Even
trickier to change the export data format itself requires a careful
bootstrapping procedure.

Instead, simply generate normal Go code that lets us directly
construct the builtin runtime declarations.

Passes toolstash -cmp.

Fixes #17508.

Change-Id: I4f6078a3c7507ba40072580695d57c87a5604baf
Reviewed-on: https://go-review.googlesource.com/31493
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2016-10-19 19:58:00 +00:00
Matthew Dempsky
3f2cb493e5 cmd/compile: handle unsafe builtins like universal builtins
Reuse the same mechanisms for handling universal builtins like len to
handle unsafe.Sizeof, etc. Allows us to drop package unsafe's export
data, and simplifies some code.

Updates #17508.

Change-Id: I620e0617c24e57e8a2d7cccd0e2de34608779656
Reviewed-on: https://go-review.googlesource.com/31433
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2016-10-18 22:34:44 +00:00
Matthew Dempsky
7dd9c385f6 Revert "cmd/compile: inline convI2E"
This reverts commit 395d36a67d.

Appears to be responsible for builder failures.

Change-Id: Ic6c6307f662767e529060b88704a9f074785d99e
Reviewed-on: https://go-review.googlesource.com/31310
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-10-17 20:37:19 +00:00
Keith Randall
395d36a67d cmd/compile: inline convI2E
It's pretty simple.  For:
  e = (interface{})(i)
Do:
  tmp = i.itab
  if tmp != nil {
    tmp = tmp.typ_ // load type from itab
  }
  e = eface{tmp, i.data}

It is smaller and faster than calling the runtime.

Change-Id: I0ad27f62f4ec0b6cd53bc8530e4da0eae3e67a6c
Reviewed-on: https://go-review.googlesource.com/31260
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-17 19:09:07 +00:00
Martin Möhrmann
d295174030 runtime: speed up non-ASCII rune decoding
Copies utf8 constants and EncodeRune implementation from unicode/utf8.

Adds a new decoderune implementation that is used by the compiler
in code generated for ranging over strings. It does not handle
ASCII runes since these are handled directly before calls to decoderune.

The DecodeRuneInString implementation from unicode/utf8 is not used
since it uses a lookup table that would increase the use of cpu caches.

Adds more tests that check decoding of valid and invalid utf8 sequences.

name                              old time/op  new time/op  delta
RuneIterate/range2/ASCII-4        7.45ns ± 2%  7.45ns ± 1%     ~     (p=0.634 n=16+16)
RuneIterate/range2/Japanese-4     53.5ns ± 1%  49.2ns ± 2%   -8.03%  (p=0.000 n=20+20)
RuneIterate/range2/MixedLength-4  46.3ns ± 1%  41.0ns ± 2%  -11.57%  (p=0.000 n=20+20)

new:
"".decoderune t=1 size=423 args=0x28 locals=0x0
old:
"".charntorune t=1 size=666 args=0x28 locals=0x0

Change-Id: I1df1fdb385bb9ea5e5e71b8818ea2bf5ce62de52
Reviewed-on: https://go-review.googlesource.com/28490
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-17 11:25:22 +00:00
Keith Randall
442de98c14 cmd/compile,runtime: redo how map assignments work
To compile:
  m[k] = v
instead of:
  mapassign(maptype, m, &k, &v), do
do:
  *mapassign(maptype, m, &k) = v

mapassign returns a pointer to the value slot in the map.  It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.

This makes map accesses faster but potentially larger (codewise).

It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove.  We also potentially
avoid stack temporaries to hold v.

The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).

This CL is in preparation for doing operations like m[k] += v with
only a single runtime call.  That will roughly double the speed of
such operations.

Update #17133
Update #5147

Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12 20:41:23 +00:00
Keith Randall
6129f37367 cmd/compile: inline convT2{I,E} when result doesn't escape
No point in calling a function when we can build the interface
using a known type (or itab) and the address of a local.

Get rid of third arg (preallocated stack space) to convT2{I,E}.

Makes go binary smaller by 0.2%

benchmark                   old ns/op     new ns/op     delta
BenchmarkEfaceInteger-8     16.7          10.1          -39.52%

Update #17118
Update #15375

Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2
Reviewed-on: https://go-review.googlesource.com/29373
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-19 02:37:08 +00:00
Matthew Dempsky
27eebbabc2 cmd/compile, runtime: remove throwreturn
Change-Id: If8d27cf1cd8d650ed0ba332448d3174d80b6b0ca
Reviewed-on: https://go-review.googlesource.com/29217
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-15 12:13:34 +00:00
Matthew Dempsky
6758eedf89 cmd/compile: remove Pointer from builtin/unsafe.go
We already explicitly construct the "unsafe.Pointer" type in typeinit
because we need it for Types[TUNSAFEPTR]. No point in also having it
in builtin/unsafe.go if it just means (*importer).importtype needs to
fix it.

Change-Id: Ife8a5a73cbbe2bfcabe8b25ee4f7e0f5fd0570b4
Reviewed-on: https://go-review.googlesource.com/29082
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-12 20:44:28 +00:00
Martin Möhrmann
0dae9dfb08 cmd/compile: improve string iteration performance
Generate a for loop for ranging over strings that only needs to call
the runtime function charntorune for non ASCII characters.

This provides faster iteration over ASCII characters and slightly
faster iteration for other characters.

The runtime function charntorune is changed to take an index from where
to start decoding and returns the index after the last byte belonging
to the decoded rune.

All call sites of charntorune in the runtime are replaced by a for loop
that will be transformed by the compiler instead of calling the charntorune
function directly.

go binary size decreases by 80 bytes.
godoc binary size increases by around 4 kilobytes.

runtime:

name                           old time/op  new time/op  delta
RuneIterate/range/ASCII-4      43.7ns ± 3%  10.3ns ± 4%  -76.33%  (p=0.000 n=44+45)
RuneIterate/range/Japanese-4   72.5ns ± 2%  62.8ns ± 2%  -13.41%  (p=0.000 n=49+50)
RuneIterate/range1/ASCII-4     43.5ns ± 2%  10.4ns ± 3%  -76.18%  (p=0.000 n=50+50)
RuneIterate/range1/Japanese-4  72.5ns ± 2%  62.9ns ± 2%  -13.26%  (p=0.000 n=50+49)
RuneIterate/range2/ASCII-4     43.5ns ± 3%  10.3ns ± 2%  -76.22%  (p=0.000 n=48+47)
RuneIterate/range2/Japanese-4  72.4ns ± 2%  62.7ns ± 2%  -13.47%  (p=0.000 n=50+50)

strings:

name                 old time/op    new time/op    delta
IndexRune-4            64.7ns ± 5%    22.4ns ± 3%  -65.43%  (p=0.000 n=25+21)
MapNoChanges-4          269ns ± 2%     157ns ± 2%  -41.46%  (p=0.000 n=23+24)
Fields-4               23.0ms ± 2%    19.7ms ± 2%  -14.35%  (p=0.000 n=25+25)
FieldsFunc-4           23.1ms ± 2%    19.6ms ± 2%  -14.94%  (p=0.000 n=25+24)

name                 old speed      new speed      delta
Fields-4             45.6MB/s ± 2%  53.2MB/s ± 2%  +16.87%  (p=0.000 n=24+25)
FieldsFunc-4         45.5MB/s ± 2%  53.5MB/s ± 2%  +17.57%  (p=0.000 n=25+24)

Updates #13162

Change-Id: I79ffaf828d82bf9887592f08e5cad883e9f39701
Reviewed-on: https://go-review.googlesource.com/27853
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
2016-08-30 18:17:20 +00:00
Martin Möhrmann
e6f9f39ce5 cmd/compile: generate makeslice calls with int arguments
Where possible generate calls to runtime makeslice with int arguments
during compile time instead of makeslice with int64 arguments.

This eliminates converting arguments for calls to makeslice with
int64 arguments for platforms where int64 values do not fit into
arguments of type int.

godoc 386 binary shrinks by approximately 12 kilobyte.

amd64:
name         old time/op  new time/op  delta
MakeSlice-2  29.8ns ± 1%  29.8ns ± 1%   ~     (p=1.000 n=24+24)

386:
name         old time/op  new time/op  delta
MakeSlice-2  52.3ns ± 0%  45.9ns ± 0%  -12.17%  (p=0.000 n=25+22)

Fixes  #15357

Change-Id: Icb8701bb63c5a83877d26c8a4b78e782ba76de7c
Reviewed-on: https://go-review.googlesource.com/27851
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-08-29 18:25:33 +00:00
Robert Griesemer
595cebb055 cmd/compile: remove ignored bool from exported ODCL nodes
This shortens the export format by 1 byte for each exported ODCL
node in inlined function bodies.

Maintain backward compatibility by updating format version and
continue to accept older format.

Change-Id: I549bb3ade90bc0f146decf8016d5c9c3f14eb293
Reviewed-on: https://go-review.googlesource.com/27999
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-08-29 18:07:47 +00:00
Robert Griesemer
c043e90e55 cmd/compile: clean up encoding of export version info
Replace ad-hoc encoding of export version info with a
more systematic approach.

Continue to read (but not write) the Go1.7 format for backward-
compatibility. This will avoid spurious errors with old installed
packages.

Fixes #16244.

Change-Id: I945e79ffd5e22b883250f6f9fac218370c2505a2
Reviewed-on: https://go-review.googlesource.com/27452
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-08-22 22:28:52 +00:00
Robert Griesemer
f50ced6d73 cmd/compile: remove encoding of safemode bit from export data
Removes the encoding of this bit which was ignored but left behind
for 1.7 to minimize pre-1.7 export format changes. See the issue
for more details.

Fixes #15772.

Change-Id: I46cd7a66ad4c6003b78c64295cf3bda503ebf2dd
Reviewed-on: https://go-review.googlesource.com/27201
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-08-16 23:59:38 +00:00
Keith Randall
df2f813bd2 [dev.ssa] cmd/compile: 386 port now works
GOARCH=386 SSATEST=1 ./all.bash passes

Caveat: still needs changes to test/ files to use *_ssa.go versions.  I
won't check those changes in with this CL because the builders will
complain as they don't have SSATEST=1.

Mostly minor fixes.

Implement float <-> uint32 in assembly.  It seems the simplest option
for now.

GO386=387 does not work.  That's why I can't make SSA the default for
386 yet.

Change-Id: Ic4d4402104d32bcfb1fd612f5bb6539f9acb8ae0
Reviewed-on: https://go-review.googlesource.com/25119
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-07-21 20:41:18 +00:00
Robert Griesemer
575a871662 cmd/compile: don't lose //go:nointerface pragma in export data
Fixes #16243.

Change-Id: I207d1e8aa48abe453a23c709ccf4f8e07368595b
Reviewed-on: https://go-review.googlesource.com/24648
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-07-01 19:20:11 +00:00
Robert Griesemer
394ac818b0 cmd/compile: add and enable (internal) option to only track named types
The new export format keeps track of all types that are exported.
If a type is seen that was exported before, only a reference to
that type is emitted. The importer maintains a list of all the
seen types and uses that list to resolve type references.

The existing compiler infrastructure's invariants assumes that
only named types are referred to before they are fully set up.
Referring to unnamed incomplete types causes problems. One of
the issues was #15548.

Added a new internal flag 'trackAllTypes' to enable/disable
this type tracking. With this change only named types are
tracked.

Verified that this fix also addresses #15548, even w/o the
prior fix for that issue (in fact that prior fix is turned
off if trackAllTypes is disabled because it's not needed).

The test for #15548 covers also this change.

For #15548.

Change-Id: Id0b3ff983629703d025a442823f99649fd728a56
Reviewed-on: https://go-review.googlesource.com/22839
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-05-07 23:56:02 +00:00
Robert Griesemer
7538b1db8e cmd/compile: switch to compact export format by default
builtin.go was auto-generated via go generate; all other
changes were manual.

The new format reduces the export data size by ~65% on average
for the std library packages (and there is still quite a bit of
room for improvement).

The average time to write export data is reduced by (at least)
62% as measured in one run over the std lib, it is likely more.

The average time to read import data is reduced by (at least)
37% as measured in one run over the std lib, it is likely more.
There is also room to improve this time.

The compiler transparently handles both packages using the old
and the new format.

Comparing the -S output of the go build for each package via
the cmp.bash script (added) shows identical assembly code for
all packages, but 6 files show file:line differences:

The following files have differences because they use cgo
and cgo uses different temp. directories for different builds.
Harmless.

	src/crypto/x509
	src/net
	src/os/user
	src/runtime/cgo

The following files have file:line differences that are not yet
fully explained; however the differences exist w/ and w/o new export
format (pre-existing condition). See issue #15453.

	src/go/internal/gccgoimporter
	src/go/internal/gcimporter

In summary, switching to the new export format produces the same
package files as before for all practical purposes.

How can you tell which one you have (if you care): Open a package
(.a) file in an editor. Textual export data starts with a $$ after
the header and is more or less legible; binary export data starts
with a $$B after the header and is mostly unreadable. A stand-alone
decoder (for debugging) is in the works.

In case of a problem, please first try reverting back to the old
textual format to determine if the cause is the new export format:

For a stand-alone compiler invocation:
- go tool compile -newexport=0 <files>

For a single package:
- go build -gcflags="-newexport=0" <pkg>

For make/all.bash:
- (export GO_GCFLAGS="-newexport=0"; sh make.bash)

Fixes #13241.

Change-Id: I2588cb463be80af22446bf80c225e92ab79878b8
Reviewed-on: https://go-review.googlesource.com/22123
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-04-27 16:59:55 +00:00
Keith Randall
60fd32a47f cmd/compile: change the way we handle large map values
mapaccess{1,2} returns a pointer to the value.  When the key
is not in the map, it returns a pointer to zeroed memory.
Currently, for large map values we have a complicated scheme which
dynamically allocates zeroed memory for this purpose.  It is ugly
code and requires an atomic.Load in a bunch of places we'd rather
not have it.

Switch to a scheme where callsites of mapaccess{1,2} which expect
large return values pass in a pointer to zeroed memory that
mapaccess can return if the key is not found.  This avoids the
atomic.Load on all map accesses with a few extra instructions only
for the large value acccesses, plus a bit of bss space.

There was a time (1.4 & 1.5?) where we did something like this but
all the tricks to make the right size zero value were done by the
linker.  That scheme broke in the presence of dyamic linking.
The scheme in this CL works even when dynamic linking.

Fixes #12337

Change-Id: Ic2d0319944af33bbb59785938d9ab80958d1b4b1
Reviewed-on: https://go-review.googlesource.com/22221
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-04-20 21:15:31 +00:00
Michel Lespinasse
859b63cc09 cmd/compile: optimize remaining convT2I calls
See #14874
Updates #6853

This change adds a compiler optimization for non pointer shaped convT2I.
Since itab symbols are now emitted by the compiler, the itab address can
be passed directly to convT2I instead of passing the iface type and a
cache pointer argument.

Compilebench results for the 5-commits series ending here:

name       old time/op     new time/op     delta
Template       336ms ± 4%      344ms ± 4%   +2.61%          (p=0.027 n=9+8)
Unicode        165ms ± 6%      173ms ± 7%   +5.11%          (p=0.014 n=9+9)
GoTypes        1.09s ± 1%      1.06s ± 2%   -3.29%          (p=0.000 n=9+9)
Compiler       5.09s ±10%      4.75s ±10%   -6.64%        (p=0.011 n=10+10)
MakeBash       31.1s ± 5%      30.3s ± 3%     ~           (p=0.089 n=10+10)

name       old text-bytes  new text-bytes  delta
HelloSize       558k ± 0%       558k ± 0%   +0.02%        (p=0.000 n=10+10)
CmdGoSize      6.24M ± 0%      6.11M ± 0%   -2.11%        (p=0.000 n=10+10)

name       old data-bytes  new data-bytes  delta
HelloSize      3.66k ± 0%      3.74k ± 0%   +2.41%        (p=0.000 n=10+10)
CmdGoSize       134k ± 0%       162k ± 0%  +20.76%        (p=0.000 n=10+10)

name       old bss-bytes   new bss-bytes   delta
HelloSize       126k ± 0%       126k ± 0%     ~     (all samples are equal)
CmdGoSize       149k ± 0%       146k ± 0%   -2.17%        (p=0.000 n=10+10)

name       old exe-bytes   new exe-bytes   delta
HelloSize       924k ± 0%       924k ± 0%   +0.05%        (p=0.000 n=10+10)
CmdGoSize      9.77M ± 0%      9.62M ± 0%   -1.47%        (p=0.000 n=10+10)

Change-Id: Ib230ddc04988824035c32287ae544a965fedd344
Reviewed-on: https://go-review.googlesource.com/20902
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
2016-03-29 02:21:50 +00:00
Matthew Dempsky
deb83d0639 cmd/compile: remove unused write barrier helpers
These have been unused since CL 10316.

Passes toolstash -cmp.

Change-Id: Icc19f3fcc7275fbee1c665f704e10a110ecce2a5
Reviewed-on: https://go-review.googlesource.com/21242
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-03-29 00:16:55 +00:00
Matthew Dempsky
d1341d6cf3 cmd/compile, runtime: eliminate growslice_n
Fixes #11419.

Change-Id: I7935a253e3e96191a33f5041bab203ecc5f0c976
Reviewed-on: https://go-review.googlesource.com/20647
Reviewed-by: Keith Randall <khr@golang.org>
2016-03-13 23:34:31 +00:00
Matthew Dempsky
9877900c8c Revert "cmd/compile: move hiter, hmap, and scase definitions into builtin.go"
This reverts commit f28bbb776a.

Change-Id: I82fb81dcff3ddcaefef72949f1ef3a41bcd22301
Reviewed-on: https://go-review.googlesource.com/19849
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-02-23 19:42:52 +00:00
Matthew Dempsky
f28bbb776a cmd/compile: move hiter, hmap, and scase definitions into builtin.go
Also eliminates per-maptype hiter and hmap types, since they're not
really needed anyway.  Update packages reflect and runtime
accordingly.

Reduces golang.org/x/tools/cmd/godoc's text segment by ~170kB:

   text	   data	    bss	    dec	    hex	filename
13085702	 140640	 151520	13377862	 cc2146	godoc.before
12915382	 140640	 151520	13207542	 c987f6	godoc.after

Updates #6853.

Change-Id: I948b2bc1f22d477c1756204996b4e3e1fb568d81
Reviewed-on: https://go-review.googlesource.com/16610
Reviewed-by: Keith Randall <khr@golang.org>
2016-02-22 07:42:37 +00:00
Keith Randall
d0c11577b9 cmd/compile: inline {i,e}facethash
These functions are really simple, the overhead of calling
them (in both time and code size) is larger than the inlined versions.

Reorganize how the nil case in a type switch is handled, as we have
to check for nil explicitly now anyway.

Saves about 0.8% in the binary size of the go tool.

Change-Id: I8501b62d72fde43650b79f52b5f699f1fbd0e7e7
Reviewed-on: https://go-review.googlesource.com/19814
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-02-22 05:09:25 +00:00
Matthew Dempsky
9a184b22ee cmd/compile: refresh builtin.go
The export data format was augmented with a new "unsafe-uintptr" tag
in https://golang.org/cl/18584, but builtin.go was not regenerated.

While here, add a test to make sure builtin.go stays up to date in the
future.

Change-Id: I4ae17da29f0855bef6ec0fcc10e7082c8427d39c
Reviewed-on: https://go-review.googlesource.com/19681
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2016-02-19 19:08:47 +00:00
Matthew Dempsky
7555f7f2bf cmd/compile: cleanup mkbuiltin.go
Changes largely in preparation for eventually switching the builtin
export data to use the new binary format.

Replace fancy incremental line-by-line scanning with simply reading
the entire object file into memory, finding the export data section,
and processing it that way.

Just use "package runtime" and "package unsafe" in the builtin Go
source files so we don't need to rewrite references to "PACKAGE".

Stop looking for init_PACKAGE_function; it doesn't exist anyway.

Compile package runtime with -u so that its export data marks it as a
"safe" package.

Eliminate requirement to pass "runtime" and "unsafe" as command-line
arguments so that just "go run mkbuiltin.go" works.

Only rewrite builtin.go when successful.

Change-Id: I4addfde9e0cfb30607c7a83de686bde0ad1f035a
Reviewed-on: https://go-review.googlesource.com/19624
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-18 21:41:51 +00:00
Ian Lance Taylor
be1ef46775 runtime: add optional expensive check for invalid cgo pointer passing
If you set GODEBUG=cgocheck=2 the runtime package will use the write
barrier to detect cases where a Go program writes a Go pointer into
non-Go memory.  In conjunction with the existing cgo checks, and the
not-yet-implemented cgo check for exported functions, this should
reliably detect all cases (that do not import the unsafe package) in
which a Go pointer is incorrectly shared with C code.  This check is
optional because it turns on the write barrier at all times, which is
known to be expensive.

Update #12416.

Change-Id: I549d8b2956daa76eac853928e9280e615d6365f4
Reviewed-on: https://go-review.googlesource.com/16899
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-16 18:39:06 +00:00
Didier Spezia
0624fd3f14 cmd/compile: regenerate builtin.go
Following a recent change, file builtin.go is not up-to-date.
Generate it again by running go generate.

Fixes #13203

Change-Id: Ib91c5ccc93665c043da95c7d3783ce5d94e48466
Reviewed-on: https://go-review.googlesource.com/16821
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-14 22:17:20 +00:00
Ian Lance Taylor
0c69f1303f cmd/compile: add -msan option
The -msan option causes the compiler to add instrumentation for the
C/C++ memory sanitizer.  Every memory read/write will be preceded by
a call to msanread/msanwrite.

This CL passes tests but is not usable by itself.  The actual
implementation of msanread/msanwrite in the runtime package, and support
for -msan in the go tool and the linker, and tests, will follow in
subsequent CLs.

Change-Id: I3d517fb3e6e65d9bf9433db070a420fd11f57816
Reviewed-on: https://go-review.googlesource.com/16160
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-10-21 14:58:53 +00:00
Russ Cox
32fddadd98 runtime: reduce slice growth during append to 2x
The new inlined code for append assumed that it could pass the
desired new cap to growslice, not the number of new elements.
But growslice still interpreted the argument as the number of new elements,
making it always grow by >2x (more precisely, 2x+1 rounded up
to the next malloc block size). At the time, I had intended to change
the other callers to use the new cap as well, but it's too late for that.
Instead, introduce growslice_n for the old callers and keep growslice
for the inlined (common case) caller.

Fixes #11403.

Filed #11419 to merge them.

Change-Id: I1338b1e5b352f3be4e43641f44b652ef7195251b
Reviewed-on: https://go-review.googlesource.com/11541
Reviewed-by: Austin Clements <austin@google.com>
2015-06-26 17:49:33 +00:00
Russ Cox
17eba6e6b7 cmd/compile, cmd/link: create from 5g, 5l, etc
Trivial merging of 5g, 6g, ... into go tool compile,
and similarlly 5l, 6l, ... into go tool link.
The files compile/main.go and link/main.go are new.
Everything else in those directories is a move followed by
change of imports and package name.

This CL breaks the build. Manual fixups are in the next CL.

See golang-dev thread titled "go tool compile, etc" for background.

Change-Id: Id35ff5a5859ad9037c61275d637b1bd51df6828b
Reviewed-on: https://go-review.googlesource.com/10287
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Rob Pike <r@golang.org>
2015-05-21 17:31:51 +00:00
Renamed from src/cmd/internal/gc/builtin.go (Browse further)