This change introduces an arm intrinsic that generates the FMULAD
instruction for the fused-multiply-add operation on systems that
support it. System support is detected via cpu.ARM.HasVFPv4. A rewrite
rule translates the generic intrinsic to FMULAD.
Updates #25819.
Change-Id: I8459e5dd1cdbdca35f88a78dbeb7d387f1e20efa
Reviewed-on: https://go-review.googlesource.com/c/go/+/142117
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
To permit ssa-level optimization, this change introduces an amd64 intrinsic
that generates the VFMADD231SD instruction for the fused-multiply-add
operation on systems that support it. System support is detected via
cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic
("Fma") to VFMADD231SD.
The benchmark compares the software implementation (old) with the intrinsic
(new).
name old time/op new time/op delta
Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5)
Updates #25819.
Change-Id: I966655e5f96817a5d06dff5942418a3915b09584
Reviewed-on: https://go-review.googlesource.com/c/go/+/137156
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This CL adds -d=checkptr as a compile-time option for adding
instrumentation to check that Go code is following unsafe.Pointer
safety rules dynamically. In particular, it currently checks two
things:
1. When converting unsafe.Pointer to *T, make sure the resulting
pointer is aligned appropriately for T.
2. When performing pointer arithmetic, if the result points to a Go
heap object, make sure we can find an unsafe.Pointer-typed operand
that pointed into the same object.
These checks are currently disabled for the runtime, and can also be
disabled through a new //go:nocheckptr annotation. The latter is
necessary for functions like strings.noescape, which intentionally
violate safety rules to workaround escape analysis limitations.
Fixes#22218.
Change-Id: If5a51273881d93048f74bcff10a3275c9c91da6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/162237
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Right now we generate hash functions for all types, just in case they
are used as map keys. That's a lot of wasted effort and binary size
for types which will never be used as a map key. Instead, generate
hash functions only for types that we know are map keys.
Just doing that is a bit too simple, since maps with an interface type
as a key might have to hash any concrete key type that implements that
interface. So for that case, implement hashing of such types at
runtime (instead of with generated code). It will be slower, but only
for maps with interface types as keys, and maybe only a bit slower as
the aeshash time probably dominates the dispatch time.
Reorg where we keep the equals and hash functions. Move the hash function
from the key type to the map type, saving a field in every non-map type.
That leaves only one function in the alg structure, so get rid of that and
just keep the equal function in the type descriptor itself.
cmd/go now has 10 generated hash functions, instead of 504. Makes
cmd/go 1.0% smaller. Update #6853.
Speed on non-interface keys is unchanged. Speed on interface keys
is ~20% slower:
name old time/op new time/op delta
MapInterfaceString-8 23.0ns ±21% 27.6ns ±14% +20.01% (p=0.002 n=10+10)
MapInterfacePtr-8 19.4ns ±16% 23.7ns ± 7% +22.48% (p=0.000 n=10+8)
Change-Id: I7c2e42292a46b5d4e288aaec4029bdbb01089263
Reviewed-on: https://go-review.googlesource.com/c/go/+/191198
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
A few examples (for accessing a slice of length 3):
s[-1] runtime error: index out of range [-1]
s[3] runtime error: index out of range [3] with length 3
s[-1:0] runtime error: slice bounds out of range [-1:]
s[3:0] runtime error: slice bounds out of range [3:0]
s[3:-1] runtime error: slice bounds out of range [:-1]
s[3:4] runtime error: slice bounds out of range [:4] with capacity 3
s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3
Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.
An exhaustive set of examples is in issue30116[u].out in the CL.
The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.
Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.
Fixes#30116
Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Allow shifts by signed amounts. Panic if the shift amount is negative.
TODO: We end up doing two compares per shift, see Ian's comment
https://github.com/golang/go/issues/19113#issuecomment-443241799 that
we could do it with a single comparison in the normal case.
The prove pass mostly handles this code well. For instance, it removes the
<0 check for cases like this:
if s >= 0 { _ = x << s }
_ = x << len(a)
This case isn't handled well yet:
_ = x << (y & 0xf)
I'll do followon CLs for unhandled cases as needed.
Update #19113
R=go1.13
Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1
Reviewed-on: https://go-review.googlesource.com/c/158719
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The current support_XXX variables are specific for the
amd64 and 386 platforms.
Prefix processor capability variables by architecture to have a
consistent naming scheme and avoid reuse of the existing
variables for new platforms.
This also aligns naming of runtime variables closer with internal/cpu
processor capability variable names.
Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d
Reviewed-on: https://go-review.googlesource.com/c/91435
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Only return a pointer p to the new slices backing array from makeslice.
Makeslice callers then construct sliceheader{p, len, cap} explictly
instead of makeslice returning the slice.
Reduces go binary size by ~0.2%.
Removes 92 (~3.5%) panicindex calls from go binary.
Change-Id: I29b7c3b5fe8b9dcec96e2c43730575071cfe8a94
Reviewed-on: https://go-review.googlesource.com/c/141822
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.
This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).
func f(a, b, c string) {
fmt.Printf("%s %s\n", a, b)
fmt.Printf("%s %s\n", b, c)
}
This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.
The go binary shrinks 0.3%.
Update #24286
Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.
Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This adds the support to enable the race detector for ppc64le.
Added runtime/race_ppc64le.s to manage the calls from Go to the
LLVM tsan functions, mostly converting from the Go ABI to the
PPC64 ABI expected by Clang generated code.
Changed racewalk.go to call racefuncenterfp instead of racefuncenter
on ppc64le to allow the caller pc to be obtained in the asm code
before calling the tsan version.
Changed the set up code for racecallbackthunk so it doesn't use
the autogenerated save and restore of the link register since that
sequence uses registers inconsistent with the normal ppc64 ABI.
Made various changes to recognize that race is supported for
ppc64le.
Ensured that tls_g is updated and accessible from race_linux_ppc64le.s
so that the race ctx can be obtained and passed to tsan functions.
This enables the race tests for ppc64le in cmd/dist/test.go and
increases the timeout when running the benchmarks with the -race
option to avoid timing out.
Updates #24354, #23731
Change-Id: Ib97dc7ac313e6313c836dc7d2fb698f9d8fba3ef
Reviewed-on: https://go-review.googlesource.com/107935
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
replace map clears of the form:
for k := range m {
delete(m, k)
}
(where m is map with key type that is reflexive for ==)
with a new runtime function that clears the maps backing
array with a memclr and reinitializes the hmap struct.
Map key types that for example contain floats are not
replaced by this optimization since NaN keys cannot
be deleted from maps using delete.
name old time/op new time/op delta
GoMapClear/Reflexive/1 92.2ns ± 1% 47.1ns ± 2% -48.89% (p=0.000 n=9+9)
GoMapClear/Reflexive/10 108ns ± 1% 48ns ± 2% -55.68% (p=0.000 n=10+10)
GoMapClear/Reflexive/100 303ns ± 2% 110ns ± 3% -63.56% (p=0.000 n=10+10)
GoMapClear/Reflexive/1000 3.58µs ± 3% 1.23µs ± 2% -65.49% (p=0.000 n=9+10)
GoMapClear/Reflexive/10000 28.2µs ± 3% 10.3µs ± 2% -63.55% (p=0.000 n=9+10)
GoMapClear/NonReflexive/1 121ns ± 2% 124ns ± 7% ~ (p=0.097 n=10+10)
GoMapClear/NonReflexive/10 137ns ± 2% 139ns ± 3% +1.53% (p=0.033 n=10+10)
GoMapClear/NonReflexive/100 331ns ± 3% 334ns ± 2% ~ (p=0.342 n=10+10)
GoMapClear/NonReflexive/1000 3.64µs ± 3% 3.64µs ± 2% ~ (p=0.887 n=9+10)
GoMapClear/NonReflexive/10000 28.1µs ± 2% 28.4µs ± 3% ~ (p=0.247 n=10+10)
Fixes#20138
Change-Id: I181332a8ef434a4f0d89659f492d8711db3f3213
Reviewed-on: https://go-review.googlesource.com/110055
Reviewed-by: Keith Randall <khr@golang.org>
Make selectgo return recvOK as a result parameter instead.
Change-Id: Iffd436371d360bf666b76d4d7503e7c3037a9f1d
Reviewed-on: https://go-review.googlesource.com/37935
Reviewed-by: Austin Clements <austin@google.com>
Now the registration phase looks like:
var cases [4]runtime.scases
var order [8]uint16
selectsend(&cases[0], c1, &v1)
selectrecv(&cases[1], c2, &v2, nil)
selectrecv(&cases[2], c3, &v3, &ok)
selectdefault(&cases[3])
chosen := selectgo(&cases[0], &order[0], 4)
Primarily, this is just preparation for having the compiler open-code
selectsend, selectrecv, and selectdefault.
As a minor benefit, order can now be layed out separately on the stack
in the pointer-free segment, so it won't take up space in the
function's stack pointer maps.
Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45
Reviewed-on: https://go-review.googlesource.com/37933
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Instead of creating a new &nodfp expression for every recover() call,
or a new nodpc variable for every function instrumented by the race
detector, this CL introduces two new uintptr-typed pseudo-variables
callerSP and callerPC. These pseudo-variables act just like calls to
the runtime's getcallersp() and getcallerpc() functions.
For consistency, change runtime.gorecover's builtin stub's parameter
type from "*int32" to "uintptr".
Passes toolstash-check, but toolstash-check -race fails because of
register allocator changes.
Change-Id: I985d644653de2dac8b7b03a28829ad04dfd4f358
Reviewed-on: https://go-review.googlesource.com/99416
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.
Fixes#22460.
Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).
Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Reviewed-on: https://go-review.googlesource.com/92705
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The signature of the mapassign_fast* routines need to distinguish
the pointerness of their key argument. If the affected routines
suspend part way through, the object pointed to by the key might
get garbage collected because the key is typed as a uint{32,64}.
This is not a problem for mapaccess or mapdelete because the key
in those situations do not live beyond the call involved. If the
object referenced by the key is garbage collected prematurely, the
code still works fine. Even if that object is subsequently reallocated,
it can't be written to the map in time to affect the lookup/delete.
Fixes#22781
Change-Id: I0bbbc5e9883d5ce702faf4e655348be1191ee439
Reviewed-on: https://go-review.googlesource.com/79018
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Handle make(map[any]any) and make(map[any]any, hint) where
hint <= BUCKETSIZE special to allow for faster map initialization
and to improve binary size by using runtime calls with fewer arguments.
Given hint is smaller or equal to BUCKETSIZE in which case
overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap:
* If hmap needs to be allocated on the stack then only hmap's hash0
field needs to be initialized and no call to makemap is needed.
* If hmap needs to be allocated on the heap then a new special
makehmap function will allocate hmap and intialize hmap's
hash0 field.
Reduces size of the godoc by ~36kb.
AMD64
name old time/op new time/op delta
NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10)
NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10)
Updates #6853
Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b
Reviewed-on: https://go-review.googlesource.com/61190
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Went mainly for the ones that make no sense, such as the ones
mid-sentence or after commas.
Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0
Reviewed-on: https://go-review.googlesource.com/57293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Where possible generate calls to runtime makemap with int hint argument
during compile time instead of makemap with int64 hint argument.
This eliminates converting the hint argument for calls to makemap with
int64 hint argument for platforms where int64 values do not fit into
an argument of type int.
A similar optimization for makeslice was introduced in CL
golang.org/cl/27851.
386:
name old time/op new time/op delta
NewEmptyMap 53.5ns ± 5% 41.9ns ± 5% -21.56% (p=0.000 n=10+10)
NewSmallMap 182ns ± 1% 165ns ± 1% -8.92% (p=0.000 n=10+10)
Change-Id: Ibd2b4c57b36f171b173bf7a0602b3a59771e6e44
Reviewed-on: https://go-review.googlesource.com/55142
Reviewed-by: Keith Randall <khr@golang.org>
eqstring is only called for strings with equal lengths.
Instead of pushing a pointer and length for each argument string
on the stack we can omit pushing one of the lengths on the stack.
Changing eqstrings signature to eqstring(*uint8, *uint8, int) bool
to implement the above optimization would make it very similar to the
existing memequal(*any, *any, uintptr) bool function.
Since string lengths are positive we can avoid code redundancy and
use memequal instead of using eqstring with an optimized signature.
go command binary size reduced by 4128 bytes on amd64.
name old time/op new time/op delta
CompareStringEqual 6.03ns ± 1% 5.71ns ± 1% -5.23% (p=0.000 n=19+18)
CompareStringIdentical 2.88ns ± 1% 3.22ns ± 7% +11.86% (p=0.000 n=20+20)
CompareStringSameLength 4.31ns ± 1% 4.01ns ± 1% -7.17% (p=0.000 n=19+19)
CompareStringDifferentLength 0.29ns ± 2% 0.29ns ± 2% ~ (p=1.000 n=20+20)
CompareStringBigUnaligned 64.3µs ± 2% 64.1µs ± 3% ~ (p=0.164 n=20+19)
CompareStringBig 61.9µs ± 1% 61.6µs ± 2% -0.46% (p=0.033 n=20+19)
Change-Id: Ice15f3b937c981f0d3bc8479a9ea0d10658ac8df
Reviewed-on: https://go-review.googlesource.com/53650
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call. Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.
name old time/op new time/op delta
OnesCount-8 2.47ns ± 5% 1.04ns ± 2% -57.70% (p=0.000 n=10+10)
OnesCount16-8 1.05ns ± 1% 0.78ns ± 0% -25.56% (p=0.000 n=9+8)
OnesCount32-8 1.63ns ± 5% 1.04ns ± 2% -35.96% (p=0.000 n=10+10)
OnesCount64-8 2.45ns ± 0% 1.04ns ± 1% -57.55% (p=0.000 n=6+10)
Update #18616
Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Clean up code that does interface equality. Avoid doing checks
in efaceeq/ifaceeq that we already did before calling those routines.
No noticeable performance changes for existing benchmarks.
name old time/op new time/op delta
EfaceCmpDiff-8 604ns ± 1% 553ns ± 1% -8.41% (p=0.000 n=9+10)
Fixes#18618
Change-Id: I3bd46db82b96494873045bc3300c56400bc582eb
Reviewed-on: https://go-review.googlesource.com/38606
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
The chanrecv funcs don't use it at all. The chansend ones do, but the
element type is now part of the hchan struct, which is already a
parameter.
hchan can be nil in chansend when sending to a nil channel, so when
instrumenting we must copy to the stack to be able to read the channel
type.
name old time/op new time/op delta
ChanUncontended 6.42µs ± 1% 6.22µs ± 0% -3.06% (p=0.000 n=19+18)
Initially found by github.com/mvdan/unparam.
Fixes#19591.
Change-Id: I3a5e8a0082e8445cc3f0074695e3593fd9c88412
Reviewed-on: https://go-review.googlesource.com/38351
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.
test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.
Updates #19331.
Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
runtime.memclr* functions have signatures
func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr)
func memclrHasPointers(ptr unsafe.Pointer, n uintptr)
Update compiler's copy. Also teach gc/mkbuiltin.go to handle
unsafe.Pointer. The import statement and its support is not
really necessary, but just to make it look like real Go code.
Fixes#19185.
Change-Id: I251d02571fde2716d4727e31e04d56ec04b6f22a
Reviewed-on: https://go-review.googlesource.com/37257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The definition of writeBarrier in the runtime was changed in CL 22855
to include padding. Update the definition built in to the compiler to match.
This doesn't affect the generated code, as the compiler sets the type
to use anyhow, but having them be different seems clearly wrong.
Change-Id: I8eac05bf70a424a0b2338ba5e9e41af231316de0
Reviewed-on: https://go-review.googlesource.com/37377
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.
Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.
Update #18492
Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Loop breaking with a counter. Benchmarked (see comments),
eyeball checked for sanity on popular loops. This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.
Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test. Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.
If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop. Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.
This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.
Updates #17831.
Updates #10958.
Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The code to do the conversion is smaller than the
call to the runtime.
The 1-result asserts need to call panic if they fail, but that
code is out of line.
The only conversions left in the runtime are those which
might allocate and those which might need to generate an itab.
Given the following types:
type E interface{}
type I interface { foo() }
type I2 iterface { foo(); bar() }
type Big [10]int
func (b Big) foo() { ... }
This CL inlines the following conversions:
was assertE2T
var e E = ...
b := i.(Big)
was assertE2T2
var e E = ...
b, ok := i.(Big)
was assertI2T
var i I = ...
b := i.(Big)
was assertI2T2
var i I = ...
b, ok := i.(Big)
was assertI2E
var i I = ...
e := i.(E)
was assertI2E2
var i I = ...
e, ok := i.(E)
These are the remaining runtime calls:
convT2E:
var b Big = ...
var e E = b
convT2I:
var b Big = ...
var i I = b
convI2I:
var i2 I2 = ...
var i I = i2
assertE2I:
var e E = ...
i := e.(I)
assertE2I2:
var e E = ...
i, ok := e.(I)
assertI2I:
var i I = ...
i2 := i.(I2)
assertI2I2:
var i I = ...
i2, ok := i.(I2)
Fixes#17405Fixes#8422
Change-Id: Ida2367bf8ce3cd2c6bb599a1814f1d275afabe21
Reviewed-on: https://go-review.googlesource.com/32313
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Since barrier-less memclr is only safe in very narrow circumstances,
this commit renames memclr to avoid accidentally calling memclr on
typed memory. This can cause subtle, non-deterministic bugs, so it's
worth some effort to prevent. In the near term, this will also prevent
bugs creeping in from any concurrent CLs that add calls to memclr; if
this happens, whichever patch hits master second will fail to compile.
This also adds the other new memclr variants to the compiler's
builtin.go to minimize the churn on that binary blob. We'll use these
in future commits.
Updates #17503.
Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca
Reviewed-on: https://go-review.googlesource.com/31369
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
- removes the runtime function stringtoslicebytetmp
- removes the generation of calls to stringtoslicebytetmp from the frontend
- adds handling of OSTRARRAYBYTETMP in the backend
This reduces binary sizes and avoids function call overhead.
Change-Id: Ib9988d48549cee663b685b4897a483f94727b940
Reviewed-on: https://go-review.googlesource.com/32158
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
It's pretty simple. For:
e = (interface{})(i)
Do:
tmp = i.itab
if tmp != nil {
tmp = tmp.typ_ // load type from itab
}
e = eface{tmp, i.data}
It is smaller and faster than calling the runtime.
Change-Id: I0ad27f62f4ec0b6cd53bc8530e4da0eae3e67a6c
Reviewed-on: https://go-review.googlesource.com/31260
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Copies utf8 constants and EncodeRune implementation from unicode/utf8.
Adds a new decoderune implementation that is used by the compiler
in code generated for ranging over strings. It does not handle
ASCII runes since these are handled directly before calls to decoderune.
The DecodeRuneInString implementation from unicode/utf8 is not used
since it uses a lookup table that would increase the use of cpu caches.
Adds more tests that check decoding of valid and invalid utf8 sequences.
name old time/op new time/op delta
RuneIterate/range2/ASCII-4 7.45ns ± 2% 7.45ns ± 1% ~ (p=0.634 n=16+16)
RuneIterate/range2/Japanese-4 53.5ns ± 1% 49.2ns ± 2% -8.03% (p=0.000 n=20+20)
RuneIterate/range2/MixedLength-4 46.3ns ± 1% 41.0ns ± 2% -11.57% (p=0.000 n=20+20)
new:
"".decoderune t=1 size=423 args=0x28 locals=0x0
old:
"".charntorune t=1 size=666 args=0x28 locals=0x0
Change-Id: I1df1fdb385bb9ea5e5e71b8818ea2bf5ce62de52
Reviewed-on: https://go-review.googlesource.com/28490
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
To compile:
m[k] = v
instead of:
mapassign(maptype, m, &k, &v), do
do:
*mapassign(maptype, m, &k) = v
mapassign returns a pointer to the value slot in the map. It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.
This makes map accesses faster but potentially larger (codewise).
It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove. We also potentially
avoid stack temporaries to hold v.
The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).
This CL is in preparation for doing operations like m[k] += v with
only a single runtime call. That will roughly double the speed of
such operations.
Update #17133
Update #5147
Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No point in calling a function when we can build the interface
using a known type (or itab) and the address of a local.
Get rid of third arg (preallocated stack space) to convT2{I,E}.
Makes go binary smaller by 0.2%
benchmark old ns/op new ns/op delta
BenchmarkEfaceInteger-8 16.7 10.1 -39.52%
Update #17118
Update #15375
Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2
Reviewed-on: https://go-review.googlesource.com/29373
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>