Previously the compiler rewrote constant division into OHMUL
operations, but that rewriting was moved to SSA in CL 37015. Now OHMUL
is unused, so we can get rid of it.
Change-Id: Ib6fc7c2b6435510bafb5735b3b4f42cfd8ed8cdb
Reviewed-on: https://go-review.googlesource.com/37750
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Add Set3 function to complement existing Set1 and Set2 functions.
Consistently use Set1, Set2 and Set3 for []*Node instead of Set where applicable.
Add SetFirst and SetSecond for setting elements of []*Node to mirror
First and Second for accessing elements in []*Node.
Replace uses of Index by First and Second and
SetIndex with SetFirst and SetSecond where applicable.
Passes toolstash -cmp.
Change-Id: I8255aae768cf245c8f93eec2e9efa05b8112b4e5
Reviewed-on: https://go-review.googlesource.com/37430
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
We can immediately emit static assignment data rather than queueing
them up to be processed during SSA building.
Passes toolstash -cmp.
Change-Id: I8bcea4b72eafb0cc0b849cd93e9cde9d84f30d5e
Reviewed-on: https://go-review.googlesource.com/37024
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Instead we can just call needwritebarrier when constructing the SSA
representation.
Change-Id: I6fefaad49daada9cdb3050f112889e49dca0047b
Reviewed-on: https://go-review.googlesource.com/34566
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Remove rotate generation from walk. Remove OLROT and ssa.Lrot* opcodes.
Generate rotates during SSA lowering for architectures that have them.
This CL will allow rotates to be generated in more situations,
like when the shift values are determined to be constant
only after some analysis.
Fixes#18254
Change-Id: I8d6d684ff5ce2511aceaddfda98b908007851079
Reviewed-on: https://go-review.googlesource.com/34232
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
A Func's Shortname is just an identifier. No need for an entire ONAME
Node.
Change-Id: Ie4d397e8d694c907fdf924ce57bd96bdb4aaabca
Reviewed-on: https://go-review.googlesource.com/35574
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Conversion to Nodes still happens sequentially at the moment.
Change-Id: I3407ba0711b8b92e22ece0a06fefaff863c3ccc9
Reviewed-on: https://go-review.googlesource.com/35126
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Known issues:
- needs many more tests
- duplicate method declarations via type alias names are not detected
- type alias cycle error messages need to be improved
- need to review setup of byte/rune type aliases
For #18130.
Change-Id: Icc2fefad6214e5e56539a9dcb3fe537bf58029f8
Reviewed-on: https://go-review.googlesource.com/35121
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
XPos is a compact (8 instead of 16 bytes on a 64bit machine) source
position representation. There is a 1:1 correspondence between each
XPos and each regular Pos, translated via a global table.
In some sense this brings back the LineHist, though positions can
track line and column information; there is a O(1) translation
between the representations (no binary search), and the translation
is factored out.
The size increase with the prior change is brought down again and
the compiler speed is in line with the master repo (measured on
the same "quiet" machine as for prior change):
name old time/op new time/op delta
Template 256ms ± 1% 262ms ± 2% ~ (p=0.063 n=5+4)
Unicode 132ms ± 1% 135ms ± 2% ~ (p=0.063 n=5+4)
GoTypes 891ms ± 1% 871ms ± 1% -2.28% (p=0.016 n=5+4)
Compiler 3.84s ± 2% 3.89s ± 2% ~ (p=0.413 n=5+4)
MakeBash 47.1s ± 1% 46.2s ± 2% ~ (p=0.095 n=5+5)
name old user-ns/op new user-ns/op delta
Template 309M ± 1% 314M ± 2% ~ (p=0.111 n=5+4)
Unicode 165M ± 1% 172M ± 9% ~ (p=0.151 n=5+5)
GoTypes 1.14G ± 2% 1.12G ± 1% ~ (p=0.063 n=5+4)
Compiler 5.00G ± 1% 4.96G ± 1% ~ (p=0.286 n=5+4)
Change-Id: Icc570cc60ab014d8d9af6976f1f961ab8828cc47
Reviewed-on: https://go-review.googlesource.com/34506
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Various minor adjustments.
Change-Id: Iedfb97989f7bedaa3e9e8993b167e05f162434a7
Reviewed-on: https://go-review.googlesource.com/34136
Reviewed-by: David Lazar <lazard@golang.org>
This is a step toward chosing a different position representation.
By introducing an explicit type, it will be easier to make the
transition step-wise while ensuring everything keeps running.
This has been reviewed via https://go-review.googlesource.com/#/c/34025/.
Change-Id: Ibceddcd62d8f346321ac3250e3940e9c436ed684
Reviewed-on: https://go-review.googlesource.com/34132
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
func f() {
g()
}
We mistakenly don't add a frame pointer for f. This means f
isn't seen when walking the frame pointer linked list. That
matters for kernel-gathered profiles, and is an impediment for
issues like #16638.
To fix, allocate a stack frame even for otherwise frameless functions
like f. It is a bit tricky because we need to avoid some runtime
internals that really, really don't want one.
No test at the moment, as only kernel CPU profiles would catch it.
Tests will come with the implementation of #16638.
Fixes#18103
Change-Id: I411206cc9de4c8fdd265bee2e4fa61d161ad1847
Reviewed-on: https://go-review.googlesource.com/33754
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
It's never set anywhere, and even if it was, it would just Fatalf.
Change-Id: I84ade6d2068c623a8c85f84d8cdce38984996ddd
Reviewed-on: https://go-review.googlesource.com/32489
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This is an extension of
https://go-review.googlesource.com/c/31662/
to mark all the temporaries, not just the ssa-generated ones.
Before-and-after ls -l `go tool -n compile` shows a 3%
reduction in size (or rather, a prior 3% inflation for
failing to filter temps out properly.)
Replaced name-dependent "is it a temp?" tests with calls to
*Node.IsAutoTmp(), which depends on AutoTemp. Also replace
calls to istemp(n) with n.IsAutoTmp(), to reduce duplication
and clean up function name space. Generated temporaries
now come with a "." prefix to avoid (apparently harmless)
clashes with legal Go variable names.
Fixes#17644.
Fixes#17240.
Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43
Reviewed-on: https://go-review.googlesource.com/32255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Use a local map during inlining instead.
Change-Id: I10cd19885e7124f812bb04a79dbda52bfebfe1a1
Reviewed-on: https://go-review.googlesource.com/32225
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reuse the same mechanisms for handling universal builtins like len to
handle unsafe.Sizeof, etc. Allows us to drop package unsafe's export
data, and simplifies some code.
Updates #17508.
Change-Id: I620e0617c24e57e8a2d7cccd0e2de34608779656
Reviewed-on: https://go-review.googlesource.com/31433
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This adds a //go:notinheap pragma for declarations of types that must
not be heap allocated. We ensure these rules by disallowing new(T),
make([]T), append([]T), or implicit allocation of T, by disallowing
conversions to notinheap types, and by propagating notinheap to any
struct or array that contains notinheap elements.
The utility of this pragma is that we can eliminate write barriers for
writes to pointers to go:notinheap types, since the write barrier is
guaranteed to be a no-op. This will let us mark several scheduler and
memory allocator structures as go:notinheap, which will let us
disallow write barriers in the scheduler and memory allocator much
more thoroughly and also eliminate some problematic hybrid write
barriers.
This also makes go:nowritebarrierrec and go:yeswritebarrierrec much
more powerful. Currently we use go:nowritebarrier all over the place,
but it's almost never what you actually want: when write barriers are
illegal, they're typically illegal for a whole dynamic scope. Partly
this is because go:nowritebarrier has been around longer, but it's
also because go:nowritebarrierrec couldn't be used in situations that
had no-op write barriers or where some nested scope did allow write
barriers. go:notinheap eliminates many no-op write barriers and
go:yeswritebarrierrec makes it possible to opt back in to write
barriers, so these two changes will let us use go:nowritebarrierrec
far more liberally.
This updates #13386, which is about controlling pointers from non-GC'd
memory to GC'd memory. That would require some additional pragma (or
pragmas), but could build on this pragma.
Change-Id: I6314f8f4181535dd166887c9ec239977b54940bd
Reviewed-on: https://go-review.googlesource.com/30939
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Previously, we used OKEY nodes to represent keyed struct literal
elements. The field names were represented by an ONAME node, but this
is clumsy because it's the only remaining case where ONAME was used to
represent a bare identifier and not a variable.
This CL introduces a new OSTRUCTKEY node op for use in struct
literals. These ops instead store the field name in the node's own Sym
field. This is similar in spirit to golang.org/cl/20890.
Significant reduction in allocations for struct literal heavy code
like package unicode:
name old time/op new time/op delta
Template 345ms ± 6% 341ms ± 6% ~ (p=0.141 n=29+28)
Unicode 200ms ± 9% 184ms ± 7% -7.77% (p=0.000 n=29+30)
GoTypes 1.04s ± 3% 1.05s ± 3% ~ (p=0.096 n=30+30)
Compiler 4.47s ± 9% 4.49s ± 6% ~ (p=0.890 n=29+29)
name old user-ns/op new user-ns/op delta
Template 523M ±13% 516M ±17% ~ (p=0.400 n=29+30)
Unicode 334M ±27% 314M ±30% ~ (p=0.093 n=30+30)
GoTypes 1.53G ±10% 1.52G ±10% ~ (p=0.572 n=30+30)
Compiler 6.28G ± 7% 6.34G ±11% ~ (p=0.300 n=30+30)
name old alloc/op new alloc/op delta
Template 44.5MB ± 0% 44.4MB ± 0% -0.35% (p=0.000 n=27+30)
Unicode 39.2MB ± 0% 34.5MB ± 0% -11.79% (p=0.000 n=26+30)
GoTypes 125MB ± 0% 125MB ± 0% -0.12% (p=0.000 n=29+30)
Compiler 515MB ± 0% 515MB ± 0% -0.10% (p=0.000 n=29+30)
name old allocs/op new allocs/op delta
Template 426k ± 0% 424k ± 0% -0.39% (p=0.000 n=29+30)
Unicode 374k ± 0% 323k ± 0% -13.67% (p=0.000 n=29+30)
GoTypes 1.21M ± 0% 1.21M ± 0% -0.14% (p=0.000 n=29+29)
Compiler 4.40M ± 0% 4.39M ± 0% -0.13% (p=0.000 n=29+30)
Passes toolstash/buildall.
Change-Id: Iba4ee765dd1748f67e52fcade1cd75c9f6e13fa9
Reviewed-on: https://go-review.googlesource.com/30974
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
To compile:
m[k] = v
instead of:
mapassign(maptype, m, &k, &v), do
do:
*mapassign(maptype, m, &k) = v
mapassign returns a pointer to the value slot in the map. It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.
This makes map accesses faster but potentially larger (codewise).
It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove. We also potentially
avoid stack temporaries to hold v.
The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).
This CL is in preparation for doing operations like m[k] += v with
only a single runtime call. That will roughly double the speed of
such operations.
Update #17133
Update #5147
Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Does not pass toolstash, but only because it causes ATYPE instructions
to be emitted in a different order, and it avoids emitting type
metadata for unused variables.
Change-Id: I3ec8f66a40b5af9213e0d6e852b267a8dd995838
Reviewed-on: https://go-review.googlesource.com/30217
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Identify live stack variables during SSA and compute the stack frame
layout earlier so that we can emit instructions with the correct
offsets upfront.
Passes toolstash/buildall.
Change-Id: I191100dba274f1e364a15bdcfdc1d1466cdd1db5
Reviewed-on: https://go-review.googlesource.com/30216
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We're dropping this behavior in favor of runtime.KeepAlive.
Implement runtime.KeepAlive as an intrinsic.
Update #15843
Change-Id: Ib60225bd30d6770ece1c3c7d1339a06aa25b1cbc
Reviewed-on: https://go-review.googlesource.com/28310
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Does not pass toolstash -cmp due to changed export data,
but the cmd/go binary (which doesn't contain export data)
is bit-for-bit identical.
Change-Id: I6b12f9de18cf7da528e9207dccbf8f08c969f142
Reviewed-on: https://go-review.googlesource.com/26753
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Consider a switch statement like:
switch x {
case 1:
// ...
case 2, 3, 4, 5, 6:
// ...
case 5:
// ...
}
Prior to this CL, the generated code treated
2, 3, 4, 5, and 6 independently in a binary search.
With this CL, the generated code checks whether
2 <= x && x <= 6.
walkinrange then optimizes that range check
into a single unsigned comparison.
Experiments suggest that the best min range size
is 2, using binary size as a proxy for optimization.
Binary sizes before/after this CL:
cmd/compile: 14209728 / 14165360
cmd/go: 9543100 / 9539004
Change-Id: If2f7fb97ca80468fa70351ef540866200c4c996c
Reviewed-on: https://go-review.googlesource.com/26770
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Switch lowering splits each case expression out
into its own OCASE node.
Change-Id: Ifcb72b99975ed36da8540f6e43343e9aa2058572
Reviewed-on: https://go-review.googlesource.com/26769
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
When T is a scalar, there are no runtime calls
required, which makes this a clear win.
encoding/binary:
WriteInts-8 958ns ± 3% 864ns ± 2% -9.80% (p=0.000 n=15+15)
This also considerably shrinks a core fmt
routine:
Before: "".(*pp).printArg t=1 size=3952 args=0x20 locals=0xf0
After: "".(*pp).printArg t=1 size=2624 args=0x20 locals=0x98
Unfortunately, I find it very hard to get stable
numbers out of the fmt benchmarks due to thermal scaling.
Change-Id: I1278006b030253bf8e48dc7631d18985cdaa143d
Reviewed-on: https://go-review.googlesource.com/26659
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Might as well tell the runtime how large the map is going to be.
This avoids grow work and allocations while the map is being built.
Will wait for 1.8.
Fixes#15880Fixes#16279
Change-Id: I377e3e5ec1e2e76ea2a50cc00810adda20ad0e79
Reviewed-on: https://go-review.googlesource.com/23558
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Make sure the pointer to the heap copy of an output parameter is kept
live throughout the function. The function could panic at any point,
and then a defer could recover. Thus, we need the pointer to the heap
copy always available so the post-deferreturn code can copy the return
value back to the stack.
Before this CL, the pointer to the heap copy could be considered dead in
certain situations, like code which is reverse dominated by a panic call.
Fixes#16095.
Change-Id: Ic3800423e563670e5b567b473bf4c84cddb49a4c
Reviewed-on: https://go-review.googlesource.com/24213
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>