Previously, genMatch0 and genResult0 contained
lots of duplication: locating the op, parsing
the value, validation, etc.
Parsing and validation was mixed in with code gen.
Extract a helper, parseValue. It is responsible
for parsing the value, locating the op, and doing
shared validation.
As a bonus (and possibly as my original motivation),
make op selection pay attention to the number
of args present.
This allows arch-specific ops to share a name
with generic ops as long as there is no ambiguity.
It also detects and reports unresolved ambiguity,
unlike before, where it would simply always
pick the generic op, with no warning.
Also use parseValue when generating the top-level
op dispatch, to ensure its opinion about ops
matches genMatch0 and genResult0.
The order of statements in the generated code used
to depend on the exact rule. It is now somewhat
independent of the rule. That is the source
of some of the generated code changes in this CL.
See rewritedec64 and rewritegeneric for examples.
It is a one-time change.
The op dispatch switch and functions used to be
sorted by opname without architecture. The sort
now includes the architecture, leading to further
generated code changes.
See rewriteARM and rewriteAMD64 for examples.
Again, it is a one-time change.
There are no functional changes.
Change-Id: I22c989183ad5651741ebdc0566349c5fd6c6b23c
Reviewed-on: https://go-review.googlesource.com/24649
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Add some simplification rules for floating point ops.
cmd/internal/obj/arm supports instructions that compare FP register
to 0, but runtime softfloat simulator does not. This CL adds these
instructions to softfloat simulator as well.
Updates #15365.
Change-Id: I29405b2bfcb4c8cf106cb7a1a811409fec91b170
Reviewed-on: https://go-review.googlesource.com/24790
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
These were helpful for some autogenerated code
I'm working with.
Change-Id: I7b89c69552ca99bf560a14bfbcd6bd238595ddf6
Reviewed-on: https://go-review.googlesource.com/24742
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Encode the size and the alignment into AuxInt of Zero and Move ops.
On AMD64, we simply don't look at the alignment. On ARM and PPC64, we
only generate aligned stores.
Updates #15365.
Change-Id: Ifdcc205c364f67c4516b9adebfe7d50d223b6863
Reviewed-on: https://go-review.googlesource.com/24511
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Break really long lines.
Add spacing to line up columns.
In AMD64, put all the optimization rules after all the
lowering rules.
Change-Id: I45cc7368bf278416e67f89e74358db1bd4326a93
Reviewed-on: https://go-review.googlesource.com/22470
Reviewed-by: David Chase <drchase@google.com>
Make sure ops have the right number of args, set
aux and auxint only if allowed, etc.
Normalize error reporting format.
Change-Id: Ie545fcc5990c8c7d62d40d9a0a55885f941eb645
Reviewed-on: https://go-review.googlesource.com/22320
Reviewed-by: David Chase <drchase@google.com>
Make sure the results of unsigned constant-folded
shifts are sign-extended into the AuxInt field.
Fixes#15175
Change-Id: I3490d1bc3d9b2e1578ed30964645508577894f58
Reviewed-on: https://go-review.googlesource.com/21586
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In b we only need the division by 0 check.
func b(i uint, v []byte) byte {
return v[i%uint(len(v))]
}
Updates #15079.
Change-Id: Ic7491e677dd57cd6ba577efbce576dbb6e023cbd
Reviewed-on: https://go-review.googlesource.com/21502
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ahmed Waheed <oneofone@gmail.com>
Compound AUTO types weren't named previously. That was because live
variable analysis (plive.go) doesn't handle spilling to compound types.
It can't handle them because there is no valid place to put VARDEFs when
regalloc is spilling compound types.
compound types = multiword builtin types: complex, string, slice, and
interface.
Instead, we split named AUTOs into individual one-word variables. For
example, a string s gets split into a byte ptr s.ptr and an integer
s.len. Those two variables can be spilled to / restored from
independently. As a result, live variable analysis can handle them
because they are one-word objects.
This CL will change how AUTOs are described in DWARF information.
Consider the code:
func f(s string, i int) int {
x := s[i:i+5]
g()
return lookup(x)
}
The old compiler would spill x to two consecutive slots on the stack,
both named x (at offsets 0 and 8). The new compiler spills the pointer
of x to a slot named x.ptr. It doesn't spill x.len at all, as it is a
constant (5) and can be rematerialized for the call to lookup.
So compound objects may not be spilled in their entirety, and even if
they are they won't necessarily be contiguous. Such is the price of
optimization.
Re-enable live variable analysis tests. One test remains disabled, it
fails because of #14904.
Change-Id: I8ef2b5ab91e43a0d2136bfc231c05d100ec0b801
Reviewed-on: https://go-review.googlesource.com/21233
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Previously if we were only using the low bits of AuxInt,
the high bits were ignored and could be junk. This CL
changes that behavior to define the high bits to be the
sign-extended version of the low bits for all cases.
There are 2 main benefits:
- Deterministic representation. This helps with CSE.
(Const8 [0x1]) and (Const8 [0x101]) used to be the same "value"
but CSE couldn't see them as such.
- Testability. We can check that all ops leave AuxInt in a state
consistent with the new rule. In the old scheme, it was hard
to check whether a rule correctly used only the low-order bits.
Side benefits:
- ==0 and !=0 tests are easier.
Drawbacks:
- This differs from the runtime representation in registers,
where it is important that we allow upper bits to be undefined
(so we're not sign/zero-extending all the time).
- Ops that treat AuxInt as unsigned (shifts, mostly) need to be
a bit more careful.
Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819
Reviewed-on: https://go-review.googlesource.com/21256
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
khr: Lifting the nil check out of the loop altogether is an admirable
goal, and this rewrite is one step on the way. But without lifting it
out of the loop, the rewrite is just hurting us.
Fixes#14917
Change-Id: Idb917f37d89f50f8e046d5ebd7c092b1e0eb0633
Reviewed-on: https://go-review.googlesource.com/21040
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Shows up occassionally, especially after p = p[:8:len(p)]
Updates #14905
Change-Id: Iab35ef2eac57817e6a10c6aaeeb84709e8021641
Reviewed-on: https://go-review.googlesource.com/21025
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Allow names to be used for subexpressions of match rules.
For example:
(OpA x:(OpB y)) -> ..use x here to refer to the OpB value..
This gets rid of the .Args[0].Args[0]... way of naming we
used to use.
While we're here, give all subexpression matches names instead
of recomputing them with .Args[i] sequences each time they
are referenced. Makes the generated rule code a bit smaller.
Change-Id: Ie42139f6f208933b75bd2ae8bd34e95419bc0e4e
Reviewed-on: https://go-review.googlesource.com/20997
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
For the following example, but there are a few more in the stdlib:
func histogram(b []byte, h *[256]int32) {
for _, t := range b {
h[t]++
}
}
Change-Id: I56615f341ae52e02ef34025588dc6d1c52122295
Reviewed-on: https://go-review.googlesource.com/20924
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Keep track of how many uses each Value has. Each appearance in
Value.Args and in Block.Control counts once.
The number of uses of a value is generically useful to
constrain rewrite rules. For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.
But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use. Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races. In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice. (I have a separate
CL which triggers the mutex failure. This CL has a test which
demonstrates a similar failure.)
Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
All of a struct's fields have to fit into memory anyway, so index them
with int instead of int64. This also makes it nicer for
cmd/compile/internal/gc to reuse the same NumFields function.
Change-Id: I210be804a0c33370ec9977414918c02c675b0fbe
Reviewed-on: https://go-review.googlesource.com/20691
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Split the auxFloat type into 32/64 bit versions and perform checking for
exactly representable float32 values. Perform const folding on
float32/64. Comment out some const negation rules that the frontend
already performs.
Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2
Reviewed-on: https://go-review.googlesource.com/20568
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
* Refacts a bit saving and restoring parents restrictions
* Shaves ~100k from pkg/tools/linux_amd64,
but most of the savings come from the rewrite rules.
* Improves on the following artificial test case:
func f1(a4 bool, a6 bool) bool {
return a6 || (a6 || (a6 || a4)) || (a6 || (a4 || a6 || (false || a6)))
}
Change-Id: I714000f75a37a3a6617c6e6834c75bd23674215f
Reviewed-on: https://go-review.googlesource.com/20306
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
I would like to add a
func (t *Type) Elem() *Type
method to package gc, but that would collide with the existing
func (t *Type) Elem() ssa.Type
method needed to make *gc.Type implement ssa.Type. Because the latter
is much less widely used right now than the former will be, this CL
renames it to ElemType.
Longer term, hopefully gc and ssa will share a common Type interface,
and ElemType can go away.
Change-Id: I270008515dc4c01ef531cf715637a924659c4735
Reviewed-on: https://go-review.googlesource.com/20546
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
* Move lowering into a separate pass.
* SliceLen/SliceCap is now available to various intermediate passes
which use useful for bounds checking.
* Add a second opt pass to handle the new opportunities
Decreases the code size of binaries in pkg/tool/linux_amd64
by ~45K.
Updates #14564#14606
Change-Id: I5b2bd6202181c50623a3585fbf15c0d6db6d4685
Reviewed-on: https://go-review.googlesource.com/20172
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Converting an and-K into a pair of shifts for K that will
fit in a one-byte argument is probably not an optimization,
and it also interferes with other patterns that we want to
see fire, like (<< (AND K)) [for small K] and bounds check
elimination for masked indices.
Turns out that on Intel, even 32-bit signed immediates beat
the shift pair; the size reduction of tool binaries is 0.09%
vs 0.07% for only the 8-bit immediates.
RLH found this one working on the new/next GC.
Change-Id: I2414a8de1dd58d680d18587577fbadb7ff4f67d9
Reviewed-on: https://go-review.googlesource.com/20410
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
* Simplify the nilcheck generated by
for _, e := range a {}
* No effect on the generated code because these nil checks
don't end up in the generated code.
* Useful for other analysis, e.g. it'll remove one dependecy
on the induction variable.
Change-Id: I6ee66ddfdc010ae22aea8dca48163303d93de7a9
Reviewed-on: https://go-review.googlesource.com/20307
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This triggers an astonishing 160k times
during make.bash. The second biggest
generic rewrite triggers 100k times.
However, this is really just moving
rewrites that were happening at the
architecture level to the generic level.
Change-Id: Ife06fe5234f31433328460cb2e0741c071deda41
Reviewed-on: https://go-review.googlesource.com/20235
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
This only deals with the loads themselves. The bounds checks
are a separate issue. Also doesn't handle stores, those are
harder because we need to make sure intermediate memory states
aren't observed (which is hard to do with rewrite rules).
Use one byte shorter instructions for zero-extending loads.
Update #14267
Change-Id: I40af25ab5208488151ba7db32bf96081878fa7d9
Reviewed-on: https://go-review.googlesource.com/20218
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
uint8(s.b & 0xff) ought to produce same code as uint8(s.b)
but it did not. RLH found this one looking for moles to
whack in the GC code.
Change-Id: I883d68ec7a5746d652712be84a274a11256b3b33
Reviewed-on: https://go-review.googlesource.com/20141
Reviewed-by: Keith Randall <khr@golang.org>
Do some easy TODOs.
Move a bunch of other TODOs into bugs.
Change-Id: Iaba9dad6221a2af11b3cbcc512875f4a85842873
Reviewed-on: https://go-review.googlesource.com/20114
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Add a blank line before the "package ssa" lines so the "autogenerated
don't edit" comments don't end up in godoc output.
Change-Id: I82bf90d52d426ce1a8e21483fc8f47b3689259c7
Reviewed-on: https://go-review.googlesource.com/20086
Reviewed-by: Keith Randall <khr@golang.org>
* This is a very basic form of straight line strength reduction.
* Removes one multiplication from a[b].c++; a[b+1].c++
* It increases pressure on the register allocator because
CSE creates more copies of the multiplication sizeof(a[0])*b.
Change-Id: I686a18e9c24cc6f8bdfa925713afed034f7d36d0
Reviewed-on: https://go-review.googlesource.com/20091
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The @ directive used to read the target block after some value
structure had already changed. I don't think it was ever really
a bug, but it's confusing.
It might fail like this:
(Foo x y) -> @v.Args[0].Block (Bar y (Baz ...))
v.Op = Bar
v.Args[0] = y
v.Args[1] = v.Args[0].Block.NewValue(Baz, ...)
That new value is allocated in the block of y, not the
block of x.
Anyway, read the destination block first so this
potential bug can't happen.
Change-Id: Ie41d2fc349b35cefaa319fa9327808bcb781b4e2
Reviewed-on: https://go-review.googlesource.com/19900
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Saves about 2k for binaries in pkg/tool/linux_amd64.
Also useful when opt runs after cse (as in 12960) which reorders
arguments for commutative operations such as Add64.
Change-Id: I49ad53afa53db9736bd35c425f4fb35fb511fd63
Reviewed-on: https://go-review.googlesource.com/19827
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Loads of stores from the same pointer with compatible types
can be replaced with a copy.
Change-Id: I514b3ed8e5b6a9c432946880eac67a51b1607932
Reviewed-on: https://go-review.googlesource.com/19743
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Also throw in a few more shift constant folding.
Change-Id: Iabe00596987f594e0686fbac3d76376d94612340
Reviewed-on: https://go-review.googlesource.com/19543
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
ANDs of constants whose only set bits are leading or trailing can be
rewritten as two shifts instead. This is slightly faster for 32 or
64 bit operands.
Change-Id: Id5c1ff27e5a4df22fac67b03b9bddb944871145d
Reviewed-on: https://go-review.googlesource.com/19485
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Found by inspecting random generated code.
Change-Id: I57d0fed7c3a8dc91fd13cdccb4819101f9976ec9
Reviewed-on: https://go-review.googlesource.com/19413
Reviewed-by: Keith Randall <khr@golang.org>
* Phis can have variable number of arguments, but rulegen assumed that
each operation has fixed number of arguments.
* Rewriting Phis is necessary to handle the following case:
func f1_ssa(a bool, x int) int {
v := 0
if a {
v = -1
} else {
v = -1
}
return x|v
}
Change-Id: Iff6bd411b854f3d1d6d3ce21934bf566757094f2
Reviewed-on: https://go-review.googlesource.com/19412
Reviewed-by: Keith Randall <khr@golang.org>
The frontend does this for 32 bits and below, but SSA needs
to do it for 64 bits. The algorithms are all copied from
cgen.go:cgen_div.
Speeds up TimeFormat substantially: ~40% slower to ~10% slower.
Change-Id: I023ea2eb6040df98ccd9105e15ca6ea695610a7a
Reviewed-on: https://go-review.googlesource.com/19302
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
* Enclose each rule's code in a for with no condition
* The loop is ran at most once because it's always terminated by a return.
* Use break when matching condition fails
* Drop rule hashes
* Shaves about 3 lines of code per rule
The binary size is not afected.
Change-Id: I27c3e40dc8cae98dcd50739342dc38db2ef9c247
Reviewed-on: https://go-review.googlesource.com/19220
Reviewed-by: Keith Randall <khr@golang.org>
Shaves about 3 lines per generated rule.
Change-Id: I94adc94ab79f90ac5fd033f896ece3b1eddf0f3d
Reviewed-on: https://go-review.googlesource.com/19197
Reviewed-by: Keith Randall <khr@golang.org>