This reverts commit 6616573982, corresponding to CL 436915.
Reason for revert: this is causing some bootstrap build problems with older versions of Go 1.17, as I understand it. Still under investigation.
Change-Id: Idb6e17ff7b47004cbf87f967af6d84f214d8abb5
Reviewed-on: https://go-review.googlesource.com/c/go/+/435471
Reviewed-by: David Chase <drchase@google.com>
The minimum bootstrap version for Go ≥ 1.20 is Go 1.17. That version
supports the new style //go:build lines. Thus the old style //+build
lines can be dropped in this part of the tree as well. Leave the
//+build lines in cmd/dist which will ensure the minimum Go version
during bootstrap.
As suggested by Cherry during review of CL 430496
For #44505
Change-Id: If53c0b02cacbfb055a33e73cfd38578dfd3aa340
Reviewed-on: https://go-review.googlesource.com/c/go/+/436915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
For #45557
Change-Id: I56824135d86452603dd4ed4bab0e24c201bb0683
Reviewed-on: https://go-review.googlesource.com/c/go/+/426257
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Andy Pan <panjf2000@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Performance is kind of hard to exactly quantify.
One big difference between jump tables and the old binary search
scheme is that there's only 1 branch statement instead of O(n) of
them. That can be both a blessing and a curse, and can make evaluating
jump tables very hard to do.
The single branch can become a choke point for the hardware branch
predictor. A branch table jump must fit all of its state in a single
branch predictor entry (technically, a branch target predictor entry).
With binary search that predictor state can be spread among lots of
entries. In cases where the case selection is repetitive and thus
predictable, binary search can perform better.
The big win for a jump table is that it doesn't consume so much of the
branch predictor's resources. But that benefit is essentially never
observed in microbenchmarks, because the branch predictor can easily
keep state for all the binary search branches in a microbenchmark. So
that benefit is really hard to measure.
So predictable switch microbenchmarks are ~useless - they will almost
always favor the binary search scheme. Fully unpredictable switch
microbenchmarks are better, as they aren't lying to us quite so
much. In a perfectly unpredictable situation, a jump table will expect
to incur 1-1/N branch mispredicts, where a binary search would incur
lg(N)/2 of them. That makes the crossover point at about N=4. But of
course switches in real programs are seldom fully unpredictable, so
we'll use a higher crossover point.
Beyond the branch predictor, jump tables tend to execute more
instructions per switch but have no additional instructions per case,
which also argues for a larger crossover.
As far as code size goes, with this CL cmd/go has a slightly smaller
code segment and a slightly larger overall size (from the jump tables
themselves which live in the data segment).
This is a case where some FDO (feedback-directed optimization) would
be really nice to have. #28262
Some large-program benchmarks might help make the case for this
CL. Especially if we can turn on branch mispredict counters so we can
see how much using jump tables can free up branch prediction resources
that can be gainfully used elsewhere in the program.
name old time/op new time/op delta
Switch8Predictable 1.89ns ± 2% 1.27ns ± 3% -32.58% (p=0.000 n=9+10)
Switch8Unpredictable 9.33ns ± 1% 7.50ns ± 1% -19.60% (p=0.000 n=10+9)
Switch32Predictable 2.20ns ± 2% 1.64ns ± 1% -25.39% (p=0.000 n=10+9)
Switch32Unpredictable 10.0ns ± 2% 7.6ns ± 2% -24.04% (p=0.000 n=10+10)
Fixes#5496
Update #34381
Change-Id: I3ff56011d02be53f605ca5fd3fb96b905517c34f
Reviewed-on: https://go-review.googlesource.com/c/go/+/357330
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
The go/build package needs access to this configuration,
so move it into a new package available to the standard library.
Change-Id: I868a94148b52350c76116451f4ad9191246adcff
Reviewed-on: https://go-review.googlesource.com/c/go/+/310731
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
If I change a rule in ARM64.rules to use the variable name "b" in a
conflicting way, rulegen would previously not complain, and the compiler
would later give a confusing error:
$ go run *.go && go build cmd/compile/internal/ssa
# cmd/compile/internal/ssa
../rewriteARM64.go:24236:10: b.NewValue0 undefined (type int64 has no field or method NewValue0)
Make rulegen complain early about those cases. Sometimes they might
happen to be harmless, but in general they can easily cause confusion or
unintended effect due to shadowing.
After the change, with the same conflicting rule:
$ go run *.go && go build cmd/compile/internal/ssa
2021/03/22 11:31:49 rule ARM64.rules:495 uses the reserved name b
exit status 1
Note that 24 existing rules were using reserved names. It seems like the
shadowing was harmless, as it wasn't causing typechecking issues nor did
it seem to cause unintended behavior when the rule rewrite code ran.
The bool values "b" were renamed "t", since that seems to have a
precedent in other rules and in the fmt package.
Sequential values like "a b c" were renamed to "x y z", since "b" is
reserved.
Finally, "typ" was renamed to "_typ", since there doesn't seem to be an
obviously better answer.
Passes all three of:
$ GOARCH=amd64 go build -toolexec 'toolstash -cmp' -a std
$ GOARCH=arm64 go build -toolexec 'toolstash -cmp' -a std
$ GOARCH=mips64 go build -toolexec 'toolstash -cmp' -a std
Fixes#45154.
Change-Id: I1cce194dc7b477886a9c218c17973e996bcedccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/303549
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This doesn't change any behavior, but should help the compiler realise
that these funcs really do nothing at all.
Change-Id: Ib26c02ef264691acac983538ec300f91d6ff98db
Reviewed-on: https://go-review.googlesource.com/c/go/+/280314
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This is a proof-of-concept change for using the g register on
AMD64. getg is now lowered to R14 in the new ABI. The g register
is not yet used in all places where it can be used (e.g. stack
bounds check, runtime assembly code).
Change-Id: I10123ddf38e31782cf58bafcdff170aee0ff0d1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/289196
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
The type of the shift count must be an unsigned integer. Some s390x
rules for shift have their auxint type being int8. This results in a
compilation failure on s390x with an invalid operation when running
make.bash using older versions of go (e.g: go1.10.4).
This CL adds an auxint type of uint8 and changes the ops for shift and
rotate to use auxint with type uint8. The related rules are also
modified to address this change.
Fixes#43090
Change-Id: I594274b6e3d9b23092fc9e9f4b354870164f2f19
Reviewed-on: https://go-review.googlesource.com/c/go/+/277078
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Dmitri Shuralyov <dmitshur@golang.org>
This change removes support in rulegen for untyped -> ssa rules.
Change-Id: I202018e191fc74f027243351bc8cf96145f2482c
Reviewed-on: https://go-review.googlesource.com/c/go/+/264679
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Includes a few tweaks to Value.copyOf(a) (make it a no-op for
a self-copy) and new pattern hack "___" (3 underscores) is
like ellipsis, except the replacement doesn't need to have
matching ellipsis/underscores.
Moved the arg-length check in generated pattern-matching code
BEFORE the args are probed, because not all instances of
variable length OpFoo will have all the args mentioned in
some rule for OpFoo, and when that happens, the compiler
panics without the early check.
Change-Id: I66de40672b3794a6427890ff96c805a488d783f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/247537
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This is prerequisite to moving call expansion later into SSA,
and probably a good idea anyway. Passes tests.
This is the first minimal CL that does a 1-for-1 substitution
of *ssa.AuxCall for *obj.LSym. Next step (next CL) is to make
this change for all calls so that additional information can
be stored in AuxCall.
Change-Id: Ia3a7715648fd9fb1a176850767a726e6f5b959eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/237680
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The current implementation stores the comparison pseudo-ops of arm64
conditional instructions (CSEL/CSEL0) in Aux, this patch modifies it
and stores it in AuxInt, which can avoid the allocation.
Change-Id: I0b69e51f63acd84c6878c6a59ccf6417501a8cfc
Reviewed-on: https://go-review.googlesource.com/c/go/+/252517
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This patch adds the ARM6464Bitfield auxInt to auxIntType() and
returns its Go type as "arm64Bitfield" type, which is defined
as int16 type.
And the Go type of SymOff auxInt is int32, but some functions
(such as min(), areAdjacentOffsets() and read16/32/64(),etc.)
use SymOff as an input parameter and treat its type as int64,
this patch adds the type conversion for these rules.
Passes toolstash-check -all.
Change-Id: Ib234b48d0a97ef244dd37878e06b5825316dd782
Reviewed-on: https://go-review.googlesource.com/c/go/+/234378
Reviewed-by: Keith Randall <khr@golang.org>
Encode the flag results in an auxint field instead of having
one opcode per flag state. This helps us handle the new *noov
branches in a unified manner.
This is only for arm, arm64 is in a subsequent CL.
We could extend to other architectures as well, athough it would
only be cleanup, no behavioral change.
Update #39505
Change-Id: Ia46cea596faad540d1496c5915ab1274571543f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/238077
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This CL changes the arm64 TBZ/TBNZ block from using Aux to using
a (typed) AuxInt. The corresponding rules have also been changed
to be typed.
Passes
GOARCH=arm64 gotip build -toolexec 'toolstash -cmp' -a std
Change-Id: I98d0cd2a791948f1db13259c17fb1b9b2807a043
Reviewed-on: https://go-review.googlesource.com/c/go/+/230839
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We only generate if statements via CondBreak, which is nice as the
control flow is simple and easy to work with. It seems like the If type
was added but never used, so remove it to avoid confusion.
We had a TODO about replacing CondBreak with If instead. I gave that a
try, but it doesn't seem worth the effort. The code gets more complex
and we don't really win anything in return.
While at it, don't use op strings as format strings in exprf. This
doesn't cause any issue at the moment, but it's best to be explicit
about the operator not containing any formatting verbs.
Change-Id: Ib59ad72d3628bf91594efc609e222232ad1e8748
Reviewed-on: https://go-review.googlesource.com/c/go/+/230257
Reviewed-by: Keith Randall <khr@golang.org>
The commuteDepth variable is no longer necessary; remove it.
Else branches after a log.Fatal call are unnecessary.
Also make the unbalanced return an integer, so we can differentiate
positive from negative cases. We only want to continue a rule with the
following lines if this balance is positive, for example.
While at it, make the balance loop stop when it goes negative, to not
let ")(" seem balanced.
Change-Id: I8aa313343ca5a2f07f638b62a0398fdf108fc9eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/228822
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Keep track of all expressions encountered while
generating a rewrite result, and re-use them whenever possible.
Named expressions may still be used for clarity when desired.
Change-Id: I640dca108763eb8baeff8f9a4169300af3445b82
Reviewed-on: https://go-review.googlesource.com/c/go/+/229800
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
When generating code for unsigned equals (==) and not equals (!=)
comparisons we currently, on s390x, always use signed comparisons.
This mostly works well, however signed comparisons on s390x sign
extend their immediates and unsigned comparisons zero extend them.
For compare-and-branch instructions which can only have 8-bit
immediates this significantly changes the range of immediate values
we can represent: [-128, 127] for signed comparisons and [0, 255]
for unsigned comparisons.
When generating equals and not equals checks we don't neet to worry
about whether the comparison is signed or unsigned. This CL
therefore adds rules to allow us to switch signedness for such
comparisons if it means that it brings a constant into range for an
8-bit immediate.
For example, a signed equals with an integer in the range [128, 255]
will now be implemented using an unsigned compare-and-branch
instruction rather than separate compare and branch instructions.
As part of this change I've also added support for adding a name
to block control values using the same `x:(...)` syntax we use for
value rules.
Triggers 792 times when compiling cmd and std.
Change-Id: I77fa80a128f0a8ce51a2888d1e384bd5e9b61a77
Reviewed-on: https://go-review.googlesource.com/c/go/+/228642
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The types used while generating code, such as Rule and File, have been
exported for a while. This is harmless for a main package, and lets us
easily differentiate types from variables and functions, as well as use
names like "If" since "if" is a keyword.
However, the fields remained unexported. This was a bit inconsistent,
and also meant that we couldn't use some intuitive names like If.else.
Export them.
Besides the capitalization, the only change is that the If type now has
the fields Then and Else, instead of stmt and alt.
Change-Id: I426ff140c6ca186fec394f17b29165861da5fd98
Reviewed-on: https://go-review.googlesource.com/c/go/+/228821
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This first pass makes the rules using the condition code mask
(CCMask) and rotate parameters (RotateParams) aux values strongly
typed. This required adding strongly typed aux handling to the
block rulegen.
More CLs like this to follow, but this is probably the most
complex.
Passes toolstash-check -all.
Change-Id: Ie513b07d527f0c1b398d7748331442dcb5f7b17d
Reviewed-on: https://go-review.googlesource.com/c/go/+/228518
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Otherwise, just copying the aux and auxint fields doesn't make much sense.
(Although there's no bug - it just means it isn't typechecked correctly.)
Change-Id: I4e21ac67f0c7bfd04ed5af1713cd24bca08af092
Reviewed-on: https://go-review.googlesource.com/c/go/+/227962
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Surprisingly many rules needed no modification.
Use wrapper functions for aux like we did for auxint.
Simplifies things a bit.
Change-Id: I2e852e77f1585dcb306a976ab9335f1ac5b4a770
Reviewed-on: https://go-review.googlesource.com/c/go/+/227961
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Move a lot of the constant folding rules to use strongly
typed AuxInt fields.
We need more than a cast to convert AuxInt to, e.g., float32.
Make conversion functions for converting back and forth.
Change-Id: Ia3d95ee3583ee2179a10938e20210a7617358c88
Reviewed-on: https://go-review.googlesource.com/c/go/+/227866
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Will help with strongly typed rewrite rules.
Change-Id: Ifbf316a49f4081322b3b8f13bc962713437d9aba
Reviewed-on: https://go-review.googlesource.com/c/go/+/227785
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Right now the Aux and AuxInt fields of ssa.Values are typed as
interface{} and int64, respectively. Each rule that uses these values
must cast them to the type they actually are (*obj.LSym, or int32, or
ValAndOff, etc.), use them, and then cast them back to interface{} or
int64.
We know for each opcode what the types of the Aux and AuxInt fields
should be. So let's modify the rule generator to declare the types to
be what we know they should be, autoconverting to and from the generic
types for us. That way we can make the rules more type safe.
It's difficult to make a single CL for this, so I've coopted the "=>"
token to indicate a rule that is strongly typed. "->" rules are
processed as before. That will let us migrate a few rules at a time in
separate CLs. Hopefully we can reach a state where all rules are
strongly typed and we can drop the distinction.
This CL changes just a few rules to get a feel for what this
transition would look like.
I've decided not to put explicit types in the rules. I think it
makes the rules somewhat clearer, but definitely more verbose.
In particular, the passthrough rules that don't modify the fields
in question are verbose for no real reason.
Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/190197
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Retrying CL 222782, with a fix that will hopefully stop the random crashing.
The issue with the previous CL is that it does pointer arithmetic
in a way that may briefly generate an out-of-bounds pointer. If an
interrupt happens to occur in that state, the referenced object may
be collected incorrectly.
Suppose there was code that did s[x+c]. The previous CL had a rule
to the effect of ptr + (x + c) -> c + (ptr + x). But ptr+x is not
guaranteed to point to the same object as ptr. In contrast,
ptr+(x+c) is guaranteed to point to the same object as ptr, because
we would have already checked that x+c is in bounds.
For example, strconv.trim used to have this code:
MOVZX -0x1(BX)(DX*1), BP
CMPL $0x30, AL
After CL 222782, it had this code:
LEAL 0(BX)(DX*1), BP
CMPB $0x30, -0x1(BP)
An interrupt between those last two instructions could see BP pointing
outside the backing store of the slice involved.
It's really hard to actually demonstrate a bug. First, you need to
have an interrupt occur at exactly the right time. Then, there must
be no other pointers to the object in question. Since the interrupted
frame will be scanned conservatively, there can't even be a dead
pointer in another register or on the stack. (In the example above,
a bug can't happen because BX still holds the original pointer.)
Then, the object in question needs to be collected (or at least
scanned?) before the interrupted code continues.
This CL needs to handle load combining somewhat differently than CL 222782
because of the new restriction on arithmetic. That's the only real
difference (other than removing the bad rules) from that old CL.
This bug is also present in the amd64 rewrite rules, and we haven't
seen any crashing as a result. I will fix up that code similarly to
this one in a separate CL.
Update #37881
Change-Id: I5f0d584d9bef4696bfe89a61ef0a27c8d507329f
Reviewed-on: https://go-review.googlesource.com/c/go/+/225798
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
In cases in which we had a named value whose args were all _,
like this rule from ARM.rules:
(MOVBUreg x:(MOVBUload _ _)) -> (MOVWreg x)
We previously inserted
_ = x.Args[1]
even though it is unnecessary.
This change eliminates this pointless bounds check.
And in other cases, we now check bounds just as far as strictly necessary.
No significant movement on any compiler metrics.
Just nicer (and less) code.
Passes toolstash-check -all.
Change-Id: I075dfe9f926cc561cdc705e9ddaab563164bed3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221781
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This:
* Simplifies and shortens the generated code for rewrite rules.
* Shrinks cmd/compile by 86k (0.4%) and makes it easier to compile.
* Removes the stmt boundary code wrangling from Value.reset,
in favor of doing it in the one place where it actually does some work,
namely the writebarrier pass. (This was ascertained by inspecting the
code for cases in which notStmtBoundary values were generated.)
Passes toolstash-check -all.
Change-Id: I25671d4c4bbd772f235195d11da090878ea2cc07
Reviewed-on: https://go-review.googlesource.com/c/go/+/221421
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
The goal here is improved AuxInt printing in ssa.html.
Instead of displaying an inscrutable encoded integer,
it displays something like
v25 (28) = UBFX <int> [lsb=4,width=8] v52
which is much nicer for debugging.
Change-Id: I40713ff7f4a857c4557486cdf73c2dff137511ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/221420
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
These detect opportunities to convert a rule to use an ellipsis,
and provide better error messages when something goes wrong.
This change was used to generate all the preceding changes
converting rules to use ellipses. This change is at the end of those
changes rather than the beginning in order to avoid log spam during rule
generation (say during a git bisection).
The preceding changes collectively shrink the cmd/compile binary by ~2.2%.
Part of this detection is also warning when the presence of an
unmentioned aux or auxint could cause conversion to an ellipsis
rule to change the sematics of the rule.
For example:
(Div64 x y) -> (DIV x y)
looks like a promising rule for an ellipsis. However, Div64 has an auxint,
and (on most platforms) DIV does not. An ellipsis rule would keep the
auxint intact, rather than zeroing it, which can infere with CSE.
So this change flags this rule as doing implicit zeroing;
it should be replaced by
(Div64 [a] x y) -> (DIV x y)
which makes it clear that the auxint is being zeroed.
This detection is not foolproof, but it currently has no false positives.
If false positives arise in the future, we will need to gate the output.
Change-Id: Ie21f284579e5d6e75aa304d0deb024d41ede528b
Reviewed-on: https://go-review.googlesource.com/c/go/+/217014
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Since rulegen is only tested by inspecting and running its output code,
we have no good way to see if any chunks of its source are actually
being unused.
Code coverage only works as part of 'go test', since it needs to
instrument our code. Add a script that sets up a tiny test for that
purpose, with a quick example on how to use it.
We need to use a script, because there's no other way to make this work
without breaking 'go run *.go'. It's far more common to run the
generator than to obtain a coverage profile, so this solution seems like
the right tradeoff, and we don't break existing users.
The script isn't terribly portable, but that's okay for now.
At the time of wriging, coverage sits at 89.7%. I've manually skimmed
main.go and rulegen.go, and practically all unused code is either error
handling, or optional code like *genLog and "if false". A couple of
small exceptions stand out, though I'm not paying attention to them in
this CL.
While at it, inline a couple of tiny unusedInspector methods that were
only needed once or twice.
Change-Id: I78c5fb47c8536d70e546a437637d4428ec7adfaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/212760
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This requires threading location information through varCount.
This provides much more useful error messages.
Change-Id: If5ff942cbbbf386724eda15a523c181c137fac20
Reviewed-on: https://go-review.googlesource.com/c/go/+/216221
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
We had three implementations.
Refactor, and document the shared implementation.
While we're here, improve the docs for func unbalanced.
Change-Id: I612cce79de15a864247afe377d3739d04a56b9bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/216219
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
CL 213703 converted generated rewrite rules for commutative ops
to use loops instead of duplicated code.
However, it loaded args using expressions like
v.Args[i] and v.Args[i^1], which the compiler could
not eliminate bounds for (including with all outstanding
prove CLs).
Also, given a series of separate rewrite rules for the same op,
we generated bounds checks for every rewrite rule, even though
we were repeatedly loading the same set of args.
This change reduces both sets of bounds checks.
Instead of loading v.Args[i] and v.Args[i^1] for commutative loops,
we now preload v.Args[0] and v.Args[1] into local variables,
and then swap them (as needed) in the commutative loop post statement.
And we now load all top level v.Args into local variables
at the beginning of every rewrite rule function.
The second optimization is the more significant,
but the first helps a little, and they play together
nicely from the perspective of generating the code.
This does increase register pressure, but the reduced bounds
checks more than compensate.
Note that the vast majority of rewrite rules evaluated
are not applied, so the prologue is the most important
part of the rewrite rules.
There is one subtle aspect to the new generated code.
Because the top level v.Args are shared across rewrite rules,
and rule evaluation can swap v_0 and v_1, v_0 and v_1
can end up being swapped from one rule to the next.
That is OK, because any time a rule does not get applied,
they will have been swapped exactly twice.
Passes toolstash-check -all.
name old time/op new time/op delta
Template 213ms ± 2% 211ms ± 2% -0.85% (p=0.000 n=92+96)
Unicode 83.5ms ± 2% 83.2ms ± 2% -0.41% (p=0.004 n=95+90)
GoTypes 737ms ± 2% 733ms ± 2% -0.51% (p=0.000 n=91+94)
Compiler 3.45s ± 2% 3.43s ± 2% -0.44% (p=0.000 n=99+100)
SSA 8.54s ± 1% 8.32s ± 2% -2.56% (p=0.000 n=96+99)
Flate 136ms ± 2% 135ms ± 1% -0.47% (p=0.000 n=96+96)
GoParser 169ms ± 1% 168ms ± 1% -0.33% (p=0.000 n=96+93)
Reflect 456ms ± 3% 455ms ± 3% ~ (p=0.261 n=95+94)
Tar 186ms ± 2% 185ms ± 2% -0.48% (p=0.000 n=94+95)
XML 251ms ± 1% 250ms ± 1% -0.51% (p=0.000 n=91+94)
[Geo mean] 424ms 421ms -0.68%
name old user-time/op new user-time/op delta
Template 275ms ± 1% 274ms ± 2% -0.55% (p=0.000 n=95+98)
Unicode 118ms ± 4% 118ms ± 4% ~ (p=0.642 n=98+90)
GoTypes 983ms ± 1% 980ms ± 1% -0.30% (p=0.000 n=93+93)
Compiler 4.56s ± 6% 4.52s ± 6% -0.72% (p=0.003 n=100+100)
SSA 11.4s ± 1% 11.1s ± 1% -2.50% (p=0.000 n=96+97)
Flate 168ms ± 1% 167ms ± 1% -0.49% (p=0.000 n=92+92)
GoParser 204ms ± 1% 204ms ± 2% -0.27% (p=0.003 n=99+96)
Reflect 599ms ± 2% 598ms ± 2% ~ (p=0.116 n=95+92)
Tar 227ms ± 2% 225ms ± 2% -0.57% (p=0.000 n=95+98)
XML 313ms ± 2% 312ms ± 1% -0.37% (p=0.000 n=89+95)
[Geo mean] 547ms 544ms -0.61%
file before after Δ %
compile 21113112 21109016 -4096 -0.019%
total 131704940 131700844 -4096 -0.003%
Change-Id: Id6c39e0367e597c0c75b8a4b1eb14cc3cbd11956
Reviewed-on: https://go-review.googlesource.com/c/go/+/216218
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This change introduces a new syntax for rewrite rules
that only change a Value's Op. See #36380 for more discussion.
Updating rewrite rules to use ellipses will happen
in follow-up CLs.
Change-Id: I8c56e85de24607579d79729575c89ca80805ba5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/213898
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
It's much easier to debug when you can see
the contents in order to interpret the error message.
Change-Id: I03bbb9dd3071aeca9577cc725a60d43f78118cf4
Reviewed-on: https://go-review.googlesource.com/c/go/+/215717
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
This documentation was lost in CL 213703.
This change restores it.
Change-Id: I544f15771d8a7390893efbda93478b46095ccf3c
Reviewed-on: https://go-review.googlesource.com/c/go/+/215541
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
I noticed some instances of "[ " and " ]" in the rewrite rules.
Normalizing them helps catch possible future duplicate rules.
Change-Id: I892fd7e9b4019ed304f0a61fa2bb7f7e47ef8f38
Reviewed-on: https://go-review.googlesource.com/c/go/+/213682
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Prior to this change, we generated additional rules at rulegen time
for all possible combinations of args to commutative ops.
This is simple and works well, but leads to lots of generated rules.
This in turn has increased the size of the compiler,
made it hard to compile package ssa on small machines,
and provided a disincentive to mark some ops as commutative.
This change reworks how we handle commutative ops.
Instead of generating a rule per argument permutation,
we generate a series of nested loops, one for each commutative op.
Each loop tries both possible argument orderings.
I also considered attempting to canonicalize the inputs to the
rewrite rules. However, because either or both arguments might be
nothing more than an identifier, and because there can be arbitrary
conditions to evaluate during matching, I did not see how to proceed.
The duplicate rule detection now sorts arguments to commutative ops,
so that it can detect commutative-only duplicates.
There may be further optimizations to the new generated code.
In particular, we may not be removing as many bounds checks as before;
I have not investigated deeply. If more work here is needed,
we could do it with more hints or with improvements to the prove pass.
This change has almost no impact on the generated code.
It does not pass toolstash-check, however. In a handful of functions,
for reasons I do not understand, there are minor position changes.
For the entire series ending at this change,
there is negligible compiler performance impact.
The compiler binary shrinks by about 15%,
and package ssa shrinks by about 25%.
Package ssa also compiles ~25% faster with ~25% less memory.
Change-Id: Ia2ee9ceae7be08a17342319d4e31b0bb238a2ee4
Reviewed-on: https://go-review.googlesource.com/c/go/+/213703
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
The commutative rule generator has an optimization
where given (Add x y) with no other uses of x and y in the matching rule,
it doesn't generate the commutative match (Add y x).
However, if there is also a condition referring to x or y,
such as (Add x y) && isFoo(x), then we should generate the commutative rule.
This change parses the condition, extracts all idents, and takes them
into consideration.
This doesn't yield any new optimizations now.
However, it is the right thing to do;
otherwise we'll have to track it down and fix it again later.
It is also expensive now, in terms of additional generated code.
However, it will be much, much less expensive soon,
once our generated code for commutative ops gets smaller.
Change-Id: I52c2016c884bbc7789bf8dfe9b9c56061bc028ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/213702
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
When working on rulegen, I often find myself
searching the rules files to find the source of
generated code. Add a flag to make that easier.
The flag needs to be off by default,
so that adding a single rule doesn't cause a massive diff.
Change-Id: I5a6f09129dc6fceef7c9cd1ad7eee24f3880ba91
Reviewed-on: https://go-review.googlesource.com/c/go/+/213700
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>