The new conversions can be activated (or bisected) with
-gcflags=all=-d=converthash=PATTERN
where PATTERN is either a hash string or n, qn, y, qy for
no, quietly no, yes, quietly yes.
This CL makes the default pattern be "qn" instead of the
default-default which is an efficient encoding of "qy".
Updates #75834
Change-Id: I88a9fd7880bc999132420c8d0a22a8fdc1e95a2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/711845
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
Leave those constant foldings for runtime, similar to how we do it
for NaN generation.
These are the only instances I could find in cmd/compile/..., using
objdump -d ../pkg/tool/darwin_arm64/compile| egrep "(fcvtz|>:)" | grep -B1 fcvt
(There are instances in other places, like runtime and reflect, but I don't
think those places would affect compiler output.)
Change-Id: I4113fe4570115e4765825cf442cb1fde97cf2f27
Reviewed-on: https://go-review.googlesource.com/c/go/+/711281
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
This change creates calls to size-specialized malloc functions instead
of calls to newObject when we know the size of the allocation at
compilation time. Most of it is a matter of calling the newObject
function (which will create calls to the size-specialized functions)
rather then the newObjectNonSpecialized function (which won't). In the
newHeapaddr, small, non-pointer case, we'll create a non specialized
newObject and transform that into the appropriate size-specialized
function when we produce the mallocgc in flushPendingHeapAllocations.
We have to update some of the rewrites in generic.rules to also apply to
the size-specialized functions when they apply to newObject.
The messiest thing is we have to adjust the offset we use to save the
memory profiler stack, because the depth of the call to profilealloc is
two frames fewer in the size-specialized malloc functions compared to
when newObject calls mallocgc. A bunch of tests have been adjusted to
account for that.
Change-Id: I6a6a6964c9037fb6719e392c4a498ed700b617d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/707856
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Some tests make assignments to an argument without reading it.
With CL 708865, they are treated as dead stores and are removed.
Make sure the results are used.
Fixes#75745.
Fixes#75746.
Change-Id: I05580beb1006505ec1550e5fa245b54dcefd10b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/708916
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Currently, we remove stores to local variables that are not read.
We don't do that for arguments. But arguments and locals are
essentially the same. Arguments are passed by value, and are not
expected to be read in the caller's frame. So we can remove the
writes to them as well. One exception is the cgo_unsafe_arg
directive, which makes all the arguments effectively address-taken.
cgo_unsafe_arg implies ABI0, so we just skip ABI0 functions'
arguments.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I8999fc50da6a87f22c1ec23e9a0c15483b6f7df8
Reviewed-on: https://go-review.googlesource.com/c/go/+/705815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708865
While here, reorder Float32ConstantStore/Float64ConstantStore for
consistency.
Change-Id: Ic1b3e9f9474965d15bc94518d78d1a4a7bda93f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/703756
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
Change-Id: I4bee2770fedf97e35b5a5b9187a8ba3c41f9ec2e
Reviewed-on: https://go-review.googlesource.com/c/go/+/702697
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@google.com>
This CL also adds riscv64 checks
Change-Id: I693e4e606f470615f6b49085592d6d5ca61473d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/703716
Reviewed-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Previous CLs optimized direct use of abi.Type, but reflect.Type is
indirected, so was not benefiting.
For TypeFor, we can use toRType directly without a nil check because the
types are statically known.
Normally, I'd think SSA would remove the nil check, but due to some
oddity (specifically, late fuse being required to remove the nil check,
but opt doesn't run that late) means that the nil check persists and
gets in the way.
Manually writing the code in this instance seems to fix the problem.
It also exposed another problem; depending on the ordering, writeType
could get to a type symbol before SSA, thereby preventing Extra from
being created on the symbol for later lookups that don't go through
TypeLinksym directly. In writeType, for non-shape types, call
TypeLinksym to ensure that the type is set up for later callers. That
change itself passed toolstash -cmp.
All up, this stack put through compilecmp shows a lot of improvement in
various reflect-using packages, and reflect itself. It is too big to fit
in the commit message but here's some info:
compilecmp master -> HEAD
master (d767064170): cmd/compile: mark abi.PtrType.Elem sym as used
HEAD (846a94c568): cmd/compile, reflect: further allow inlining of TypeFor
file before after Δ %
addr2line 3735911 3735391 -520 -0.014%
asm 6382235 6382091 -144 -0.002%
buildid 3608568 3608360 -208 -0.006%
cgo 5951816 5951480 -336 -0.006%
compile 28362080 28339772 -22308 -0.079%
cover 6668686 6661414 -7272 -0.109%
dist 4311961 4311425 -536 -0.012%
fix 3771706 3771474 -232 -0.006%
link 8686073 8684993 -1080 -0.012%
nm 3715923 3715459 -464 -0.012%
objdump 6074366 6073774 -592 -0.010%
pack 3025653 3025277 -376 -0.012%
pprof 18269485 18261653 -7832 -0.043%
test2json 3442726 3438390 -4336 -0.126%
trace 16984831 16981767 -3064 -0.018%
vet 10701931 10696355 -5576 -0.052%
total 133693951 133639075 -54876 -0.041%
runtime
runtime.stkobjinit 240 -> 165 (-31.25%)
runtime [cmd/compile]
runtime.stkobjinit 240 -> 165 (-31.25%)
reflect
reflect.Value.Seq2.func3 309 -> 245 (-20.71%)
reflect.Value.Seq2.func1.1 281 -> 198 (-29.54%)
reflect.Value.Seq.func1.1 242 -> 165 (-31.82%)
reflect.Value.Seq2.func2 360 -> 285 (-20.83%)
reflect.Value.Seq.func4 281 -> 239 (-14.95%)
reflect.Value.Seq2.func4 399 -> 284 (-28.82%)
reflect.Value.Seq.func2 271 -> 230 (-15.13%)
reflect.TypeFor[go.shape.uint64] 33 -> 18 (-45.45%)
reflect.Value.Seq.func3 219 -> 178 (-18.72%)
reflect [cmd/compile]
reflect.Value.Seq2.func2 360 -> 285 (-20.83%)
reflect.Value.Seq.func4 281 -> 239 (-14.95%)
reflect.Value.Seq.func2 271 -> 230 (-15.13%)
reflect.Value.Seq.func1.1 242 -> 165 (-31.82%)
reflect.Value.Seq2.func1.1 281 -> 198 (-29.54%)
reflect.Value.Seq2.func3 309 -> 245 (-20.71%)
reflect.Value.Seq.func3 219 -> 178 (-18.72%)
reflect.TypeFor[go.shape.uint64] 33 -> 18 (-45.45%)
reflect.Value.Seq2.func4 399 -> 284 (-28.82%)
fmt
fmt.(*pp).fmtBytes 1723 -> 1691 (-1.86%)
database/sql/driver
reflect.TypeFor[go.shape.interface 33 -> 18 (-45.45%)
database/sql/driver.init 72 -> 57 (-20.83%)
Change-Id: I9eb750cf0b7ebf532589f939431feb0a899e42ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/701301
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This reverts commit 4c63d798cb.
Reason for revert: Causes miscompilations. See issue 75365.
Change-Id: Icd1fcfeb23d2ec524b16eb556030f43875e1c90d
Reviewed-on: https://go-review.googlesource.com/c/go/+/702455
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Mark Freeman <markfreeman@google.com>
On loong64, the addi.d instruction can only directly handle 12-bit
immediate numbers. If a larger immediate number needs to be processed,
it must first be placed in a register, and then the add.d instruction
is used to complete the processing of the larger immediate number.
If a larger immediate number c satisfies is32Bit(c) && c&0xffff == 0,
then the ADDV16 instruction can be used to complete the addition operation.
Removes 164 instructions from the go binary on loong64.
Change-Id: I404de93cc4eaaa12fe424f5a0d61b03231215d1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/700536
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This CL makes the generated code for reflect.TypeFor as simple as an
intrinsic function.
Fixes#75203
Change-Id: I7bb48787101f07e77ab5c583292e834c28a028d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/700336
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Refer to CL 633075, loong64 has a zero(R0) register that can be used to do this.
Change-Id: I846c6bdfcfd6dbfa18338afc13e34e350580ead4
Reviewed-on: https://go-review.googlesource.com/c/go/+/693876
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Pattern1: (the type of c is uint16)
c>>8 | c<<8
To:
revb2h c
Pattern2: (the type of c is uint32)
(c & 0xff00ff00)>>8 | (c & 0x00ff00ff)<<8
To:
revb2h c
Pattern3: (the type of c is uint64)
(c & 0xff00ff00ff00ff00)>>8 | (c & 0x00ff00ff00ff00ff)<<8
To:
revb4h c
Change-Id: Ic6231a3f476cbacbea4bd00e31193d107cb86cda
Reviewed-on: https://go-review.googlesource.com/c/go/+/696335
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Change-Id: I782f93510bba92ba60b298c1c1cde456c8bcec38
Reviewed-on: https://go-review.googlesource.com/c/go/+/697956
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
We now (as of CL 678620) use float registers other than X0 for copying.
Change-Id: Ifdecd5df7519663742eed0f292c98453754d4b25
Reviewed-on: https://go-review.googlesource.com/c/go/+/695275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Although InlMark takes a memory argument it ultimately becomes a
NOP and therefore is safe to speculatively execute.
Fixes#74915
Change-Id: I64317dd433e300ac28de2bcf201845083ec2ac82
Reviewed-on: https://go-review.googlesource.com/c/go/+/693795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Add the VECTOR FP (MINIMUM|MAXIMUM) instructions to the assembler and
use them in the compiler to implement min and max.
Note: I've allowed floating point registers to be used with the single
element instructions (those with the W instead of V prefix) to allow
easier integration into the compiler.
Change-Id: I5f80a510bd248cf483cce95f1979bf63fbae7de6
Reviewed-on: https://go-review.googlesource.com/c/go/+/684715
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <mark@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
After CL 628075, do not rely on the memory arg of an OpLocalAddr.
Fixes#74788
Change-Id: I4e893241e3949bb8f2d93c8b88cc102e155b725d
Reviewed-on: https://go-review.googlesource.com/c/go/+/691275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Mark Freeman <mark@golang.org>
Same as CL 689815, but for modulus instead of division.
Updates #74485
Change-Id: I73000231c886a987a1093669ff207fd9117a8160
Reviewed-on: https://go-review.googlesource.com/c/go/+/689895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
These recently added tests failed when using the -all_codgen flag.
Fixes#74770
Change-Id: Idea1ea02af2bd9f45c7d0a28d633c7442328e6df
Reviewed-on: https://go-review.googlesource.com/c/go/+/690715
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Run-TryBot: Michael Munday <mikemndy@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Mark Freeman <mark@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
For performance see CL 685676.
This allows something like:
if y { x *= 2 }
To be compiled to:
SHLXQ BX, AX, AX
Instead of:
MOVQ AX, CX
SHLQ $1, CX
MOVBLZX BL, DX
TESTQ DX, DX
CMOVQNE CX, AX
While ./make.bash uniqued per LOC, there is 2 doublings and 4 halvings.
Change-Id: Ic0727cbf429528a2dbf17cbfc3b0121db8387444
Reviewed-on: https://go-review.googlesource.com/c/go/+/685695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
This allows something like:
if y { x++ }
To be compiled to:
MOVBLZX BX, CX
ADDQ CX, AX
Instead of:
LEAQ 1(AX), CX
MOVBLZX BL, DX
TESTQ DX, DX
CMOVQNE CX, AX
While ./make.bash uniqued per LOC, there is 100 additions and 75 substractions.
See benchmark here: https://go.dev/play/p/DJf5COjwhd_s
Either it's a performance no-op or it is faster:
goos: linux
goarch: amd64
cpu: AMD Ryzen 5 3600 6-Core Processor
│ /tmp/old.logs │ /tmp/new.logs │
│ sec/op │ sec/op vs base │
CmovInlineConditionAddLatency-12 0.5443n ± 5% 0.5339n ± 3% -1.90% (p=0.004 n=10)
CmovInlineConditionAddThroughputBy6-12 1.492n ± 1% 1.494n ± 1% ~ (p=0.955 n=10)
CmovInlineConditionSubLatency-12 0.5419n ± 3% 0.5282n ± 3% -2.52% (p=0.019 n=10)
CmovInlineConditionSubThroughputBy6-12 1.587n ± 1% 1.584n ± 2% ~ (p=0.492 n=10)
CmovOutlineConditionAddLatency-12 0.5223n ± 1% 0.2639n ± 4% -49.47% (p=0.000 n=10)
CmovOutlineConditionAddThroughputBy6-12 1.159n ± 1% 1.097n ± 2% -5.35% (p=0.000 n=10)
CmovOutlineConditionSubLatency-12 0.5271n ± 3% 0.2654n ± 2% -49.66% (p=0.000 n=10)
CmovOutlineConditionSubThroughputBy6-12 1.053n ± 1% 1.050n ± 1% ~ (p=1.000 n=10)
geomean
There are other benefits not tested by this benchmark:
- the math form is usually a couple bytes shorter (ICACHE)
- the math form is usually 0~2 uops shorter (UCACHE)
- the math form has usually less register pressure*
- the math form can sometimes be optimized further
*regalloc rarely find how it can use less registers
As far as pass ordering goes there are many possible options,
I've decided to reorder branchelim before late opt since:
- unlike running exclusively the CondSelect rules after branchelim,
some extra optimizations might trigger on the adds or subs.
- I don't want to maintain a second generic.rules file of only the stuff,
that can trigger after branchelim.
- rerunning all of opt a third time increase compilation time for little gains.
By elimination moving branchelim seems fine.
Change-Id: I869adf57e4d109948ee157cfc47144445146bafd
Reviewed-on: https://go-review.googlesource.com/c/go/+/685676
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Fold a shift through AND when the AND gets a zero-or-one operand (e.g.
from arithmetic shift by 63 of a 64-bit value) for a common case with
slice operations:
ASR $63, R2, R2
AND R3<<3, R2, R2
ADD R2, R0, R2
As the operands are 64-bit, we can transform it to:
AND R2->63, R3, R2
ADD R2<<3, R0, R2
Code size improvement:
compile: .text: 9088004 -> 9086292 (-0.02%)
etcd: .text: 10500276 -> 10498964 (-0.01%)
Change-Id: Ibcd5e67173da39b77ceff77ca67812fb8be5a7b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/679895
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Mark Freeman <mark@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Optimize ARM64 code generation for slice bounds checking by recognizing
patterns where comparisons to zero involve SUB or SUBconst operations.
This change adds SSA opt rules to simplify:
(CMPconst [0] (SUB x y)) => (CMP x y)
The optimizations apply to EQ, NE, ULE, and UGT comparisons, enabling
more efficient bounds checking for slice operations.
Code size improvement:
compile: .text: 9088004 -> 9065988 (-0.24%)
etcd: .text: 10500276 -> 10497092 (-0.03%)
Change-Id: I467cb27674351652bcacc52b87e1f19677bd46a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/679915
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
CL 622236 forgot to check the mask was also a 32 bit rotate mask. Add
a modified version of isPPC64WordRotateMask which valids the mask is
contiguous and fits inside a uint32.
I don't this is possible when merging SRDconst, the first check should
always reject such combines. But, be extra careful and do it there
too.
Fixes#73153
Change-Id: Ie95f74ec5e7d89dc761511126db814f886a7a435
Reviewed-on: https://go-review.googlesource.com/c/go/+/679775
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
unique.Make always copies strings passed into it, so it's safe to not
copy byte slices converted to strings either. Handle this just like map
accesses with string(b) as keys.
This CL only handles unique.Make(string(b)), not nested cases like
unique.Make([2]string{string(b1), string(b2)}); this could be done in a
followup CL but the map lookup code in walk is sufficiently different
than the call handling code that I didn't attempt it. (SSA is much
easier).
Fixes#71926
Change-Id: Ic2f82f2f91963d563b4ddb1282bd49fc40da8b85
Reviewed-on: https://go-review.googlesource.com/c/go/+/672135
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Today, this interface conversion causes the struct literal
to be heap allocated:
var sink any
func example1() {
sink = S{1, 1}
}
For basic literals like integers that are directly used in
an interface conversion that would otherwise allocate, the compiler
is able to use read-only global storage (see #18704).
This CL extends that to struct and array literals as well by creating
read-only global storage that is able to represent for example S{1, 1},
and then using a pointer to that storage in the interface
when the interface conversion happens.
A more challenging example is:
func example2() {
v := S{1, 1}
sink = v
}
In this case, the struct literal is not directly part of the
interface conversion, but is instead assigned to a local variable.
To still avoid heap allocation in cases like this, in walk we
construct a cache that maps from expressions used in interface
conversions to earlier expressions that can be used to represent the
same value (via ir.ReassignOracle.StaticValue). This is somewhat
analogous to how we avoided heap allocation for basic literals in
CL 649077 earlier in our stack, though here we also need to do a
little more work to create the read-only global.
CL 649076 (also earlier in our stack) added most of the tests
along with debug diagnostics in convert.go to make it easier
to test this change.
See the writeup in #71359 for details.
Fixes#71359Fixes#71323
Updates #62653
Updates #53465
Updates #8618
Change-Id: I8924f0c69ff738ea33439bd6af7b4066af493b90
Reviewed-on: https://go-review.googlesource.com/c/go/+/649555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
In the loong64 instruction set, there is no NORI instruction,
so the immediate value in NORconst need to be stored in register
and then use the three-register NOR instruction.
Change-Id: I5ef697450619317218cb3ef47fc07e238bdc2139
Reviewed-on: https://go-review.googlesource.com/c/go/+/673836
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This CL implements the TODO in combineStores to allow combining
stores of different sizes, as long as the total size aligns to
2, 4, 8.
Fixes#72832.
Change-Id: I6d1d471335da90d851ad8f3b5a0cf10bdcfa17c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/661855
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>