mirror of
https://github.com/golang/go.git
synced 2025-12-08 06:10:04 +00:00
9 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
592c2db868 |
cmd/compile: improve loopRotate to handle nested loops
Enhance loop rotation of nested loops. Currently, loops are processed independently, resulting in unnecessary jumps between outer and inner loops. By processing inner loops before their parent loop, we ensure nested loop blocks are properly placed within their parent loop's block sequence. There is some code size improvement (as measured on amd64) due to jumps to/from inner loop are removed by the updated loopRotate block order: Executable Old .text New .text Change ------------------------------------------------------- asm 2147569 2146481 -0.05% cgo 1977457 1975761 -0.09% compile 10447345 10441905 -0.05% cover 2110097 2108977 -0.05% link 2930289 2929041 -0.04% preprofile 927345 926769 -0.06% vet 3279057 3277009 -0.06% Change-Id: I4b9e993c2be07fad735e6bcf32d062d099d9cfb5 Reviewed-on: https://go-review.googlesource.com/c/go/+/684335 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
|
|
18d0e6a14f |
cmd/compile, cmd/internal: fine-grained fiddling with loop alignment
This appears to be useful only on amd64, and was specifically benchmarked on Apple Silicon and did not produce any benefit there. This CL adds the assembly instruction `PCALIGNMAX align,amount` which aligns to `align` if that can be achieved with `amount` or fewer bytes of padding. (0 means never, but will align the enclosing function.) Specifically, if low-order-address-bits + amount are greater than or equal to align; thus, `PCALIGNMAX 64,63` is the same as `PCALIGN 64` and `PCALIGNMAX 64,0` will never emit any alignment, but will still cause the function itself to be aligned to (at least) 64 bytes. Change-Id: Id51a056f1672f8095e8f755e01f72836c9686aa3 Reviewed-on: https://go-review.googlesource.com/c/go/+/577935 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
||
|
|
68bd383368 |
cmd/compile: add cache of sizeable objects so they can be reused
We kind of have this mechanism already, just normalizing it and using it in a bunch of places. Previously a bunch of places cached slices only for the duration of a single function compilation. Now we can reuse slices across a whole compiler run. Use a sync.Pool of powers-of-two sizes. This lets us use not too much memory, and avoid holding onto memory we're no longer using when a GC happens. There's a few different types we need, so generate the code for it. Generics would be useful here, but we can't use generics in the compiler because of bootstrapping. Change-Id: I6cf37e7b7b2e802882aaa723a0b29770511ccd82 Reviewed-on: https://go-review.googlesource.com/c/go/+/444820 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> |
||
|
|
19309779ac |
all: gofmt main repo
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> |
||
|
|
600259b099 |
cmd/compile: use depth first topological sort algorithm for layout
The current layout algorithm tries to put consecutive blocks together, so the priority of the successor block is higher than the priority of the zero indegree block. This algorithm is beneficial for subsequent register allocation, but will result in more branch instructions. The depth-first topological sorting algorithm is a well-known layout algorithm, which has applications in many languages, and it helps to reduce branch instructions. This CL applies it to the layout pass. The test results show that it helps to reduce the code size. This CL also includes the following changes: 1, Removed the primary predecessor mechanism. The new layout algorithm is not very friendly to register allocator in some cases, in order to adapt to the new layout algorithm, a new primary predecessor selection strategy is introduced. 2, Since the new layout implementation may place non-loop blocks between loop blocks, some adaptive modifications have also been made to looprotate pass. 3, The layout also affects the results of codegen, so this CL also adjusted several codegen tests accordingly. It is inevitable that this CL will cause the code size or performance of a few functions to decrease, but the number of cases it improves is much larger than the number of cases it drops. Statistical data from compilecmp on linux/amd64 is as follow: name old time/op new time/op delta Template 382ms ± 4% 382ms ± 4% ~ (p=0.497 n=49+50) Unicode 170ms ± 9% 169ms ± 8% ~ (p=0.344 n=48+50) GoTypes 2.01s ± 4% 2.01s ± 4% ~ (p=0.628 n=50+48) Compiler 190ms ±10% 189ms ± 9% ~ (p=0.734 n=50+50) SSA 11.8s ± 2% 11.8s ± 3% ~ (p=0.877 n=50+50) Flate 241ms ± 9% 241ms ± 8% ~ (p=0.897 n=50+49) GoParser 366ms ± 3% 361ms ± 4% -1.21% (p=0.004 n=47+50) Reflect 835ms ± 3% 838ms ± 3% ~ (p=0.275 n=50+49) Tar 336ms ± 4% 335ms ± 3% ~ (p=0.454 n=48+48) XML 433ms ± 4% 431ms ± 3% ~ (p=0.071 n=49+48) LinkCompiler 706ms ± 4% 705ms ± 4% ~ (p=0.608 n=50+49) ExternalLinkCompiler 1.85s ± 3% 1.83s ± 2% -1.47% (p=0.000 n=49+48) LinkWithoutDebugCompiler 437ms ± 5% 437ms ± 6% ~ (p=0.953 n=49+50) [Geo mean] 615ms 613ms -0.37% name old alloc/op new alloc/op delta Template 38.7MB ± 1% 38.7MB ± 1% ~ (p=0.834 n=50+50) Unicode 28.1MB ± 0% 28.1MB ± 0% -0.22% (p=0.000 n=49+50) GoTypes 168MB ± 1% 168MB ± 1% ~ (p=0.054 n=47+47) Compiler 23.0MB ± 1% 23.0MB ± 1% ~ (p=0.432 n=50+50) SSA 1.54GB ± 0% 1.54GB ± 0% +0.21% (p=0.000 n=50+50) Flate 23.6MB ± 1% 23.6MB ± 1% ~ (p=0.153 n=43+46) GoParser 35.1MB ± 1% 35.1MB ± 2% ~ (p=0.202 n=50+50) Reflect 84.7MB ± 1% 84.7MB ± 1% ~ (p=0.333 n=48+49) Tar 34.5MB ± 1% 34.5MB ± 1% ~ (p=0.406 n=46+49) XML 44.3MB ± 2% 44.2MB ± 3% ~ (p=0.981 n=50+50) LinkCompiler 131MB ± 0% 128MB ± 0% -2.74% (p=0.000 n=50+50) ExternalLinkCompiler 120MB ± 0% 120MB ± 0% +0.01% (p=0.007 n=50+50) LinkWithoutDebugCompiler 77.3MB ± 0% 77.3MB ± 0% -0.02% (p=0.000 n=50+50) [Geo mean] 69.3MB 69.1MB -0.22% file before after Δ % addr2line 4104220 4043684 -60536 -1.475% api 5342502 5249678 -92824 -1.737% asm 4973785 4858257 -115528 -2.323% buildid 2667844 2625660 -42184 -1.581% cgo 4686849 4616313 -70536 -1.505% compile 23667431 23268406 -399025 -1.686% cover 4959676 4874108 -85568 -1.725% dist 3515934 3450422 -65512 -1.863% doc 3995581 3925469 -70112 -1.755% fix 3379202 3318522 -60680 -1.796% link 6743249 6629913 -113336 -1.681% nm 4047529 3991777 -55752 -1.377% objdump 4456151 4388151 -68000 -1.526% pack 2435040 2398072 -36968 -1.518% pprof 13804080 13565808 -238272 -1.726% test2json 2690043 2645987 -44056 -1.638% trace 10418492 10232716 -185776 -1.783% vet 7258259 7121259 -137000 -1.888% total 113145867 111204202 -1941665 -1.716% The situation on linux/arm64 is as follow: name old time/op new time/op delta Template 280ms ± 1% 282ms ± 1% +0.75% (p=0.000 n=46+48) Unicode 124ms ± 2% 124ms ± 2% +0.37% (p=0.045 n=50+50) GoTypes 1.69s ± 1% 1.70s ± 1% +0.56% (p=0.000 n=49+50) Compiler 122ms ± 1% 123ms ± 1% +0.93% (p=0.000 n=50+50) SSA 12.6s ± 1% 12.7s ± 0% +0.72% (p=0.000 n=50+50) Flate 170ms ± 1% 172ms ± 1% +0.97% (p=0.000 n=49+49) GoParser 262ms ± 1% 263ms ± 1% +0.39% (p=0.000 n=49+48) Reflect 639ms ± 1% 650ms ± 1% +1.63% (p=0.000 n=49+49) Tar 243ms ± 1% 245ms ± 1% +0.82% (p=0.000 n=50+50) XML 324ms ± 1% 327ms ± 1% +0.72% (p=0.000 n=50+49) LinkCompiler 597ms ± 1% 596ms ± 1% -0.27% (p=0.001 n=48+47) ExternalLinkCompiler 1.90s ± 1% 1.88s ± 1% -1.00% (p=0.000 n=50+50) LinkWithoutDebugCompiler 364ms ± 1% 363ms ± 1% ~ (p=0.220 n=49+50) [Geo mean] 485ms 488ms +0.49% name old alloc/op new alloc/op delta Template 38.7MB ± 0% 38.8MB ± 1% ~ (p=0.093 n=43+49) Unicode 28.4MB ± 0% 28.4MB ± 0% +0.03% (p=0.000 n=49+45) GoTypes 169MB ± 1% 169MB ± 1% +0.23% (p=0.010 n=50+50) Compiler 23.2MB ± 1% 23.2MB ± 1% +0.11% (p=0.000 n=40+44) SSA 1.54GB ± 0% 1.55GB ± 0% +0.45% (p=0.000 n=47+49) Flate 23.8MB ± 2% 23.8MB ± 1% ~ (p=0.543 n=50+50) GoParser 35.3MB ± 1% 35.4MB ± 1% ~ (p=0.792 n=50+50) Reflect 85.2MB ± 1% 85.2MB ± 0% ~ (p=0.055 n=50+47) Tar 34.5MB ± 1% 34.5MB ± 1% +0.06% (p=0.015 n=50+50) XML 43.8MB ± 2% 43.9MB ± 2% +0.19% (p=0.000 n=48+48) LinkCompiler 137MB ± 0% 136MB ± 0% -0.92% (p=0.000 n=50+50) ExternalLinkCompiler 127MB ± 0% 127MB ± 0% ~ (p=0.516 n=50+50) LinkWithoutDebugCompiler 84.0MB ± 0% 84.0MB ± 0% ~ (p=0.057 n=50+50) [Geo mean] 70.4MB 70.4MB +0.01% file before after Δ % addr2line 4021557 4002933 -18624 -0.463% api 5127847 5028503 -99344 -1.937% asm 5034716 4936836 -97880 -1.944% buildid 2608118 2594094 -14024 -0.538% cgo 4488592 4398320 -90272 -2.011% compile 22501129 22213592 -287537 -1.278% cover 4742301 4713573 -28728 -0.606% dist 3388071 3365311 -22760 -0.672% doc 3802250 3776082 -26168 -0.688% fix 3306147 3216939 -89208 -2.698% link 6404483 6363699 -40784 -0.637% nm 3941026 3921930 -19096 -0.485% objdump 4383330 4295122 -88208 -2.012% pack 2404547 2389515 -15032 -0.625% pprof 12996234 12856818 -139416 -1.073% test2json 2668500 2586788 -81712 -3.062% trace 9816276 9609580 -206696 -2.106% vet 6900682 6787338 -113344 -1.643% total 108535806 107056973 -1478833 -1.363% Change-Id: Iaec1cdcaacca8025e9babb0fb8a532fddb70c87d Reviewed-on: https://go-review.googlesource.com/c/go/+/255239 Reviewed-by: eric fang <eric.fang@arm.com> Reviewed-by: Keith Randall <khr@golang.org> Trust: eric fang <eric.fang@arm.com> |
||
|
|
9e21e9c5cb |
cmd/compile: make loop finder more aware of irreducible loops
The loop finder doesn't return good information if it encounters an irreducible loop. Make a start on improving this, and set a function-level flag to indicate when there is such a loop (and the returned information might be flaky). Use that flag to prevent the loop rotater from getting confused; the existing code seems to depend on artifacts of the previous loop-finding algorithm. (There is one irreducible loop in the go library, in "inflate.go"). Change-Id: If6e26feab38d9b009d2252d556e1470c803bde40 Reviewed-on: https://go-review.googlesource.com/42150 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> |
||
|
|
ea5e3bd2a1 |
all: fix easy-to-miss typos
Using the wonderful https://github.com/client9/misspell tool. Change-Id: Icdbc75a5559854f4a7a61b5271bcc7e3f99a1a24 Reviewed-on: https://go-review.googlesource.com/57851 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
|
638ebb04f2 |
cmd/compile: don't break up contiguous blocks in looprotate
looprotate finds loop headers and arranges for them to be placed after the body of the loop. This eliminates a jump from the body. However, if the loop header is a series of contiguously laid out blocks, the rotation introduces a new jump in that series. This CL expands the "loop header" to move to be the entire run of contiguously laid out blocks in the same loop. This shrinks object files a little, and actually speeds up the compiler noticeably. Numbers below. Fannkuch performance seems to vary a lot by machine. On my laptop: name old time/op new time/op delta Fannkuch11-8 2.89s ± 2% 2.85s ± 3% -1.22% (p=0.000 n=50+50) This has a significant affect on the append benchmarks in #14758: name old time/op new time/op delta Foo-8 312ns ± 3% 276ns ± 2% -11.37% (p=0.000 n=30+29) Bar-8 565ns ± 2% 456ns ± 2% -19.27% (p=0.000 n=27+28) Updates #18977 Fixes #20355 name old time/op new time/op delta Template 205ms ± 5% 204ms ± 8% ~ (p=0.903 n=92+99) Unicode 85.3ms ± 4% 85.1ms ± 3% ~ (p=0.191 n=92+94) GoTypes 512ms ± 4% 507ms ± 4% -0.93% (p=0.000 n=95+97) Compiler 2.38s ± 3% 2.35s ± 3% -1.27% (p=0.000 n=98+95) SSA 4.67s ± 3% 4.64s ± 3% -0.62% (p=0.000 n=95+96) Flate 117ms ± 3% 117ms ± 3% ~ (p=0.099 n=84+86) GoParser 139ms ± 4% 137ms ± 4% -0.90% (p=0.000 n=97+98) Reflect 329ms ± 5% 326ms ± 6% -0.97% (p=0.002 n=99+98) Tar 102ms ± 6% 101ms ± 5% -0.97% (p=0.006 n=97+97) XML 198ms ±10% 196ms ±13% ~ (p=0.087 n=100+100) [Geo mean] 318ms 316ms -0.72% name old user-time/op new user-time/op delta Template 250ms ± 7% 250ms ± 7% ~ (p=0.850 n=94+92) Unicode 107ms ± 8% 106ms ± 5% -0.76% (p=0.005 n=98+91) GoTypes 665ms ± 5% 659ms ± 5% -0.85% (p=0.003 n=93+98) Compiler 3.15s ± 3% 3.10s ± 3% -1.60% (p=0.000 n=99+98) SSA 6.82s ± 3% 6.72s ± 4% -1.55% (p=0.000 n=94+98) Flate 138ms ± 8% 138ms ± 6% ~ (p=0.369 n=94+92) GoParser 170ms ± 5% 168ms ± 6% -1.13% (p=0.002 n=96+98) Reflect 412ms ± 8% 416ms ± 8% ~ (p=0.169 n=100+100) Tar 123ms ±18% 123ms ±14% ~ (p=0.896 n=100+100) XML 236ms ± 9% 234ms ±11% ~ (p=0.124 n=100+100) [Geo mean] 401ms 398ms -0.63% name old alloc/op new alloc/op delta Template 38.8MB ± 0% 38.8MB ± 0% ~ (p=0.222 n=5+5) Unicode 28.7MB ± 0% 28.7MB ± 0% ~ (p=0.421 n=5+5) GoTypes 109MB ± 0% 109MB ± 0% ~ (p=0.056 n=5+5) Compiler 457MB ± 0% 457MB ± 0% +0.07% (p=0.008 n=5+5) SSA 1.10GB ± 0% 1.10GB ± 0% +0.05% (p=0.008 n=5+5) Flate 24.5MB ± 0% 24.5MB ± 0% ~ (p=0.222 n=5+5) GoParser 30.9MB ± 0% 31.0MB ± 0% +0.21% (p=0.016 n=5+5) Reflect 73.4MB ± 0% 73.4MB ± 0% ~ (p=0.421 n=5+5) Tar 25.5MB ± 0% 25.5MB ± 0% ~ (p=0.548 n=5+5) XML 40.9MB ± 0% 40.9MB ± 0% ~ (p=0.151 n=5+5) [Geo mean] 71.6MB 71.6MB +0.07% name old allocs/op new allocs/op delta Template 394k ± 0% 394k ± 0% ~ (p=1.000 n=5+5) Unicode 344k ± 0% 343k ± 0% ~ (p=0.310 n=5+5) GoTypes 1.16M ± 0% 1.16M ± 0% ~ (p=1.000 n=5+5) Compiler 4.42M ± 0% 4.42M ± 0% ~ (p=1.000 n=5+5) SSA 9.80M ± 0% 9.80M ± 0% ~ (p=0.095 n=5+5) Flate 237k ± 1% 238k ± 1% ~ (p=0.310 n=5+5) GoParser 320k ± 0% 322k ± 1% +0.50% (p=0.032 n=5+5) Reflect 958k ± 0% 957k ± 0% ~ (p=0.548 n=5+5) Tar 252k ± 1% 252k ± 0% ~ (p=1.000 n=5+5) XML 400k ± 0% 400k ± 0% ~ (p=0.841 n=5+5) [Geo mean] 741k 742k +0.06% name old object-bytes new object-bytes delta Template 386k ± 0% 386k ± 0% -0.05% (p=0.008 n=5+5) Unicode 202k ± 0% 202k ± 0% -0.01% (p=0.008 n=5+5) GoTypes 1.16M ± 0% 1.16M ± 0% -0.06% (p=0.008 n=5+5) Compiler 3.91M ± 0% 3.91M ± 0% -0.06% (p=0.008 n=5+5) SSA 7.91M ± 0% 7.92M ± 0% +0.01% (p=0.008 n=5+5) Flate 228k ± 0% 227k ± 0% -0.04% (p=0.008 n=5+5) GoParser 283k ± 0% 283k ± 0% -0.06% (p=0.008 n=5+5) Reflect 952k ± 0% 951k ± 0% -0.02% (p=0.008 n=5+5) Tar 187k ± 0% 187k ± 0% -0.04% (p=0.008 n=5+5) XML 406k ± 0% 406k ± 0% -0.05% (p=0.008 n=5+5) [Geo mean] 648k 648k -0.04% Change-Id: I8630c4291a0eb2f7e7927bc04d7cc0efef181094 Reviewed-on: https://go-review.googlesource.com/43491 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> |
||
|
|
39ce5907ca |
cmd/compile: rotate loops so conditional branch is at the end
Old loops look like this:
loop:
CMPQ ...
JGE exit
...
JMP loop
exit:
New loops look like this:
JMP entry
loop:
...
entry:
CMPQ ...
JLT loop
This removes one instruction (the unconditional jump) from
the inner loop.
Kinda surprisingly, it matters.
This is a bit different than the peeling that the old obj
library did in that we don't duplicate the loop exit test.
We just jump to the test. I'm not sure if it is better or
worse to do that (peeling gets rid of the JMP but means more
code duplication), but this CL is certainly a much simpler
compiler change, so I'll try this way first.
The obj library used to do peeling before
CL https://go-review.googlesource.com/c/36205 turned it off.
Fixes #15837 (remove obj instruction reordering)
The reordering is already removed, this CL implements the only
part of that reordering that we'd like to keep.
Fixes #14758 (append loop)
name old time/op new time/op delta
Foo-12 817ns ± 4% 538ns ± 0% -34.08% (p=0.000 n=10+9)
Bar-12 850ns ±11% 570ns ±13% -32.88% (p=0.000 n=10+10)
Update #19595 (BLAS slowdown)
name old time/op new time/op delta
DgemvMedMedNoTransIncN-12 13.2µs ± 9% 10.2µs ± 1% -22.26% (p=0.000 n=9+9)
Fixes #19633 (append loop)
name old time/op new time/op delta
Foo-12 810ns ± 1% 540ns ± 0% -33.30% (p=0.000 n=8+9)
Update #18977 (Fannkuch11 regression)
name old time/op new time/op delta
Fannkuch11-8 2.80s ± 0% 3.01s ± 0% +7.47% (p=0.000 n=9+10)
This one makes no sense. There's strictly 1 less instruction in the
inner loop (17 instead of 18). They are exactly the same instructions
except for the JMP that has been elided.
go1 benchmarks generally don't look very impressive. But the gains for the
specific issues above make this CL still probably worth it.
name old time/op new time/op delta
BinaryTree17-8 2.32s ± 0% 2.34s ± 0% +1.14% (p=0.000 n=9+7)
Fannkuch11-8 2.80s ± 0% 3.01s ± 0% +7.47% (p=0.000 n=9+10)
FmtFprintfEmpty-8 44.1ns ± 1% 46.1ns ± 1% +4.53% (p=0.000 n=10+10)
FmtFprintfString-8 67.8ns ± 0% 74.4ns ± 1% +9.80% (p=0.000 n=10+9)
FmtFprintfInt-8 74.9ns ± 0% 78.4ns ± 0% +4.67% (p=0.000 n=8+10)
FmtFprintfIntInt-8 117ns ± 1% 123ns ± 1% +4.69% (p=0.000 n=9+10)
FmtFprintfPrefixedInt-8 160ns ± 1% 146ns ± 0% -8.22% (p=0.000 n=8+10)
FmtFprintfFloat-8 214ns ± 0% 206ns ± 0% -3.91% (p=0.000 n=8+8)
FmtManyArgs-8 468ns ± 0% 497ns ± 1% +6.09% (p=0.000 n=8+10)
GobDecode-8 6.16ms ± 0% 6.21ms ± 1% +0.76% (p=0.000 n=9+10)
GobEncode-8 4.90ms ± 0% 4.92ms ± 1% +0.37% (p=0.028 n=9+10)
Gzip-8 209ms ± 0% 212ms ± 0% +1.33% (p=0.000 n=10+10)
Gunzip-8 36.6ms ± 0% 38.0ms ± 1% +4.03% (p=0.000 n=9+9)
HTTPClientServer-8 84.2µs ± 0% 86.0µs ± 1% +2.14% (p=0.000 n=9+9)
JSONEncode-8 13.6ms ± 3% 13.8ms ± 1% +1.55% (p=0.003 n=9+10)
JSONDecode-8 53.2ms ± 5% 52.9ms ± 0% ~ (p=0.280 n=10+10)
Mandelbrot200-8 3.78ms ± 0% 3.78ms ± 1% ~ (p=0.661 n=10+9)
GoParse-8 2.89ms ± 0% 2.94ms ± 2% +1.50% (p=0.000 n=10+10)
RegexpMatchEasy0_32-8 68.5ns ± 2% 68.9ns ± 1% ~ (p=0.136 n=10+10)
RegexpMatchEasy0_1K-8 220ns ± 1% 225ns ± 1% +2.41% (p=0.000 n=10+10)
RegexpMatchEasy1_32-8 64.7ns ± 0% 64.5ns ± 0% -0.28% (p=0.042 n=10+10)
RegexpMatchEasy1_1K-8 348ns ± 1% 355ns ± 0% +1.90% (p=0.000 n=10+10)
RegexpMatchMedium_32-8 102ns ± 1% 105ns ± 1% +2.95% (p=0.000 n=10+10)
RegexpMatchMedium_1K-8 33.1µs ± 3% 32.5µs ± 0% -1.75% (p=0.000 n=10+10)
RegexpMatchHard_32-8 1.71µs ± 1% 1.70µs ± 1% -0.84% (p=0.002 n=10+9)
RegexpMatchHard_1K-8 51.1µs ± 0% 50.8µs ± 1% -0.48% (p=0.004 n=10+10)
Revcomp-8 411ms ± 1% 402ms ± 0% -2.22% (p=0.000 n=10+9)
Template-8 61.8ms ± 1% 59.7ms ± 0% -3.44% (p=0.000 n=9+9)
TimeParse-8 306ns ± 0% 318ns ± 0% +3.83% (p=0.000 n=10+10)
TimeFormat-8 320ns ± 0% 318ns ± 1% -0.53% (p=0.012 n=7+10)
Change-Id: Ifaf29abbe5874e437048e411ba8f7cfbc9e1c94b
Reviewed-on: https://go-review.googlesource.com/38431
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|