Commit graph

4 commits

Author SHA1 Message Date
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00
Brad Fitzpatrick
994f59666f bytes: don't grow Buffer if capacity is available
Also added a new benchmark from the same test:

benchmark                           old ns/op    new ns/op    delta
BenchmarkBufferNotEmptyWriteRead      2643698       709189  -73.17%

Fixes #5154

R=golang-dev, r, gri
CC=golang-dev
https://golang.org/cl/8164043
2013-03-29 12:39:19 -07:00
Russ Cox
9b875bc037 bytes: faster Count, Index, Equal
Benchmarks are from GOARCH=amd64 on a MacPro5,1.

benchmark                                    old MB/s     new MB/s  speedup
bytes_test.BenchmarkEqual32                    452.89       891.07    1.97x
bytes_test.BenchmarkEqual4K                    852.71      1700.44    1.99x
bytes_test.BenchmarkEqual4M                    841.53      1587.93    1.89x
bytes_test.BenchmarkEqual64M                   838.22      1578.14    1.88x

bytes_test.BenchmarkIndex32                     58.02        48.99    0.84x
bytes_test.BenchmarkIndex4K                     48.26        41.32    0.86x
bytes_test.BenchmarkIndex4M                     48.20        41.24    0.86x
bytes_test.BenchmarkIndex64M                    48.08        41.21    0.86x
bytes_test.BenchmarkIndexEasy32                410.04       546.82    1.33x
bytes_test.BenchmarkIndexEasy4K                849.26     14257.37   16.79x
bytes_test.BenchmarkIndexEasy4M                854.54     17222.15   20.15x
bytes_test.BenchmarkIndexEasy64M               843.57     11060.40   13.11x

bytes_test.BenchmarkCount32                     57.24        50.68    0.89x
bytes_test.BenchmarkCount4K                     48.19        41.82    0.87x
bytes_test.BenchmarkCount4M                     48.18        41.74    0.87x
bytes_test.BenchmarkCount64M                    48.17        41.71    0.87x
bytes_test.BenchmarkCountEasy32                433.11       547.44    1.26x
bytes_test.BenchmarkCountEasy4K               1130.59     14194.06   12.55x
bytes_test.BenchmarkCountEasy4M               1131.23     17231.18   15.23x
bytes_test.BenchmarkCountEasy64M              1111.40     11068.88    9.96x

The non-easy Count/Index benchmarks are a worst case input.

regexp.BenchmarkMatchEasy0_32                  237.46       221.47    0.93x
regexp.BenchmarkMatchEasy0_1K                  553.53      1019.72    1.84x
regexp.BenchmarkMatchEasy0_32K                 693.99      1672.06    2.41x
regexp.BenchmarkMatchEasy0_1M                  688.72      1611.68    2.34x
regexp.BenchmarkMatchEasy0_32M                 680.70      1565.05    2.30x
regexp.BenchmarkMatchEasy1_32                  165.56       243.08    1.47x
regexp.BenchmarkMatchEasy1_1K                  336.45       496.32    1.48x
regexp.BenchmarkMatchEasy1_32K                 302.80       425.63    1.41x
regexp.BenchmarkMatchEasy1_1M                  300.42       414.20    1.38x
regexp.BenchmarkMatchEasy1_32M                 299.64       413.47    1.38x

R=golang-dev, r, iant
CC=golang-dev
https://golang.org/cl/5451116
2011-12-07 15:09:56 -05:00
Russ Cox
d6b3f37e1e bytes: asm for bytes.IndexByte
PERFORMANCE DIFFERENCE

SUMMARY

                                                   amd64           386
2.2 GHz AMD Opteron 8214 HE (Linux)             3.0x faster    8.2x faster
3.60 GHz Intel Xeon (Linux)                     2.2x faster    6.2x faster
2.53 GHz Intel Core2 Duo E7200 (Linux)          1.5x faster    4.4x faster
2.66 Ghz Intel Xeon 5150 (Mac Pro, OS X)        1.5x SLOWER    3.0x faster
2.33 GHz Intel Xeon E5435 (Linux)               1.5x SLOWER    3.0x faster
2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)  1.4x SLOWER    3.0x faster
1.83 GHz Intel Core2 T5600 (Mac Mini, OS X)        none*       3.0x faster

* but yesterday I consistently saw 1.4x SLOWER.

DETAILS

2.2 GHz AMD Opteron 8214 HE (Linux)

amd64 (3x faster)

IndexByte4K            500000           3733 ns/op     1097.24 MB/s
IndexByte4M               500        4328042 ns/op      969.10 MB/s
IndexByte64M               50       67866160 ns/op      988.84 MB/s

IndexBytePortable4K    200000          11161 ns/op      366.99 MB/s
IndexBytePortable4M       100       11795880 ns/op      355.57 MB/s
IndexBytePortable64M       10      188675000 ns/op      355.68 MB/s

386 (8.2x faster)

IndexByte4K            500000           3734 ns/op     1096.95 MB/s
IndexByte4M               500        4209954 ns/op      996.28 MB/s
IndexByte64M               50       68031980 ns/op      986.43 MB/s

IndexBytePortable4K     50000          30670 ns/op      133.55 MB/s
IndexBytePortable4M        50       31868220 ns/op      131.61 MB/s
IndexBytePortable64M        2      508851500 ns/op      131.88 MB/s

3.60 GHz Intel Xeon (Linux)

amd64 (2.2x faster)

IndexByte4K            500000           4612 ns/op      888.12 MB/s
IndexByte4M               500        4835250 ns/op      867.44 MB/s
IndexByte64M               20       77388450 ns/op      867.17 MB/s

IndexBytePortable4K    200000          10306 ns/op      397.44 MB/s
IndexBytePortable4M       100       11201460 ns/op      374.44 MB/s
IndexBytePortable64M       10      179456800 ns/op      373.96 MB/s

386 (6.3x faster)

IndexByte4K            500000           4631 ns/op      884.47 MB/s
IndexByte4M               500        4846388 ns/op      865.45 MB/s
IndexByte64M               20       78691200 ns/op      852.81 MB/s

IndexBytePortable4K    100000          28989 ns/op      141.29 MB/s
IndexBytePortable4M        50       31183180 ns/op      134.51 MB/s
IndexBytePortable64M        5      498347200 ns/op      134.66 MB/s

2.53 GHz Intel Core2 Duo E7200  (Linux)

amd64 (1.5x faster)

IndexByte4K            500000           6502 ns/op      629.96 MB/s
IndexByte4M               500        6692208 ns/op      626.74 MB/s
IndexByte64M               10      107410400 ns/op      624.79 MB/s

IndexBytePortable4K    200000           9721 ns/op      421.36 MB/s
IndexBytePortable4M       100       10013680 ns/op      418.86 MB/s
IndexBytePortable64M       10      160460800 ns/op      418.23 MB/s

386 (4.4x faster)

IndexByte4K            500000           6505 ns/op      629.67 MB/s
IndexByte4M               500        6694078 ns/op      626.57 MB/s
IndexByte64M               10      107397600 ns/op      624.86 MB/s

IndexBytePortable4K    100000          28835 ns/op      142.05 MB/s
IndexBytePortable4M        50       29562680 ns/op      141.88 MB/s
IndexBytePortable64M        5      473221400 ns/op      141.81 MB/s

2.66 Ghz Intel Xeon 5150  (Mac Pro, OS X)

amd64 (1.5x SLOWER)

IndexByte4K            200000           9290 ns/op      440.90 MB/s
IndexByte4M               200        9568925 ns/op      438.33 MB/s
IndexByte64M               10      154473600 ns/op      434.44 MB/s

IndexBytePortable4K    500000           6202 ns/op      660.43 MB/s
IndexBytePortable4M       500        6583614 ns/op      637.08 MB/s
IndexBytePortable64M       20      107166250 ns/op      626.21 MB/s

386 (3x faster)

IndexByte4K            200000           9301 ns/op      440.38 MB/s
IndexByte4M               200        9568025 ns/op      438.37 MB/s
IndexByte64M               10      154391000 ns/op      434.67 MB/s

IndexBytePortable4K    100000          27526 ns/op      148.80 MB/s
IndexBytePortable4M       100       28302490 ns/op      148.20 MB/s
IndexBytePortable64M        5      454170200 ns/op      147.76 MB/s

2.33 GHz Intel Xeon E5435  (Linux)

amd64 (1.5x SLOWER)

IndexByte4K            200000          10601 ns/op      386.38 MB/s
IndexByte4M               100       10827240 ns/op      387.38 MB/s
IndexByte64M               10      173175500 ns/op      387.52 MB/s

IndexBytePortable4K    500000           7082 ns/op      578.37 MB/s
IndexBytePortable4M       500        7391792 ns/op      567.43 MB/s
IndexBytePortable64M       20      122618550 ns/op      547.30 MB/s

386 (3x faster)

IndexByte4K            200000          11074 ns/op      369.88 MB/s
IndexByte4M               100       10902620 ns/op      384.71 MB/s
IndexByte64M               10      181292800 ns/op      370.17 MB/s

IndexBytePortable4K     50000          31725 ns/op      129.11 MB/s
IndexBytePortable4M        50       32564880 ns/op      128.80 MB/s
IndexBytePortable64M        2      545926000 ns/op      122.93 MB/s

2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)

amd64 (1.4x SLOWER)

IndexByte4K            200000          11120 ns/op      368.35 MB/s
IndexByte4M               100       11531950 ns/op      363.71 MB/s
IndexByte64M               10      184819000 ns/op      363.11 MB/s

IndexBytePortable4K    500000           7419 ns/op      552.10 MB/s
IndexBytePortable4M       200        8018710 ns/op      523.06 MB/s
IndexBytePortable64M       10      127614900 ns/op      525.87 MB/s

386 (3x faster)

IndexByte4K            200000          11114 ns/op      368.54 MB/s
IndexByte4M               100       11443530 ns/op      366.52 MB/s
IndexByte64M               10      185212000 ns/op      362.34 MB/s

IndexBytePortable4K     50000          32891 ns/op      124.53 MB/s
IndexBytePortable4M        50       33930580 ns/op      123.61 MB/s
IndexBytePortable64M        2      545400500 ns/op      123.05 MB/s

1.83 GHz Intel Core2 T5600  (Mac Mini, OS X)

amd64 (no difference)

IndexByte4K            200000          13497 ns/op      303.47 MB/s
IndexByte4M               100       13890650 ns/op      301.95 MB/s
IndexByte64M                5      222358000 ns/op      301.81 MB/s

IndexBytePortable4K    200000          13584 ns/op      301.53 MB/s
IndexBytePortable4M       100       13913280 ns/op      301.46 MB/s
IndexBytePortable64M       10      222572600 ns/op      301.51 MB/s

386 (3x faster)

IndexByte4K            200000          13565 ns/op      301.95 MB/s
IndexByte4M               100       13882640 ns/op      302.13 MB/s
IndexByte64M                5      221411600 ns/op      303.10 MB/s

IndexBytePortable4K     50000          39978 ns/op      102.46 MB/s
IndexBytePortable4M        50       41038160 ns/op      102.20 MB/s
IndexBytePortable64M        2      656362500 ns/op      102.24 MB/s

R=r
CC=golang-dev
https://golang.org/cl/166055
2009-12-04 10:23:43 -08:00