1
0
mirror of https://github.com/golang/go synced 2024-09-29 03:24:29 -06:00
Commit Graph

8 Commits

Author SHA1 Message Date
Ben Shi
067dfce21f cmd/compile: optimize ARM's comparision
Optimize (CMPconst [0] (ADD x y)) to (CMN x y) will only get benefits
when the result of the addition is no longer used, otherwise there
might be even performance drop. And this CL fixes that issue for
CMP/CMN/TST/TEQ.

There is little regression in the go1 benchmark (excluding noise),
and the test case JSONDecode-4 even gets improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              21.6s ± 1%     21.6s ± 0%  -0.22%  (p=0.013 n=30+30)
Fannkuch11-4                11.1s ± 0%     11.1s ± 0%  +0.11%  (p=0.000 n=30+29)
FmtFprintfEmpty-4           297ns ± 0%     297ns ± 0%  +0.08%  (p=0.007 n=26+28)
FmtFprintfString-4          589ns ± 1%     589ns ± 0%    ~     (p=0.659 n=30+25)
FmtFprintfInt-4             644ns ± 1%     650ns ± 0%  +0.88%  (p=0.000 n=30+24)
FmtFprintfIntInt-4          964ns ± 0%     977ns ± 0%  +1.33%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    1.06µs ± 0%    1.07µs ± 0%  +1.31%  (p=0.000 n=29+27)
FmtFprintfFloat-4          1.89µs ± 0%    1.92µs ± 0%  +1.25%  (p=0.000 n=29+29)
FmtManyArgs-4              3.63µs ± 0%    3.67µs ± 0%  +1.33%  (p=0.000 n=29+27)
GobDecode-4                38.1ms ± 1%    37.9ms ± 1%  -0.60%  (p=0.000 n=29+29)
GobEncode-4                35.3ms ± 2%    35.2ms ± 1%    ~     (p=0.286 n=30+30)
Gzip-4                      2.36s ± 0%     2.37s ± 2%    ~     (p=0.277 n=24+28)
Gunzip-4                    264ms ± 1%     264ms ± 1%    ~     (p=0.104 n=28+30)
HTTPClientServer-4         1.04ms ± 4%    1.02ms ± 4%  -1.65%  (p=0.000 n=28+28)
JSONEncode-4               78.5ms ± 1%    79.6ms ± 1%  +1.34%  (p=0.000 n=27+28)
JSONDecode-4                379ms ± 4%     352ms ± 5%  -7.09%  (p=0.000 n=29+30)
Mandelbrot200-4            17.6ms ± 0%    17.6ms ± 0%    ~     (p=0.206 n=28+29)
GoParse-4                  21.9ms ± 1%    22.1ms ± 1%  +0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4       631ns ± 0%     641ns ± 0%  +1.63%  (p=0.000 n=29+30)
RegexpMatchEasy0_1K-4      4.11µs ± 0%    4.11µs ± 0%    ~     (p=0.700 n=30+30)
RegexpMatchEasy1_32-4       670ns ± 0%     679ns ± 0%  +1.37%  (p=0.000 n=21+30)
RegexpMatchEasy1_1K-4      5.31µs ± 0%    5.26µs ± 0%  -1.03%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4      905ns ± 0%     906ns ± 0%  +0.14%  (p=0.001 n=30+30)
RegexpMatchMedium_1K-4      192µs ± 0%     191µs ± 0%  -0.45%  (p=0.000 n=29+27)
RegexpMatchHard_32-4       11.8µs ± 0%    11.7µs ± 0%  -0.39%  (p=0.000 n=29+28)
RegexpMatchHard_1K-4        347µs ± 0%     347µs ± 0%    ~     (p=0.084 n=29+30)
Revcomp-4                  37.5ms ± 1%    37.5ms ± 1%    ~     (p=0.279 n=29+29)
Template-4                  519ms ± 2%     519ms ± 2%    ~     (p=0.652 n=28+29)
TimeParse-4                2.83µs ± 0%    2.78µs ± 0%  -1.90%  (p=0.000 n=27+28)
TimeFormat-4               5.79µs ± 0%    5.60µs ± 0%  -3.23%  (p=0.000 n=29+29)
[Geo mean]                  331µs          330µs       -0.16%

name                     old speed      new speed      delta
GobDecode-4              20.1MB/s ± 1%  20.3MB/s ± 1%  +0.61%  (p=0.000 n=29+29)
GobEncode-4              21.7MB/s ± 2%  21.8MB/s ± 1%    ~     (p=0.294 n=30+30)
Gzip-4                   8.23MB/s ± 1%  8.20MB/s ± 2%    ~     (p=0.099 n=26+28)
Gunzip-4                 73.5MB/s ± 1%  73.4MB/s ± 1%    ~     (p=0.107 n=28+30)
JSONEncode-4             24.7MB/s ± 1%  24.4MB/s ± 1%  -1.32%  (p=0.000 n=27+28)
JSONDecode-4             5.13MB/s ± 4%  5.52MB/s ± 5%  +7.65%  (p=0.000 n=29+30)
GoParse-4                2.65MB/s ± 1%  2.63MB/s ± 1%  -0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4    50.7MB/s ± 0%  49.9MB/s ± 0%  -1.58%  (p=0.000 n=29+29)
RegexpMatchEasy0_1K-4     249MB/s ± 0%   249MB/s ± 0%    ~     (p=0.342 n=30+28)
RegexpMatchEasy1_32-4    47.7MB/s ± 0%  47.1MB/s ± 0%  -1.39%  (p=0.000 n=26+30)
RegexpMatchEasy1_1K-4     193MB/s ± 0%   195MB/s ± 0%  +1.04%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4   1.10MB/s ± 0%  1.10MB/s ± 0%  -0.42%  (p=0.000 n=30+26)
RegexpMatchMedium_1K-4   5.33MB/s ± 0%  5.36MB/s ± 0%  +0.43%  (p=0.000 n=29+29)
RegexpMatchHard_32-4     2.72MB/s ± 0%  2.73MB/s ± 0%  +0.37%  (p=0.000 n=29+30)
RegexpMatchHard_1K-4     2.95MB/s ± 0%  2.95MB/s ± 0%    ~     (all equal)
Revcomp-4                67.8MB/s ± 1%  67.7MB/s ± 1%    ~     (p=0.273 n=29+29)
Template-4               3.74MB/s ± 2%  3.74MB/s ± 2%    ~     (p=0.665 n=28+29)
[Geo mean]               15.2MB/s       15.2MB/s       +0.21%

Change-Id: Ifed1fb8cc02d5ca52c8bc6c21b6b5bf6dbb2701a
Reviewed-on: https://go-review.googlesource.com/132115
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 14:54:08 +00:00
Ben Shi
0e9f1de0b7 cmd/compile: optimize arm64's comparison
Add more optimization with TST/CMN.

1. A tiny benchmark shows more than 12% improvement.
TSTCMN-4                    378µs ± 0%     332µs ± 0%  -12.15%  (p=0.000 n=30+27)
(https://github.com/benshi001/ugo1/blob/master/tstcmn_test.go)

2. There is little regression in the go1 benchmark, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              19.1s ± 0%     19.1s ± 0%    ~     (p=0.994 n=28+29)
Fannkuch11-4                10.0s ± 0%     10.0s ± 0%    ~     (p=0.198 n=30+25)
FmtFprintfEmpty-4           233ns ± 0%     233ns ± 0%  +0.14%  (p=0.002 n=24+30)
FmtFprintfString-4          428ns ± 0%     428ns ± 0%    ~     (all equal)
FmtFprintfInt-4             472ns ± 0%     472ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          725ns ± 0%     725ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt-4     889ns ± 0%     888ns ± 0%    ~     (p=0.632 n=28+30)
FmtFprintfFloat-4          1.20µs ± 0%    1.20µs ± 0%  +0.05%  (p=0.001 n=18+30)
FmtManyArgs-4              3.00µs ± 0%    2.99µs ± 0%  -0.07%  (p=0.001 n=27+30)
GobDecode-4                42.1ms ± 0%    42.2ms ± 0%  +0.29%  (p=0.000 n=28+28)
GobEncode-4                38.6ms ± 9%    38.8ms ± 9%    ~     (p=0.912 n=30+30)
Gzip-4                      2.07s ± 1%     2.05s ± 1%  -0.64%  (p=0.000 n=29+30)
Gunzip-4                    175ms ± 0%     175ms ± 0%  -0.15%  (p=0.001 n=30+30)
HTTPClientServer-4          872µs ± 5%     880µs ± 6%    ~     (p=0.196 n=30+29)
JSONEncode-4               88.5ms ± 1%    89.8ms ± 1%  +1.49%  (p=0.000 n=23+24)
JSONDecode-4                393ms ± 1%     390ms ± 1%  -0.89%  (p=0.000 n=28+30)
Mandelbrot200-4            19.5ms ± 0%    19.5ms ± 0%    ~     (p=0.405 n=29+28)
GoParse-4                  19.9ms ± 0%    20.0ms ± 0%  +0.27%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4       431ns ± 0%     431ns ± 0%    ~     (p=1.000 n=30+30)
RegexpMatchEasy0_1K-4      1.61µs ± 0%    1.61µs ± 0%    ~     (p=0.527 n=26+26)
RegexpMatchEasy1_32-4       443ns ± 0%     443ns ± 0%    ~     (all equal)
RegexpMatchEasy1_1K-4      2.58µs ± 1%    2.58µs ± 1%    ~     (p=0.578 n=27+25)
RegexpMatchMedium_32-4      740ns ± 0%     740ns ± 0%    ~     (p=0.357 n=30+30)
RegexpMatchMedium_1K-4      223µs ± 0%     223µs ± 0%  +0.16%  (p=0.000 n=30+29)
RegexpMatchHard_32-4       12.3µs ± 0%    12.3µs ± 0%    ~     (p=0.236 n=27+27)
RegexpMatchHard_1K-4        371µs ± 0%     371µs ± 0%  +0.09%  (p=0.000 n=30+27)
Revcomp-4                   2.85s ± 0%     2.85s ± 0%    ~     (p=0.057 n=28+25)
Template-4                  408ms ± 1%     409ms ± 1%    ~     (p=0.117 n=29+29)
TimeParse-4                1.93µs ± 0%    1.93µs ± 0%    ~     (p=0.535 n=29+28)
TimeFormat-4               1.99µs ± 0%    1.99µs ± 0%    ~     (p=0.168 n=29+28)
[Geo mean]                  306µs          307µs       +0.07%

name                     old speed      new speed      delta
GobDecode-4              18.3MB/s ± 0%  18.2MB/s ± 0%  -0.31%  (p=0.000 n=28+29)
GobEncode-4              19.9MB/s ± 8%  19.8MB/s ± 9%    ~     (p=0.923 n=30+30)
Gzip-4                   9.39MB/s ± 1%  9.45MB/s ± 1%  +0.65%  (p=0.000 n=29+30)
Gunzip-4                  111MB/s ± 0%   111MB/s ± 0%  +0.15%  (p=0.001 n=30+30)
JSONEncode-4             21.9MB/s ± 1%  21.6MB/s ± 1%  -1.45%  (p=0.000 n=23+23)
JSONDecode-4             4.94MB/s ± 1%  4.98MB/s ± 1%  +0.84%  (p=0.000 n=27+30)
GoParse-4                2.91MB/s ± 0%  2.90MB/s ± 0%  -0.34%  (p=0.000 n=21+22)
RegexpMatchEasy0_32-4    74.1MB/s ± 0%  74.1MB/s ± 0%    ~     (p=0.469 n=29+28)
RegexpMatchEasy0_1K-4     634MB/s ± 0%   634MB/s ± 0%    ~     (p=0.978 n=24+28)
RegexpMatchEasy1_32-4    72.2MB/s ± 0%  72.2MB/s ± 0%    ~     (p=0.064 n=27+29)
RegexpMatchEasy1_1K-4     396MB/s ± 1%   396MB/s ± 1%    ~     (p=0.583 n=27+25)
RegexpMatchMedium_32-4   1.35MB/s ± 0%  1.35MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K-4   4.60MB/s ± 0%  4.59MB/s ± 0%  -0.14%  (p=0.000 n=30+26)
RegexpMatchHard_32-4     2.61MB/s ± 0%  2.61MB/s ± 0%    ~     (all equal)
RegexpMatchHard_1K-4     2.76MB/s ± 0%  2.76MB/s ± 0%    ~     (all equal)
Revcomp-4                89.1MB/s ± 0%  89.1MB/s ± 0%    ~     (p=0.059 n=28+25)
Template-4               4.75MB/s ± 1%  4.75MB/s ± 1%    ~     (p=0.106 n=29+29)
[Geo mean]               18.3MB/s       18.3MB/s       -0.07%

Change-Id: I3cd76ce63e84b0c3cebabf9fa3573b76a7343899
Reviewed-on: https://go-review.googlesource.com/124935
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 02:51:28 +00:00
Ben Shi
2b69ad0b3c cmd/compile: optimize arm's comparison
The CMP/CMN/TST/TEQ perform similar to SUB/ADD/AND/XOR except
the result is abondoned, and only NZCV flags are affected.

This CL implements further optimization with them.

1. A micro benchmark test gets more than 9% improvment.
TSTTEQ-4                   6.99ms ± 0%    6.35ms ± 0%  -9.15%  (p=0.000 n=33+36)
(https://github.com/benshi001/ugo1/blob/master/tstteq2_test.go)

2. The go1 benckmark shows no regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              25.7s ± 1%     25.7s ± 1%    ~     (p=0.830 n=40+40)
Fannkuch11-4                13.3s ± 0%     13.2s ± 0%  -0.65%  (p=0.000 n=40+34)
FmtFprintfEmpty-4           394ns ± 0%     394ns ± 0%    ~     (p=0.819 n=40+40)
FmtFprintfString-4          677ns ± 0%     677ns ± 0%  +0.06%  (p=0.039 n=39+40)
FmtFprintfInt-4             707ns ± 0%     706ns ± 0%  -0.14%  (p=0.000 n=40+39)
FmtFprintfIntInt-4         1.04µs ± 0%    1.04µs ± 0%  +0.10%  (p=0.000 n=29+31)
FmtFprintfPrefixedInt-4    1.10µs ± 0%    1.11µs ± 0%  +0.65%  (p=0.000 n=39+37)
FmtFprintfFloat-4          2.27µs ± 0%    2.26µs ± 0%  -0.53%  (p=0.000 n=39+40)
FmtManyArgs-4              3.96µs ± 0%    3.96µs ± 0%  +0.10%  (p=0.000 n=39+40)
GobDecode-4                53.4ms ± 1%    52.8ms ± 2%  -1.10%  (p=0.000 n=39+39)
GobEncode-4                50.3ms ± 3%    50.4ms ± 2%    ~     (p=0.089 n=40+39)
Gzip-4                      2.62s ± 0%     2.64s ± 0%  +0.60%  (p=0.000 n=40+39)
Gunzip-4                    312ms ± 0%     312ms ± 0%  +0.02%  (p=0.030 n=40+39)
HTTPClientServer-4         1.01ms ± 7%    0.98ms ± 7%  -2.37%  (p=0.000 n=40+39)
JSONEncode-4                126ms ± 1%     126ms ± 1%  -0.38%  (p=0.004 n=39+39)
JSONDecode-4                423ms ± 0%     426ms ± 2%  +0.72%  (p=0.001 n=39+40)
Mandelbrot200-4            18.4ms ± 0%    18.4ms ± 0%  +0.04%  (p=0.000 n=38+40)
GoParse-4                  22.8ms ± 0%    22.6ms ± 0%  -0.68%  (p=0.000 n=35+40)
RegexpMatchEasy0_32-4       699ns ± 0%     704ns ± 0%  +0.73%  (p=0.000 n=27+40)
RegexpMatchEasy0_1K-4      4.27µs ± 0%    4.26µs ± 0%  -0.09%  (p=0.000 n=35+38)
RegexpMatchEasy1_32-4       741ns ± 0%     735ns ± 0%  -0.85%  (p=0.000 n=40+35)
RegexpMatchEasy1_1K-4      5.53µs ± 0%    5.49µs ± 0%  -0.69%  (p=0.000 n=39+40)
RegexpMatchMedium_32-4     1.07µs ± 0%    1.04µs ± 2%  -2.34%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4      261µs ± 0%     261µs ± 0%  -0.16%  (p=0.000 n=40+39)
RegexpMatchHard_32-4       14.9µs ± 0%    14.9µs ± 0%  -0.18%  (p=0.000 n=39+40)
RegexpMatchHard_1K-4        445µs ± 0%     446µs ± 0%  +0.09%  (p=0.000 n=36+34)
Revcomp-4                  41.8ms ± 1%    41.8ms ± 1%    ~     (p=0.595 n=39+38)
Template-4                  530ms ± 1%     528ms ± 1%  -0.49%  (p=0.000 n=40+40)
TimeParse-4                3.39µs ± 0%    3.42µs ± 0%  +0.98%  (p=0.000 n=36+38)
TimeFormat-4               6.12µs ± 0%    6.07µs ± 0%  -0.81%  (p=0.000 n=34+38)
[Geo mean]                  384µs          383µs       -0.24%

name                     old speed      new speed      delta
GobDecode-4              14.4MB/s ± 1%  14.5MB/s ± 2%  +1.11%  (p=0.000 n=39+39)
GobEncode-4              15.3MB/s ± 3%  15.2MB/s ± 2%    ~     (p=0.104 n=40+39)
Gzip-4                   7.40MB/s ± 1%  7.36MB/s ± 0%  -0.60%  (p=0.000 n=40+39)
Gunzip-4                 62.2MB/s ± 0%  62.1MB/s ± 0%  -0.02%  (p=0.047 n=40+39)
JSONEncode-4             15.4MB/s ± 1%  15.4MB/s ± 2%  +0.39%  (p=0.002 n=39+39)
JSONDecode-4             4.59MB/s ± 0%  4.56MB/s ± 2%  -0.71%  (p=0.000 n=39+40)
GoParse-4                2.54MB/s ± 0%  2.56MB/s ± 0%  +0.72%  (p=0.000 n=26+40)
RegexpMatchEasy0_32-4    45.8MB/s ± 0%  45.4MB/s ± 0%  -0.75%  (p=0.000 n=38+40)
RegexpMatchEasy0_1K-4     240MB/s ± 0%   240MB/s ± 0%  +0.09%  (p=0.000 n=35+38)
RegexpMatchEasy1_32-4    43.1MB/s ± 0%  43.5MB/s ± 0%  +0.84%  (p=0.000 n=40+39)
RegexpMatchEasy1_1K-4     185MB/s ± 0%   186MB/s ± 0%  +0.69%  (p=0.000 n=39+40)
RegexpMatchMedium_32-4    936kB/s ± 1%   959kB/s ± 2%  +2.38%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4   3.92MB/s ± 0%  3.93MB/s ± 0%  +0.18%  (p=0.000 n=39+40)
RegexpMatchHard_32-4     2.15MB/s ± 0%  2.15MB/s ± 0%  +0.19%  (p=0.000 n=40+40)
RegexpMatchHard_1K-4     2.30MB/s ± 0%  2.30MB/s ± 0%    ~     (all equal)
Revcomp-4                60.8MB/s ± 1%  60.8MB/s ± 1%    ~     (p=0.600 n=39+38)
Template-4               3.66MB/s ± 1%  3.68MB/s ± 1%  +0.46%  (p=0.000 n=40+40)
[Geo mean]               12.8MB/s       12.8MB/s       +0.27%

Change-Id: I849161169ecf0876a04b7c1d3990fa8d1435215e
Reviewed-on: https://go-review.googlesource.com/122855
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-27 15:49:46 +00:00
Ben Shi
556971316f cmd/compile: optimize 386's comparison
CMPL/CMPW/CMPB can take a memory operand on 386, and this CL
implements that optimization.

1. The total size of pkg/linux_386 decreases about 45KB, excluding
cmd/compile.

2. The go1 benchmark shows a little improvement.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.36s ± 2%     3.37s ± 3%    ~     (p=0.537 n=40+40)
Fannkuch11-4                3.59s ± 1%     3.53s ± 2%  -1.58%  (p=0.000 n=40+40)
FmtFprintfEmpty-4          46.0ns ± 3%    45.8ns ± 3%    ~     (p=0.249 n=40+40)
FmtFprintfString-4         80.0ns ± 4%    78.8ns ± 3%  -1.49%  (p=0.001 n=40+40)
FmtFprintfInt-4            89.7ns ± 2%    90.3ns ± 2%  +0.74%  (p=0.003 n=40+40)
FmtFprintfIntInt-4          144ns ± 3%     143ns ± 3%  -0.95%  (p=0.003 n=40+40)
FmtFprintfPrefixedInt-4     181ns ± 4%     180ns ± 2%    ~     (p=0.103 n=40+40)
FmtFprintfFloat-4           412ns ± 3%     408ns ± 4%  -0.97%  (p=0.018 n=40+40)
FmtManyArgs-4               607ns ± 4%     605ns ± 4%    ~     (p=0.148 n=40+40)
GobDecode-4                7.19ms ± 4%    7.24ms ± 5%    ~     (p=0.340 n=40+40)
GobEncode-4                7.04ms ± 9%    6.99ms ± 9%    ~     (p=0.289 n=40+40)
Gzip-4                      400ms ± 6%     398ms ± 5%    ~     (p=0.168 n=40+40)
Gunzip-4                   41.2ms ± 3%    41.7ms ± 3%  +1.40%  (p=0.001 n=40+40)
HTTPClientServer-4         62.5µs ± 1%    62.1µs ± 2%  -0.61%  (p=0.000 n=37+37)
JSONEncode-4               20.7ms ± 4%    20.4ms ± 3%  -1.60%  (p=0.000 n=40+40)
JSONDecode-4               69.4ms ± 4%    69.2ms ± 6%    ~     (p=0.177 n=40+40)
Mandelbrot200-4            5.22ms ± 6%    5.21ms ± 3%    ~     (p=0.531 n=40+40)
GoParse-4                  3.29ms ± 3%    3.28ms ± 3%    ~     (p=0.321 n=40+39)
RegexpMatchEasy0_32-4       104ns ± 4%     103ns ± 7%  -0.89%  (p=0.040 n=40+40)
RegexpMatchEasy0_1K-4       852ns ± 3%     853ns ± 2%    ~     (p=0.357 n=40+40)
RegexpMatchEasy1_32-4       113ns ± 8%     113ns ± 3%    ~     (p=0.906 n=40+40)
RegexpMatchEasy1_1K-4      1.03µs ± 4%    1.03µs ± 5%    ~     (p=0.326 n=40+40)
RegexpMatchMedium_32-4      136ns ± 3%     133ns ± 3%  -2.31%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4     44.0µs ± 3%    43.7µs ± 3%    ~     (p=0.053 n=40+40)
RegexpMatchHard_32-4       2.27µs ± 3%    2.26µs ± 4%    ~     (p=0.391 n=40+40)
RegexpMatchHard_1K-4       68.0µs ± 3%    68.9µs ± 3%  +1.28%  (p=0.000 n=40+40)
Revcomp-4                   1.86s ± 5%     1.86s ± 2%    ~     (p=0.950 n=40+40)
Template-4                 73.4ms ± 4%    69.9ms ± 7%  -4.78%  (p=0.000 n=40+40)
TimeParse-4                 449ns ± 4%     441ns ± 5%  -1.76%  (p=0.000 n=40+40)
TimeFormat-4                416ns ± 3%     417ns ± 4%    ~     (p=0.304 n=40+40)
[Geo mean]                 67.7µs         67.3µs       -0.55%

name                     old speed      new speed      delta
GobDecode-4               107MB/s ± 4%   106MB/s ± 5%    ~     (p=0.336 n=40+40)
GobEncode-4               109MB/s ± 5%   110MB/s ± 9%    ~     (p=0.142 n=38+40)
Gzip-4                   48.5MB/s ± 5%  48.8MB/s ± 5%    ~     (p=0.172 n=40+40)
Gunzip-4                  472MB/s ± 3%   465MB/s ± 3%  -1.39%  (p=0.001 n=40+40)
JSONEncode-4             93.6MB/s ± 4%  95.1MB/s ± 3%  +1.61%  (p=0.000 n=40+40)
JSONDecode-4             28.0MB/s ± 3%  28.1MB/s ± 6%    ~     (p=0.181 n=40+40)
GoParse-4                17.6MB/s ± 3%  17.7MB/s ± 3%    ~     (p=0.350 n=40+39)
RegexpMatchEasy0_32-4     308MB/s ± 4%   311MB/s ± 6%  +0.96%  (p=0.025 n=40+40)
RegexpMatchEasy0_1K-4    1.20GB/s ± 3%  1.20GB/s ± 2%    ~     (p=0.317 n=40+40)
RegexpMatchEasy1_32-4     282MB/s ± 7%   282MB/s ± 3%    ~     (p=0.516 n=40+40)
RegexpMatchEasy1_1K-4     994MB/s ± 4%   991MB/s ± 5%    ~     (p=0.319 n=40+40)
RegexpMatchMedium_32-4   7.31MB/s ± 3%  7.49MB/s ± 3%  +2.46%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4   23.3MB/s ± 3%  23.4MB/s ± 3%    ~     (p=0.052 n=40+40)
RegexpMatchHard_32-4     14.1MB/s ± 3%  14.1MB/s ± 4%    ~     (p=0.391 n=40+40)
RegexpMatchHard_1K-4     15.1MB/s ± 3%  14.9MB/s ± 3%  -1.27%  (p=0.000 n=40+40)
Revcomp-4                 137MB/s ± 5%   137MB/s ± 2%    ~     (p=0.942 n=40+40)
Template-4               26.5MB/s ± 4%  27.8MB/s ± 7%  +5.03%  (p=0.000 n=40+40)
[Geo mean]               78.6MB/s       79.0MB/s       +0.57%

Change-Id: Idcacc6881ef57cd7dc33aa87b711282842b72a53
Reviewed-on: https://go-review.googlesource.com/126618
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-08-20 14:23:22 +00:00
Michael Munday
b65122f99a cmd/compile: optimize comparisons using load merging where available
Multi-byte comparison operations were used on amd64, arm64, i386
and s390x for comparisons with constant arrays, but only amd64 and
i386 for comparisons with string constants. This CL combines the
check for platform capability, since they have the same requirements,
and also enables both on ppc64le which also supports load merging.

Note that these optimizations currently use little endian byte order
which results in byte reversal instructions on s390x. This should
be fixed at some point.

Change-Id: Ie612d13359b50c77f4d7c6e73fea4a59fa11f322
Reviewed-on: https://go-review.googlesource.com/102558
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-04-09 21:16:47 +00:00
Alberto Donizetti
a27cd4fd31 test/codegen: port tbz/tbnz arm64 tests
And delete them from asm_test.

Change-Id: I34fcf85ae8ce09cd146fe4ce6a0ae7616bd97e2d
Reviewed-on: https://go-review.googlesource.com/102296
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-24 09:35:53 +00:00
Alberto Donizetti
fc6280d4b0 test/codegen: port direct comparisons with memory tests
And remove them from asm_test.

Change-Id: I1ca29b40546d6de06f20bfd550ed8ff87f495454
Reviewed-on: https://go-review.googlesource.com/102115
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-22 17:20:09 +00:00
Alberto Donizetti
be371edd67 test/codegen: port comparisons tests to codegen
And delete them from asm_test.

Change-Id: I64c512bfef3b3da6db5c5d29277675dade28b8ab
Reviewed-on: https://go-review.googlesource.com/101595
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-20 19:38:06 +00:00