1
0
mirror of https://github.com/golang/go synced 2024-11-17 05:35:00 -07:00
Commit Graph

35574 Commits

Author SHA1 Message Date
Giovanni Bajo
0bcf8bcd99 test: in asmcheck, regexp must match from beginning of line
This avoid simple bugs like "ADD" matching "FADD". Obviously
"ADD" will still match "ADDQ" so some care is still required
in this regard, but at least a first class of possible errors
is taken care of.

Change-Id: I7deb04c31de30bedac9c026d9889ace4a1d2adcb
Reviewed-on: https://go-review.googlesource.com/97817
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 18:14:54 +00:00
Giovanni Bajo
879a1ff1e4 test: improve asmcheck syntax
asmcheck comments now support a compact form of specifying
multiple checks for each platform, using the following syntax:

   amd64:"SHL\t[$]4","SHR\t[$]4"

Negative checks are also parsed using the following syntax:

   amd64:-"ROR"

though they are still not working.

Moreover, out-of-line comments have been implemented. This
allows to specify asmchecks on comment-only lines, that will
be matched on the first subsequent non-comment non-empty line.

    // amd64:"XOR"
    // arm:"EOR"

    x ^= 1

Change-Id: I110c7462fc6a5c70fd4af0d42f516016ae7f2760
Reviewed-on: https://go-review.googlesource.com/97816
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 18:10:48 +00:00
Josh Bleecher Snyder
9372e3f5ef runtime: don't allocate to build strings of length 1
Use staticbytes instead.
Instrumenting make.bash shows approx 0.5%
of all slicebytetostrings have a buffer of length 1.

name                     old time/op  new time/op  delta
SliceByteToString/1-8    14.1ns ± 1%   4.1ns ± 1%  -71.13%  (p=0.000 n=17+20)
SliceByteToString/2-8    15.5ns ± 2%  15.5ns ± 1%     ~     (p=0.061 n=20+18)
SliceByteToString/4-8    14.9ns ± 1%  15.0ns ± 2%   +1.25%  (p=0.000 n=20+20)
SliceByteToString/8-8    17.1ns ± 1%  17.5ns ± 1%   +2.16%  (p=0.000 n=19+19)
SliceByteToString/16-8   23.6ns ± 1%  23.9ns ± 1%   +1.41%  (p=0.000 n=20+18)
SliceByteToString/32-8   26.0ns ± 1%  25.8ns ± 0%   -1.05%  (p=0.000 n=19+16)
SliceByteToString/64-8   30.0ns ± 0%  30.2ns ± 0%   +0.56%  (p=0.000 n=16+18)
SliceByteToString/128-8  38.9ns ± 0%  39.0ns ± 0%   +0.23%  (p=0.019 n=19+15)

Fixes #24172

Change-Id: I3dfa14eefbf9fb4387114e20c9cb40e186abe962
Reviewed-on: https://go-review.googlesource.com/97717
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 17:38:06 +00:00
Josh Bleecher Snyder
aa9c1a8f80 runtime: fix amd64p32 indexbytes in presence of overflow
When the slice/string length is very large,
probably artifically large as in CL 97523,
adding BX (length) to R11 (pointer) overflows.
As a result, checking DI < R11 yields the wrong result.
Since they will be equal when the loop is done,
just check DI != R11 instead.
Yes, the pointer itself could overflow, but if that happens,
something else has gone pretty wrong; not our concern here.

Fixes #24187

Change-Id: I2f60fc6ccae739345d01bc80528560726ad4f8c6
Reviewed-on: https://go-review.googlesource.com/97802
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-01 16:53:33 +00:00
Chad Rosier
77ba071ec6 cmd/compile/internal/ssa: combine consecutive LittleEndian stores on arm64
This optimization mirrors that which is already implemented for AMD64.  The
optimization specifically targets the binary.LittleEndian.PutUint* functions.

encoding/binary results on Amberwing:
name                   old time/op    new time/op    delta
ReadSlice1000Int32s      9.67µs ± 1%    9.64µs ± 1%     ~     (p=0.185 n=9+9)
ReadStruct               5.24µs ± 2%    5.36µs ± 2%   +2.24%  (p=0.002 n=10+8)
ReadInts                 8.69µs ± 5%    8.88µs ± 5%     ~     (p=0.083 n=10+10)
WriteInts                3.90µs ±10%    3.71µs ± 9%     ~     (p=0.077 n=10+10)
WriteSlice1000Int32s     10.9µs ± 1%    10.9µs ± 1%     ~     (p=0.701 n=9+9)
PutUint16                 572ns ±14%     505ns ±11%  -11.75%  (p=0.006 n=9+10)
PutUint32                 550ns ±18%     540ns ±11%     ~     (p=0.692 n=10+10)
PutUint64                 565ns ±15%     540ns ±17%     ~     (p=0.248 n=10+10)
LittleEndianPutUint16     540ns ±11%     500ns ±10%     ~     (p=0.094 n=10+10)
LittleEndianPutUint32     520ns ±15%     480ns ±15%     ~     (p=0.087 n=10+10)
LittleEndianPutUint64     505ns ±29%     470ns ±17%     ~     (p=0.208 n=10+10)
PutUvarint32              700ns ±21%     635ns ±10%   -9.29%  (p=0.028 n=10+10)
PutUvarint64              740ns ± 8%     740ns ± 8%     ~     (p=0.713 n=10+10)
[Geo mean]               1.53µs         1.47µs        -3.93%

name                   old speed      new speed      delta
ReadSlice1000Int32s     414MB/s ± 1%   415MB/s ± 1%     ~     (p=0.185 n=9+9)
ReadStruct             14.3MB/s ± 2%  14.0MB/s ± 2%   -2.21%  (p=0.000 n=10+8)
ReadInts               3.45MB/s ± 4%  3.38MB/s ± 6%     ~     (p=0.085 n=10+10)
WriteInts              7.71MB/s ± 9%  8.09MB/s ± 8%   +4.93%  (p=0.048 n=10+10)
WriteSlice1000Int32s    367MB/s ± 1%   366MB/s ± 1%     ~     (p=0.701 n=9+9)
PutUint16              3.51MB/s ±14%  3.99MB/s ±11%  +13.47%  (p=0.009 n=9+10)
PutUint32              7.35MB/s ±21%  7.44MB/s ±10%     ~     (p=0.692 n=10+10)
PutUint64              14.3MB/s ±14%  15.0MB/s ±19%     ~     (p=0.248 n=10+10)
LittleEndianPutUint16  3.72MB/s ±11%  4.03MB/s ±10%     ~     (p=0.094 n=10+10)
LittleEndianPutUint32  7.75MB/s ±15%  8.39MB/s ±13%     ~     (p=0.087 n=10+10)
LittleEndianPutUint64  16.1MB/s ±23%  17.2MB/s ±16%     ~     (p=0.208 n=10+10)
PutUvarint32           5.76MB/s ±18%  6.32MB/s ±10%   +9.72%  (p=0.028 n=10+10)
PutUvarint64           10.8MB/s ± 8%  10.8MB/s ± 8%     ~     (p=0.713 n=10+10)
[Geo mean]             13.7MB/s       14.3MB/s        +4.02%

go1 results on Amberwing:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       249ns ± 0%     249ns ± 0%    ~     (p=0.087 n=10+10)
RegexpMatchEasy0_1K       584ns ± 0%     584ns ± 0%    ~     (all equal)
RegexpMatchEasy1_32       246ns ± 0%     246ns ± 0%    ~     (p=1.000 n=10+10)
RegexpMatchEasy1_1K       806ns ± 0%     806ns ± 0%    ~     (p=0.706 n=10+9)
RegexpMatchMedium_32      314ns ± 0%     314ns ± 0%    ~     (all equal)
RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%    ~     (p=0.245 n=10+8)
RegexpMatchHard_32       2.75µs ± 1%    2.75µs ± 1%    ~     (p=0.690 n=10+10)
RegexpMatchHard_1K       78.9µs ± 0%    78.9µs ± 1%    ~     (p=0.295 n=9+9)
FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
FmtFprintfString          112ns ± 0%     112ns ± 0%    ~     (all equal)
FmtFprintfInt             117ns ± 0%     116ns ± 0%  -0.85%  (p=0.000 n=10+10)
FmtFprintfIntInt          181ns ± 0%     181ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt     222ns ± 0%     224ns ± 0%  +0.90%  (p=0.000 n=9+10)
FmtFprintfFloat           318ns ± 1%     322ns ± 0%    ~     (p=0.059 n=10+8)
FmtManyArgs               736ns ± 1%     735ns ± 0%    ~     (p=0.206 n=9+9)
Gzip                      437ms ± 0%     436ms ± 0%  -0.25%  (p=0.000 n=10+10)
HTTPClientServer         89.8µs ± 1%    90.2µs ± 2%    ~     (p=0.393 n=10+10)
JSONEncode               20.1ms ± 1%    20.2ms ± 1%    ~     (p=0.065 n=9+10)
JSONDecode               94.2ms ± 1%    93.9ms ± 1%  -0.42%  (p=0.043 n=10+10)
GobDecode                12.7ms ± 1%    12.8ms ± 2%  +0.94%  (p=0.019 n=10+10)
GobEncode                12.1ms ± 0%    12.1ms ± 0%    ~     (p=0.052 n=10+10)
Mandelbrot200            5.06ms ± 0%    5.05ms ± 0%  -0.04%  (p=0.000 n=9+10)
TimeParse                 450ns ± 3%     446ns ± 0%    ~     (p=0.238 n=10+9)
TimeFormat                485ns ± 1%     483ns ± 1%    ~     (p=0.073 n=10+10)
Template                 90.4ms ± 0%    90.7ms ± 0%  +0.29%  (p=0.000 n=8+10)
GoParse                  6.01ms ± 0%    6.03ms ± 0%  +0.35%  (p=0.000 n=10+10)
BinaryTree17              11.7s ± 0%     11.7s ± 0%    ~     (p=0.481 n=10+10)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.315 n=10+10)
Fannkuch11                3.40s ± 0%     3.37s ± 0%  -0.92%  (p=0.000 n=10+10)
[Geo mean]               67.9µs         67.9µs       +0.02%

name                   old speed      new speed      delta
RegexpMatchEasy0_32     128MB/s ± 0%   128MB/s ± 0%  -0.08%  (p=0.003 n=8+10)
RegexpMatchEasy0_1K    1.75GB/s ± 0%  1.75GB/s ± 0%    ~     (p=0.642 n=8+10)
RegexpMatchEasy1_32     130MB/s ± 0%   130MB/s ± 0%    ~     (p=0.690 n=10+9)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%    ~     (p=0.661 n=10+9)
RegexpMatchMedium_32   3.18MB/s ± 0%  3.18MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K   19.7MB/s ± 0%  19.6MB/s ± 0%    ~     (p=0.190 n=10+9)
RegexpMatchHard_32     11.6MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.669 n=10+10)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.718 n=9+9)
Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.24%  (p=0.000 n=10+10)
JSONEncode             96.5MB/s ± 1%  96.1MB/s ± 1%    ~     (p=0.065 n=9+10)
JSONDecode             20.6MB/s ± 1%  20.7MB/s ± 1%  +0.42%  (p=0.041 n=10+10)
GobDecode              60.6MB/s ± 1%  60.0MB/s ± 2%  -0.92%  (p=0.016 n=10+10)
GobEncode              63.4MB/s ± 0%  63.6MB/s ± 0%    ~     (p=0.055 n=10+10)
Template               21.5MB/s ± 0%  21.4MB/s ± 0%  -0.30%  (p=0.000 n=9+10)
GoParse                9.64MB/s ± 0%  9.61MB/s ± 0%  -0.36%  (p=0.000 n=10+10)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.323 n=10+10)
[Geo mean]             56.0MB/s       55.9MB/s       -0.07%

Change-Id: I79a4978d42d01a5f72ed5ceec07f5e78ac6b3859
Reviewed-on: https://go-review.googlesource.com/97175
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 16:40:19 +00:00
Wei Xiao
562346b7d0 bytes: add asm version of Index for short strings on arm64
Currently we have special case for 1-byte strings,
this extends it to strings shorter than 9 bytes on arm64.

Benchmark results:
name                              old time/op    new time/op    delta
IndexByte/10-32                     18.6ns ± 0%    18.1ns ± 0%    -2.69%  (p=0.008 n=5+5)
IndexByte/32-32                     16.8ns ± 1%    16.9ns ± 1%      ~     (p=0.762 n=5+5)
IndexByte/4K-32                      464ns ± 0%     464ns ± 0%      ~     (all equal)
IndexByte/4M-32                      528µs ± 1%     506µs ± 1%    -4.17%  (p=0.008 n=5+5)
IndexByte/64M-32                    18.7ms ± 0%    18.7ms ± 1%      ~     (p=0.730 n=4+5)
IndexBytePortable/10-32             33.8ns ± 0%    34.9ns ± 3%      ~     (p=0.167 n=5+5)
IndexBytePortable/32-32             65.3ns ± 0%    66.1ns ± 2%      ~     (p=0.444 n=5+5)
IndexBytePortable/4K-32             5.88µs ± 0%    5.88µs ± 0%      ~     (p=0.325 n=5+5)
IndexBytePortable/4M-32             6.03ms ± 0%    6.03ms ± 0%      ~     (p=1.000 n=5+5)
IndexBytePortable/64M-32            98.8ms ± 0%    98.9ms ± 0%    +0.10%  (p=0.008 n=5+5)
IndexRune/10-32                     57.7ns ± 0%    49.2ns ± 0%   -14.73%  (p=0.000 n=5+4)
IndexRune/32-32                     57.7ns ± 0%    58.6ns ± 0%    +1.56%  (p=0.008 n=5+5)
IndexRune/4K-32                      511ns ± 0%     513ns ± 0%    +0.39%  (p=0.008 n=5+5)
IndexRune/4M-32                      527µs ± 1%     527µs ± 1%      ~     (p=0.690 n=5+5)
IndexRune/64M-32                    18.7ms ± 0%    18.7ms ± 1%      ~     (p=0.190 n=4+5)
IndexRuneASCII/10-32                23.8ns ± 0%    23.8ns ± 0%      ~     (all equal)
IndexRuneASCII/32-32                24.3ns ± 0%    24.3ns ± 0%      ~     (all equal)
IndexRuneASCII/4K-32                 468ns ± 0%     468ns ± 0%      ~     (all equal)
IndexRuneASCII/4M-32                 521µs ± 1%     531µs ± 2%    +1.91%  (p=0.016 n=5+5)
IndexRuneASCII/64M-32               18.6ms ± 1%    18.5ms ± 0%      ~     (p=0.730 n=5+4)
Index/10-32                         89.1ns ±13%    25.2ns ± 0%   -71.72%  (p=0.008 n=5+5)
Index/32-32                          225ns ± 2%     226ns ± 3%      ~     (p=0.683 n=5+5)
Index/4K-32                         11.9µs ± 0%    11.8µs ± 0%    -0.22%  (p=0.008 n=5+5)
Index/4M-32                         12.1ms ± 0%    12.1ms ± 0%      ~     (p=0.548 n=5+5)
Index/64M-32                         197ms ± 0%     197ms ± 0%      ~     (p=0.690 n=5+5)
IndexEasy/10-32                     46.2ns ± 0%    22.1ns ± 8%   -52.16%  (p=0.008 n=5+5)
IndexEasy/32-32                     46.2ns ± 0%    47.2ns ± 0%    +2.16%  (p=0.008 n=5+5)
IndexEasy/4K-32                      499ns ± 0%     502ns ± 0%    +0.44%  (p=0.008 n=5+5)
IndexEasy/4M-32                      529µs ± 2%     529µs ± 1%      ~     (p=0.841 n=5+5)
IndexEasy/64M-32                    18.6ms ± 1%    18.7ms ± 1%      ~     (p=0.222 n=5+5)
IndexAnyASCII/1:1-32                15.7ns ± 0%    15.7ns ± 0%      ~     (all equal)
IndexAnyASCII/1:2-32                17.2ns ± 0%    17.2ns ± 0%      ~     (all equal)
IndexAnyASCII/1:4-32                20.0ns ± 0%    20.0ns ± 0%      ~     (all equal)
IndexAnyASCII/1:8-32                34.8ns ± 0%    34.8ns ± 0%      ~     (all equal)
IndexAnyASCII/1:16-32               48.1ns ± 0%    48.1ns ± 0%      ~     (all equal)
IndexAnyASCII/16:1-32               97.9ns ± 1%    97.7ns ± 0%      ~     (p=0.857 n=5+5)
IndexAnyASCII/16:2-32                102ns ± 0%     102ns ± 0%      ~     (all equal)
IndexAnyASCII/16:4-32                116ns ± 1%     116ns ± 1%      ~     (p=1.000 n=5+5)
IndexAnyASCII/16:8-32                141ns ± 1%     141ns ± 0%      ~     (p=0.571 n=5+4)
IndexAnyASCII/16:16-32               178ns ± 0%     178ns ± 0%      ~     (all equal)
IndexAnyASCII/256:1-32              1.09µs ± 0%    1.09µs ± 0%      ~     (all equal)
IndexAnyASCII/256:2-32              1.09µs ± 0%    1.10µs ± 0%    +0.27%  (p=0.008 n=5+5)
IndexAnyASCII/256:4-32              1.11µs ± 0%    1.11µs ± 0%      ~     (p=0.397 n=5+5)
IndexAnyASCII/256:8-32              1.10µs ± 0%    1.10µs ± 0%      ~     (p=0.444 n=5+5)
IndexAnyASCII/256:16-32             1.14µs ± 0%    1.14µs ± 0%      ~     (all equal)
IndexAnyASCII/4096:1-32             16.5µs ± 0%    16.5µs ± 0%      ~     (p=1.000 n=5+5)
IndexAnyASCII/4096:2-32             17.0µs ± 0%    17.0µs ± 0%      ~     (p=0.159 n=5+4)
IndexAnyASCII/4096:4-32             17.1µs ± 0%    17.1µs ± 0%      ~     (p=0.921 n=4+5)
IndexAnyASCII/4096:8-32             16.5µs ± 0%    16.5µs ± 0%      ~     (p=0.460 n=5+5)
IndexAnyASCII/4096:16-32            16.5µs ± 0%    16.5µs ± 0%      ~     (p=0.794 n=5+4)
IndexPeriodic/IndexPeriodic2-32      189µs ± 0%     189µs ± 0%      ~     (p=0.841 n=5+5)
IndexPeriodic/IndexPeriodic4-32      189µs ± 0%     189µs ± 0%    -0.03%  (p=0.016 n=5+4)
IndexPeriodic/IndexPeriodic8-32      189µs ± 0%     189µs ± 0%      ~     (p=0.651 n=5+5)
IndexPeriodic/IndexPeriodic16-32     175µs ± 9%     174µs ± 7%      ~     (p=1.000 n=5+5)
IndexPeriodic/IndexPeriodic32-32    75.1µs ± 0%    75.1µs ± 0%      ~     (p=0.690 n=5+5)
IndexPeriodic/IndexPeriodic64-32    42.6µs ± 0%    44.7µs ± 0%    +4.98%  (p=0.008 n=5+5)

name                              old speed      new speed      delta
IndexByte/10-32                    538MB/s ± 0%   552MB/s ± 0%    +2.65%  (p=0.008 n=5+5)
IndexByte/32-32                   1.90GB/s ± 1%  1.90GB/s ± 1%      ~     (p=0.548 n=5+5)
IndexByte/4K-32                   8.82GB/s ± 0%  8.81GB/s ± 0%      ~     (p=0.548 n=5+5)
IndexByte/4M-32                   7.95GB/s ± 1%  8.29GB/s ± 1%    +4.35%  (p=0.008 n=5+5)
IndexByte/64M-32                  3.58GB/s ± 0%  3.60GB/s ± 1%      ~     (p=0.730 n=4+5)
IndexBytePortable/10-32            296MB/s ± 0%   286MB/s ± 3%      ~     (p=0.381 n=4+5)
IndexBytePortable/32-32            490MB/s ± 0%   485MB/s ± 2%      ~     (p=0.286 n=5+5)
IndexBytePortable/4K-32            697MB/s ± 0%   697MB/s ± 0%      ~     (p=0.413 n=5+5)
IndexBytePortable/4M-32            696MB/s ± 0%   695MB/s ± 0%      ~     (p=0.897 n=5+5)
IndexBytePortable/64M-32           679MB/s ± 0%   678MB/s ± 0%    -0.10%  (p=0.008 n=5+5)
IndexRune/10-32                    173MB/s ± 0%   203MB/s ± 0%   +17.24%  (p=0.016 n=5+4)
IndexRune/32-32                    555MB/s ± 0%   546MB/s ± 0%    -1.62%  (p=0.008 n=5+5)
IndexRune/4K-32                   8.01GB/s ± 0%  7.98GB/s ± 0%    -0.38%  (p=0.008 n=5+5)
IndexRune/4M-32                   7.97GB/s ± 1%  7.95GB/s ± 1%      ~     (p=0.690 n=5+5)
IndexRune/64M-32                  3.59GB/s ± 0%  3.58GB/s ± 1%      ~     (p=0.190 n=4+5)
IndexRuneASCII/10-32               420MB/s ± 0%   420MB/s ± 0%      ~     (p=0.190 n=5+4)
IndexRuneASCII/32-32              1.32GB/s ± 0%  1.32GB/s ± 0%      ~     (p=0.333 n=5+5)
IndexRuneASCII/4K-32              8.75GB/s ± 0%  8.75GB/s ± 0%      ~     (p=0.690 n=5+5)
IndexRuneASCII/4M-32              8.04GB/s ± 1%  7.89GB/s ± 2%    -1.87%  (p=0.016 n=5+5)
IndexRuneASCII/64M-32             3.61GB/s ± 1%  3.62GB/s ± 0%      ~     (p=0.730 n=5+4)
Index/10-32                        113MB/s ±14%   397MB/s ± 0%  +249.76%  (p=0.008 n=5+5)
Index/32-32                        142MB/s ± 2%   141MB/s ± 3%      ~     (p=0.794 n=5+5)
Index/4K-32                        345MB/s ± 0%   346MB/s ± 0%    +0.22%  (p=0.008 n=5+5)
Index/4M-32                        345MB/s ± 0%   345MB/s ± 0%      ~     (p=0.619 n=5+5)
Index/64M-32                       341MB/s ± 0%   341MB/s ± 0%      ~     (p=0.595 n=5+5)
IndexEasy/10-32                    216MB/s ± 0%   453MB/s ± 8%  +109.60%  (p=0.008 n=5+5)
IndexEasy/32-32                    692MB/s ± 0%   678MB/s ± 0%    -2.01%  (p=0.008 n=5+5)
IndexEasy/4K-32                   8.19GB/s ± 0%  8.16GB/s ± 0%    -0.45%  (p=0.008 n=5+5)
IndexEasy/4M-32                   7.93GB/s ± 2%  7.93GB/s ± 1%      ~     (p=0.841 n=5+5)
IndexEasy/64M-32                  3.60GB/s ± 1%  3.59GB/s ± 1%      ~     (p=0.222 n=5+5)

Change-Id: I4ca69378a2df6f9ba748c6a2706953ee1bd07343
Reviewed-on: https://go-review.googlesource.com/96555
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-01 15:24:33 +00:00
Marcel van Lohuizen
4c1aff87f1 testing: gracefully handle subtest failing parent’s T
Don’t panic if a subtest inadvertently calls FailNow
on a parent’s T.  Instead, report the offending subtest
while still reporting the error with the ancestor test and
keep exiting goroutines.

Note that this implementation has a race if parallel
subtests are failing the parent concurrently.
This is fine:
Calling FailNow on a parent is considered an error
in principle, at the moment, and is reported if it is
detected. Having the race allows the race detector
to detect the error as well.

Fixes #22882

Change-Id: Ifa6d5e55bb88f6bcbb562fc8c99f1f77e320015a
Reviewed-on: https://go-review.googlesource.com/97635
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 10:17:22 +00:00
Giovanni Bajo
c9438cb198 test: add support for code generation tests (asmcheck)
The top-level test harness is modified to support a new kind
of test: "asmcheck". This is meant to replace asm_test.go
as an easier and more readable way to test code generation.

I've added a couple of codegen tests to get initial feedback
on the syntax. I've created them under a common "codegen"
subdirectory, so that it's easier to run them all with
"go run run.go -v codegen".

The asmcheck syntax allows to insert line comments that
can specify a regular expression to match in the assembly code,
for multiple architectures (the testsuite will automatically
build each testfile multiple times, one per mentioned architecture).

Negative matches are unsupported for now, so this cannot fully
replace asm_test yet.

Change-Id: Ifdbba389f01d55e63e73c99e5f5449e642101d55
Reviewed-on: https://go-review.googlesource.com/97355
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-03-01 07:59:54 +00:00
Tobias Klauser
c7c01efd96 runtime: clean up libc_* definitions on Solaris
All functions defined in syscall2_solaris.go have the respective libc_*
var in syscall_solaris.go, except for libc_close. Move it from
os3_solaris.go

Remove unused libc_fstat.

Order go:cgo_import_dynamic and go:linkname lists in
syscall2_solaris.go alphabetically.

Change-Id: I9f12fa473cf1ae351448ac45597c82a67d799c31
Reviewed-on: https://go-review.googlesource.com/97736
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01 07:31:53 +00:00
Joe Tsai
4338518da8 encoding/json: avoid assuming side-effect free reflect.Value.Addr().Elem()
Consider the following:
	type child struct{ Field string }
	type parent struct{ child }

	p := new(parent)
	v := reflect.ValueOf(p).Elem().Field(0)
	v.Field(0).SetString("hello")           // v.Field = "hello"
	v = v.Addr().Elem()                     // v = *(&v)
	v.Field(0).SetString("goodbye")         // v.Field = "goodbye"

It would appear that v.Addr().Elem() should have the same value, and
that it would be safe to set "goodbye".
However, after CL 66331, any interspersed calls between Field calls
causes the RO flag to be set.
Thus, setting to "goodbye" actually causes a panic.

That CL affects decodeState.indirect which assumes that back-to-back
Value.Addr().Elem() is side-effect free. We fix that logic to keep
track of the Addr() and Elem() calls and set v back to the original
after a full round-trip has occured.

Fixes #24152
Updates #24153

Change-Id: Ie50f8fe963f00cef8515d89d1d5cbc43b76d9f9c
Reviewed-on: https://go-review.googlesource.com/97796
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-01 00:16:20 +00:00
erifan01
8c3c8332cd cmd/asm: enable several arm64 load & store instructions
Instructions LDARB, LDARH, LDAXPW, LDAXP, STLRB, STLRH, STLXP, STLXPW, STXP,
STXPW have been added before, but they are not enabled. This CL enabled them.

Change the form of LDXP and LDXPW to the form of LDP, and fix a bug of STLXP.

Change-Id: I5d2b51494b92451bf6b072c65cfdd8acf07e9b54
Reviewed-on: https://go-review.googlesource.com/96215
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-28 23:46:21 +00:00
Ben Shi
1057624985 cmd/compile: optimize ARM64 code with EON/ORN
EON and ORN are efficient ARM64 instructions. EON combines (x ^ ^y)
into a single operation, and so ORN does for (x | ^y).

This CL implements that optimization. And here are benchmark results
with RaspberryPi3/ArchLinux.

1. A specific test gets about 13% improvement.
EONORN                      181µs ± 0%     157µs ± 0%  -13.26%  (p=0.000 n=26+23)
(https://github.com/benshi001/ugo1/blob/master/eonorn_test.go)

2. There is little change in the go1 benchmark, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              44.1s ± 2%     44.0s ± 2%    ~     (p=0.513 n=30+30)
Fannkuch11-4                32.9s ± 3%     32.8s ± 3%  -0.12%  (p=0.024 n=30+30)
FmtFprintfEmpty-4           561ns ± 9%     558ns ± 9%    ~     (p=0.654 n=30+30)
FmtFprintfString-4         1.09µs ± 4%    1.09µs ± 3%    ~     (p=0.158 n=30+30)
FmtFprintfInt-4            1.12µs ± 0%    1.12µs ± 0%    ~     (p=0.917 n=23+28)
FmtFprintfIntInt-4         1.73µs ± 0%    1.76µs ± 4%    ~     (p=0.665 n=23+30)
FmtFprintfPrefixedInt-4    2.15µs ± 1%    2.15µs ± 0%    ~     (p=0.389 n=27+26)
FmtFprintfFloat-4          3.18µs ± 4%    3.13µs ± 0%  -1.50%  (p=0.003 n=30+23)
FmtManyArgs-4              7.32µs ± 4%    7.21µs ± 0%    ~     (p=0.220 n=30+25)
GobDecode-4                99.1ms ± 9%    97.0ms ± 0%  -2.07%  (p=0.000 n=30+23)
GobEncode-4                83.3ms ± 3%    82.4ms ± 4%    ~     (p=0.321 n=30+30)
Gzip-4                      4.39s ± 4%     4.32s ± 2%  -1.42%  (p=0.017 n=30+23)
Gunzip-4                    440ms ± 0%     447ms ± 4%  +1.54%  (p=0.006 n=24+30)
HTTPClientServer-4          547µs ± 1%     537µs ± 1%  -1.91%  (p=0.000 n=30+30)
JSONEncode-4                211ms ± 0%     211ms ± 0%  +0.04%  (p=0.000 n=23+24)
JSONDecode-4                847ms ± 0%     847ms ± 0%    ~     (p=0.158 n=25+25)
Mandelbrot200-4            46.5ms ± 0%    46.5ms ± 0%  -0.04%  (p=0.000 n=25+24)
GoParse-4                  43.4ms ± 0%    43.4ms ± 0%    ~     (p=0.494 n=24+25)
RegexpMatchEasy0_32-4      1.03µs ± 0%    1.03µs ± 0%    ~     (all equal)
RegexpMatchEasy0_1K-4      4.02µs ± 3%    3.98µs ± 0%  -0.95%  (p=0.003 n=30+24)
RegexpMatchEasy1_32-4      1.01µs ± 3%    1.01µs ± 2%    ~     (p=0.629 n=30+30)
RegexpMatchEasy1_1K-4      6.39µs ± 0%    6.39µs ± 0%    ~     (p=0.564 n=24+23)
RegexpMatchMedium_32-4     1.80µs ± 3%    1.78µs ± 0%    ~     (p=0.155 n=30+24)
RegexpMatchMedium_1K-4      555µs ± 0%     563µs ± 3%  +1.55%  (p=0.004 n=27+30)
RegexpMatchHard_32-4       31.0µs ± 4%    30.5µs ± 1%  -1.58%  (p=0.000 n=30+23)
RegexpMatchHard_1K-4        947µs ± 4%     931µs ± 0%  -1.66%  (p=0.009 n=30+24)
Revcomp-4                   7.71s ± 4%     7.71s ± 4%    ~     (p=0.196 n=29+30)
Template-4                  877ms ± 0%     878ms ± 0%  +0.16%  (p=0.018 n=23+27)
TimeParse-4                4.75µs ± 1%    4.74µs ± 0%    ~     (p=0.895 n=24+23)
TimeFormat-4               4.83µs ± 4%    4.83µs ± 4%    ~     (p=0.767 n=30+30)
[Geo mean]                  709µs          707µs       -0.35%

name                     old speed      new speed      delta
GobDecode-4              7.75MB/s ± 8%  7.91MB/s ± 0%  +2.03%  (p=0.001 n=30+23)
GobEncode-4              9.22MB/s ± 3%  9.32MB/s ± 4%    ~     (p=0.389 n=30+30)
Gzip-4                   4.43MB/s ± 4%  4.43MB/s ± 4%    ~     (p=0.888 n=30+30)
Gunzip-4                 44.1MB/s ± 0%  43.4MB/s ± 4%  -1.46%  (p=0.009 n=24+30)
JSONEncode-4             9.18MB/s ± 0%  9.18MB/s ± 0%    ~     (p=0.308 n=16+24)
JSONDecode-4             2.29MB/s ± 0%  2.29MB/s ± 0%    ~     (all equal)
GoParse-4                1.33MB/s ± 0%  1.33MB/s ± 0%    ~     (all equal)
RegexpMatchEasy0_32-4    30.9MB/s ± 0%  30.9MB/s ± 0%    ~     (p=1.000 n=23+24)
RegexpMatchEasy0_1K-4     255MB/s ± 3%   257MB/s ± 0%  +0.92%  (p=0.004 n=30+24)
RegexpMatchEasy1_32-4    31.7MB/s ± 3%  31.6MB/s ± 2%    ~     (p=0.603 n=30+30)
RegexpMatchEasy1_1K-4     160MB/s ± 0%   160MB/s ± 0%    ~     (p=0.435 n=24+23)
RegexpMatchMedium_32-4    554kB/s ± 3%   560kB/s ± 0%  +1.08%  (p=0.004 n=30+24)
RegexpMatchMedium_1K-4   1.85MB/s ± 0%  1.82MB/s ± 3%  -1.48%  (p=0.001 n=27+30)
RegexpMatchHard_32-4     1.03MB/s ± 4%  1.05MB/s ± 1%  +1.51%  (p=0.027 n=30+23)
RegexpMatchHard_1K-4     1.08MB/s ± 4%  1.10MB/s ± 0%  +1.69%  (p=0.002 n=30+25)
Revcomp-4                33.0MB/s ± 4%  33.0MB/s ± 4%    ~     (p=0.272 n=29+30)
Template-4               2.21MB/s ± 0%  2.21MB/s ± 0%    ~     (all equal)
[Geo mean]               7.75MB/s       7.77MB/s       +0.29%

3. There is little regression in the compilecmp benchmark.
name        old time/op       new time/op       delta
Template          2.28s ± 3%        2.28s ± 4%    ~     (p=0.739 n=10+10)
Unicode           1.34s ± 4%        1.32s ± 3%    ~     (p=0.113 n=10+9)
GoTypes           8.10s ± 3%        8.18s ± 3%    ~     (p=0.393 n=10+10)
Compiler          39.0s ± 3%        39.2s ± 3%    ~     (p=0.393 n=10+10)
SSA                114s ± 3%         115s ± 2%    ~     (p=0.631 n=10+10)
Flate             1.41s ± 2%        1.42s ± 3%    ~     (p=0.353 n=10+10)
GoParser          1.81s ± 1%        1.83s ± 2%    ~     (p=0.211 n=10+9)
Reflect           5.06s ± 2%        5.06s ± 2%    ~     (p=0.912 n=10+10)
Tar               2.19s ± 3%        2.20s ± 3%    ~     (p=0.247 n=10+10)
XML               2.65s ± 2%        2.67s ± 5%    ~     (p=0.796 n=10+10)
[Geo mean]        4.92s             4.93s       +0.27%

name        old user-time/op  new user-time/op  delta
Template          2.81s ± 2%        2.81s ± 3%    ~     (p=0.971 n=10+10)
Unicode           1.70s ± 3%        1.67s ± 5%    ~     (p=0.315 n=10+10)
GoTypes           9.71s ± 1%        9.78s ± 1%  +0.71%  (p=0.023 n=10+10)
Compiler          47.3s ± 1%        47.1s ± 3%    ~     (p=0.579 n=10+10)
SSA                143s ± 2%         143s ± 2%    ~     (p=0.280 n=10+10)
Flate             1.70s ± 3%        1.71s ± 3%    ~     (p=0.481 n=10+10)
GoParser          2.21s ± 3%        2.21s ± 1%    ~     (p=0.549 n=10+9)
Reflect           5.89s ± 1%        5.87s ± 2%    ~     (p=0.739 n=10+10)
Tar               2.66s ± 2%        2.63s ± 2%    ~     (p=0.105 n=10+10)
XML               3.16s ± 3%        3.18s ± 2%    ~     (p=0.143 n=10+10)
[Geo mean]        5.97s             5.97s       -0.06%

name        old text-bytes    new text-bytes    delta
HelloSize         637kB ± 0%        637kB ± 0%    ~     (all equal)

name        old data-bytes    new data-bytes    delta
HelloSize        9.46kB ± 0%       9.46kB ± 0%    ~     (all equal)

name        old bss-bytes     new bss-bytes     delta
HelloSize         125kB ± 0%        125kB ± 0%    ~     (all equal)

name        old exe-bytes     new exe-bytes     delta
HelloSize        1.24MB ± 0%       1.24MB ± 0%    ~     (all equal)

Change-Id: Ie27357d65c5ce9d07afdffebe1e2daadcaa3369f
Reviewed-on: https://go-review.googlesource.com/97036
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 23:42:40 +00:00
Balaram Makam
094258408d cmd/compile: improve fractional word zeroing
This change improves fractional word zeroing by
using overlapping MOVDs for the fractions.

Performance of go1 benchmarks on Amberwing was all noise:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       247ns ± 0%     246ns ± 0%  -0.40%  (p=0.008 n=5+5)
RegexpMatchEasy0_1K       581ns ± 0%     579ns ± 0%  -0.34%  (p=0.000 n=5+4)
RegexpMatchEasy1_32       244ns ± 0%     242ns ± 0%    ~     (p=0.079 n=4+5)
RegexpMatchEasy1_1K       804ns ± 0%     805ns ± 0%    ~     (p=0.238 n=5+4)
RegexpMatchMedium_32      313ns ± 0%     311ns ± 0%  -0.64%  (p=0.008 n=5+5)
RegexpMatchMedium_1K     52.2µs ± 0%    51.9µs ± 0%  -0.52%  (p=0.016 n=5+4)
RegexpMatchHard_32       2.75µs ± 0%    2.74µs ± 0%    ~     (p=0.603 n=5+5)
RegexpMatchHard_1K       78.8µs ± 0%    78.9µs ± 0%  +0.05%  (p=0.008 n=5+5)
FmtFprintfEmpty          58.6ns ± 0%    58.6ns ± 0%    ~     (p=0.159 n=5+5)
FmtFprintfString          118ns ± 0%     119ns ± 0%  +0.85%  (p=0.008 n=5+5)
FmtFprintfInt             119ns ± 0%     123ns ± 0%  +3.36%  (p=0.016 n=5+4)
FmtFprintfIntInt          192ns ± 0%     200ns ± 0%  +4.17%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt     224ns ± 0%     209ns ± 0%  -6.70%  (p=0.008 n=5+5)
FmtFprintfFloat           335ns ± 0%     335ns ± 0%    ~     (all equal)
FmtManyArgs               775ns ± 0%     811ns ± 1%  +4.67%  (p=0.016 n=4+5)
Gzip                      437ms ± 0%     438ms ± 0%  +0.19%  (p=0.008 n=5+5)
HTTPClientServer         88.7µs ± 1%    90.3µs ± 1%  +1.75%  (p=0.016 n=5+5)
JSONEncode               20.1ms ± 1%    20.1ms ± 0%    ~     (p=1.000 n=5+5)
JSONDecode               94.7ms ± 1%    94.8ms ± 1%    ~     (p=0.548 n=5+5)
GobDecode                12.8ms ± 1%    12.8ms ± 1%    ~     (p=0.548 n=5+5)
GobEncode                12.1ms ± 0%    12.1ms ± 0%    ~     (p=0.151 n=5+5)
Mandelbrot200            5.37ms ± 0%    5.37ms ± 0%  -0.03%  (p=0.008 n=5+5)
TimeParse                 450ns ± 0%     451ns ± 1%    ~     (p=0.635 n=4+5)
TimeFormat                485ns ± 0%     484ns ± 0%    ~     (p=0.508 n=5+5)
Template                 90.4ms ± 0%    90.2ms ± 0%  -0.24%  (p=0.016 n=5+5)
GoParse                  5.98ms ± 0%    5.98ms ± 0%    ~     (p=1.000 n=5+5)
BinaryTree17              11.8s ± 0%     11.8s ± 0%    ~     (p=0.841 n=5+5)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.310 n=5+5)
Fannkuch11                3.28s ± 0%     3.34s ± 0%  +1.64%  (p=0.008 n=5+5)

name                   old speed      new speed      delta
RegexpMatchEasy0_32     129MB/s ± 0%   130MB/s ± 0%  +0.30%  (p=0.016 n=4+5)
RegexpMatchEasy0_1K    1.76GB/s ± 0%  1.77GB/s ± 0%  +0.27%  (p=0.016 n=5+4)
RegexpMatchEasy1_32     131MB/s ± 0%   132MB/s ± 0%  +0.71%  (p=0.016 n=4+5)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%  -0.17%  (p=0.016 n=5+4)
RegexpMatchMedium_32   3.19MB/s ± 0%  3.21MB/s ± 0%  +0.63%  (p=0.008 n=5+5)
RegexpMatchMedium_1K   19.6MB/s ± 0%  19.7MB/s ± 0%  +0.52%  (p=0.016 n=5+4)
RegexpMatchHard_32     11.7MB/s ± 0%  11.7MB/s ± 0%    ~     (p=0.643 n=5+5)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.079 n=4+5)
Gzip                   44.4MB/s ± 0%  44.3MB/s ± 0%  -0.19%  (p=0.008 n=5+5)
JSONEncode             96.3MB/s ± 1%  96.4MB/s ± 0%    ~     (p=1.000 n=5+5)
JSONDecode             20.5MB/s ± 1%  20.5MB/s ± 1%    ~     (p=0.460 n=5+5)
GobDecode              60.1MB/s ± 1%  59.9MB/s ± 1%    ~     (p=0.548 n=5+5)
GobEncode              63.5MB/s ± 0%  63.7MB/s ± 0%    ~     (p=0.135 n=5+5)
Template               21.5MB/s ± 0%  21.5MB/s ± 0%  +0.24%  (p=0.016 n=5+5)
GoParse                9.68MB/s ± 0%  9.69MB/s ± 0%    ~     (p=0.786 n=5+5)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.310 n=5+5)
Change-Id: I596eee6421cdbad1a0189cdb9fe0628bba534eaf
Reviewed-on: https://go-review.googlesource.com/96775
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 23:28:39 +00:00
Hana Kim
413d8a833d cmd/trace: skip tests if parsing fails with timestamp error
runtime/trace test already skips tests in case of the timestamp
error.

Moreover, relax TestAnalyzeAnnotationGC test condition to
deal with the inaccuracy caused from use of cputicks in tracing.

Fixes #24081
Updates #16755

Change-Id: I708ecc6da202eaec07e431085a75d3dbfbf4cc06
Reviewed-on: https://go-review.googlesource.com/97757
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 22:09:34 +00:00
Matthew Dempsky
b3f00c6985 cmd/compile: fix unexpected type alias crash
OCOMPLIT stores the pre-typechecked type in n.Right, and then moves it
to n.Type. However, it wasn't clearing n.Right, so n.Right continued
to point to the OTYPE node. (Exception: slice literals reused n.Right
to store the array length.)

When exporting inline function bodies, we don't expect to need to save
any type aliases. Doing so wouldn't be wrong per se, but it's
completely unnecessary and would just bloat the export data.

However, reexportdep (whose role is to identify types needed by inline
function bodies) uses a generic tree traversal mechanism, which visits
n.Right even for O{ARRAY,MAP,STRUCT}LIT nodes. This means it finds the
OTYPE node, and mistakenly interpreted that the type alias needs to be
exported.

The straight forward fix is to just clear n.Right when typechecking
composite literals.

Fixes #24173.

Change-Id: Ia2d556bfdd806c83695b08e18b6cd71eff0772fc
Reviewed-on: https://go-review.googlesource.com/97719
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-28 20:18:37 +00:00
Daniel Martí
1e308fbc1a cmd/compile: improved error message when calling a shadowed builtin
Otherwise, the error can be confusing if one forgets or doesn't know
that the builtin is being shadowed, which is not common practice.

Fixes #22822.

Change-Id: I735393b5ce28cb83815a1c3f7cd2e7bb5080a32d
Reviewed-on: https://go-review.googlesource.com/97455
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-28 19:39:52 +00:00
Adam Langley
4b1d704d14 crypto/x509: parse invalid DNS names and email addresses.
Go 1.10 requires that SANs in certificates are valid. However, a
non-trivial number of (generally non-WebPKI) certificates have invalid
strings in dnsName fields and some have even put those dnsName SANs in
CA certificates.

This change defers validity checking until name constraints are checked.

Fixes #23995, #23711.

Change-Id: I2e0ebb0898c047874a3547226b71e3029333b7f1
Reviewed-on: https://go-review.googlesource.com/96378
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-28 19:14:11 +00:00
Robert Griesemer
c1359db9cc go/types: fix empty interface optimization (minor performance bug)
The tests checking for empty interfaces so that they can be fast-
tracked in the code actually didn't test the right field and the
fast track code never executed. Doing it now.

Change-Id: I58b2951efb3fb40b3366874c79fd653591ae0e99
Reviewed-on: https://go-review.googlesource.com/97519
Reviewed-by: Alan Donovan <adonovan@google.com>
2018-02-28 18:22:21 +00:00
Robert Griesemer
e2b5e6038b go/types: fix incorrect context when type-checking interfaces
Regression, introduced by https://go-review.googlesource.com/c/go/+/79575
which meant to be more conservative but ended up destroying an important
context.

Fixes #24140.

Change-Id: Id428dbb295ce9f11ab7cd54ec5ab51ef4291ac3f
Reviewed-on: https://go-review.googlesource.com/97535
Reviewed-by: Alan Donovan <adonovan@google.com>
2018-02-28 18:22:16 +00:00
Josh Bleecher Snyder
91a05b92be cmd/compile: prevent memmove in copy when dst == src
This causes a nominal increase in binary size.

name        old object-bytes  new object-bytes  delta
Template          399kB ± 0%        399kB ± 0%    ~     (all equal)
Unicode           207kB ± 0%        207kB ± 0%    ~     (all equal)
GoTypes          1.23MB ± 0%       1.23MB ± 0%    ~     (all equal)
Compiler         4.35MB ± 0%       4.35MB ± 0%  +0.01%  (p=0.008 n=5+5)
SSA              9.77MB ± 0%       9.77MB ± 0%  +0.00%  (p=0.008 n=5+5)
Flate             236kB ± 0%        236kB ± 0%  +0.04%  (p=0.008 n=5+5)
GoParser          298kB ± 0%        298kB ± 0%    ~     (all equal)
Reflect          1.03MB ± 0%       1.03MB ± 0%  +0.01%  (p=0.008 n=5+5)
Tar               333kB ± 0%        334kB ± 0%  +0.22%  (p=0.008 n=5+5)
XML               414kB ± 0%        414kB ± 0%  +0.02%  (p=0.008 n=5+5)
[Geo mean]        730kB             731kB       +0.03%

Change-Id: I381809fd9cfbfd6db44bd342b06285e62a3a21f1
Reviewed-on: https://go-review.googlesource.com/94596
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-28 17:37:22 +00:00
Richard Miller
a379b7d9ac syscall: reduce redundant getwd tracking in Plan 9
In Plan 9, each M is implemented as a separate OS process with
its own working directory.  To keep the wd consistent across
goroutines (or rescheduling of the same goroutine), CL 6350
introduced a Fixwd procedure which checks using getwd and calls
chdir if necessary before any syscall operating on a pathname.

This wd checking will not be necessary if the pathname is absolute
(starts with '/' or '#').  Getwd is a fairly expensive operation
in Plan 9 (implemented by opening "." and calling Fd2path on the
file descriptor).  Eliminating the redundant getwd calls can
significantly reduce overhead for common operations like
"dist test --list" which perform many syscalls on absolute pathnames.

Updates #9428.

Change-Id: I13fd9380779de27b0ac2f2b488229778d6839255
Reviewed-on: https://go-review.googlesource.com/97675
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: David du Colombier <0intro@gmail.com>
2018-02-28 16:26:49 +00:00
Richard Miller
c2cdfbd1a7 runtime: don't try to shrink address space with brk in Plan 9
Plan 9 won't let brk shrink the data segment if it's shared with
other processes (which it is in the go runtime).  So we keep track
of the notional end of the segment as it moves up and down, and
call brk only when it grows.

Corrects CL 94776.

Updates #23860.
Fixes #24013.

Change-Id: I754232decab81dfd71d690f77ee6097a17d9be11
Reviewed-on: https://go-review.googlesource.com/97595
Reviewed-by: David du Colombier <0intro@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 15:57:10 +00:00
Rob Pike
1be58dcda8 doc/faq: add a Q&A about virus scanners
Fixes #23759.

Change-Id: I0407ebfea507991fc205f7b04bc5798808a5c5f6
Reviewed-on: https://go-review.googlesource.com/97496
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: David Symonds <dsymonds@golang.org>
2018-02-28 05:55:31 +00:00
Robert Griesemer
0c884d0810 cmd/compile, cmd/compile/internal/syntax: print relative column info
This change enables printing of relative column information if a
prior line directive specified a valid column. If there was no
line directive, or the line directive didn't specify a column
(or the -C flag is specified), no column information is shown in
file positions.

Implementation: Column values (and line values, for that matter)
that are zero are interpreted as "unknown". A line directive that
doesn't specify a column records that as a zero column in the
respective PosBase data structure. When computing relative columns,
a relative value is zero of the base's column value is zero.
When formatting a position, a zero column value is not printed.

To make this work without special cases, the PosBase for a file
is given a concrete (non-0:0) position 1:1 with the PosBase's
line and column also being 1:1. In other words, at the position
1:1 of a file, it's relative positions are starting with 1:1 as
one would expect.

In the package syntax, this requires self-recursive PosBases for
file bases, matching what cmd/internal/src.PosBase was already
doing. In src.PosBase, file and inlining bases also need to be
based at 1:1 to indicate "known" positions.

This change completes the cmd/compiler part of the issue below.

Fixes #22662.

Change-Id: I6c3d2dee26709581fba0d0261b1d12e93f1cba1a
Reviewed-on: https://go-review.googlesource.com/97375
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-28 03:51:23 +00:00
Hana Kim
b5bd5bfbc7 cmd/trace: fix overlappingDuration
Update #24081

Change-Id: Ieccfb03c51e86f35d4629a42959c80570bd93c33
Reviewed-on: https://go-review.googlesource.com/97555
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 02:42:15 +00:00
Heschi Kreinick
f8973fcafb cmd/link: revert CL 89535: "fix up location lists for dsymutil"
This reverts commit 230b0bad1f.

Reason for revert: breaking the build.

Fixes #24165

Change-Id: I9d8dda59f97a47e5c436f1c061b34ced82bde8ec
Reviewed-on: https://go-review.googlesource.com/97575
Run-TryBot: Heschi Kreinick <heschi@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-28 01:53:43 +00:00
Kunpei Sakai
21343e07d6 cmd/compile: remove duplicates by using finishcompare
Updates #23834

Change-Id: If05001f9fd6b97d72069f440102eec6e371908dd
Reviewed-on: https://go-review.googlesource.com/97016
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-28 00:50:06 +00:00
Michael Fraenkel
a375a6b363 cmd/compile: convert untyped bool during walkCases
Updates #23834.

Change-Id: I1789525a992d37aae9e9b69c1e9d91437d3d0d3b
Reviewed-on: https://go-review.googlesource.com/97001
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-02-27 23:26:36 +00:00
Keith Randall
2413b54888 cmd/compile: mark the first word of an interface as a uintptr
The first word of an interface is a pointer, but for the purposes
of GC we don't need to treat it as such.
 1. If it is a non-empty interface, the pointer points to an itab
    which is always in persistentalloc space.
 2. If it is an empty interface, the pointer points to a _type.
   a. If it is a compile-time-allocated type, it points into
      the read-only data section.
   b. If it is a reflect-allocated type, it points into the Go heap.
      Reflect is responsible for keeping a reference to
      the underlying type so it won't be GCd.

If we ever have a moving GC, we need to change this for 2b (as
well as scan itabs to update their itab._type fields).

Write barriers on the first word of interfaces have already been removed.

Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959
Reviewed-on: https://go-review.googlesource.com/97518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-02-27 22:58:32 +00:00
isharipo
b80b4a23d1 cmd/internal/obj/x86: add missing legacy insts
Minimizes the amount of "TODO" stuff in test suite
of cmd/asm/internal/asm/testdata/amd64enc.s.

Some instructions were already implemented, but
test cases for them were commented-out.

Does not enable MMX instructions, calls/jumps and some
segment registers instructions.

-- Affected instructions --
BLENDVPD, BLENDVPS
BSWAPW
CBW
CDQE
CLAC
CLFLUSHOPT
CMPXCHG16B
CRC32B, CRC32L, CRC32W
CWDE
FBLD
FBSTP
FCMOVB
FCMOVBE
FCMOVE
FCMOVNB
FCMOVNBE
FCMOVU
FCOMI
FCOMIP
IMUL3L, IMUL3Q, IMUL3W
ICEBP, INT
INVPCID
LARQ
LGDT, LIDT, LLDT
LMSW
LTR
LZCNTL, LZCNTQ, LZCNTW
MONITOR
MOVBELL, MOVBEQQ, MOVBEWW
MOVBQZX
MOVQ
MOVSWW, MOVZWW
MWAIT
NOPL, NOPW
PBLENDVB
PEXTRW
RDPKRU
RDRANDL, RDRANDQ, RDRANDW
RDSEEDL, RDSEEDQ, RDSEEDW
RDTSCP
SAHF
SGDT, SIDT
SLDTL, SLDTQ, SLDTW
SMSWL, SMSWQ, SMSWW
STAC
STRL, STRQ, STRW
SYSENTER, SYSENTER64
SYSEXIT, SYSEXIT64
SHA256RNDS2
TZCNTL, TZCNTQ, TZCNTW
UD1, UD2
WRPKRU
XRSTOR, XRSTOR64
XRSTORS, XRSTORS64
XSAVE, XSAVE64
XSAVEC, XSAVEC64
XSAVEOPT, XSAVEOPT64
XSAVES, XSAVES64
XSETBV

Fixes #6739

Change-Id: I8b125d9a5ea39bb4b9da7e66a63a16f609cef376
Reviewed-on: https://go-review.googlesource.com/97235
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 21:55:25 +00:00
Daniel Martí
c55505bae2 cmd/vet: type conversions never have side effects
Make the hasSideEffects func use type information to see if a CallExpr
is a type conversion or not. In case it is, there cannot be any side
effects.

Now that vet always has type information, we can afford to use it here.
Update the tests and remove the TODO there too.

Change-Id: I74fdacf830aedf2371e67ba833802c414178caf1
Reviewed-on: https://go-review.googlesource.com/79536
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-27 21:48:10 +00:00
Ilya Tocar
c2ccc48165 cmd/compile/internal/ssa: refactor zeroUpper32Bits
Explicitly whitelist args of OpSelect{1|2} that zero upper 32 bits.
Use better values in corresponding test.
This should have been a part of  CL 96815, but it was submitted, before
relevant comments.

Change-Id: Ic85d90a4471a17f6d64f8f5c405f21378bf3a30d
Reviewed-on: https://go-review.googlesource.com/97295
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 20:38:32 +00:00
Darshan Parajuli
ccaa2bc5c0 fmt: change some unexported method names to camel case
Change-Id: I12f96a9397cfaebe11e616543d804bd62cd72da0
Reviewed-on: https://go-review.googlesource.com/97379
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-27 20:12:04 +00:00
Josh Bleecher Snyder
15b0d1376a cmd/compile: clean up comments
Follow-up to CL 94256.

Change-Id: I61c450dee5975492192453738f734f772e95c1a5
Reviewed-on: https://go-review.googlesource.com/97515
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-02-27 20:06:22 +00:00
ChrisALiles
4f5389c321 cmd/compile: move the SSA local type definitions to a single location
Fixes #20304

Change-Id: I52ee02d1602ed7fffc96b27fd60990203c771aaf
Reviewed-on: https://go-review.googlesource.com/94256
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-02-27 19:40:36 +00:00
Ilya Tocar
0f2ef0ad44 cmd/compile/internal/ssa: combine byte stores on amd64
On amd64 we optimize  encoding/binary.BigEndian.PutUint{16,32,64}
into bswap + single store, but strangely enough not LittleEndian.PutUint{16,32}.
We have similar rules, but they use 64-bit shifts everywhere,
and fail for 16/32-bit case. Add rules that matchLittleEndian.PutUint,
and relevant tests. Performance results:

LittleEndianPutUint16-6    1.43ns ± 0%    1.07ns ± 0%   -25.17%  (p=0.000 n=9+9)
LittleEndianPutUint32-6    2.14ns ± 0%    0.94ns ± 0%   -56.07%  (p=0.019 n=6+8)

LittleEndianPutUint16-6  1.40GB/s ± 0%  1.87GB/s ± 0%   +33.24%  (p=0.000 n=9+9)
LittleEndianPutUint32-6  1.87GB/s ± 0%  4.26GB/s ± 0%  +128.54%  (p=0.000 n=8+8)

Discovered, while looking at ethereum_ethash from community benchmarks

Change-Id: Id86d5443687ecddd2803edf3203dbdd1246f61fe
Reviewed-on: https://go-review.googlesource.com/95475
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 19:38:50 +00:00
Matthew Dempsky
d7cd61ceaa cmd/compile: fix inlining of constant if statements
We accidentally overlooked needing to still visit Ninit for OIF
statements with constant conditions in golang.org/cl/96778.

Fixes #24120.

Change-Id: I5b341913065ff90e1163fb872b9e8d47e2a789d2
Reviewed-on: https://go-review.googlesource.com/97475
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-27 19:27:32 +00:00
Heschi Kreinick
230b0bad1f cmd/link: fix up location lists for dsymutil
LLVM tools, particularly lldb and dsymutil, don't support base address
selection entries in location lists. When targeting GOOS=darwin,
mode, have the linker translate location lists to CU-relative form
instead.

Technically, this isn't necessary when linking internally, as long as
nobody plans to use anything other than Delve to look at the DWARF. But
someone might want to use lldb, and it's really confusing when dwarfdump
shows gibberish for the location entries. The performance cost isn't
noticeable, so enable it even for internal linking.

Doing this in the linker is a little weird, but it was more expensive in
the compiler, probably because the compiler is much more stressful to
the GC. Also, if we decide to only do it for external linking, the
compiler can't see the link mode.

Benchmark before and after this commit on Mac with -dwarflocationlists=1:

name        old time/op       new time/op       delta
StdCmd            21.3s ± 1%        21.3s ± 1%    ~     (p=0.310 n=27+27)

Only StdCmd is relevant, because only StdCmd runs the linker. Whatever
the cost is here, it's not very large.

Change-Id: I200246dedaee4f824966f7551ac95f8d7123d3b1
Reviewed-on: https://go-review.googlesource.com/89535
Reviewed-by: David Chase <drchase@google.com>
2018-02-27 18:55:23 +00:00
Tobias Klauser
5b21bf6f81 runtime: simplify walltime/nanotime on linux/{386,amd64}
Avoid an unnecessary MOVL/MOVQ.

Follow CL 97377

Change-Id: Ic43976d6b0cece3ed455496d18aedd67e0337d3f
Reviewed-on: https://go-review.googlesource.com/97358
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27 18:42:41 +00:00
Tobias Klauser
2013ad897d os, syscall: use pipe2 instead of pipe syscall on OpenBSD
The pipe2 syscall is part of OpenBSD since version 5.7 and thus exists in
all officially supported versions.

Follows CL 38426 and CL 94035

Change-Id: I8f93ecbc89664241f1b6b0d069e948776941b1d0
Reviewed-on: https://go-review.googlesource.com/97356
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27 18:37:36 +00:00
Philip Hofer
81786649c5 cmd/compile/internal/ssa: clear branch likeliness in clobberBlock
The branchelim pass makes some blocks unreachable, but does not
remove them from Func.Values. Consequently, ssacheck complains
when it finds a block with a non-zero likeliness value but no
successors.

Fixes #24014

Change-Id: I2dcf1d8f4e769a2f363508dab3b11198ead336b6
Reviewed-on: https://go-review.googlesource.com/96075
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Philip Hofer <phofer@umich.edu>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-02-27 18:14:05 +00:00
Josh Bleecher Snyder
c5d6c42d35 runtime: improve 386/amd64 systemstack
Minor improvements, noticed while investigating other things.

Shorten the prologue.

Make branch direction better for static branch prediction;
the most common case by far is switching stacks (g==curg).

Change-Id: Ib2211d3efecb60446355cda56194221ccb78057d
Reviewed-on: https://go-review.googlesource.com/97377
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27 18:10:38 +00:00
Joe Tsai
f399af3114 go/doc: replace unexported values with underscore if necessary
When a var or const declaration contains a mixture of exported and unexported
identifiers, replace the unexported identifiers with underscore.
Otherwise, the LHS and the RHS may mismatch or the declaration may mismatch
with an iota from above.

Fixes #22426

Change-Id: Icd5fb81b4ece647232a9f7d05cb140227091e9cb
Reviewed-on: https://go-review.googlesource.com/94877
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-27 16:29:17 +00:00
erifan01
ed6c6c9c11 math: optimize sinh and cosh
Improve performance by reducing unnecessary function calls

Benchmarks:

Tme    old time/op  new time/op  delta
Cosh-8   229ns ± 0%   138ns ± 0%  -39.74%  (p=0.008 n=5+5)
Sinh-8   231ns ± 0%   139ns ± 0%  -39.83%  (p=0.008 n=5+5)

Change-Id: Icab5485849bbfaafca8429d06b67c558101f4f3c
Reviewed-on: https://go-review.googlesource.com/85477
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-02-27 04:34:37 +00:00
Josh Bleecher Snyder
486caa26d7 runtime: short-circuit typedmemmove when dst==src
Change-Id: I855268a4c0d07ad602ec90f5da66422d3d87c5f2
Reviewed-on: https://go-review.googlesource.com/94595
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:56:18 +00:00
Giovanni Bajo
68def82008 cmd/compile: fix bit-test rules for highest bit
Bit-test rules failed to match when matching the highest bit
of a word because operands in SSA are signed int64. Fix
them by treating them as unsigned (and correctly handling
32-bit operands as well).

Tests will be added in next CL.

Change-Id: I491c4e88e7e2f87e9bb72bd0d9fa5d4025b90736
Reviewed-on: https://go-review.googlesource.com/94765
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:51:40 +00:00
Giovanni Bajo
098208a0d9 cmd/compile: fold bit masking on bits that have been shifted away
Spotted while working on #18943, it triggers once during bootstrap.

Change-Id: Ia4330ccc6395627c233a8eb4dcc0e3e2a770bea7
Reviewed-on: https://go-review.googlesource.com/94764
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:51:19 +00:00
Chad Rosier
ecd9e8a2fe cmd/compile/internal/ssa: combine zero stores into larger stores on arm64
This reduces the go tool binary on arm64 by 12k.

go1 results on Amberwing:
name                   old time/op    new time/op    delta
RegexpMatchEasy0_32       249ns ± 0%     249ns ± 0%    ~     (p=0.087 n=10+10)
RegexpMatchEasy0_1K       584ns ± 0%     584ns ± 0%    ~     (all equal)
RegexpMatchEasy1_32       246ns ± 0%     246ns ± 0%    ~     (p=1.000 n=10+10)
RegexpMatchEasy1_1K       806ns ± 0%     806ns ± 0%    ~     (p=0.706 n=10+9)
RegexpMatchMedium_32      314ns ± 0%     314ns ± 0%    ~     (all equal)
RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%    ~     (p=0.245 n=10+8)
RegexpMatchHard_32       2.75µs ± 1%    2.75µs ± 1%    ~     (p=0.690 n=10+10)
RegexpMatchHard_1K       78.9µs ± 0%    78.9µs ± 1%    ~     (p=0.295 n=9+9)
FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
FmtFprintfString          112ns ± 0%     112ns ± 0%    ~     (all equal)
FmtFprintfInt             117ns ± 0%     116ns ± 0%  -0.85%  (p=0.000 n=10+10)
FmtFprintfIntInt          181ns ± 0%     181ns ± 0%    ~     (all equal)
FmtFprintfPrefixedInt     222ns ± 0%     224ns ± 0%  +0.90%  (p=0.000 n=9+10)
FmtFprintfFloat           318ns ± 1%     322ns ± 0%    ~     (p=0.059 n=10+8)
FmtManyArgs               736ns ± 1%     735ns ± 0%    ~     (p=0.206 n=9+9)
Gzip                      437ms ± 0%     436ms ± 0%  -0.25%  (p=0.000 n=10+10)
HTTPClientServer         89.8µs ± 1%    90.2µs ± 2%    ~     (p=0.393 n=10+10)
JSONEncode               20.1ms ± 1%    20.2ms ± 1%    ~     (p=0.065 n=9+10)
JSONDecode               94.2ms ± 1%    93.9ms ± 1%  -0.42%  (p=0.043 n=10+10)
GobDecode                12.7ms ± 1%    12.8ms ± 2%  +0.94%  (p=0.019 n=10+10)
GobEncode                12.1ms ± 0%    12.1ms ± 0%    ~     (p=0.052 n=10+10)
Mandelbrot200            5.06ms ± 0%    5.05ms ± 0%  -0.04%  (p=0.000 n=9+10)
TimeParse                 450ns ± 3%     446ns ± 0%    ~     (p=0.238 n=10+9)
TimeFormat                485ns ± 1%     483ns ± 1%    ~     (p=0.073 n=10+10)
Template                 90.4ms ± 0%    90.7ms ± 0%  +0.29%  (p=0.000 n=8+10)
GoParse                  6.01ms ± 0%    6.03ms ± 0%  +0.35%  (p=0.000 n=10+10)
BinaryTree17              11.7s ± 0%     11.7s ± 0%    ~     (p=0.481 n=10+10)
Revcomp                   669ms ± 0%     669ms ± 0%    ~     (p=0.315 n=10+10)
Fannkuch11                3.40s ± 0%     3.37s ± 0%  -0.92%  (p=0.000 n=10+10)
[Geo mean]               67.9µs         67.9µs       +0.02%

name                   old speed      new speed      delta
RegexpMatchEasy0_32     128MB/s ± 0%   128MB/s ± 0%  -0.08%  (p=0.003 n=8+10)
RegexpMatchEasy0_1K    1.75GB/s ± 0%  1.75GB/s ± 0%    ~     (p=0.642 n=8+10)
RegexpMatchEasy1_32     130MB/s ± 0%   130MB/s ± 0%    ~     (p=0.690 n=10+9)
RegexpMatchEasy1_1K    1.27GB/s ± 0%  1.27GB/s ± 0%    ~     (p=0.661 n=10+9)
RegexpMatchMedium_32   3.18MB/s ± 0%  3.18MB/s ± 0%    ~     (all equal)
RegexpMatchMedium_1K   19.7MB/s ± 0%  19.6MB/s ± 0%    ~     (p=0.190 n=10+9)
RegexpMatchHard_32     11.6MB/s ± 0%  11.6MB/s ± 1%    ~     (p=0.669 n=10+10)
RegexpMatchHard_1K     13.0MB/s ± 0%  13.0MB/s ± 0%    ~     (p=0.718 n=9+9)
Gzip                   44.4MB/s ± 0%  44.5MB/s ± 0%  +0.24%  (p=0.000 n=10+10)
JSONEncode             96.5MB/s ± 1%  96.1MB/s ± 1%    ~     (p=0.065 n=9+10)
JSONDecode             20.6MB/s ± 1%  20.7MB/s ± 1%  +0.42%  (p=0.041 n=10+10)
GobDecode              60.6MB/s ± 1%  60.0MB/s ± 2%  -0.92%  (p=0.016 n=10+10)
GobEncode              63.4MB/s ± 0%  63.6MB/s ± 0%    ~     (p=0.055 n=10+10)
Template               21.5MB/s ± 0%  21.4MB/s ± 0%  -0.30%  (p=0.000 n=9+10)
GoParse                9.64MB/s ± 0%  9.61MB/s ± 0%  -0.36%  (p=0.000 n=10+10)
Revcomp                 380MB/s ± 0%   380MB/s ± 0%    ~     (p=0.323 n=10+10)
[Geo mean]             56.0MB/s       55.9MB/s       -0.07%

Change-Id: Ia732fa57fbcf4767d72382516d9f16705d177736
Reviewed-on: https://go-review.googlesource.com/96435
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-27 00:07:25 +00:00
Josh Bleecher Snyder
3a9e4440fd cmd/compile: tighten after lowering
Moving tighten after lowering benefits from the removal of values by
lowering and lowered CSE. It lets us make better decisions about
which values are rematerializable and which generate flags.
Empirically, it lowers stack usage (by avoiding spills)
and generates slightly smaller and faster binaries.


Fixes #19853
Fixes #21041

name        old time/op       new time/op       delta
Template          195ms ± 4%        193ms ± 4%  -1.33%  (p=0.000 n=92+97)
Unicode          94.1ms ± 9%       92.5ms ± 8%  -1.66%  (p=0.002 n=97+95)
GoTypes           572ms ± 5%        566ms ± 7%  -0.92%  (p=0.001 n=95+98)
Compiler          2.56s ± 4%        2.52s ± 3%  -1.41%  (p=0.000 n=94+97)
SSA               6.52s ± 2%        6.47s ± 3%  -0.82%  (p=0.000 n=96+94)
Flate             117ms ± 5%        116ms ± 7%  -0.72%  (p=0.018 n=97+97)
GoParser          148ms ± 6%        146ms ± 4%  -0.97%  (p=0.002 n=98+95)
Reflect           370ms ± 7%        363ms ± 6%  -1.79%  (p=0.000 n=99+98)
Tar               175ms ± 6%        173ms ± 6%  -1.11%  (p=0.001 n=94+95)
XML               204ms ± 6%        201ms ± 5%  -1.49%  (p=0.000 n=97+96)
[Geo mean]        363ms             359ms       -1.22%

name        old user-time/op  new user-time/op  delta
Template          251ms ± 5%        245ms ± 5%  -2.40%  (p=0.000 n=97+93)
Unicode           131ms ±10%        128ms ± 9%  -1.93%  (p=0.001 n=100+99)
GoTypes           760ms ± 4%        752ms ± 4%  -0.96%  (p=0.000 n=97+95)
Compiler          3.51s ± 3%        3.48s ± 2%  -1.04%  (p=0.000 n=96+95)
SSA               9.57s ± 4%        9.52s ± 2%  -0.50%  (p=0.004 n=97+96)
Flate             149ms ± 6%        147ms ± 6%  -1.46%  (p=0.000 n=98+96)
GoParser          184ms ± 5%        181ms ± 7%  -1.84%  (p=0.000 n=98+97)
Reflect           469ms ± 6%        461ms ± 6%  -1.69%  (p=0.000 n=100+98)
Tar               219ms ± 8%        217ms ± 7%  -0.90%  (p=0.035 n=96+96)
XML               255ms ± 5%        251ms ± 6%  -1.48%  (p=0.000 n=98+98)
[Geo mean]        476ms             469ms       -1.42%

name        old alloc/op      new alloc/op      delta
Template         37.8MB ± 0%       37.8MB ± 0%  -0.17%  (p=0.000 n=100+100)
Unicode          28.8MB ± 0%       28.8MB ± 0%  -0.02%  (p=0.000 n=100+95)
GoTypes           112MB ± 0%        112MB ± 0%  -0.20%  (p=0.000 n=100+97)
Compiler          466MB ± 0%        464MB ± 0%  -0.27%  (p=0.000 n=100+100)
SSA              1.49GB ± 0%       1.49GB ± 0%  -0.08%  (p=0.000 n=100+99)
Flate            24.4MB ± 0%       24.3MB ± 0%  -0.25%  (p=0.000 n=98+99)
GoParser         30.7MB ± 0%       30.6MB ± 0%  -0.26%  (p=0.000 n=99+100)
Reflect          76.4MB ± 0%       76.4MB ± 0%    ~     (p=0.253 n=100+100)
Tar              38.9MB ± 0%       38.8MB ± 0%  -0.20%  (p=0.000 n=100+97)
XML              41.5MB ± 0%       41.4MB ± 0%  -0.19%  (p=0.000 n=100+98)
[Geo mean]       77.5MB            77.4MB       -0.16%

name        old allocs/op     new allocs/op     delta
Template           381k ± 0%         381k ± 0%  -0.15%  (p=0.000 n=100+100)
Unicode            342k ± 0%         342k ± 0%  -0.01%  (p=0.000 n=100+98)
GoTypes           1.19M ± 0%        1.18M ± 0%  -0.24%  (p=0.000 n=100+100)
Compiler          4.52M ± 0%        4.50M ± 0%  -0.29%  (p=0.000 n=100+100)
SSA               12.3M ± 0%        12.3M ± 0%  -0.11%  (p=0.000 n=100+100)
Flate              234k ± 0%         234k ± 0%  -0.26%  (p=0.000 n=99+96)
GoParser           318k ± 0%         317k ± 0%  -0.21%  (p=0.000 n=99+100)
Reflect            974k ± 0%         974k ± 0%  -0.03%  (p=0.000 n=100+100)
Tar                392k ± 0%         391k ± 0%  -0.17%  (p=0.000 n=100+99)
XML                404k ± 0%         403k ± 0%  -0.24%  (p=0.000 n=99+99)
[Geo mean]         794k              792k       -0.17%

name        old object-bytes  new object-bytes  delta
Template          393kB ± 0%        392kB ± 0%  -0.19%  (p=0.008 n=5+5)
Unicode           207kB ± 0%        207kB ± 0%    ~     (all equal)
GoTypes          1.23MB ± 0%       1.22MB ± 0%  -0.11%  (p=0.008 n=5+5)
Compiler         4.34MB ± 0%       4.33MB ± 0%  -0.15%  (p=0.008 n=5+5)
SSA              9.85MB ± 0%       9.85MB ± 0%  -0.07%  (p=0.008 n=5+5)
Flate             235kB ± 0%        234kB ± 0%  -0.59%  (p=0.008 n=5+5)
GoParser          297kB ± 0%        296kB ± 0%  -0.22%  (p=0.008 n=5+5)
Reflect          1.03MB ± 0%       1.03MB ± 0%  -0.00%  (p=0.008 n=5+5)
Tar               332kB ± 0%        331kB ± 0%  -0.15%  (p=0.008 n=5+5)
XML               413kB ± 0%        412kB ± 0%  -0.19%  (p=0.008 n=5+5)
[Geo mean]        728kB             727kB       -0.17%

Change-Id: I9b5cdb668ed102a001897a05e833105acba220a2
Reviewed-on: https://go-review.googlesource.com/95995
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-02-27 00:03:24 +00:00
Keith Randall
4b00d3f4a2 cmd/compile: implement comparisons directly with memory
Allow the compiler to generate code like CMPQ 16(AX), $7

It's tricky because it's difficult to spill such a comparison during
flagalloc, because the same memory state might not be available at
the restore locations.

Solve this problem by decomposing the compare+load back into its parts
if it needs to be spilled.

The big win is that the write barrier test goes from:

MOVL	runtime.writeBarrier(SB), CX
TESTL	CX, CX
JNE	60

to

CMPL	runtime.writeBarrier(SB), $0
JNE	59

It's one instruction and one byte smaller.

Fixes #19485
Fixes #15245
Update #22460

Binaries are about 0.15% smaller.

Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836
Reviewed-on: https://go-review.googlesource.com/86035
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-02-26 23:49:44 +00:00