1
0
mirror of https://github.com/golang/go synced 2024-09-29 07:14:29 -06:00
Commit Graph

35645 Commits

Author SHA1 Message Date
Alberto Donizetti
8516ecd05f test/codegen: port math/bits.ReverseBytes tests to codegen
And remove them from ssa_test.

Change-Id: If767af662801219774d1bdb787c77edfa6067770
Reviewed-on: https://go-review.googlesource.com/98976
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-06 20:34:33 +00:00
Wei Xiao
05962561ae cmd/compile/internal/ssa: improve store combine optimization on arm64
Current implementation doesn't consider MOVDreg type operand and fail to combine
it into larger store. This patch fixes the issue.

Fixes #24242

Change-Id: I7d68697f80e76f48c3528ece01a602bf513248ec
Reviewed-on: https://go-review.googlesource.com/98397
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-06 20:29:04 +00:00
Josh Bleecher Snyder
b85433975a encoding/binary: use an offset instead of slicing
While running make.bash, over 5% of all pointer writes
come from encoding/binary doing struct reads.

This change replaces slicing during such reads with an offset.
This avoids updating the slice pointer with every
struct field read or write.

This has no impact when the write barrier is off.
Running the benchmarks with GOGC=1, however,
shows significant improvement:

name          old time/op    new time/op    delta
ReadStruct-8    13.2µs ± 6%    10.1µs ± 5%  -23.24%  (p=0.000 n=10+10)

name          old speed      new speed      delta
ReadStruct-8  5.69MB/s ± 6%  7.40MB/s ± 5%  +30.18%  (p=0.000 n=10+10)

Change-Id: I22904263196bfeddc38abe8989428e263aee5253
Reviewed-on: https://go-review.googlesource.com/98757
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 18:59:03 +00:00
Josh Bleecher Snyder
f7739c07c8 runtime: skip pointless writes in freedefer
Change-Id: I501a0e5c87ec88616c7dcdf1b723758b6df6c088
Reviewed-on: https://go-review.googlesource.com/98758
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-06 18:58:57 +00:00
Josh Bleecher Snyder
4599419e69 debug/macho: use bytes.IndexByte instead of a loop
Simpler, and no doubt faster.

Change-Id: Idd401918da07a257de365087721e9ff061e6fd07
Reviewed-on: https://go-review.googlesource.com/98759
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 18:58:50 +00:00
Balaram Makam
0e8b7110f6 cmd/compile/internal/ssa: inline small memmove for arm64
This patch enables the optimization for arm64 target.

Performance results on Amberwing for strconv benchmark:
name             old time/op  new time/op  delta
Quote             721ns ± 0%   617ns ± 0%  -14.40%  (p=0.016 n=5+4)
QuoteRune         118ns ± 0%   117ns ± 0%   -0.85%  (p=0.008 n=5+5)
AppendQuote       436ns ± 2%   321ns ± 0%  -26.31%  (p=0.008 n=5+5)
AppendQuoteRune  34.7ns ± 0%  28.4ns ± 0%  -18.16%  (p=0.000 n=5+4)
[Geo mean]        189ns        160ns       -15.41%

Change-Id: I5714c474e7483d07ca338fbaf49beb4bbcc11c44
Reviewed-on: https://go-review.googlesource.com/98735
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-06 18:37:19 +00:00
Alberto Donizetti
18ae5eca3b test/codegen: port math/bits.OnesCount tests to codegen
And remove them from ssa_test.

Change-Id: I3efac5fea529bb0efa2dae32124530482ba5058e
Reviewed-on: https://go-review.googlesource.com/98815
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-06 17:53:00 +00:00
Cherry Zhang
f624445473 cmd/internal/obj/arm64: gofmt
Change-Id: Ica778fef2d0245fbb14f595597e45c7cf6adef84
Reviewed-on: https://go-review.googlesource.com/98895
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-06 16:35:20 +00:00
Elias Naur
f5f16d1ec1 iostest.bash: don't build std library twice
Instead, mirror androidtest.bash and build once, then run run.bash.

Change-Id: I174ae30b2a429a62b20bb290a70cb07ed712b1e4
Reviewed-on: https://go-review.googlesource.com/98915
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-06 16:08:20 +00:00
Elias Naur
ad87a67cdf cmd/dist: default to GOARM=7 on android
Auto-detecting GOARM on Android makes as little sense as for nacl/arm
and darwin/arm.

Also update androidtest.sh to not require GOARM set.

Change-Id: Id409ce1573d3c668d00fa4b7e3562ad7ece6fef5
Reviewed-on: https://go-review.googlesource.com/98875
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-06 16:08:04 +00:00
Cherry Zhang
084143d844 math/big: don't use R18 in ARM64 assembly
R18 seems reserved on Apple platforms.

May fix darwin/arm64 build.

Change-Id: Ia2c1de550a64827c85a64affa53b94c62aacce8e
Reviewed-on: https://go-review.googlesource.com/98896
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
2018-03-06 15:34:00 +00:00
Tobias Klauser
9745397e1d runtime: fix stack switch check in walltime/nanotime on linux/arm
CL 98095 got the check wrong. We should be testing
'getg() == getg().m.curg', not 'getg().m == getg().m.curg'.

Change-Id: I32f6238b00409b67afa8efe732513d542aec5bc7
Reviewed-on: https://go-review.googlesource.com/98855
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 14:24:19 +00:00
Alberto Donizetti
85dcc709a8 test/codegen: port math/bits.TrailingZeros tests to codegen
And remove them from ssa_test.

Change-Id: Ib5de5c0d908f23915e0847eca338cacf2fa5325b
Reviewed-on: https://go-review.googlesource.com/98795
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-06 11:48:37 +00:00
as
df8c2b905b net/http: correct subtle transposition of offset and whence in test
Change-Id: I788972bdf85c0225397c0e74901bf9c33c6d30c7
GitHub-Last-Rev: 57737fe782
GitHub-Pull-Request: golang/go#24265
Reviewed-on: https://go-review.googlesource.com/98761
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-06 06:13:17 +00:00
Meng Zhuo
8916773a3d runtime, cmd/compile: use ldp for DUFFCOPY on ARM64
name         old time/op  new time/op  delta
CopyFat8     2.15ns ± 1%  2.19ns ± 6%     ~     (p=0.171 n=8+9)
CopyFat12    2.15ns ± 0%  2.17ns ± 2%     ~     (p=0.137 n=8+10)
CopyFat16    2.17ns ± 3%  2.15ns ± 0%     ~     (p=0.211 n=10+10)
CopyFat24    2.16ns ± 1%  2.15ns ± 0%     ~     (p=0.087 n=10+10)
CopyFat32    11.5ns ± 0%  12.8ns ± 2%  +10.87%  (p=0.000 n=8+10)
CopyFat64    20.2ns ± 2%  12.9ns ± 0%  -36.11%  (p=0.000 n=10+10)
CopyFat128   37.2ns ± 0%  21.5ns ± 0%  -42.20%  (p=0.000 n=10+10)
CopyFat256   71.6ns ± 0%  38.7ns ± 0%  -45.95%  (p=0.000 n=10+10)
CopyFat512    140ns ± 0%    73ns ± 0%  -47.86%  (p=0.000 n=10+9)
CopyFat520    142ns ± 0%    74ns ± 0%  -47.54%  (p=0.000 n=10+10)
CopyFat1024   277ns ± 0%   141ns ± 0%  -49.10%  (p=0.000 n=10+10)

Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a
Reviewed-on: https://go-review.googlesource.com/97395
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-06 04:14:59 +00:00
Rob Pike
baf3eb1625 cmd/doc: make local dot-slash path names work
Before, an argument that started ./ or ../ was not treated as
a package relative to the current directory. Thus

	$ cd $GOROOT/src/text
	$ go doc ./template

could find html/template as $GOROOT/src/html/./template
is a valid Go source directory.

Fix this by catching such paths and making them absolute before
processing.

Fixes #23383.

Change-Id: Ic2a92eaa3a6328f728635657f9de72ac3ee82afb
Reviewed-on: https://go-review.googlesource.com/98396
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 01:11:26 +00:00
Fangming.Fang
917e72697e crypto/aes: optimize arm64 AES implementation
This patch makes use of arm64 AES instructions to accelerate AES computation
and only supports optimization on Linux for arm64

name        old time/op    new time/op     delta
Encrypt-32     255ns ± 0%       26ns ± 0%   -89.73%
Decrypt-32     256ns ± 0%       26ns ± 0%   -89.77%
Expand-32      990ns ± 5%      901ns ± 0%    -9.05%

name        old speed      new speed       delta
Encrypt-32  62.5MB/s ± 0%  610.4MB/s ± 0%  +876.39%
Decrypt-32  62.3MB/s ± 0%  610.2MB/s ± 0%  +879.6%

Fixes #18498

Change-Id: If416e5a151785325527b32ff72f6da3812493ed0
Reviewed-on: https://go-review.googlesource.com/64490
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-03-06 00:44:29 +00:00
erifan01
c4f3fe95c6 math/big: optimize addVV and subVV on arm64
The biggest hot spot of the existing implementation is "load" operations, which lead to poor performance.
By unrolling the cycle 4x and 2x, and using "LDP", "STP" instructions, this CL can reduce the "load" cost and improve performance.

Benchmarks:

name                              old time/op    new time/op     delta
AddVV/1-8                           21.5ns ± 0%     11.5ns ± 0%   -46.51%  (p=0.008 n=5+5)
AddVV/2-8                           13.5ns ± 0%     12.0ns ± 0%   -11.11%  (p=0.008 n=5+5)
AddVV/3-8                           15.5ns ± 0%     13.0ns ± 0%   -16.13%  (p=0.008 n=5+5)
AddVV/4-8                           17.5ns ± 0%     13.5ns ± 0%   -22.86%  (p=0.008 n=5+5)
AddVV/5-8                           19.5ns ± 0%     14.5ns ± 0%   -25.64%  (p=0.008 n=5+5)
AddVV/10-8                          29.5ns ± 0%     18.0ns ± 0%   -38.98%  (p=0.008 n=5+5)
AddVV/100-8                          217ns ± 0%       94ns ± 0%   -56.64%  (p=0.008 n=5+5)
AddVV/1000-8                        2.02µs ± 0%     1.03µs ± 0%   -48.85%  (p=0.008 n=5+5)
AddVV/10000-8                       20.5µs ± 0%     11.3µs ± 0%   -44.70%  (p=0.008 n=5+5)
AddVV/100000-8                       247µs ± 3%      154µs ± 0%   -37.52%  (p=0.008 n=5+5)
SubVV/1-8                           21.5ns ± 0%     11.5ns ± 0%      ~     (p=0.079 n=4+5)
SubVV/2-8                           13.5ns ± 0%     12.0ns ± 0%   -11.11%  (p=0.008 n=5+5)
SubVV/3-8                           15.5ns ± 0%     13.0ns ± 0%   -16.13%  (p=0.008 n=5+5)
SubVV/4-8                           17.5ns ± 0%     13.5ns ± 0%   -22.86%  (p=0.008 n=5+5)
SubVV/5-8                           19.5ns ± 0%     14.5ns ± 0%   -25.64%  (p=0.008 n=5+5)
SubVV/10-8                          29.5ns ± 0%     18.0ns ± 0%   -38.98%  (p=0.008 n=5+5)
SubVV/100-8                          217ns ± 0%       94ns ± 0%   -56.64%  (p=0.008 n=5+5)
SubVV/1000-8                        2.02µs ± 0%     0.80µs ± 0%   -60.50%  (p=0.008 n=5+5)
SubVV/10000-8                       20.5µs ± 0%     11.3µs ± 0%   -44.99%  (p=0.008 n=5+5)
SubVV/100000-8                       221µs ±11%      223µs ±16%      ~     (p=0.690 n=5+5)
AddVW/1-8                           9.32ns ± 0%     9.32ns ± 0%      ~     (all equal)
AddVW/2-8                           19.7ns ± 1%     19.7ns ± 0%      ~     (p=0.381 n=5+4)
AddVW/3-8                           11.5ns ± 0%     11.5ns ± 0%      ~     (all equal)
AddVW/4-8                           13.0ns ± 0%     13.0ns ± 0%      ~     (all equal)
AddVW/5-8                           14.5ns ± 0%     14.5ns ± 0%      ~     (all equal)
AddVW/10-8                          22.0ns ± 0%     22.0ns ± 0%      ~     (all equal)
AddVW/100-8                          167ns ± 0%      167ns ± 0%      ~     (all equal)
AddVW/1000-8                        1.52µs ± 0%     1.52µs ± 0%    +0.40%  (p=0.008 n=5+5)
AddVW/10000-8                       15.1µs ± 0%     15.1µs ± 0%      ~     (p=0.556 n=5+4)
AddVW/100000-8                       152µs ± 1%      152µs ± 1%      ~     (p=0.690 n=5+5)
AddMulVVW/1-8                       33.3ns ± 0%     32.7ns ± 1%    -1.86%  (p=0.008 n=5+5)
AddMulVVW/2-8                       59.3ns ± 1%     56.9ns ± 1%    -4.15%  (p=0.008 n=5+5)
AddMulVVW/3-8                       80.5ns ± 1%     85.4ns ± 3%    +6.19%  (p=0.008 n=5+5)
AddMulVVW/4-8                        127ns ± 0%      111ns ± 1%   -13.19%  (p=0.008 n=5+5)
AddMulVVW/5-8                        144ns ± 0%      149ns ± 0%    +3.47%  (p=0.016 n=4+5)
AddMulVVW/10-8                       298ns ± 1%      283ns ± 0%    -4.77%  (p=0.008 n=5+5)
AddMulVVW/100-8                     3.06µs ± 0%     2.99µs ± 0%    -2.21%  (p=0.008 n=5+5)
AddMulVVW/1000-8                    31.3µs ± 0%     26.9µs ± 0%   -14.17%  (p=0.008 n=5+5)
AddMulVVW/10000-8                    316µs ± 0%      305µs ± 0%    -3.51%  (p=0.008 n=5+5)
AddMulVVW/100000-8                  3.17ms ± 0%     3.17ms ± 1%      ~     (p=0.690 n=5+5)
DecimalConversion-8                  316µs ± 1%      313µs ± 2%      ~     (p=0.095 n=5+5)
FloatString/100-8                   2.53µs ± 1%     2.56µs ± 2%      ~     (p=0.222 n=5+5)
FloatString/1000-8                  58.4µs ± 0%     58.5µs ± 0%      ~     (p=0.206 n=5+5)
FloatString/10000-8                 4.59ms ± 0%     4.58ms ± 0%    -0.31%  (p=0.008 n=5+5)
FloatString/100000-8                 446ms ± 0%      444ms ± 0%    -0.31%  (p=0.008 n=5+5)
FloatAdd/10-8                        184ns ± 0%      172ns ± 0%    -6.30%  (p=0.008 n=5+5)
FloatAdd/100-8                       189ns ± 2%      191ns ± 4%      ~     (p=0.381 n=5+5)
FloatAdd/1000-8                      371ns ± 0%      347ns ± 1%    -6.42%  (p=0.008 n=5+5)
FloatAdd/10000-8                    1.87µs ± 0%     1.68µs ± 0%   -10.16%  (p=0.008 n=5+5)
FloatAdd/100000-8                   17.1µs ± 0%     15.6µs ± 0%    -8.74%  (p=0.016 n=5+4)
FloatSub/10-8                        152ns ± 0%      138ns ± 0%    -9.47%  (p=0.000 n=4+5)
FloatSub/100-8                       148ns ± 0%      142ns ± 0%    -4.05%  (p=0.000 n=5+4)
FloatSub/1000-8                      245ns ± 1%      217ns ± 0%   -11.28%  (p=0.000 n=5+4)
FloatSub/10000-8                    1.07µs ± 0%     0.88µs ± 1%   -18.14%  (p=0.008 n=5+5)
FloatSub/100000-8                   9.58µs ± 0%     7.96µs ± 0%   -16.84%  (p=0.008 n=5+5)
ParseFloatSmallExp-8                28.8µs ± 1%     29.0µs ± 1%      ~     (p=0.095 n=5+5)
ParseFloatLargeExp-8                 126µs ± 1%      126µs ± 1%      ~     (p=0.841 n=5+5)
GCD10x10/WithoutXY-8                 277ns ± 2%      281ns ± 4%      ~     (p=0.746 n=5+5)
GCD10x10/WithXY-8                   2.10µs ± 1%     2.12µs ± 3%      ~     (p=0.548 n=5+5)
GCD10x100/WithoutXY-8                615ns ± 3%      607ns ± 2%      ~     (p=0.135 n=5+5)
GCD10x100/WithXY-8                  3.50µs ± 2%     3.62µs ± 5%      ~     (p=0.151 n=5+5)
GCD10x1000/WithoutXY-8              1.39µs ± 2%     1.39µs ± 3%      ~     (p=0.690 n=5+5)
GCD10x1000/WithXY-8                 7.39µs ± 1%     7.34µs ± 2%      ~     (p=0.135 n=5+5)
GCD10x10000/WithoutXY-8             8.66µs ± 1%     8.68µs ± 1%      ~     (p=0.421 n=5+5)
GCD10x10000/WithXY-8                28.1µs ± 2%     27.0µs ± 2%    -3.81%  (p=0.008 n=5+5)
GCD10x100000/WithoutXY-8            79.3µs ± 1%     79.3µs ± 1%      ~     (p=0.841 n=5+5)
GCD10x100000/WithXY-8                238µs ± 0%      227µs ± 1%    -4.74%  (p=0.008 n=5+5)
GCD100x100/WithoutXY-8              1.89µs ± 1%     1.88µs ± 2%      ~     (p=0.968 n=5+5)
GCD100x100/WithXY-8                 26.7µs ± 1%     27.0µs ± 1%    +1.44%  (p=0.032 n=5+5)
GCD100x1000/WithoutXY-8             4.48µs ± 1%     4.45µs ± 2%      ~     (p=0.341 n=5+5)
GCD100x1000/WithXY-8                36.3µs ± 1%     35.1µs ± 1%    -3.27%  (p=0.008 n=5+5)
GCD100x10000/WithoutXY-8            22.8µs ± 0%     22.7µs ± 1%      ~     (p=0.056 n=5+5)
GCD100x10000/WithXY-8                145µs ± 1%      133µs ± 1%    -8.33%  (p=0.008 n=5+5)
GCD100x100000/WithoutXY-8            198µs ± 0%      195µs ± 0%    -1.56%  (p=0.008 n=5+5)
GCD100x100000/WithXY-8              1.11ms ± 0%     1.00ms ± 0%   -10.04%  (p=0.008 n=5+5)
GCD1000x1000/WithoutXY-8            25.2µs ± 1%     24.8µs ± 1%    -1.63%  (p=0.016 n=5+5)
GCD1000x1000/WithXY-8                513µs ± 0%      517µs ± 2%      ~     (p=0.421 n=5+5)
GCD1000x10000/WithoutXY-8           57.0µs ± 0%     52.7µs ± 1%    -7.56%  (p=0.008 n=5+5)
GCD1000x10000/WithXY-8              1.20ms ± 0%     1.10ms ± 0%    -8.70%  (p=0.008 n=5+5)
GCD1000x100000/WithoutXY-8           358µs ± 0%      318µs ± 1%   -11.03%  (p=0.008 n=5+5)
GCD1000x100000/WithXY-8             8.71ms ± 0%     7.65ms ± 0%   -12.19%  (p=0.008 n=5+5)
GCD10000x10000/WithoutXY-8           690µs ± 0%      630µs ± 0%    -8.71%  (p=0.008 n=5+5)
GCD10000x10000/WithXY-8             16.0ms ± 1%     14.9ms ± 0%    -6.85%  (p=0.008 n=5+5)
GCD10000x100000/WithoutXY-8         2.09ms ± 0%     1.75ms ± 0%   -16.09%  (p=0.016 n=5+4)
GCD10000x100000/WithXY-8            86.8ms ± 0%     76.3ms ± 0%   -12.09%  (p=0.008 n=5+5)
GCD100000x100000/WithoutXY-8        51.1ms ± 0%     46.0ms ± 0%    -9.97%  (p=0.008 n=5+5)
GCD100000x100000/WithXY-8            1.25s ± 0%      1.15s ± 0%    -7.92%  (p=0.008 n=5+5)
Hilbert-8                           2.45ms ± 1%     2.49ms ± 1%    +1.99%  (p=0.008 n=5+5)
Binomial-8                          4.98µs ± 3%     4.90µs ± 2%      ~     (p=0.421 n=5+5)
QuoRem-8                            7.10µs ± 0%     6.21µs ± 0%   -12.55%  (p=0.016 n=5+4)
Exp-8                                161ms ± 0%      161ms ± 0%      ~     (p=0.421 n=5+5)
Exp2-8                               161ms ± 0%      161ms ± 0%      ~     (p=0.151 n=5+5)
Bitset-8                            40.4ns ± 0%     40.3ns ± 0%      ~     (p=0.190 n=5+5)
BitsetNeg-8                          163ns ± 3%      137ns ± 2%   -15.91%  (p=0.008 n=5+5)
BitsetOrig-8                         377ns ± 1%      372ns ± 1%    -1.22%  (p=0.024 n=5+5)
BitsetNegOrig-8                      631ns ± 1%      605ns ± 1%    -4.09%  (p=0.008 n=5+5)
ModSqrt225_Tonelli-8                7.26ms ± 0%     7.26ms ± 0%      ~     (p=0.548 n=5+5)
ModSqrt224_3Mod4-8                  2.24ms ± 0%     2.24ms ± 0%      ~     (p=1.000 n=5+5)
ModSqrt5430_Tonelli-8                62.4s ± 0%      62.4s ± 0%      ~     (p=0.841 n=5+5)
ModSqrt5430_3Mod4-8                  20.8s ± 0%      20.7s ± 0%      ~     (p=0.056 n=5+5)
Sqrt-8                               101µs ± 0%       89µs ± 0%   -12.17%  (p=0.008 n=5+5)
IntSqr/1-8                          32.5ns ± 1%     32.7ns ± 1%      ~     (p=0.056 n=5+5)
IntSqr/2-8                           160ns ± 5%      158ns ± 0%      ~     (p=0.397 n=5+4)
IntSqr/3-8                           298ns ± 4%      296ns ± 4%      ~     (p=0.667 n=5+5)
IntSqr/5-8                           737ns ± 5%      761ns ± 3%    +3.34%  (p=0.016 n=5+5)
IntSqr/8-8                          1.87µs ± 4%     1.90µs ± 3%      ~     (p=0.222 n=5+5)
IntSqr/10-8                         2.96µs ± 4%     2.92µs ± 6%      ~     (p=0.310 n=5+5)
IntSqr/20-8                         6.28µs ± 3%     6.21µs ± 2%      ~     (p=0.310 n=5+5)
IntSqr/30-8                         14.0µs ± 2%     13.9µs ± 2%      ~     (p=0.548 n=5+5)
IntSqr/50-8                         37.7µs ± 3%     38.3µs ± 2%      ~     (p=0.095 n=5+5)
IntSqr/80-8                         95.9µs ± 2%     95.1µs ± 1%      ~     (p=0.310 n=5+5)
IntSqr/100-8                         148µs ± 1%      148µs ± 1%      ~     (p=0.841 n=5+5)
IntSqr/200-8                         586µs ± 1%      587µs ± 1%      ~     (p=1.000 n=5+5)
IntSqr/300-8                        1.32ms ± 0%     1.31ms ± 1%    -0.73%  (p=0.032 n=5+5)
IntSqr/500-8                        2.48ms ± 0%     2.45ms ± 0%    -1.15%  (p=0.008 n=5+5)
IntSqr/800-8                        4.68ms ± 0%     4.62ms ± 0%    -1.23%  (p=0.008 n=5+5)
IntSqr/1000-8                       7.57ms ± 0%     7.50ms ± 0%    -0.84%  (p=0.008 n=5+5)
Mul-8                                311ms ± 0%      308ms ± 0%    -0.81%  (p=0.008 n=5+5)
Exp3Power/0x10-8                     574ns ± 1%      578ns ± 2%      ~     (p=0.500 n=5+5)
Exp3Power/0x40-8                     640ns ± 1%      646ns ± 0%      ~     (p=0.056 n=5+5)
Exp3Power/0x100-8                   1.42µs ± 1%     1.42µs ± 1%      ~     (p=0.246 n=5+5)
Exp3Power/0x400-8                   8.30µs ± 1%     8.29µs ± 1%      ~     (p=0.802 n=5+5)
Exp3Power/0x1000-8                  60.0µs ± 0%     59.9µs ± 0%    -0.24%  (p=0.016 n=5+5)
Exp3Power/0x4000-8                   817µs ± 0%      816µs ± 0%    -0.17%  (p=0.008 n=5+5)
Exp3Power/0x10000-8                 7.80ms ± 1%     7.70ms ± 0%    -1.23%  (p=0.008 n=5+5)
Exp3Power/0x40000-8                 73.4ms ± 0%     72.5ms ± 0%    -1.28%  (p=0.008 n=5+5)
Exp3Power/0x100000-8                 665ms ± 0%      656ms ± 0%    -1.34%  (p=0.008 n=5+5)
Exp3Power/0x400000-8                 5.99s ± 0%      5.90s ± 0%    -1.40%  (p=0.008 n=5+5)
Fibo-8                               116ms ± 0%       50ms ± 0%   -57.09%  (p=0.008 n=5+5)
NatSqr/1-8                           112ns ± 4%      112ns ± 2%      ~     (p=0.968 n=5+5)
NatSqr/2-8                           251ns ± 2%      250ns ± 1%      ~     (p=0.571 n=5+5)
NatSqr/3-8                           378ns ± 2%      379ns ± 2%      ~     (p=0.794 n=5+5)
NatSqr/5-8                           829ns ± 3%      827ns ± 2%      ~     (p=1.000 n=5+5)
NatSqr/8-8                          1.97µs ± 2%     1.95µs ± 2%      ~     (p=0.310 n=5+5)
NatSqr/10-8                         3.02µs ± 2%     2.99µs ± 2%      ~     (p=0.421 n=5+5)
NatSqr/20-8                         6.51µs ± 2%     6.49µs ± 1%      ~     (p=0.841 n=5+5)
NatSqr/30-8                         14.1µs ± 2%     14.0µs ± 2%      ~     (p=0.841 n=5+5)
NatSqr/50-8                         38.1µs ± 2%     38.3µs ± 3%      ~     (p=0.690 n=5+5)
NatSqr/80-8                         95.5µs ± 2%     96.0µs ± 1%      ~     (p=0.421 n=5+5)
NatSqr/100-8                         150µs ± 1%      148µs ± 2%      ~     (p=0.095 n=5+5)
NatSqr/200-8                         588µs ± 1%      590µs ± 1%      ~     (p=0.421 n=5+5)
NatSqr/300-8                        1.32ms ± 1%     1.31ms ± 1%      ~     (p=0.841 n=5+5)
NatSqr/500-8                        2.50ms ± 0%     2.47ms ± 0%    -1.03%  (p=0.008 n=5+5)
NatSqr/800-8                        4.70ms ± 0%     4.64ms ± 0%    -1.31%  (p=0.008 n=5+5)
NatSqr/1000-8                       7.60ms ± 0%     7.52ms ± 0%    -1.01%  (p=0.008 n=5+5)
ScanPi-8                             326µs ± 0%      326µs ± 0%      ~     (p=0.841 n=5+5)
StringPiParallel-8                  70.3µs ± 5%     63.8µs ±10%      ~     (p=0.056 n=5+5)
Scan/10/Base2-8                     1.09µs ± 0%     1.09µs ± 0%      ~     (p=0.317 n=5+5)
Scan/100/Base2-8                    7.79µs ± 0%     7.78µs ± 0%      ~     (p=0.063 n=5+5)
Scan/1000/Base2-8                   79.0µs ± 0%     78.9µs ± 0%    -0.18%  (p=0.008 n=5+5)
Scan/10000/Base2-8                  1.22ms ± 0%     1.22ms ± 0%    -0.15%  (p=0.008 n=5+5)
Scan/100000/Base2-8                 55.1ms ± 0%     55.2ms ± 0%    +0.20%  (p=0.008 n=5+5)
Scan/10/Base8-8                      512ns ± 0%      512ns ± 1%      ~     (p=0.810 n=5+5)
Scan/100/Base8-8                    2.89µs ± 0%     2.89µs ± 0%      ~     (p=0.810 n=5+5)
Scan/1000/Base8-8                   31.0µs ± 0%     31.0µs ± 0%      ~     (p=0.151 n=5+5)
Scan/10000/Base8-8                   740µs ± 0%      741µs ± 0%    +0.10%  (p=0.008 n=5+5)
Scan/100000/Base8-8                 50.6ms ± 0%     50.6ms ± 0%    +0.08%  (p=0.008 n=5+5)
Scan/10/Base10-8                     487ns ± 0%      487ns ± 0%      ~     (p=0.571 n=5+5)
Scan/100/Base10-8                   2.67µs ± 0%     2.67µs ± 0%      ~     (p=0.810 n=5+5)
Scan/1000/Base10-8                  28.7µs ± 0%     28.7µs ± 0%    +0.06%  (p=0.008 n=5+5)
Scan/10000/Base10-8                  716µs ± 0%      717µs ± 0%      ~     (p=0.222 n=5+5)
Scan/100000/Base10-8                50.3ms ± 0%     50.3ms ± 0%    +0.10%  (p=0.008 n=5+5)
Scan/10/Base16-8                     438ns ± 0%      437ns ± 1%      ~     (p=0.786 n=5+5)
Scan/100/Base16-8                   2.47µs ± 0%     2.47µs ± 0%    -0.19%  (p=0.048 n=5+5)
Scan/1000/Base16-8                  27.2µs ± 0%     27.3µs ± 0%      ~     (p=0.087 n=5+5)
Scan/10000/Base16-8                  722µs ± 0%      722µs ± 0%    +0.11%  (p=0.008 n=5+5)
Scan/100000/Base16-8                52.6ms ± 0%     52.7ms ± 0%    +0.15%  (p=0.008 n=5+5)
String/10/Base2-8                    247ns ± 2%      248ns ± 1%      ~     (p=0.437 n=5+5)
String/100/Base2-8                  1.51µs ± 0%     1.51µs ± 0%    -0.37%  (p=0.024 n=5+5)
String/1000/Base2-8                 13.6µs ± 1%     13.5µs ± 0%      ~     (p=0.095 n=5+5)
String/10000/Base2-8                 135µs ± 0%      135µs ± 1%      ~     (p=0.841 n=5+5)
String/100000/Base2-8               1.32ms ± 1%     1.32ms ± 1%      ~     (p=0.690 n=5+5)
String/10/Base8-8                    169ns ± 1%      169ns ± 1%      ~     (p=1.000 n=5+5)
String/100/Base8-8                   636ns ± 0%      634ns ± 1%      ~     (p=0.413 n=5+5)
String/1000/Base8-8                 5.33µs ± 1%     5.32µs ± 0%      ~     (p=0.222 n=5+5)
String/10000/Base8-8                50.9µs ± 1%     50.7µs ± 0%      ~     (p=0.151 n=5+5)
String/100000/Base8-8                500µs ± 1%      497µs ± 0%      ~     (p=0.421 n=5+5)
String/10/Base10-8                   516ns ± 1%      513ns ± 0%    -0.62%  (p=0.016 n=5+4)
String/100/Base10-8                 1.97µs ± 0%     1.96µs ± 0%      ~     (p=0.667 n=4+5)
String/1000/Base10-8                12.5µs ± 0%     11.5µs ± 0%    -7.92%  (p=0.008 n=5+5)
String/10000/Base10-8               57.7µs ± 0%     52.5µs ± 0%    -8.93%  (p=0.008 n=5+5)
String/100000/Base10-8              25.6ms ± 0%     21.6ms ± 0%   -15.94%  (p=0.008 n=5+5)
String/10/Base16-8                   150ns ± 1%      149ns ± 0%      ~     (p=0.413 n=5+4)
String/100/Base16-8                  514ns ± 1%      514ns ± 1%      ~     (p=0.849 n=5+5)
String/1000/Base16-8                4.01µs ± 0%     4.01µs ± 0%      ~     (p=0.421 n=5+5)
String/10000/Base16-8               37.8µs ± 1%     37.8µs ± 1%      ~     (p=0.841 n=5+5)
String/100000/Base16-8               373µs ± 2%      373µs ± 0%      ~     (p=0.421 n=5+5)
LeafSize/0-8                        6.63ms ± 0%     6.63ms ± 0%      ~     (p=0.730 n=4+5)
LeafSize/1-8                        74.0µs ± 0%     67.7µs ± 1%    -8.53%  (p=0.008 n=5+5)
LeafSize/2-8                        74.2µs ± 0%     68.3µs ± 1%    -7.99%  (p=0.008 n=5+5)
LeafSize/3-8                         379µs ± 0%      309µs ± 0%   -18.52%  (p=0.008 n=5+5)
LeafSize/4-8                        72.7µs ± 1%     66.7µs ± 0%    -8.37%  (p=0.008 n=5+5)
LeafSize/5-8                         471µs ± 0%      384µs ± 0%   -18.55%  (p=0.008 n=5+5)
LeafSize/6-8                         378µs ± 0%      308µs ± 0%   -18.59%  (p=0.008 n=5+5)
LeafSize/7-8                         245µs ± 0%      204µs ± 1%   -16.75%  (p=0.008 n=5+5)
LeafSize/8-8                        73.4µs ± 0%     66.9µs ± 1%    -8.79%  (p=0.008 n=5+5)
LeafSize/9-8                         538µs ± 0%      437µs ± 0%   -18.75%  (p=0.008 n=5+5)
LeafSize/10-8                        472µs ± 0%      396µs ± 1%   -16.01%  (p=0.008 n=5+5)
LeafSize/11-8                        460µs ± 0%      374µs ± 0%   -18.58%  (p=0.008 n=5+5)
LeafSize/12-8                        378µs ± 0%      308µs ± 0%   -18.38%  (p=0.008 n=5+5)
LeafSize/13-8                        343µs ± 0%      284µs ± 0%   -17.30%  (p=0.008 n=5+5)
LeafSize/14-8                        248µs ± 0%      206µs ± 0%   -16.94%  (p=0.008 n=5+5)
LeafSize/15-8                        169µs ± 0%      144µs ± 0%   -14.69%  (p=0.008 n=5+5)
LeafSize/16-8                       72.9µs ± 0%     66.8µs ± 1%    -8.27%  (p=0.008 n=5+5)
LeafSize/32-8                       82.5µs ± 0%     76.7µs ± 0%    -7.04%  (p=0.008 n=5+5)
LeafSize/64-8                        134µs ± 0%      129µs ± 0%    -3.80%  (p=0.008 n=5+5)
ProbablyPrime/n=0-8                 44.2ms ± 0%     43.4ms ± 0%    -1.95%  (p=0.008 n=5+5)
ProbablyPrime/n=1-8                 64.9ms ± 0%     64.0ms ± 0%    -1.27%  (p=0.008 n=5+5)
ProbablyPrime/n=5-8                  147ms ± 0%      146ms ± 0%    -0.58%  (p=0.008 n=5+5)
ProbablyPrime/n=10-8                 250ms ± 0%      249ms ± 0%    -0.35%  (p=0.008 n=5+5)
ProbablyPrime/n=20-8                 456ms ± 0%      455ms ± 0%    -0.18%  (p=0.008 n=5+5)
ProbablyPrime/Lucas-8               23.6ms ± 0%     22.7ms ± 0%    -3.74%  (p=0.008 n=5+5)
ProbablyPrime/MillerRabinBase2-8    20.7ms ± 0%     20.6ms ± 0%      ~     (p=0.421 n=5+5)
FloatSqrt/64-8                      2.25µs ± 1%     2.29µs ± 0%    +1.48%  (p=0.008 n=5+5)
FloatSqrt/128-8                     4.86µs ± 1%     4.92µs ± 1%    +1.21%  (p=0.032 n=5+5)
FloatSqrt/256-8                     13.6µs ± 0%     13.7µs ± 1%    +1.31%  (p=0.032 n=5+5)
FloatSqrt/1000-8                    70.0µs ± 1%     70.1µs ± 0%      ~     (p=0.690 n=5+5)
FloatSqrt/10000-8                   1.92ms ± 0%     1.90ms ± 0%    -0.59%  (p=0.008 n=5+5)
FloatSqrt/100000-8                  55.3ms ± 0%     54.8ms ± 0%    -1.01%  (p=0.008 n=5+5)
FloatSqrt/1000000-8                  4.56s ± 0%      4.50s ± 0%    -1.28%  (p=0.008 n=5+5)

name                              old speed      new speed       delta
AddVV/1-8                         2.97GB/s ± 0%   5.56GB/s ± 0%   +86.85%  (p=0.008 n=5+5)
AddVV/2-8                         9.47GB/s ± 0%  10.66GB/s ± 0%   +12.50%  (p=0.008 n=5+5)
AddVV/3-8                         12.4GB/s ± 0%   14.7GB/s ± 0%   +19.10%  (p=0.008 n=5+5)
AddVV/4-8                         14.6GB/s ± 0%   18.9GB/s ± 0%   +29.63%  (p=0.016 n=4+5)
AddVV/5-8                         16.4GB/s ± 0%   22.0GB/s ± 0%   +34.47%  (p=0.016 n=5+4)
AddVV/10-8                        21.7GB/s ± 0%   35.5GB/s ± 0%   +63.89%  (p=0.008 n=5+5)
AddVV/100-8                       29.4GB/s ± 0%   68.0GB/s ± 0%  +131.38%  (p=0.008 n=5+5)
AddVV/1000-8                      31.7GB/s ± 0%   61.9GB/s ± 0%   +95.43%  (p=0.008 n=5+5)
AddVV/10000-8                     31.2GB/s ± 0%   56.4GB/s ± 0%   +80.83%  (p=0.008 n=5+5)
AddVV/100000-8                    25.9GB/s ± 3%   41.4GB/s ± 0%   +59.98%  (p=0.008 n=5+5)
SubVV/1-8                         2.97GB/s ± 0%   5.56GB/s ± 0%   +86.97%  (p=0.016 n=4+5)
SubVV/2-8                         9.47GB/s ± 0%  10.66GB/s ± 0%   +12.51%  (p=0.008 n=5+5)
SubVV/3-8                         12.4GB/s ± 0%   14.8GB/s ± 0%   +19.23%  (p=0.016 n=4+5)
SubVV/4-8                         14.6GB/s ± 0%   18.9GB/s ± 0%   +29.56%  (p=0.008 n=5+5)
SubVV/5-8                         16.4GB/s ± 0%   22.0GB/s ± 0%   +34.47%  (p=0.016 n=4+5)
SubVV/10-8                        21.7GB/s ± 0%   35.5GB/s ± 0%   +63.89%  (p=0.008 n=5+5)
SubVV/100-8                       29.4GB/s ± 0%   68.0GB/s ± 0%  +131.38%  (p=0.008 n=5+5)
SubVV/1000-8                      31.6GB/s ± 0%   80.1GB/s ± 0%  +153.08%  (p=0.008 n=5+5)
SubVV/10000-8                     31.2GB/s ± 0%   56.7GB/s ± 0%   +81.79%  (p=0.008 n=5+5)
SubVV/100000-8                    29.1GB/s ±10%   29.0GB/s ±18%      ~     (p=0.690 n=5+5)
AddVW/1-8                          859MB/s ± 0%    859MB/s ± 0%    -0.01%  (p=0.008 n=5+5)
AddVW/2-8                          811MB/s ± 1%    814MB/s ± 0%      ~     (p=0.413 n=5+4)
AddVW/3-8                         2.08GB/s ± 0%   2.08GB/s ± 0%      ~     (p=0.206 n=5+5)
AddVW/4-8                         2.46GB/s ± 0%   2.46GB/s ± 0%      ~     (p=0.056 n=5+5)
AddVW/5-8                         2.75GB/s ± 0%   2.75GB/s ± 0%      ~     (p=0.508 n=5+5)
AddVW/10-8                        3.63GB/s ± 0%   3.63GB/s ± 0%      ~     (p=0.214 n=5+5)
AddVW/100-8                       4.79GB/s ± 0%   4.79GB/s ± 0%      ~     (p=0.500 n=5+5)
AddVW/1000-8                      5.27GB/s ± 0%   5.25GB/s ± 0%    -0.43%  (p=0.008 n=5+5)
AddVW/10000-8                     5.30GB/s ± 0%   5.30GB/s ± 0%      ~     (p=0.397 n=5+5)
AddVW/100000-8                    5.27GB/s ± 1%   5.25GB/s ± 1%      ~     (p=0.690 n=5+5)
AddMulVVW/1-8                     1.92GB/s ± 0%   1.96GB/s ± 1%    +1.95%  (p=0.008 n=5+5)
AddMulVVW/2-8                     2.16GB/s ± 1%   2.25GB/s ± 1%    +4.32%  (p=0.008 n=5+5)
AddMulVVW/3-8                     2.39GB/s ± 1%   2.25GB/s ± 3%    -5.79%  (p=0.008 n=5+5)
AddMulVVW/4-8                     2.00GB/s ± 0%   2.31GB/s ± 1%   +15.31%  (p=0.008 n=5+5)
AddMulVVW/5-8                     2.22GB/s ± 0%   2.14GB/s ± 0%    -3.86%  (p=0.008 n=5+5)
AddMulVVW/10-8                    2.15GB/s ± 1%   2.25GB/s ± 0%    +5.03%  (p=0.008 n=5+5)
AddMulVVW/100-8                   2.09GB/s ± 0%   2.14GB/s ± 0%    +2.25%  (p=0.008 n=5+5)
AddMulVVW/1000-8                  2.04GB/s ± 0%   2.38GB/s ± 0%   +16.52%  (p=0.008 n=5+5)
AddMulVVW/10000-8                 2.03GB/s ± 0%   2.10GB/s ± 0%    +3.64%  (p=0.008 n=5+5)
AddMulVVW/100000-8                2.02GB/s ± 0%   2.02GB/s ± 1%      ~     (p=0.690 n=5+5)

Change-Id: Ie482d67a7dbb5af6f5d81af2b3d9d14bd66336db
Reviewed-on: https://go-review.googlesource.com/77831
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-06 00:22:08 +00:00
Yury Smolsky
adcf2d59ec os/exec: document Process.Kill behaviour
It is not clear from documentation what the Process.Kill does. And it
leads to reccuring confusion about Cmd.Start/Wait methods.

Fixes #24220

Change-Id: I66609d21d2954e195d13648014681530eed8ea6c
Reviewed-on: https://go-review.googlesource.com/98715
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 23:47:41 +00:00
Mostyn Bramley-Moore
32e459a09c path/filepath: use a temp dir in path_test.go
We should avoid writing temp files to GOROOT, since it might be readonly.

Fixes #23881

Change-Id: Iaa38ec404b303f0cf27fdfb7daf1ddd60fd5d1c9
GitHub-Last-Rev: de0211df84
GitHub-Pull-Request: golang/go#24238
Reviewed-on: https://go-review.googlesource.com/98517
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 23:38:39 +00:00
Matthew Dempsky
26708439ec cmd/compile: refactor order.go into methods
No functional changes, just changing all the orderfoo functions
into (*Order).foo methods.

Passes toolstash-check.

Change-Id: Ib9833daa98aff3c645ce56794a414f8472689152
Reviewed-on: https://go-review.googlesource.com/98617
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 21:25:24 +00:00
Andrey Mirtchovski
43b6476262 os/signal: disable loading of history during test
This change modifies Go to disable loading of users' shell history for
TestTerminalSignal tests. TestTerminalSignal, as part of its workload,
will execute a new interactive bash shell. Bash will attempt to load the
user's history from the file pointed to by the HISTFILE environment
variable. For users with large histories that may take up to several
seconds, pushing the whole test past the 5 second timeout and causing
it to fail.

Change-Id: I11b2f83ee91f51fa1e9774a39181ab365f9a6b3a
GitHub-Last-Rev: 7efdf616a2
GitHub-Pull-Request: golang/go#24255
Reviewed-on: https://go-review.googlesource.com/98616
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 21:03:33 +00:00
Hana Kim
d3946f75d3 internal/trace: remove backlinks from span/task end to start
This is an updated version of golang.org/cl/96395, with the fix to
TestUserSpan.

This reverts commit 7b6f6267e90a8e4eab37a3f2164ba882e6222adb.

Change-Id: I31eec8ba0997f9178dffef8dac608e731ab70872
Reviewed-on: https://go-review.googlesource.com/98236
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-05 20:10:22 +00:00
Alberto Donizetti
83e41b3e76 test/codegen: port math/bits.Leadingzero tests to codegen
Change-Id: Ic21d25db5d56ce77516c53082dfbc010e5875b81
Reviewed-on: https://go-review.googlesource.com/98655
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-05 19:52:04 +00:00
Ian Lance Taylor
7178267b59 runtime: rename vdso symbols to use camel case
This was originally C code using names with underscores, which were
retained when the code was rewritten into Go. Change the code to use
Go-like camel case names.

The names that come from the ELF ABI are left unchanged.

Change-Id: I181bc5dd81284c07bc67b7df4635f4734b41d646
Reviewed-on: https://go-review.googlesource.com/98520
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 19:12:32 +00:00
Tobias Klauser
5f80e70912 runtime: remove unused SYS_* definitions on Linux
Also fix the indentation of the SYS_* definitions in sys_linux_mipsx.s
and order them numerically.

Change-Id: I0c454301c329a163e7db09dcb25d4e825149858c
Reviewed-on: https://go-review.googlesource.com/98448
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-05 18:32:08 +00:00
Alberto Donizetti
c1806906d8 test: port bits.Len intrinsics tests to the new codegen harness
This change move bits.Len* intrinsification tests to the new codegen
test harness, removing them from the old ssa_test file. Five different
test functions (one for each bit.Len function tested) was used, to
avoid possible unwanted interactions between multiple calls inside one
function.

Change-Id: Iffd5be55b58e88597fa30a562a28dacb01236d8b
Reviewed-on: https://go-review.googlesource.com/98156
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-03-05 18:01:19 +00:00
Keith Randall
63bcabed49 internal/bytealg: fix arm64 Index function
Missed removing the argument loading from the indexbody function.

Change-Id: Ia1391231fc99771d00410a09fe80a09f08ceed02
Reviewed-on: https://go-review.googlesource.com/98575
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-05 15:38:29 +00:00
Giovanni Bajo
29fcd57a9f cmd/compile: fold offsets into memory ops
Fold offsets for:

  {ADD,SUB,MUL}[SD]mem
  ADD[LQ]constmem
  {ADD,SUB,AND,OR,XOR}[LQ]mem

Cumulatively, the rules trigger ~900 times in all.bash.

Fixes #23325

Change-Id: If6c701f68fa0b57907a353a07a516b914127d0d8
Reviewed-on: https://go-review.googlesource.com/98035
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-04 22:53:50 +00:00
Keith Randall
ee58eccc56 internal/bytealg: move short string Index implementations into bytealg
Also move the arm64 CountByte implementation while we're here.

Fixes #19792

Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 19:49:44 +00:00
Keith Randall
f6332bb84a internal/bytealg: move compare functions to bytealg
Move bytes.Compare and runtime·cmpstring to bytealg.

Update #19792

Change-Id: I139e6d7c59686bef7a3017e3dec99eba5fd10447
Reviewed-on: https://go-review.googlesource.com/98515
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:39 +00:00
Keith Randall
45964e4f9c internal/bytealg: move Count to bytealg
Move bytes.Count and strings.Count to bytealg.

Update #19792

Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04 17:49:25 +00:00
Giovanni Bajo
89ae7045f3 test: convert all math-related tests from asm_test
Change-Id: If542f0b5c5754e6eb2f9b302fe5a148ba9a57338
Reviewed-on: https://go-review.googlesource.com/98443
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-04 16:52:33 +00:00
Giovanni Bajo
fad31e513d test: move load/store combines into asmcheck
This CL moves the load/store combining tests into asmcheck.
In addition at being more compact, it's also now easier to
spot what it is missing in each architecture.

While doing so, I think I uncovered a bug in ppc64le and arm64
rules, because they fail to load/store combine in non-trivial
functions. Not sure why, I'll open an issue.

Change-Id: Ia1572d53c0553d9104f3e52b95e4d1768a8440a3
Reviewed-on: https://go-review.googlesource.com/98441
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-04 16:52:03 +00:00
Giovanni Bajo
80bfb75c42 test: in asmcheck, dump only the functions which fail
Before this change, in case of any failure, asmcheck was
dumping to stderr the whole output of compile -S, which
can be very long if it contains multiple functions.

Make it so it filters the output to only display the
assembly output of functions for which at least one opcode
check failed. This greatly simplifies debugging.

Change-Id: I1bbf54473b8252a3384e2c1dade82d926afc119d
Reviewed-on: https://go-review.googlesource.com/98444
Run-TryBot: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-04 01:05:02 +00:00
Giovanni Bajo
8ce74b7d11 test: port a nil-check interface test from asm_test
Change-Id: I69c1688506d1aeca655047acf35d1bff966fc01e
Reviewed-on: https://go-review.googlesource.com/98442
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-03 20:20:54 +00:00
Giovanni Bajo
ec0b8c0585 test: use the version of Go used to run run.go
Currently, the top-level testsuite always uses whatever version
of Go is found in the PATH to execute all the tests. This
forces the developers to tweak the PATH to run the testsuite.

Change it to use the same version of Go used to run run.go.
This allows developers to run the testsuite using the tip
compiler by simply saying "../bin/go run run.go".

I think this is a better solution compared to always forcing
"../bin/go", because it allows developers to run the testsuite
using different Go versions, for instance to check if a new
test is fixed in tip compared to the installed compiler.

Fixes #24217

Change-Id: I41b299c753b6e77c41e28be9091b2b630efea9d2
Reviewed-on: https://go-review.googlesource.com/98439
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-03-03 19:52:00 +00:00
Pascal S. de Kloe
74a92b8e8d encoding/json: apply conventional error handling in decoder
name                            old time/op    new time/op    delta
CodeEncoder-12                    1.89ms ± 1%    1.91ms ± 0%   +1.16%  (p=0.000 n=20+19)
CodeMarshal-12                    2.09ms ± 1%    2.12ms ± 0%   +1.63%  (p=0.000 n=17+18)
CodeDecoder-12                    8.43ms ± 1%    8.32ms ± 1%   -1.35%  (p=0.000 n=18+20)
UnicodeDecoder-12                  399ns ± 0%     339ns ± 0%  -15.00%  (p=0.000 n=20+19)
DecoderStream-12                   281ns ± 1%     231ns ± 0%  -17.91%  (p=0.000 n=20+16)
CodeUnmarshal-12                  9.35ms ± 2%    9.15ms ± 2%   -2.11%  (p=0.000 n=20+20)
CodeUnmarshalReuse-12             8.41ms ± 2%    8.29ms ± 2%   -1.34%  (p=0.000 n=20+20)
UnmarshalString-12                81.2ns ± 2%    74.0ns ± 4%   -8.89%  (p=0.000 n=20+20)
UnmarshalFloat64-12               71.1ns ± 2%    64.3ns ± 1%   -9.60%  (p=0.000 n=20+19)
UnmarshalInt64-12                 60.6ns ± 2%    53.2ns ± 0%  -12.28%  (p=0.000 n=18+18)
Issue10335-12                     96.9ns ± 0%    87.7ns ± 1%   -9.52%  (p=0.000 n=17+20)
Unmapped-12                        247ns ± 4%     231ns ± 3%   -6.34%  (p=0.000 n=20+20)
TypeFieldsCache/MissTypes1-12     11.1µs ± 0%    11.1µs ± 0%     ~     (p=0.376 n=19+20)
TypeFieldsCache/MissTypes10-12    33.9µs ± 0%    33.8µs ± 0%   -0.32%  (p=0.000 n=18+9)

name                            old speed      new speed      delta
CodeEncoder-12                  1.03GB/s ± 1%  1.01GB/s ± 0%   -1.15%  (p=0.000 n=20+19)
CodeMarshal-12                   930MB/s ± 1%   915MB/s ± 0%   -1.60%  (p=0.000 n=17+18)
CodeDecoder-12                   230MB/s ± 1%   233MB/s ± 1%   +1.37%  (p=0.000 n=18+20)
UnicodeDecoder-12               35.0MB/s ± 0%  41.2MB/s ± 0%  +17.60%  (p=0.000 n=20+19)
CodeUnmarshal-12                 208MB/s ± 2%   212MB/s ± 2%   +2.16%  (p=0.000 n=20+20)

name                            old alloc/op   new alloc/op   delta
Issue10335-12                       184B ± 0%      184B ± 0%     ~     (all equal)
Unmapped-12                         216B ± 0%      216B ± 0%     ~     (all equal)

name                            old allocs/op  new allocs/op  delta
Issue10335-12                       3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Unmapped-12                         4.00 ± 0%      4.00 ± 0%     ~     (all equal)

Change-Id: I4b1a87a205da2ef9a572f86f85bc833653c61570
Reviewed-on: https://go-review.googlesource.com/98440
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-03 17:16:47 +00:00
Tobias Klauser
51b027116c runtime: use vDSO for clock_gettime on linux/arm
Use the __vdso_clock_gettime fast path via the vDSO on linux/arm to
speed up nanotime and walltime. This results in the following
performance improvement for time.Now on a RaspberryPi 3 (running
32bit Raspbian, i.e. GOOS=linux/GOARCH=arm):

name     old time/op  new time/op  delta
TimeNow  0.99µs ± 0%  0.39µs ± 1%  -60.74%  (p=0.000 n=12+20)

Change-Id: I3598278a6c88d7f6a6ce66c56b9d25f9dd2f4c9a
Reviewed-on: https://go-review.googlesource.com/98095
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-03 12:12:58 +00:00
Tobias Klauser
c69f60d071 runtime: remove unused __vdso_time_sym
It's unused since https://golang.org/cl/99320043

Change-Id: I74d69ff894aa2fb556f1c2083406c118c559d91b
Reviewed-on: https://go-review.googlesource.com/98195
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-03 12:11:38 +00:00
Keith Randall
1dfa380e3d internal/bytealg: move equal functions to bytealg
Move bytes.Equal, runtime.memequal, and runtime.memequal_varlen
to the bytealg package.

Update #19792

Change-Id: Ic4175e952936016ea0bda6c7c3dbb33afdc8e4ac
Reviewed-on: https://go-review.googlesource.com/98355
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-03 04:18:27 +00:00
Joe Tsai
f0756ca2ea encoding/json: use sync.Map for field cache
The previous type cache is quadratic in time in the situation where
new types are continually encountered. Now that it is possible to dynamically
create new types with the reflect package, this can cause json to
perform very poorly.

Switch to sync.Map which does well when the cache has hit steady state,
but also handles occasional updates in better than quadratic time.

benchmark                                     old ns/op      new ns/op     delta
BenchmarkTypeFieldsCache/MissTypes1-8         14817          16202         +9.35%
BenchmarkTypeFieldsCache/MissTypes10-8        70926          69144         -2.51%
BenchmarkTypeFieldsCache/MissTypes100-8       976467         208973        -78.60%
BenchmarkTypeFieldsCache/MissTypes1000-8      79520162       1750371       -97.80%
BenchmarkTypeFieldsCache/MissTypes10000-8     6873625837     16847806      -99.75%
BenchmarkTypeFieldsCache/HitTypes1000-8       7.51           8.80          +17.18%
BenchmarkTypeFieldsCache/HitTypes10000-8      7.58           8.68          +14.51%

The old implementation takes 12 minutes just to build a cache of size 1e5
due to the quadratic behavior. I did not bother benchmark sizes above that.

Change-Id: I5e6facc1eb8e1b80e5ca285e4dd2cc8815618dad
Reviewed-on: https://go-review.googlesource.com/76850
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-03 00:08:09 +00:00
Shamil Garatuev
e658b85f26 internal/syscall/windows/registry: improve ReadSubKeyNames permissions
Make ReadSubKeyNames work even if key is opened with only
ENUMERATE_SUB_KEYs access rights mask.

Fixes #23869

Change-Id: I138bd51715fdbc3bda05607c64bde1150f4fe6b2
Reviewed-on: https://go-review.googlesource.com/97435
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2018-03-02 22:58:33 +00:00
Keith Randall
403ab0f221 internal/bytealg: move IndexByte asssembly to the new bytealg package
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.

Once this is in, the next step is to move the other functions
(Compare, Equal, ...).

Update #19792

This change seems complicated enough that we might just declare
"not worth it" and abandon.  Opinions welcome.

The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.

The wrapper functions have been cleaned up as they are now actually
checked by vet.

Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 22:46:15 +00:00
Brad Fitzpatrick
dcedcaa5fb net: skip flaky TestLookupLongTXT for now
Flaky tests failing trybots help nobody.

Updates #22857

Change-Id: I87bc018651ab4fe02560a6d24c08a1d7ccd8ba37
Reviewed-on: https://go-review.googlesource.com/97416
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-02 22:05:44 +00:00
Damien Mathieu
2fd1b52395 net/http: lock the read-only mutex in shouldRedirect
Since that method uses 'mux.m', we need to lock the mutex to avoid data races.

Change-Id: I998448a6e482b5d6a1b24f3354bb824906e23172
GitHub-Last-Rev: 163a7d4942
GitHub-Pull-Request: golang/go#23994
Reviewed-on: https://go-review.googlesource.com/96575
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 22:03:44 +00:00
David du Colombier
1c9297c365 cmd/compile: skip TestEmptyDwarfRanges on Plan 9
TestEmptyDwarfRanges has been added in CL 94816.
This test is failing on Plan 9 because executables
don't have a DWARF symbol table.

Fixes #24226.

Change-Id: Iff7e34b8c2703a2f19ee8087a4d64d0bb98496cd
Reviewed-on: https://go-review.googlesource.com/98275
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 21:23:07 +00:00
Hana Kim
d3562c9db9 internal/trace: Revert "remove backlinks from span/task end to start"
This reverts commit 16398894dc.
This broke TestUserTaskSpan test.

Change-Id: If5ff8bdfe84e8cb30787b03ead87205ece3d5601
Reviewed-on: https://go-review.googlesource.com/98235
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-02 20:32:08 +00:00
Hana Kim
16398894dc internal/trace: remove backlinks from span/task end to start
Even though undocumented, the assumption is the Event's link field
points to the following event in the future. The new span/task event
processing breaks the assumption.

Change-Id: I4ce2f30c67c4f525ec0a121a7e43d8bdd2ec3f77
Reviewed-on: https://go-review.googlesource.com/96395
Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-03-02 20:15:57 +00:00
Alberto Donizetti
644b2dafc2 test/codegen: add copyright headers to new codegen files
Change-Id: I9fe6572d1043ef9ee09c0925059ded554ad24c6b
Reviewed-on: https://go-review.googlesource.com/98215
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02 20:13:13 +00:00