1
0
mirror of https://github.com/golang/go synced 2024-11-23 17:30:02 -07:00
Commit Graph

36642 Commits

Author SHA1 Message Date
Anit Gandhi
3f2039e28d crypto/{aes,internal/cipherhw,tls}: use common internal/cpu in place of cipherhw
When the internal/cpu package was introduced, the AES package still used
the custom crypto/internal/cipherhw package for amd64 and s390x. This
change removes that package entirely in favor of directly referencing the
cpu feature flags set and exposed by the internal/cpu package. In
addition, 5 new flags have been added to the internal/cpu s390x struct
for detecting various cipher message (KM) features.

Change-Id: I77cdd8bc1b04ab0e483b21bf1879b5801a4ba5f4
GitHub-Last-Rev: a611e3ecb1
GitHub-Pull-Request: golang/go#24766
Reviewed-on: https://go-review.googlesource.com/105695
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 22:22:09 +00:00
Richard Musiol
482d241936 cmd/compile: add wasm stack optimization
Go's SSA instructions only operate on registers. For example, an add
instruction would read two registers, do the addition and then write
to a register. WebAssembly's instructions, on the other hand, operate
on the stack. The add instruction first pops two values from the stack,
does the addition, then pushes the result to the stack. To fulfill
Go's semantics, one needs to map Go's single add instruction to
4 WebAssembly instructions:
- Push the value of local variable A to the stack
- Push the value of local variable B to the stack
- Do addition
- Write value from stack to local variable C

Now consider that B was set to the constant 42 before the addition:
- Push constant 42 to the stack
- Write value from stack to local variable B

This works, but is inefficient. Instead, the stack is used directly
by inlining instructions if possible. With inlining it becomes:
- Push the value of local variable A to the stack (add)
- Push constant 42 to the stack (constant)
- Do addition (add)
- Write value from stack to local variable C (add)

Note that the two SSA instructions can not be generated sequentially
anymore, because their WebAssembly instructions are interleaved.

Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4

Updates #18892

Change-Id: Ie35e1c0bebf4985fddda0d6330eb2066f9ad6dec
Reviewed-on: https://go-review.googlesource.com/103535
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-23 21:37:59 +00:00
Hana Kim
02399fa65c cmd/vendor/.../unix: pick up upstream fixes for broken tests
Update golang/go#25528
Update golang/go#25529

Change-Id: I47ec282e76eb7740547e220ac00d6a7992e17b9e
Reviewed-on: https://go-review.googlesource.com/114094
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 21:11:41 +00:00
Alexander Döring
1dc20e9124 math/big: specialize Karatsuba implementation for squaring
Currently we use three different algorithms for squaring:
1. basic multiplication for small numbers
2. basic squaring for medium numbers
3. Karatsuba multiplication for large numbers

Change 3. to a version of Karatsuba multiplication specialized
for x == y.

Increasing the performance of 3. lets us lower the threshold
between 2. and 3.

Adapt TestCalibrate to the change that 3. isn't independent
of the threshold between 1. and 2. any more.

Fixes #23221.

benchstat old.txt new.txt
name           old time/op  new time/op  delta
NatSqr/1-4     29.6ns ± 7%  29.5ns ± 5%     ~     (p=0.103 n=50+50)
NatSqr/2-4     51.9ns ± 1%  51.9ns ± 1%     ~     (p=0.693 n=42+49)
NatSqr/3-4     64.3ns ± 1%  64.1ns ± 0%   -0.26%  (p=0.000 n=46+43)
NatSqr/5-4     93.5ns ± 2%  93.1ns ± 1%   -0.39%  (p=0.000 n=48+49)
NatSqr/8-4      131ns ± 1%   131ns ± 1%     ~     (p=0.870 n=46+49)
NatSqr/10-4     175ns ± 1%   175ns ± 1%   +0.38%  (p=0.000 n=49+47)
NatSqr/20-4     426ns ± 1%   429ns ± 1%   +0.84%  (p=0.000 n=46+48)
NatSqr/30-4     702ns ± 2%   699ns ± 1%   -0.38%  (p=0.011 n=46+44)
NatSqr/50-4    1.44µs ± 2%  1.43µs ± 1%   -0.54%  (p=0.010 n=48+48)
NatSqr/80-4    2.85µs ± 1%  2.87µs ± 1%   +0.68%  (p=0.000 n=47+47)
NatSqr/100-4   4.06µs ± 1%  4.07µs ± 1%   +0.29%  (p=0.000 n=46+45)
NatSqr/200-4   13.4µs ± 1%  13.5µs ± 1%   +0.73%  (p=0.000 n=48+48)
NatSqr/300-4   28.5µs ± 1%  28.2µs ± 1%   -1.22%  (p=0.000 n=46+48)
NatSqr/500-4   81.9µs ± 1%  67.0µs ± 1%  -18.25%  (p=0.000 n=48+48)
NatSqr/800-4    161µs ± 1%   140µs ± 1%  -13.29%  (p=0.000 n=47+48)
NatSqr/1000-4   245µs ± 1%   207µs ± 1%  -15.17%  (p=0.000 n=49+49)

go test -v -calibrate --run TestCalibrate
...
Calibrating threshold between basicSqr(x) and karatsubaSqr(x)
Looking for a timing difference for x between 200 - 500 words by 10 step
words = 200 deltaT =     -980ns (  -7%) is karatsubaSqr(x) better: false
words = 210 deltaT =     -773ns (  -5%) is karatsubaSqr(x) better: false
words = 220 deltaT =     -695ns (  -4%) is karatsubaSqr(x) better: false
words = 230 deltaT =     -570ns (  -3%) is karatsubaSqr(x) better: false
words = 240 deltaT =     -458ns (  -2%) is karatsubaSqr(x) better: false
words = 250 deltaT =      -63ns (   0%) is karatsubaSqr(x) better: false
words = 260 deltaT =      118ns (   0%) is karatsubaSqr(x) better: true  threshold  found
words = 270 deltaT =      377ns (   1%) is karatsubaSqr(x) better: true
words = 280 deltaT =      765ns (   3%) is karatsubaSqr(x) better: true
words = 290 deltaT =      673ns (   2%) is karatsubaSqr(x) better: true
words = 300 deltaT =      502ns (   1%) is karatsubaSqr(x) better: true
words = 310 deltaT =      629ns (   2%) is karatsubaSqr(x) better: true
words = 320 deltaT =    1.011µs (   3%) is karatsubaSqr(x) better: true
words = 330 deltaT =     1.36µs (   4%) is karatsubaSqr(x) better: true
words = 340 deltaT =    3.001µs (   8%) is karatsubaSqr(x) better: true
words = 350 deltaT =    3.178µs (   8%) is karatsubaSqr(x) better: true
...

Change-Id: I6f13c23d94d042539ac28e77fd2618cdc37a429e
Reviewed-on: https://go-review.googlesource.com/105075
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-05-23 20:46:02 +00:00
Andrew Bonventre
723f4286b9 doc: update Code of Conduct
Change-Id: I82c03dd026bb797a49b7361389373924acf6366c
Reviewed-on: https://go-review.googlesource.com/114085
Reviewed-by: Russ Cox <rsc@golang.org>
2018-05-23 20:16:46 +00:00
Hana (Hyang-Ah) Kim
3f89214940 cmd/pprof: add readline support similar to upstream
The upstream pprof implements the readline feature using
the github.com/chzyer/readline package in its pprof.go main.

It would be ideal to use the same readline support package as
the upstream for better user experience and code maintenance.
However, bringing in third-party packages requires more work
than I envisioned (e.g. clean up the vendored code to meet the
expected standard - iow don't break builders).

As a result, this change implements the similar feature
for the pprof command included in the go distribution
(cmd/pprof/pprof.go) using golang.org/x/crypto/ssh/terminal
for now.

Auto-completion is not yet supported (same in the upstream).

The feature is enabled only in linux, windows, darwin, and
only when terminal support is available.

This change brings in new vendored packages,
golang.org/x/crypto/ssh/terminal and
golang.org/x/sys/{unix,windows}.

For #14041

Change-Id: If4a790796acf2ab20f7e81268b9d9354c5a5cd2b
Reviewed-on: https://go-review.googlesource.com/112436
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 19:01:58 +00:00
Richard Musiol
392ff18b8f syscall: partially revert "enable some nacl code to be shared with js/wasm"
This partially reverts commit 3bdbb5df76.
The latest CL of js/wasm's file system support does not use
file descriptor mapping any more.

Change-Id: Iaec9c84b392366282cddc69acc75c8a3eb556824
Reviewed-on: https://go-review.googlesource.com/114195
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 18:54:00 +00:00
Adam Medzinski
645d4726f0 net: add example for net.UDPConn.WriteTo function
The current documentation of the WriteTo function is very poor and it
is difficult to deduce how to use it correctly. A good example will
make things much easier.

Fixes #25456

Change-Id: Ibf0c0e153afae8f3e0d7d765d0dc9bcbfd69bfb1
Reviewed-on: https://go-review.googlesource.com/113775
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23 18:24:17 +00:00
Elias Naur
7ba1c91dd9 misc/android: forward SIGQUIT to the process running on the device
When a test binary runs for too long, the go command sends it a
SIGQUIT to force a backtrace dump. On Android, the exec wrapper
will instead receive the signal and dump its backtrace.

Forward SIGQUIT signals from the wrapper to the wrapped process
to gain useful backtraces.

Inspired by issuse 25519; this CL would have revealed the hanging
test directly in the builder log.

Change-Id: Ic362d06940d261374343a1dc09366ef54edaa631
Reviewed-on: https://go-review.googlesource.com/114137
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23 18:18:39 +00:00
David Chase
77c51c6c62 cmd/compile: grow stack before test() to avoid gdb misbehavior
While next-ing over a call in gdb, if execution of that call
causes a goroutine's stack to grow (i.e., be moved), gdb loses
track and runs ahead to the next breakpoint, or to the end of
the program, whichever comes first.

Prevent this by preemptively growing the stack so that
ssa/debug_test.go will reliably measure what is intended,
the goodness of line number placement and variable printing.

Fixes #25497.

Change-Id: I8daf931650292a8c8faad2285d7fd405f2157bd2
Reviewed-on: https://go-review.googlesource.com/114080
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 18:05:07 +00:00
Ian Gudger
11b3ee6fec net: fix DNS NXDOMAIN performance regression
golang.org/cl/37879 unintentionally changed the way NXDOMAIN errors were
handled. Before that change, resolution would fail on the first NXDOMAIN
error and return to the user. After that change, the next server would
be consulted and resolution would fail only after all servers had been
consulted. This change restores the old behavior.

Go 10.10.2:
BenchmarkGoLookupIP-12                        	   10000	    174883 ns/op   11450 B/op	     163 allocs/op
BenchmarkGoLookupIPNoSuchHost-12              	    3000	    670140 ns/op   52189 B/op	     544 allocs/op
BenchmarkGoLookupIPWithBrokenNameServer-12    	       1	5002568137 ns/op  163792 B/op	     375 allocs/op

before this change:
BenchmarkGoLookupIP-12                        	   10000	    165501 ns/op    8585 B/op	      94 allocs/op
BenchmarkGoLookupIPNoSuchHost-12              	    1000	   1204117 ns/op   83661 B/op	     674 allocs/op
BenchmarkGoLookupIPWithBrokenNameServer-12    	       1	5002629186 ns/op  159128 B/op	     275 allocs/op

after this change:
BenchmarkGoLookupIP-12                        	   10000	    158102 ns/op    8585 B/op	      94 allocs/op
BenchmarkGoLookupIPNoSuchHost-12              	    2000	    645364 ns/op   42990 B/op	     356 allocs/op
BenchmarkGoLookupIPWithBrokenNameServer-12    	       1	5002163437 ns/op  159144 B/op	     275 allocs/op

Fixes #25336

Change-Id: I315cd70330d1f66e54ce5a189a61c99f095bc138
Reviewed-on: https://go-review.googlesource.com/113815
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 18:01:23 +00:00
dchenk
0979767664 net/http: fix doc comment on PostFormValue function
This function checks Request.PostForm, which now includes values parsed
from a PATCH request.

Change-Id: I5d0af58d9c0e9111d4e822c45f0fb1f511bbf0d5
Reviewed-on: https://go-review.googlesource.com/114009
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23 16:55:51 +00:00
azat
ef99381676 os: Add example for Expand function.
Change-Id: I581492c29158e57ca2f98b75f47870791965a7ff
Reviewed-on: https://go-review.googlesource.com/81155
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 15:37:22 +00:00
isharipo
67fe8b5677 cmd/internal/obj/x86: add missing Yi8 for VEX ytabs
This change adds Yi8 forms for every ytab that had them before AVX-512 patch.
The rationale is backwards-compatibility.

EVEX forms remain strict and unchanged as they're not bound to any
backwards-compatibility issues.

Fixes #25510

Change-Id: Icd692266010ed64c9fe47cc837afc2edf2ad2d1d
Reviewed-on: https://go-review.googlesource.com/114136
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-05-23 15:34:19 +00:00
Peter Weinberger
1764609b8b internal/trace: change Less to make sorting events deterministice
The existing code just used timestamps. The new code uses more fields
    when timestamps are equal.

	Revised to shorten code per reviewer comments.

Change-Id: Ibd0824d0acd7644484d536b1a754a0da156fac3d
Reviewed-on: https://go-review.googlesource.com/113721
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2018-05-23 14:18:18 +00:00
Alex Brainman
ef880a2f61 cmd/go: skip TestLinkerTmpDirIsDeleted if cgo is not enabled
Fixes builders that do not have cgo installed.

Change-Id: I719b7959226b0e67c3ffc11e071784787cabc5ab
Reviewed-on: https://go-review.googlesource.com/114235
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23 14:13:24 +00:00
Tobias Klauser
4bb649fba8 debug/pe: gofmt
CL 110555 introduced some changes which were not properly gofmt'ed.
Because the CL was sent via Github the gofmt checks usually performed by
git-codereview didn't catch this (see #24946).

Change-Id: I65c1271620690dbeec88b4ce482d158f7d6df45d
Reviewed-on: https://go-review.googlesource.com/114255
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2018-05-23 12:26:19 +00:00
Alex Brainman
1174ad3a8f cmd/link: close go.o before deleting it
Windows does not allow to delete opened file.

Fixes #24704

Change-Id: Idfca2d00a2c46bdd9bd2a721478bfd003c474ece
Reviewed-on: https://go-review.googlesource.com/113935
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23 09:44:27 +00:00
Ben Shi
5776bd558f misc/android: add a NL at the of README
There is no NL at the end of README, and that make it strange
when doing "cat misc/android/README".

Change-Id: Ib47953d7b16e8927a4d6be7d5be8de8f2ddbcc39
Reviewed-on: https://go-review.googlesource.com/114010
Reviewed-by: Elias Naur <elias.naur@gmail.com>
2018-05-23 05:54:14 +00:00
Ben Burkert
92bdfab795 internal/poll: disable splice on old linux versions
The splice syscall is buggy prior to linux 2.6.29. Instead of returning
0 when reading a closed socket, it returns EAGAIN.  While it is possible
to detect this (HAProxy falls back to recv), it is simpiler to avoid
using splice all together. the "fcntl(fd, F_GETPIPE_SZ)" syscall is used
detect buggy versions of splice as the syscall returns EINVAL on
versions prior to 2.6.35.

Fixes #25486

Change-Id: I860c029f13de2b09e95a7ba39b76ac7fca91a195
Reviewed-on: https://go-review.googlesource.com/113999
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-23 01:45:00 +00:00
Austin Clements
132900982c cmd/compile: ignore g register in liveness analysis
In rare circumstances that we don't yet fully understand, the g
register can be spilled to the stack and then reloaded. If this
happens, liveness analysis sees a pointer load into a
non-general-purpose register and panics.

We should fix the root cause of this, but fix the build for now by
ignoring pointer loads into the g register.

For #25504.

Change-Id: I0dfee1af9750c8e9157c7637280cdf07118ef2ca
Reviewed-on: https://go-review.googlesource.com/114081
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22 21:49:36 +00:00
David Chase
cfbf375a81 cmd/compile: common up code in fuse for joining blocks
There's semantically-but-not-literally equivalent code in
two cases for joining blocks' value lists in ssa/fuse.go.
It can be made literally equivalent, then commoned up.

Updates #25426.

Change-Id: Id1819366c9d22e5126f9203dcd4c622423994110
Reviewed-on: https://go-review.googlesource.com/113719
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-05-22 21:46:36 +00:00
Austin Clements
f11e9aac4f cmd/compile: reuse liveness structures
Currently liveness analysis is a significant source of allocations in
the compiler. This CL mitigates this by moving the main sources of
allocation to the ssa.Cache, allowing them to be reused between
different liveness runs.

Passes toolstash -cmp.

name        old time/op       new time/op       delta
Template          194ms ± 1%        193ms ± 1%    ~     (p=0.156 n=10+9)
Unicode          99.1ms ± 1%       99.3ms ± 2%    ~     (p=0.853 n=10+10)
GoTypes           689ms ± 0%        687ms ± 0%  -0.27% 	(p=0.022 n=10+9)
Compiler          3.29s ± 1%        3.30s ± 1%    ~ 	(p=0.489 n=9+9)
SSA               8.02s ± 2%        7.97s ± 1%  -0.71%  (p=0.011 n=10+10)
Flate             131ms ± 1%        130ms ± 1%  -0.59%  (p=0.043 n=9+10)
GoParser          162ms ± 1%        160ms ± 1%  -1.53%  (p=0.000 n=10+10)
Reflect           454ms ± 0%        454ms ± 0%    ~    	(p=0.959 n=8+8)
Tar               185ms ± 1%        185ms ± 2%    ~ 	(p=0.905 n=9+10)
XML               235ms ± 1%        232ms ± 1%  -1.15% 	(p=0.001 n=9+10)
[Geo mean]        414ms             412ms       -0.39%

name        old alloc/op      new alloc/op      delta
Template         35.6MB ± 0%       34.2MB ± 0%  -3.75%  (p=0.000 n=10+10)
Unicode          29.5MB ± 0%       29.4MB ± 0%  -0.26%  (p=0.000 n=10+9)
GoTypes           117MB ± 0%        112MB ± 0%  -3.78%  (p=0.000 n=9+10)
Compiler          532MB ± 0%        512MB ± 0%  -3.80%  (p=0.000 n=10+10)
SSA              1.55GB ± 0%       1.48GB ± 0%  -4.82%  (p=0.000 n=10+10)
Flate            24.5MB ± 0%       23.6MB ± 0%  -3.61%  (p=0.000 n=10+9)
GoParser         28.7MB ± 0%       27.7MB ± 0%  -3.43%  (p=0.000 n=10+10)
Reflect          80.5MB ± 0%       78.1MB ± 0%  -2.96%  (p=0.000 n=10+10)
Tar              35.1MB ± 0%       33.9MB ± 0%  -3.49%  (p=0.000 n=10+10)
XML              43.7MB ± 0%       42.4MB ± 0%  -3.05%  (p=0.000 n=10+10)
[Geo mean]       78.4MB            75.8MB       -3.30%

name        old allocs/op     new allocs/op     delta
Template           335k ± 0%         335k ± 0%  -0.12%  (p=0.000 n=10+10)
Unicode            339k ± 0%         339k ± 0%  -0.01%  (p=0.001 n=10+10)
GoTypes           1.18M ± 0%        1.17M ± 0%  -0.12%  (p=0.000 n=10+10)
Compiler          4.94M ± 0%        4.94M ± 0%  -0.06%  (p=0.000 n=10+10)
SSA               12.5M ± 0%        12.5M ± 0%  -0.07%  (p=0.000 n=10+10)
Flate              223k ± 0%         223k ± 0%  -0.11%  (p=0.000 n=10+10)
GoParser           281k ± 0%         281k ± 0%  -0.08%  (p=0.000 n=10+10)
Reflect            963k ± 0%         960k ± 0%  -0.23%  (p=0.000 n=10+9)
Tar                330k ± 0%         330k ± 0%  -0.12%  (p=0.000 n=10+10)
XML                392k ± 0%         392k ± 0%  -0.08%  (p=0.000 n=10+10)
[Geo mean]         761k              760k       -0.10%

Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps, plus these optimizations, is:

name        old time/op       new time/op       delta
Template          186ms ± 1%        194ms ± 1%  +4.41%  (p=0.000 n=9+10)
Unicode          96.5ms ± 1%       99.1ms ± 1%  +2.76%  (p=0.000 n=9+10)
GoTypes           659ms ± 1%        689ms ± 0%  +4.56%  (p=0.000 n=9+10)
Compiler          3.14s ± 2%        3.29s ± 1%  +4.95%  (p=0.000 n=9+9)
SSA               7.68s ± 3%        8.02s ± 2%  +4.41%  (p=0.000 n=10+10)
Flate             126ms ± 0%        131ms ± 1%  +4.14%  (p=0.000 n=10+9)
GoParser          153ms ± 1%        162ms ± 1%  +5.90%  (p=0.000 n=10+10)
Reflect           436ms ± 1%        454ms ± 0%  +4.14%  (p=0.000 n=10+8)
Tar               177ms ± 1%        185ms ± 1%  +4.28%  (p=0.000 n=8+9)
XML               224ms ± 1%        235ms ± 1%  +5.23%  (p=0.000 n=10+9)
[Geo mean]        396ms             414ms       +4.47%

name        old alloc/op      new alloc/op      delta
Template         34.5MB ± 0%       35.6MB ± 0%  +3.24%  (p=0.000 n=10+10)
Unicode          29.3MB ± 0%       29.5MB ± 0%  +0.51%  (p=0.000 n=9+10)
GoTypes           113MB ± 0%        117MB ± 0%  +3.31%  (p=0.000 n=8+9)
Compiler          509MB ± 0%        532MB ± 0%  +4.46%  (p=0.000 n=10+10)
SSA              1.49GB ± 0%       1.55GB ± 0%  +4.10%  (p=0.000 n=10+10)
Flate            23.8MB ± 0%       24.5MB ± 0%  +2.92%  (p=0.000 n=10+10)
GoParser         27.9MB ± 0%       28.7MB ± 0%  +2.88%  (p=0.000 n=10+10)
Reflect          77.4MB ± 0%       80.5MB ± 0%  +4.01%  (p=0.000 n=10+10)
Tar              34.1MB ± 0%       35.1MB ± 0%  +3.12%  (p=0.000 n=10+10)
XML              42.6MB ± 0%       43.7MB ± 0%  +2.65%  (p=0.000 n=10+10)
[Geo mean]       76.1MB            78.4MB       +3.11%

name        old allocs/op     new allocs/op     delta
Template           320k ± 0%         335k ± 0%  +4.60%  (p=0.000 n=10+10)
Unicode            336k ± 0%         339k ± 0%  +0.96%  (p=0.000 n=9+10)
GoTypes           1.12M ± 0%        1.18M ± 0%  +4.55%  (p=0.000 n=10+10)
Compiler          4.66M ± 0%        4.94M ± 0%  +6.18%  (p=0.000 n=10+10)
SSA               11.9M ± 0%        12.5M ± 0%  +5.37%  (p=0.000 n=10+10)
Flate              214k ± 0%         223k ± 0%  +4.15%  (p=0.000 n=9+10)
GoParser           270k ± 0%         281k ± 0%  +4.15%  (p=0.000 n=10+10)
Reflect            921k ± 0%         963k ± 0%  +4.49%  (p=0.000 n=10+10)
Tar                317k ± 0%         330k ± 0%  +4.25%  (p=0.000 n=10+10)
XML                375k ± 0%         392k ± 0%  +4.75%  (p=0.000 n=10+10)
[Geo mean]         729k              761k       +4.34%

Updates #24543.

Change-Id: Ia951fdb3c17ae1c156e1d05fc42e69caba33c91a
Reviewed-on: https://go-review.googlesource.com/110179
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 20:44:06 +00:00
Austin Clements
75dadbec1e cmd/compile: make LivenessMap dense
Currently liveness information is kept in a map keyed by *ssa.Value.
This made sense when liveness information was sparse, but now we have
liveness for nearly every ssa.Value. There's a fair amount of memory
and CPU overhead to this map now.

This CL replaces this map with a slice indexed by value ID.

Passes toolstash -cmp.

name        old time/op       new time/op       delta
Template          197ms ± 1%        194ms ± 1%  -1.60%  (p=0.000 n=9+10)
Unicode           100ms ± 2%         99ms ± 1%  -1.31%  (p=0.012 n=8+10)
GoTypes           695ms ± 1%        689ms ± 0%  -0.94%  (p=0.000 n=10+10)
Compiler          3.34s ± 2%        3.29s ± 1%  -1.26%  (p=0.000 n=10+9)
SSA               8.08s ± 0%        8.02s ± 2%  -0.70%  (p=0.034 n=8+10)
Flate             133ms ± 1%        131ms ± 1%  -1.04%  (p=0.006 n=10+9)
GoParser          163ms ± 1%        162ms ± 1%  -0.79%  (p=0.034 n=8+10)
Reflect           459ms ± 1%        454ms ± 0%  -1.06%  (p=0.000 n=10+8)
Tar               186ms ± 1%        185ms ± 1%  -0.87%  (p=0.003 n=9+9)
XML               238ms ± 1%        235ms ± 1%  -1.01%  (p=0.004 n=8+9)
[Geo mean]        418ms             414ms       -1.06%

name        old alloc/op      new alloc/op      delta
Template         36.4MB ± 0%       35.6MB ± 0%  -2.29%  (p=0.000 n=9+10)
Unicode          29.7MB ± 0%       29.5MB ± 0%  -0.68%  (p=0.000 n=10+10)
GoTypes           119MB ± 0%        117MB ± 0%  -2.30%  (p=0.000 n=9+9)
Compiler          546MB ± 0%        532MB ± 0%  -2.47%  (p=0.000 n=10+10)
SSA              1.59GB ± 0%       1.55GB ± 0%  -2.41%  (p=0.000 n=10+10)
Flate            24.9MB ± 0%       24.5MB ± 0%  -1.77%  (p=0.000 n=8+10)
GoParser         29.5MB ± 0%       28.7MB ± 0%  -2.60%  (p=0.000 n=9+10)
Reflect          81.7MB ± 0%       80.5MB ± 0%  -1.49%  (p=0.000 n=10+10)
Tar              35.7MB ± 0%       35.1MB ± 0%  -1.64%  (p=0.000 n=10+10)
XML              45.0MB ± 0%       43.7MB ± 0%  -2.76%  (p=0.000 n=9+10)
[Geo mean]       80.1MB            78.4MB       -2.04%

name        old allocs/op     new allocs/op     delta
Template           336k ± 0%         335k ± 0%  -0.31%  (p=0.000 n=9+10)
Unicode            339k ± 0%         339k ± 0%  -0.05%  (p=0.000 n=10+10)
GoTypes           1.18M ± 0%        1.18M ± 0%  -0.26%  (p=0.000 n=10+10)
Compiler          4.96M ± 0%        4.94M ± 0%  -0.24%  (p=0.000 n=10+10)
SSA               12.6M ± 0%        12.5M ± 0%  -0.30%  (p=0.000 n=10+10)
Flate              224k ± 0%         223k ± 0%  -0.30%  (p=0.000 n=10+10)
GoParser           282k ± 0%         281k ± 0%  -0.32%  (p=0.000 n=10+10)
Reflect            965k ± 0%         963k ± 0%  -0.27%  (p=0.000 n=9+10)
Tar                331k ± 0%         330k ± 0%  -0.27%  (p=0.000 n=10+10)
XML                393k ± 0%         392k ± 0%  -0.26%  (p=0.000 n=10+10)
[Geo mean]         763k              761k       -0.26%

Updates #24543.

Change-Id: I4cfd2461510d3c026a262760bca225dc37482341
Reviewed-on: https://go-review.googlesource.com/110178
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 20:44:05 +00:00
Austin Clements
3c36b8be66 cmd/compile: incrementally compact liveness maps
The per-Value slice of liveness maps is currently one of the largest
sources of allocation in the compiler. On cmd/compile/internal/ssa,
it's 5% of overall allocation, or 75MB in total. Enabling liveness
maps everywhere significantly increased this allocation footprint,
which in turn slowed down the compiler.

Improve this by compacting the liveness maps after every block is
processed. There are typically very few distinct liveness maps, so
compacting the maps after every block, rather than at the end of the
function, can significantly reduce these allocations.

Passes toolstash -cmp.

name        old time/op       new time/op       delta
Template          198ms ± 2%        196ms ± 1%  -1.11%  (p=0.008 n=9+10)
Unicode           100ms ± 1%         99ms ± 1%  -0.94%  (p=0.015 n=8+9)
GoTypes           703ms ± 2%        695ms ± 1%  -1.15%  (p=0.000 n=10+10)
Compiler          3.38s ± 3%        3.33s ± 0%  -1.66%  (p=0.000 n=10+9)
SSA               7.96s ± 1%        7.93s ± 1%    ~ 	(p=0.113 n=9+10)
Flate             134ms ± 1%        132ms ± 1%  -1.30%  (p=0.000 n=8+10)
GoParser          165ms ± 2%        163ms ± 1%  -1.32%  (p=0.013 n=9+10)
Reflect           462ms ± 2%        459ms ± 0%  -0.65%  (p=0.036 n=9+8)
Tar               188ms ± 2%        186ms ± 1%    ~     (p=0.173 n=8+10)
XML               243ms ± 7%        239ms ± 1%    ~     (p=0.684 n=10+10)
[Geo mean]        421ms             416ms       -1.10%

name        old alloc/op      new alloc/op      delta
Template         38.0MB ± 0%       36.5MB ± 0%  -3.98%  (p=0.000 n=10+10)
Unicode          30.3MB ± 0%       29.6MB ± 0%  -2.21% 	(p=0.000 n=10+10)
GoTypes           125MB ± 0%        120MB ± 0%  -4.51% 	(p=0.000 n=10+9)
Compiler          575MB ± 0%        546MB ± 0%  -5.06% 	(p=0.000 n=10+10)
SSA              1.64GB ± 0%       1.55GB ± 0%  -4.97% 	(p=0.000 n=10+10)
Flate            25.9MB ± 0%       25.0MB ± 0%  -3.41% 	(p=0.000 n=10+10)
GoParser         30.7MB ± 0%       29.5MB ± 0%  -3.97% 	(p=0.000 n=10+10)
Reflect          84.1MB ± 0%       81.9MB ± 0%  -2.64% 	(p=0.000 n=10+10)
Tar              37.0MB ± 0%       35.8MB ± 0%  -3.27% 	(p=0.000 n=10+9)
XML              47.2MB ± 0%       45.0MB ± 0%  -4.57% 	(p=0.000 n=10+10)
[Geo mean]       83.2MB            79.9MB       -3.86%

name        old allocs/op     new allocs/op     delta
Template           337k ± 0%         337k ± 0%  -0.06%  (p=0.000 n=10+10)
Unicode            340k ± 0%         340k ± 0%  -0.01% 	(p=0.014 n=10+10)
GoTypes           1.18M ± 0%        1.18M ± 0%  -0.04% 	(p=0.000 n=10+10)
Compiler          4.97M ± 0%        4.97M ± 0%  -0.03% 	(p=0.000 n=10+10)
SSA               12.3M ± 0%        12.3M ± 0%  -0.01% 	(p=0.000 n=10+10)
Flate              226k ± 0%         225k ± 0%  -0.09% 	(p=0.000 n=10+10)
GoParser           283k ± 0%         283k ± 0%  -0.06% 	(p=0.000 n=10+9)
Reflect            972k ± 0%         971k ± 0%  -0.04% 	(p=0.000 n=10+8)
Tar                333k ± 0%         332k ± 0%  -0.05% 	(p=0.000 n=10+9)
XML                395k ± 0%         395k ± 0%  -0.04% 	(p=0.000 n=10+10)
[Geo mean]         764k              764k       -0.04%

Updates #24543.

Change-Id: I6fdc46e4ddb6a8eea95d38242345205eb8397f0b
Reviewed-on: https://go-review.googlesource.com/110177
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 20:44:03 +00:00
Austin Clements
3c4aaf8a32 cmd/compile: abstract bvec sets
This moves the bvec hash table logic out of Liveness.compact and into
a bvecSet type. Furthermore, the bvecSet type has the ability to grow
dynamically, which the current implementation doesn't. In addition to
making the code cleaner, this will make it possible to incrementally
compact liveness bitmaps.

Passes toolstash -cmp

Updates #24543.

Change-Id: I46c53e504494206061a1f790ae4a02d768a65681
Reviewed-on: https://go-review.googlesource.com/110176
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 20:44:02 +00:00
Austin Clements
577c05ca1c cmd/compile: single pass over Blocks in Liveness.epilogue
Currently Liveness.epilogue makes three passes over the Blocks, but
there's no need to do this. Combine them into a single pass. This
eliminates the need for blockEffects.lastbitmapindex, but, more
importantly, will let us incrementally compact the liveness bitmaps
and significantly reduce allocatons in Liveness.epilogue.

Passes toolstash -cmp.

Updates #24543.

Change-Id: I27802bcd00d23aa122a7ec16cdfd739ae12dd7aa
Reviewed-on: https://go-review.googlesource.com/110175
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 20:44:01 +00:00
Hana Kim
51be90a251 cmd/vendor/README: temporary instruction for update
Until vgo sorts out and cleans up the vendoring process.

Ran govendor to update packages the cmd/pprof depends on
which resulted in deletion of some of unnecessary files.

Change-Id: Idfba53e94414e90a5e280222750a6df77e979a16
Reviewed-on: https://go-review.googlesource.com/114079
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
2018-05-22 20:38:19 +00:00
David Chase
402dd10a99 cmd/link: revert DWARF version to 2 for .debug_lines
On OSX 10.12 and earlier, paired with XCode 9.0,
specifying DWARF version 3 causes dsymutil to misbehave.
Version 2 appears to be good enough to allow processing
of the prologue_end opcode on (at least one version of)
Linux and OSX 10.13.

Fixes #25451.

Change-Id: Ic760e34248393a5386be96351c8e492da1d3413b
Reviewed-on: https://go-review.googlesource.com/114015
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 20:27:52 +00:00
Martin Möhrmann
685ecc7f02 internal/cpu: fix test build on ppc64
The runtime import is unused.

Change-Id: I37fe210256ddafa579d9e6d64f3f0db78581974e
Reviewed-on: https://go-review.googlesource.com/114175
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2018-05-22 20:14:05 +00:00
Austin Clements
e391faded7 runtime: fix defer matching of leaf functions on LR machines
Traceback matches the defer stack with the function call stack using
the SP recorded in defer frames when the defer frame is created.
However, on LR machines this is ambiguous: if function A pushes a
defer and then calls function B, where B is a leaf function with a
zero-sized frame, then both A and B have the same SP and will *both*
match the defer on the defer stack. Since traceback unwinds through B
first, it will incorrectly match up the defer with B's frame instead
of A's frame.

Where this goes particularly wrong is if function B causes a signal
that turns into a panic (e.g., a nil pointer dereference). In order to
handle the fact that we may not have a liveness map at the location
that caused the signal and injected a sigpanic call, traceback has
logic to unwind the panicking frame's continuation PC to the PC where
the most recent defer was pushed (this is safe because the frame is
dead other than any defers it pushed). However, if traceback
mis-matches the defer stack, it winds up reporting the B's
continuation PC is in A. If the runtime then uses this continuation PC
to look up PCDATA in B, it will panic because the PC is out of range
for B. This failure mode can be seen in
sync/atomic/atomic_test.go:TestNilDeref. An example failure is:
https://build.golang.org/log/8e07a762487839252af902355f6b1379dbd463c5

This CL fixes all of this by recognizing that a function that pushes a
defer must also have a non-zero-sized frame and using this fact to
refine the defer matching logic.

Fixes the build for arm64, mips, mipsle, ppc64, ppc64le, and s390x.

Fixes #25499.

Change-Id: Iff7c01d08ad42f3de22b3a73658cc2f674900101
Reviewed-on: https://go-review.googlesource.com/114078
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 19:38:03 +00:00
Martin Möhrmann
f045ddc624 internal/cpu: add experiment to disable CPU features with GODEBUGCPU
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active.

The GODEBUGCPU environment variable can be used to disable usage of
specific processor features in the Go standard library.
This is useful for testing and benchmarking different code paths that
are guarded by internal/cpu variable checks.

Use of processor features can not be enabled through GODEBUGCPU.

To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use:
GODEBUGCPU=avx=0,sse41=0

The special "all" option can be used to disable all options:
GODEBUGCPU=all=0

Updates #12805
Updates #15403

Change-Id: I699c2e6f74d98472b6fb4b1e5ffbf29b15697aab
Reviewed-on: https://go-review.googlesource.com/91737
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:49:31 +00:00
Zhongpeng Lin
3d15f76881 go/build: call ctxt.match for checking file name constraints
This makes the checking of build tags in file names consistent to that of the build tags in `// +build` line.

Fixed #25461

Change-Id: Iba14d1050f8aba44e7539ab3b8711af1980ccfe4
GitHub-Last-Rev: 11b14e239d
GitHub-Pull-Request: golang/go#25480
Reviewed-on: https://go-review.googlesource.com/113818
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:47:03 +00:00
Martin Sucha
a10d390676 crypto/x509: document fields used in CreateCertificate
The added fields are used in buildExtensions so
should be documented too.

Fixes #21363

Change-Id: Ifcc11da5b690327946c2488bcf4c79c60175a339
Reviewed-on: https://go-review.googlesource.com/113916
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:29:47 +00:00
Martin Sucha
5c8f65b9ca crypto/x509: reformat template members in docs
It's easier to skim a list of items visually when the
items are each on a separate line. Separate lines also
help reduce diff size when items are added/removed.

The list is indented so that it's displayed preformatted
in HTML output as godoc doesn't support formatting lists
natively yet (see #7873).

Change-Id: Ibf9e92437e4b464ba58ea3ccef579e8df4745d75
Reviewed-on: https://go-review.googlesource.com/113915
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:29:16 +00:00
Alberto Donizetti
37c11dc0aa log/syslog: skip tests that depend on daemon on builders
Some functions in log/syslog depend on syslogd running. Instead of
treating errors caused by the daemon not running as test failures,
ignore them and skip the test.

Fixes the longtest builder.

Change-Id: I628fe4aab5f1a505edfc0748861bb976ed5917ea
Reviewed-on: https://go-review.googlesource.com/113838
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 18:26:03 +00:00
dchenk
faa69b906f encoding/base32: remove redundant conditional
Immediately following the conditional block removed here is a loop
which checks exactly what the conditional already checked, so the
entire conditional is redundant.

Change-Id: I892fd9f2364d87e2c1cacb0407531daec6643183
Reviewed-on: https://go-review.googlesource.com/114000
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-22 17:12:05 +00:00
Keith Randall
31ef3846a7 cmd/compile: add rulegen diagnostic
When rulegen complains about a missing type, report the line number
in the rules file.

Change-Id: Ic7c19e1d5f29547911909df5788945848a6080ff
Reviewed-on: https://go-review.googlesource.com/114004
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 16:40:34 +00:00
Austin Clements
c5ed10f3be runtime: support for debugger function calls
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.

Fixes #21678.
Updates derekparker/delve#119.

Change-Id: I56d8ca67700f1f77e19d89e7fc92ab337b228834
Reviewed-on: https://go-review.googlesource.com/109699
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:05 +00:00
Austin Clements
9f95c9db23 cmd/compile, cmd/internal/obj: record register maps in binary
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.

This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:

name        old time/op       new time/op       delta
Template          195ms ± 2%        197ms ± 1%    ~     (p=0.136 n=9+9)
Unicode          98.4ms ± 2%       99.7ms ± 1%  +1.39%  (p=0.004 n=10+10)
GoTypes           685ms ± 1%        700ms ± 1%  +2.06%  (p=0.000 n=9+9)
Compiler          3.28s ± 1%        3.34s ± 0%  +1.71%  (p=0.000 n=9+8)
SSA               7.79s ± 1%        7.91s ± 1%  +1.55%  (p=0.000 n=10+9)
Flate             133ms ± 2%        133ms ± 2%    ~     (p=0.190 n=10+10)
GoParser          161ms ± 2%        164ms ± 3%  +1.83%  (p=0.015 n=10+10)
Reflect           450ms ± 1%        457ms ± 1%  +1.62%  (p=0.000 n=10+10)
Tar               183ms ± 2%        185ms ± 1%  +0.91%  (p=0.008 n=9+10)
XML               234ms ± 1%        238ms ± 1%  +1.60%  (p=0.000 n=9+9)
[Geo mean]        411ms             417ms       +1.40%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.47M ± 0%        1.51M ± 0%  +2.79%  (p=0.000 n=10+10)

Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:

name        old time/op       new time/op       delta
Template          185ms ± 2%        197ms ± 1%   +6.42%  (p=0.000 n=10+9)
Unicode          96.3ms ± 3%       99.7ms ± 1%   +3.60%  (p=0.000 n=10+10)
GoTypes           658ms ± 0%        700ms ± 1%   +6.37%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.34s ± 0%   +6.53%  (p=0.000 n=9+8)
SSA               7.41s ± 2%        7.91s ± 1%   +6.71%  (p=0.000 n=9+9)
Flate             126ms ± 1%        133ms ± 2%   +6.15%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        164ms ± 3%   +6.89%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        457ms ± 1%   +4.59%  (p=0.000 n=10+10)
Tar               178ms ± 1%        185ms ± 1%   +4.18%  (p=0.000 n=10+10)
XML               223ms ± 1%        238ms ± 1%   +6.39%  (p=0.000 n=10+9)
[Geo mean]        394ms             417ms        +5.78%

name        old alloc/op      new alloc/op      delta
Template         34.5MB ± 0%       38.0MB ± 0%  +10.19%  (p=0.000 n=10+10)
Unicode          29.3MB ± 0%       30.3MB ± 0%   +3.56%  (p=0.000 n=8+9)
GoTypes           113MB ± 0%        125MB ± 0%  +10.89%  (p=0.000 n=10+10)
Compiler          510MB ± 0%        575MB ± 0%  +12.79%  (p=0.000 n=10+10)
SSA              1.46GB ± 0%       1.64GB ± 0%  +12.40%  (p=0.000 n=10+10)
Flate            23.9MB ± 0%       25.9MB ± 0%   +8.56%  (p=0.000 n=10+10)
GoParser         28.0MB ± 0%       30.8MB ± 0%  +10.08%  (p=0.000 n=10+10)
Reflect          77.6MB ± 0%       84.3MB ± 0%   +8.63%  (p=0.000 n=10+10)
Tar              34.1MB ± 0%       37.0MB ± 0%   +8.44%  (p=0.000 n=10+10)
XML              42.7MB ± 0%       47.2MB ± 0%  +10.75%  (p=0.000 n=10+10)
[Geo mean]       76.0MB            83.3MB        +9.60%

name        old allocs/op     new allocs/op     delta
Template           321k ± 0%         337k ± 0%   +4.98%  (p=0.000 n=10+10)
Unicode            337k ± 0%         340k ± 0%   +1.04%  (p=0.000 n=10+9)
GoTypes           1.13M ± 0%        1.18M ± 0%   +4.85%  (p=0.000 n=10+10)
Compiler          4.67M ± 0%        4.96M ± 0%   +6.25%  (p=0.000 n=10+10)
SSA               11.7M ± 0%        12.3M ± 0%   +5.69%  (p=0.000 n=10+10)
Flate              216k ± 0%         226k ± 0%   +4.52%  (p=0.000 n=10+9)
GoParser           271k ± 0%         283k ± 0%   +4.52%  (p=0.000 n=10+10)
Reflect            927k ± 0%         972k ± 0%   +4.78%  (p=0.000 n=10+10)
Tar                318k ± 0%         333k ± 0%   +4.56%  (p=0.000 n=10+10)
XML                376k ± 0%         395k ± 0%   +5.04%  (p=0.000 n=10+10)
[Geo mean]         730k              764k        +4.61%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.51M ± 0%   +3.66%  (p=0.000 n=10+10)

For #24543.

Change-Id: I91e003dc64151916b384274884bf02a2d6862547
Reviewed-on: https://go-review.googlesource.com/109353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:03 +00:00
Austin Clements
61158c162f cmd/compile: compute register liveness maps
This extends the liveness analysis to track registers containing live
pointers. We do this by tracking bitmaps for live pointer registers
in parallel with bitmaps for stack variables.

This does not yet do anything with these liveness maps, though they do
appear in the debug output for -live=2.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          193ms ± 5%        195ms ± 2%    ~     (p=0.050 n=9+9)
Unicode          97.7ms ± 2%       98.4ms ± 2%    ~     (p=0.315 n=9+10)
GoTypes           674ms ± 2%        685ms ± 1%  +1.72%  (p=0.001 n=9+9)
Compiler          3.21s ± 1%        3.28s ± 1%  +2.28%  (p=0.000 n=10+9)
SSA               7.70s ± 1%        7.79s ± 1%  +1.07%  (p=0.015 n=10+10)
Flate             130ms ± 3%        133ms ± 2%  +2.19%  (p=0.003 n=10+10)
GoParser          159ms ± 3%        161ms ± 2%  +1.51%  (p=0.019 n=10+10)
Reflect           444ms ± 1%        450ms ± 1%  +1.43%  (p=0.000 n=9+10)
Tar               181ms ± 2%        183ms ± 2%  +1.45%  (p=0.010 n=10+9)
XML               230ms ± 1%        234ms ± 1%  +1.56%  (p=0.000 n=8+9)
[Geo mean]        405ms             411ms       +1.48%

No effect on binary size because we're not yet emitting the register
maps.

For #24543.

Change-Id: Ieb022f0aea89c0ea9a6f035195bce2f0e67dbae4
Reviewed-on: https://go-review.googlesource.com/109352
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:02 +00:00
Austin Clements
840f25bedd cmd/compile: dense numbering for GP registers
For register maps, we need a dense numbering of registers that may
contain pointers of interest to the garbage collector. Add this to
Register and compute it from the GP register set.

For #24543.

Change-Id: If6f0521effca5eca4d17895468b1fc52d67e0f32
Reviewed-on: https://go-review.googlesource.com/109351
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 15:55:01 +00:00
Austin Clements
4f765b18f8 cmd/compile: fix ARM64 build
Write barrier unsafe-point analysis needs to flow through
OpARM64MOVWUload in c-shared mode.

Change-Id: I4f06f54d9e74a739a1b4fcb9ab0a1ae9b7b88a95
Reviewed-on: https://go-review.googlesource.com/114077
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 15:38:30 +00:00
Austin Clements
97bea97065 cmd/compile: fix unsafe-point analysis with -N
Compiling without optimizations (-N) can result in write barrier
blocks that have been optimized away but not actually pruned from the
block set. Fix unsafe-point analysis to recognize and ignore these.

For #24543.

Change-Id: I2ca86fb1a0346214ec71d7d6c17b6a121857b01d
Reviewed-on: https://go-review.googlesource.com/114076
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22 15:26:32 +00:00
isharipo
5437cde96c cmd/asm: enable AVX512
- Uncomment tests for AVX512 encoder
- Permit instruction suffixes for x86
- Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
- EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
- optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)

Note: suffix formatting implemented with updated CConv function.
Now arch asm backend should register formatting function by
calling RegisterOpSuffix.

Updates #22779

Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8
Reviewed-on: https://go-review.googlesource.com/113315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-22 14:57:15 +00:00
Ben Shi
8a85bce215 test/codegen: improve test cases for arm64
1. Some incorrect test cases are disabled.
2. Some wrong test cases are corrected.
3. Some new test cases are added.

Change-Id: Ib5d0473d55159f233ddab79f96967eaec7b08597
Reviewed-on: https://go-review.googlesource.com/113736
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22 14:50:41 +00:00
Austin Clements
ed7a0682e2 cmd/compile: output stack map index everywhere it changes
Currently, the code generator only considers outputting stack map
indexes at CALL instructions. Raise this into the code generator loop
itself so that changes in the stack map index at any instruction emit
a PCDATA Prog before the actual instruction.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          190ms ± 2%        191ms ± 2%    ~     (p=0.529 n=10+10)
Unicode          96.4ms ± 1%       98.5ms ± 3%  +2.18%  (p=0.001 n=9+10)
GoTypes           669ms ± 1%        673ms ± 1%  +0.62%  (p=0.004 n=9+9)
Compiler          3.18s ± 1%        3.22s ± 1%  +1.06%  (p=0.000 n=10+9)
SSA               7.59s ± 1%        7.64s ± 1%  +0.66%  (p=0.023 n=10+10)
Flate             128ms ± 1%        130ms ± 2%  +1.07%  (p=0.043 n=10+10)
GoParser          157ms ± 2%        158ms ± 3%    ~     (p=0.123 n=10+10)
Reflect           442ms ± 1%        445ms ± 1%  +0.73%  (p=0.017 n=10+9)
Tar               179ms ± 1%        180ms ± 1%  +0.58%  (p=0.019 n=9+9)
XML               229ms ± 1%        232ms ± 2%  +1.27%  (p=0.009 n=10+10)
[Geo mean]        401ms             405ms       +0.94%

name        old exe-bytes     new exe-bytes     delta
HelloSize         1.46M ± 0%        1.47M ± 0%  +0.84%  (p=0.000 n=10+10)
[Geo mean]        1.46M             1.47M       +0.84%

For #24543.

Change-Id: I4bfe45b767c9d9db47308a27763b303fa75bfa54
Reviewed-on: https://go-review.googlesource.com/109350
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:38 +00:00
Austin Clements
a367f44c18 cmd/compile: enable stack maps everywhere except unsafe points
This modifies issafepoint in liveness analysis to report almost every
operation as a safe point. There are four things we don't mark as
safe-points:

1. Runtime code (other than at calls).

2. go:nosplit functions (other than at calls).

3. Instructions between the load of the write barrier-enabled flag and
   the write.

4. Instructions leading up to a uintptr -> unsafe.Pointer conversion.

We'll optimize this in later CLs:

name        old time/op       new time/op       delta
Template          185ms ± 2%        190ms ± 2%   +2.95%  (p=0.000 n=10+10)
Unicode          96.3ms ± 3%       96.4ms ± 1%     ~     (p=0.905 n=10+9)
GoTypes           658ms ± 0%        669ms ± 1%   +1.72%  (p=0.000 n=10+9)
Compiler          3.14s ± 1%        3.18s ± 1%   +1.56%  (p=0.000 n=9+10)
SSA               7.41s ± 2%        7.59s ± 1%   +2.48%  (p=0.000 n=9+10)
Flate             126ms ± 1%        128ms ± 1%   +2.08%  (p=0.000 n=10+10)
GoParser          153ms ± 1%        157ms ± 2%   +2.38%  (p=0.000 n=10+10)
Reflect           437ms ± 1%        442ms ± 1%   +0.98%  (p=0.001 n=10+10)
Tar               178ms ± 1%        179ms ± 1%   +0.67%  (p=0.035 n=10+9)
XML               223ms ± 1%        229ms ± 1%   +2.58%  (p=0.000 n=10+10)
[Geo mean]        394ms             401ms        +1.75%

No effect on binary size because we're not yet emitting these extra
safe points.

For #24543.

Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859
Reviewed-on: https://go-review.googlesource.com/109348
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-05-22 14:43:37 +00:00
Austin Clements
d7d9df8a5b cmd/compile: introduce LivenessMap and LivenessIndex
Currently liveness only produces a stack map index at each safe point,
so the information is summarized in a map[*ssa.Value]int. We're about
to have both a stack map index and a register map index, so replace
the int with a LivenessIndex type we can extend, and replace the map
with a LivenessMap that we can also change more easily in the future.

This also gives us an easy hook for defining the value that means "not
a safe point".

Passes toolstash -cmp.

For #24543.

Change-Id: Ic4c069839635efed4fd0f603899b80f8be3b56ec
Reviewed-on: https://go-review.googlesource.com/109347
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:36 +00:00
Austin Clements
02495da6b6 cmd/internal/obj: consolidate emitting entry stack map
The obj package needs to emit the PCDATA to select the entry stack map
before calling morestack. Currently this is copied for every
architecture. Since we're about to change how this works, consolidate
all of these copies into a single helper function.

For #24543.

Change-Id: Ia92d94de78f8e23fd06dba747c43e03e5989f67b
Reviewed-on: https://go-review.googlesource.com/109346
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-05-22 14:43:35 +00:00