1
0
mirror of https://github.com/golang/go synced 2024-11-05 21:36:12 -07:00
Commit Graph

3445 Commits

Author SHA1 Message Date
Keith Randall
7b916243d9 cmd/compile: rename init function from init.ializers back to init
The name change init -> init.ializers was initially required for
initialization code.

With CL 161337 there's no wrapper code any more, there's a data
structure instead (named .inittask). So we can go back to just
plain init appearing in tracebacks.

RELNOTE=yes

Update #29919. Followon to CL 161337.

Change-Id: I5a4a49d286df24b53b2baa193dfda482f3ea82a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/167780
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-18 20:41:12 +00:00
Keith Randall
d949d0b925 cmd/compile: reorganize init functions
Instead of writing an init function per package that does the same
thing for every package, just write that implementation once in the
runtime. Change the compiler to generate a data structure that encodes
the required initialization operations.

Reduces cmd/go binary size by 0.3%+.  Most of the init code is gone,
including all the corresponding stack map info. The .inittask
structures that replace them are quite a bit smaller.

Most usefully to me, there is no longer an init function in every -S output.
(There is an .inittask global there, but it's much less distracting.)

After this CL we could change the name of the "init.ializers" function
back to just "init".

Update #6853

R=go1.13

Change-Id: Iec82b205cc52fe3ade4d36406933c97dbc9c01b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/161337
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-03-18 20:10:55 +00:00
Matthew Dempsky
916e861fd9 cmd/compile: fix importing rewritten f(g()) calls
golang.org/cl/166983 started serializing the Ninit field of OCALL
nodes within function inline bodies (necessary to fix a regression in
building crypto/ecdsa with -gcflags=-l=4), but this means the Ninit
field needs to be typechecked when the imported function body is used.

It's unclear why this wasn't necessary for the crypto/ecdsa
regression.

Fixes #30907.

Change-Id: Id5f0bf3c4d17bbd6d5318913b859093c93a0a20c
Reviewed-on: https://go-review.googlesource.com/c/go/+/168199
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-18 19:43:38 +00:00
Keith Randall
2c423f063b cmd/compile,runtime: provide index information on bounds check failure
A few examples (for accessing a slice of length 3):

   s[-1]    runtime error: index out of range [-1]
   s[3]     runtime error: index out of range [3] with length 3
   s[-1:0]  runtime error: slice bounds out of range [-1:]
   s[3:0]   runtime error: slice bounds out of range [3:0]
   s[3:-1]  runtime error: slice bounds out of range [:-1]
   s[3:4]   runtime error: slice bounds out of range [:4] with capacity 3
   s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3

Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.

An exhaustive set of examples is in issue30116[u].out in the CL.

The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.

Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.

Fixes #30116

Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-18 17:33:38 +00:00
Than McIntosh
64b1889e2d test: new test for issue 30862
New test case, inspired by gccgo issue 30862.

Updates #30862.

Change-Id: I5e494b877e4fd142b8facb527471fe1fdef39c61
Reviewed-on: https://go-review.googlesource.com/c/go/+/167744
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-15 19:05:53 +00:00
Tobias Klauser
156c830bea cmd/compile: eliminate unnecessary type conversions in TrailingZeros(16|8) for arm
This follows CL 156999 which did the same for arm64.

name               old time/op  new time/op  delta
TrailingZeros-4    7.30ns ± 1%  7.30ns ± 0%     ~     (p=0.413 n=9+9)
TrailingZeros8-4   8.32ns ± 0%  7.17ns ± 0%  -13.77%  (p=0.000 n=10+9)
TrailingZeros16-4  8.30ns ± 0%  7.18ns ± 0%  -13.50%  (p=0.000 n=9+10)
TrailingZeros32-4  6.46ns ± 1%  6.47ns ± 1%     ~     (p=0.325 n=10+10)
TrailingZeros64-4  16.3ns ± 0%  16.2ns ± 0%   -0.61%  (p=0.000 n=7+10)

Change-Id: I7e9e1abf7e30d811aa474d272b2824ec7cbbaa98
Reviewed-on: https://go-review.googlesource.com/c/go/+/167797
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-15 18:37:22 +00:00
Matthew Dempsky
c0cfe9687f cmd/compile: rewrite f(g()) for multi-value g() during typecheck
This is a re-attempt at CL 153841, which caused two regressions:

1. crypto/ecdsa failed to build with -gcflags=-l=4. This was because
when "t1, t2, ... := g(); f(t1, t2, ...)" was exported, we were losing
the first assignment from the call's Ninit field.

2. net/http/pprof failed to run with -gcflags=-N. This is due to a
conflict with CL 159717: as of that CL, package-scope initialization
statements are executed within the "init.ializer" function, rather
than the "init" function, and the generated temp variables need to be
moved accordingly too.

[Rest of description is as before.]

This CL moves order.go's copyRet logic for rewriting f(g()) into t1,
t2, ... := g(); f(t1, t2, ...) earlier into typecheck. This allows the
rest of the compiler to stop worrying about multi-value functions
appearing outside of OAS2FUNC nodes.

This changes compiler behavior in a few observable ways:

1. Typechecking error messages for builtin functions now use general
case error messages rather than unnecessarily differing ones.

2. Because f(g()) is rewritten before inlining, saved inline bodies
now see the rewritten form too. This could be addressed, but doesn't
seem worthwhile.

3. Most notably, this simplifies escape analysis and fixes a memory
corruption issue in esc.go. See #29197 for details.

Fixes #15992.
Fixes #29197.

Change-Id: I930b10f7e27af68a0944d6c9bfc8707c3fab27a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/166983
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-14 21:00:20 +00:00
Richard Musiol
5ee1b84959 math, math/bits: add intrinsics for wasm
This commit adds compiler intrinsics for the packages math and
math/bits on the wasm architecture for better performance.

benchmark                        old ns/op     new ns/op     delta
BenchmarkCeil                    8.31          3.21          -61.37%
BenchmarkCopysign                5.24          3.88          -25.95%
BenchmarkAbs                     5.42          3.34          -38.38%
BenchmarkFloor                   8.29          3.18          -61.64%
BenchmarkRoundToEven             9.76          3.26          -66.60%
BenchmarkSqrtLatency             8.13          4.88          -39.98%
BenchmarkSqrtPrime               5246          3535          -32.62%
BenchmarkTrunc                   8.29          3.15          -62.00%
BenchmarkLeadingZeros            13.0          4.23          -67.46%
BenchmarkLeadingZeros8           4.65          4.42          -4.95%
BenchmarkLeadingZeros16          7.60          4.38          -42.37%
BenchmarkLeadingZeros32          10.7          4.48          -58.13%
BenchmarkLeadingZeros64          12.9          4.31          -66.59%
BenchmarkTrailingZeros           6.52          4.04          -38.04%
BenchmarkTrailingZeros8          4.57          4.14          -9.41%
BenchmarkTrailingZeros16         6.69          4.16          -37.82%
BenchmarkTrailingZeros32         6.97          4.23          -39.31%
BenchmarkTrailingZeros64         6.59          4.00          -39.30%
BenchmarkOnesCount               7.93          3.30          -58.39%
BenchmarkOnesCount8              3.56          3.19          -10.39%
BenchmarkOnesCount16             4.85          3.19          -34.23%
BenchmarkOnesCount32             7.27          3.19          -56.12%
BenchmarkOnesCount64             8.08          3.28          -59.41%
BenchmarkRotateLeft              4.88          3.80          -22.13%
BenchmarkRotateLeft64            5.03          3.63          -27.83%

Change-Id: Ic1e0c2984878be8defb6eb7eb6ee63765c793222
Reviewed-on: https://go-review.googlesource.com/c/go/+/165177
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-14 19:46:19 +00:00
Josh Bleecher Snyder
61945fc502 cmd/compile: don't generate panicshift for masked int shifts
We know that a & 31 is non-negative for all a, signed or not.
We can avoid checking that and needing to write out an
unreachable call to panicshift.

Change-Id: I32f32fb2c950d2b2b35ac5c0e99b7b2dbd47f917
Reviewed-on: https://go-review.googlesource.com/c/go/+/167499
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-14 00:03:29 +00:00
Josh Bleecher Snyder
870cfe6484 test/codegen: gofmt
Change-Id: I33f5b5051e5f75aa264ec656926223c5a3c09c1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/167498
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
2019-03-13 21:44:45 +00:00
Matthew Dempsky
a1b5cb1d04 cmd/compile: simplify isGoConst
The only ways to construct an OLITERAL node are (1) a basic literal
from the source package, (2) constant folding within evconst (which
only folds Go language constants), (3) the universal "nil" constant,
and (4) implicit conversions of nil to some concrete type.

Passes toolstash-check.

Change-Id: I30fc6b07ebede7adbdfa4ed562436cbb7078a2ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/166981
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-13 18:24:02 +00:00
Carlos Eduardo Seo
4ba69a9a17 cmd/compile: add processor level selection support to ppc64{,le}
ppc64{,le} processor level selection allows the compiler to generate instructions
targeting newer processors and processor-specific optimizations without breaking
compatibility with our current baseline. This feature introduces a new environment
variable, GOPPC64.

GOPPC64 is a GOARCH=ppc64{,le} specific option, for a choice between different
processor levels (i.e. Instruction Set Architecture versions) for which the
compiler will target. The default is 'power8'.

Change-Id: Ic152e283ae1c47084ece4346fa002a3eabb3bb9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/163758
Run-TryBot: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-13 14:44:02 +00:00
Robert Griesemer
bea58ef352 cmd/compile: don't report redundant error for invalid integer literals
Fixes #30722.

Change-Id: Ia4c6e37282edc44788cd8af3f6cfa10895a19e4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166519
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-03-12 22:59:12 +00:00
Bryan C. Mills
af13cfc3a2 ../test: set GOPATH in nosplit.go
This test invokes 'go build', so in module mode it needs a module
cache to guard edits to go.mod.

Fixes #30776

Change-Id: I89ebef1fad718247e7f972cd830e31d6f4a83e4c
Reviewed-on: https://go-review.googlesource.com/c/go/+/167085
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 20:46:29 +00:00
fanzha02
a85afef277 cmd/compile: add handling for new floating-point comparisons flags
The CL 164718 adds new condition flags for floating-point comparisons
in arm64 backend, but dose not add the handling in rewrite.go for
corresponding Ops, which causes issue 30679. And this CL fixes this
issue.

Fixes #30679

Change-Id: I8acc749f78227c3e9e74fa7938f05fb442fb62c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/166579
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 14:01:26 +00:00
Than McIntosh
30cc8a46c4 test: add new test for gccgo compilation problem
New test for issue 30659 (compilation error due to bad
export data).

Updates #30659.

Change-Id: I2541ee3c379e5b22033fea66bb4ebaf720cc5e1f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166917
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-03-12 00:04:59 +00:00
Keith Randall
486ca37b14 test: fix memcombine tests
Two tests (load_le_byte8_uint64_inv and load_be_byte8_uint64)
pass but the generated code isn't actually correct.

The test regexp provides a false negative, as it matches the
MOVQ (SP), BP instruction in the epilogue.

Combined loads never worked for these cases - the test was added in error
as part of a batch and not noticed because of the above false match.

Normalize the amd64/386 tests to always negative match on narrower
loads and OR.

Change-Id: I256861924774d39db0e65723866c81df5ab5076f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166837
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-11 19:18:03 +00:00
Carlo Alberto Ferraris
05051b56a0 sync: allow inlining the RWMutex.RUnlock fast path
RWMutex.RLock is already inlineable, so add a test for it as well.

name                    old time/op  new time/op  delta
RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)

linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)

The cumulative effect of this and the previous 3 commits is:

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)

Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 16:34:17 +00:00
Carlo Alberto Ferraris
ca8354843e sync: allow inlining the Once.Do fast path
Using Once.Do is now extremely cheap because the fast path is just an inlined
atomic load of a variable that is written only once and a conditional jump.
This is very beneficial for Once.Do because, due to its nature, the fast path
will be used for every call after the first one.

In a attempt to mimize code size increase, reorder the fields so that the
pointer to Once is also the pointer to Once.done, that is the only field used
in the hot path. This allows to use more compact instruction encodings or less
instructions in the hot path (that is inlined at every callsite).

name     old time/op  new time/op  delta
Once     4.54ns ± 0%  2.06ns ± 0%  -54.59%  (p=0.000 n=19+16)
Once-4   1.18ns ± 0%  0.55ns ± 0%  -53.39%  (p=0.000 n=15+16)
Once-16  0.53ns ± 0%  0.17ns ± 0%  -67.92%  (p=0.000 n=18+17)

linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)

Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 05:58:31 +00:00
Carlo Alberto Ferraris
41cb0aedff sync: allow inlining the Mutex.Lock fast path
name                    old time/op  new time/op  delta
MutexUncontended        18.9ns ± 0%  16.2ns ± 0%  -14.29%  (p=0.000 n=19+19)
MutexUncontended-4      4.75ns ± 1%  4.08ns ± 0%  -14.20%  (p=0.000 n=20+19)
MutexUncontended-16     2.05ns ± 0%  2.11ns ± 0%   +2.93%  (p=0.000 n=19+16)
Mutex                   19.3ns ± 1%  16.2ns ± 0%  -15.86%  (p=0.000 n=17+19)
Mutex-4                 52.4ns ± 4%  48.6ns ± 9%   -7.22%  (p=0.000 n=20+20)
Mutex-16                 139ns ± 2%   140ns ± 3%   +1.03%  (p=0.011 n=16+20)
MutexSlack              18.9ns ± 1%  16.2ns ± 1%  -13.96%  (p=0.000 n=20+20)
MutexSlack-4             225ns ± 8%   211ns ±10%   -5.94%  (p=0.000 n=18+19)
MutexSlack-16           98.4ns ± 1%  90.9ns ± 1%   -7.60%  (p=0.000 n=17+18)
MutexWork               58.2ns ± 3%  55.4ns ± 0%   -4.82%  (p=0.000 n=20+17)
MutexWork-4              103ns ± 7%    95ns ±18%   -8.03%  (p=0.000 n=20+20)
MutexWork-16             163ns ± 2%   155ns ± 2%   -4.47%  (p=0.000 n=18+18)
MutexWorkSlack          57.7ns ± 1%  55.4ns ± 0%   -3.99%  (p=0.000 n=20+13)
MutexWorkSlack-4         276ns ±13%   260ns ±10%   -5.64%  (p=0.001 n=19+19)
MutexWorkSlack-16        147ns ± 0%   156ns ± 1%   +5.87%  (p=0.000 n=14+19)
MutexNoSpin              968ns ± 0%   900ns ± 1%   -6.98%  (p=0.000 n=20+18)
MutexNoSpin-4            270ns ± 2%   255ns ± 2%   -5.74%  (p=0.000 n=19+20)
MutexNoSpin-16           120ns ± 4%   112ns ± 0%   -6.99%  (p=0.000 n=19+14)
MutexSpin               3.13µs ± 1%  3.19µs ± 6%     ~     (p=0.401 n=20+20)
MutexSpin-4              832ns ± 2%   831ns ± 1%   -0.17%  (p=0.023 n=16+18)
MutexSpin-16             395ns ± 0%   399ns ± 0%   +0.94%  (p=0.000 n=17+19)
RWMutexUncontended      69.5ns ± 0%  68.4ns ± 0%   -1.59%  (p=0.000 n=20+20)
RWMutexUncontended-4    17.5ns ± 0%  16.7ns ± 0%   -4.30%  (p=0.000 n=18+17)
RWMutexUncontended-16   7.92ns ± 0%  7.87ns ± 0%   -0.61%  (p=0.000 n=18+17)
RWMutexWrite100         24.9ns ± 1%  25.0ns ± 1%   +0.32%  (p=0.000 n=20+20)
RWMutexWrite100-4       46.2ns ± 4%  46.2ns ± 5%     ~     (p=0.840 n=19+20)
RWMutexWrite100-16      69.9ns ± 5%  69.9ns ± 3%     ~     (p=0.545 n=20+19)
RWMutexWrite10          27.0ns ± 2%  26.8ns ± 2%   -0.98%  (p=0.001 n=20+20)
RWMutexWrite10-4        34.7ns ± 2%  35.0ns ± 4%     ~     (p=0.191 n=18+20)
RWMutexWrite10-16       37.2ns ± 4%  37.3ns ± 2%     ~     (p=0.438 n=20+19)
RWMutexWorkWrite100      164ns ± 0%   163ns ± 0%   -0.24%  (p=0.025 n=20+20)
RWMutexWorkWrite100-4    193ns ± 3%   191ns ± 2%   -1.06%  (p=0.027 n=20+20)
RWMutexWorkWrite100-16   210ns ± 3%   207ns ± 3%   -1.22%  (p=0.038 n=20+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%     ~     (all equal)
RWMutexWorkWrite10-4     178ns ± 2%   179ns ± 2%     ~     (p=0.186 n=20+20)
RWMutexWorkWrite10-16    192ns ± 2%   192ns ± 2%     ~     (p=0.731 n=19+20)

linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)

Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
Reviewed-on: https://go-review.googlesource.com/c/go/+/148959
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-09 05:08:04 +00:00
Keith Randall
83a33d3855 cmd/compile: reverse order of slice bounds checks
Turns out this makes the fix for 28797 unnecessary, because this order
ensures that the RHS of IsSliceInBounds ops are always nonnegative.

The real reason for this change is that it also makes dealing with
<0 values easier for reporting values in bounds check panics (issue #30116).

Makes cmd/go negligibly smaller.

Update #28797

Change-Id: I1f25ba6d2b3b3d4a72df3105828aa0a4b629ce85
Reviewed-on: https://go-review.googlesource.com/c/go/+/166377
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-09 00:52:45 +00:00
fanzha02
6efd51c6b7 cmd/compile: change the condition flags of floating-point comparisons in arm64 backend
Current compiler reverses operands to work around NaN in
"less than" and "less equal than" comparisons. But if we
want to use "FCMPD/FCMPS $(0.0), Fn" to do some optimization,
the workaround way does not work. Because assembler does
not support instruction "FCMPD/FCMPS Fn, $(0.0)".

This CL sets condition flags for floating-point comparisons
to resolve this problem.

Change-Id: Ia48076a1da95da64596d6e68304018cb301ebe33
Reviewed-on: https://go-review.googlesource.com/c/go/+/164718
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-07 21:23:52 +00:00
Raul Silvera
2dd066d4a7 test: improve test coverage for heap sampling
Update the test in test/heapsampling.go to more thoroughly validate heap sampling.
Lower the sampling rate on the test to ensure allocations both smaller and
larger than the sampling rate are tested.

Tighten up the validation check to a 10% difference between the unsampled and correct value.
Because of the nature of random sampling, it is possible that the unsampled value fluctuates
over that range. To avoid flakes, run the experiment three times and only report an issue if the
same location consistently falls out of range on all experiments.

This tests the sampling fix in cl/158337.

Change-Id: I54a709e5c75827b8b1c2d87cdfb425ab09759677
GitHub-Last-Rev: 7c04f12603
GitHub-Pull-Request: golang/go#26944
Reviewed-on: https://go-review.googlesource.com/c/go/+/129117
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-07 21:05:15 +00:00
erifan01
4e2b0dda8c cmd/compile: eliminate unnecessary type conversions in TrailingZeros(16|8) for arm64
This CL eliminates unnecessary type conversion operations: OpZeroExt16to64 and OpZeroExt8to64.
If the input argrument is a nonzero value, then ORconst operation can also be eliminated.

Benchmarks:

name               old time/op  new time/op  delta
TrailingZeros-8    2.75ns ± 0%  2.75ns ± 0%     ~     (all equal)
TrailingZeros8-8   3.49ns ± 1%  2.93ns ± 0%  -16.00%  (p=0.000 n=10+10)
TrailingZeros16-8  3.49ns ± 1%  2.93ns ± 0%  -16.05%  (p=0.000 n=9+10)
TrailingZeros32-8  2.67ns ± 1%  2.68ns ± 1%     ~     (p=0.468 n=10+10)
TrailingZeros64-8  2.67ns ± 1%  2.65ns ± 0%   -0.62%  (p=0.022 n=10+9)

code:

func f16(x uint) { z = bits.TrailingZeros16(uint16(x)) }

Before:

"".f16 STEXT size=48 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f16(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0000 00000 (test.go:7)        FUNCDATA        ZR, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        PCDATA  $2, ZR
        0x0000 00000 (test.go:7)        PCDATA  ZR, ZR
        0x0000 00000 (test.go:7)        MOVD    "".x(FP), R0
        0x0004 00004 (test.go:7)        MOVHU   R0, R0
        0x0008 00008 (test.go:7)        ORR     $65536, R0, R0
        0x000c 00012 (test.go:7)        RBIT    R0, R0
        0x0010 00016 (test.go:7)        CLZ     R0, R0
        0x0014 00020 (test.go:7)        MOVD    R0, "".z(SB)
        0x0020 00032 (test.go:7)        RET     (R30)

This line of code is unnecessary:
        0x0004 00004 (test.go:7)        MOVHU   R0, R0

After:

"".f16 STEXT size=32 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f16(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0000 00000 (test.go:7)        FUNCDATA        ZR, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        PCDATA  $2, ZR
        0x0000 00000 (test.go:7)        PCDATA  ZR, ZR
        0x0000 00000 (test.go:7)        MOVD    "".x(FP), R0
        0x0004 00004 (test.go:7)        ORR     $65536, R0, R0
        0x0008 00008 (test.go:7)        RBITW   R0, R0
        0x000c 00012 (test.go:7)        CLZW    R0, R0
        0x0010 00016 (test.go:7)        MOVD    R0, "".z(SB)
        0x001c 00028 (test.go:7)        RET     (R30)

The situation of TrailingZeros8 is similar to TrailingZeros16.

Change-Id: I473bdca06be8460a0be87abbae6fe640017e4c9d
Reviewed-on: https://go-review.googlesource.com/c/go/+/156999
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-07 14:24:56 +00:00
erifan01
fee84cc905 cmd/compile: add an optimization rule for math/bits.ReverseBytes16 on arm
This CL adds two rules to turn patterns like ((x<<8) | (x>>8)) (the type of
x is uint16, "|" can also be "+" or "^") to a REV16 instruction on arm v6+.
This optimization rule can be used for math/bits.ReverseBytes16.

Benchmarks on arm v6:
name               old time/op  new time/op  delta
ReverseBytes-32    2.86ns ± 0%  2.86ns ± 0%   ~     (all equal)
ReverseBytes16-32  2.86ns ± 0%  2.86ns ± 0%   ~     (all equal)
ReverseBytes32-32  1.29ns ± 0%  1.29ns ± 0%   ~     (all equal)
ReverseBytes64-32  1.43ns ± 0%  1.43ns ± 0%   ~     (all equal)

Change-Id: I819e633c9a9d308f8e476fb0c82d73fb73dd019f
Reviewed-on: https://go-review.googlesource.com/c/go/+/159019
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-07 13:37:54 +00:00
Keith Randall
9dc3b8b722 reflect: fix more issues with StructOf GC programs
First the insidious bug:

  var n uintptr
  for n := elemPtrs; n > 120; n -= 120 {
    prog = append(prog, 120)
    prog = append(prog, mask[:15]...)
    mask = mask[15:]
  }
  prog = append(prog, byte(n))
  prog = append(prog, mask[:(n+7)/8]...)

The := breaks this code, because the n after the loop is always 0!

We also do need to handle field padding correctly. In particular
the old padding code doesn't correctly handle fields that are not
a multiple of a pointer in size.

Fixes #30606.

Change-Id: Ifcab9494dc25c20116753c5d7e0145d6c2053ed8
Reviewed-on: https://go-review.googlesource.com/c/go/+/165860
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-07 00:06:12 +00:00
Keith Randall
05b3db24c1 reflect: fix StructOf GC programs
They are missing a stop byte at the end.

Normally this doesn't matter, but when including a GC program
in another GC program, we strip the last byte. If that last byte
wasn't a stop byte, then we've thrown away part of the program
we actually need.

Fixes #30606

Change-Id: Ie9604beeb84f7f9442e77d31fe64c374ca132cce
Reviewed-on: https://go-review.googlesource.com/c/go/+/165857
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-06 20:22:14 +00:00
Keith Randall
4a9064ef41 cmd/compile: fix ordering for short-circuiting ops
Make sure the side effects inside short-circuited operations (&& and ||)
happen correctly.

Before this CL, we attached the side effects to the node itself using
exprInPlace. That caused other side effects in sibling expressions
to get reordered with respect to the short circuit side effect.

Instead, rewrite a && b like:

r := a
if r {
  r = b
}

That code we can keep correctly ordered with respect to other
side-effects extracted from part of a big expression.

exprInPlace seems generally unsafe. But this was the only case where
exprInPlace is called not at the top level of an expression, so I
don't think the other uses can actually trigger an issue (there can't
be a sibling expression). TODO: maybe those cases don't need "in
place", and we can retire that function generally.

This CL needed a small tweak to the SSA generation of OIF so that the
short circuit optimization still triggers. The short circuit optimization
looks for triangle but not diamonds, so don't bother allocating a block
if it will be empty.

Go 1 benchmarks are in the noise.

Fixes #30566

Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610
Reviewed-on: https://go-review.googlesource.com/c/go/+/165617
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-06 20:04:07 +00:00
Bryan C. Mills
b1a783df87 test/bench/go1: add go.mod file
cmd/dist executes 'go test' within this directory, so it needs a
go.mod file to tell the compiler what package path to use in
diagnostic and debug information.

Updates #30228

Change-Id: Ia313ac06bc0ec4631d415faa20c56cce2ac8dbc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/165498
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-06 18:53:12 +00:00
Alberto Donizetti
2f8d2427d9 test: skip mutex Unlock inlining tests on a few builders
Fix builder breakage from CL 148958.

This is an inlining test that should be skipped on -N -l.

The inlining also doesn't happen on arm and wasm, so skip the test
there too.

Fixes the noopt builder, the linux-arm builder, and the wasm builder.

Updates #30605

Change-Id: I06b90d595be7185df61db039dd225dc90d6f678f
Reviewed-on: https://go-review.googlesource.com/c/go/+/165339
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-05 18:23:18 +00:00
Carlo Alberto Ferraris
4c3f26076b sync: allow inlining the Mutex.Unlock fast path
Make use of the newly-enabled limited midstack inlining.
Similar changes will be done in followup CLs.

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  18.9ns ± 0%   -1.92%  (p=0.000 n=20+19)
MutexUncontended-4      5.24ns ± 0%  4.75ns ± 1%   -9.25%  (p=0.000 n=20+20)
MutexUncontended-16     2.10ns ± 0%  2.05ns ± 0%   -2.38%  (p=0.000 n=15+19)
Mutex                   19.6ns ± 0%  19.3ns ± 1%   -1.92%  (p=0.000 n=20+17)
Mutex-4                 54.6ns ± 5%  52.4ns ± 4%   -4.09%  (p=0.000 n=20+20)
Mutex-16                 133ns ± 5%   139ns ± 2%   +4.23%  (p=0.000 n=20+16)
MutexSlack              33.4ns ± 2%  18.9ns ± 1%  -43.56%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   225ns ± 8%   +9.12%  (p=0.000 n=20+18)
MutexSlack-16           89.4ns ± 1%  98.4ns ± 1%  +10.10%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  58.2ns ± 3%   -3.75%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%   103ns ± 7%   -1.68%  (p=0.007 n=20+20)
MutexWork-16             157ns ± 1%   163ns ± 2%   +3.90%  (p=0.000 n=18+18)
MutexWorkSlack          70.2ns ± 5%  57.7ns ± 1%  -17.81%  (p=0.000 n=19+20)
MutexWorkSlack-4         277ns ±13%   276ns ±13%     ~     (p=0.682 n=20+19)
MutexWorkSlack-16        156ns ± 0%   147ns ± 0%   -5.62%  (p=0.000 n=16+14)
MutexNoSpin              966ns ± 0%   968ns ± 0%   +0.11%  (p=0.029 n=15+20)
MutexNoSpin-4            269ns ± 4%   270ns ± 2%     ~     (p=0.807 n=20+19)
MutexNoSpin-16           122ns ± 0%   120ns ± 4%   -1.63%  (p=0.000 n=19+19)
MutexSpin               3.13µs ± 0%  3.13µs ± 1%   +0.16%  (p=0.004 n=18+20)
MutexSpin-4              826ns ± 1%   832ns ± 2%   +0.74%  (p=0.000 n=19+16)
MutexSpin-16             397ns ± 1%   395ns ± 0%   -0.50%  (p=0.000 n=19+17)
RWMutexUncontended      71.4ns ± 0%  69.5ns ± 0%   -2.72%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  17.5ns ± 0%   -4.92%  (p=0.000 n=20+18)
RWMutexUncontended-16   8.01ns ± 0%  7.92ns ± 0%   -1.15%  (p=0.000 n=18+18)
RWMutexWrite100         24.9ns ± 0%  24.9ns ± 1%     ~     (p=0.099 n=19+20)
RWMutexWrite100-4       46.5ns ± 3%  46.2ns ± 4%     ~     (p=0.253 n=17+19)
RWMutexWrite100-16      68.9ns ± 3%  69.9ns ± 5%   +1.46%  (p=0.012 n=18+20)
RWMutexWrite10          27.1ns ± 0%  27.0ns ± 2%     ~     (p=0.128 n=17+20)
RWMutexWrite10-4        34.8ns ± 1%  34.7ns ± 2%     ~     (p=0.180 n=20+18)
RWMutexWrite10-16       37.5ns ± 2%  37.2ns ± 4%   -0.89%  (p=0.023 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   164ns ± 0%     ~     (p=0.106 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   193ns ± 3%   +3.46%  (p=0.000 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   210ns ± 3%   +2.96%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%   -0.20%  (p=0.017 n=20+19)
RWMutexWorkWrite10-4     179ns ± 1%   178ns ± 2%     ~     (p=0.215 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   192ns ± 2%     ~     (p=0.166 n=15+19)

linux/amd64 bin/go 14630572 (previous commit 14605947, +24625/+0.17%)

Change-Id: I3f9d1765801fe0b8deb1bc2728b8bba8a7508e23
Reviewed-on: https://go-review.googlesource.com/c/go/+/148958
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-05 14:59:31 +00:00
erifan01
159b2de442 cmd/compile: optimize math/bits.Div32 for arm64
Benchmark:
name     old time/op  new time/op  delta
Div-8    22.0ns ± 0%  22.0ns ± 0%     ~     (all equal)
Div32-8  6.51ns ± 0%  3.00ns ± 0%  -53.90%  (p=0.000 n=10+8)
Div64-8  22.5ns ± 0%  22.5ns ± 0%     ~     (all equal)

Code:
func div32(hi, lo, y uint32) (q, r uint32) {return bits.Div32(hi, lo, y)}

Before:
        0x0020 00032 (test.go:24)       MOVWU   "".y+8(FP), R0
        0x0024 00036 ($GOROOT/src/math/bits/bits.go:472)        CBZW    R0, 132
        0x0028 00040 ($GOROOT/src/math/bits/bits.go:472)        MOVWU   "".hi(FP), R1
        0x002c 00044 ($GOROOT/src/math/bits/bits.go:472)        CMPW    R1, R0
        0x0030 00048 ($GOROOT/src/math/bits/bits.go:472)        BLS     96
        0x0034 00052 ($GOROOT/src/math/bits/bits.go:475)        MOVWU   "".lo+4(FP), R2
        0x0038 00056 ($GOROOT/src/math/bits/bits.go:475)        ORR     R1<<32, R2, R1
        0x003c 00060 ($GOROOT/src/math/bits/bits.go:476)        CBZ     R0, 140
        0x0040 00064 ($GOROOT/src/math/bits/bits.go:476)        UDIV    R0, R1, R2
        0x0044 00068 (test.go:24)       MOVW    R2, "".q+16(FP)
        0x0048 00072 ($GOROOT/src/math/bits/bits.go:476)        UREM    R0, R1, R0
        0x0050 00080 (test.go:24)       MOVW    R0, "".r+20(FP)
        0x0054 00084 (test.go:24)       MOVD    -8(RSP), R29
        0x0058 00088 (test.go:24)       MOVD.P  32(RSP), R30
        0x005c 00092 (test.go:24)       RET     (R30)

After:
        0x001c 00028 (test.go:24)       MOVWU   "".y+8(FP), R0
        0x0020 00032 (test.go:24)       CBZW    R0, 92
        0x0024 00036 (test.go:24)       MOVWU   "".hi(FP), R1
        0x0028 00040 (test.go:24)       CMPW    R0, R1
        0x002c 00044 (test.go:24)       BHS     84
        0x0030 00048 (test.go:24)       MOVWU   "".lo+4(FP), R2
        0x0034 00052 (test.go:24)       ORR     R1<<32, R2, R4
        0x0038 00056 (test.go:24)       UDIV    R0, R4, R3
        0x003c 00060 (test.go:24)       MSUB    R3, R4, R0, R4
        0x0040 00064 (test.go:24)       MOVW    R3, "".q+16(FP)
        0x0044 00068 (test.go:24)       MOVW    R4, "".r+20(FP)
        0x0048 00072 (test.go:24)       MOVD    -8(RSP), R29
        0x004c 00076 (test.go:24)       MOVD.P  16(RSP), R30
        0x0050 00080 (test.go:24)       RET     (R30)

UREM instruction in the previous assembly code will be converted to UDIV and MSUB instructions
on arm64. However the UDIV instruction in UREM is unnecessary, because it's a duplicate of the
previous UDIV. This CL adds a rule to have this extra UDIV instruction removed by CSE.

Change-Id: Ie2508784320020b2de022806d09f75a7871bb3d7
Reviewed-on: https://go-review.googlesource.com/c/159577
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-03 20:20:10 +00:00
erifan01
192b675f17 cmd/compile: add an optimaztion rule for math/bits.ReverseBytes16 on arm64
On amd64 ReverseBytes16 is lowered to a rotate instruction. However arm64 doesn't
have 16-bit rotate instruction, but has a REV16W instruction which can be used
for ReverseBytes16. This CL adds a rule to turn the patterns like (x<<8) | (x>>8)
(the type of x is uint16, and "|" can also be "^" or "+") to a REV16W instruction.

Code:
func reverseBytes16(i uint16) uint16 { return bits.ReverseBytes16(i) }

Before:
        0x0004 00004 (test.go:6)        MOVHU   "".i(FP), R0
        0x0008 00008 ($GOROOT/src/math/bits/bits.go:262)        UBFX    $8, R0, $8, R1
        0x000c 00012 ($GOROOT/src/math/bits/bits.go:262)        ORR     R0<<8, R1, R0
        0x0010 00016 (test.go:6)        MOVH    R0, "".~r1+8(FP)
        0x0014 00020 (test.go:6)        RET     (R30)

After:
        0x0000 00000 (test.go:6)        MOVHU   "".i(FP), R0
        0x0004 00004 (test.go:6)        REV16W  R0, R0
        0x0008 00008 (test.go:6)        MOVH    R0, "".~r1+8(FP)
        0x000c 00012 (test.go:6)        RET     (R30)

Benchmarks:
name                old time/op       new time/op       delta
ReverseBytes-224    1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)
ReverseBytes16-224  1.500000ns +- 0%  1.000000ns +- 0%  -33.33%  (p=0.000 n=9+10)
ReverseBytes32-224  1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)
ReverseBytes64-224  1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)

Change-Id: I87cd41b2d8e549bf39c601f185d5775bd42d739c
Reviewed-on: https://go-review.googlesource.com/c/157757
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-01 15:42:19 +00:00
Cherry Zhang
40df9cc606 cmd/compile: make KeepAlive work on stack object
Currently, runtime.KeepAlive applied on a stack object doesn't
actually keeps the stack object alive, and the heap object
referenced from it could be collected. This is because the
address of the stack object is rematerializeable, and we just
ignored KeepAlive on rematerializeable values. This CL fixes it.

Fixes #30476.

Change-Id: Ic1f75ee54ed94ea79bd46a8ddcd9e81d01556d1d
Reviewed-on: https://go-review.googlesource.com/c/164537
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-01 15:36:52 +00:00
Matthew Dempsky
38642b9fce Revert "cmd/compile: rewrite f(g()) for multi-value g() during typecheck"
This reverts commit d96b7fbf98.

Reason for revert: broke noopt and longtest builders.

Change-Id: Ifaec64d817c4336cb255a2e9db00526b7bc5606a
Reviewed-on: https://go-review.googlesource.com/c/164757
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-01 05:58:51 +00:00
Matthew Dempsky
d96b7fbf98 cmd/compile: rewrite f(g()) for multi-value g() during typecheck
This CL moves order.go's copyRet logic for rewriting f(g()) into t1,
t2, ... = g(); f(t1, t2, ...) earlier into typecheck. This allows the
rest of the compiler to stop worrying about multi-value functions
appearing outside of OAS2FUNC nodes.

This changes compiler behavior in a few observable ways:

1. Typechecking error messages for builtin functions now use general
case error messages rather than unnecessarily differing ones.

2. Because f(g()) is rewritten before inlining, saved inline bodies
now see the rewritten form too. This could be addressed, but doesn't
seem worthwhile.

3. Most notably, this simplifies escape analysis and fixes a memory
corruption issue in esc.go. See #29197 for details.

Fixes #15992.
Fixes #29197.

Change-Id: I86a70668301efeec8fbd11fe2d242e359a3ad0af
Reviewed-on: https://go-review.googlesource.com/c/153841
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-28 22:50:08 +00:00
Matthew Dempsky
7fa195c1b9 cmd/compile: fix false positives in isGoConst
isGoConst could spuriously return true for variables that shadow a
constant declaration with the same name.

Because even named constants are always represented by OLITERAL nodes,
the easy fix is to just ignore ONAME nodes in isGoConst. We can
similarly ignore ONONAME nodes.

Confirmed that k8s.io/kubernetes/test/e2e/storage builds again with
this fix.

Fixes #30430.

Change-Id: I899400d749982d341dc248a7cd5a18277c2795ec
Reviewed-on: https://go-review.googlesource.com/c/164319
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-28 20:31:53 +00:00
erifan01
dd91269b7c cmd/compile: optimize math/bits Len32 intrinsic on arm64
Arm64 has a 32-bit CLZ instruction CLZW, which can be used for intrinsic Len32.
Function LeadingZeros32 calls Len32, with this change, the assembly code of
LeadingZeros32 becomes more concise.

Go code:

func f32(x uint32) { z = bits.LeadingZeros32(x) }

Before:

"".f32 STEXT size=32 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f32(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0004 00004 (test.go:7)        MOVWU   "".x(FP), R0
        0x0008 00008 ($GOROOT/src/math/bits/bits.go:30) CLZ     R0, R0
        0x000c 00012 ($GOROOT/src/math/bits/bits.go:30) SUB     $32, R0, R0
        0x0010 00016 (test.go:7)        MOVD    R0, "".z(SB)
        0x001c 00028 (test.go:7)        RET     (R30)

After:

"".f32 STEXT size=32 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f32(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0004 00004 (test.go:7)        MOVWU   "".x(FP), R0
        0x0008 00008 ($GOROOT/src/math/bits/bits.go:30) CLZW    R0, R0
        0x000c 00012 (test.go:7)        MOVD    R0, "".z(SB)
        0x0018 00024 (test.go:7)        RET     (R30)

Benchmarks:
name              old time/op  new time/op  delta
LeadingZeros-8    2.53ns ± 0%  2.55ns ± 0%   +0.67%  (p=0.000 n=10+10)
LeadingZeros8-8   3.56ns ± 0%  3.56ns ± 0%     ~     (all equal)
LeadingZeros16-8  3.55ns ± 0%  3.56ns ± 0%     ~     (p=0.465 n=10+10)
LeadingZeros32-8  3.55ns ± 0%  2.96ns ± 0%  -16.71%  (p=0.000 n=10+7)
LeadingZeros64-8  2.53ns ± 0%  2.54ns ± 0%     ~     (p=0.059 n=8+10)

Change-Id: Ie5666bb82909e341060e02ffd4e86c0e5d67e90a
Reviewed-on: https://go-review.googlesource.com/c/157000
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-02-27 16:09:33 +00:00
Gergely Brautigam
8ee9bca272 cmd/compile: suppress typecheck errors in a type switch case with broken type
If a type switch case expression has failed typechecking, the case body is
likely to also fail with confusing or spurious errors. Suppress
typechecking the case body when this happens.

Fixes #28926

Change-Id: Idfdb9d5627994f2fd90154af1659e9a92bf692c4
Reviewed-on: https://go-review.googlesource.com/c/158617
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-02-27 08:22:03 +00:00
Matthew Dempsky
6fa7669fd7 cmd/compile: unify duplicate const detection logic
Consistent logic for handling both duplicate map keys and case values,
and eliminates ad hoc value hashing code.

Also makes cmd/compile consistent with go/types's handling of
duplicate constants (see #28085), which is at least an improvement
over the status quo even if we settle on something different for the
spec.

As a side effect, this also suppresses cmd/compile's warnings about
duplicate nils in (non-interface expression) switch statements, which
was technically never allowed by the spec anyway.

Updates #28085.
Updates #28378.

Change-Id: I176a251e770c3c5bc11c2bf8d1d862db8f252a17
Reviewed-on: https://go-review.googlesource.com/c/152544
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-27 01:08:13 +00:00
Iskander Sharipov
c1050a8e54 cmd/compile: don't generate newobject call for 0-sized types
Emit &runtime.zerobase instead of a call to newobject for
allocations of zero sized objects in walk.go.

Fixes #29446

Change-Id: I11b67981d55009726a17c2e582c12ce0c258682e
Reviewed-on: https://go-review.googlesource.com/c/155840
Run-TryBot: Iskander Sharipov <quasilyte@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2019-02-26 23:08:15 +00:00
Ian Lance Taylor
c3c90d0132 test: add test case that caused a gccgo compiler crash
Change-Id: Icdc980e0dcb5639c49aba5f4f252f33bd207e4fa
Reviewed-on: https://go-review.googlesource.com/c/162617
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-02-26 23:02:54 +00:00
Keith Randall
933e34ac99 cmd/compile: treat slice pointers as non-nil
var a []int = ...
p := &a[0]
_ = *p

We don't need to nil check on the 3rd line. If the bounds check on the 2nd
line passes, we know p is non-nil.

We rely on the fact that any cap>0 slice has a non-nil pointer as its
pointer to the backing array. This is true for all safely-constructed slices,
and I don't see any reason why someone would violate this rule using unsafe.

R=go1.13

Fixes #30366

Change-Id: I3ed764fcb72cfe1fbf963d8c1a82e24e3b6dead7
Reviewed-on: https://go-review.googlesource.com/c/163740
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-02-26 20:44:52 +00:00
Agniva De Sarker
39fa3f171c cmd/compile: fix a typo in assignment mismatch error
Fixes #30087

Change-Id: Ic6d80f8e6e1831886af8613420b1bd129a1b4850
Reviewed-on: https://go-review.googlesource.com/c/161577
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-26 18:50:48 +00:00
Michael Fraenkel
6d781decad cmd/compile: confusing error if composite literal field is a method
When looking for the field specified in a composite literal, check that
the specified name is actually a field and not a method.

Fixes #29855.

Change-Id: Id77666e846f925907b1eec64213b1d25af8a2466
Reviewed-on: https://go-review.googlesource.com/c/158938
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-26 18:42:07 +00:00
Bryan C. Mills
1670da9ee4 test: add a go.mod file in the working directory of nosplit.go
Updates #30228

Change-Id: I41bbedf15fa51242f69a3b1ecafd0d3191271799
Reviewed-on: https://go-review.googlesource.com/c/163518
Reviewed-by: Jay Conrod <jayconrod@google.com>
2019-02-26 02:44:45 +00:00
Cherry Zhang
0349f29a55 cmd/compile: flow interface data to heap if CONVIFACE of a non-direct interface escapes
Consider the following code:

func f(x []*T) interface{} {
	return x
}

It returns an interface that holds a heap copy of x (by calling
convT2I or friend), therefore x escape to heap. The current
escape analysis only recognizes that x flows to the result. This
is not sufficient, since if the result does not escape, x's
content may be stack allocated and this will result a
heap-to-stack pointer, which is bad.

Fix this by realizing that if a CONVIFACE escapes and we're
converting from a non-direct interface type, the data needs to
escape to heap.

Running "toolstash -cmp" on std & cmd, the generated machine code
are identical for all packages. However, the export data (escape
tags) differ in the following packages. It looks to me that all
are similar to the "f" above, where the parameter should escape
to heap.

io/ioutil/ioutil.go:118
	old: leaking param: r to result ~r1 level=0
	new: leaking param: r

image/image.go:943
	old: leaking param: p to result ~r0 level=1
	new: leaking param content: p

net/url/url.go:200
	old: leaking param: s to result ~r2 level=0
	new: leaking param: s

(as a consequence)
net/url/url.go:183
	old: leaking param: s to result ~r1 level=0
	new: leaking param: s

net/url/url.go:194
	old: leaking param: s to result ~r1 level=0
	new: leaking param: s

net/url/url.go:699
	old: leaking param: u to result ~r0 level=1
	new: leaking param: u

net/url/url.go:775
	old: (*URL).String u does not escape
	new: leaking param content: u

net/url/url.go:1038
	old: leaking param: u to result ~r0 level=1
	new: leaking param: u

net/url/url.go:1099
	old: (*URL).MarshalBinary u does not escape
	new: leaking param content: u

flag/flag.go:235
	old: leaking param: s to result ~r0 level=1
	new: leaking param content: s

go/scanner/errors.go:105
	old: leaking param: p to result ~r0 level=0
	new: leaking param: p

database/sql/sql.go:204
	old: leaking param: ns to result ~r0 level=0
	new: leaking param: ns

go/constant/value.go:303
	old: leaking param: re to result ~r2 level=0, leaking param: im to result ~r2 level=0
	new: leaking param: re, leaking param: im

go/constant/value.go:846
	old: leaking param: x to result ~r1 level=0
	new: leaking param: x

encoding/xml/xml.go:518
	old: leaking param: d to result ~r1 level=2
	new: leaking param content: d

encoding/xml/xml.go:122
	old: leaking param: leaking param: t to result ~r1 level=0
	new: leaking param: t

crypto/x509/verify.go:506
	old: leaking param: c to result ~r8 level=0
	new: leaking param: c

crypto/x509/verify.go:563
	old: leaking param: c to result ~r3 level=0, leaking param content: c
	new: leaking param: c

crypto/x509/verify.go:615
	old: (nothing)
	new: leaking closure reference c

crypto/x509/verify.go:996
	old: leaking param: c to result ~r1 level=0, leaking param content: c
	new: leaking param: c

net/http/filetransport.go:30
	old: leaking param: fs to result ~r1 level=0
	new: leaking param: fs

net/http/h2_bundle.go:2684
	old: leaking param: mh to result ~r0 level=2
	new: leaking param content: mh

net/http/h2_bundle.go:7352
	old: http2checkConnHeaders req does not escape
	new: leaking param content: req

net/http/pprof/pprof.go:221
	old: leaking param: name to result ~r1 level=0
	new: leaking param: name

cmd/internal/bio/must.go:21
	old: leaking param: w to result ~r1 level=0
	new: leaking param: w

Fixes #29353.

Change-Id: I7e7798ae773728028b0dcae5bccb3ada51189c68
Reviewed-on: https://go-review.googlesource.com/c/162829
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: David Chase <drchase@google.com>
2019-02-21 15:14:45 +00:00
Robert Griesemer
4ad5537bfa cmd/compile: accept 'i' suffix orthogonally on all numbers
This change accepts the 'i' suffix on binary and octal integer
literals as well as hexadecimal floats. The suffix was already
accepted on decimal integers and floats.

Note that 0123i == 123i for backward-compatibility (and 09i is
valid).

See also the respective language in the spec change:
https://golang.org/cl/161098

Change-Id: I9d2d755cba36a3fa7b9e24308c73754d4568daaf
Reviewed-on: https://go-review.googlesource.com/c/162878
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-02-19 22:45:09 +00:00
Robert Griesemer
041d31b882 cmd/compile: don't mix internal float/complex constants of different precision
There are several places where a new (internal) complex constant is allocated
via new(Mpcplx) rather than newMpcmplx(). The problem with using new() is that
the Mpcplx data structure's Real and Imag components don't get initialized with
an Mpflt of the correct precision (they have precision 0, which may be adjusted
later).

In all cases but one, the components of those complex constants are set using
a Set operation which "inherits" the correct precision from the value that is
being set.

But when creating a complex value for an imaginary literal, the imaginary
component is set via SetString which assumes 64bits of precision by default.
As a result, the internal representation of 0.01i and complex(0, 0.01) was
not correct.

Replaced all used of new(Mpcplx) with newMpcmplx() and added a new test.

Fixes #30243.

Change-Id: Ife7fd6ccd42bf887a55c6ce91727754657e6cb2d
Reviewed-on: https://go-review.googlesource.com/c/163000
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-02-19 21:05:17 +00:00
Cherry Zhang
dca707b2a0 cmd/compile: guard against loads with negative offset from readonly constants
CL 154057 adds guards agaist out-of-bound reads from readonly
constants. It turns out that in dead code, the offset can also
be negative. Guard against negative offset as well.

Fixes #30257.

Change-Id: I47c2a2e434dd466c08ae6f50f213999a358c796e
Reviewed-on: https://go-review.googlesource.com/c/162819
Reviewed-by: Keith Randall <khr@golang.org>
2019-02-16 02:02:31 +00:00
Keith Randall
585c9e8412 cmd/compile: implement shifts by signed amounts
Allow shifts by signed amounts. Panic if the shift amount is negative.

TODO: We end up doing two compares per shift, see Ian's comment
https://github.com/golang/go/issues/19113#issuecomment-443241799 that
we could do it with a single comparison in the normal case.

The prove pass mostly handles this code well. For instance, it removes the
<0 check for cases like this:
    if s >= 0 { _ = x << s }
    _ = x << len(a)

This case isn't handled well yet:
    _ = x << (y & 0xf)
I'll do followon CLs for unhandled cases as needed.

Update #19113

R=go1.13

Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1
Reviewed-on: https://go-review.googlesource.com/c/158719
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-02-15 23:13:09 +00:00
Robert Griesemer
ceb849dd97 cmd/compile: accept new Go2 number literals
This CL introduces compiler support for the new binary and octal integer
literals, hexadecimal floats, and digit separators for all number literals.

The new Go 2 number literal scanner accepts the following liberal format:

number   = [ prefix ] digits [ "." digits ] [ exponent ] [ "i" ] .
prefix   = "0" [ "b" |"B" | "o" | "O" | "x" | "X" ] .
digits   = { digit | "_" } .
exponent = ( "e" | "E" | "p" | "P" ) [ "+" | "-" ] digits .

If the number starts with "0x" or "0X", digit is any hexadecimal digit;
otherwise, digit is any decimal digit. If the accepted number is not valid,
errors are reported accordingly.

See the new test cases in scanner_test.go for a selection of valid and
invalid numbers and the respective error messages.

R=Go1.13

Updates #12711.
Updates #19308.
Updates #28493.
Updates #29008.

Change-Id: Ic8febc7bd4dc5186b16a8c8897691e81125cf0ca
Reviewed-on: https://go-review.googlesource.com/c/157677
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2019-02-11 23:22:50 +00:00
Yasser Abdolmaleki
aa161ad17e test/chan: fix broken link to Squinting at Power Series
Change-Id: Idee94a1d93555d53442098dd7479982e3f5afbba
Reviewed-on: https://go-review.googlesource.com/c/161339
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-06 20:54:16 +00:00
Keith Randall
8f854244ad cmd/compile: fix crash when memmove argument is not the right type
Make sure the argument to memmove is of pointer type before we try to
get the element type.

This has been noticed for code that uses unsafe+linkname so it can
call runtime.memmove. Probably not the best thing to allow, but the
code is out there and we'd rather not break it unnecessarily.

Fixes #30061

Change-Id: I334a8453f2e293959fd742044c43fbe93f0b3d31
Reviewed-on: https://go-review.googlesource.com/c/160826
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-01 23:43:09 +00:00
Cherry Zhang
7e987b7b33 reflect: eliminate write barrier for copying result in callReflect
We are copying the results to uninitialized stack space. Write
barrier is not needed.

Fixes #30041.

Change-Id: Ia91d74dbafd96dc2bd92de0cb479808991dda03e
Reviewed-on: https://go-review.googlesource.com/c/160737
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2019-02-01 19:23:02 +00:00
Keith Randall
eafe9a186c cmd/compile: hide init functions in tracebacks
Treat compiler-generated init functions as wrappers, so they will not
be shown in tracebacks.

The exception to this rule is that we'd like to show the line number
of initializers for global variables in tracebacks. In order to
preserve line numbers for those cases, separate out the code for those
initializers into a separate function (which is not marked as
autogenerated).

This CL makes the go binary 0.2% bigger.

Fixes #29919

Change-Id: I0f1fbfc03d10d764ce3a8ddb48fb387ca8453386
Reviewed-on: https://go-review.googlesource.com/c/159717
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-01-27 04:02:46 +00:00
Ian Lance Taylor
d585f04fd3 cmd/compile: base PPC64 trunc rules on final type, not op type
Whether a truncation should become a MOVWreg or a MOVWZreg doesn't
depend on the type of the operand, it depends on the type of the final
result.  If the final result is unsigned, we can use MOVWZreg.  If the
final result is signed, we can use MOVWreg.  Checking the type of the
operand does the wrong thing if truncating an unsigned value to a
signed value, or vice-versa.

Fixes #29943

Change-Id: Ia6fc7d006486fa02cffd0bec4d910bdd5b6365f8
Reviewed-on: https://go-review.googlesource.com/c/159760
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-01-27 01:43:05 +00:00
Keith Randall
1fb596143c cmd/compile: don't bother compiling functions named "_"
They can't be used, so we don't need code generated for them. We just
need to report errors in their bodies.

This is the minimal CL for 1.12. For 1.13, CL 158845 will remove
a bunch of special cases sprinkled about the compiler to handle "_"
functions, which should (after this CL) be unnecessary.

Update #29870

Change-Id: Iaa1c194bd0017dffdce86589fe2d36726ee83c13
Reviewed-on: https://go-review.googlesource.com/c/158820
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-01-22 22:06:11 +00:00
Keith Randall
462e90259a runtime: keep FuncForPC from crashing for PCs between functions
Reuse the strict mechanism from FileLine for FuncForPC, so we don't
crash when asking the pcln table about bad pcs.

Fixes #29735

Change-Id: Iaffb32498b8586ecf4eae03823e8aecef841aa68
Reviewed-on: https://go-review.googlesource.com/c/157799
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-01-14 23:37:39 +00:00
Keith Randall
7502ed3b90 cmd/compile: when merging instructions, prefer line number of faulting insn
Normally this happens when combining a sign extension and a load.  We
want the resulting combo-instruction to get the line number of the
load, not the line number of the sign extension.

For each rule, compute where we should get its line number by finding
a value on the match side that can fault.  Use that line number for
all the new values created on the right-hand side.

Fixes #27201

Change-Id: I19b3c6f468fff1a3c0bfbce2d6581828557064a3
Reviewed-on: https://go-review.googlesource.com/c/156937
Reviewed-by: David Chase <drchase@google.com>
2019-01-14 22:41:33 +00:00
Austin Clements
a2e79571a9 cmd/compile: separate data and function LSyms
Currently, obj.Ctxt's symbol table does not distinguish between ABI0
and ABIInternal symbols. This is *almost* okay, since a given symbol
name in the final object file is only going to belong to one ABI or
the other, but it requires that the compiler mark a Sym as being a
function symbol before it retrieves its LSym. If it retrieves the LSym
first, that LSym will be created as ABI0, and later marking the Sym as
a function symbol won't change the LSym's ABI.

Marking a Sym as a function symbol before looking up its LSym sounds
easy, except Syms have a dual purpose: they are used just as interned
strings (every function, variable, parameter, etc with the same
textual name shares a Sym), and *also* to store state for whatever
package global has that name. As a result, it's easy to slip up and
look up an LSym when a Sym is serving as the name of a local variable,
and then later mark it as a function when it's serving as the global
with the name.

In general, we were careful to avoid this, but #29610 demonstrates one
case where we messed up. Because of on-demand importing from indexed
export data, it's possible to compile a method wrapper for a type
imported from another package before importing an init function from
that package. If the argument of the method is named "init", the
"init" LSym will be created as a data symbol when compiling the
wrapper, before it gets marked as a function symbol.

To fix this, we separate obj.Ctxt's symbol tables for ABI0 and
ABIInternal symbols. This way, the compiler will simply get a
different LSym once the Sym takes on its package-global meaning as a
function.

This fixes the above ordering issue, and means we no longer need to go
out of our way to create the "init" function early and mark it as a
function symbol.

Fixes #29610.
Updates #27539.

Change-Id: Id9458b40017893d46ef9e4a3f9b47fc49e1ce8df
Reviewed-on: https://go-review.googlesource.com/c/157017
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-01-11 00:45:49 +00:00
Keith Randall
956879dd0b runtime: make FuncForPC return the innermost inlined frame
Returning the innermost frame instead of the outermost
makes code that walks the results of runtime.Caller{,s}
still work correctly in the presence of mid-stack inlining.

Fixes #29582

Change-Id: I2392e3dd5636eb8c6f58620a61cef2194fe660a7
Reviewed-on: https://go-review.googlesource.com/c/156364
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-01-08 21:54:04 +00:00
David Chase
28fb8c6987 cmd/compile: modify swt.go to skip repeated walks of switch
The compiler appears to contain several squirrelly corner
cases where nodes are double walked, some where new nodes
are created from walked parts.  Rather than trust that we
had searched hard enough for the last one, change
exprSwitch.walk() to return immediately if it has already
been walked.  This appears to be the only case where
double-walking a node is actually harmful.

Fixes #29562.

Change-Id: I0667e8769aba4c3236666cd836a934e256c0bfc5
Reviewed-on: https://go-review.googlesource.com/c/156317
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-01-04 21:51:37 +00:00
Keith Randall
af134b17da runtime: proper panic tracebacks with mid-stack inlining
As a followon to CL 152537, modify the panic-printing traceback
to also handle mid-stack inlining correctly.

Also declare -fm functions (aka method functions) as wrappers, so that
they get elided during traceback. This fixes part 2 of #26839.

Fixes #28640
Fixes #24488
Update #26839

Change-Id: I1c535a9b87a9a1ea699621be1e6526877b696c21
Reviewed-on: https://go-review.googlesource.com/c/153477
Reviewed-by: David Chase <drchase@google.com>
2019-01-04 00:00:24 +00:00
Keith Randall
688667716e runtime: don't scan go'd function args past length of ptr bitmap
Use the length of the bitmap to decide how much to pass to the
write barrier, not the total length of the arguments.

The test needs enough arguments so that two distinct bitmaps
get interpreted as a single longer bitmap.

Update #29362

Change-Id: I78f3f7f9ec89c2ad4678f0c52d3d3def9cac8e72
Reviewed-on: https://go-review.googlesource.com/c/156123
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-01-03 23:37:42 +00:00
Keith Randall
af4320350b runtime: add test for go function argument scanning
Derived	from Naoki's reproducer.

Update #29362

Change-Id: I1cbd33b38a2f74905dbc22c5ecbad4a87a24bdd1
Reviewed-on: https://go-review.googlesource.com/c/156122
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-01-03 20:17:01 +00:00
Cherry Zhang
2e217fa726 cmd/compile: fix deriving from x+d >= w on overflow in prove pass
In the case of x+d >= w, where d and w are constants, we are
deriving x is within the bound of min=w-d and max=maxInt-d. When
there is an overflow (min >= max), we know only one of x >= min
or x <= max is true, and we derive this by excluding the other.
When excluding x >= min, we did not consider the equal case, so
we could incorrectly derive x <= max when x == min.

Fixes #29502.

Change-Id: Ia9f7d814264b1a3ddf78f52e2ce23377450e6e8a
Reviewed-on: https://go-review.googlesource.com/c/156019
Reviewed-by: David Chase <drchase@google.com>
2019-01-02 19:28:06 +00:00
Alberto Donizetti
efbd01f1dc test: disable issue 29329 test when cgo is not enabled
CL 155917 added a -race test that shouldn't be run when cgo is not
enabled. Enforce this in the test file, with a buildflag.

Fixes the nocgo builder.

Change-Id: I9fe0d8f21da4d6e2de3f8fe9395e1fa7e9664b02
Reviewed-on: https://go-review.googlesource.com/c/155957
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-29 14:18:11 +00:00
Keith Randall
ed15e82413 runtime: panic on uncomparable map key, even if map is empty
Reorg map flags a bit so we don't need any extra space for the extra flag.

Fixes #23734

Change-Id: I436812156240ae90de53d0943fe1aabf3ea37417
Reviewed-on: https://go-review.googlesource.com/c/155918
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-12-29 01:00:54 +00:00
Keith Randall
14bdcc76fd cmd/compile: fix racewalk{enter,exit} removal
We can't remove race instrumentation unless there are no calls,
not just no static calls. Closure and interface calls also count.

The problem in issue 29329 is that there was a racefuncenter, an
InterCall, and a racefuncexit.  The racefuncenter was removed, then
the InterCall was rewritten to a StaticCall. That prevented the
racefuncexit from being removed. That caused an imbalance in
racefuncenter/racefuncexit calls, which made the race detector barf.

Bug introduced at CL 121235

Fixes #29329

Change-Id: I2c94ac6cf918dd910b74b2a0de5dc2480d236f16
Reviewed-on: https://go-review.googlesource.com/c/155917
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-29 00:04:59 +00:00
Keith Randall
69c2c56453 cmd/compile,runtime: redo mid-stack inlining tracebacks
Work involved in getting a stack trace is divided between
runtime.Callers and runtime.CallersFrames.

Before this CL, runtime.Callers returns a pc per runtime frame.
runtime.CallersFrames is responsible for expanding a runtime frame
into potentially multiple user frames.

After this CL, runtime.Callers returns a pc per user frame.
runtime.CallersFrames just maps those to user frame info.

Entries in the result of runtime.Callers are now pcs
of the calls (or of the inline marks), not of the instruction
just after the call.

Fixes #29007
Fixes #28640
Update #26320

Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
Reviewed-on: https://go-review.googlesource.com/c/152537
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-12-28 20:55:36 +00:00
Cherry Zhang
6a64efc250 cmd/compile: fix MIPS SGTconst-with-shift rules
(SGTconst [c] (SRLconst _ [d])) && 0 <= int32(c) && uint32(d) <= 31 && 1<<(32-uint32(d)) <= int32(c) -> (MOVWconst [1])

This rule is problematic. 1<<(32-uint32(d)) <= int32(c) meant to
say that it is true if c is greater than the largest possible
value of the right shift. But when d==1, 1<<(32-1) is negative
and results in the wrong comparison.

Rewrite the rules in a more direct way.

Fixes #29402.

Change-Id: I5940fc9538d9bc3a4bcae8aa34672867540dc60e
Reviewed-on: https://go-review.googlesource.com/c/155798
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-27 00:07:53 +00:00
Keith Randall
c5414457c6 cmd/compile: pad zero-sized stack variables
If someone takes a pointer to a zero-sized stack variable, it can
be incorrectly interpreted as a pointer to the next object in the
stack frame. To avoid this, add some padding after zero-sized variables.

We only need to pad if the next variable in memory (which is the
previous variable in the order in which we allocate variables to the
stack frame) has pointers. If the next variable has no pointers, it
won't hurt to have a pointer to it.

Because we allocate all pointer-containing variables before all
non-pointer-containing variables, we should only have to pad once per
frame.

Fixes #24993

Change-Id: Ife561cdfdf964fdbf69af03ae6ba97d004e6193c
Reviewed-on: https://go-review.googlesource.com/c/155698
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-22 01:16:00 +00:00
Keith Randall
debca77971 cmd/compile: fix line number for implicitly declared method expressions
Method expressions where the method is implicitly declared have no
line number. The Error method of the built-in error type is one such
method.  We leave the line number at the use of the method expression
in this case.

Fixes #29389

Change-Id: I29c64bb47b1a704576abf086599eb5af7b78df53
Reviewed-on: https://go-review.googlesource.com/c/155639
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-12-22 01:08:39 +00:00
Matthew Dempsky
706b54bb85 cmd/compile: fix ICE due to bad rune width
It was possible that

    var X interface{} = 'x'

could cause a compilation failure due to having not calculated rune's
width yet. typecheck.go normally calculates the width of things, but
it doesn't for implicit conversions to default type. We already
compute the width of all of the standard numeric types in universe.go,
but we failed to calculate it for the rune alias type. So we could
later crash if the code never otherwise explicitly mentioned 'rune'.

While here, explicitly compute widths for 'byte' and 'error' for
consistency.

Fixes #29350.

Change-Id: Ifedd4899527c983ee5258dcf75aaf635b6f812f8
Reviewed-on: https://go-review.googlesource.com/c/155380
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-20 19:35:04 +00:00
Keith Randall
443990742e cmd/compile: ignore out-of-bounds reads from readonly constants
Out-of-bounds reads of globals can happen in dead code. For code
like this:

s := "a"
if len(s) == 3 {
   load s[0], s[1], and s[2]
}

The out-of-bounds loads are dead code, but aren't removed yet
when lowering. We need to not panic when compile-time evaluating
those loads. This can only happen for dead code, so the result
doesn't matter.

Fixes #29215

Change-Id: I7fb765766328b9524c6f2a1e6ab8d8edd9875097
Reviewed-on: https://go-review.googlesource.com/c/154057
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-12-20 17:34:32 +00:00
Robert Griesemer
99e4ddd053 cmd/compile: increase nesting depth limit for type descriptors
The formatting routines for types use a depth limit as primitive
mechanism to detect cycles. For now, increase the limit from 100
to 250 and file #29312 so we don't drop this on the floor.

Also, adjust some fatal error messages elsewhere to use
better formatting.

Fixes #29264.
Updates #29312.

Change-Id: Idd529f6682d478e0dcd2d469cb802192190602f6
Reviewed-on: https://go-review.googlesource.com/c/154583
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-18 00:13:58 +00:00
David Chase
d924c3336c cmd/compile: prevent double-walk of switch for OPRINT/OPRINTN
When a println arg contains a call to an inlineable function
that itself contains a switch, that switch statement will be
walked twice, once by the walkexprlist formerly in the
OPRINT/OPRINTN case, then by walkexprlistcheap in walkprint.

Remove the first walkexprlist, it is not necessary.
walkexprlist =
		s[i] = walkexpr(s[i], init)
walkexprlistcheap = {
		s[i] = cheapexpr(n, init)
		s[i] = walkexpr(s[i], init)
}

Seems like this might be possible in other places, i.e.,
calls to inlineable switch-containing functions.

See also #25776.
Fixes #29220.

Change-Id: I3781e86aad6688711597b8bee9bc7ebd3af93601
Reviewed-on: https://go-review.googlesource.com/c/154497
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-17 22:49:21 +00:00
Robert Griesemer
a1aafd8b28 cmd/compile: generate interface method expression wrapper for error.Error
A prior optimization (https://golang.org/cl/106175) removed the
generation of unnecessary method expression wrappers, but also
eliminated the generation of the wrapper for error.Error which
was still required.

Special-case error type in the optimization.

Fixes #29304.

Change-Id: I54c8afc88a2c6d1906afa2d09c68a0a3f3e2f1e3
Reviewed-on: https://go-review.googlesource.com/c/154578
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-12-17 19:48:36 +00:00
Martin Möhrmann
38e7177c94 cmd/compile: fix length overflow when appending elements to a slice
Instead of testing len(slice)+numNewElements > cap(slice) use
uint(len(slice)+numNewElements) > uint(cap(slice)) to test
if a slice needs to be grown in an append operation.

This prevents a possible overflow when len(slice) is near the maximum
int value and the addition of a constant number of new elements
makes it overflow and wrap around to a negative number which is
smaller than the capacity of the slice.

Appending a slice to a slice with append(s1, s2...) already used
a uint comparison to test slice capacity and therefore was not
vulnerable to the same overflow issue.

Fixes: #29190

Change-Id: I41733895838b4f80a44f827bf900ce931d8be5ca
Reviewed-on: https://go-review.googlesource.com/c/154037
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-14 05:48:18 +00:00
Keith Randall
05bbec7357 cmd/compile: don't combine load+op if the op has SymAddr arguments
By combining the load+op, we may force the op to happen earlier in
the store chain. That might force the SymAddr operation earlier, and
in particular earlier than its corresponding VarDef. That leads to
an invalid schedule, so avoid that.

This is kind of a hack to work around the issue presented. I think
the underlying problem, that LEAQ is not directly ordered with respect
to its vardef, is the real problem. The benefit of this CL is that
it fixes the immediate issue, is small, and obviously won't break
anything. A real fix for this issue is much more invasive.

The go binary is unchanged in size.
This situation just doesn't occur very often.

Fixes #28445

Change-Id: I13a765e13f075d5b6808a355ef3c43cdd7cd47b6
Reviewed-on: https://go-review.googlesource.com/c/153641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-12-12 21:13:15 +00:00
Keith Randall
01a1eaa10c cmd/compile: use innermost line number for -S
When functions are inlined, for instructions in the inlined body, does
-S print the location of the call, or the location of the body? Right
now, we do the former. I'd like to do the latter by default, it makes
much more sense when reading disassembly. With mid-stack inlining
enabled in more cases, this quandry will come up more often.

The original behavior is still available with -S=2. Some tests
use this mode (so they can find assembly generated by a particular
source line).

This helped me with understanding what the compiler was doing
while fixing #29007.

Change-Id: Id14a3a41e1b18901e7c5e460aa4caf6d940ed064
Reviewed-on: https://go-review.googlesource.com/c/153241
Reviewed-by: David Chase <drchase@google.com>
2018-12-11 20:24:45 +00:00
David Chase
ea6259d5e9 cmd/compile: check for negative upper bound to IsSliceInBounds
IsSliceInBounds(x, y) asserts that y is not negative, but
there were cases where this is not true.  Change code
generation to ensure that this is true when it's not obviously
true.  Prove phase cleans a few of these out.

With this change the compiler text section is 0.06% larger,
that is, not very much.  Benchmarking still TBD, may need
to wait for access to a benchmarking box (next week).

Also corrected run.go to handle '?' in -update_errors output.

Fixes #28797.

Change-Id: Ia8af90bc50a91ae6e934ef973def8d3f398fac7b
Reviewed-on: https://go-review.googlesource.com/c/152477
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-07 23:04:58 +00:00
Austin Clements
6454a09a97 cmd/compile: omit write barriers for slice clears of go:notinheap pointers
Currently,

  for i := range a {
    a[i] = nil
  }

will compile to have write barriers even if a is a slice of pointers
to go:notinheap types. This happens because the optimization that
transforms this into a memclr only asks it a's element type has
pointers, and not if it specifically has heap pointers.

Fix this by changing arrayClear to use HasHeapPointer instead of
types.Haspointers. We probably shouldn't have both of these functions,
since a pointer to a notinheap type is effectively a uintptr, but
that's not going to change in this CL.

Change-Id: I284b85bdec6ae1e641f894e8f577989facdb0cf1
Reviewed-on: https://go-review.googlesource.com/c/152723
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-05 21:54:54 +00:00
Robert Griesemer
cd47e8944d cmd/compile: avoid multiple errors regarding misuse of ... in signatures
Follow-up on #28450 (golang.org/cl/152417).

Updates #28450.
Fixes #29107.

Change-Id: Ib4b4fe582c35315a4f71cf6dbc7f7f2f24b37ec1
Reviewed-on: https://go-review.googlesource.com/c/152758
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-12-05 20:59:58 +00:00
smasher164
a7af474359 cmd/compile: improve error message for non-final variadic parameter
Previously, when a function signature had defined a non-final variadic
parameter, the error message always referred to the type associated with that
parameter. However, if the offending parameter's name was part of an identifier
list with a variadic type, one could misinterpret the message, thinking the
problem had been with one of the other names in the identifer list.

    func bar(a, b ...int) {}
clear ~~~~~~~^       ^~~~~~~~ confusing

This change updates the error message and sets the column position to that of
the offending parameter's name, if it exists.

Fixes #28450.

Change-Id: I076f560925598ed90e218c25d70f9449ffd9b3ea
Reviewed-on: https://go-review.googlesource.com/c/152417
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-12-05 01:35:59 +00:00
Cherry Zhang
be09bdf589 cmd/compile: fix unnamed parameter handling in escape analysis
For recursive functions, the parameters were iterated using
fn.Name.Defn.Func.Dcl, which does not include unnamed/blank
parameters. This results in a mismatch in formal-actual
assignments, for example,

func f(_ T, x T)

f(a, b) should result in { _=a, x=b }, but the escape analysis
currently sees only { x=a } and drops b on the floor. This may
cause b to not escape when it should (or a escape when it should
not).

Fix this by using fntype.Params().FieldSlice() instead, which
does include unnamed parameters.

Also add a sanity check that ensures all the actual parameters
are consumed.

Fixes #29000

Change-Id: Icd86f2b5d71e7ebbab76e375b7702f62efcf59ae
Reviewed-on: https://go-review.googlesource.com/c/152617
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-04 23:01:00 +00:00
Keith Randall
4a1a783dda cmd/compile: fix static initializer
staticcopy of a struct or array should recursively call itself, not
staticassign.

This fixes an issue where a struct with a slice in it is copied during
static initialization. In this case, the backing array for the slice
is duplicated, and each copy of the slice refers to a different
backing array, which is incorrect.  That issue has existed since at
least Go 1.2.

I'm not sure why this was never noticed. It seems like a pretty
obvious bug if anyone modifies the resulting slice.

In any case, we started to notice when the optimization in CL 140301
landed.  Here is basically what happens in issue29013b.go:
1) The error above happens, so we get two backing stores for what
   should be the same slice.
2) The code for initializing those backing stores is reused.
   But not duplicated: they are the same Node structure.
3) The order pass allocates temporaries for the map operations.
   For the first instance, things work fine and two temporaries are
   allocated and stored in the OKEY nodes. For the second instance,
   the order pass decides new temporaries aren't needed, because
   the OKEY nodes already have temporaries in them.
   But the order pass also puts a VARKILL of the temporaries between
   the two instance initializations.
4) In this state, the code is technically incorrect. But before
   CL 140301 it happens to work because the temporaries are still
   correctly initialized when they are used for the second time. But then...
5) The new CL 140301 sees the VARKILLs and decides to reuse the
   temporary for instance 1 map 2 to initialize the instance 2 map 1
   map. Because the keys aren't re-initialized, instance 2 map 1
   gets the wrong key inserted into it.

Fixes #29013

Change-Id: I840ce1b297d119caa706acd90e1517a5e47e9848
Reviewed-on: https://go-review.googlesource.com/c/152081
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-12-03 16:48:21 +00:00
David Chase
624e197c71 cmd/compile: decrease inlining call cost from 60 to 57
A Go user made a well-documented request for a slightly
lower threshold.  I tested against a selection of other
people's benchmarks, and saw a tiny benefit (possibly noise)
at equally tiny cost, and no unpleasant surprises observed
in benchmarking.

I.e., might help, doesn't hurt, low risk, request was
delivered on a silver platter.

It did, however, change the behavior of one test because
now bytes.Buffer.Grow is eligible for inlining.

Updates #19348.

Change-Id: I85e3088a4911290872b8c6bda9601b5354c48695
Reviewed-on: https://go-review.googlesource.com/c/151977
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-01 15:03:28 +00:00
Ben Shi
c042fedbc8 test/codegen: add arithmetic tests for 386/amd64/arm/arm64
This CL adds several test cases of arithmetic operations for
386/amd64/arm/arm64.

Change-Id: I362687c06249f31091458a1d8c45fc4d006b616a
Reviewed-on: https://go-review.googlesource.com/c/151897
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-12-01 05:17:44 +00:00
Robert Griesemer
a37d95c74a cmd/compile: fix constant index bounds check and error message
While here, rename nonnegintconst to indexconst (because that's
what it is) and add Fatalf calls where we are not expecting the
indexconst call to fail, and fixed wrong comparison in smallintconst.

Fixes #23781.

Change-Id: I86eb13081c450943b1806dfe3ae368872f76639a
Reviewed-on: https://go-review.googlesource.com/c/151599
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-30 23:48:00 +00:00
Keith Randall
2140975ebd cmd/compile: eliminate write barriers when writing non-heap ptrs
We don't need a write barrier if:
1) The location we're writing to doesn't hold a heap pointer, and
2) The value we're writing isn't a heap pointer.

The freshly returned value from runtime.newobject satisfies (1).
Pointers to globals, and the contents of the read-only data section satisfy (2).

This is particularly helpful for code like:
p := []string{"abc", "def", "ghi"}

Where the compiler generates:
   a := new([3]string)
   move(a, statictmp_)  // eliminates write barriers here
   p := a[:]

For big slice literals, this makes the code a smaller and faster to
compile.

Update #13554. Reduces the compile time by ~10% and RSS by ~30%.

Change-Id: Icab81db7591c8777f68e5d528abd48c7e44c87eb
Reviewed-on: https://go-review.googlesource.com/c/151498
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-29 22:23:02 +00:00
Keith Randall
2b4f24a2d2 cmd/compile: randomize value order in block for testing
A little bit of compiler stress testing. Randomize the order
of the values in a block before every phase. This randomization
makes sure that we're not implicitly depending on that order.

Currently the random seed is a hash of the function name.
It provides determinism, but sacrifices some coverage.
Other arrangements are possible (env var, ...) but require
more setup.

Fixes #20178

Change-Id: Idae792a23264bd9a3507db6ba49b6d591a608e83
Reviewed-on: https://go-review.googlesource.com/c/33909
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-28 17:13:46 +00:00
Keith Randall
0b79dde112 cmd/compile: don't use CMOV ops to compute load addresses
We want to issue loads as soon as possible, especially when they
are going to miss in the cache. Using a conditional move (CMOV) here:

i := ...
if cond {
   i++
}
... = a[i]

means that we have to wait for cond to be computed before the load
is issued. Without a CMOV, if the branch is predicted correctly the
load can be issued in parallel with computing cond.
Even if the branch is predicted incorrectly, maybe the speculative
load is close to the real load, and we get a prefetch for free.
In the worst case, when the prediction is wrong and the address is
way off, we only lose by the time difference between the CMOV
latency (~2 cycles) and the mispredict restart latency (~15 cycles).

We only squash CMOVs that affect load addresses. Results of CMOVs
that are used for other things (store addresses, store values) we
use as before.

Fixes #26306

Change-Id: I82ca14b664bf05e1d45e58de8c4d9c775a127ca1
Reviewed-on: https://go-review.googlesource.com/c/145717
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-11-27 17:22:37 +00:00
Clément Chigot
5680874e0c test: fix nilptr5 for AIX
This commit fixes a mistake made in CL 144538.
This nilcheck can be removed because OpPPC64LoweredMove will fault if
arg0 is nil, as it's used to store. Further information can be found in
cmd/compile/internal/ssa/nilcheck.go.

Change-Id: Ifec0080c00eb1f94a8c02f8bf60b93308e71b119
Reviewed-on: https://go-review.googlesource.com/c/151298
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-27 15:36:08 +00:00
Brian Kessler
319787a528 cmd/compile: intrinsify math/bits.Div on amd64
Note that the intrinsic implementation panics separately for overflow and
divide by zero, which matches the behavior of the pure go implementation.
There is a modest performance improvement after intrinsic implementation.

name     old time/op  new time/op  delta
Div-4    53.0ns ± 1%  47.0ns ± 0%  -11.28%  (p=0.008 n=5+5)
Div32-4  18.4ns ± 0%  18.5ns ± 1%     ~     (p=0.444 n=5+5)
Div64-4  53.3ns ± 0%  47.5ns ± 4%  -10.77%  (p=0.008 n=5+5)

Updates #28273

Change-Id: Ic1688ecc0964acace2e91bf44ef16f5fb6b6bc82
Reviewed-on: https://go-review.googlesource.com/c/144378
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-27 05:04:25 +00:00
Keith Randall
eb6c433eb3 cmd/compile: don't convert non-Go-constants to OLITERALs
Don't convert values that aren't Go constants, like
uintptr(unsafe.Pointer(nil)), to a literal constant. This avoids
assuming they are constants for things like indexing, array sizes,
case duplication, etc.

Also, nil is an allowed duplicate in switches. CTNILs aren't Go constants.

Fixes #28078
Fixes #28079

Change-Id: I9ab8af47098651ea09ef10481787eae2ae2fb445
Reviewed-on: https://go-review.googlesource.com/c/151320
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-27 01:21:41 +00:00
Keith Randall
6fff980cf1 cmd/compile: initialize sparse slice literals dynamically
When a slice composite literal is sparse, initialize it dynamically
instead of statically.

s := []int{5:5, 20:20}

To initialize the backing store for s, use 2 constant writes instead
of copying from a static array with 21 entries.

This CL also fixes pathologies in the compiler when the slice is
*very* sparse.

Fixes #23780

Change-Id: Iae95c6e6f6a0e2994675cbc750d7a4dd6436b13b
Reviewed-on: https://go-review.googlesource.com/c/151319
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-26 22:50:48 +00:00
Keith Randall
1602e49701 cmd/compile: don't constant-fold non-Go constants in the frontend
Abort evconst if its argument isn't a Go constant. The SSA backend
will do the optimizations in question later. They tend to be weird
cases, like uintptr(unsafe.Pointer(uintptr(1))).

Fix OADDSTR and OCOMPLEX cases in isGoConst.
OADDSTR has its arguments in n.List, not n.Left and n.Right.
OCOMPLEX might have a 2-result function as its arg in List[0]
(in which case it isn't a Go constant).

Fixes #24760

Change-Id: Iab312d994240d99b3f69bfb33a443607e872b01d
Reviewed-on: https://go-review.googlesource.com/c/151338
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-26 22:49:57 +00:00
Keith Randall
ca3749230b cmd/compile: allow bodyless function if it is linkname'd
In assembly free packages (aka "complete" or "pure go"), allow
bodyless functions if they are linkname'd to something else.

Presumably the thing the function is linkname'd to has a definition.
If not, the linker will complain. And linkname is unsafe, so we expect
users to know what they are doing.

Note this handles only one direction, where the linkname directive
is in the local package. If the linkname directive is in the remote
package, this CL won't help. (See os/signal/sig.s for an example.)

Fixes #23311

Change-Id: I824361b4b582ee05976d94812e5b0e8b0f7a18a6
Reviewed-on: https://go-review.googlesource.com/c/151318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-26 20:00:59 +00:00
Clément Chigot
9fe9853ae5 cmd/compile: fix nilcheck for AIX
This commit adapts compile tool to create correct nilchecks for AIX.

AIX allows to load a nil pointer. Therefore, the default nilcheck
which issues a load must be replaced by a CMP instruction followed by a
store at 0x0 if the value is nil. The store will trigger a SIGSEGV as on
others OS.

The nilcheck algorithm must be adapted to do not remove nilcheck if it's
only a read. Stores are detected with v.Type.IsMemory().

Tests related to nilptr must be adapted to the previous changements.
nilptr.go cannot be used as it's because the AIX address space starts at
1<<32.

Change-Id: I9f5aaf0b7e185d736a9b119c0ed2fe4e5bd1e7af
Reviewed-on: https://go-review.googlesource.com/c/144538
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-26 14:13:53 +00:00
Clément Chigot
041526c6ef runtime: handle 64bits addresses for AIX
This commit allows the runtime to handle 64bits addresses returned by
mmap syscall on AIX.

Mmap syscall returns addresses on 59bits on AIX. But the Arena
implementation only allows addresses with less than 48 bits.
This commit increases the arena size up to 1<<60 for aix/ppc64.

Update: #25893

Change-Id: Iea72e8a944d10d4f00be915785e33ae82dd6329e
Reviewed-on: https://go-review.googlesource.com/c/138736
Reviewed-by: Austin Clements <austin@google.com>
2018-11-26 14:06:28 +00:00
Austin Clements
9255688610 cmd/asm: rename -symabis to -gensymabis
Currently, both asm and compile have a -symabis flag, but in asm it's
a boolean flag that means to generate a symbol ABIs file and in the
compiler its a string flag giving the path of the symbol ABIs file to
consume. I'm worried about this false symmetry biting us in the
future, so rename asm's flag to -gensymabis.

Updates #27539.

Change-Id: I8b9c18a852d2838099718f8989813f19d82e7434
Reviewed-on: https://go-review.googlesource.com/c/149818
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-16 21:57:50 +00:00
Martin Möhrmann
75798e8ada runtime: make processor capability variable naming platform specific
The current support_XXX variables are specific for the
amd64 and 386 platforms.

Prefix processor capability variables by architecture to have a
consistent naming scheme and avoid reuse of the existing
variables for new platforms.

This also aligns naming of runtime variables closer with internal/cpu
processor capability variable names.

Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d
Reviewed-on: https://go-review.googlesource.com/c/91435
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14 20:30:31 +00:00
Milan Knezevic
c92e73b702 cmd/compile/internal/gc: OMUL should be evaluated when using soft-float
When using soft-float, OMUL might be rewritten to function call
so we should ensure it was evaluated first.

Fixes #28688

Change-Id: I30b87501782fff62d35151f394a1c22b0d490c6c
Reviewed-on: https://go-review.googlesource.com/c/148837
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-14 18:52:15 +00:00
Emmanuel T Odeke
6d620fc42e test: move empty header file in builddir, buildrundir to temp directory
Move the empty header file created by "builddir", "buildrundir"
directives to t.tempDir. The file was accidentally placed in the
same directory as the source code and this was a vestige of CL 146999.

Fixes #28781

Change-Id: I3d2ada5f9e8bf4ce4f015b9bd379b311592fe3ce
Reviewed-on: https://go-review.googlesource.com/c/149458
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14 00:22:40 +00:00
Austin Clements
a3c70e28ed test: fix ABI mismatch in fixedbugs/issue19507
Because run.go doesn't pass the package being compiled to the compiler
via the -p flag, it can't match up the main·f symbol from the
assembler with the "func f" stub in Go, so it doesn't produce the
correct assembly stub.

Fix this by removing the package prefix from the assembly definition.

Alternatively, we could make run.go pass -p to the compiler, but it's
nicer to remove these package prefixes anyway.

Should fix the linux-arm builder, which was broken by the introduction
of function ABIs in CL 147160.

Updates #27539.

Change-Id: Id62b7701e1108a21a5ad48ffdb5dad4356c273a6
Reviewed-on: https://go-review.googlesource.com/c/149483
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2018-11-13 23:44:52 +00:00
Keith Randall
0098f8aeac runtime: when using explicit argmap, also use arglen
When we set an explicit argmap, we may want only a prefix of that
argmap.  Argmap is set when the function is reflect.makeFuncStub or
reflect.methodValueCall. In this case, arglen specifies how much of
the args section is actually live. (It could be either all the args +
results, or just the args.)

Fixes #28750

Change-Id: Idf060607f15a298ac591016994e58e22f7f92d83
Reviewed-on: https://go-review.googlesource.com/c/149217
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-11-13 22:52:09 +00:00
Austin Clements
0f5dfbcfd7 cmd/go, cmd/dist: plumb symabis from assembler to compiler
For #27539.

Change-Id: I0e27f142224e820205fb0e65ad03be7eba93da14
Reviewed-on: https://go-review.googlesource.com/c/146999
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-12 20:46:41 +00:00
Austin Clements
7f1dd3ae4d test: minor simplification to run.go
This is a little clearer, and we're about to need the .s file list in
one more place, so this will cut down on duplication.

Change-Id: I4da8bf03a0469fb97565b0841c40d505657b574e
Reviewed-on: https://go-review.googlesource.com/c/146998
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-11-12 20:46:39 +00:00
Josh Bleecher Snyder
8607b2e825 cmd/compile: optimize A->B->C Moves that include VarDefs
We have an existing optimization that recognizes
memory moves of the form A -> B -> C and converts
them into A -> C, in the hopes that the store to
B will be end up being dead and thus eliminated.

However, when A, B, and C are large types,
the front end sometimes emits VarDef ops for the moves.
This change adds an optimization to match that pattern.

This required changing an old compiler test.
The test assumed that a temporary was required
to deal with a large return value.
With this optimization in place, that temporary
ended up being eliminated.

Triggers 649 times during 'go build -a std cmd'.

Cuts 16k off cmd/go.

name        old object-bytes  new object-bytes  delta
Template          507kB ± 0%        507kB ± 0%  -0.15%  (p=0.008 n=5+5)
Unicode           225kB ± 0%        225kB ± 0%    ~     (all equal)
GoTypes          1.85MB ± 0%       1.85MB ± 0%    ~     (all equal)
Flate             328kB ± 0%        328kB ± 0%    ~     (all equal)
GoParser          402kB ± 0%        402kB ± 0%  -0.00%  (p=0.008 n=5+5)
Reflect          1.41MB ± 0%       1.41MB ± 0%  -0.20%  (p=0.008 n=5+5)
Tar               458kB ± 0%        458kB ± 0%    ~     (all equal)
XML               601kB ± 0%        599kB ± 0%  -0.21%  (p=0.008 n=5+5)

Change-Id: I9b5f25c8663a0b772ad1ee51fa61f74b74d26dd3
Reviewed-on: https://go-review.googlesource.com/c/143479
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-11-11 14:18:33 +00:00
Lynn Boger
4ae49b5921 cmd/compile: use ANDCC, ORCC, XORCC to avoid CMP on ppc64x
This change makes use of the cc versions of the AND, OR, XOR
instructions, omitting the need for a CMP instruction.

In many test programs and in the go binary, this reduces the
size of 20-30 functions by at least 1 instruction, many in
runtime.

Testcase added to test/codegen/comparisons.go

Change-Id: I6cc1ca8b80b065d7390749c625bc9784b0039adb
Reviewed-on: https://go-review.googlesource.com/c/143059
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-09 19:40:52 +00:00
Keith Randall
13baf4b2cd cmd/compile: encourage inlining of functions with single-call bodies
This is a simple tweak to allow a bit more mid-stack inlining.
In cases like this:

func f() {
    g()
}

We'd really like to inline f into its callers. It can't hurt.

We implement this optimization by making calls a bit cheaper, enough
to afford a single call in the function body, but not 2.
The remaining budget allows for some argument modification, or perhaps
a wrapping conditional:

func f(x int) {
    g(x, 0)
}
func f(x int) {
    if x > 0 {
        g()
    }
}

Update #19348

Change-Id: Ifb1ea0dd1db216c3fd5c453c31c3355561fe406f
Reviewed-on: https://go-review.googlesource.com/c/147361
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
2018-11-08 17:29:23 +00:00
Keith Randall
95a4f793c0 cmd/compile: don't deadcode eliminate labels
Dead-code eliminating labels is tricky because there might
be gotos that can still reach them.

Bug probably introduced with CL 91056

Fixes #28616

Change-Id: I6680465134e3486dcb658896f5172606cc51b104
Reviewed-on: https://go-review.googlesource.com/c/147817
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Iskander Sharipov <iskander.sharipov@intel.com>
2018-11-06 18:50:16 +00:00
Ian Lance Taylor
a540aa338a test: add test that gccgo failed to compile
Updates #28601

Change-Id: I734fc5ded153126d384f0df912ecd4d208005e49
Reviewed-on: https://go-review.googlesource.com/c/147537
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-11-05 20:54:58 +00:00
Robert Griesemer
e6305380a0 cmd/compile: reintroduce work-around for cyclic alias declarations
This change re-introduces (temporarily) a work-around for recursive
alias type declarations, originally in https://golang.org/cl/35831/
(intended as fix for #18640). The work-around was removed later
for a more comprehensive cycle detection check. That check
contained a subtle error which made the code appear to work,
while in fact creating incorrect types internally. See #25838
for details.

By re-introducing the original work-around, we eliminate problems
with many simple recursive type declarations involving aliases;
specifically cases such as #27232 and #27267. However, the more
general problem remains.

This CL also fixes the subtle error (incorrect variable use when
analyzing a type cycle) mentioned above and now issues a fatal
error with a reference to the relevant issue (rather than crashing
later during the compilation). While not great, this is better
than the current status. The long-term solution will need to
address these cycles (see #25838).

As a consequence, several old test cases are not accepted anymore
by the compiler since they happened to work accidentally only.
This CL disables parts or all code of those test cases. The issues
are: #18640, #23823, and #24939.

One of the new test cases (fixedbugs/issue27232.go) exposed a
go/types issue. The test case is excluded from the go/types test
suite and an issue was filed (#28576).

Updates #18640.
Updates #23823.
Updates #24939.
Updates #25838.
Updates #28576.

Fixes #27232.
Fixes #27267.

Change-Id: I6c2d10da98bfc6f4f445c755fcaab17fc7b214c5
Reviewed-on: https://go-review.googlesource.com/c/147286
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-11-05 20:30:19 +00:00
Matthew Dempsky
b2c397e537 cmd/compile: disallow converting string to notinheap slice
Unlikely to happen in practice, but easy enough to prevent and might
as well do so for completeness.

Fixes #28243.

Change-Id: I848c3af49cb923f088e9490c6a79373e182fad08
Reviewed-on: https://go-review.googlesource.com/c/142719
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2018-11-02 19:53:59 +00:00
Clément Chigot
85525c56ab all: skip unsupported tests on AIX
This commit skips tests which aren't yet supported on AIX.

nosplit.go is disabled because stackGuardMultiplier is increased for
syscalls.

Change-Id: Ib5ff9a4539c7646bcb6caee159f105ff8a160ad7
Reviewed-on: https://go-review.googlesource.com/c/146939
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2018-11-02 16:12:08 +00:00
Keith Randall
0ad332d80c cmd/compile: implement some moves using non-overlapping reads&writes
For moves >8,<16 bytes, do a move using non-overlapping loads/stores
if it would require no more instructions.

This helps a bit with the case when the move is from a static
constant, because then the code to materialize the value being moved
is smaller.

Change-Id: Ie47a5a7c654afeb4973142b0a9922faea13c9b54
Reviewed-on: https://go-review.googlesource.com/c/146019
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-30 20:27:03 +00:00
Keith Randall
f14067f3c1 cmd/compile: when comparing 0-size types, make sure expr side-effects survive
Fixes #23837

Change-Id: I53f524d87946a0065f28a4ddbe47b40f2b43c459
Reviewed-on: https://go-review.googlesource.com/c/145757
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-30 17:45:19 +00:00
Ben Shi
455ef3f6bc test/codegen: improve arithmetic tests
This CL fixes several typos and adds two more cases
to arithmetic test.

Change-Id: I086560162ea351e2166866e444e2317da36c1729
Reviewed-on: https://go-review.googlesource.com/c/145210
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-30 14:39:53 +00:00
Ben Shi
5f5ea3fd4d cmd/compile: optimize amd64's ADDQconstmodify/ADDLconstmodify
This CL optimize amd64's code:
"ADDQ $-1, MEM_OP" -> "DECQ MEM_OP"
"ADDL $-1, MEM_OP" -> "DECL MEM_OP"

1. The total size of pkg/linux_amd64 (excluding cmd/compile)
decreases about 0.1KB.

2. The go1 benchmark shows little regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              2.60s ± 5%     2.64s ± 3%  +1.53%  (p=0.000 n=38+39)
Fannkuch11-4                2.37s ± 2%     2.38s ± 2%    ~     (p=0.950 n=40+40)
FmtFprintfEmpty-4          40.4ns ± 5%    40.5ns ± 5%    ~     (p=0.711 n=40+40)
FmtFprintfString-4         72.4ns ± 5%    72.3ns ± 3%    ~     (p=0.485 n=40+40)
FmtFprintfInt-4            79.7ns ± 3%    80.1ns ± 3%    ~     (p=0.124 n=40+40)
FmtFprintfIntInt-4          126ns ± 3%     127ns ± 3%  +0.71%  (p=0.027 n=40+40)
FmtFprintfPrefixedInt-4     153ns ± 4%     153ns ± 2%    ~     (p=0.604 n=40+40)
FmtFprintfFloat-4           206ns ± 5%     210ns ± 5%  +1.79%  (p=0.002 n=40+40)
FmtManyArgs-4               498ns ± 3%     496ns ± 3%    ~     (p=0.099 n=40+40)
GobDecode-4                6.48ms ± 6%    6.47ms ± 7%    ~     (p=0.686 n=39+40)
GobEncode-4                5.95ms ± 7%    5.96ms ± 6%    ~     (p=0.670 n=40+34)
Gzip-4                      224ms ± 6%     223ms ± 5%    ~     (p=0.143 n=40+40)
Gunzip-4                   36.5ms ± 4%    36.5ms ± 4%    ~     (p=0.556 n=40+40)
HTTPClientServer-4         60.7µs ± 2%    59.9µs ± 3%  -1.20%  (p=0.000 n=39+39)
JSONEncode-4               9.03ms ± 4%    9.04ms ± 4%    ~     (p=0.589 n=40+40)
JSONDecode-4               49.4ms ± 4%    49.2ms ± 4%    ~     (p=0.276 n=40+40)
Mandelbrot200-4            3.80ms ± 4%    3.79ms ± 4%    ~     (p=0.837 n=40+40)
GoParse-4                  3.15ms ± 5%    3.13ms ± 5%    ~     (p=0.240 n=40+40)
RegexpMatchEasy0_32-4      72.9ns ± 3%    72.0ns ± 8%  -1.25%  (p=0.003 n=40+40)
RegexpMatchEasy0_1K-4       229ns ± 5%     230ns ± 4%    ~     (p=0.318 n=40+40)
RegexpMatchEasy1_32-4      66.9ns ± 3%    67.3ns ± 7%    ~     (p=0.817 n=40+40)
RegexpMatchEasy1_1K-4       371ns ± 5%     370ns ± 4%    ~     (p=0.275 n=40+40)
RegexpMatchMedium_32-4      106ns ± 4%     104ns ± 7%  -2.28%  (p=0.000 n=40+40)
RegexpMatchMedium_1K-4     32.0µs ± 2%    31.4µs ± 3%  -2.08%  (p=0.000 n=40+40)
RegexpMatchHard_32-4       1.54µs ± 7%    1.52µs ± 3%  -1.80%  (p=0.007 n=39+40)
RegexpMatchHard_1K-4       45.8µs ± 4%    45.5µs ± 3%    ~     (p=0.707 n=40+40)
Revcomp-4                   401ms ± 5%     401ms ± 6%    ~     (p=0.935 n=40+40)
Template-4                 62.4ms ± 4%    61.2ms ± 3%  -1.85%  (p=0.000 n=40+40)
TimeParse-4                 315ns ± 2%     318ns ± 3%  +1.10%  (p=0.002 n=40+40)
TimeFormat-4                297ns ± 3%     298ns ± 3%    ~     (p=0.238 n=40+40)
[Geo mean]                 45.8µs         45.7µs       -0.22%

name                     old speed      new speed      delta
GobDecode-4               119MB/s ± 6%   119MB/s ± 7%    ~     (p=0.684 n=39+40)
GobEncode-4               129MB/s ± 7%   128MB/s ± 6%    ~     (p=0.413 n=40+34)
Gzip-4                   86.6MB/s ± 6%  87.0MB/s ± 6%    ~     (p=0.145 n=40+40)
Gunzip-4                  532MB/s ± 4%   532MB/s ± 4%    ~     (p=0.556 n=40+40)
JSONEncode-4              215MB/s ± 4%   215MB/s ± 4%    ~     (p=0.583 n=40+40)
JSONDecode-4             39.3MB/s ± 4%  39.5MB/s ± 4%    ~     (p=0.277 n=40+40)
GoParse-4                18.4MB/s ± 5%  18.5MB/s ± 5%    ~     (p=0.229 n=40+40)
RegexpMatchEasy0_32-4     439MB/s ± 3%   445MB/s ± 8%  +1.28%  (p=0.003 n=40+40)
RegexpMatchEasy0_1K-4    4.46GB/s ± 4%  4.45GB/s ± 4%    ~     (p=0.343 n=40+40)
RegexpMatchEasy1_32-4     479MB/s ± 3%   476MB/s ± 7%    ~     (p=0.855 n=40+40)
RegexpMatchEasy1_1K-4    2.76GB/s ± 5%  2.77GB/s ± 4%    ~     (p=0.250 n=40+40)
RegexpMatchMedium_32-4   9.36MB/s ± 4%  9.58MB/s ± 6%  +2.31%  (p=0.001 n=40+40)
RegexpMatchMedium_1K-4   32.0MB/s ± 2%  32.7MB/s ± 3%  +2.12%  (p=0.000 n=40+40)
RegexpMatchHard_32-4     20.7MB/s ± 7%  21.1MB/s ± 3%  +1.95%  (p=0.005 n=40+40)
RegexpMatchHard_1K-4     22.4MB/s ± 4%  22.5MB/s ± 3%    ~     (p=0.689 n=40+40)
Revcomp-4                 634MB/s ± 5%   634MB/s ± 6%    ~     (p=0.935 n=40+40)
Template-4               31.1MB/s ± 3%  31.7MB/s ± 3%  +1.88%  (p=0.000 n=40+40)
[Geo mean]                129MB/s        130MB/s       +0.62%

Change-Id: I9d61ee810d900920c572cbe89e2f1626bfed12b7
Reviewed-on: https://go-review.googlesource.com/c/145209
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-30 00:22:58 +00:00
Robert Griesemer
70dd90c4a9 cmd/compile: revert "typecheck types and funcs before consts"
This reverts commit 9ce87a63b9.

The fix addresses the specific test case, but not the general
problem.

Updates #24755.

Change-Id: I0ba8463b41b099b1ebf49759f88a423b40f70d58
Reviewed-on: https://go-review.googlesource.com/c/145617
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-29 19:51:57 +00:00
Daniel Martí
9ce87a63b9 cmd/compile: typecheck types and funcs before consts
This way, once the constant declarations are typechecked, all named
types are fully typechecked and have all of their methods added.

Usually this isn't important, as methods and interfaces cannot be used
in constant declarations. However, it can lead to confusing and
incorrect errors, such as:

	$ cat f.go
	package p

	type I interface{ F() }
	type T struct{}

	const _ = I(T{})

	func (T) F() {}
	$ go build f.go
	./f.go:6:12: cannot convert T literal (type T) to type I:
		T does not implement I (missing F method)

The error is clearly wrong, as T does have an F method. If we ensure
that all funcs are typechecked before all constant declarations, we get
the correct error:

	$ go build f2.go
	# command-line-arguments
	./f.go:6:7: const initializer I(T literal) is not a constant

Fixes #24755.

Change-Id: I182b60397b9cac521d9a9ffadb11b42fd42e42fe
Reviewed-on: https://go-review.googlesource.com/c/115096
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-29 18:10:54 +00:00
Josh Bleecher Snyder
15c4575293 cmd/compile: convert arguments as needed
CL 114797 reworked how arguments get written to the stack.
Some type conversions got lost in the process. Restore them.

Fixes #28390
Updates #28430

Change-Id: Ia0d37428d7d615c865500bbd1a7a4167554ee34f
Reviewed-on: https://go-review.googlesource.com/c/144598
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-28 18:22:36 +00:00
Keith Randall
9f291d1fc3 cmd/compile: fix rule for combining loads with compares
Unlike normal load+op opcodes, the load+compare opcode does
not clobber its non-load argument. Allow the load+compare merge
to happen even if the non-load argument is used elsewhere.

Noticed when investigating issue #28417.

Change-Id: Ibc48d1f2e06ae76034c59f453815d263e8ec7288
Reviewed-on: https://go-review.googlesource.com/c/145097
Reviewed-by: Ainar Garipov <gugl.zadolbal@gmail.com>
Reviewed-by: Ben Shi <powerman1st@163.com>
2018-10-27 00:59:54 +00:00
Keith Randall
dd789550a7 cmd/compile: intrinsify math/bits.Sub on amd64
name             old time/op  new time/op  delta
Sub-8            1.12ns ± 1%  1.17ns ± 1%   +5.20%          (p=0.008 n=5+5)
Sub32-8          1.11ns ± 0%  1.11ns ± 0%     ~     (all samples are equal)
Sub64-8          1.12ns ± 0%  1.18ns ± 1%   +5.00%          (p=0.016 n=4+5)
Sub64multiple-8  4.10ns ± 1%  0.86ns ± 1%  -78.93%          (p=0.008 n=5+5)

Fixes #28273

Change-Id: Ibcb6f2fd32d987c3bcbae4f4cd9d335a3de98548
Reviewed-on: https://go-review.googlesource.com/c/144258
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:27 +00:00
Keith Randall
899f3a2892 cmd/compile: intrinsify math/bits.Add on amd64
name             old time/op  new time/op  delta
Add-8            1.11ns ± 0%  1.18ns ± 0%   +6.31%  (p=0.029 n=4+4)
Add32-8          1.02ns ± 0%  1.02ns ± 1%     ~     (p=0.333 n=4+5)
Add64-8          1.11ns ± 1%  1.17ns ± 0%   +5.79%  (p=0.008 n=5+5)
Add64multiple-8  4.35ns ± 1%  0.86ns ± 0%  -80.22%  (p=0.000 n=5+4)

The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.

Update #28273

Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25 19:47:00 +00:00
Robert Griesemer
6761b1eb1b cmd/compile: better errors for structs with conflicting fields and methods
If a field and method have the same name, mark the respective struct field
so that we don't report follow-on errors when the field/method is accessed.

Per suggestion of @mdempsky.

Fixes #28268.

Change-Id: Ia1ca4cdfe9bacd3739d1fd7ca5e014ca094245ee
Reviewed-on: https://go-review.googlesource.com/c/144259
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-24 20:39:37 +00:00
Robert Griesemer
5538ecadca cmd/compile: better error for embedded field referring to missing import
Fixes #27938.

Change-Id: I16263ac6c0b8903b8a16f02e8db0e1a16d1c95b4
Reviewed-on: https://go-review.googlesource.com/c/144261
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-24 20:08:18 +00:00
ChrisALiles
13d5cd7847 cmd/compile: use proved bounds to remove signed division fix-ups
prove is able to find 94 occurrences in std cmd where a divisor
can't have the value -1. The change removes
the extraneous fix-up code for these cases.

Fixes #25239

Change-Id: Ic184de971f47cc57c702eb72805b8e291c14035d
Reviewed-on: https://go-review.googlesource.com/c/130215
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-23 02:29:44 +00:00
Keith Randall
dca769dca9 cmd/compile: in append(f()), type convert appended items
The second and subsequent return values from f() need to be
converted to the element type of the first return value from f()
(which must be a slice).

Fixes #22327

Change-Id: I5c0a424812c82c1b95b6d124c5626cfc4408bdb6
Reviewed-on: https://go-review.googlesource.com/c/142718
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-10-22 17:30:57 +00:00
Ben Shi
95dda75bde cmd/compile: optimize store combination on 386/amd64
This CL add 3 rules to combine byte-store to word-store on386 and
amd64.

Change-Id: Iffd9cda42f1961680c81def4edc773ad58f211b3
Reviewed-on: https://go-review.googlesource.com/c/143057
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-19 02:21:04 +00:00
Ian Lance Taylor
0a8e347751 test: update issue5089.go for recent gccgo changes
As of https://golang.org/cl/43456 gccgo now gives a better error
message for this test.

Before:
    fixedbugs/issue5089.go:13:1: error: redefinition of ‘bufio.Buffered’: receiver name changed
     func (b *bufio.Reader) Buffered() int { // ERROR "non-local|redefinition"
     ^
    fixedbugs/issue5089.go:11:13: note: previous definition of ‘bufio.Buffered’ was here
     import "bufio" // GCCGO_ERROR "previous"
                 ^

Now:
    fixedbugs/issue5089.go:13:7: error: may not define methods on non-local type
     func (b *bufio.Reader) Buffered() int { // ERROR "non-local|redefinition"
           ^

Change-Id: I4112ca8d91336f6369f780c1d45b8915b5e8e235
Reviewed-on: https://go-review.googlesource.com/c/130955
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-18 17:53:42 +00:00
Ian Lance Taylor
8ccafb1ac7 test: add fixedbugs/bug506 for gccgo
Building with gccgo failed with an undefined symbol error from an
unnecessary hash function.

Updates #19773

Change-Id: Ic78bf1b086ff5ee26d464089c0e14987d3fe8b02
Reviewed-on: https://go-review.googlesource.com/c/130956
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-18 04:57:41 +00:00
Ben Shi
4158734097 test/codegen: add more combined load/store test cases
This CL adds more combined load/store test cases for 386/amd64.

Change-Id: I0a483a6ed0212b65c5e84d67ed8c9f50c389ce2d
Reviewed-on: https://go-review.googlesource.com/c/142878
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-18 01:57:54 +00:00
Matthew Dempsky
5185744962 cmd/compile: remove obsolete "safe" mode
Nowadays there are better ways to safely run untrusted Go programs, like
NaCl and gVisor.

Change-Id: I20c45f13a50dbcf35c343438b720eb93e7b4e13a
Reviewed-on: https://go-review.googlesource.com/c/142717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2018-10-17 19:00:37 +00:00
Josh Bleecher Snyder
f2a676536f test: limit runoutput concurrency with -v
This appears to have simply been an oversight.

Change-Id: Ia5d1309b3ebc99c9abbf0282397693272d8178aa
Reviewed-on: https://go-review.googlesource.com/c/142885
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-17 16:42:42 +00:00
Robert Griesemer
e861c3e003 cmd/compile: simplified test case (cleanup)
Follow-up on https://golang.org/cl/124595; no semantic changes.

Updates #26411.

Change-Id: Ic1c4622dbf79529ff61530f9c25ec742c2abe5ca
Reviewed-on: https://go-review.googlesource.com/c/142720
Run-TryBot: Robert Griesemer <gri@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-16 23:11:02 +00:00
Emmanuel T Odeke
0b63086f64 cmd/compile: fix label redefinition error column numbers
Ensure that label redefinition error column numbers
print the actual start of the label instead of the
position of the label's delimiting token ":".

For example, given this program:

package main

func main() {

            foo:
   foo:
foo:
foo            :
}

* Before:
main.go:5:13: label foo defined and not used
main.go:6:7: label foo already defined at main.go:5:13
main.go:7:4: label foo already defined at main.go:5:13
main.go:8:16: label foo already defined at main.go:5:13

* After:
main.go:5:13: label foo defined and not used
main.go:6:4: label foo already defined at main.go:5:13
main.go:7:1: label foo already defined at main.go:5:13
main.go:8:1: label foo already defined at main.go:5:13

Fixes #26411

Change-Id: I8eb874b97fdc8862547176d57ac2fa0f075f2367
Reviewed-on: https://go-review.googlesource.com/c/124595
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-16 22:32:14 +00:00
Filippo Valsorda
a52289ef2b Revert "fmt: fix incorrect format of whole-number floats when using %#v"
Numbers without decimals are valid Go representations of whole-number
floats. That is, "var x float64 = 5" is valid Go. Avoid breakage in
tests that expect a certain output from %#v by reverting to it.

To guarantee the right type is generated by a print use %T(%#v) instead.

Added a test to lock in this behavior.

This reverts commit 7c7cecc184.

Fixes #27634
Updates #26363

Change-Id: I544c400a0903777dd216452a7e86dfe60b0b0283
Reviewed-on: https://go-review.googlesource.com/c/142597
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2018-10-16 21:54:35 +00:00
Lynn Boger
39fa301bdc test/codegen: enable more tests for ppc64/ppc64le
Adding cases for ppc64,ppc64le to the codegen tests
where appropriate.

Change-Id: Idf8cbe88a4ab4406a4ef1ea777bd15a58b68f3ed
Reviewed-on: https://go-review.googlesource.com/c/142557
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-16 19:00:53 +00:00
Ben Shi
4b78fe57a8 cmd/compile: optimize 386's load/store combination
This CL adds more combinations of two consequtive MOVBload/MOVBstore
to a unique MOVWload/MOVWstore.

1. The size of the go executable decreases about 4KB, and the total
size of pkg/linux_386 (excluding cmd/compile) decreases about 1.5KB.

2. There is no regression in the go1 benchmark result, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.28s ± 2%     3.29s ± 2%    ~     (p=0.151 n=40+40)
Fannkuch11-4                3.52s ± 1%     3.51s ± 1%  -0.28%  (p=0.002 n=40+40)
FmtFprintfEmpty-4          45.4ns ± 4%    45.0ns ± 4%  -0.89%  (p=0.019 n=40+40)
FmtFprintfString-4         81.9ns ± 7%    81.3ns ± 1%    ~     (p=0.660 n=40+25)
FmtFprintfInt-4            91.9ns ± 9%    91.4ns ± 9%    ~     (p=0.249 n=40+40)
FmtFprintfIntInt-4          143ns ± 4%     143ns ± 4%    ~     (p=0.760 n=40+40)
FmtFprintfPrefixedInt-4     184ns ± 3%     183ns ± 4%    ~     (p=0.485 n=40+40)
FmtFprintfFloat-4           408ns ± 3%     409ns ± 3%    ~     (p=0.961 n=40+40)
FmtManyArgs-4               597ns ± 4%     602ns ± 3%    ~     (p=0.413 n=40+40)
GobDecode-4                7.13ms ± 6%    7.14ms ± 6%    ~     (p=0.859 n=40+40)
GobEncode-4                6.86ms ± 9%    6.94ms ± 7%    ~     (p=0.162 n=40+40)
Gzip-4                      395ms ± 4%     396ms ± 3%    ~     (p=0.099 n=40+40)
Gunzip-4                   40.9ms ± 4%    41.1ms ± 3%    ~     (p=0.064 n=40+40)
HTTPClientServer-4         63.6µs ± 2%    63.6µs ± 3%    ~     (p=0.832 n=36+39)
JSONEncode-4               16.1ms ± 3%    15.8ms ± 3%  -1.60%  (p=0.001 n=40+40)
JSONDecode-4               61.0ms ± 3%    61.5ms ± 4%    ~     (p=0.065 n=40+40)
Mandelbrot200-4            5.16ms ± 3%    5.18ms ± 3%    ~     (p=0.056 n=40+40)
GoParse-4                  3.25ms ± 2%    3.23ms ± 3%    ~     (p=0.727 n=40+40)
RegexpMatchEasy0_32-4      90.2ns ± 3%    89.3ns ± 6%  -0.98%  (p=0.002 n=40+40)
RegexpMatchEasy0_1K-4       812ns ± 3%     815ns ± 3%    ~     (p=0.309 n=40+40)
RegexpMatchEasy1_32-4       103ns ± 6%     103ns ± 5%    ~     (p=0.680 n=40+40)
RegexpMatchEasy1_1K-4      1.01µs ± 4%    1.02µs ± 3%    ~     (p=0.326 n=40+33)
RegexpMatchMedium_32-4      120ns ± 4%     120ns ± 5%    ~     (p=0.834 n=40+40)
RegexpMatchMedium_1K-4     40.1µs ± 3%    39.5µs ± 4%  -1.35%  (p=0.000 n=40+40)
RegexpMatchHard_32-4       2.27µs ± 6%    2.23µs ± 4%  -1.67%  (p=0.011 n=40+40)
RegexpMatchHard_1K-4       67.2µs ± 3%    67.2µs ± 3%    ~     (p=0.149 n=40+40)
Revcomp-4                   1.84s ± 2%     1.86s ± 3%  +0.70%  (p=0.020 n=40+40)
Template-4                 69.0ms ± 4%    69.8ms ± 3%  +1.20%  (p=0.003 n=40+40)
TimeParse-4                 438ns ± 3%     439ns ± 4%    ~     (p=0.650 n=40+40)
TimeFormat-4                412ns ± 3%     412ns ± 3%    ~     (p=0.888 n=40+40)
[Geo mean]                 65.2µs         65.2µs       -0.04%

name                     old speed      new speed      delta
GobDecode-4               108MB/s ± 6%   108MB/s ± 6%    ~     (p=0.855 n=40+40)
GobEncode-4               112MB/s ± 9%   111MB/s ± 8%    ~     (p=0.159 n=40+40)
Gzip-4                   49.2MB/s ± 4%  49.1MB/s ± 3%    ~     (p=0.102 n=40+40)
Gunzip-4                  474MB/s ± 3%   472MB/s ± 3%    ~     (p=0.063 n=40+40)
JSONEncode-4              121MB/s ± 3%   123MB/s ± 3%  +1.62%  (p=0.001 n=40+40)
JSONDecode-4             31.9MB/s ± 3%  31.6MB/s ± 4%    ~     (p=0.070 n=40+40)
GoParse-4                17.9MB/s ± 2%  17.9MB/s ± 3%    ~     (p=0.696 n=40+40)
RegexpMatchEasy0_32-4     355MB/s ± 3%   358MB/s ± 5%  +0.99%  (p=0.002 n=40+40)
RegexpMatchEasy0_1K-4    1.26GB/s ± 3%  1.26GB/s ± 3%    ~     (p=0.381 n=40+40)
RegexpMatchEasy1_32-4     310MB/s ± 5%   310MB/s ± 4%    ~     (p=0.655 n=40+40)
RegexpMatchEasy1_1K-4    1.01GB/s ± 4%  1.01GB/s ± 3%    ~     (p=0.351 n=40+33)
RegexpMatchMedium_32-4   8.32MB/s ± 4%  8.34MB/s ± 5%    ~     (p=0.696 n=40+40)
RegexpMatchMedium_1K-4   25.6MB/s ± 3%  25.9MB/s ± 4%  +1.36%  (p=0.000 n=40+40)
RegexpMatchHard_32-4     14.1MB/s ± 6%  14.3MB/s ± 4%  +1.64%  (p=0.011 n=40+40)
RegexpMatchHard_1K-4     15.2MB/s ± 3%  15.2MB/s ± 3%    ~     (p=0.147 n=40+40)
Revcomp-4                 138MB/s ± 2%   137MB/s ± 3%  -0.70%  (p=0.021 n=40+40)
Template-4               28.1MB/s ± 4%  27.8MB/s ± 3%  -1.19%  (p=0.003 n=40+40)
[Geo mean]               83.7MB/s       83.7MB/s       +0.03%

Change-Id: I2a2b3a942b5c45467491515d201179fd192e65c9
Reviewed-on: https://go-review.googlesource.com/c/141650
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-16 07:17:11 +00:00
Ben Shi
3785be3093 test/codegen: fix confusing test cases
ARMv7's MULAF/MULSF/MULAD/MULSD are not fused,
this CL fixes the confusing test cases.

Change-Id: I35022e207e2f0d24a23a7f6f188e41ba8eee9886
Reviewed-on: https://go-review.googlesource.com/c/142439
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Akhil Indurti <aindurti@gmail.com>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2018-10-16 07:17:02 +00:00
Daniel Martí
7f3313133e cmd/compile: don't panic on invalid map key declarations
In golang.org/cl/75310, the compiler's typechecker was changed so that
map key types were validated at a later stage, to make sure that all the
necessary type information was present.

This still worked for map type declarations, but caused a regression for
top-level map variable declarations. These now caused a fatal panic
instead of a typechecking error.

The cause was that checkMapKeys was run too early, before all
typechecking was done. In particular, top-level map variable
declarations are typechecked as external declarations, much later than
where checkMapKeys was run.

Add a test case for both exported and unexported top-level map
declarations, and add a second call to checkMapKeys at the actual end of
typechecking. Simply moving the one call isn't a good solution either;
the comments expand on that.

Fixes #28058.

Change-Id: Ia5febb01a1d877447cf66ba44fb49a7e0f4f18a5
Reviewed-on: https://go-review.googlesource.com/c/140417
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-10-15 22:11:26 +00:00
Martin Möhrmann
a0f57c3fd0 cmd/compile: avoid string allocations when map key is struct or array literal
x = map[string(byteslice)] is already optimized by the compiler to avoid a
string allocation. This CL generalizes this optimization to:

x = map[T1{ ... Tn{..., string(byteslice), ...} ... }]
where T1 to Tn is a nesting of struct and array literals.

Found in a hot code path that used a struct of strings made from []byte
slices to make a map lookup.

There are no uses of the more generalized optimization in the standard library.
Passes toolstash -cmp.

MapStringConversion/32/simple    21.9ns ± 2%    21.9ns ± 3%      ~     (p=0.995 n=17+20)
MapStringConversion/32/struct    28.8ns ± 3%    22.0ns ± 2%   -23.80%  (p=0.000 n=20+20)
MapStringConversion/32/array     28.5ns ± 2%    21.9ns ± 2%   -23.14%  (p=0.000 n=19+16)
MapStringConversion/64/simple    21.0ns ± 2%    21.1ns ± 3%      ~     (p=0.072 n=19+18)
MapStringConversion/64/struct    72.4ns ± 3%    21.3ns ± 2%   -70.53%  (p=0.000 n=20+20)
MapStringConversion/64/array     72.8ns ± 1%    21.0ns ± 2%   -71.13%  (p=0.000 n=17+19)

name                           old allocs/op  new allocs/op  delta
MapStringConversion/32/simple      0.00           0.00           ~     (all equal)
MapStringConversion/32/struct      0.00           0.00           ~     (all equal)
MapStringConversion/32/array       0.00           0.00           ~     (all equal)
MapStringConversion/64/simple      0.00           0.00           ~     (all equal)
MapStringConversion/64/struct      1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)
MapStringConversion/64/array       1.00 ± 0%      0.00       -100.00%  (p=0.000 n=20+20)

Change-Id: I483b4d84d8d74b1025b62c954da9a365e79b7a3a
Reviewed-on: https://go-review.googlesource.com/c/116275
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-15 19:22:07 +00:00
Alberto Donizetti
7c96d87eda test/codegen: test ppc64 TrailingZeros, OnesCount codegen
This change adds codegen tests for the intrinsification on ppc64 of
the OnesCount{64,32,16,8}, and TrailingZeros{64,32,16,8} math/bits
functions.

Change-Id: Id3364921fbd18316850e15c8c71330c906187fdb
Reviewed-on: https://go-review.googlesource.com/c/141897
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-10-15 16:53:03 +00:00
Keith Randall
63e964e174 cmd/compile: provide types for all order-allocated temporaries
Ensure that we correctly type the stack temps for regular closures,
method function closures, and slice literals.

Then we don't need to override the dummy types later.
Furthermore, this allows order to reuse temporaries of these types.

OARRAYLIT doesn't need a temporary as far as I can tell, so I
removed that case from order.

Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
Reviewed-on: https://go-review.googlesource.com/c/140306
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-15 16:07:52 +00:00
Ben Shi
93e27e01af test/codegen: add tests of FMA for arm/arm64
This CL adds tests of fused multiplication-accumulation
on arm/arm64.

Change-Id: Ic85d5277c0d6acb7e1e723653372dfaf96824a39
Reviewed-on: https://go-review.googlesource.com/c/141652
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-15 14:51:30 +00:00
Ben Shi
c3208842e1 test/codegen: add tests for multiplication-subtraction
This CL adds tests for armv7's MULS and arm64's MSUBW.

Change-Id: Id0fd5d26fd477e4ed14389b0d33cad930423eb5b
Reviewed-on: https://go-review.googlesource.com/c/141651
Run-TryBot: Ben Shi <powerman1st@163.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-15 02:41:33 +00:00
Keith Randall
389e942745 cmd/compile: reuse temporaries in order pass
Instead of allocating a new temporary each time one
is needed, keep a list of temporaries which are free
(have already been VARKILLed on every path) and use
one of them.

Should save a lot of stack space. In a function like this:

func main() {
     fmt.Printf("%d %d\n", 2, 3)
     fmt.Printf("%d %d\n", 4, 5)
     fmt.Printf("%d %d\n", 6, 7)
}

The three [2]interface{} arrays used to hold the ... args
all use the same autotmp, instead of 3 different autotmps
as happened previous to this CL.

Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
Reviewed-on: https://go-review.googlesource.com/c/140301
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14 05:21:00 +00:00
Keith Randall
0e9f8a21f8 runtime,cmd/compile: pass strings and slices to convT2{E,I} by value
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.

This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).

func f(a, b, c string) {
     fmt.Printf("%s %s\n", a, b)
     fmt.Printf("%s %s\n", b, c)
}

This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.

The go binary shrinks 0.3%.

Update #24286

Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.

Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 03:46:51 +00:00
Keith Randall
653a4bd8d4 cmd/compile: optimize loads from readonly globals into constants
Instead of
   MOVB go.string."foo"(SB), AX
do
   MOVB $102, AX

When we know the global we're loading from is readonly, we can
do that read at compile time.

I've made this arch-dependent mostly because the cases where this
happens often are memory->memory moves, and those don't get
decomposed until lowering.

Did amd64/386/arm/arm64. Other architectures could follow.

Update #26498

Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
Reviewed-on: https://go-review.googlesource.com/c/141118
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-14 02:54:40 +00:00
Ben Shi
bac6a2925c test/codegen: add more arm64 test cases
This CL adds 3 combined load test cases for arm64.

Change-Id: I2c67308c40fd8a18f9f2d16c6d12911dcdc583e2
Reviewed-on: https://go-review.googlesource.com/c/140700
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-11 15:14:06 +00:00
Ben Shi
27965c1436 cmd/compile: optimize 386's ADDLconstmodifyidx4
This CL optimize ADDLconstmodifyidx4 to INCL/DECL, when the
constant is +1/-1.

1. The total size of pkg/linux_386/ decreases 28 bytes, excluding
cmd/compile.

2. There is no regression in the go1 benchmark test, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.25s ± 2%     3.23s ± 3%  -0.70%  (p=0.040 n=30+30)
Fannkuch11-4                3.50s ± 1%     3.47s ± 1%  -0.68%  (p=0.000 n=30+30)
FmtFprintfEmpty-4          44.6ns ± 3%    44.8ns ± 3%  +0.46%  (p=0.029 n=30+30)
FmtFprintfString-4         79.0ns ± 3%    78.7ns ± 3%    ~     (p=0.053 n=30+30)
FmtFprintfInt-4            89.2ns ± 2%    89.4ns ± 3%    ~     (p=0.665 n=30+29)
FmtFprintfIntInt-4          142ns ± 3%     142ns ± 3%    ~     (p=0.435 n=30+30)
FmtFprintfPrefixedInt-4     182ns ± 2%     182ns ± 2%    ~     (p=0.964 n=30+30)
FmtFprintfFloat-4           407ns ± 3%     411ns ± 4%    ~     (p=0.080 n=30+30)
FmtManyArgs-4               597ns ± 3%     593ns ± 4%    ~     (p=0.222 n=30+30)
GobDecode-4                7.09ms ± 6%    7.07ms ± 7%    ~     (p=0.633 n=30+30)
GobEncode-4                6.81ms ± 9%    6.81ms ± 8%    ~     (p=0.982 n=30+30)
Gzip-4                      398ms ± 4%     400ms ± 6%    ~     (p=0.177 n=30+30)
Gunzip-4                   41.3ms ± 3%    40.6ms ± 4%  -1.71%  (p=0.005 n=30+30)
HTTPClientServer-4         63.4µs ± 3%    63.4µs ± 4%    ~     (p=0.646 n=30+28)
JSONEncode-4               16.0ms ± 3%    16.1ms ± 3%    ~     (p=0.057 n=30+30)
JSONDecode-4               63.3ms ± 8%    63.1ms ± 7%    ~     (p=0.786 n=30+30)
Mandelbrot200-4            5.17ms ± 3%    5.15ms ± 8%    ~     (p=0.654 n=30+30)
GoParse-4                  3.24ms ± 3%    3.23ms ± 2%    ~     (p=0.091 n=30+30)
RegexpMatchEasy0_32-4       103ns ± 4%     103ns ± 4%    ~     (p=0.575 n=30+30)
RegexpMatchEasy0_1K-4       823ns ± 2%     821ns ± 3%    ~     (p=0.827 n=30+30)
RegexpMatchEasy1_32-4       113ns ± 3%     112ns ± 3%    ~     (p=0.076 n=30+30)
RegexpMatchEasy1_1K-4      1.02µs ± 4%    1.01µs ± 5%    ~     (p=0.087 n=30+30)
RegexpMatchMedium_32-4      129ns ± 3%     127ns ± 4%  -1.55%  (p=0.009 n=30+30)
RegexpMatchMedium_1K-4     39.3µs ± 4%    39.7µs ± 3%    ~     (p=0.054 n=30+30)
RegexpMatchHard_32-4       2.15µs ± 4%    2.15µs ± 4%    ~     (p=0.712 n=30+30)
RegexpMatchHard_1K-4       66.0µs ± 3%    65.1µs ± 3%  -1.32%  (p=0.002 n=30+30)
Revcomp-4                   1.85s ± 2%     1.85s ± 3%    ~     (p=0.168 n=30+30)
Template-4                 69.5ms ± 7%    68.9ms ± 6%    ~     (p=0.250 n=28+28)
TimeParse-4                 434ns ± 3%     432ns ± 4%    ~     (p=0.629 n=30+30)
TimeFormat-4                403ns ± 4%     408ns ± 3%  +1.23%  (p=0.019 n=30+29)
[Geo mean]                 65.5µs         65.3µs       -0.20%

name                     old speed      new speed      delta
GobDecode-4               108MB/s ± 6%   109MB/s ± 6%    ~     (p=0.636 n=30+30)
GobEncode-4               113MB/s ±10%   113MB/s ± 9%    ~     (p=0.982 n=30+30)
Gzip-4                   48.8MB/s ± 4%  48.6MB/s ± 5%    ~     (p=0.178 n=30+30)
Gunzip-4                  470MB/s ± 3%   479MB/s ± 4%  +1.72%  (p=0.006 n=30+30)
JSONEncode-4              121MB/s ± 3%   120MB/s ± 3%    ~     (p=0.057 n=30+30)
JSONDecode-4             30.7MB/s ± 8%  30.8MB/s ± 8%    ~     (p=0.784 n=30+30)
GoParse-4                17.9MB/s ± 3%  17.9MB/s ± 2%    ~     (p=0.090 n=30+30)
RegexpMatchEasy0_32-4     309MB/s ± 4%   309MB/s ± 3%    ~     (p=0.530 n=30+30)
RegexpMatchEasy0_1K-4    1.24GB/s ± 2%  1.25GB/s ± 3%    ~     (p=0.976 n=30+30)
RegexpMatchEasy1_32-4     282MB/s ± 3%   284MB/s ± 3%  +0.81%  (p=0.041 n=30+30)
RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  1.01GB/s ± 4%    ~     (p=0.091 n=30+30)
RegexpMatchMedium_32-4   7.71MB/s ± 3%  7.84MB/s ± 4%  +1.71%  (p=0.000 n=30+30)
RegexpMatchMedium_1K-4   26.1MB/s ± 4%  25.8MB/s ± 3%    ~     (p=0.051 n=30+30)
RegexpMatchHard_32-4     14.9MB/s ± 4%  14.9MB/s ± 4%    ~     (p=0.712 n=30+30)
RegexpMatchHard_1K-4     15.5MB/s ± 3%  15.7MB/s ± 3%  +1.34%  (p=0.003 n=30+30)
Revcomp-4                 138MB/s ± 2%   137MB/s ± 3%    ~     (p=0.174 n=30+30)
Template-4               28.0MB/s ± 6%  28.2MB/s ± 6%    ~     (p=0.251 n=28+28)
[Geo mean]               82.3MB/s       82.6MB/s       +0.36%

Change-Id: I389829699ffe9500a013fcf31be58a97e98043e1
Reviewed-on: https://go-review.googlesource.com/c/140701
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-11 01:20:34 +00:00
Keith Randall
ceb0c371d9 cmd/compile: make []byte("...") more efficient
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.

[]byte("foo") rewrites to the following pseudocode:

var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]

which generates this assembly:

	0x001d 00029 (tmp1.go:9)	LEAQ	type.[3]uint8(SB), AX
	0x0024 00036 (tmp1.go:9)	MOVQ	AX, (SP)
	0x0028 00040 (tmp1.go:9)	CALL	runtime.newobject(SB)
	0x002d 00045 (tmp1.go:9)	MOVQ	8(SP), AX
	0x0032 00050 (tmp1.go:9)	MOVBLZX	go.string."foo"+2(SB), CX
	0x0039 00057 (tmp1.go:9)	MOVWLZX	go.string."foo"(SB), DX
	0x0040 00064 (tmp1.go:9)	MOVW	DX, (AX)
	0x0043 00067 (tmp1.go:9)	MOVB	CL, 2(AX)
// Then the slice is b = {AX, 3, 3}

The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores.  Next CL...

Update #26498
Fixes #10170

Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-10 16:10:40 +00:00
Ben Shi
3933302550 cmd/compile: add indexed form for several 386 instructions
This CL implements indexed memory operands for the following instructions.
(ADD|SUB|MUL|AND|OR|XOR)Lload -> (ADD|SUB|MUL|AND|OR|XOR)Lloadidx4
(ADD|SUB|AND|OR|XOR)Lmodify -> (ADD|SUB|AND|OR|XOR)Lmodifyidx4
(ADD|AND|OR|XOR)Lconstmodify -> (ADD|AND|OR|XOR)Lconstmodifyidx4

1. The total size of pkg/linux_386/ decreases about 2.5KB, excluding
cmd/compile/ .

2. There is little regression in the go1 benchmark test, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              3.25s ± 3%     3.25s ± 3%    ~     (p=0.218 n=40+40)
Fannkuch11-4                3.53s ± 1%     3.53s ± 1%    ~     (p=0.303 n=40+40)
FmtFprintfEmpty-4          44.9ns ± 3%    45.6ns ± 3%  +1.48%  (p=0.030 n=40+36)
FmtFprintfString-4         78.7ns ± 5%    80.1ns ± 7%    ~     (p=0.217 n=36+40)
FmtFprintfInt-4            90.2ns ± 6%    89.8ns ± 5%    ~     (p=0.659 n=40+38)
FmtFprintfIntInt-4          140ns ± 5%     141ns ± 5%  +1.00%  (p=0.027 n=40+40)
FmtFprintfPrefixedInt-4     185ns ± 3%     183ns ± 3%    ~     (p=0.104 n=40+40)
FmtFprintfFloat-4           411ns ± 4%     406ns ± 3%  -1.37%  (p=0.005 n=40+40)
FmtManyArgs-4               590ns ± 4%     598ns ± 4%  +1.35%  (p=0.008 n=40+40)
GobDecode-4                7.16ms ± 5%    7.10ms ± 5%    ~     (p=0.335 n=40+40)
GobEncode-4                6.85ms ± 7%    6.74ms ± 9%    ~     (p=0.058 n=38+40)
Gzip-4                      400ms ± 4%     399ms ± 2%  -0.34%  (p=0.003 n=40+33)
Gunzip-4                   41.4ms ± 3%    41.4ms ± 4%  -0.12%  (p=0.020 n=40+40)
HTTPClientServer-4         64.1µs ± 4%    63.5µs ± 2%  -1.07%  (p=0.000 n=39+37)
JSONEncode-4               15.9ms ± 2%    15.9ms ± 3%    ~     (p=0.103 n=40+40)
JSONDecode-4               62.2ms ± 4%    61.6ms ± 3%  -0.98%  (p=0.006 n=39+40)
Mandelbrot200-4            5.18ms ± 3%    5.14ms ± 4%    ~     (p=0.125 n=40+40)
GoParse-4                  3.29ms ± 2%    3.27ms ± 2%  -0.66%  (p=0.006 n=40+40)
RegexpMatchEasy0_32-4       103ns ± 4%     103ns ± 4%    ~     (p=0.632 n=40+40)
RegexpMatchEasy0_1K-4       830ns ± 3%     828ns ± 3%    ~     (p=0.563 n=40+40)
RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 4%    ~     (p=0.494 n=40+40)
RegexpMatchEasy1_1K-4      1.03µs ± 4%    1.03µs ± 4%    ~     (p=0.665 n=40+40)
RegexpMatchMedium_32-4      130ns ± 4%     129ns ± 3%    ~     (p=0.458 n=40+40)
RegexpMatchMedium_1K-4     39.4µs ± 3%    39.7µs ± 3%    ~     (p=0.825 n=40+40)
RegexpMatchHard_32-4       2.16µs ± 4%    2.15µs ± 4%    ~     (p=0.137 n=40+40)
RegexpMatchHard_1K-4       65.2µs ± 3%    65.4µs ± 4%    ~     (p=0.160 n=40+40)
Revcomp-4                   1.87s ± 2%     1.87s ± 1%  +0.17%  (p=0.019 n=33+33)
Template-4                 69.4ms ± 3%    69.8ms ± 3%  +0.60%  (p=0.009 n=40+40)
TimeParse-4                 437ns ± 4%     438ns ± 4%    ~     (p=0.234 n=40+40)
TimeFormat-4                408ns ± 3%     408ns ± 3%    ~     (p=0.904 n=40+40)
[Geo mean]                 65.7µs         65.6µs       -0.08%

name                     old speed      new speed      delta
GobDecode-4               107MB/s ± 5%   108MB/s ± 5%    ~     (p=0.336 n=40+40)
GobEncode-4               112MB/s ± 6%   114MB/s ± 9%  +1.95%  (p=0.036 n=37+40)
Gzip-4                   48.5MB/s ± 4%  48.6MB/s ± 2%  +0.28%  (p=0.003 n=40+33)
Gunzip-4                  469MB/s ± 4%   469MB/s ± 4%  +0.11%  (p=0.021 n=40+40)
JSONEncode-4              122MB/s ± 2%   122MB/s ± 3%    ~     (p=0.105 n=40+40)
JSONDecode-4             31.2MB/s ± 4%  31.5MB/s ± 4%  +0.99%  (p=0.007 n=39+40)
GoParse-4                17.6MB/s ± 2%  17.7MB/s ± 2%  +0.66%  (p=0.007 n=40+40)
RegexpMatchEasy0_32-4     310MB/s ± 4%   310MB/s ± 4%    ~     (p=0.384 n=40+40)
RegexpMatchEasy0_1K-4    1.23GB/s ± 3%  1.24GB/s ± 3%    ~     (p=0.186 n=40+40)
RegexpMatchEasy1_32-4     283MB/s ± 3%   281MB/s ± 4%    ~     (p=0.855 n=40+40)
RegexpMatchEasy1_1K-4    1.00GB/s ± 4%  1.00GB/s ± 4%    ~     (p=0.665 n=40+40)
RegexpMatchMedium_32-4   7.68MB/s ± 4%  7.73MB/s ± 3%    ~     (p=0.359 n=40+40)
RegexpMatchMedium_1K-4   26.0MB/s ± 3%  25.8MB/s ± 3%    ~     (p=0.825 n=40+40)
RegexpMatchHard_32-4     14.8MB/s ± 3%  14.9MB/s ± 4%    ~     (p=0.136 n=40+40)
RegexpMatchHard_1K-4     15.7MB/s ± 3%  15.7MB/s ± 4%    ~     (p=0.150 n=40+40)
Revcomp-4                 136MB/s ± 1%   136MB/s ± 1%  -0.09%  (p=0.028 n=32+33)
Template-4               28.0MB/s ± 3%  27.8MB/s ± 3%  -0.59%  (p=0.010 n=40+40)
[Geo mean]               82.1MB/s       82.3MB/s       +0.25%

Change-Id: Ifa387a251056678326d3508aa02753b70bf7e5d0
Reviewed-on: https://go-review.googlesource.com/c/140303
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-10-09 03:55:08 +00:00
Igor Zhilianin
04dc1b2443 all: fix a bunch of misspellings
Change-Id: I94cebca86706e072fbe3be782d3edbe0e22b9432
GitHub-Last-Rev: 8e15a40545
GitHub-Pull-Request: golang/go#28067
Reviewed-on: https://go-review.googlesource.com/c/140437
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-08 03:12:03 +00:00
Keith Randall
6933d76a7e cmd/compile: allow VARDEF at top level
This was missed as part of adding a top-level VARDEF
for stack tracing (CL 134156).

Fixes #28055

Change-Id: Id14748dfccb119197d788867d2ec6a3b3c9835cf
Reviewed-on: https://go-review.googlesource.com/c/140304
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2018-10-06 16:28:04 +00:00
Igor Zhilianin
f90e89e675 all: fix a bunch of misspellings
Change-Id: If2954bdfc551515403706b2cd0dde94e45936e08
GitHub-Last-Rev: d4cfc41a55
GitHub-Pull-Request: golang/go#28049
Reviewed-on: https://go-review.googlesource.com/c/140299
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-06 15:40:03 +00:00
uropek
f1973f3164 test: fix spelling of caught be the compiler to caught by the compiler
Change-Id: Id21cdce35963dcdb96cc06252170590224c5aa17
GitHub-Last-Rev: 429dad0ceb
GitHub-Pull-Request: golang/go#28000
Reviewed-on: https://go-review.googlesource.com/c/139424
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-04 00:49:49 +00:00
Keith Randall
c91ce3cc7b test: stress test for stack objects
Allocate a long linked list on the stack. This tests both
lots of live stack objects, and lots of intra-stack pointers
to those objects.

Change-Id: I169e067416455737774851633b1e5367e10e1cf2
Reviewed-on: https://go-review.googlesource.com/c/135296
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:29 +00:00
Keith Randall
9dac0a8132 runtime: on a signal, set traceback address to a deferreturn call
When a function triggers a signal (like a segfault which translates to
a nil pointer exception) during execution, a sigpanic handler is just
below it on the stack.  The function itself did not stop at a
safepoint, so we have to figure out what safepoint we should use to
scan its stack frame.

Previously we used the site of the most recent defer to get the live
variables at the signal site. That answer is not quite correct, as
explained in #27518. Instead, use the site of a deferreturn call.
It has all the right variables marked as live (no args, all the return
values, except those that escape to the heap, in which case the
corresponding PAUTOHEAP variables will be live instead).

This CL requires stack objects, so that all the local variables
and args referenced by the deferred closures keep the right variables alive.

Fixes #27518

Change-Id: Id45d8a8666759986c203181090b962e2981e48ca
Reviewed-on: https://go-review.googlesource.com/c/134637
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:23 +00:00
Keith Randall
9a8372f8bd cmd/compile,runtime: remove ambiguously live logic
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.

Update a bunch of liveness tests to reflect the new world.

Fixes #22350

Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-03 19:54:16 +00:00
Cherry Zhang
c96e3bcc97 cmd/compile: fix type of OffPtr in some optimization rules
In some optimization rules the type of generated OffPtr was
incorrectly set to the type of the pointee, instead of the
pointer. When the OffPtr value is spilled, this may generate
a spill of the wrong type, e.g. a floating point spill of an
integer (pointer) value. On Wasm, this leads to invalid
bytecode.

Fixes #27961.

Change-Id: I5d464847eb900ed90794105c0013a1a7330756cc
Reviewed-on: https://go-review.googlesource.com/c/139257
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Richard Musiol <neelance@gmail.com>
2018-10-03 15:01:47 +00:00
Carlos Eduardo Seo
9aed4cc395 cmd/compile: instrinsify math/bits.Mul on ppc64x
Add SSA rules to intrinsify Mul/Mul64 on ppc64x.

benchmark             old ns/op     new ns/op     delta
BenchmarkMul-40       8.80          0.93          -89.43%
BenchmarkMul32-40     1.39          1.39          +0.00%
BenchmarkMul64-40     5.39          0.93          -82.75%

Updates #24813

Change-Id: I6e95bfbe976a2278bd17799df184a7fbc0e57829
Reviewed-on: https://go-review.googlesource.com/138917
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2018-10-02 18:56:06 +00:00
Keith Randall
ef50373983 reflect: ensure correct scanning of return values
During a call to a reflect-generated function or method (via
makeFuncStub or methodValueCall), when should we scan the return
values?

When we're starting a reflect call, the space on the stack for the
return values is not initialized yet, as it contains whatever junk was
on the stack of the caller at the time. The return space must not be
scanned during a GC.

When we're finishing a reflect call, the return values are
initialized, and must be scanned during a GC to make sure that any
pointers in the return values are found and their referents retained.

When the GC stack walk comes across a reflect call in progress on the
stack, it needs to know whether to scan the results or not. It doesn't
know the progress of the reflect call, so it can't decide by
itself. The reflect package needs to tell it.

This CL adds another slot in the frame of makeFuncStub and
methodValueCall so we can put a boolean in there which tells the
runtime whether to scan the results or not.

This CL also adds the args length to reflectMethodValue so the
runtime can restrict its scanning to only the args section (not the
results) if the reflect package says the results aren't ready yet.

Do a delicate dance in the reflect package to set the "results are
valid" bit. We need to make sure we set the bit only after we've
copied the results back to the stack. But we must set the bit before
we drop reflect's copy of the results. Otherwise, we might have a
state where (temporarily) no one has a live copy of the results.
That's the state we were observing in issue #27695 before this CL.

The bitmap used by the runtime currently contains only the args.
(Actually, it contains all the bits, but the size is set so we use
only the args portion.) This is safe for early in a reflect call, but
unsafe late in a reflect call. The test issue27695.go demonstrates
this unsafety. We change the bitmap to always include both args
and results, and decide at runtime which portion to use.

issue27695.go only has a test for method calls. Function calls were ok
because there wasn't a safepoint between when reflect dropped its copy
of the return values and when the caller is resumed. This may change
when we introduce safepoints everywhere.

This truncate-to-only-the-args was part of CL 9888 (in 2015). That
part of the CL fixed the problem demonstrated in issue27695b.go but
introduced the problem demonstrated in issue27695.go.

TODO, in another CL: simplify FuncLayout and its test. stack return
value is now identical to frametype.ptrdata + frametype.gcdata.

Fixes #27695

Change-Id: I2d49b34e34a82c6328b34f02610587a291b25c5f
Reviewed-on: https://go-review.googlesource.com/137440
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-09-29 20:25:24 +00:00
Ben Shi
5aeecc4530 cmd/compile: optimize arm64's code with more shifted operations
This CL optimizes arm64's NEG/MVN/TST/CMN with a shifted operand.

1. The total size of pkg/android_arm64 decreases about 0.2KB, excluding
cmd/compile/ .

2. The go1 benchmark shows no regression, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              16.4s ± 1%     16.4s ± 1%    ~     (p=0.914 n=29+29)
Fannkuch11-4                8.72s ± 0%     8.72s ± 0%    ~     (p=0.274 n=30+29)
FmtFprintfEmpty-4           174ns ± 0%     174ns ± 0%    ~     (all equal)
FmtFprintfString-4          370ns ± 0%     370ns ± 0%    ~     (all equal)
FmtFprintfInt-4             419ns ± 0%     419ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          672ns ± 1%     675ns ± 2%    ~     (p=0.217 n=28+30)
FmtFprintfPrefixedInt-4     806ns ± 0%     806ns ± 0%    ~     (p=0.402 n=30+28)
FmtFprintfFloat-4          1.09µs ± 0%    1.09µs ± 0%  +0.02%  (p=0.011 n=22+27)
FmtManyArgs-4              2.67µs ± 0%    2.68µs ± 0%    ~     (p=0.279 n=29+30)
GobDecode-4                33.1ms ± 1%    33.1ms ± 0%    ~     (p=0.052 n=28+29)
GobEncode-4                29.6ms ± 0%    29.6ms ± 0%  +0.08%  (p=0.013 n=28+29)
Gzip-4                      1.38s ± 2%     1.39s ± 2%    ~     (p=0.071 n=29+29)
Gunzip-4                    139ms ± 0%     139ms ± 0%    ~     (p=0.265 n=29+29)
HTTPClientServer-4          789µs ± 4%     785µs ± 4%    ~     (p=0.206 n=29+28)
JSONEncode-4               49.7ms ± 0%    49.6ms ± 0%  -0.24%  (p=0.000 n=30+30)
JSONDecode-4                266ms ± 1%     267ms ± 1%  +0.34%  (p=0.000 n=30+30)
Mandelbrot200-4            16.6ms ± 0%    16.6ms ± 0%    ~     (p=0.835 n=28+30)
GoParse-4                  15.9ms ± 0%    15.8ms ± 0%  -0.29%  (p=0.000 n=27+30)
RegexpMatchEasy0_32-4       380ns ± 0%     381ns ± 0%  +0.18%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      1.18µs ± 0%    1.19µs ± 0%  +0.23%  (p=0.000 n=30+30)
RegexpMatchEasy1_32-4       357ns ± 0%     358ns ± 0%  +0.28%  (p=0.000 n=29+29)
RegexpMatchEasy1_1K-4      2.04µs ± 0%    2.04µs ± 0%  +0.06%  (p=0.006 n=30+30)
RegexpMatchMedium_32-4      589ns ± 0%     590ns ± 0%  +0.24%  (p=0.000 n=28+30)
RegexpMatchMedium_1K-4      162µs ± 0%     162µs ± 0%  -0.01%  (p=0.027 n=26+29)
RegexpMatchHard_32-4       9.58µs ± 0%    9.58µs ± 0%    ~     (p=0.935 n=30+30)
RegexpMatchHard_1K-4        287µs ± 0%     287µs ± 0%    ~     (p=0.387 n=29+30)
Revcomp-4                   2.50s ± 0%     2.50s ± 0%  -0.10%  (p=0.020 n=28+28)
Template-4                  310ms ± 0%     310ms ± 1%    ~     (p=0.406 n=30+30)
TimeParse-4                1.68µs ± 0%    1.68µs ± 0%  +0.03%  (p=0.014 n=30+17)
TimeFormat-4               1.65µs ± 0%    1.66µs ± 0%  +0.32%  (p=0.000 n=27+29)
[Geo mean]                  247µs          247µs       +0.05%

name                     old speed      new speed      delta
GobDecode-4              23.2MB/s ± 0%  23.2MB/s ± 0%  -0.08%  (p=0.032 n=27+29)
GobEncode-4              26.0MB/s ± 0%  25.9MB/s ± 0%  -0.10%  (p=0.011 n=29+29)
Gzip-4                   14.1MB/s ± 2%  14.0MB/s ± 2%    ~     (p=0.081 n=29+29)
Gunzip-4                  139MB/s ± 0%   139MB/s ± 0%    ~     (p=0.290 n=29+29)
JSONEncode-4             39.0MB/s ± 0%  39.1MB/s ± 0%  +0.25%  (p=0.000 n=29+30)
JSONDecode-4             7.30MB/s ± 1%  7.28MB/s ± 1%  -0.33%  (p=0.000 n=30+30)
GoParse-4                3.65MB/s ± 0%  3.66MB/s ± 0%  +0.29%  (p=0.000 n=27+30)
RegexpMatchEasy0_32-4    84.1MB/s ± 0%  84.0MB/s ± 0%  -0.17%  (p=0.000 n=30+28)
RegexpMatchEasy0_1K-4     864MB/s ± 0%   862MB/s ± 0%  -0.24%  (p=0.000 n=30+30)
RegexpMatchEasy1_32-4    89.5MB/s ± 0%  89.3MB/s ± 0%  -0.18%  (p=0.000 n=28+24)
RegexpMatchEasy1_1K-4     502MB/s ± 0%   502MB/s ± 0%  -0.05%  (p=0.008 n=30+29)
RegexpMatchMedium_32-4   1.70MB/s ± 0%  1.69MB/s ± 0%  -0.59%  (p=0.000 n=29+30)
RegexpMatchMedium_1K-4   6.31MB/s ± 0%  6.31MB/s ± 0%  +0.05%  (p=0.005 n=30+26)
RegexpMatchHard_32-4     3.34MB/s ± 0%  3.34MB/s ± 0%    ~     (all equal)
RegexpMatchHard_1K-4     3.57MB/s ± 0%  3.57MB/s ± 0%    ~     (all equal)
Revcomp-4                 102MB/s ± 0%   102MB/s ± 0%  +0.10%  (p=0.022 n=28+28)
Template-4               6.26MB/s ± 0%  6.26MB/s ± 1%    ~     (p=0.768 n=30+30)
[Geo mean]               24.2MB/s       24.1MB/s       -0.08%

Change-Id: I494f9db7f8a568a00e9c74ae25086a58b2221683
Reviewed-on: https://go-review.googlesource.com/137976
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-28 15:05:17 +00:00
Ben Shi
d60cf39f8e cmd/compile: optimize arm64's MADD and MSUB
This CL implements constant folding for MADD/MSUB on arm64.

1. The total size of pkg/android_arm64/ decreases about 4KB,
   excluding cmd/compile/ .

2. There is no regression in the go1 benchmark, excluding noise.
name                     old time/op    new time/op    delta
BinaryTree17-4              16.4s ± 1%     16.5s ± 1%  +0.24%  (p=0.008 n=29+29)
Fannkuch11-4                8.73s ± 0%     8.71s ± 0%  -0.15%  (p=0.000 n=29+29)
FmtFprintfEmpty-4           174ns ± 0%     174ns ± 0%    ~     (all equal)
FmtFprintfString-4          370ns ± 0%     372ns ± 2%  +0.53%  (p=0.007 n=24+30)
FmtFprintfInt-4             419ns ± 0%     419ns ± 0%    ~     (all equal)
FmtFprintfIntInt-4          673ns ± 1%     661ns ± 1%  -1.81%  (p=0.000 n=30+27)
FmtFprintfPrefixedInt-4     806ns ± 0%     805ns ± 0%    ~     (p=0.957 n=28+27)
FmtFprintfFloat-4          1.09µs ± 0%    1.09µs ± 0%  -0.04%  (p=0.001 n=22+30)
FmtManyArgs-4              2.67µs ± 0%    2.68µs ± 0%  +0.03%  (p=0.045 n=29+28)
GobDecode-4                33.2ms ± 1%    32.5ms ± 1%  -2.11%  (p=0.000 n=29+29)
GobEncode-4                29.5ms ± 0%    29.2ms ± 0%  -1.04%  (p=0.000 n=28+28)
Gzip-4                      1.39s ± 2%     1.38s ± 1%  -0.48%  (p=0.023 n=30+30)
Gunzip-4                    139ms ± 0%     139ms ± 0%    ~     (p=0.616 n=30+28)
HTTPClientServer-4          766µs ± 4%     758µs ± 3%  -1.03%  (p=0.013 n=28+29)
JSONEncode-4               49.7ms ± 0%    49.6ms ± 0%  -0.24%  (p=0.000 n=30+30)
JSONDecode-4                266ms ± 0%     268ms ± 1%  +1.07%  (p=0.000 n=29+30)
Mandelbrot200-4            16.6ms ± 0%    16.6ms ± 0%    ~     (p=0.248 n=30+29)
GoParse-4                  15.9ms ± 0%    16.0ms ± 0%  +0.76%  (p=0.000 n=29+29)
RegexpMatchEasy0_32-4       381ns ± 0%     380ns ± 0%  -0.14%  (p=0.000 n=30+30)
RegexpMatchEasy0_1K-4      1.18µs ± 0%    1.19µs ± 1%  +0.30%  (p=0.000 n=29+30)
RegexpMatchEasy1_32-4       357ns ± 0%     357ns ± 0%    ~     (all equal)
RegexpMatchEasy1_1K-4      2.04µs ± 0%    2.05µs ± 0%  +0.50%  (p=0.000 n=26+28)
RegexpMatchMedium_32-4      590ns ± 0%     589ns ± 0%  -0.12%  (p=0.000 n=30+23)
RegexpMatchMedium_1K-4      162µs ± 0%     162µs ± 0%    ~     (p=0.318 n=28+25)
RegexpMatchHard_32-4       9.56µs ± 0%    9.56µs ± 0%    ~     (p=0.072 n=30+29)
RegexpMatchHard_1K-4        287µs ± 0%     287µs ± 0%  -0.02%  (p=0.005 n=28+28)
Revcomp-4                   2.50s ± 0%     2.51s ± 0%    ~     (p=0.246 n=29+29)
Template-4                  312ms ± 1%     313ms ± 1%  +0.46%  (p=0.002 n=30+30)
TimeParse-4                1.68µs ± 0%    1.67µs ± 0%  -0.31%  (p=0.000 n=27+29)
TimeFormat-4               1.66µs ± 0%    1.64µs ± 0%  -0.92%  (p=0.000 n=29+26)
[Geo mean]                  247µs          246µs       -0.15%

name                     old speed      new speed      delta
GobDecode-4              23.1MB/s ± 1%  23.6MB/s ± 0%  +2.17%  (p=0.000 n=29+28)
GobEncode-4              26.0MB/s ± 0%  26.3MB/s ± 0%  +1.05%  (p=0.000 n=28+28)
Gzip-4                   14.0MB/s ± 2%  14.1MB/s ± 1%  +0.47%  (p=0.026 n=30+30)
Gunzip-4                  139MB/s ± 0%   139MB/s ± 0%    ~     (p=0.624 n=30+28)
JSONEncode-4             39.1MB/s ± 0%  39.2MB/s ± 0%  +0.24%  (p=0.000 n=30+30)
JSONDecode-4             7.31MB/s ± 0%  7.23MB/s ± 1%  -1.07%  (p=0.000 n=28+30)
GoParse-4                3.65MB/s ± 0%  3.62MB/s ± 0%  -0.77%  (p=0.000 n=29+29)
RegexpMatchEasy0_32-4    84.0MB/s ± 0%  84.1MB/s ± 0%  +0.18%  (p=0.000 n=28+30)
RegexpMatchEasy0_1K-4     864MB/s ± 0%   861MB/s ± 1%  -0.29%  (p=0.000 n=29+30)
RegexpMatchEasy1_32-4    89.5MB/s ± 0%  89.5MB/s ± 0%    ~     (p=0.841 n=28+28)
RegexpMatchEasy1_1K-4     502MB/s ± 0%   500MB/s ± 0%  -0.51%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4   1.69MB/s ± 0%  1.70MB/s ± 0%  +0.41%  (p=0.000 n=26+30)
RegexpMatchMedium_1K-4   6.31MB/s ± 0%  6.30MB/s ± 0%    ~     (p=0.129 n=30+25)
RegexpMatchHard_32-4     3.35MB/s ± 0%  3.35MB/s ± 0%    ~     (p=0.657 n=30+29)
RegexpMatchHard_1K-4     3.57MB/s ± 0%  3.57MB/s ± 0%    ~     (all equal)
Revcomp-4                 102MB/s ± 0%   101MB/s ± 0%    ~     (p=0.213 n=29+29)
Template-4               6.22MB/s ± 1%  6.19MB/s ± 1%  -0.42%  (p=0.005 n=30+29)
[Geo mean]               24.1MB/s       24.2MB/s       +0.08%

Change-Id: I6c02d3c9975f6bd8bc215cb1fc14d29602b45649
Reviewed-on: https://go-review.googlesource.com/138095
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-28 15:03:17 +00:00
Than McIntosh
31d19c0ba3 test: add testcase for gccgo compile failure
Also includes a small tweak to test/run.go to allow package names
with Unicode letters (as opposed to just ASCII chars).

Updates #27836

Change-Id: Idbf0bdea24174808cddcb69974dab820eb13e521
Reviewed-on: https://go-review.googlesource.com/138075
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-27 15:01:24 +00:00
David Heuschmann
ae9c822f78 cmd/compile: use more specific error message for assignment mismatch
Show a more specifc error message in the form of "%d variables but %v
returns %d values" if an assignment mismatch occurs with a function
or method call on the right.

Fixes #27595

Change-Id: Ibc97d070662b08f150ac22d686059cf224e012ab
Reviewed-on: https://go-review.googlesource.com/135575
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2018-09-27 00:35:06 +00:00
Brian Kessler
9eb53ab9bc cmd/compile: intrinsify math/bits.Mul
Add SSA rules to intrinsify Mul/Mul64 (AMD64 and ARM64).
SSA rules for other functions and architectures are left as a future
optimization.  Benchmark results on AMD64/ARM64 before and after SSA
implementation are below.

amd64
name     old time/op  new time/op  delta
Add-4    1.78ns ± 0%  1.85ns ±12%     ~     (p=0.397 n=4+5)
Add32-4  1.71ns ± 1%  1.70ns ± 0%     ~     (p=0.683 n=5+5)
Add64-4  1.80ns ± 2%  1.77ns ± 0%   -1.22%  (p=0.048 n=5+5)
Sub-4    1.78ns ± 0%  1.78ns ± 0%     ~     (all equal)
Sub32-4  1.78ns ± 1%  1.78ns ± 0%     ~     (p=1.000 n=5+5)
Sub64-4  1.78ns ± 1%  1.78ns ± 0%     ~     (p=0.968 n=5+4)
Mul-4    11.5ns ± 1%   1.8ns ± 2%  -84.39%  (p=0.008 n=5+5)
Mul32-4  1.39ns ± 0%  1.38ns ± 3%     ~     (p=0.175 n=5+5)
Mul64-4  6.85ns ± 1%  1.78ns ± 1%  -73.97%  (p=0.008 n=5+5)
Div-4    57.1ns ± 1%  56.7ns ± 0%     ~     (p=0.087 n=5+5)
Div32-4  18.0ns ± 0%  18.0ns ± 0%     ~     (all equal)
Div64-4  56.4ns ±10%  53.6ns ± 1%     ~     (p=0.071 n=5+5)

arm64
name      old time/op  new time/op  delta
Add-96    5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Add32-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Add64-96  5.52ns ± 0%  5.51ns ± 0%     ~     (p=0.444 n=5+5)
Sub-96    5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Sub32-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Sub64-96  5.51ns ± 0%  5.51ns ± 0%     ~     (all equal)
Mul-96    34.6ns ± 0%   5.0ns ± 0%  -85.52%  (p=0.008 n=5+5)
Mul32-96  4.51ns ± 0%  4.51ns ± 0%     ~     (all equal)
Mul64-96  21.1ns ± 0%   5.0ns ± 0%  -76.26%  (p=0.008 n=5+5)
Div-96    64.7ns ± 0%  64.7ns ± 0%     ~     (all equal)
Div32-96  17.0ns ± 0%  17.0ns ± 0%     ~     (all equal)
Div64-96  53.1ns ± 0%  53.1ns ± 0%     ~     (all equal)

Updates #24813

Change-Id: I9bda6d2102f65cae3d436a2087b47ed8bafeb068
Reviewed-on: https://go-review.googlesource.com/129415
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-26 20:35:57 +00:00
Alberto Donizetti
8c1c6702f1 test: restore binary.BigEndian use in checkbce
CL 136855 removed the encoding/binary dependency from the checkbce.go
test by defining a local Uint64 to fix the noopt builder; then a more
general mechanism to skip tests on the noopt builder was introduced in
CL 136898, so we can now restore the binary.Uint64 calls in testbce.

Change-Id: I3efbb41be0bfc446a7e638ce6a593371ead2684f
Reviewed-on: https://go-review.googlesource.com/137056
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-24 21:20:51 +00:00
Brad Fitzpatrick
b3369063e5 test: skip some tests on noopt builder
Adds a new build tag "gcflags_noopt" that can be used in test/*.go
tests.

Fixes #27833

Change-Id: I4ea0ccd9e9e58c4639de18645fec81eb24a3a929
Reviewed-on: https://go-review.googlesource.com/136898
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-24 20:56:48 +00:00
Keith Randall
9774fa6f40 cmd/compile: fix precedence order bug
&^ and << have equal precedence.  Add some parentheses to make sure
we shift before we andnot.

Fixes #27829

Change-Id: Iba8576201f0f7c52bf9795aaa75d15d8f9a76811
Reviewed-on: https://go-review.googlesource.com/136899
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 17:43:55 +00:00
Jongmin Kim
c22c7607d3 test/bench/garbage: update Benchmarks Game URL to new page
The existing URL in comment points to an Alioth page which was
deprecated (and not working), so use the new Benchmarks Game URL.

Change-Id: Ifd694382a44a24c44acbed3fe1b17bca6dab998f
Reviewed-on: https://go-review.googlesource.com/136835
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 17:11:42 +00:00
Alberto Donizetti
6054fef17f test: fix bcecheck test on noopt builder
The noopt builder is configured by setting GO_GCFLAGS=-N -l, but the
test/run.go test harness doesn't look at GO_GCFLAGS when processing
"errorcheck" files, it just calls compile:

  cmdline := []string{goTool(), "tool", "compile", /* etc */}

This is working as intended, since it makes the tests more robust and
independent from the environment; errorcheck files are supposed to set
additional building flags, when needed, like in:

  // errorcheck -0 -N -l

The test/bcecheck.go test used to work on the noopt builder (even if
bce is not active on -N -l) because the test was auto-contained and
the file always compiled with optimizations enabled.

In CL 107355, a new bce test dependent on an external package
(encoding.binary) was added. On the noopt builder the external package
is built using -N -l, and this causes a test failure that broke the
noopt builder:

  https://build.golang.org/log/b2be319536285e5807ee9d66d6d0ec4d57433768

To reproduce the failure, one can do:

  $ go install -a -gcflags="-N -l" std
  $ go run run.go -- checkbce.go

This change fixes the noopt builder breakage by removing the bce test
dependency on encoding/binary by defining a local Uint64() function to
be used in the test.

Change-Id: Ife71aab662001442e715c32a0b7d758349a63ff1
Reviewed-on: https://go-review.googlesource.com/136855
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-24 16:54:52 +00:00
Iskander Sharipov
499fbb1a8a cmd/compile/internal/gc: unify self-assignment checks in esc.go
Move slice self-assign check into isSelfAssign function.
Make debug output consistent for all self-assignment cases.

Change-Id: I0e4cc7b3c1fcaeace7226dd80a0dc1ea97347a55
Reviewed-on: https://go-review.googlesource.com/136276
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-20 09:46:11 +00:00
Robert Griesemer
ae37f5a397 cmd/compile: fix error message for &T{} literal mismatch
See the change and comment in typecheck.go for a detailed explanation.

Fixes #26855.

Change-Id: I7867f948490fc0873b1bd849048cda6acbc36e76
Reviewed-on: https://go-review.googlesource.com/136395
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2018-09-20 00:07:35 +00:00
Iskander Sharipov
c03d0e4fec cmd/compile/internal/gc: handle arith ops in samesafeexpr
Teach samesafeexpr to handle arithmetic unary and binary ops.

It makes map lookup optimization possible in

	m[k+1] = append(m[k+1], ...)
	m[-k] = append(m[-k], ...)
	... etc

Does not cover "+" for strings (concatenation).

Change-Id: Ibbb16ac3faf176958da344be1471b06d7cf33a6c
Reviewed-on: https://go-review.googlesource.com/135795
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-19 12:03:58 +00:00
Ben Shi
c6bf9a8109 cmd/compile: optimize AMD64's bit wise operation
Currently "arr[idx] |= 0x80" is compiled to MOVLload->BTSL->MOVLstore.
And this CL optimizes it to a single BTSLconstmodify. Other bit wise
operations with a direct memory operand are also implemented.

1. The size of the executable bin/go decreases about 4KB, and the total size
of pkg/linux_amd64 (excluding cmd/compile) decreases about 0.6KB.

2. There a little improvement in the go1 benchmark test (excluding noise).
name                     old time/op    new time/op    delta
BinaryTree17-4              2.66s ± 4%     2.66s ± 3%    ~     (p=0.596 n=49+49)
Fannkuch11-4                2.38s ± 2%     2.32s ± 2%  -2.69%  (p=0.000 n=50+50)
FmtFprintfEmpty-4          42.7ns ± 4%    43.2ns ± 7%  +1.31%  (p=0.009 n=50+50)
FmtFprintfString-4         71.0ns ± 5%    72.0ns ± 3%  +1.33%  (p=0.000 n=50+50)
FmtFprintfInt-4            80.7ns ± 4%    80.6ns ± 3%    ~     (p=0.931 n=50+50)
FmtFprintfIntInt-4          125ns ± 3%     126ns ± 4%    ~     (p=0.051 n=50+50)
FmtFprintfPrefixedInt-4     158ns ± 1%     142ns ± 3%  -9.84%  (p=0.000 n=36+50)
FmtFprintfFloat-4           215ns ± 4%     212ns ± 4%  -1.23%  (p=0.002 n=50+50)
FmtManyArgs-4               519ns ± 3%     510ns ± 3%  -1.77%  (p=0.000 n=50+50)
GobDecode-4                6.49ms ± 6%    6.52ms ± 5%    ~     (p=0.866 n=50+50)
GobEncode-4                5.93ms ± 8%    6.01ms ± 7%    ~     (p=0.076 n=50+50)
Gzip-4                      222ms ± 4%     224ms ± 8%  +0.80%  (p=0.001 n=50+50)
Gunzip-4                   36.6ms ± 5%    36.4ms ± 4%    ~     (p=0.093 n=50+50)
HTTPClientServer-4         59.1µs ± 1%    58.9µs ± 2%  -0.24%  (p=0.039 n=49+48)
JSONEncode-4               9.23ms ± 4%    9.21ms ± 5%    ~     (p=0.244 n=50+50)
JSONDecode-4               48.8ms ± 4%    48.7ms ± 4%    ~     (p=0.653 n=50+50)
Mandelbrot200-4            3.81ms ± 4%    3.80ms ± 3%    ~     (p=0.834 n=50+50)
GoParse-4                  3.20ms ± 5%    3.19ms ± 5%    ~     (p=0.494 n=50+50)
RegexpMatchEasy0_32-4      78.1ns ± 2%    77.4ns ± 3%  -0.86%  (p=0.005 n=50+50)
RegexpMatchEasy0_1K-4       233ns ± 3%     233ns ± 3%    ~     (p=0.074 n=50+50)
RegexpMatchEasy1_32-4      74.2ns ± 3%    73.4ns ± 3%  -1.06%  (p=0.000 n=50+50)
RegexpMatchEasy1_1K-4       369ns ± 2%     364ns ± 4%  -1.41%  (p=0.000 n=36+50)
RegexpMatchMedium_32-4      109ns ± 4%     107ns ± 3%  -2.06%  (p=0.001 n=50+50)
RegexpMatchMedium_1K-4     31.5µs ± 3%    30.8µs ± 3%  -2.20%  (p=0.000 n=50+50)
RegexpMatchHard_32-4       1.57µs ± 3%    1.56µs ± 2%  -0.57%  (p=0.016 n=50+50)
RegexpMatchHard_1K-4       47.4µs ± 4%    47.0µs ± 3%  -0.82%  (p=0.008 n=50+50)
Revcomp-4                   414ms ± 7%     412ms ± 7%    ~     (p=0.285 n=50+50)
Template-4                 64.3ms ± 4%    62.7ms ± 3%  -2.44%  (p=0.000 n=50+50)
TimeParse-4                 316ns ± 3%     313ns ± 3%    ~     (p=0.122 n=50+50)
TimeFormat-4                291ns ± 3%     293ns ± 3%  +0.80%  (p=0.001 n=50+50)
[Geo mean]                 46.5µs         46.2µs       -0.81%

name                     old speed      new speed      delta
GobDecode-4               118MB/s ± 6%   118MB/s ± 5%    ~     (p=0.863 n=50+50)
GobEncode-4               130MB/s ± 9%   128MB/s ± 8%    ~     (p=0.076 n=50+50)
Gzip-4                   87.4MB/s ± 4%  86.8MB/s ± 7%  -0.78%  (p=0.002 n=50+50)
Gunzip-4                  531MB/s ± 5%   533MB/s ± 4%    ~     (p=0.093 n=50+50)
JSONEncode-4              210MB/s ± 4%   211MB/s ± 5%    ~     (p=0.247 n=50+50)
JSONDecode-4             39.8MB/s ± 4%  39.9MB/s ± 4%    ~     (p=0.654 n=50+50)
GoParse-4                18.1MB/s ± 5%  18.2MB/s ± 5%    ~     (p=0.493 n=50+50)
RegexpMatchEasy0_32-4     410MB/s ± 2%   413MB/s ± 3%  +0.86%  (p=0.004 n=50+50)
RegexpMatchEasy0_1K-4    4.39GB/s ± 3%  4.38GB/s ± 3%    ~     (p=0.063 n=50+50)
RegexpMatchEasy1_32-4     432MB/s ± 3%   436MB/s ± 3%  +1.07%  (p=0.000 n=50+50)
RegexpMatchEasy1_1K-4    2.77GB/s ± 2%  2.81GB/s ± 4%  +1.46%  (p=0.000 n=36+50)
RegexpMatchMedium_32-4   9.16MB/s ± 3%  9.35MB/s ± 4%  +2.09%  (p=0.001 n=50+50)
RegexpMatchMedium_1K-4   32.5MB/s ± 3%  33.2MB/s ± 3%  +2.25%  (p=0.000 n=50+50)
RegexpMatchHard_32-4     20.4MB/s ± 3%  20.5MB/s ± 2%  +0.56%  (p=0.017 n=50+50)
RegexpMatchHard_1K-4     21.6MB/s ± 4%  21.8MB/s ± 3%  +0.83%  (p=0.008 n=50+50)
Revcomp-4                 613MB/s ± 4%   618MB/s ± 7%    ~     (p=0.152 n=48+50)
Template-4               30.2MB/s ± 4%  30.9MB/s ± 3%  +2.49%  (p=0.000 n=50+50)
[Geo mean]                127MB/s        128MB/s       +0.64%

Change-Id: If405198283855d75697f66cf894b2bef458f620e
Reviewed-on: https://go-review.googlesource.com/135422
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-19 03:00:58 +00:00
Keith Randall
c6118af558 cmd/compile: don't do floating point optimization x+0 -> x
That optimization is not valid if x == -0.

The test is a bit tricky because 0 == -0. We distinguish
0 from -0 with 1/0 == inf, 1/-0 == -inf.

This has been a bug since CL 24790 in Go 1.8. Probably doesn't
warrant a backport.

Fixes #27718

Note: the optimization x-0 -> x is actually valid.
But it's probably best to take it out, so as to not confuse readers.

Change-Id: I99f16a93b45f7406ec8053c2dc759a13eba035fa
Reviewed-on: https://go-review.googlesource.com/135701
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-18 20:27:09 +00:00
fanzha02
a19a83c8ef cmd/compile: optimize math.Float64(32)bits and math.Float64(32)frombits on arm64
Use float <-> int register moves without conversion instead of stores
and loads to move float <-> int values.

Math package benchmark results.
name                 old time/op  new time/op  delta
Acosh                 153ns ± 0%   147ns ± 0%   -3.92%  (p=0.000 n=10+10)
Asinh                 183ns ± 0%   177ns ± 0%   -3.28%  (p=0.000 n=10+10)
Atanh                 157ns ± 0%   155ns ± 0%   -1.27%  (p=0.000 n=10+10)
Atan2                 118ns ± 0%   117ns ± 1%   -0.59%  (p=0.003 n=10+10)
Cbrt                  119ns ± 0%   114ns ± 0%   -4.20%  (p=0.000 n=10+10)
Copysign             7.51ns ± 0%  6.51ns ± 0%  -13.32%  (p=0.000 n=9+10)
Cos                  73.1ns ± 0%  70.6ns ± 0%   -3.42%  (p=0.000 n=10+10)
Cosh                  119ns ± 0%   121ns ± 0%   +1.68%  (p=0.000 n=10+9)
ExpGo                 154ns ± 0%   149ns ± 0%   -3.05%  (p=0.000 n=9+10)
Expm1                 101ns ± 0%    99ns ± 0%   -1.88%  (p=0.000 n=10+10)
Exp2Go                150ns ± 0%   146ns ± 0%   -2.67%  (p=0.000 n=10+10)
Abs                  7.01ns ± 0%  6.01ns ± 0%  -14.27%  (p=0.000 n=10+9)
Mod                   234ns ± 0%   212ns ± 0%   -9.40%  (p=0.000 n=9+10)
Frexp                34.5ns ± 0%  30.0ns ± 0%  -13.04%  (p=0.000 n=10+10)
Gamma                 112ns ± 0%   111ns ± 0%   -0.89%  (p=0.000 n=10+10)
Hypot                73.6ns ± 0%  68.6ns ± 0%   -6.79%  (p=0.000 n=10+10)
HypotGo              77.1ns ± 0%  72.1ns ± 0%   -6.49%  (p=0.000 n=10+10)
Ilogb                31.0ns ± 0%  28.0ns ± 0%   -9.68%  (p=0.000 n=10+10)
J0                    437ns ± 0%   434ns ± 0%   -0.62%  (p=0.000 n=10+10)
J1                    433ns ± 0%   431ns ± 0%   -0.46%  (p=0.000 n=10+10)
Jn                    927ns ± 0%   922ns ± 0%   -0.54%  (p=0.000 n=10+10)
Ldexp                41.5ns ± 0%  37.0ns ± 0%  -10.84%  (p=0.000 n=9+10)
Log                   124ns ± 0%   118ns ± 0%   -4.84%  (p=0.000 n=10+9)
Logb                 34.0ns ± 0%  32.0ns ± 0%   -5.88%  (p=0.000 n=10+10)
Log1p                 110ns ± 0%   108ns ± 0%   -1.82%  (p=0.000 n=10+10)
Log10                 136ns ± 0%   132ns ± 0%   -2.94%  (p=0.000 n=10+10)
Log2                 51.6ns ± 0%  47.1ns ± 0%   -8.72%  (p=0.000 n=10+10)
Nextafter32          33.0ns ± 0%  30.5ns ± 0%   -7.58%  (p=0.000 n=10+10)
Nextafter64          29.0ns ± 0%  26.5ns ± 0%   -8.62%  (p=0.000 n=10+10)
PowInt                169ns ± 0%   160ns ± 0%   -5.33%  (p=0.000 n=10+10)
PowFrac               375ns ± 0%   361ns ± 0%   -3.73%  (p=0.000 n=10+10)
RoundToEven          14.0ns ± 0%  12.5ns ± 0%  -10.71%  (p=0.000 n=10+10)
Remainder             206ns ± 0%   192ns ± 0%   -6.80%  (p=0.000 n=10+9)
Signbit              6.01ns ± 0%  5.51ns ± 0%   -8.32%  (p=0.000 n=10+9)
Sin                  70.1ns ± 0%  69.6ns ± 0%   -0.71%  (p=0.000 n=10+10)
Sincos               99.1ns ± 0%  99.6ns ± 0%   +0.50%  (p=0.000 n=9+10)
SqrtGoLatency         178ns ± 0%   146ns ± 0%  -17.70%  (p=0.000 n=8+10)
SqrtPrime            9.19µs ± 0%  9.20µs ± 0%   +0.01%  (p=0.000 n=9+9)
Tanh                  125ns ± 1%   127ns ± 0%   +1.36%  (p=0.000 n=10+10)
Y0                    428ns ± 0%   426ns ± 0%   -0.47%  (p=0.000 n=10+10)
Y1                    431ns ± 0%   429ns ± 0%   -0.46%  (p=0.000 n=10+9)
Yn                    906ns ± 0%   901ns ± 0%   -0.55%  (p=0.000 n=10+10)
Float64bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.000 n=10+10)
Float64frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+9)
Float32bits          4.50ns ± 0%  3.50ns ± 0%  -22.22%  (p=0.002 n=8+10)
Float32frombits      4.00ns ± 0%  3.50ns ± 0%  -12.50%  (p=0.000 n=10+10)

Change-Id: Iba829e15d5624962fe0c699139ea783efeefabc2
Reviewed-on: https://go-review.googlesource.com/129715
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-17 20:49:04 +00:00
Iskander Sharipov
859cf7fc0f cmd/compile/internal/gc: handle array slice self-assign in esc.go
Instead of skipping all OSLICEARR, skip only ones with non-pointer
array type. For pointers to arrays, it's safe to apply the
self-assignment slicing optimizations.

Refactored the matching code into separate function for readability.

This is an extension to already existing optimization.

On its own, it does not improve any code under std, but
it opens some new optimization opportunities. One
of them is described in the referenced issue.

Updates #7921

Change-Id: I08ac660d3ef80eb15fd7933fb73cf53ded9333ad
Reviewed-on: https://go-review.googlesource.com/133375
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-17 11:14:58 +00:00
Keith Randall
b1f656b1ce cmd/compile: fold address calculations into CMPload[const] ops
Makes go binary smaller by 0.2%.

I noticed this in autogenerated equal methods, and there are
probably a lot of those.

Change-Id: I4e04eb3653fbceb9dd6a4eee97ceab1fa4d10b72
Reviewed-on: https://go-review.googlesource.com/135379
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-09-14 19:42:09 +00:00
Lynn Boger
8dbd9afbb0 cmd/compile: improve rules for PPC64.rules
This adds some improvements to the rules for PPC64 to eliminate
unnecessary zero or sign extends, and fix some rule for truncates
which were not always using the correct sign instruction.

This reduces of size of many functions by 1 or 2 instructions and
can improve performance in cases where the execution time depends
on small loops where at least 1 instruction was removed and where that
loop contributes a significant amount of the total execution time.

Included is a testcase for codegen to verify the sign/zero extend
instructions are omitted.

An example of the improvement (strings):
IndexAnyASCII/256:1-16     392ns ± 0%   369ns ± 0%  -5.79%  (p=0.000 n=1+10)
IndexAnyASCII/256:2-16     397ns ± 0%   376ns ± 0%  -5.23%  (p=0.000 n=1+9)
IndexAnyASCII/256:4-16     405ns ± 0%   384ns ± 0%  -5.19%  (p=1.714 n=1+6)
IndexAnyASCII/256:8-16     427ns ± 0%   403ns ± 0%  -5.57%  (p=0.000 n=1+10)
IndexAnyASCII/256:16-16    441ns ± 0%   418ns ± 1%  -5.33%  (p=0.000 n=1+10)
IndexAnyASCII/4096:1-16   5.62µs ± 0%  5.27µs ± 1%  -6.31%  (p=0.000 n=1+10)
IndexAnyASCII/4096:2-16   5.67µs ± 0%  5.29µs ± 0%  -6.67%  (p=0.222 n=1+8)
IndexAnyASCII/4096:4-16   5.66µs ± 0%  5.28µs ± 1%  -6.66%  (p=0.000 n=1+10)
IndexAnyASCII/4096:8-16   5.66µs ± 0%  5.31µs ± 1%  -6.10%  (p=0.000 n=1+10)
IndexAnyASCII/4096:16-16  5.70µs ± 0%  5.33µs ± 1%  -6.43%  (p=0.182 n=1+10)

Change-Id: I739a6132b505936d39001aada5a978ff2a5f0500
Reviewed-on: https://go-review.googlesource.com/129875
Reviewed-by: David Chase <drchase@google.com>
2018-09-13 18:24:53 +00:00
erifan01
8149db4f64 cmd/compile: intrinsify math.RoundToEven and math.Abs on arm64
math.RoundToEven can be done by one arm64 instruction FRINTND, intrinsify it to improve performance.
The current pure Go implementation of the function Abs is translated into five instructions on arm64:
str, ldr, and, str, ldr. The intrinsic implementation requires only one instruction, so in terms of
performance, intrinsify it is worthwhile.

Benchmarks:
name           old time/op  new time/op  delta
Abs-8          3.50ns ± 0%  1.50ns ± 0%  -57.14%  (p=0.000 n=10+10)
RoundToEven-8  9.26ns ± 0%  1.50ns ± 0%  -83.80%  (p=0.000 n=10+10)

Change-Id: I9456b26ab282b544dfac0154fc86f17aed96ac3d
Reviewed-on: https://go-review.googlesource.com/116535
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-13 14:52:51 +00:00
fanzha02
d5377c2026 test: fix the wrong test of math.Copysign(c, -1) for arm64
The CL 132915 added the wrong codegen test for math.Copysign(c, -1),
it should test that AND is not emitted. This CL fixes this error.

Change-Id: Ida1d3d54ebfc7f238abccbc1f70f914e1b5bfd91
Reviewed-on: https://go-review.googlesource.com/134815
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-12 15:34:20 +00:00
Ben Shi
9f2411894b cmd/compile: optimize arm's bit operation
BFC (Bit Field Clear) was introduced in ARMv7, which can simplify
ANDconst and BICconst. And this CL implements that optimization.

1. The total size of pkg/android_arm decreases about 3KB, excluding
cmd/compile/.

2. There is no regression in the go1 benchmark result, and some
cases (FmtFprintfEmpty-4 and RegexpMatchMedium_32-4) even get
slight improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              25.3s ± 1%     25.2s ± 1%    ~     (p=0.072 n=30+29)
Fannkuch11-4                13.3s ± 0%     13.3s ± 0%  +0.13%  (p=0.000 n=30+26)
FmtFprintfEmpty-4           407ns ± 0%     394ns ± 0%  -3.19%  (p=0.000 n=26+28)
FmtFprintfString-4          664ns ± 0%     662ns ± 0%  -0.22%  (p=0.000 n=30+30)
FmtFprintfInt-4             712ns ± 0%     706ns ± 0%  -0.79%  (p=0.000 n=30+30)
FmtFprintfIntInt-4         1.06µs ± 0%    1.05µs ± 0%  -0.38%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    1.16µs ± 0%    1.16µs ± 0%  -0.13%  (p=0.000 n=30+29)
FmtFprintfFloat-4          2.24µs ± 0%    2.23µs ± 0%  -0.51%  (p=0.000 n=29+21)
FmtManyArgs-4              4.09µs ± 0%    4.06µs ± 0%  -0.83%  (p=0.000 n=28+30)
GobDecode-4                55.0ms ± 5%    55.4ms ± 5%    ~     (p=0.307 n=30+30)
GobEncode-4                51.2ms ± 1%    51.9ms ± 1%  +1.23%  (p=0.000 n=29+30)
Gzip-4                      2.64s ± 0%     2.60s ± 0%  -1.35%  (p=0.000 n=30+29)
Gunzip-4                    309ms ± 0%     308ms ± 0%  -0.27%  (p=0.000 n=30+30)
HTTPClientServer-4         1.03ms ± 5%    1.02ms ± 4%    ~     (p=0.117 n=30+29)
JSONEncode-4                101ms ± 2%     101ms ± 2%    ~     (p=0.338 n=29+29)
JSONDecode-4                383ms ± 2%     382ms ± 2%    ~     (p=0.751 n=26+30)
Mandelbrot200-4            18.4ms ± 0%    18.4ms ± 0%  -0.10%  (p=0.000 n=29+29)
GoParse-4                  22.6ms ± 0%    22.5ms ± 0%  -0.39%  (p=0.000 n=30+30)
RegexpMatchEasy0_32-4       761ns ± 0%     750ns ± 0%  -1.47%  (p=0.000 n=26+29)
RegexpMatchEasy0_1K-4      4.33µs ± 0%    4.34µs ± 0%  +0.27%  (p=0.000 n=25+28)
RegexpMatchEasy1_32-4       809ns ± 0%     795ns ± 0%  -1.74%  (p=0.000 n=27+25)
RegexpMatchEasy1_1K-4      5.54µs ± 0%    5.53µs ± 0%  -0.18%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4     1.11µs ± 0%    1.08µs ± 0%  -2.78%  (p=0.000 n=27+29)
RegexpMatchMedium_1K-4      255µs ± 0%     255µs ± 0%  -0.02%  (p=0.029 n=30+30)
RegexpMatchHard_32-4       14.7µs ± 0%    14.7µs ± 0%  -0.28%  (p=0.000 n=30+29)
RegexpMatchHard_1K-4        439µs ± 0%     439µs ± 0%    ~     (p=0.907 n=23+27)
Revcomp-4                  41.9ms ± 1%    41.9ms ± 1%    ~     (p=0.230 n=28+30)
Template-4                  522ms ± 1%     528ms ± 1%  +1.25%  (p=0.000 n=30+30)
TimeParse-4                3.34µs ± 0%    3.35µs ± 0%  +0.23%  (p=0.000 n=30+27)
TimeFormat-4               6.06µs ± 0%    6.13µs ± 0%  +1.08%  (p=0.000 n=29+29)
[Geo mean]                  384µs          382µs       -0.37%

name                     old speed      new speed      delta
GobDecode-4              14.0MB/s ± 5%  13.9MB/s ± 5%    ~     (p=0.308 n=30+30)
GobEncode-4              15.0MB/s ± 1%  14.8MB/s ± 1%  -1.22%  (p=0.000 n=29+30)
Gzip-4                   7.36MB/s ± 0%  7.46MB/s ± 0%  +1.35%  (p=0.000 n=30+30)
Gunzip-4                 62.8MB/s ± 0%  63.0MB/s ± 0%  +0.27%  (p=0.000 n=30+30)
JSONEncode-4             19.2MB/s ± 2%  19.2MB/s ± 2%    ~     (p=0.312 n=29+29)
JSONDecode-4             5.05MB/s ± 3%  5.08MB/s ± 2%    ~     (p=0.356 n=29+30)
GoParse-4                2.56MB/s ± 0%  2.57MB/s ± 0%  +0.39%  (p=0.000 n=23+27)
RegexpMatchEasy0_32-4    42.0MB/s ± 0%  42.6MB/s ± 0%  +1.50%  (p=0.000 n=26+28)
RegexpMatchEasy0_1K-4     236MB/s ± 0%   236MB/s ± 0%  -0.27%  (p=0.000 n=25+28)
RegexpMatchEasy1_32-4    39.6MB/s ± 0%  40.2MB/s ± 0%  +1.73%  (p=0.000 n=27+27)
RegexpMatchEasy1_1K-4     185MB/s ± 0%   185MB/s ± 0%  +0.18%  (p=0.000 n=29+29)
RegexpMatchMedium_32-4    900kB/s ± 0%   920kB/s ± 0%  +2.22%  (p=0.000 n=29+29)
RegexpMatchMedium_1K-4   4.02MB/s ± 0%  4.02MB/s ± 0%  +0.07%  (p=0.004 n=30+27)
RegexpMatchHard_32-4     2.17MB/s ± 0%  2.18MB/s ± 0%  +0.46%  (p=0.000 n=30+26)
RegexpMatchHard_1K-4     2.33MB/s ± 0%  2.33MB/s ± 0%    ~     (all equal)
Revcomp-4                60.6MB/s ± 1%  60.7MB/s ± 1%    ~     (p=0.207 n=28+30)
Template-4               3.72MB/s ± 1%  3.67MB/s ± 1%  -1.23%  (p=0.000 n=30+30)
[Geo mean]               12.9MB/s       12.9MB/s       +0.29%

Change-Id: I07f497f8bb476c950dc555491d00c9066fb64a4e
Reviewed-on: https://go-review.googlesource.com/134232
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-11 14:37:51 +00:00
Iskander Sharipov
b7f9c640b7 test: extend noescape bytes.Buffer test suite
Added some more cases that should be guarded against regression.

Change-Id: I9f1dda2fd0be9b6e167ef1cc018fc8cce55c066c
Reviewed-on: https://go-review.googlesource.com/134017
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-09-07 18:49:12 +00:00
erifan01
204cc14bdd cmd/compile: implement non-constant rotates using ROR on arm64
Add some rules to match the Go code like:
	y &= 63
	x << y | x >> (64-y)
or
	y &= 63
	x >> y | x << (64-y)
as a ROR instruction. Make math/bits.RotateLeft faster on arm64.

Extends CL 132435 to arm64.

Benchmarks of math/bits.RotateLeftxxN:
name            old time/op       new time/op       delta
RotateLeft-8    3.548750ns +- 1%  2.003750ns +- 0%  -43.54%  (p=0.000 n=8+8)
RotateLeft8-8   3.925000ns +- 0%  3.925000ns +- 0%     ~     (p=1.000 n=8+8)
RotateLeft16-8  3.925000ns +- 0%  3.927500ns +- 0%     ~     (p=0.608 n=8+8)
RotateLeft32-8  3.925000ns +- 0%  2.002500ns +- 0%  -48.98%  (p=0.000 n=8+8)
RotateLeft64-8  3.536250ns +- 0%  2.003750ns +- 0%  -43.34%  (p=0.000 n=8+8)

Change-Id: I77622cd7f39b917427e060647321f5513973232c
Reviewed-on: https://go-review.googlesource.com/122542
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-07 14:52:02 +00:00
Ben Shi
031a35ec84 cmd/compile: optimize 386's comparison
Optimization of "(CMPconst [0] (ANDL x y)) -> (TESTL x y)" only
get benefits if there is no further use of the result of x&y. A
condition of uses==1 will have slight improvements.

1. The code size of pkg/linux_386 decreases about 300 bytes, excluding
cmd/compile/.

2. The go1 benchmark shows no regression, and even a slight improvement
in test case FmtFprintfEmpty-4, excluding noise.

name                     old time/op    new time/op    delta
BinaryTree17-4              3.34s ± 3%     3.32s ± 2%    ~     (p=0.197 n=30+30)
Fannkuch11-4                3.48s ± 2%     3.47s ± 1%  -0.33%  (p=0.015 n=30+30)
FmtFprintfEmpty-4          46.3ns ± 4%    44.8ns ± 4%  -3.33%  (p=0.000 n=30+30)
FmtFprintfString-4         78.8ns ± 7%    77.3ns ± 5%    ~     (p=0.098 n=30+26)
FmtFprintfInt-4            90.2ns ± 1%    90.0ns ± 7%  -0.23%  (p=0.027 n=18+30)
FmtFprintfIntInt-4          144ns ± 4%     143ns ± 5%    ~     (p=0.945 n=30+29)
FmtFprintfPrefixedInt-4     180ns ± 4%     180ns ± 5%    ~     (p=0.858 n=30+30)
FmtFprintfFloat-4           409ns ± 4%     406ns ± 3%  -0.87%  (p=0.028 n=30+30)
FmtManyArgs-4               611ns ± 5%     608ns ± 4%    ~     (p=0.812 n=30+30)
GobDecode-4                7.30ms ± 5%    7.26ms ± 5%    ~     (p=0.522 n=30+29)
GobEncode-4                6.90ms ± 7%    6.82ms ± 4%    ~     (p=0.086 n=29+28)
Gzip-4                      396ms ± 4%     400ms ± 4%  +0.99%  (p=0.026 n=30+30)
Gunzip-4                   41.1ms ± 3%    41.2ms ± 3%    ~     (p=0.495 n=30+30)
HTTPClientServer-4         63.7µs ± 3%    63.3µs ± 2%    ~     (p=0.113 n=29+29)
JSONEncode-4               16.1ms ± 2%    16.1ms ± 2%  -0.30%  (p=0.041 n=30+30)
JSONDecode-4               60.9ms ± 3%    61.2ms ± 6%    ~     (p=0.187 n=30+30)
Mandelbrot200-4            5.17ms ± 2%    5.19ms ± 3%    ~     (p=0.676 n=30+30)
GoParse-4                  3.28ms ± 3%    3.25ms ± 2%  -0.97%  (p=0.002 n=30+30)
RegexpMatchEasy0_32-4       103ns ± 4%     104ns ± 4%    ~     (p=0.352 n=30+30)
RegexpMatchEasy0_1K-4       849ns ± 2%     845ns ± 2%    ~     (p=0.381 n=30+30)
RegexpMatchEasy1_32-4       113ns ± 4%     113ns ± 4%    ~     (p=0.795 n=30+30)
RegexpMatchEasy1_1K-4      1.03µs ± 3%    1.03µs ± 4%    ~     (p=0.275 n=25+30)
RegexpMatchMedium_32-4      132ns ± 3%     132ns ± 3%    ~     (p=0.970 n=30+30)
RegexpMatchMedium_1K-4     41.4µs ± 3%    41.4µs ± 3%    ~     (p=0.212 n=30+30)
RegexpMatchHard_32-4       2.22µs ± 4%    2.22µs ± 4%    ~     (p=0.399 n=30+30)
RegexpMatchHard_1K-4       67.2µs ± 3%    67.6µs ± 4%    ~     (p=0.359 n=30+30)
Revcomp-4                   1.84s ± 2%     1.83s ± 2%    ~     (p=0.532 n=30+30)
Template-4                 69.1ms ± 4%    68.8ms ± 3%    ~     (p=0.146 n=30+30)
TimeParse-4                 441ns ± 3%     442ns ± 3%    ~     (p=0.154 n=30+30)
TimeFormat-4                413ns ± 3%     414ns ± 3%    ~     (p=0.275 n=30+30)
[Geo mean]                 66.2µs         66.0µs       -0.28%

name                     old speed      new speed      delta
GobDecode-4               105MB/s ± 5%   106MB/s ± 5%    ~     (p=0.514 n=30+29)
GobEncode-4               111MB/s ± 5%   113MB/s ± 4%  +1.37%  (p=0.046 n=28+28)
Gzip-4                   49.1MB/s ± 4%  48.6MB/s ± 4%  -0.98%  (p=0.028 n=30+30)
Gunzip-4                  472MB/s ± 4%   472MB/s ± 3%    ~     (p=0.496 n=30+30)
JSONEncode-4              120MB/s ± 2%   121MB/s ± 2%  +0.29%  (p=0.042 n=30+30)
JSONDecode-4             31.9MB/s ± 3%  31.7MB/s ± 6%    ~     (p=0.186 n=30+30)
GoParse-4                17.6MB/s ± 3%  17.8MB/s ± 2%  +0.98%  (p=0.002 n=30+30)
RegexpMatchEasy0_32-4     309MB/s ± 4%   307MB/s ± 4%    ~     (p=0.501 n=30+30)
RegexpMatchEasy0_1K-4    1.21GB/s ± 2%  1.21GB/s ± 2%    ~     (p=0.301 n=30+30)
RegexpMatchEasy1_32-4     283MB/s ± 4%   282MB/s ± 3%    ~     (p=0.877 n=30+30)
RegexpMatchEasy1_1K-4    1.00GB/s ± 3%  0.99GB/s ± 4%    ~     (p=0.276 n=25+30)
RegexpMatchMedium_32-4   7.54MB/s ± 3%  7.55MB/s ± 3%    ~     (p=0.528 n=30+30)
RegexpMatchMedium_1K-4   24.7MB/s ± 3%  24.7MB/s ± 3%    ~     (p=0.203 n=30+30)
RegexpMatchHard_32-4     14.4MB/s ± 4%  14.4MB/s ± 4%    ~     (p=0.407 n=30+30)
RegexpMatchHard_1K-4     15.3MB/s ± 3%  15.1MB/s ± 4%    ~     (p=0.306 n=30+30)
Revcomp-4                 138MB/s ± 2%   139MB/s ± 2%    ~     (p=0.520 n=30+30)
Template-4               28.1MB/s ± 4%  28.2MB/s ± 3%    ~     (p=0.149 n=30+30)
[Geo mean]               81.5MB/s       81.5MB/s       +0.06%

Change-Id: I7f75425f79eec93cdd8fdd94db13ad4f61b6a2f5
Reviewed-on: https://go-review.googlesource.com/133657
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2018-09-07 01:20:45 +00:00
fanzha02
2e5c32518c cmd/compile: optimize math.Copysign on arm64
Add rewrite rules to optimize math.Copysign() when the second
argument is negative floating point constant.

For example, math.Copysign(c, -2): The previous compile output is
"AND $9223372036854775807, R0, R0; ORR $-9223372036854775808, R0, R0".
The optimized compile output is "ORR $-9223372036854775808, R0, R0"

Math package benchmark results.
name                   old time/op  new time/op  delta
Copysign-8             2.61ns ± 2%  2.49ns ± 0%  -4.55%  (p=0.000 n=10+10)
Cos-8                  43.0ns ± 0%  41.5ns ± 0%  -3.49%  (p=0.000 n=10+10)
Cosh-8                 98.6ns ± 0%  98.1ns ± 0%  -0.51%  (p=0.000 n=10+10)
ExpGo-8                 107ns ± 0%   105ns ± 0%  -1.87%  (p=0.000 n=10+10)
Exp2Go-8                100ns ± 0%   100ns ± 0%  +0.39%  (p=0.000 n=10+8)
Max-8                  6.56ns ± 2%  6.45ns ± 1%  -1.63%  (p=0.002 n=10+10)
Min-8                  6.66ns ± 3%  6.47ns ± 2%  -2.82%  (p=0.006 n=10+10)
Mod-8                   107ns ± 1%   104ns ± 1%  -2.72%  (p=0.000 n=10+10)
Frexp-8                11.5ns ± 1%  11.0ns ± 0%  -4.56%  (p=0.000 n=8+10)
HypotGo-8              19.4ns ± 0%  19.4ns ± 0%  +0.36%  (p=0.019 n=10+10)
Ilogb-8                8.63ns ± 0%  8.51ns ± 0%  -1.36%  (p=0.000 n=10+10)
Jn-8                    584ns ± 0%   585ns ± 0%  +0.17%  (p=0.000 n=7+8)
Ldexp-8                13.8ns ± 0%  13.5ns ± 0%  -2.17%  (p=0.002 n=8+10)
Logb-8                 10.2ns ± 0%   9.9ns ± 0%  -2.65%  (p=0.000 n=10+7)
Nextafter64-8          7.54ns ± 0%  7.51ns ± 0%  -0.37%  (p=0.000 n=10+10)
Remainder-8            73.5ns ± 1%  70.4ns ± 1%  -4.27%  (p=0.000 n=10+10)
SqrtGoLatency-8        79.6ns ± 0%  76.2ns ± 0%  -4.30%  (p=0.000 n=9+10)
Yn-8                    582ns ± 0%   579ns ± 0%  -0.52%  (p=0.000 n=10+10)

Change-Id: I0c9cd1ea87435e7b8bab94b4e79e6e29785f25b1
Reviewed-on: https://go-review.googlesource.com/132915
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 19:57:25 +00:00
Iskander Sharipov
9c2be4c22d bytes: remove bootstrap array from Buffer
Rationale: small buffer optimization does not work and it has
made things slower since 2014. Until we can make it work,
we should prefer simpler code that also turns out to be more
efficient.

With this change, it's possible to use
NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired
stack-allocated initial buffer since escape analysis can
prove the created slice to be non-escaping.

New implementation key points:

    - Zero value bytes.Buffer performs better than before
    - You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes
    - The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly
    - Empty writes don't cause allocations

Buffer benchmarks from bytes package:

    name                       old time/op    new time/op    delta
    ReadString-8                 9.20µs ± 1%    9.22µs ± 1%     ~     (p=0.148 n=10+10)
    WriteByte-8                  28.1µs ± 0%    26.2µs ± 0%   -6.78%  (p=0.000 n=10+10)
    WriteRune-8                  64.9µs ± 0%    65.0µs ± 0%   +0.16%  (p=0.000 n=10+10)
    BufferNotEmptyWriteRead-8     469µs ± 0%     461µs ± 0%   -1.76%  (p=0.000 n=9+10)
    BufferFullSmallReads-8        108µs ± 0%     108µs ± 0%   -0.21%  (p=0.000 n=10+10)

    name                       old speed      new speed      delta
    ReadString-8               3.56GB/s ± 1%  3.55GB/s ± 1%     ~     (p=0.165 n=10+10)
    WriteByte-8                 146MB/s ± 0%   156MB/s ± 0%   +7.26%  (p=0.000 n=9+10)
    WriteRune-8                 189MB/s ± 0%   189MB/s ± 0%   -0.16%  (p=0.000 n=10+10)

    name                       old alloc/op   new alloc/op   delta
    ReadString-8                 32.8kB ± 0%    32.8kB ± 0%     ~     (all equal)
    WriteByte-8                   0.00B          0.00B          ~     (all equal)
    WriteRune-8                   0.00B          0.00B          ~     (all equal)
    BufferNotEmptyWriteRead-8    4.72kB ± 0%    4.67kB ± 0%   -1.02%  (p=0.000 n=10+10)
    BufferFullSmallReads-8       3.44kB ± 0%    3.33kB ± 0%   -3.26%  (p=0.000 n=10+10)

    name                       old allocs/op  new allocs/op  delta
    ReadString-8                   1.00 ± 0%      1.00 ± 0%     ~     (all equal)
    WriteByte-8                    0.00           0.00          ~     (all equal)
    WriteRune-8                    0.00           0.00          ~     (all equal)
    BufferNotEmptyWriteRead-8      3.00 ± 0%      3.00 ± 0%     ~     (all equal)
    BufferFullSmallReads-8         3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)

The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc):

    HTTPClientServer-8           64.0 ± 0%      63.0 ± 0%  -1.56%  (p=0.000 n=10+10)

For more explanations and benchmarks see the referenced issue.

Updates #7921

Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3
Reviewed-on: https://go-review.googlesource.com/133715
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-06 19:33:18 +00:00
taylorza
4a095b87d3 cmd/compile: don't crash reporting misuse of shadowed built-in function
The existing implementation causes a compiler panic if a function parameter shadows a built-in function, and then calling that shadowed name.

Fixes #27356
Change-Id: I1ffb6dc01e63c7f499e5f6f75f77ce2318f35bcd
Reviewed-on: https://go-review.googlesource.com/132876
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-09-06 02:49:21 +00:00
Ben Shi
067dfce21f cmd/compile: optimize ARM's comparision
Optimize (CMPconst [0] (ADD x y)) to (CMN x y) will only get benefits
when the result of the addition is no longer used, otherwise there
might be even performance drop. And this CL fixes that issue for
CMP/CMN/TST/TEQ.

There is little regression in the go1 benchmark (excluding noise),
and the test case JSONDecode-4 even gets improvement.

name                     old time/op    new time/op    delta
BinaryTree17-4              21.6s ± 1%     21.6s ± 0%  -0.22%  (p=0.013 n=30+30)
Fannkuch11-4                11.1s ± 0%     11.1s ± 0%  +0.11%  (p=0.000 n=30+29)
FmtFprintfEmpty-4           297ns ± 0%     297ns ± 0%  +0.08%  (p=0.007 n=26+28)
FmtFprintfString-4          589ns ± 1%     589ns ± 0%    ~     (p=0.659 n=30+25)
FmtFprintfInt-4             644ns ± 1%     650ns ± 0%  +0.88%  (p=0.000 n=30+24)
FmtFprintfIntInt-4          964ns ± 0%     977ns ± 0%  +1.33%  (p=0.000 n=30+30)
FmtFprintfPrefixedInt-4    1.06µs ± 0%    1.07µs ± 0%  +1.31%  (p=0.000 n=29+27)
FmtFprintfFloat-4          1.89µs ± 0%    1.92µs ± 0%  +1.25%  (p=0.000 n=29+29)
FmtManyArgs-4              3.63µs ± 0%    3.67µs ± 0%  +1.33%  (p=0.000 n=29+27)
GobDecode-4                38.1ms ± 1%    37.9ms ± 1%  -0.60%  (p=0.000 n=29+29)
GobEncode-4                35.3ms ± 2%    35.2ms ± 1%    ~     (p=0.286 n=30+30)
Gzip-4                      2.36s ± 0%     2.37s ± 2%    ~     (p=0.277 n=24+28)
Gunzip-4                    264ms ± 1%     264ms ± 1%    ~     (p=0.104 n=28+30)
HTTPClientServer-4         1.04ms ± 4%    1.02ms ± 4%  -1.65%  (p=0.000 n=28+28)
JSONEncode-4               78.5ms ± 1%    79.6ms ± 1%  +1.34%  (p=0.000 n=27+28)
JSONDecode-4                379ms ± 4%     352ms ± 5%  -7.09%  (p=0.000 n=29+30)
Mandelbrot200-4            17.6ms ± 0%    17.6ms ± 0%    ~     (p=0.206 n=28+29)
GoParse-4                  21.9ms ± 1%    22.1ms ± 1%  +0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4       631ns ± 0%     641ns ± 0%  +1.63%  (p=0.000 n=29+30)
RegexpMatchEasy0_1K-4      4.11µs ± 0%    4.11µs ± 0%    ~     (p=0.700 n=30+30)
RegexpMatchEasy1_32-4       670ns ± 0%     679ns ± 0%  +1.37%  (p=0.000 n=21+30)
RegexpMatchEasy1_1K-4      5.31µs ± 0%    5.26µs ± 0%  -1.03%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4      905ns ± 0%     906ns ± 0%  +0.14%  (p=0.001 n=30+30)
RegexpMatchMedium_1K-4      192µs ± 0%     191µs ± 0%  -0.45%  (p=0.000 n=29+27)
RegexpMatchHard_32-4       11.8µs ± 0%    11.7µs ± 0%  -0.39%  (p=0.000 n=29+28)
RegexpMatchHard_1K-4        347µs ± 0%     347µs ± 0%    ~     (p=0.084 n=29+30)
Revcomp-4                  37.5ms ± 1%    37.5ms ± 1%    ~     (p=0.279 n=29+29)
Template-4                  519ms ± 2%     519ms ± 2%    ~     (p=0.652 n=28+29)
TimeParse-4                2.83µs ± 0%    2.78µs ± 0%  -1.90%  (p=0.000 n=27+28)
TimeFormat-4               5.79µs ± 0%    5.60µs ± 0%  -3.23%  (p=0.000 n=29+29)
[Geo mean]                  331µs          330µs       -0.16%

name                     old speed      new speed      delta
GobDecode-4              20.1MB/s ± 1%  20.3MB/s ± 1%  +0.61%  (p=0.000 n=29+29)
GobEncode-4              21.7MB/s ± 2%  21.8MB/s ± 1%    ~     (p=0.294 n=30+30)
Gzip-4                   8.23MB/s ± 1%  8.20MB/s ± 2%    ~     (p=0.099 n=26+28)
Gunzip-4                 73.5MB/s ± 1%  73.4MB/s ± 1%    ~     (p=0.107 n=28+30)
JSONEncode-4             24.7MB/s ± 1%  24.4MB/s ± 1%  -1.32%  (p=0.000 n=27+28)
JSONDecode-4             5.13MB/s ± 4%  5.52MB/s ± 5%  +7.65%  (p=0.000 n=29+30)
GoParse-4                2.65MB/s ± 1%  2.63MB/s ± 1%  -0.87%  (p=0.000 n=28+26)
RegexpMatchEasy0_32-4    50.7MB/s ± 0%  49.9MB/s ± 0%  -1.58%  (p=0.000 n=29+29)
RegexpMatchEasy0_1K-4     249MB/s ± 0%   249MB/s ± 0%    ~     (p=0.342 n=30+28)
RegexpMatchEasy1_32-4    47.7MB/s ± 0%  47.1MB/s ± 0%  -1.39%  (p=0.000 n=26+30)
RegexpMatchEasy1_1K-4     193MB/s ± 0%   195MB/s ± 0%  +1.04%  (p=0.000 n=25+28)
RegexpMatchMedium_32-4   1.10MB/s ± 0%  1.10MB/s ± 0%  -0.42%  (p=0.000 n=30+26)
RegexpMatchMedium_1K-4   5.33MB/s ± 0%  5.36MB/s ± 0%  +0.43%  (p=0.000 n=29+29)
RegexpMatchHard_32-4     2.72MB/s ± 0%  2.73MB/s ± 0%  +0.37%  (p=0.000 n=29+30)
RegexpMatchHard_1K-4     2.95MB/s ± 0%  2.95MB/s ± 0%    ~     (all equal)
Revcomp-4                67.8MB/s ± 1%  67.7MB/s ± 1%    ~     (p=0.273 n=29+29)
Template-4               3.74MB/s ± 2%  3.74MB/s ± 2%    ~     (p=0.665 n=28+29)
[Geo mean]               15.2MB/s       15.2MB/s       +0.21%

Change-Id: Ifed1fb8cc02d5ca52c8bc6c21b6b5bf6dbb2701a
Reviewed-on: https://go-review.googlesource.com/132115
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 14:54:08 +00:00
Iskander Sharipov
4cf33e361a cmd/compile/internal/gc: fix mayAffectMemory in esc.go
For OINDEX and other Left+Right nodes, we want the whole
node to be considered as "may affect memory" if either
of Left or Right affect memory. Initial implementation
only considered node as such if both Left and Right were non-safe.

Change-Id: Icfb965a0b4c24d8f83f3722216db068dad2eba95
Reviewed-on: https://go-review.googlesource.com/133275
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2018-09-05 14:16:25 +00:00
Iskander Sharipov
bcf3e063cc test: remove go:noinline from escape_because.go
File is compiled with "-l" flag, so go:noinline is redundant.

Change-Id: Ia269f3b9de9466857fc578ba5164613393e82369
Reviewed-on: https://go-review.googlesource.com/133295
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 10:45:58 +00:00
Tobias Klauser
d7fc2205d4 test: fix nilptr3 check for wasm
CL 131735 only updated nilptr3.go for the adjusted nil check. Adjust
nilptr3_wasm.go as well.

Change-Id: I4a6257d32bb212666fe768dac53901ea0b051138
Reviewed-on: https://go-review.googlesource.com/133495
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-09-05 09:57:32 +00:00
Michael Munday
f94de9c9fb cmd/compile: make math/bits.RotateLeft{32,64} intrinsics on s390x
Extends CL 132435 to s390x. s390x has 32- and 64-bit variable
rotate left instructions.

Change-Id: Ic4f1ebb0e0543207ed2fc8c119e0163b428138a5
Reviewed-on: https://go-review.googlesource.com/133035
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05 08:29:02 +00:00