1
0
mirror of https://github.com/golang/go synced 2024-09-30 06:24:33 -06:00
Commit Graph

3309 Commits

Author SHA1 Message Date
David Chase
d8f60eea64 cmd/compile: enhance induction variable detection for unrolled loops
Would suggest extending capabilities (32-bit, unsigned, etc)
in separate CLs because prove bugs are so mystifying.

This implements the suggestion in this comment
https://go-review.googlesource.com/c/go/+/104041/10/src/cmd/compile/internal/ssa/loopbce.go#164
for inferring properly bounded iteration for loops of the form

for i := K0; i < KNN-(K-1); i += K
for i := K0; i <= KNN-K;    i += K

Where KNN is "known non negative" (i.e., len or cap) and K
is also not negative.  Because i <= KNN-K, i+K <= KNN and
no overflow occurs.

Also handles decreasing case (K1 > 0)
for i := KNN; i >= K0; i -= K1
which works when MININT+K1 < K0
(i.e. MININT < K0-K1, no overflow)

Signed only, also only 64 bit for now.

Change-Id: I5da6015aba2f781ec76c4ad59c9c48d952325fdc
Reviewed-on: https://go-review.googlesource.com/c/go/+/136375
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
2019-03-29 20:08:07 +00:00
zdjones
ddef157813 cmd/compile: make prove use poset to check non-negatives
Prove currently fails to remove bounds checks of the form:

if i >= 0 {              // hint that i is non-negative
    for i < len(data) {  // i becomes Phi in the loop SSA
        _ = data[i]      // data[Phi]; bounds check!!
	i++
    }
}

addIndVarRestrictions fails to identify that the loop induction
variable, (Phi), is non-negative. As a result, the restrictions,
i <= Phi < len(data), are only added for the signed domain. When
testing the bounds check, addBranchRestrictions is similarly unable
to infer that Phi is non-negative. As a result, the restriction,
Phi >= len(data), is only added/tested for the unsigned domain.

This CL changes the isNonNegative method to utilise the factTable's
partially ordered set (poset). It also adds field factTable.zero to
allow isNonNegative to query the poset using the zero(0) constant
found or created early in prove.

Fixes #28956

Change-Id: I792f886c652eeaa339b0d57d5faefbf5922fe44f
Reviewed-on: https://go-review.googlesource.com/c/go/+/161437
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
2019-03-29 07:17:49 +00:00
Matthew Dempsky
9f6b21caea cmd/compile: fix ICE from invalid operations on float/complex constants
Typechecking treats all untyped numbers as integers for the purposes
of validating operators. However, when I refactoring constant
operation evalution in golang.org/cl/139901, I mistakenly interpreted
that the only invalid case that needed to be preserved was % (modulo)
on floating-point values.

This CL restores the other remaining cases that were dropped from that
CL. It also uses the phrasing "invalid operation" instead of "illegal
constant expression" for better consistency with the rest of
cmd/compile and with go/types.

Lastly, this CL extends setconst to recognize failed constant folding
(e.g., division by zero) so that we can properly mark those
expressions as broken rather than continuing forward with bogus values
that might lead to further spurious errors.

Fixes #31060.

Change-Id: I1ab6491371925e22bc8b95649f1a0eed010abca6
Reviewed-on: https://go-review.googlesource.com/c/go/+/169719
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-28 17:46:55 +00:00
Than McIntosh
ad8b7a70c5 test: fix fixedbugs/issue30908.go to work with no-opt builder
Update the issue 30908 test to work with the no-opt builder
(this requires a corresponding change in the linker as well).
As part of this change, 'rundir' tests are now linked without
passing "-w" to the linker.

Updates #30908.
Fixes #31034.

Change-Id: Ic776e1607075c295e409e1c8230aaf55a79a6323
Reviewed-on: https://go-review.googlesource.com/c/go/+/169161
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-03-27 17:53:36 +00:00
LE Manh Cuong
d47db6dc0c cmd/compile: fix literal struct interface {} lost passing by value
CL 135377 introduces pass strings and slices to convT2{E,I} by value.
Before that CL, all types, except interface will be allocated temporary
address. The CL changes the logic that only constant and type which
needs address (determine by convFuncName) will be allocated.

It fails to cover the case where type is static composite literal.
Adding condition to check that case fixes the issue.

Also, static composite literal node implies constant type, so consttype
checking can be removed.

Fixes #30956

Change-Id: Ifc750a029fb4889c2d06e73e44bf85e6ef4ce881
Reviewed-on: https://go-review.googlesource.com/c/go/+/168858
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-27 17:28:06 +00:00
Matthew Dempsky
c4f87ed26f cmd/compile: fix "append outside assignment" ICE
Some special-case code paths in order.go didn't expect OCALLFUNC to
have Ninit; in particular, OAS2FUNC and ODEFER/OGO failed to call
o.init on their child OCALLFUNC node. This resulted in not all of the
AST being properly ordered.

This was noticed because order is responsible for introducing an
invariant around how OAPPEND is used, which is enforced by walk.
However, there were perhaps simpler cases (e.g., simple order of
evaluation) that were being silently miscompiled.

Fixes #31010.

Change-Id: Ib928890ab5ec2aebd8e30a030bc2b404387f9123
Reviewed-on: https://go-review.googlesource.com/c/go/+/169257
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-03-26 02:20:09 +00:00
Than McIntosh
6582ee9cba test: new test for issue 30908
New test case designed to mimic the code in issue 30908, which
features duplicate but non-indentical DWARF abstract subprogram DIEs.

Updates #30908.

Change-Id: Iacb4b53e6a988e46c801cdac236cef883c553f8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/168957
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2019-03-25 15:00:08 +00:00
Than McIntosh
fbd74a8922 test: support -ldflags for "rundir" tests, new -P option
For "rundir" tests, allow users to add in linker flags as well as
compiler flags, e.g.

// rundir -m -ldflags -w

The directive above will pass "-m" to the compiler on each package compilation
and "-w" to the linker for the final link.

In addition, if "-P" is specified with 'rundir', then for each compile
pass in "-p <X>" to set the packagepath explicitly, which is closer to
how the compiler is run by 'go build'.

Change-Id: I04720011a89d1bd8dcb4f2ccb4af1d74f6a01da1
Reviewed-on: https://go-review.googlesource.com/c/go/+/168977
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-25 14:59:58 +00:00
erifan01
d0cbf9bf53 cmd/compile: follow up intrinsifying math/bits.Add64 for arm64
This CL deals with the additional comments of CL 159017.

Change-Id: I4ad3c60c834646d58dc0c544c741b92bfe83fb8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/168857
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-22 15:09:47 +00:00
Cherry Zhang
f23c601bf9 cmd/compile: copy volatile values before emitting write barrier call
It is possible that a "volatile" value (one that can be clobbered
by preparing args of a call) to be used in multiple write barrier
calls. We used to copy the volatile value right before each call.
But this doesn't work if the value is used the second time, after
the first call where it is already clobbered. Copy it before
emitting any call.

Fixes #30977.

Change-Id: Iedcc91ad848d5ded547bf37a8359c125d32e994c
Reviewed-on: https://go-review.googlesource.com/c/go/+/168677
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-21 21:05:13 +00:00
Carlos Eduardo Seo
3023d7da49 cmd/compile/internal, cmd/internal/obj/ppc64: generate new count trailing zeros instructions on POWER9
This change adds new POWER9 instructions for counting trailing zeros (CNTTZW/CNTTZD)
to the assembler and generates them in SSA when GOPPC64=power9.

name                 old time/op  new time/op  delta
TrailingZeros-160    1.59ns ±20%  1.45ns ±10%  -8.81%  (p=0.000 n=14+13)
TrailingZeros8-160   1.55ns ±23%  1.62ns ±44%    ~     (p=0.593 n=13+15)
TrailingZeros16-160  1.78ns ±23%  1.62ns ±38%  -9.31%  (p=0.003 n=14+14)
TrailingZeros32-160  1.64ns ±10%  1.49ns ± 9%  -9.15%  (p=0.000 n=13+14)
TrailingZeros64-160  1.53ns ± 6%  1.45ns ± 5%  -5.38%  (p=0.000 n=15+13)

Change-Id: I365e6ff79f3ce4d8ebe089a6a86b1771853eb596
Reviewed-on: https://go-review.googlesource.com/c/go/+/167517
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2019-03-20 20:27:00 +00:00
erifan01
5714c91b53 cmd/compile: intrinsify math/bits.Add64 for arm64
This CL instrinsifies Add64 with arm64 instruction sequence ADDS, ADCS
and ADC, and optimzes the case of carry chains.The CL also changes the
test code so that the intrinsic implementation can be tested.

Benchmarks:
name               old time/op       new time/op       delta
Add-224            2.500000ns +- 0%  2.090000ns +- 4%  -16.40%  (p=0.000 n=9+10)
Add32-224          2.500000ns +- 0%  2.500000ns +- 0%     ~     (all equal)
Add64-224          2.500000ns +- 0%  1.577778ns +- 2%  -36.89%  (p=0.000 n=10+9)
Add64multiple-224  6.000000ns +- 0%  2.000000ns +- 0%  -66.67%  (p=0.000 n=10+10)

Change-Id: I6ee91c9a85c16cc72ade5fd94868c579f16c7615
Reviewed-on: https://go-review.googlesource.com/c/go/+/159017
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-20 05:39:49 +00:00
Josh Bleecher Snyder
250b96a7bf cmd/compile: slightly optimize adding 128
'SUBQ $-0x80, r' is shorter to encode than 'ADDQ $0x80, r',
and functionally equivalent. Use it instead.

Shaves off a few bytes here and there:

file    before    after     Δ       %       
compile 25935856  25927664  -8192   -0.032% 
nm      4251840   4247744   -4096   -0.096% 

Change-Id: Ia9e02ea38cbded6a52a613b92e3a914f878d931e
Reviewed-on: https://go-review.googlesource.com/c/go/+/168344
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-19 19:44:52 +00:00
Keith Randall
ca36af215f cmd/compile: better write barrier removal when initializing new objects
When initializing a new object, we're often writing
1) to a location that doesn't have a pointer to a heap object
2) a pointer that doesn't point to a heap object

When both those conditions are true, we can avoid the write barrier.

This CL detects case 1 by looking for writes to known-zeroed
locations.  The results of runtime.newobject are zeroed, and we
perform a simple tracking of which parts of that object are written so
we can determine what part remains zero at each write.

This CL detects case 2 by looking for addresses of globals (including
the types and itabs which are used in interfaces) and for nil pointers.

Makes cmd/go 0.3% smaller. Some particular cases, like the slice
literal in #29573, can get much smaller.

TODO: we can remove actual zero writes also with this mechanism.

Update #29573

Change-Id: Ie74a3533775ea88da0495ba02458391e5db26cb9
Reviewed-on: https://go-review.googlesource.com/c/go/+/156363
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-18 21:16:19 +00:00
Keith Randall
7b916243d9 cmd/compile: rename init function from init.ializers back to init
The name change init -> init.ializers was initially required for
initialization code.

With CL 161337 there's no wrapper code any more, there's a data
structure instead (named .inittask). So we can go back to just
plain init appearing in tracebacks.

RELNOTE=yes

Update #29919. Followon to CL 161337.

Change-Id: I5a4a49d286df24b53b2baa193dfda482f3ea82a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/167780
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-18 20:41:12 +00:00
Keith Randall
d949d0b925 cmd/compile: reorganize init functions
Instead of writing an init function per package that does the same
thing for every package, just write that implementation once in the
runtime. Change the compiler to generate a data structure that encodes
the required initialization operations.

Reduces cmd/go binary size by 0.3%+.  Most of the init code is gone,
including all the corresponding stack map info. The .inittask
structures that replace them are quite a bit smaller.

Most usefully to me, there is no longer an init function in every -S output.
(There is an .inittask global there, but it's much less distracting.)

After this CL we could change the name of the "init.ializers" function
back to just "init".

Update #6853

R=go1.13

Change-Id: Iec82b205cc52fe3ade4d36406933c97dbc9c01b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/161337
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-03-18 20:10:55 +00:00
Matthew Dempsky
916e861fd9 cmd/compile: fix importing rewritten f(g()) calls
golang.org/cl/166983 started serializing the Ninit field of OCALL
nodes within function inline bodies (necessary to fix a regression in
building crypto/ecdsa with -gcflags=-l=4), but this means the Ninit
field needs to be typechecked when the imported function body is used.

It's unclear why this wasn't necessary for the crypto/ecdsa
regression.

Fixes #30907.

Change-Id: Id5f0bf3c4d17bbd6d5318913b859093c93a0a20c
Reviewed-on: https://go-review.googlesource.com/c/go/+/168199
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-18 19:43:38 +00:00
Keith Randall
2c423f063b cmd/compile,runtime: provide index information on bounds check failure
A few examples (for accessing a slice of length 3):

   s[-1]    runtime error: index out of range [-1]
   s[3]     runtime error: index out of range [3] with length 3
   s[-1:0]  runtime error: slice bounds out of range [-1:]
   s[3:0]   runtime error: slice bounds out of range [3:0]
   s[3:-1]  runtime error: slice bounds out of range [:-1]
   s[3:4]   runtime error: slice bounds out of range [:4] with capacity 3
   s[0:3:4] runtime error: slice bounds out of range [::4] with capacity 3

Note that in cases where there are multiple things wrong with the
indexes (e.g. s[3:-1]), we report one of those errors kind of
arbitrarily, currently the rightmost one.

An exhaustive set of examples is in issue30116[u].out in the CL.

The message text has the same prefix as the old message text. That
leads to slightly awkward phrasing but hopefully minimizes the chance
that code depending on the error text will break.

Increases the size of the go binary by 0.5% (amd64). The panic functions
take arguments in registers in order to keep the size of the compiled code
as small as possible.

Fixes #30116

Change-Id: Idb99a827b7888822ca34c240eca87b7e44a04fdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/161477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-18 17:33:38 +00:00
Than McIntosh
64b1889e2d test: new test for issue 30862
New test case, inspired by gccgo issue 30862.

Updates #30862.

Change-Id: I5e494b877e4fd142b8facb527471fe1fdef39c61
Reviewed-on: https://go-review.googlesource.com/c/go/+/167744
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-15 19:05:53 +00:00
Tobias Klauser
156c830bea cmd/compile: eliminate unnecessary type conversions in TrailingZeros(16|8) for arm
This follows CL 156999 which did the same for arm64.

name               old time/op  new time/op  delta
TrailingZeros-4    7.30ns ± 1%  7.30ns ± 0%     ~     (p=0.413 n=9+9)
TrailingZeros8-4   8.32ns ± 0%  7.17ns ± 0%  -13.77%  (p=0.000 n=10+9)
TrailingZeros16-4  8.30ns ± 0%  7.18ns ± 0%  -13.50%  (p=0.000 n=9+10)
TrailingZeros32-4  6.46ns ± 1%  6.47ns ± 1%     ~     (p=0.325 n=10+10)
TrailingZeros64-4  16.3ns ± 0%  16.2ns ± 0%   -0.61%  (p=0.000 n=7+10)

Change-Id: I7e9e1abf7e30d811aa474d272b2824ec7cbbaa98
Reviewed-on: https://go-review.googlesource.com/c/go/+/167797
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-15 18:37:22 +00:00
Matthew Dempsky
c0cfe9687f cmd/compile: rewrite f(g()) for multi-value g() during typecheck
This is a re-attempt at CL 153841, which caused two regressions:

1. crypto/ecdsa failed to build with -gcflags=-l=4. This was because
when "t1, t2, ... := g(); f(t1, t2, ...)" was exported, we were losing
the first assignment from the call's Ninit field.

2. net/http/pprof failed to run with -gcflags=-N. This is due to a
conflict with CL 159717: as of that CL, package-scope initialization
statements are executed within the "init.ializer" function, rather
than the "init" function, and the generated temp variables need to be
moved accordingly too.

[Rest of description is as before.]

This CL moves order.go's copyRet logic for rewriting f(g()) into t1,
t2, ... := g(); f(t1, t2, ...) earlier into typecheck. This allows the
rest of the compiler to stop worrying about multi-value functions
appearing outside of OAS2FUNC nodes.

This changes compiler behavior in a few observable ways:

1. Typechecking error messages for builtin functions now use general
case error messages rather than unnecessarily differing ones.

2. Because f(g()) is rewritten before inlining, saved inline bodies
now see the rewritten form too. This could be addressed, but doesn't
seem worthwhile.

3. Most notably, this simplifies escape analysis and fixes a memory
corruption issue in esc.go. See #29197 for details.

Fixes #15992.
Fixes #29197.

Change-Id: I930b10f7e27af68a0944d6c9bfc8707c3fab27a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/166983
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-14 21:00:20 +00:00
Richard Musiol
5ee1b84959 math, math/bits: add intrinsics for wasm
This commit adds compiler intrinsics for the packages math and
math/bits on the wasm architecture for better performance.

benchmark                        old ns/op     new ns/op     delta
BenchmarkCeil                    8.31          3.21          -61.37%
BenchmarkCopysign                5.24          3.88          -25.95%
BenchmarkAbs                     5.42          3.34          -38.38%
BenchmarkFloor                   8.29          3.18          -61.64%
BenchmarkRoundToEven             9.76          3.26          -66.60%
BenchmarkSqrtLatency             8.13          4.88          -39.98%
BenchmarkSqrtPrime               5246          3535          -32.62%
BenchmarkTrunc                   8.29          3.15          -62.00%
BenchmarkLeadingZeros            13.0          4.23          -67.46%
BenchmarkLeadingZeros8           4.65          4.42          -4.95%
BenchmarkLeadingZeros16          7.60          4.38          -42.37%
BenchmarkLeadingZeros32          10.7          4.48          -58.13%
BenchmarkLeadingZeros64          12.9          4.31          -66.59%
BenchmarkTrailingZeros           6.52          4.04          -38.04%
BenchmarkTrailingZeros8          4.57          4.14          -9.41%
BenchmarkTrailingZeros16         6.69          4.16          -37.82%
BenchmarkTrailingZeros32         6.97          4.23          -39.31%
BenchmarkTrailingZeros64         6.59          4.00          -39.30%
BenchmarkOnesCount               7.93          3.30          -58.39%
BenchmarkOnesCount8              3.56          3.19          -10.39%
BenchmarkOnesCount16             4.85          3.19          -34.23%
BenchmarkOnesCount32             7.27          3.19          -56.12%
BenchmarkOnesCount64             8.08          3.28          -59.41%
BenchmarkRotateLeft              4.88          3.80          -22.13%
BenchmarkRotateLeft64            5.03          3.63          -27.83%

Change-Id: Ic1e0c2984878be8defb6eb7eb6ee63765c793222
Reviewed-on: https://go-review.googlesource.com/c/go/+/165177
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-14 19:46:19 +00:00
Josh Bleecher Snyder
61945fc502 cmd/compile: don't generate panicshift for masked int shifts
We know that a & 31 is non-negative for all a, signed or not.
We can avoid checking that and needing to write out an
unreachable call to panicshift.

Change-Id: I32f32fb2c950d2b2b35ac5c0e99b7b2dbd47f917
Reviewed-on: https://go-review.googlesource.com/c/go/+/167499
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-14 00:03:29 +00:00
Josh Bleecher Snyder
870cfe6484 test/codegen: gofmt
Change-Id: I33f5b5051e5f75aa264ec656926223c5a3c09c1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/167498
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
2019-03-13 21:44:45 +00:00
Matthew Dempsky
a1b5cb1d04 cmd/compile: simplify isGoConst
The only ways to construct an OLITERAL node are (1) a basic literal
from the source package, (2) constant folding within evconst (which
only folds Go language constants), (3) the universal "nil" constant,
and (4) implicit conversions of nil to some concrete type.

Passes toolstash-check.

Change-Id: I30fc6b07ebede7adbdfa4ed562436cbb7078a2ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/166981
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-03-13 18:24:02 +00:00
Carlos Eduardo Seo
4ba69a9a17 cmd/compile: add processor level selection support to ppc64{,le}
ppc64{,le} processor level selection allows the compiler to generate instructions
targeting newer processors and processor-specific optimizations without breaking
compatibility with our current baseline. This feature introduces a new environment
variable, GOPPC64.

GOPPC64 is a GOARCH=ppc64{,le} specific option, for a choice between different
processor levels (i.e. Instruction Set Architecture versions) for which the
compiler will target. The default is 'power8'.

Change-Id: Ic152e283ae1c47084ece4346fa002a3eabb3bb9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/163758
Run-TryBot: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-13 14:44:02 +00:00
Robert Griesemer
bea58ef352 cmd/compile: don't report redundant error for invalid integer literals
Fixes #30722.

Change-Id: Ia4c6e37282edc44788cd8af3f6cfa10895a19e4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166519
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-03-12 22:59:12 +00:00
Bryan C. Mills
af13cfc3a2 ../test: set GOPATH in nosplit.go
This test invokes 'go build', so in module mode it needs a module
cache to guard edits to go.mod.

Fixes #30776

Change-Id: I89ebef1fad718247e7f972cd830e31d6f4a83e4c
Reviewed-on: https://go-review.googlesource.com/c/go/+/167085
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 20:46:29 +00:00
fanzha02
a85afef277 cmd/compile: add handling for new floating-point comparisons flags
The CL 164718 adds new condition flags for floating-point comparisons
in arm64 backend, but dose not add the handling in rewrite.go for
corresponding Ops, which causes issue 30679. And this CL fixes this
issue.

Fixes #30679

Change-Id: I8acc749f78227c3e9e74fa7938f05fb442fb62c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/166579
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-12 14:01:26 +00:00
Than McIntosh
30cc8a46c4 test: add new test for gccgo compilation problem
New test for issue 30659 (compilation error due to bad
export data).

Updates #30659.

Change-Id: I2541ee3c379e5b22033fea66bb4ebaf720cc5e1f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166917
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-03-12 00:04:59 +00:00
Keith Randall
486ca37b14 test: fix memcombine tests
Two tests (load_le_byte8_uint64_inv and load_be_byte8_uint64)
pass but the generated code isn't actually correct.

The test regexp provides a false negative, as it matches the
MOVQ (SP), BP instruction in the epilogue.

Combined loads never worked for these cases - the test was added in error
as part of a batch and not noticed because of the above false match.

Normalize the amd64/386 tests to always negative match on narrower
loads and OR.

Change-Id: I256861924774d39db0e65723866c81df5ab5076f
Reviewed-on: https://go-review.googlesource.com/c/go/+/166837
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-11 19:18:03 +00:00
Carlo Alberto Ferraris
05051b56a0 sync: allow inlining the RWMutex.RUnlock fast path
RWMutex.RLock is already inlineable, so add a test for it as well.

name                    old time/op  new time/op  delta
RWMutexUncontended      66.5ns ± 0%  60.3ns ± 1%  -9.38%  (p=0.000 n=12+20)
RWMutexUncontended-4    16.7ns ± 0%  15.3ns ± 1%  -8.49%  (p=0.000 n=17+20)
RWMutexUncontended-16   7.86ns ± 0%  7.69ns ± 0%  -2.08%  (p=0.000 n=18+15)
RWMutexWrite100         25.1ns ± 0%  24.0ns ± 1%  -4.28%  (p=0.000 n=20+18)
RWMutexWrite100-4       46.7ns ± 5%  44.1ns ± 4%  -5.53%  (p=0.000 n=20+20)
RWMutexWrite100-16      68.3ns ±11%  65.7ns ± 8%  -3.81%  (p=0.003 n=20+20)
RWMutexWrite10          26.7ns ± 1%  25.7ns ± 0%  -3.75%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.9ns ± 2%  33.8ns ± 2%  -3.15%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.4ns ± 2%  36.1ns ± 2%  -3.51%  (p=0.000 n=18+20)
RWMutexWorkWrite100      163ns ± 0%   162ns ± 0%  -0.89%  (p=0.000 n=18+20)
RWMutexWorkWrite100-4    189ns ± 4%   184ns ± 4%  -2.89%  (p=0.000 n=19+20)
RWMutexWorkWrite100-16   207ns ± 4%   200ns ± 2%  -3.07%  (p=0.000 n=19+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%  -0.75%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     177ns ± 1%   176ns ± 2%  -0.63%  (p=0.004 n=17+20)
RWMutexWorkWrite10-16    191ns ± 2%   189ns ± 1%  -0.83%  (p=0.015 n=20+17)

linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)

The cumulative effect of this and the previous 3 commits is:

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  16.4ns ± 1%  -15.13%  (p=0.000 n=20+20)
MutexUncontended-4      5.24ns ± 0%  4.09ns ± 0%  -21.95%  (p=0.000 n=20+18)
MutexUncontended-16     2.10ns ± 0%  2.12ns ± 0%   +0.95%  (p=0.000 n=15+17)
Mutex                   19.6ns ± 0%  16.3ns ± 1%  -17.12%  (p=0.000 n=20+20)
Mutex-4                 54.6ns ± 5%  45.6ns ±10%  -16.51%  (p=0.000 n=20+19)
Mutex-16                 133ns ± 5%   130ns ± 3%   -1.99%  (p=0.002 n=20+20)
MutexSlack              33.4ns ± 2%  16.2ns ± 0%  -51.44%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   209ns ± 9%     ~     (p=0.154 n=20+20)
MutexSlack-16           89.4ns ± 1%  90.9ns ± 2%   +1.70%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  55.3ns ± 1%   -8.59%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%    97ns ±11%   -7.95%  (p=0.000 n=20+20)
MutexWork-16             157ns ± 1%   158ns ± 1%   +0.66%  (p=0.001 n=18+17)
MutexWorkSlack          70.2ns ± 5%  55.3ns ± 0%  -21.30%  (p=0.000 n=19+18)
MutexWorkSlack-4         277ns ±13%   260ns ±15%   -6.35%  (p=0.002 n=20+18)
MutexWorkSlack-16        156ns ± 0%   146ns ± 1%   -6.40%  (p=0.000 n=16+19)
MutexNoSpin              966ns ± 0%   976ns ± 1%   +0.97%  (p=0.000 n=15+17)
MutexNoSpin-4            269ns ± 4%   272ns ± 4%   +1.15%  (p=0.048 n=20+18)
MutexNoSpin-16           122ns ± 0%   119ns ± 1%   -2.63%  (p=0.000 n=19+15)
MutexSpin               3.13µs ± 0%  3.12µs ± 0%   -0.17%  (p=0.000 n=18+18)
MutexSpin-4              826ns ± 1%   833ns ± 1%   +0.84%  (p=0.000 n=19+17)
MutexSpin-16             397ns ± 1%   394ns ± 1%   -0.78%  (p=0.000 n=19+19)
Once                    5.67ns ± 0%  2.07ns ± 2%  -63.43%  (p=0.000 n=20+20)
Once-4                  1.47ns ± 2%  0.54ns ± 3%  -63.49%  (p=0.000 n=19+20)
Once-16                 0.58ns ± 0%  0.17ns ± 5%  -70.49%  (p=0.000 n=17+17)
RWMutexUncontended      71.4ns ± 0%  60.3ns ± 1%  -15.60%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  15.3ns ± 1%  -17.14%  (p=0.000 n=20+20)
RWMutexUncontended-16   8.01ns ± 0%  7.69ns ± 0%   -3.91%  (p=0.000 n=18+15)
RWMutexWrite100         24.9ns ± 0%  24.0ns ± 1%   -3.57%  (p=0.000 n=19+18)
RWMutexWrite100-4       46.5ns ± 3%  44.1ns ± 4%   -5.09%  (p=0.000 n=17+20)
RWMutexWrite100-16      68.9ns ± 3%  65.7ns ± 8%   -4.65%  (p=0.000 n=18+20)
RWMutexWrite10          27.1ns ± 0%  25.7ns ± 0%   -5.25%  (p=0.000 n=17+14)
RWMutexWrite10-4        34.8ns ± 1%  33.8ns ± 2%   -2.96%  (p=0.000 n=20+20)
RWMutexWrite10-16       37.5ns ± 2%  36.1ns ± 2%   -3.72%  (p=0.000 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   162ns ± 0%   -1.49%  (p=0.000 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   184ns ± 4%     ~     (p=0.097 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   200ns ± 2%   -1.58%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   151ns ± 1%   -1.21%  (p=0.000 n=20+20)
RWMutexWorkWrite10-4     179ns ± 1%   176ns ± 2%   -1.25%  (p=0.000 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   189ns ± 1%   -0.94%  (p=0.000 n=15+17)

Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 16:34:17 +00:00
Carlo Alberto Ferraris
ca8354843e sync: allow inlining the Once.Do fast path
Using Once.Do is now extremely cheap because the fast path is just an inlined
atomic load of a variable that is written only once and a conditional jump.
This is very beneficial for Once.Do because, due to its nature, the fast path
will be used for every call after the first one.

In a attempt to mimize code size increase, reorder the fields so that the
pointer to Once is also the pointer to Once.done, that is the only field used
in the hot path. This allows to use more compact instruction encodings or less
instructions in the hot path (that is inlined at every callsite).

name     old time/op  new time/op  delta
Once     4.54ns ± 0%  2.06ns ± 0%  -54.59%  (p=0.000 n=19+16)
Once-4   1.18ns ± 0%  0.55ns ± 0%  -53.39%  (p=0.000 n=15+16)
Once-16  0.53ns ± 0%  0.17ns ± 0%  -67.92%  (p=0.000 n=18+17)

linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)

Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-09 05:58:31 +00:00
Carlo Alberto Ferraris
41cb0aedff sync: allow inlining the Mutex.Lock fast path
name                    old time/op  new time/op  delta
MutexUncontended        18.9ns ± 0%  16.2ns ± 0%  -14.29%  (p=0.000 n=19+19)
MutexUncontended-4      4.75ns ± 1%  4.08ns ± 0%  -14.20%  (p=0.000 n=20+19)
MutexUncontended-16     2.05ns ± 0%  2.11ns ± 0%   +2.93%  (p=0.000 n=19+16)
Mutex                   19.3ns ± 1%  16.2ns ± 0%  -15.86%  (p=0.000 n=17+19)
Mutex-4                 52.4ns ± 4%  48.6ns ± 9%   -7.22%  (p=0.000 n=20+20)
Mutex-16                 139ns ± 2%   140ns ± 3%   +1.03%  (p=0.011 n=16+20)
MutexSlack              18.9ns ± 1%  16.2ns ± 1%  -13.96%  (p=0.000 n=20+20)
MutexSlack-4             225ns ± 8%   211ns ±10%   -5.94%  (p=0.000 n=18+19)
MutexSlack-16           98.4ns ± 1%  90.9ns ± 1%   -7.60%  (p=0.000 n=17+18)
MutexWork               58.2ns ± 3%  55.4ns ± 0%   -4.82%  (p=0.000 n=20+17)
MutexWork-4              103ns ± 7%    95ns ±18%   -8.03%  (p=0.000 n=20+20)
MutexWork-16             163ns ± 2%   155ns ± 2%   -4.47%  (p=0.000 n=18+18)
MutexWorkSlack          57.7ns ± 1%  55.4ns ± 0%   -3.99%  (p=0.000 n=20+13)
MutexWorkSlack-4         276ns ±13%   260ns ±10%   -5.64%  (p=0.001 n=19+19)
MutexWorkSlack-16        147ns ± 0%   156ns ± 1%   +5.87%  (p=0.000 n=14+19)
MutexNoSpin              968ns ± 0%   900ns ± 1%   -6.98%  (p=0.000 n=20+18)
MutexNoSpin-4            270ns ± 2%   255ns ± 2%   -5.74%  (p=0.000 n=19+20)
MutexNoSpin-16           120ns ± 4%   112ns ± 0%   -6.99%  (p=0.000 n=19+14)
MutexSpin               3.13µs ± 1%  3.19µs ± 6%     ~     (p=0.401 n=20+20)
MutexSpin-4              832ns ± 2%   831ns ± 1%   -0.17%  (p=0.023 n=16+18)
MutexSpin-16             395ns ± 0%   399ns ± 0%   +0.94%  (p=0.000 n=17+19)
RWMutexUncontended      69.5ns ± 0%  68.4ns ± 0%   -1.59%  (p=0.000 n=20+20)
RWMutexUncontended-4    17.5ns ± 0%  16.7ns ± 0%   -4.30%  (p=0.000 n=18+17)
RWMutexUncontended-16   7.92ns ± 0%  7.87ns ± 0%   -0.61%  (p=0.000 n=18+17)
RWMutexWrite100         24.9ns ± 1%  25.0ns ± 1%   +0.32%  (p=0.000 n=20+20)
RWMutexWrite100-4       46.2ns ± 4%  46.2ns ± 5%     ~     (p=0.840 n=19+20)
RWMutexWrite100-16      69.9ns ± 5%  69.9ns ± 3%     ~     (p=0.545 n=20+19)
RWMutexWrite10          27.0ns ± 2%  26.8ns ± 2%   -0.98%  (p=0.001 n=20+20)
RWMutexWrite10-4        34.7ns ± 2%  35.0ns ± 4%     ~     (p=0.191 n=18+20)
RWMutexWrite10-16       37.2ns ± 4%  37.3ns ± 2%     ~     (p=0.438 n=20+19)
RWMutexWorkWrite100      164ns ± 0%   163ns ± 0%   -0.24%  (p=0.025 n=20+20)
RWMutexWorkWrite100-4    193ns ± 3%   191ns ± 2%   -1.06%  (p=0.027 n=20+20)
RWMutexWorkWrite100-16   210ns ± 3%   207ns ± 3%   -1.22%  (p=0.038 n=20+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%     ~     (all equal)
RWMutexWorkWrite10-4     178ns ± 2%   179ns ± 2%     ~     (p=0.186 n=20+20)
RWMutexWorkWrite10-16    192ns ± 2%   192ns ± 2%     ~     (p=0.731 n=19+20)

linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)

Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
Reviewed-on: https://go-review.googlesource.com/c/go/+/148959
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-09 05:08:04 +00:00
Keith Randall
83a33d3855 cmd/compile: reverse order of slice bounds checks
Turns out this makes the fix for 28797 unnecessary, because this order
ensures that the RHS of IsSliceInBounds ops are always nonnegative.

The real reason for this change is that it also makes dealing with
<0 values easier for reporting values in bounds check panics (issue #30116).

Makes cmd/go negligibly smaller.

Update #28797

Change-Id: I1f25ba6d2b3b3d4a72df3105828aa0a4b629ce85
Reviewed-on: https://go-review.googlesource.com/c/go/+/166377
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-09 00:52:45 +00:00
fanzha02
6efd51c6b7 cmd/compile: change the condition flags of floating-point comparisons in arm64 backend
Current compiler reverses operands to work around NaN in
"less than" and "less equal than" comparisons. But if we
want to use "FCMPD/FCMPS $(0.0), Fn" to do some optimization,
the workaround way does not work. Because assembler does
not support instruction "FCMPD/FCMPS Fn, $(0.0)".

This CL sets condition flags for floating-point comparisons
to resolve this problem.

Change-Id: Ia48076a1da95da64596d6e68304018cb301ebe33
Reviewed-on: https://go-review.googlesource.com/c/go/+/164718
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-03-07 21:23:52 +00:00
Raul Silvera
2dd066d4a7 test: improve test coverage for heap sampling
Update the test in test/heapsampling.go to more thoroughly validate heap sampling.
Lower the sampling rate on the test to ensure allocations both smaller and
larger than the sampling rate are tested.

Tighten up the validation check to a 10% difference between the unsampled and correct value.
Because of the nature of random sampling, it is possible that the unsampled value fluctuates
over that range. To avoid flakes, run the experiment three times and only report an issue if the
same location consistently falls out of range on all experiments.

This tests the sampling fix in cl/158337.

Change-Id: I54a709e5c75827b8b1c2d87cdfb425ab09759677
GitHub-Last-Rev: 7c04f12603
GitHub-Pull-Request: golang/go#26944
Reviewed-on: https://go-review.googlesource.com/c/go/+/129117
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-03-07 21:05:15 +00:00
erifan01
4e2b0dda8c cmd/compile: eliminate unnecessary type conversions in TrailingZeros(16|8) for arm64
This CL eliminates unnecessary type conversion operations: OpZeroExt16to64 and OpZeroExt8to64.
If the input argrument is a nonzero value, then ORconst operation can also be eliminated.

Benchmarks:

name               old time/op  new time/op  delta
TrailingZeros-8    2.75ns ± 0%  2.75ns ± 0%     ~     (all equal)
TrailingZeros8-8   3.49ns ± 1%  2.93ns ± 0%  -16.00%  (p=0.000 n=10+10)
TrailingZeros16-8  3.49ns ± 1%  2.93ns ± 0%  -16.05%  (p=0.000 n=9+10)
TrailingZeros32-8  2.67ns ± 1%  2.68ns ± 1%     ~     (p=0.468 n=10+10)
TrailingZeros64-8  2.67ns ± 1%  2.65ns ± 0%   -0.62%  (p=0.022 n=10+9)

code:

func f16(x uint) { z = bits.TrailingZeros16(uint16(x)) }

Before:

"".f16 STEXT size=48 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f16(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0000 00000 (test.go:7)        FUNCDATA        ZR, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        PCDATA  $2, ZR
        0x0000 00000 (test.go:7)        PCDATA  ZR, ZR
        0x0000 00000 (test.go:7)        MOVD    "".x(FP), R0
        0x0004 00004 (test.go:7)        MOVHU   R0, R0
        0x0008 00008 (test.go:7)        ORR     $65536, R0, R0
        0x000c 00012 (test.go:7)        RBIT    R0, R0
        0x0010 00016 (test.go:7)        CLZ     R0, R0
        0x0014 00020 (test.go:7)        MOVD    R0, "".z(SB)
        0x0020 00032 (test.go:7)        RET     (R30)

This line of code is unnecessary:
        0x0004 00004 (test.go:7)        MOVHU   R0, R0

After:

"".f16 STEXT size=32 args=0x8 locals=0x0 leaf
        0x0000 00000 (test.go:7)        TEXT    "".f16(SB), LEAF|NOFRAME|ABIInternal, $0-8
        0x0000 00000 (test.go:7)        FUNCDATA        ZR, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        FUNCDATA        $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (test.go:7)        PCDATA  $2, ZR
        0x0000 00000 (test.go:7)        PCDATA  ZR, ZR
        0x0000 00000 (test.go:7)        MOVD    "".x(FP), R0
        0x0004 00004 (test.go:7)        ORR     $65536, R0, R0
        0x0008 00008 (test.go:7)        RBITW   R0, R0
        0x000c 00012 (test.go:7)        CLZW    R0, R0
        0x0010 00016 (test.go:7)        MOVD    R0, "".z(SB)
        0x001c 00028 (test.go:7)        RET     (R30)

The situation of TrailingZeros8 is similar to TrailingZeros16.

Change-Id: I473bdca06be8460a0be87abbae6fe640017e4c9d
Reviewed-on: https://go-review.googlesource.com/c/go/+/156999
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-07 14:24:56 +00:00
erifan01
fee84cc905 cmd/compile: add an optimization rule for math/bits.ReverseBytes16 on arm
This CL adds two rules to turn patterns like ((x<<8) | (x>>8)) (the type of
x is uint16, "|" can also be "+" or "^") to a REV16 instruction on arm v6+.
This optimization rule can be used for math/bits.ReverseBytes16.

Benchmarks on arm v6:
name               old time/op  new time/op  delta
ReverseBytes-32    2.86ns ± 0%  2.86ns ± 0%   ~     (all equal)
ReverseBytes16-32  2.86ns ± 0%  2.86ns ± 0%   ~     (all equal)
ReverseBytes32-32  1.29ns ± 0%  1.29ns ± 0%   ~     (all equal)
ReverseBytes64-32  1.43ns ± 0%  1.43ns ± 0%   ~     (all equal)

Change-Id: I819e633c9a9d308f8e476fb0c82d73fb73dd019f
Reviewed-on: https://go-review.googlesource.com/c/go/+/159019
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-07 13:37:54 +00:00
Keith Randall
9dc3b8b722 reflect: fix more issues with StructOf GC programs
First the insidious bug:

  var n uintptr
  for n := elemPtrs; n > 120; n -= 120 {
    prog = append(prog, 120)
    prog = append(prog, mask[:15]...)
    mask = mask[15:]
  }
  prog = append(prog, byte(n))
  prog = append(prog, mask[:(n+7)/8]...)

The := breaks this code, because the n after the loop is always 0!

We also do need to handle field padding correctly. In particular
the old padding code doesn't correctly handle fields that are not
a multiple of a pointer in size.

Fixes #30606.

Change-Id: Ifcab9494dc25c20116753c5d7e0145d6c2053ed8
Reviewed-on: https://go-review.googlesource.com/c/go/+/165860
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-07 00:06:12 +00:00
Keith Randall
05b3db24c1 reflect: fix StructOf GC programs
They are missing a stop byte at the end.

Normally this doesn't matter, but when including a GC program
in another GC program, we strip the last byte. If that last byte
wasn't a stop byte, then we've thrown away part of the program
we actually need.

Fixes #30606

Change-Id: Ie9604beeb84f7f9442e77d31fe64c374ca132cce
Reviewed-on: https://go-review.googlesource.com/c/go/+/165857
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-06 20:22:14 +00:00
Keith Randall
4a9064ef41 cmd/compile: fix ordering for short-circuiting ops
Make sure the side effects inside short-circuited operations (&& and ||)
happen correctly.

Before this CL, we attached the side effects to the node itself using
exprInPlace. That caused other side effects in sibling expressions
to get reordered with respect to the short circuit side effect.

Instead, rewrite a && b like:

r := a
if r {
  r = b
}

That code we can keep correctly ordered with respect to other
side-effects extracted from part of a big expression.

exprInPlace seems generally unsafe. But this was the only case where
exprInPlace is called not at the top level of an expression, so I
don't think the other uses can actually trigger an issue (there can't
be a sibling expression). TODO: maybe those cases don't need "in
place", and we can retire that function generally.

This CL needed a small tweak to the SSA generation of OIF so that the
short circuit optimization still triggers. The short circuit optimization
looks for triangle but not diamonds, so don't bother allocating a block
if it will be empty.

Go 1 benchmarks are in the noise.

Fixes #30566

Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610
Reviewed-on: https://go-review.googlesource.com/c/go/+/165617
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2019-03-06 20:04:07 +00:00
Bryan C. Mills
b1a783df87 test/bench/go1: add go.mod file
cmd/dist executes 'go test' within this directory, so it needs a
go.mod file to tell the compiler what package path to use in
diagnostic and debug information.

Updates #30228

Change-Id: Ia313ac06bc0ec4631d415faa20c56cce2ac8dbc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/165498
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-06 18:53:12 +00:00
Alberto Donizetti
2f8d2427d9 test: skip mutex Unlock inlining tests on a few builders
Fix builder breakage from CL 148958.

This is an inlining test that should be skipped on -N -l.

The inlining also doesn't happen on arm and wasm, so skip the test
there too.

Fixes the noopt builder, the linux-arm builder, and the wasm builder.

Updates #30605

Change-Id: I06b90d595be7185df61db039dd225dc90d6f678f
Reviewed-on: https://go-review.googlesource.com/c/go/+/165339
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-05 18:23:18 +00:00
Carlo Alberto Ferraris
4c3f26076b sync: allow inlining the Mutex.Unlock fast path
Make use of the newly-enabled limited midstack inlining.
Similar changes will be done in followup CLs.

name                    old time/op  new time/op  delta
MutexUncontended        19.3ns ± 1%  18.9ns ± 0%   -1.92%  (p=0.000 n=20+19)
MutexUncontended-4      5.24ns ± 0%  4.75ns ± 1%   -9.25%  (p=0.000 n=20+20)
MutexUncontended-16     2.10ns ± 0%  2.05ns ± 0%   -2.38%  (p=0.000 n=15+19)
Mutex                   19.6ns ± 0%  19.3ns ± 1%   -1.92%  (p=0.000 n=20+17)
Mutex-4                 54.6ns ± 5%  52.4ns ± 4%   -4.09%  (p=0.000 n=20+20)
Mutex-16                 133ns ± 5%   139ns ± 2%   +4.23%  (p=0.000 n=20+16)
MutexSlack              33.4ns ± 2%  18.9ns ± 1%  -43.56%  (p=0.000 n=19+20)
MutexSlack-4             206ns ± 5%   225ns ± 8%   +9.12%  (p=0.000 n=20+18)
MutexSlack-16           89.4ns ± 1%  98.4ns ± 1%  +10.10%  (p=0.000 n=18+17)
MutexWork               60.5ns ± 0%  58.2ns ± 3%   -3.75%  (p=0.000 n=12+20)
MutexWork-4              105ns ± 5%   103ns ± 7%   -1.68%  (p=0.007 n=20+20)
MutexWork-16             157ns ± 1%   163ns ± 2%   +3.90%  (p=0.000 n=18+18)
MutexWorkSlack          70.2ns ± 5%  57.7ns ± 1%  -17.81%  (p=0.000 n=19+20)
MutexWorkSlack-4         277ns ±13%   276ns ±13%     ~     (p=0.682 n=20+19)
MutexWorkSlack-16        156ns ± 0%   147ns ± 0%   -5.62%  (p=0.000 n=16+14)
MutexNoSpin              966ns ± 0%   968ns ± 0%   +0.11%  (p=0.029 n=15+20)
MutexNoSpin-4            269ns ± 4%   270ns ± 2%     ~     (p=0.807 n=20+19)
MutexNoSpin-16           122ns ± 0%   120ns ± 4%   -1.63%  (p=0.000 n=19+19)
MutexSpin               3.13µs ± 0%  3.13µs ± 1%   +0.16%  (p=0.004 n=18+20)
MutexSpin-4              826ns ± 1%   832ns ± 2%   +0.74%  (p=0.000 n=19+16)
MutexSpin-16             397ns ± 1%   395ns ± 0%   -0.50%  (p=0.000 n=19+17)
RWMutexUncontended      71.4ns ± 0%  69.5ns ± 0%   -2.72%  (p=0.000 n=16+20)
RWMutexUncontended-4    18.4ns ± 4%  17.5ns ± 0%   -4.92%  (p=0.000 n=20+18)
RWMutexUncontended-16   8.01ns ± 0%  7.92ns ± 0%   -1.15%  (p=0.000 n=18+18)
RWMutexWrite100         24.9ns ± 0%  24.9ns ± 1%     ~     (p=0.099 n=19+20)
RWMutexWrite100-4       46.5ns ± 3%  46.2ns ± 4%     ~     (p=0.253 n=17+19)
RWMutexWrite100-16      68.9ns ± 3%  69.9ns ± 5%   +1.46%  (p=0.012 n=18+20)
RWMutexWrite10          27.1ns ± 0%  27.0ns ± 2%     ~     (p=0.128 n=17+20)
RWMutexWrite10-4        34.8ns ± 1%  34.7ns ± 2%     ~     (p=0.180 n=20+18)
RWMutexWrite10-16       37.5ns ± 2%  37.2ns ± 4%   -0.89%  (p=0.023 n=20+20)
RWMutexWorkWrite100      164ns ± 0%   164ns ± 0%     ~     (p=0.106 n=12+20)
RWMutexWorkWrite100-4    186ns ± 3%   193ns ± 3%   +3.46%  (p=0.000 n=20+20)
RWMutexWorkWrite100-16   204ns ± 2%   210ns ± 3%   +2.96%  (p=0.000 n=18+20)
RWMutexWorkWrite10       153ns ± 0%   153ns ± 0%   -0.20%  (p=0.017 n=20+19)
RWMutexWorkWrite10-4     179ns ± 1%   178ns ± 2%     ~     (p=0.215 n=19+20)
RWMutexWorkWrite10-16    191ns ± 1%   192ns ± 2%     ~     (p=0.166 n=15+19)

linux/amd64 bin/go 14630572 (previous commit 14605947, +24625/+0.17%)

Change-Id: I3f9d1765801fe0b8deb1bc2728b8bba8a7508e23
Reviewed-on: https://go-review.googlesource.com/c/go/+/148958
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-03-05 14:59:31 +00:00
erifan01
159b2de442 cmd/compile: optimize math/bits.Div32 for arm64
Benchmark:
name     old time/op  new time/op  delta
Div-8    22.0ns ± 0%  22.0ns ± 0%     ~     (all equal)
Div32-8  6.51ns ± 0%  3.00ns ± 0%  -53.90%  (p=0.000 n=10+8)
Div64-8  22.5ns ± 0%  22.5ns ± 0%     ~     (all equal)

Code:
func div32(hi, lo, y uint32) (q, r uint32) {return bits.Div32(hi, lo, y)}

Before:
        0x0020 00032 (test.go:24)       MOVWU   "".y+8(FP), R0
        0x0024 00036 ($GOROOT/src/math/bits/bits.go:472)        CBZW    R0, 132
        0x0028 00040 ($GOROOT/src/math/bits/bits.go:472)        MOVWU   "".hi(FP), R1
        0x002c 00044 ($GOROOT/src/math/bits/bits.go:472)        CMPW    R1, R0
        0x0030 00048 ($GOROOT/src/math/bits/bits.go:472)        BLS     96
        0x0034 00052 ($GOROOT/src/math/bits/bits.go:475)        MOVWU   "".lo+4(FP), R2
        0x0038 00056 ($GOROOT/src/math/bits/bits.go:475)        ORR     R1<<32, R2, R1
        0x003c 00060 ($GOROOT/src/math/bits/bits.go:476)        CBZ     R0, 140
        0x0040 00064 ($GOROOT/src/math/bits/bits.go:476)        UDIV    R0, R1, R2
        0x0044 00068 (test.go:24)       MOVW    R2, "".q+16(FP)
        0x0048 00072 ($GOROOT/src/math/bits/bits.go:476)        UREM    R0, R1, R0
        0x0050 00080 (test.go:24)       MOVW    R0, "".r+20(FP)
        0x0054 00084 (test.go:24)       MOVD    -8(RSP), R29
        0x0058 00088 (test.go:24)       MOVD.P  32(RSP), R30
        0x005c 00092 (test.go:24)       RET     (R30)

After:
        0x001c 00028 (test.go:24)       MOVWU   "".y+8(FP), R0
        0x0020 00032 (test.go:24)       CBZW    R0, 92
        0x0024 00036 (test.go:24)       MOVWU   "".hi(FP), R1
        0x0028 00040 (test.go:24)       CMPW    R0, R1
        0x002c 00044 (test.go:24)       BHS     84
        0x0030 00048 (test.go:24)       MOVWU   "".lo+4(FP), R2
        0x0034 00052 (test.go:24)       ORR     R1<<32, R2, R4
        0x0038 00056 (test.go:24)       UDIV    R0, R4, R3
        0x003c 00060 (test.go:24)       MSUB    R3, R4, R0, R4
        0x0040 00064 (test.go:24)       MOVW    R3, "".q+16(FP)
        0x0044 00068 (test.go:24)       MOVW    R4, "".r+20(FP)
        0x0048 00072 (test.go:24)       MOVD    -8(RSP), R29
        0x004c 00076 (test.go:24)       MOVD.P  16(RSP), R30
        0x0050 00080 (test.go:24)       RET     (R30)

UREM instruction in the previous assembly code will be converted to UDIV and MSUB instructions
on arm64. However the UDIV instruction in UREM is unnecessary, because it's a duplicate of the
previous UDIV. This CL adds a rule to have this extra UDIV instruction removed by CSE.

Change-Id: Ie2508784320020b2de022806d09f75a7871bb3d7
Reviewed-on: https://go-review.googlesource.com/c/159577
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-03 20:20:10 +00:00
erifan01
192b675f17 cmd/compile: add an optimaztion rule for math/bits.ReverseBytes16 on arm64
On amd64 ReverseBytes16 is lowered to a rotate instruction. However arm64 doesn't
have 16-bit rotate instruction, but has a REV16W instruction which can be used
for ReverseBytes16. This CL adds a rule to turn the patterns like (x<<8) | (x>>8)
(the type of x is uint16, and "|" can also be "^" or "+") to a REV16W instruction.

Code:
func reverseBytes16(i uint16) uint16 { return bits.ReverseBytes16(i) }

Before:
        0x0004 00004 (test.go:6)        MOVHU   "".i(FP), R0
        0x0008 00008 ($GOROOT/src/math/bits/bits.go:262)        UBFX    $8, R0, $8, R1
        0x000c 00012 ($GOROOT/src/math/bits/bits.go:262)        ORR     R0<<8, R1, R0
        0x0010 00016 (test.go:6)        MOVH    R0, "".~r1+8(FP)
        0x0014 00020 (test.go:6)        RET     (R30)

After:
        0x0000 00000 (test.go:6)        MOVHU   "".i(FP), R0
        0x0004 00004 (test.go:6)        REV16W  R0, R0
        0x0008 00008 (test.go:6)        MOVH    R0, "".~r1+8(FP)
        0x000c 00012 (test.go:6)        RET     (R30)

Benchmarks:
name                old time/op       new time/op       delta
ReverseBytes-224    1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)
ReverseBytes16-224  1.500000ns +- 0%  1.000000ns +- 0%  -33.33%  (p=0.000 n=9+10)
ReverseBytes32-224  1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)
ReverseBytes64-224  1.000000ns +- 0%  1.000000ns +- 0%     ~     (all equal)

Change-Id: I87cd41b2d8e549bf39c601f185d5775bd42d739c
Reviewed-on: https://go-review.googlesource.com/c/157757
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-01 15:42:19 +00:00
Cherry Zhang
40df9cc606 cmd/compile: make KeepAlive work on stack object
Currently, runtime.KeepAlive applied on a stack object doesn't
actually keeps the stack object alive, and the heap object
referenced from it could be collected. This is because the
address of the stack object is rematerializeable, and we just
ignored KeepAlive on rematerializeable values. This CL fixes it.

Fixes #30476.

Change-Id: Ic1f75ee54ed94ea79bd46a8ddcd9e81d01556d1d
Reviewed-on: https://go-review.googlesource.com/c/164537
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2019-03-01 15:36:52 +00:00
Matthew Dempsky
38642b9fce Revert "cmd/compile: rewrite f(g()) for multi-value g() during typecheck"
This reverts commit d96b7fbf98.

Reason for revert: broke noopt and longtest builders.

Change-Id: Ifaec64d817c4336cb255a2e9db00526b7bc5606a
Reviewed-on: https://go-review.googlesource.com/c/164757
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-03-01 05:58:51 +00:00
Matthew Dempsky
d96b7fbf98 cmd/compile: rewrite f(g()) for multi-value g() during typecheck
This CL moves order.go's copyRet logic for rewriting f(g()) into t1,
t2, ... = g(); f(t1, t2, ...) earlier into typecheck. This allows the
rest of the compiler to stop worrying about multi-value functions
appearing outside of OAS2FUNC nodes.

This changes compiler behavior in a few observable ways:

1. Typechecking error messages for builtin functions now use general
case error messages rather than unnecessarily differing ones.

2. Because f(g()) is rewritten before inlining, saved inline bodies
now see the rewritten form too. This could be addressed, but doesn't
seem worthwhile.

3. Most notably, this simplifies escape analysis and fixes a memory
corruption issue in esc.go. See #29197 for details.

Fixes #15992.
Fixes #29197.

Change-Id: I86a70668301efeec8fbd11fe2d242e359a3ad0af
Reviewed-on: https://go-review.googlesource.com/c/153841
Reviewed-by: Robert Griesemer <gri@golang.org>
2019-02-28 22:50:08 +00:00