1
0
mirror of https://github.com/golang/go synced 2024-11-22 20:40:03 -07:00
Commit Graph

4708 Commits

Author SHA1 Message Date
Cuong Manh Le
5a96386922 test: add regression test for issue 54911
It was fixed by CL 422196, and have been already worked in unified IR.

Fixes #54911

Change-Id: Ie69044a64b296f6961e667e7661d8c4d1a24d84e
Reviewed-on: https://go-review.googlesource.com/c/go/+/428758
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2022-09-07 13:58:00 +00:00
Ian Lance Taylor
b53471a655 Revert "sync: convert Once.done to atomic type"
This reverts commit CL 427140.

Reason for revert: Comments say that done should be the first field.

Change-Id: Id131da064146b44e1182289546aeb877867e63cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/428638
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-07 13:22:04 +00:00
Cuong Manh Le
c82304b712 cmd/compile: do not devirtualize defer/go calls
For defer/go calls, the function/method value are evaluated immediately.
So after devirtualizing, it may trigger a panic when implicitly deref
a nil pointer receiver, causing the program behaves unexpectedly.

It's safer to not devirtualizing defer/go calls at all.

Fixes #52072

Change-Id: I562c2860e3e577b36387dc0a12ae5077bc0766bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/428495
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-09-06 22:14:56 +00:00
cuiweixie
1110222bee sync: convert Once.done to atomic type
Change-Id: I49f8c764d49cabaad4d6859c219ba7220a389c1f
Reviewed-on: https://go-review.googlesource.com/c/go/+/427140
Run-TryBot: xie cui <523516579@qq.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-09-06 16:53:51 +00:00
Austin Clements
8be94b82ab runtime: drop function context from traceback
Currently, gentraceback tracks the closure context of the outermost
frame. This used to be important for "unstarted" calls to reflect
function stubs, where "unstarted" calls are either deferred functions
or the entry-point of a goroutine that hasn't run. Because reflect
function stubs have a dynamic argument map, we have to reach into
their closure context to fetch to map, and how to do this differs
depending on whether the function has started. This was discovered in
issue #25897.

However, as part of the register ABI, "go" and "defer" were made much
simpler, and any "go" or "defer" of a function that takes arguments or
returns results gets wrapped in a closure that provides those
arguments (and/or discards the results). Hence, we'll see that closure
instead of a direct call to a reflect stub, and can get its static
argument map without any trouble.

The one case where we may still see an unstarted reflect stub is if
the function takes no arguments and has no results, in which case the
compiler can optimize away the wrapper closure. But in this case we
know the argument map is empty: the compiler can apply this
optimization precisely because the target function has no argument
frame.

As a result, we no longer need to track the closure context during
traceback, so this CL drops all of that mechanism.

We still have to be careful about the unstarted case because we can't
reach into the function's locals frame to pull out its context
(because it has no locals frame). We double-check that in this case
we're at the function entry.

I would prefer to do this with some in-code PCDATA annotations of
where to find the dynamic argument map, but that's a lot of mechanism
to introduce for just this. It might make sense to consider this along
with #53609.

Finally, we beef up the test for this so it more reliably forces the
runtime down this path. It's fundamentally probabilistic, but this
tweak makes it better. Scheduler testing hooks (#54475) would make it
possible to write a reliable test for this.

For #54466, but it's a nice clean-up all on its own.

Change-Id: I16e4f2364ba2ea4b1fec1e27f971b06756e7b09f
Reviewed-on: https://go-review.googlesource.com/c/go/+/424254
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-09-02 19:04:48 +00:00
Keith Randall
5b1fbfba1c cmd/compile: rewrite >>c<<c to &^(1<<c-1)
Fixes #54496

Change-Id: I3c2ed8cd55836d5b07c8cdec00d3b584885aca79
Reviewed-on: https://go-review.googlesource.com/c/go/+/424856
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <martin@golang.org>
2022-09-02 18:51:37 +00:00
Matthew Dempsky
34f0029a85 cmd/compile/internal/noder: allow OCONVNOP for identical iface conversions
In go.dev/cl/421821, I included a hack to force OCONVNOP back to
OCONVIFACE for conversions involving shape types and non-empty
interfaces. The comment correctly noted that this was only needed for
conversions between non-identical types, but the code was conservative
and applied to even conversions between identical types.

This CL adds an extra bool to record whether the conversion is between
identical types, so we can keep OCONVNOP instead of forcing back to
OCONVIFACE. This has a small improvement to generated code, because we
no longer need a convI2I call (as demonstrated by codegen/ifaces.go).

But more usefully, this is relevant to pruning unnecessary itab slots
in runtime dictionaries (next CL).

Change-Id: I94f89e961cd26629b925037fea58d283140766ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/427678
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-02 18:26:02 +00:00
Derek Parker
6605686e3b cmd/compile: new inline heuristic for struct compares
This CL changes the heuristic used to determine whether we can inline a
struct equality check or if we must generate a function and call that
function for equality.

The old method was to count struct fields, but this can lead to poor
in lining decisions. We should really be determining the cost of the
equality check and use that to determine if we should inline or generate
a function.

The new benchmark provided in this CL returns the following when compared
against tip:

```
name         old time/op  new time/op  delta
EqStruct-32  2.46ns ± 4%  0.25ns ±10%  -89.72%  (p=0.000 n=39+39)
```

Fixes #38494

Change-Id: Ie06b80a2b2a03a3fd0978bcaf7715f9afb66e0ab
GitHub-Last-Rev: e9a18d9389
GitHub-Pull-Request: golang/go#53326
Reviewed-on: https://go-review.googlesource.com/c/go/+/411674
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-09-02 17:48:22 +00:00
ruinan
121344ac33 cmd/compile: optimize RotateLeft8/16 on arm64
This CL optimizes RotateLeft8/16 on arm64.

For 16 bits, we form a 32 bits register by duplicating two 16 bits
registers, then use RORW instruction to do the rotate shift.

For 8 bits, we just use LSR and LSL instead of RORW because the code is
simpler.

Benchmark          Old          ThisCL       delta
RotateLeft8-46     2.16 ns/op   1.73 ns/op   -19.70%
RotateLeft16-46    2.16 ns/op   1.54 ns/op   -28.53%

Change-Id: I09cde4383d12e31876a57f8cdfd3bb4f324fadb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/420976
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-09-02 17:46:31 +00:00
Cuong Manh Le
f45c2d7e47 go/types,types2: move notinheap tests to fixedbugs directory
So they can be added to ignored list, since the tests now require
cgo.Incomplete, which is not recognized by go/types and types2.

Updates #46731

Change-Id: I9f24e3c8605424d1f5f42ae4409437198f4c1326
Reviewed-on: https://go-review.googlesource.com/c/go/+/427142
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-09-02 17:46:15 +00:00
ruinan
54c7bc9cff cmd/compile: optimize shift ops on arm64 when the shift value is v&63
For the following code case:

  var x uint64
  x >> (shift & 63)

We can directly genereta `x >> shift` on arm64, since the hardware will
only use the bottom 6 bits of the shift amount.

Benchmark               old time/op  new time/op    delta
ShiftArithmeticRight-8  0.40ns       0.31ns        -21.7%

Change-Id: Id58c8a5b2f6dd5c30c3876f4a36e11b4d81e2dc9
Reviewed-on: https://go-review.googlesource.com/c/go/+/425294
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-09-02 17:44:29 +00:00
Cherry Mui
321a220d50 cmd/link: only add dummy XCOFF reference if the symbol exists
On AIX when external linking, for some symbols we need to add
dummy references to prevent the external linker from discarding
them. Currently we add the reference unconditionally. But if the
symbol doesn't exist, the linking fails in a later stage for
generating external relocation of a nonexistent symbol. The
symbols are special symbols that almost always exist, except that
go:buildid may not exist if the linker is invoked without the
-buildid flag. The go command invokes the linker with the flag, so
this can only happen with manual linker invocation. Specifically,
test/run.go does this in some cases.

Fix this by checking the symbol existence before adding the
reference. Re-enable tests on AIX.

Perhaps the linker should always emit a dummy buildid even if the
flag is not set...

Fixes #54814.

Change-Id: I43d81587151595309e189e38960cbda9a1c5ca32
Reviewed-on: https://go-review.googlesource.com/c/go/+/427620
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2022-09-02 15:27:18 +00:00
Cuong Manh Le
ec2ea40b31 cmd/compile: restrict //go:notinheap to runtime/internal/sys
So it won't be visible outside of runtime package. There are changes to
make tests happy:

 - For test/directive*.go files, using "go:noinline" for testing misplaced
 directives instead.
 - Restrict test/fixedbugs/bug515.go for gccgo only.
 - For test/notinheap{2,3}.go, using runtime/cgo.Incomplete for marking
 the type as not-in-heap. Though it's somewhat clumsy, it's the easiest
 way to keep the test errors for not-in-heap types until we can cleanup
 further.
 - test/typeparam/mdempsky/11.go is about defined type in user code marked
 as go:notinheap, which can't happen after this CL, though.

Fixes #46731

Change-Id: I869f5b2230c8a2a363feeec042e7723bbc416e8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/421882
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2022-09-02 06:22:23 +00:00
Robert Griesemer
489f508ccf cmd/compile: avoid "not used" errors due to bad go/defer statements
The syntax for go and defer specifies an arbitrary expression, not
a call; the call requirement is spelled out in prose. Don't to the
call check in the parser; instead move it to the type checker. This
is simpler and also allows the type checker to check expressions that
are not calls, and avoid "not used" errors due to such expressions.

We would like to make the same change in go/parser and go/types
but the change requires Go/DeferStmt nodes to hold an ast.Expr
rather than an *ast.CallExpr. We cannot change that for backward-
compatibility reasons. Since we don't test this behavior for the
type checkers alone (only for the compiler), we get away with it
for now.

Follow-up on CL 425675 which introduced the extra errors in the
first place.

Change-Id: I90890b3079d249bdeeb76d5673246ba44bec1a7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/425794
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2022-09-01 23:17:52 +00:00
Robert Griesemer
aa5ff29dab go/parser: adjustments to error messages
- Use "expected X" rather then "expecting X".
- Report a better error when a type argument list is expected.
- Adjust various tests.

For #54511.

Change-Id: I0c5ca66ecbbdcae1a8f67377682aae6b0b6ab89a
Reviewed-on: https://go-review.googlesource.com/c/go/+/425734
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2022-09-01 22:37:04 +00:00
Robert Griesemer
c801e4b10f cmd/compile/internal/syntax: use BadExpr instead of fake CallExpr in bad go/defer
If the go/defer syntax is bad, using a fake CallExpr may produce
a follow-on error in the type checker. Instead store a BadExpr
in the syntax tree (since an error has already been reported).

Adjust various tests.

For #54511.

Change-Id: Ib2d25f8eab7d5745275188d83d11620cad6ef47c
Reviewed-on: https://go-review.googlesource.com/c/go/+/425675
Reviewed-by: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2022-09-01 22:37:03 +00:00
Cuong Manh Le
b5b2cf519f go/types,types2: exclude tests that need cgo.Incomplete
Since when go/types,types2 do not know about build constraints, and
runtime/cgo.Incomplete is only available on platforms that support cgo.

These tests are also failing on aix with failure from linker, so disable
them on aix to make builder green. The fix for aix is tracked in #54814

Updates #46731
Updates #54814

Change-Id: I5d6f6e29a8196efc6c457ea64525350fc6b20309
Reviewed-on: https://go-review.googlesource.com/c/go/+/427394
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-09-01 15:27:07 +00:00
Cuong Manh Le
64b260dbde test: use cgo.Incomplete instead of go:notinheap for "run" tests
Same as CL 421880, but for test directory.

Updates #46731

Change-Id: If8d18df013a6833adcbd40acc1a721bbc23ca6b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/421881
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-01 00:42:27 +00:00
Matthew Dempsky
ca634fa2c5 cmd/compile: reject not-in-heap types as type arguments
After running the types2 type checker, walk info.Instances to reject
any not-in-heap type arguments. This is feasible to check using the
types2 API now, thanks to #46731.

Fixes #54765.

Change-Id: Idd2acc124d102d5a76f128f13c21a6e593b6790b
Reviewed-on: https://go-review.googlesource.com/c/go/+/427235
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-31 23:52:00 +00:00
Keith Randall
33a7e5a4b4 cmd/compile: combine multiple rotate instructions
Rotating by c, then by d, is the same as rotating by c+d.

Change-Id: I36df82261460ff80f7c6d39bcdf0e840cef1c91a
Reviewed-on: https://go-review.googlesource.com/c/go/+/424894
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ruinan Sun <Ruinan.Sun@arm.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-08-31 22:10:52 +00:00
Keith Randall
69aed4712d cmd/compile: use better splitting condition for string binary search
Currently we use a full cmpstring to do the comparison for each
split in the binary search for a string switch.

Instead, split by comparing a single byte of the input string with a
constant. That will give us a much faster split (although it might be
not quite as good a split).

Fixes #53333

R=go1.20

Change-Id: I28c7209342314f367071e4aa1f2beb6ec9ff7123
Reviewed-on: https://go-review.googlesource.com/c/go/+/414894
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-08-31 22:08:26 +00:00
Keith Randall
af7f067e0d cmd/compile: tighten bounds for induction variables in strided loops
for i := 0; i < 9; i += 3

Currently we compute bounds of [0,8]. Really we know that it is [0,6].

CL 415874 computed the better bound as part of overflow detection.
This CL just incorporates that better info to the prove pass.

R=go1.20

Change-Id: Ife82cc415321f6652c2b5d132a40ec23e3385766
Reviewed-on: https://go-review.googlesource.com/c/go/+/415937
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-31 22:04:55 +00:00
Wayne Zuo
da6556968f cmd/compile: simplify bounded shift on riscv64
The prove pass will mark some shifts bounded, and then we can use that
information to generate better code on riscv64.

Change-Id: Ia22f43d0598453c9417adac7017db28d7240948b
Reviewed-on: https://go-review.googlesource.com/c/go/+/422616
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-08-31 20:21:00 +00:00
cuiweixie
c708532936 cmd/compile: add support for unsafe.{String,StringData,SliceData}
For #53003

Change-Id: I13a761daca8b433b271a1feb711c103d9820772d
Reviewed-on: https://go-review.googlesource.com/c/go/+/423774
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: hopehook <hopehook@golangcn.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-31 17:15:15 +00:00
Wayne Zuo
3680b5e9c4 cmd/compile: teach prove about bitwise OR operation
Fixes #45928.

Change-Id: Ifbb0effbca4ab7c0eb56069fee40edb564553c35
Reviewed-on: https://go-review.googlesource.com/c/go/+/410336
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-31 09:35:45 +00:00
Wayne Zuo
d2e0587f77 cmd/compile: derive relation between x+delta and x in prove
If x+delta cannot overflow/underflow, we can derive:
  x+delta < x if delta<0 (this CL included)
  x+delta > x if delta>0 (this CL not included due to
  a recursive stack overflow)

Remove 95 bounds checks during ./make.bat

Fixes #51622

Change-Id: I60d9bd84c5d7e81bbf808508afd09be596644f09
Reviewed-on: https://go-review.googlesource.com/c/go/+/406175
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-08-31 09:35:10 +00:00
Cuong Manh Le
f21514c7f8 cmd/compile: only inline method wrapper if method don't contain closures
CL 327871 changes methodWrapper to always perform inlining after global
escape analysis. However, inlining the method may reveal closures, which
require walking all function bodies to decide whether to capture free
variables by value or by ref.

To fix it, just not doing inline if the method contains any closures.

Fixes #53702

Change-Id: I4b0255b86257cc6fe7e5fafbc545cc5cff9113e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/426334
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-08-30 18:02:22 +00:00
Cuong Manh Le
ddc93a536f cmd/compile: fix unified IR shapifying recursive instantiated types
Shape-based stenciling in unified IR is done by converting type argument
to its underlying type. So it agressively check that type argument is
not a TFORW. However, for recursive instantiated type argument, it may
still be a TFORW when shapifying happens. Thus the assertion failed,
causing the compiler crashing.

To fix it, just allow fully instantiated type when shapifying.

Fixes #54512
Fixes #54722

Change-Id: I527e3fd696388c8a37454e738f0324f0c2ec16cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/426335
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2022-08-30 17:23:27 +00:00
Wayne Zuo
e8f0340fa4 cmd/compile: intrinsify RotateLeft{32,64} on loong64
Benchmark on crypto/sha256 (provided by Xiaodong Liu):
name               old time/op    new time/op    delta
Hash8Bytes/New       1.19µs ± 0%    0.97µs ± 0%  -18.75%  (p=0.000 n=9+9)
Hash8Bytes/Sum224    1.21µs ± 0%    0.97µs ± 0%  -20.04%  (p=0.000 n=9+10)
Hash8Bytes/Sum256    1.21µs ± 0%    0.98µs ± 0%  -19.16%  (p=0.000 n=10+7)
Hash1K/New           15.9µs ± 0%    12.4µs ± 0%  -22.10%  (p=0.000 n=10+10)
Hash1K/Sum224        15.9µs ± 0%    12.4µs ± 0%  -22.18%  (p=0.000 n=8+10)
Hash1K/Sum256        15.9µs ± 0%    12.4µs ± 0%  -22.15%  (p=0.000 n=10+9)
Hash8K/New            119µs ± 0%      92µs ± 0%  -22.40%  (p=0.000 n=10+9)
Hash8K/Sum224         119µs ± 0%      92µs ± 0%  -22.41%  (p=0.000 n=9+10)
Hash8K/Sum256         119µs ± 0%      92µs ± 0%  -22.40%  (p=0.000 n=9+9)

name               old speed      new speed      delta
Hash8Bytes/New     6.70MB/s ± 0%  8.25MB/s ± 0%  +23.13%  (p=0.000 n=10+10)
Hash8Bytes/Sum224  6.60MB/s ± 0%  8.26MB/s ± 0%  +25.06%  (p=0.000 n=10+10)
Hash8Bytes/Sum256  6.59MB/s ± 0%  8.15MB/s ± 0%  +23.67%  (p=0.000 n=10+7)
Hash1K/New         64.3MB/s ± 0%  82.5MB/s ± 0%  +28.36%  (p=0.000 n=10+10)
Hash1K/Sum224      64.3MB/s ± 0%  82.6MB/s ± 0%  +28.51%  (p=0.000 n=10+10)
Hash1K/Sum256      64.3MB/s ± 0%  82.6MB/s ± 0%  +28.46%  (p=0.000 n=9+9)
Hash8K/New         69.0MB/s ± 0%  89.0MB/s ± 0%  +28.87%  (p=0.000 n=10+8)
Hash8K/Sum224      69.0MB/s ± 0%  89.0MB/s ± 0%  +28.88%  (p=0.000 n=9+10)
Hash8K/Sum256      69.0MB/s ± 0%  88.9MB/s ± 0%  +28.87%  (p=0.000 n=8+9)

Benchmark on crypto/sha512 (provided by Xiaodong Liu):
name               old time/op    new time/op     delta
Hash8Bytes/New       1.55µs ± 0%     1.31µs ± 0%  -15.67%  (p=0.000 n=10+10)
Hash8Bytes/Sum384    1.59µs ± 0%     1.35µs ± 0%  -14.97%  (p=0.000 n=10+10)
Hash8Bytes/Sum512    1.62µs ± 0%     1.39µs ± 0%  -14.02%  (p=0.000 n=10+10)
Hash1K/New           10.7µs ± 0%      8.6µs ± 0%  -19.60%  (p=0.000 n=8+8)
Hash1K/Sum384        10.8µs ± 0%      8.7µs ± 0%  -19.40%  (p=0.000 n=9+9)
Hash1K/Sum512        10.8µs ± 0%      8.7µs ± 0%  -19.35%  (p=0.000 n=9+10)
Hash8K/New           74.6µs ± 0%     59.6µs ± 0%  -20.08%  (p=0.000 n=10+9)
Hash8K/Sum384        74.7µs ± 0%     59.7µs ± 0%  -20.04%  (p=0.000 n=9+8)
Hash8K/Sum512        74.7µs ± 0%     59.7µs ± 0%  -20.01%  (p=0.000 n=10+10)

name               old speed      new speed       delta
Hash8Bytes/New     5.16MB/s ± 0%   6.12MB/s ± 0%  +18.60%  (p=0.000 n=10+8)
Hash8Bytes/Sum384  5.02MB/s ± 0%   5.90MB/s ± 0%  +17.56%  (p=0.000 n=10+10)
Hash8Bytes/Sum512  4.94MB/s ± 0%   5.74MB/s ± 0%  +16.29%  (p=0.000 n=10+9)
Hash1K/New         95.4MB/s ± 0%  118.6MB/s ± 0%  +24.38%  (p=0.000 n=10+10)
Hash1K/Sum384      95.0MB/s ± 0%  117.9MB/s ± 0%  +24.06%  (p=0.000 n=8+9)
Hash1K/Sum512      94.8MB/s ± 0%  117.5MB/s ± 0%  +23.99%  (p=0.000 n=8+9)
Hash8K/New          110MB/s ± 0%    137MB/s ± 0%  +25.11%  (p=0.000 n=9+6)
Hash8K/Sum384       110MB/s ± 0%    137MB/s ± 0%  +25.07%  (p=0.000 n=9+8)
Hash8K/Sum512       110MB/s ± 0%    137MB/s ± 0%  +25.01%  (p=0.000 n=10+10)

Change-Id: I28ccfce634659305a336c8e0a3f8589f7361d661
Reviewed-on: https://go-review.googlesource.com/c/go/+/422317
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-08-30 03:21:44 +00:00
Wayne Zuo
a6219737e3 cmd/compile: intrinsify Sub64 on riscv64
After this CL, the performance difference in crypto/elliptic
benchmarks on linux/riscv64 are:

name                 old time/op    new time/op    delta
ScalarBaseMult/P256    1.64ms ± 1%    1.60ms ± 1%   -2.36%  (p=0.008 n=5+5)
ScalarBaseMult/P224    1.53ms ± 1%    1.47ms ± 2%   -4.24%  (p=0.008 n=5+5)
ScalarBaseMult/P384    5.12ms ± 2%    5.03ms ± 2%     ~     (p=0.095 n=5+5)
ScalarBaseMult/P521    22.3ms ± 2%    13.8ms ± 1%  -37.89%  (p=0.008 n=5+5)
ScalarMult/P256        4.49ms ± 2%    4.26ms ± 2%   -5.13%  (p=0.008 n=5+5)
ScalarMult/P224        4.33ms ± 1%    4.09ms ± 1%   -5.59%  (p=0.008 n=5+5)
ScalarMult/P384        16.3ms ± 1%    15.5ms ± 2%   -4.78%  (p=0.008 n=5+5)
ScalarMult/P521         101ms ± 0%      47ms ± 2%  -53.36%  (p=0.008 n=5+5)

Change-Id: I31cf0506e27f9d85f576af1813630a19c20dda8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/420095
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-27 05:43:59 +00:00
Wayne Zuo
969f48a3a2 cmd/compile: intrinsify Add64 on riscv64
According to RISCV instruction set manual v2.2 Sec 2.4, we can
implement overflowing check for unsigned addition cheaply using
SLTU instructions.

After this CL, the performance difference in crypto/elliptic
benchmarks on linux/riscv64 are:

name                 old time/op    new time/op    delta
ScalarBaseMult/P256    1.93ms ± 1%    1.64ms ± 1%  -14.96%  (p=0.008 n=5+5)
ScalarBaseMult/P224    1.80ms ± 2%    1.53ms ± 1%  -14.89%  (p=0.008 n=5+5)
ScalarBaseMult/P384    6.15ms ± 2%    5.12ms ± 2%  -16.73%  (p=0.008 n=5+5)
ScalarBaseMult/P521    25.9ms ± 1%    22.3ms ± 2%  -13.78%  (p=0.008 n=5+5)
ScalarMult/P256        5.59ms ± 1%    4.49ms ± 2%  -19.79%  (p=0.008 n=5+5)
ScalarMult/P224        5.42ms ± 1%    4.33ms ± 1%  -20.01%  (p=0.008 n=5+5)
ScalarMult/P384        19.9ms ± 2%    16.3ms ± 1%  -18.15%  (p=0.008 n=5+5)
ScalarMult/P521        97.3ms ± 1%   100.7ms ± 0%   +3.48%  (p=0.008 n=5+5)

Change-Id: Ic4c82ced4b072a4a6575343fa9f29dd09b0cabc4
Reviewed-on: https://go-review.googlesource.com/c/go/+/420094
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-27 05:43:32 +00:00
Cherry Mui
1211a62bdc cmd/compile: align stack offset to alignment larger than PtrSize
In typebits.Set we check that the offset is a multiple of the
alignment, which makes perfect sense. But for values like
atomic.Int64, which has 8-byte alignment even on 32-bit platforms
(i.e. the alignment is larger than PtrSize), if it is on stack it
may be under-aligned, as the stack frame is only PtrSize aligned.

Normally we would prevent such values on stack, as the escape
analysis force values with higher alignment to heap. But for a
composite literal assignment like x = AlignedType{...}, the
compiler creates an autotmp for the RHS then copies it to the LHS.
The autotmp is on stack and may be under-aligned. Currently this
may cause an ICE in the typebits.Set check.

This CL makes it align the _offset_ of the autotmp to 8 bytes,
which satisfies the check. Note that this is actually lying: the
actual address at run time may not necessarily be 8-byte
aligned as we only align SP to 4 bytes.

The under-alignment is probably okay. The only purpose for the
autotmp is to copy the value to the LHS, and the copying code we
generate (at least currently) doesn't care the alignment beyond
stack alignment.

Fixes #54638.

Change-Id: I13c16afde2eea017479ff11dfc24092bcb8aba6a
Reviewed-on: https://go-review.googlesource.com/c/go/+/425256
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-26 15:24:31 +00:00
Matthew Dempsky
6a801d3082 cmd/compile/internal/noder: fix inlined function literal positions
When inlining function calls, we rewrite the position information on
all of the nodes to keep track of the inlining context. This is
necessary so that at runtime, we can synthesize additional stack
frames so that the inlining is transparent to the user.

However, for function literals, we *don't* want to apply this
rewriting to the underlying function. Because within the function
literal (when it's not itself inlined), the inlining context (if any)
will have already be available at the caller PC instead.

Unified IR was already getting this right in the case of user-written
statements within the function literal, which is what the unit test
for #46234 tested. However, it was still using inline-adjusted
positions for the function declaration and its parameters, which
occasionally end up getting used for generated code (e.g., loading
captured values from the closure record).

I've manually verified that this fixes the hang in
https://go.dev/play/p/avQ0qgRzOgt, and spot-checked the
-d=pctab=pctoinline output for kube-apiserver and kubelet and they
seem better.

However, I'm still working on a more robust test for this (hence
"Updates" not "Fixes") and internal assertions to verify that we're
emitting correct inline trees. In particular, there are still other
cases (even in the non-unified frontend) where we're producing
corrupt (but at least acyclic) inline trees.

Updates #54625.

Change-Id: Iacfd2e1eb06ae8dc299c0679f377461d3d46c15a
Reviewed-on: https://go-review.googlesource.com/c/go/+/425395
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-08-25 18:46:22 +00:00
Wayne Zuo
b60432df14 cmd/compile: deadcode for LoweredMuluhilo on riscv64
This is a follow up of CL 425101 on RISCV64.

According to RISCV Volume 1, Unprivileged Spec v. 20191213 Chapter 7.1:
If both the high and low bits of the same product are required, then the
recommended code sequence is: MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2
(source register specifiers must be in same order and rdh cannot be the
same as rs1 or rs2). Microarchitectures can then fuse these into a single
multiply operation instead of performing two separate multiplies.

So we should not split Muluhilo to separate instructions.

Updates #54607

Change-Id: If47461f3aaaf00e27cd583a9990e144fb8bcdb17
Reviewed-on: https://go-review.googlesource.com/c/go/+/425203
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-08-24 18:08:33 +00:00
Matthew Dempsky
f983a9340d cmd/compile: defer transitive inlining until after AST is edited
This CL changes the inliner to process transitive inlining iteratively
after the AST has actually been edited, rather than recursively and
immediately. This is important for handling indirect function calls
correctly, because ir.reassigned walks the function body looking for
reassignments; whereas previously the inlined reassignments might not
have been actually added to the AST yet.

Fixes #54632.

Change-Id: I0dd69813c8a70b965174e0072335bc00afedf286
Reviewed-on: https://go-review.googlesource.com/c/go/+/425257
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2022-08-24 14:31:08 +00:00
Keith Randall
60ad3c48f5 cmd/compile: move SSA rotate instruction detection to arch-independent rules
Detect rotate instructions while still in architecture-independent form.
It's easier to do here, and we don't need to repeat it in each
architecture file.

Change-Id: I9396954b3f3b3bfb96c160d064a02002309935bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/421195
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Ruinan Sun <Ruinan.Sun@arm.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-23 21:24:14 +00:00
Keith Randall
332a5981d0 cmd/compile: handle partially overlapping assignments
Normally, when moving Go values of type T from one location to another,
we don't need to worry about partial overlaps. The two Ts must either be
in disjoint (nonoverlapping) memory or in exactly the same location.
There are 2 cases where this isn't true:
 1) Using unsafe you can arrange partial overlaps.
 2) Since Go 1.17, you can use a cast from a slice to a ptr-to-array.
    https://go.dev/ref/spec#Conversions_from_slice_to_array_pointer
    This feature can be used to construct partial overlaps of array types.
      var a [3]int
      p := (*[2]int)(a[:])
      q := (*[2]int)(a[1:])
      *p = *q
We don't care about solving 1. Or at least, we haven't historically
and no one has complained.
For 2, we need to ensure that if there might be partial overlap,
then we can't use OpMove; we must use memmove instead.
(memmove handles partial overlap by copying in the correct
direction. OpMove does not.)

Note that we have to be careful here not to introduce a call when
we're marshaling arguments to a call or unmarshaling results from a call.

Fixes #54467

Change-Id: I1ca6aba8041576849c1d85f1fa33ae61b80a373d
Reviewed-on: https://go-review.googlesource.com/c/go/+/425076
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-23 19:56:32 +00:00
Matthew Dempsky
503de697cb Revert "cmd/compile: restore test/nested.go test cases"
This reverts CL 424854.

Reason for revert: broke misc/cgo/stdio.TestTestRun on several builders.

Will re-land after CL 421879 is submitted.

Change-Id: I2548c70d33d7c178cc71c1d491cd81c22660348f
Reviewed-on: https://go-review.googlesource.com/c/go/+/425214
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2022-08-23 19:37:34 +00:00
Matthew Dempsky
6985ab27df cmd/compile: fix unified IR's pointer-shaping
In CL 424734, I implemented pointer shaping for unified IR. Evidently
though, we didn't have any test cases that check that uses of
pointer-shaped expressions were handled correctly.

In the reported test case, the struct field "children items[*node[T]]"
gets shaped to "children items[go.shape.*uint8]" (underlying type
"[]go.shape.*uint8"); and so the expression "n.children[i]" has type
"go.shape.*uint8" and the ".items" field selection expression fails.

The fix implemented in this CL is that any expression of derived type
now gets an explicit "reshape" operation applied to it, to ensure it
has the appropriate type for its context. E.g., the "n.children[i]"
OINDEX expression above gets "reshaped" from "go.shape.*uint8" to
"*node[go.shape.int]", allowing the field selection to succeed.

This CL also adds a "-d=reshape" compiler debugging flag, because I
anticipate debugging reshaping operations will be something to come up
again in the future.

Fixes #54535.

Change-Id: Id847bd8f51300d2491d679505ee4d2e974ca972a
Reviewed-on: https://go-review.googlesource.com/c/go/+/424936
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: hopehook <hopehook@qq.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-23 18:14:10 +00:00
Matthew Dempsky
0a6e1fa986 cmd/compile: fix "expression has untyped type" ICE in generic code
During walk, we sometimes desugar OEQ nodes into multiple "untyped
bool" expressions, and then use typecheck.Conv to convert back to the
original OEQ node's type.

However, typecheck.Conv had a short-circuit path that if the type is
already identical to the target type according to types.Identical,
then we skipped the conversion. This short-circuit is normally fine;
but with generic code and shape types, it considers "untyped bool" and
"go.shape.bool" to be identical types. And we could end up leaving an
expression of "untyped bool", which then fails an internal consistency
check later.

The simple fix is to change Conv to use types.IdenticalStrict, so that
we ensure "untyped bool" gets converted to "go.shape.bool". And for
good measure, make the same change to ConvNop.

This issue was discovered and reported against unified IR, but the
issue was latent within the non-unified frontend too.

Fixes #54537.

Change-Id: I7559a346b063349b35749e8a2da704be18e51654
Reviewed-on: https://go-review.googlesource.com/c/go/+/424937
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2022-08-23 18:13:59 +00:00
Matthew Dempsky
aa6a7fa775 cmd/compile: fix reflect naming of local generic types
To disambiguate local types, we append a "·N" suffix to their name and
then trim it off again when producing their runtime type descriptors.

However, if a local type is generic, then we were further appending
the type arguments after this suffix, and the code in types/fmt.go
responsible for trimming didn't know to handle this.

We could extend the types/fmt.go code to look for the "·N" suffix
elsewhere in the type name, but this is risky because it could
legitimately (albeit unlikely) appear in struct field tags.

Instead, the most robust solution is to just change the mangling logic
to keep the "·N" suffix at the end, where types/fmt.go can easily and
reliably trim it.

Note: the "·N" suffix is still visible within the type arguments
list (e.g., the "·3" suffixes in nested.out), because we currently use
the link strings in the type arguments list.

Fixes #54456.

Change-Id: Ie9beaf7e5330982f539bff57b8d48868a3674a37
Reviewed-on: https://go-review.googlesource.com/c/go/+/424901
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2022-08-23 18:13:48 +00:00
Matthew Dempsky
72a76ca1f9 cmd/compile: restore test/nested.go test cases
When handling a type declaration like:

```
type B A
```

unified IR has been writing out that B's underlying type is A, rather
than the underlying type of A.

This is a bit awkward to implement and adds complexity to importers,
who need to handle resolving the underlying type themselves. But it
was necessary to handle when A was declared like:

```
//go:notinheap
type A int
```

Because we expected A's not-in-heap'ness to be conferred to B, which
required knowing that A was on the path from B to its actual
underlying type int.

However, since #46731 was accepted, we no longer need to support this
case. Instead we can write out B's actual underlying type.

One stumbling point though is the existing code for exporting
interfaces doesn't work for the underlying type of `comparable`, which
is now needed to implement `type C comparable`. As a bit of a hack, we
we instead export its underlying type as `interface{ comparable }`.

Fixes #54512.

Change-Id: I0fb892068d656f1e87bb8ef97da27756051126d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/424854
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-08-23 18:13:38 +00:00
eric fang
9f0f87c806 cmd/internal/obj/arm64: remove the transition from $0 to ZR
Previously we convert $0 to the ZR register for some reasons, which causes
two problems:
1. Confusion, the special case of the ZR register needs to be considered
when dealing with constants. For encoding, some places we encode ZR, and
some places we encode $0, although we have converted $0 to ZR.
2. Unexpected instruction format. All instructions that support ZR register
operands can be replaced by $0.

This patch removes this conversion. Note that this patch may cause previously
unintendedly supported instruction formats to no longer be supported.

Change-Id: I3d8d2c06711b7614a38191397da7776417f1861c
Reviewed-on: https://go-review.googlesource.com/c/go/+/404316
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-23 06:11:32 +00:00
eric fang
0a52d80666 cmd/compile/internal/ssa: optimize memory moving on arm64
This CL optimizes memory moving with LDP and STP on arm64.

Benchmarks:
name              old time/op  new time/op  delta
ClearFat7-160     1.08ns ± 0%  0.95ns ± 0%  -11.41%  (p=0.008 n=5+5)
ClearFat8-160     0.84ns ± 0%  0.84ns ± 0%   -0.95%  (p=0.008 n=5+5)
ClearFat11-160    1.08ns ± 0%  0.95ns ± 0%  -11.46%  (p=0.008 n=5+5)
ClearFat12-160    0.95ns ± 0%  0.95ns ± 0%     ~     (p=0.063 n=4+5)
ClearFat13-160    1.08ns ± 0%  0.95ns ± 0%  -11.45%  (p=0.008 n=5+5)
ClearFat14-160    1.08ns ± 0%  0.95ns ± 0%  -11.47%  (p=0.008 n=5+5)
ClearFat15-160    1.24ns ± 0%  0.95ns ± 0%  -22.98%  (p=0.029 n=4+4)
ClearFat16-160    0.84ns ± 0%  0.83ns ± 0%   -0.11%  (p=0.008 n=5+5)
ClearFat24-160    2.15ns ± 0%  2.15ns ± 0%     ~     (all equal)
ClearFat32-160    2.86ns ± 0%  2.86ns ± 0%     ~     (p=0.333 n=5+4)
ClearFat40-160    2.15ns ± 0%  2.15ns ± 0%     ~     (all equal)
ClearFat48-160    3.32ns ± 1%  3.31ns ± 1%     ~     (p=0.690 n=5+5)
ClearFat56-160    2.15ns ± 0%  2.15ns ± 0%     ~     (all equal)
ClearFat64-160    3.25ns ± 1%  3.26ns ± 1%     ~     (p=0.841 n=5+5)
ClearFat72-160    2.22ns ± 0%  2.22ns ± 0%     ~     (p=0.444 n=5+5)
ClearFat128-160   4.03ns ± 0%  4.04ns ± 0%   +0.32%  (p=0.008 n=5+5)
ClearFat256-160   6.44ns ± 0%  6.44ns ± 0%   +0.08%  (p=0.016 n=4+5)
ClearFat512-160   12.2ns ± 0%  12.2ns ± 0%   +0.13%  (p=0.008 n=5+5)
ClearFat1024-160  24.3ns ± 0%  24.3ns ± 0%     ~     (p=0.167 n=5+5)
ClearFat1032-160  24.5ns ± 0%  24.5ns ± 0%     ~     (p=0.238 n=4+5)
ClearFat1040-160  29.2ns ± 0%  29.3ns ± 0%   +0.34%  (p=0.008 n=5+5)
CopyFat7-160      1.43ns ± 0%  1.07ns ± 0%  -24.97%  (p=0.008 n=5+5)
CopyFat8-160      0.89ns ± 0%  0.89ns ± 0%     ~     (p=0.238 n=5+5)
CopyFat11-160     1.43ns ± 0%  1.07ns ± 0%  -24.97%  (p=0.008 n=5+5)
CopyFat12-160     1.07ns ± 0%  1.07ns ± 0%     ~     (p=0.238 n=5+4)
CopyFat13-160     1.43ns ± 0%  1.07ns ± 0%     ~     (p=0.079 n=4+5)
CopyFat14-160     1.43ns ± 0%  1.07ns ± 0%  -24.95%  (p=0.008 n=5+5)
CopyFat15-160     1.79ns ± 0%  1.07ns ± 0%     ~     (p=0.079 n=4+5)
CopyFat16-160     1.07ns ± 0%  1.07ns ± 0%     ~     (p=0.444 n=5+5)
CopyFat24-160     1.84ns ± 2%  1.67ns ± 0%   -9.28%  (p=0.008 n=5+5)
CopyFat32-160     3.22ns ± 0%  2.92ns ± 0%   -9.40%  (p=0.008 n=5+5)
CopyFat64-160     3.64ns ± 0%  3.57ns ± 0%   -1.96%  (p=0.008 n=5+5)
CopyFat72-160     3.56ns ± 0%  3.11ns ± 0%  -12.89%  (p=0.008 n=5+5)
CopyFat128-160    5.06ns ± 0%  5.06ns ± 0%   +0.04%  (p=0.048 n=5+5)
CopyFat256-160    9.13ns ± 0%  9.13ns ± 0%     ~     (p=0.659 n=5+5)
CopyFat512-160    17.4ns ± 0%  17.4ns ± 0%     ~     (p=0.167 n=5+5)
CopyFat520-160    17.2ns ± 0%  17.3ns ± 0%   +0.37%  (p=0.008 n=5+5)
CopyFat1024-160   34.1ns ± 0%  34.0ns ± 0%     ~     (p=0.127 n=5+5)
CopyFat1032-160   80.9ns ± 0%  34.2ns ± 0%  -57.74%  (p=0.008 n=5+5)
CopyFat1040-160   94.4ns ± 0%  41.7ns ± 0%  -55.78%  (p=0.016 n=5+4)

Change-Id: I14186f9f82b0ecf8b6c02191dc5da566b9a21e6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/421654
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-23 03:09:07 +00:00
Cherry Mui
8bf9e01473 cmd/compile: split Muluhilo op on ARM64
On ARM64 we use two separate instructions to compute the hi and lo
results of a 64x64->128 multiplication. Lower to two separate ops
so if only one result is needed we can deadcode the other.

Fixes #54607.

Change-Id: Ib023e77eb2b2b0bcf467b45471cb8a294bce6f90
Reviewed-on: https://go-review.googlesource.com/c/go/+/425101
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-22 21:29:31 +00:00
Ian Lance Taylor
6001c043dc test: add test that caused gccgo crash
For #23870

Change-Id: I3bbe0f751254d1354a59a88b45e6f944c7a2fb4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/417874
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2022-08-19 03:32:27 +00:00
Ian Lance Taylor
011a525b21 test: add test that caused gccgo to crash
For #23868

Change-Id: I07b001836e8d1411609ab84786398a5b575bf8d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/417481
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
2022-08-19 03:32:04 +00:00
Keith Randall
908499adec cmd/compile: stop using VARKILL
With the introduction of stack objects, VARKILL information is
no longer needed.

With stack objects, an object is dead when there are no more static
references to it, and the stack scanner can't find any live pointers
to it. VARKILL information isn't used to establish live ranges for
address-taken variables any more. In effect, the last static reference
*is* the VARKILL, and there's an additional dynamic liveness check
during stack scanning.

Next CL will actually rip out the VARKILL opcodes.

Change-Id: I030a2ab867445cf4e0e69397911f8a2e2f0ed07b
Reviewed-on: https://go-review.googlesource.com/c/go/+/419234
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-18 17:36:38 +00:00
Keith Randall
661146bc0b cmd/compile: don't use OFORUNTIL when implementing range loops
We don't need this special loop construct anymore now that we do
conservative GC scanning of the top of stack. Rewrite instead to a simple
pointer increment on every iteration. This leads to having a potential
past-the-end pointer at the end of the last iteration, but that value
immediately goes dead after the loop condition fails, and the past-the-end
pointer is never live across any call.

This simplifies and speeds up loops.

R=go1.20

TODO: actually delete all support for OFORUNTIL. It is now never generated,
but code to handle it (e.g. in ssagen) is still around.

TODO: in "for _, x := range" loops, we could get rid of the index
altogether and use a "pointer to the last element" reference to determine
when the loop is complete.

Fixes #53409

Change-Id: Ifc141600ff898a8bc6a75f793e575f8862679ba1
Reviewed-on: https://go-review.googlesource.com/c/go/+/414876
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-18 17:32:44 +00:00
Matthew Dempsky
52016be3f4 cmd/compile: enable more inlining for unified IR
The non-unified frontend had repeated issues with inlining and
generics (#49309, #51909, #52907), which led us to substantially
restrict inlining when shape types were present.

However, these issues are evidently not present in unified IR's
inliner, and the safety restrictions added for the non-unified
frontend can simply be disabled in unified mode.

Fixes #54497.

Change-Id: I8e6ac9f3393c588bfaf14c6452891b9640a9d1bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/424775
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-18 17:26:40 +00:00