1
0
mirror of https://github.com/golang/go synced 2024-11-23 06:50:05 -07:00
Commit Graph

5175 Commits

Author SHA1 Message Date
Cuong Manh Le
477ad7dd51 cmd/compile: support generic alias type
Type parameters on aliases are now allowed after #46477 accepted.

Updates #46477
Fixes #68054

Change-Id: Ic2e3b6f960a898163f47666e3a6bfe43b8cc22e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/593715
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2024-06-20 16:46:54 +00:00
Paul E. Murphy
d5e5b14305 cmd/compile/ssa: fix (MOVWZreg (RLWINM)) folding on PPC64
RLIWNM does not clear the upper 32 bits of the target register if
the mask wraps around (e.g 0xF000000F). Don't elide MOVWZreg for
such masks. All other usage clears the upper 32 bits.

Fixes #67844.

Change-Id: I11b89f1da9ae077624369bfe2bf25e9b7c9b79bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/590896
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-06-07 19:02:52 +00:00
Cherry Mui
0b72631a82 cmd/compile: generate args_stackmap for ABI0 assembly func regardless of linkname
Currently, the compiler generates the argument stack map based on
the function signature for bodyless function declarations, if it
is not linknamed. The assumption is that linknamed function is
provided by (Go code in) another package, so its args stack map
will be generated when compiling that package.

Now we have linknames added to declarations of assembly functions,
to signal that this function is accessed externally. Examples
include runtime.morestack_noctxt, math/big.addVV. In the current
implementation the compiler does not generate its args stack map.
That causes the assembly function's args stack map missing.
Instead, change it to generate the stack map if it is a
declaration of an ABI0 function, which can only be defined in
assembly and passed to the compiler through the -symabis flag. The
stack map generation currently only works with ABI0 layout anyway,
so we don't need to handle ABIInternal assembly functions.

Change-Id: Ic9da3b4854c604e64ed01584da3865994f5b95b8
Reviewed-on: https://go-review.googlesource.com/c/go/+/587928
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2024-06-07 15:22:22 +00:00
Robert Griesemer
cc95d85fe4 cmd/compile: remove quoting in favor of clearer prose in error messages
In an attempt to address issue #65790 (confusing error messages),
quoting of names was introduced for some (but not all) names used
in error messages.

That CL solved the issue at hand at the cost of extra punctuation
(the quotes) plus some inconsistency (not all names were quoted).

This CL removes the quoting again in favor or adding a qualifying noun
(such as "name", "label", "package", "built-in" etc.) before a user-
specified name where needed.

For instance, instead of

        invalid argument to `max'

we now say

        invalid argument to built-in max

There's still a chance for confusion. For instance, before an error
might have been

        `sadly' not exported by package X

and now it would be

        name sadly not exported by package X

but adverbs (such as "sadly") seem unlikely names in programs.

This change touches a lot of files but only affects error messages.

Fixes #67685.

Change-Id: I95435b388f92cade316e2844d59ecf6953b178bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/589118
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2024-05-30 19:19:55 +00:00
Meng Zhuo
019353d532 test/codegen: add Mul test for riscv64
Change-Id: I51e9832317e5dee1e3fe0772e7592b3dae95a625
Reviewed-on: https://go-review.googlesource.com/c/go/+/586797
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23 18:51:17 +00:00
Paul E. Murphy
c6d142c4a7 cmd/compile/internal/ssa: fix ppc64 merging of (CLRLSLDI (SRD ...))
The rotate value was not correctly converted from a 64 bit to 32
bit rotate. This caused a miscompile of
golang.org/x/text/unicode/runenames.Names.

Fixes #67526

Change-Id: Ief56fbab27ccc71cd4c01117909bfee7f60a2ea1
Reviewed-on: https://go-review.googlesource.com/c/go/+/586915
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-05-21 18:53:43 +00:00
Robert Griesemer
b8a410c434 cmd/compile: initialize posBaseMap correctly
The posBaseMap is used to identify a file's syntax tree node
given a source position. The position is mapped to the file
base which is then used to look up the file node in posBaseMap.

When posBaseMap is initialized, the file position base
is not the file base if there's a line directive before
the package clause. This can happen in cgo-generated files,
for instance due to an import "C" declaration.

If the wrong file position base is used during initialization,
looking up a file given a position will not find the file.

If a version error occurs and the corresponding file is
not found, the old code panicked with a null pointer exception.

Make sure to consistently initialize the posBaseMap by factoring
out the code computing the file base from a given position.

While at it, check for a nil file pointer. This should not happen
anymore, but don't crash if it happens (at the cost of a slightly
less informative error message).

Fixes #67141.

Change-Id: I4a6af88699c32ad01fffce124b06bb7f9e06f43d
Reviewed-on: https://go-review.googlesource.com/c/go/+/586238
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2024-05-20 20:44:21 +00:00
Paul E. Murphy
d11e417285 cmd/compile/internal/ssa: cleanup ANDCCconst rewrite rules on PPC64
Avoid creating duplicate usages of ANDCCconst. This is preparation for
a patch to reintroduce ANDconst to simplify the lower pass while
treating ANDCCconst like other *CC* ssa opcodes.

Also, move many of the similar rules wich retarget ANDCCconst users
to the flag result to a common rule for all compares against zero.

Change-Id: Ida86efe17ff413cb82c349d8ef69d2899361f4c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/585400
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-05-17 15:28:00 +00:00
Cuong Manh Le
8ce2fedaeb cmd/compile: add test case for using Alias types
CL 579935 disabled usage of Alias types in the compiler, and tracks
the problem with issue #66873. The test case in #65893 passes now
with the current tip. This CL adds a test case to ensure there is no
regression once Alias types are enabled for the compiler.

Updates #66873
Fixes #65893

Change-Id: I51b51bb13ca59549bc5925dd95f73da40465556d
Reviewed-on: https://go-review.googlesource.com/c/go/+/568455
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2024-05-16 01:45:48 +00:00
Cherry Mui
849770dec9 cmd/compile: disallow linkname referring to instantiations
Linknaming an instantiated generic symbol isn't particularly
useful: it doesn't guarantee the instantiation exists, and the
instantiated symbol name may be subject to change. Checked with a
large code corpus, currently there is no occurrance of linkname
to an instantiated generic symbol (or symbol with a bracket in its
name). This also suggests that it is not very useful. Linkname is
already an unsafe mechanism. We don't need to allow it to do more
unsafe things without justification.

Change-Id: Ifaa20c98166b28a9d7dc3290c013c2b5bb7682e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/585458
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2024-05-15 19:27:25 +00:00
Matthew Dempsky
31c8150082 cmd/compile/internal/noder: enable type aliases in type checker
This CL fixes an initialization loop during IR construction, that
stems from IR lacking first-class support for aliases. As a
workaround, we avoid publishing alias declarations until the RHS type
expression has been constructed.

Thanks to gri@ for investigating while I was out.

Fixes #66873.

Change-Id: I11e0d96ea6c357c295da47f44b6ec408edef89b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/585399
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2024-05-15 15:09:14 +00:00
Paul E. Murphy
0222a028f1 cmd/compile/internal/ssa: combine more shift and masking on PPC64
Investigating binaries, these patterns seem to show up frequently.

Change-Id: I987251e4070e35c25e98da321e444ccaa1526912
Reviewed-on: https://go-review.googlesource.com/c/go/+/583302
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-05-15 13:27:41 +00:00
Keith Randall
93e3696b5d cmd/compile: avoid past-the-end pointer when zeroing
When we optimize append(s, make([]T, n)...), we have to be careful
not to pass &s[0] + len(s)*sizeof(T) as the argument to memclr, as that
pointer might be past-the-end. This can only happen if n is zero, so
just special-case n==0 in the generated code.

Fixes #67255

Change-Id: Ic680711bb8c38440eba5e759363ef65f5945658b
Reviewed-on: https://go-review.googlesource.com/c/go/+/584116
Reviewed-by: Austin Clements <austin@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-05-08 17:09:06 +00:00
khr@golang.org
3c72dd513c cmd/compile: don't combine loads in generated equality functions
... if the architecture can't do unaligned loads.
We already handle this in a few places, but this particular place
was added in CL 399542 and missed this additional restriction.

Fixes #67160

Change-Id: I45988f11ff3ed45df1c4da3f0931ab1fdb22dbfe
Reviewed-on: https://go-review.googlesource.com/c/go/+/583175
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Derek Parker <parkerderek86@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-06 15:34:04 +00:00
Paul E. Murphy
7994da4cc1 cmd/compile/internal/ssa: on PPC64, try combining CLRLSLDI and SRDconst into RLWINM
This provides a small performance bump to crc64 as measured on ppc64le/power10:

name              old time/op    new time/op    delta
Crc64/ISO64KB       49.6µs ± 0%    46.6µs ± 0%  -6.18%
Crc64/ISO4KB        3.16µs ± 0%    2.97µs ± 0%  -5.83%
Crc64/ISO1KB         840ns ± 0%     794ns ± 0%  -5.46%
Crc64/ECMA64KB      49.6µs ± 0%    46.5µs ± 0%  -6.20%
Crc64/Random64KB    53.1µs ± 0%    49.9µs ± 0%  -6.04%
Crc64/Random16KB    15.9µs ± 1%    15.0µs ± 0%  -5.73%

Change-Id: I302b5431c7dc46dfd2d211545c483bdcdfe011f1
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/581937
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2024-05-03 21:12:29 +00:00
khr@golang.org
1a0b86375f cmd/compile: remove redundant calls to cmpstring
The results of cmpstring are reuseable if the second call has the
same arguments and memory.

Note that this gets rid of cmpstring, but we still generate a
redundant </<= test and branch afterwards, because the compiler
doesn't know that cmpstring only ever returns -1,0,1.

Update #61725

Change-Id: I93a0d1ccca50d90b1e1a888240ffb75a3b10b59b
Reviewed-on: https://go-review.googlesource.com/c/go/+/578835
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-04-19 16:31:02 +00:00
Than McIntosh
875332b8a2 cmd/compile/internal: merge stack slots for selected local auto vars
[This is a partial roll-forward of CL 553055, the main change here
is that the stack slot overlap operation is flagged off by default
(can be enabled by hand with -gcflags=-d=mergelocals=1) ]

Preliminary compiler support for merging/overlapping stack slots of
local variables whose access patterns are disjoint.

This patch includes changes in AllocFrame to do the actual
merging/overlapping based on information returned from a new
liveness.MergeLocals helper. The MergeLocals helper identifies
candidates by looking for sets of AUTO variables that either A) have
the same size and GC shape (if types contain pointers), or B) have the
same size (but potentially different types as long as those types have
no pointers). Variables must be greater than (3*types.PtrSize) in size
to be considered for merging.

After forming candidates, MergeLocals collects variables into "can be
overlapped" equivalence classes or partitions; this process is driven
by an additional liveness analysis pass. Ideally it would be nice to
move the existing stackmap liveness pass up before AllocFrame
and "widen" it to include merge candidates so that we can do just a
single liveness as opposed to two passes, however this may be difficult
given that the merge-locals liveness has to take into account
writes corresponding to dead stores.

This patch also required a change to the way ssa.OpVarDef pseudo-ops
are generated; prior to this point they would only be created for
variables whose type included pointers; if stack slot merging is
enabled then the ssagen code creates OpVarDef ops for all auto vars
that are merge candidates.

Note that some temporaries created late in the compilation process
(e.g. during ssa backend) are difficult to reason about, especially in
cases where we take the address of a temp and pass it to the runtime.
For the time being we mark most of the vars created post-ssagen as
"not a merge candidate".

Stack slot merging for locals/autos is enabled by default if "-N" is
not in effect, and can be disabled via "-gcflags=-d=mergelocals=0".

Fixmes/todos/restrictions:
- try lowering size restrictions
- re-evaluate the various skips that happen in SSA-created autotmps

Updates #62737.
Updates #65532.
Updates #65495.

Change-Id: Ifda26bc48cde5667de245c8a9671b3f0a30bb45d
Reviewed-on: https://go-review.googlesource.com/c/go/+/575415
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-04-09 16:41:23 +00:00
Alan Donovan
b24ec88bb9 cmd/compile: export/import materialized aliases
This CL changes the compiler's type import/export logic
to create and preserve materialized Alias types
when GODEBUG=gotypesaliases=1.

In conjunction with CL 574717, it allows the x/tools
tests to pass with GODEBUG=gotypesaliases=1.

Updates #65294
Updates #64581
Fixes #66550

Change-Id: I70b9279f4e0ae7a1f95ad153c4e6909a878915a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/574737
Auto-Submit: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2024-04-05 16:29:58 +00:00
Paul E. Murphy
ebf7747dbe cmd/internal/obj/ppc64: on Power10, use xxspltidp for float constants
Any normal float32 constant can be generated by this instruction;
use xxspltidp when possible. This prefixed instruction is much
faster than the two instruction load sequence from the
float32/float64 constant pool.

Change-Id: Id751d9ffdae71463adbde66427b986f0b2ef74c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/575555
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2024-04-04 15:24:29 +00:00
cui fliter
7dd8f39eba all: fix some comments
Change-Id: I0ee85161846c13d938213ef04d3a34f690a93e48
Reviewed-on: https://go-review.googlesource.com/c/go/+/553435
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2024-04-04 14:29:45 +00:00
Cuong Manh Le
5038ce82b6 cmd/compile: add missing OASOP case in mayModifyPkgVar
CL 395541 made staticopy safe, stop applying the optimization once
seeing an expression that may modify global variables. However, it
misses the case for OASOP expression, causing the static init
mis-recognizes the modification and think it's safe.

Fixing this by adding missing OASOP case.

Fixes #66585

Change-Id: I603cec018d3b5a09825c14e1f066a0e16f8bde23
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/575216
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-02 17:16:14 +00:00
Cuong Manh Le
973befe714 cmd/compile: check ODEREF for safe lhs in assignment during static init
For #66585

Change-Id: Iddc407e3ef4c3b6ecf5173963b66b3e65e43c92d
Reviewed-on: https://go-review.googlesource.com/c/go/+/575336
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-04-02 17:12:59 +00:00
Paul E. Murphy
dfb17c126c cmd/compile: support float min/max instructions on PPC64
This enables efficient use of the builtin min/max function
for float64 and float32 types on GOPPC64 >= power9.

Extend the assembler to support xsminjdp/xsmaxjdp and use
them to implement float min/max.

Simplify the VSX xx3 opcode rules to allow FPR arguments,
if all arguments are an FPR.

Change-Id: I15882a4ce5dc46eba71d683cf1d184dc4236a328
Reviewed-on: https://go-review.googlesource.com/c/go/+/574535
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2024-04-01 18:50:29 +00:00
Cuong Manh Le
0bf6071066 Revert "cmd/compile/internal: merge stack slots for selected local auto vars"
This reverts CL 553055.

Reason for revert: causes crypto/ecdsa failures on linux ppc64/s390x builders

Change-Id: I9266b030693a5b6b1e667a009de89d613755b048
Reviewed-on: https://go-review.googlesource.com/c/go/+/575236
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-30 00:11:00 +00:00
Than McIntosh
89f7805c2e cmd/compile/internal: merge stack slots for selected local auto vars
Preliminary compiler support for merging/overlapping stack
slots of local variables whose access patterns are disjoint.

This patch includes changes in AllocFrame to do the actual
merging/overlapping based on information returned from a new
liveness.MergeLocals helper. The MergeLocals helper identifies
candidates by looking for sets of AUTO variables that either A) have
the same size and GC shape (if types contain pointers), or B) have the
same size (but potentially different types as long as those types have
no pointers). Variables must be greater than (3*types.PtrSize) in size
to be considered for merging.

After forming candidates, MergeLocals collects variables into "can be
overlapped" equivalence classes or partitions; this process is driven
by an additional liveness analysis pass. Ideally it would be nice to
move the existing stackmap liveness pass up before AllocFrame
and "widen" it to include merge candidates so that we can do just a
single liveness as opposed to two passes, however this may be difficult
given that the merge-locals liveness has to take into account
writes corresponding to dead stores.

This patch also required a change to the way ssa.OpVarDef pseudo-ops
are generated; prior to this point they would only be created for
variables whose type included pointers; if stack slot merging is
enabled then the ssagen code creates OpVarDef ops for all auto vars
that are merge candidates.

Note that some temporaries created late in the compilation process
(e.g. during ssa backend) are difficult to reason about, especially in
cases where we take the address of a temp and pass it to the runtime.
For the time being we mark most of the vars created post-ssagen as
"not a merge candidate".

Stack slot merging for locals/autos is enabled by default if "-N" is
not in effect, and can be disabled via "-gcflags=-d=mergelocals=0".

Fixmes/todos/restrictions:
- try lowering size restrictions
- re-evaluate the various skips that happen in SSA-created autotmps

Fixes #62737.
Updates #65532.
Updates #65495.

Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: Ibc22e8a76c87e47bc9fafe4959804d9ea923623d
Reviewed-on: https://go-review.googlesource.com/c/go/+/553055
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-29 23:09:29 +00:00
Than McIntosh
29fcd1569a Revert "cmd/compile: add missing OINLCAll case in mayModifyPkgVar"
This reverts CL 575175.

Reason for revert: causes crypto/ecdh failures on longtest builders.

Change-Id: Ieed326fedf91760ac73095a42ba0237cf969843b
Reviewed-on: https://go-review.googlesource.com/c/go/+/575316
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
2024-03-29 21:53:14 +00:00
Cuong Manh Le
9a028e14a5 cmd/compile: add missing OINLCAll case in mayModifyPkgVar
CL 395541 made staticopy safe, stop applying the optimization once
seeing an expression that may modify global variables.

However, if a call expression was inlined, the analyzer mis-recognizes
and think that the expression is safe. For example:

	var x = 0
	var a = f()
	var b = x

are re-written to:

	var x = 0
	var a = ~r0
	var b = 0

even though it's not safe because "f()" may modify "x".

Fixing this by recognizing OINLCALL and mark the initialization as
not safe for staticopy.

Fixes #66585

Change-Id: Id930c0b7e74274195f54a498cc4c5a91c4e6d84d
Reviewed-on: https://go-review.googlesource.com/c/go/+/575175
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-29 16:46:47 +00:00
Keith Randall
8f618c1f53 cmd/compile: put constants before variables in initialization order
Fixes #66575

Change-Id: I03f4d4577b88ad0a92b260b2efd0cb9fe5082b2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/575075
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-28 22:06:51 +00:00
Ian Lance Taylor
132f9fa9f8 test: issue16016: use fewer goroutines for gccgo
For https://gcc.gnu.org/PR114453

Change-Id: If41d9fca6288b18ed47b0f21ff224c74ddb34958
Reviewed-on: https://go-review.googlesource.com/c/go/+/574536
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-27 18:39:06 +00:00
cui fliter
1e12eab870 all: fix a large number of comments
Partial typo corrections, following https://go.dev/wiki/Spelling

Change-Id: I2357906ff2ea04305c6357418e4e9556e20375d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/573776
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-03-26 19:58:28 +00:00
Andy Pan
4c2b1e0feb runtime: migrate internal/atomic to internal/runtime
For #65355

Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-25 19:53:03 +00:00
guoguangwu
b37fb8c6ca test/stress: fix typo in comment
Change-Id: I0f67801ef2d3af65c39a27b8db6ebaa769ff7f92
GitHub-Last-Rev: feb7f79ea5
GitHub-Pull-Request: golang/go#66508
Reviewed-on: https://go-review.googlesource.com/c/go/+/574075
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2024-03-25 19:21:35 +00:00
cui fliter
27fdef6168 test: put type declaration back inside the function
Because issue #47631 has been fixed, remove TODO.

Change-Id: Ic476616729f47485a18a5145bd28c87dd18b4492
Reviewed-on: https://go-review.googlesource.com/c/go/+/573775
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-25 09:04:48 +00:00
Andrey Bokhanko
0ae8468b20 cmd/compile,cmd/go,cmd/internal,runtime: remove dynamic checks for atomics for ARM64 targets that support LSE
Remove dynamic checks for atomic instructions for ARM64 targets that support LSE extension.

For #66131

Change-Id: I0ec1b183a3f4ea4c8a537430646e6bc4b4f64271
Reviewed-on: https://go-review.googlesource.com/c/go/+/569536
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com>
Reviewed-by: Shu-Chun Weng <scw@google.com>
2024-03-21 20:08:06 +00:00
Keith Randall
802473cfda cmd/compile: include constant bools in memcombine
Constant bools are like constant 1-byte values, they memcombine just fine.

(There are still trickier cases that this pass doesn't catch
yet, see TODO at memcombine.go:503.)

Fixes #66413

Change-Id: Ia67cf72ed1c416e27ac22da443bd88a3f09a6cc8
Reviewed-on: https://go-review.googlesource.com/c/go/+/573416
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-03-21 19:45:41 +00:00
Keith Randall
6bf8b76b95 cmd/compile: don't assume args are always zero-extended
On amd64, we always zero-extend when loading arguments from the stack.
On arm64, we extend based on the type. This causes problems with
zeroUpper*Bits, which reports the top bits are zero when they aren't.

Fix it to use the type to decide if the top bits are really zero.

For tests, only f32 currently fails on arm64. Added other tests
just for future-proofing.

Update #66066

Change-Id: I2f13fb47198e139ef13c9a34eb1edc932eea3ee3
Reviewed-on: https://go-review.googlesource.com/c/go/+/571135
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-20 17:35:29 +00:00
Robert Griesemer
dc6a5cfca1 go/types, types2: quote user-supplied names in error messages
Use `' quotes (as in `foo') to differentiate from Go quotes.
Quoting prevents confusion when user-supplied names alter
the meaning of the error message.

For instance, report

        duplicate method `wanted'

rather than

        duplicate method wanted

Exceptions:
- don't quote _:
        `_' is ugly and not necessary
- don't quote after a ":":
        undefined name: foo
- don't quote if the name is used correctly in a statement:
        goto L jumps over variable declaration

Quoting is done with a helper function and can be centrally adjusted
and fine-tuned as needed.

Adjusted some test cases to explicitly include the quoted names.

Fixes #65790.

Change-Id: Icce667215f303ab8685d3e5cb00d540a2fd372ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/571396
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
2024-03-18 18:59:40 +00:00
Paul E. Murphy
c7065bb9db cmd/compile/internal: generate ADDZE on PPC64
This usage shows up in quite a few places, and helps reduce
register pressure in several complex cryto functions by
removing a MOVD $0,... instruction.

Change-Id: I9444ea8f9d19bfd68fb71ea8dc34e109681b3802
Reviewed-on: https://go-review.googlesource.com/c/go/+/571055
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
2024-03-15 17:57:45 +00:00
Matthew Dempsky
412623c53f test/fixedbugs: add regress test for inlining failure
Still investigating, but adding the minimized reproducer as a regress
test case for now.

Updates #66261.

Change-Id: I20715b731f8c5b95616513d4a13e3ae083709031
Reviewed-on: https://go-review.googlesource.com/c/go/+/571815
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-14 22:47:28 +00:00
Paul E. Murphy
6e5398bad1 cmd/asm,cmd/compile: generate less instructions for most 32 bit constant adds on ppc64x
For GOPPC64 < 10 targets, most large 32 bit constants (those
exceeding int16 capacity) can be added using two instructions
instead of 3.

This cannot be done for values greater than 0x7FFF7FFF, so this
must be done during asm preprocessing as the optab matching
rules cannot differentiate this special case.

Likewise, constants 0x8000 <= x < 0x10000 are not converted. The
assembler currently generates 2 instructions sequences for these
constants.

Change-Id: I1ccc839c6c28fc32f15d286b2e52e2d22a2a06d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/568116
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2024-03-13 13:58:44 +00:00
Keith Randall
a46ecdca36 cmd/compile: fix sign/zero-extension removal
When an opcode generates a known high bit state (typically, a sub-word
operation that zeros the high bits), we can remove any subsequent
extension operation that would be a no-op.

x = (OP ...)
y = (ZeroExt32to64 x)

If OP zeros the high 32 bits, then we can replace y with x, as the
zero extension doesn't do anything.

However, x in this situation normally has a sub-word-sized type.  The
semantics of values in registers is typically that the high bits
beyond the value's type size are junk. So although the opcode
generating x *currently* zeros the high bits, after x is rewritten to
another opcode it may not - rewrites of sub-word-sized values can
trash the high bits.

To fix, move the extension-removing rules to late lower. That ensures
that their arguments won't be rewritten to change their high bits.

I am also worried about spilling and restoring. Spilling and restoring
doesn't preserve the high bits, but instead sets them to a known value
(often 0, but in some cases it could be sign-extended).  I am unable
to come up with a case that would cause a problem here, so leaving for
another time.

Fixes #66066

Change-Id: I3b5c091b3b3278ccbb7f11beda8b56f4b6d3fde7
Reviewed-on: https://go-review.googlesource.com/c/go/+/568616
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-12 19:38:41 +00:00
Cuong Manh Le
5a329c3bfb cmd/compile: fix copying SSA-able variables optimization
CL 541715 added an optimization to copy SSA-able variables.

When handling m[k] = append(m[k], ...) case, it uses ir.SameSafeExpr to
check that m[k] expressions are the same, then doing type assertion to
convert the map index to ir.IndexExpr node. However, this assertion is
not safe for m[k] expression in append(m[k], ...), since it may be
wrapped by ir.OCONVNOP node.

Fixing this by un-wrapping any ir.OCONVNOP before doing type assertion.

Fixes #66096

Change-Id: I9ff7165ab97bc7f88d0e9b7b31604da19a8ca206
Reviewed-on: https://go-review.googlesource.com/c/go/+/569716
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-03-08 02:00:33 +00:00
Joel Sing
997636760e cmd/compile,cmd/internal/obj: provide rotation pseudo-instructions for riscv64
Provide and use rotation pseudo-instructions for riscv64. The RISC-V bitmanip
extension adds support for hardware rotation instructions in the form of ROL,
ROLW, ROR, RORI, RORIW and RORW. These are easily implemented in the assembler
as pseudo-instructions for CPUs that do not support the bitmanip extension.

This approach provides a number of advantages, including reducing the rewrite
rules needed in the compiler, simplifying codegen tests and most importantly,
allowing these instructions to be used in assembly (for example, riscv64
optimised versions of SHA-256 and SHA-512). When bitmanip support is added,
these instruction sequences can simply be replaced with a single instruction
if permitted by the GORISCV64 profile.

Change-Id: Ia23402e1a82f211ac760690deb063386056ae1fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/565015
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
2024-03-07 14:57:07 +00:00
David Chase
6f5d77454e cmd/compile: add 0-sized-value simplification to copyelim
The problem was caused by faulty handling of unSSA-able
operations on zero-sized data in expand calls, but there
is no point to operations on zero-sized data.  This CL adds
a simplify step to the first place in SSA where all values
are processed and replaces anything producing a 0-sized
struct/array with the corresponding Struct/Array Make0
operation (of the appropriate type).

I attempted not generating them in ssagen, but that was a
larger change, and also had bugs. This is simple and obvious.
The only question is whether it would be worthwhile to do it
earlier (in numberlines or phielem).

Fixes #65808.

Change-Id: I0a596b3d272798015e7bb6b1a20411241759fe0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/568258
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-02 14:01:52 +00:00
Cuong Manh Le
cda1e40b44 cmd/compile: add missing Unalias call when writing type alias
Fixes #65778

Change-Id: I93af42967c7976d63b4f460b7ffbcb9a9c05ffe7
Reviewed-on: https://go-review.googlesource.com/c/go/+/565995
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2024-03-01 01:40:00 +00:00
khr@golang.org
e930413331 cmd/compile: soften type matching when allocating stack slots
Currently we use pointer equality on types when deciding whether we can
reuse a stack slot. That's too strict, as we don't guarantee pointer
equality for the same type. In particular, it can vary based on whether
PtrTo has been called in the frontend or not.

Instead, use the type's LinkString, which is guaranteed to both be
unique for a type, and to not vary given two different type structures
describing the same type.

Update #65783

Change-Id: I64f55138475f04bfa30cfb819b786b7cc06aebe4
Reviewed-on: https://go-review.googlesource.com/c/go/+/565436
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2024-02-29 21:29:41 +00:00
khr@golang.org
2589a89468 runtime: don't re-initialize itab while looking for missing function
The itab we're initializing again, just to figure out which method
is missing, might be stored in read-only memory.
This can only happen in certain weird generics situations, so it is
pretty rare, but it causes a runtime crash when it does happen.

Fixes #65962

Change-Id: Ia86e216fe33950a794ad8e475e76317f799e9136
Reviewed-on: https://go-review.googlesource.com/c/go/+/567615
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-02-29 18:30:40 +00:00
zuojunwei.1024
b8c76effd9 cmd/compile: mark pointer to noalg type as noalg
When the compiler writes PtrToThis field of noalg type, it generates
its pointer type. Mark them as noalg to prevent put them in typelinks.

Fixes #65957

Change-Id: Icbc3b18bc866f9138c7648e42dd500a80326f72b
Reviewed-on: https://go-review.googlesource.com/c/go/+/567335
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2024-02-28 05:32:14 +00:00
Robert Griesemer
5e00352b9b go/types, types2: consistently use singular when reporting version errors
Change-Id: I39af932b789cd18dc4bfc84f9667b1c32c9825f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/567476
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-28 02:54:13 +00:00
Cuong Manh Le
b847d4cd2c cmd/compile: fix early deadcode with label statement
CL 517775 moved early deadcode into unified writer. with new way to
handle dead code with label statement involved: any statements after
terminating statement will be considered dead until next label
statement.

However, this is not safe, because code after label statement may still
refer to dead statements between terminating and label statement.

It's only safe to remove statements after terminating *and* label one.

Fixes #65593

Change-Id: Idb630165240931fad50789304a9e4535f51f56e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/565596
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2024-02-27 21:07:10 +00:00
Zxilly
856355a913 cmd/compile: use quotes to wrap user-supplied token
Use quotes to wrap user-supplied token in the syntax error message.
Updates #65790

Change-Id: I631a63df4a6bb8615b7850a324d812190bc15f30
GitHub-Last-Rev: f291e1d5a6
GitHub-Pull-Request: golang/go#65840
Reviewed-on: https://go-review.googlesource.com/c/go/+/565518
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-27 16:22:24 +00:00
Cuong Manh Le
5428cc4f14 cmd/compile/internal/typecheck: remove constant bounds check
types2 handles all constant-related bounds checks in user Go code now,
so it's safe to remove the check from typecheck, avoid the inconsistency
with type parameter.

Fixes #65417

Change-Id: I82dd197b78e271725d132b5a20450ae3e90f9abc
Reviewed-on: https://go-review.googlesource.com/c/go/+/560575
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-20 22:29:14 +00:00
Joel Sing
daa58db486 cmd/compile: improve rotations for riscv64
Enable canRotate for riscv64, enable rotation intrinsics and provide
better rewrite implementations for rotations. By avoiding Lsh*x64
and Rsh*Ux64 we can produce better code, especially for 32 and 64
bit rotations. By enabling canRotate we also benefit from the generic
rotation rewrite rules.

Benchmark on a StarFive VisionFive 2:

               │   rotate.1   │              rotate.2               │
               │    sec/op    │   sec/op     vs base                │
RotateLeft-4     14.700n ± 0%   8.016n ± 0%  -45.47% (p=0.000 n=10)
RotateLeft8-4     14.70n ± 0%   10.69n ± 0%  -27.28% (p=0.000 n=10)
RotateLeft16-4    14.70n ± 0%   12.02n ± 0%  -18.23% (p=0.000 n=10)
RotateLeft32-4   13.360n ± 0%   8.016n ± 0%  -40.00% (p=0.000 n=10)
RotateLeft64-4   13.360n ± 0%   8.016n ± 0%  -40.00% (p=0.000 n=10)
geomean           14.15n        9.208n       -34.92%

Change-Id: I1a2036fdc57cf88ebb6617eb8d92e1d187e183b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/560315
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
2024-02-16 11:59:07 +00:00
Matthew Dempsky
b158ca9ae3 cmd/compile: separate inline cost analysis from applying inlining
This CL separates the pass that computes inlinability from the pass
that performs inlinability. In particular, the latter can now happen
in any flat order, rather than bottom-up order. This also allows
inlining of calls exposed by devirtualization.

Change-Id: I389c0665fdc8288a6e25129a6744bfb1ace1eff7
Reviewed-on: https://go-review.googlesource.com/c/go/+/562319
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2024-02-09 17:21:38 +00:00
Meng Zhuo
1400b26852 test/codegen: add float max/min codegen test
As CL 514596 and CL 514775 adds hardware implement of float
max/min, we should add codegen test for these two CL.

Change-Id: I347331032fe9f67a2e6fdb5d3cfe20203296b81c
Reviewed-on: https://go-review.googlesource.com/c/go/+/561295
Reviewed-by: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2024-02-08 03:02:00 +00:00
Robert Griesemer
f81e498673 go/types, types2: better errors for non-existing fields or methods
This CL improves the error messages reported when a field or method
name is used that doesn't exist. It brings the error messges on par
(or better) with the respective errors reported before Go 1.18 (i.e.
before switching to the new type checker):

Make case distinctions based on whether a field/method is exported
and how it is spelled. Factor out that logic into a new function
(lookupError) in a new file (errsupport.go), which is generated for
go/types. Use lookupError when reporting selector lookup errors
and missing struct field keys.

Add a comprehensive set of tests (lookup2.go) and spot tests for
the two cases brought up by the issue at hand.

Adjusted existing tests as needed.

Fixes #49736.

Change-Id: I2f439948dcd12f9bd1a258367862d8ff96e32305
Reviewed-on: https://go-review.googlesource.com/c/go/+/560055
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
2024-02-07 16:41:56 +00:00
Robert Griesemer
8e02e7b26a go/types, types2: use existing case-insensitive lookup (remove TODO)
Rather than implementing a new, less complete mechanism to check
if a selector exists with different capitalization, use the
existing mechanism in lookupFieldOrMethodImpl by making it
available for internal use.

Pass foldCase parameter all the way trough to Object.sameId and
thus make it consistently available where Object.sameId is used.

From sameId, factor out samePkg functionality into stand-alone
predicate.

Do better case distinction when reporting an error for an undefined
selector expression.

Cleanup.

Change-Id: I7be3cecb4976a4dce3264c7e0c49a320c87101e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/558315
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2024-01-26 14:04:26 +00:00
Jorropo
bbd0bc22ea cmd/compile: improve integer comparisons with numeric bounds
This do:
- Fold always false or always true comparisons for ints and uint.
- Reduce < and <= where the true set is only one value to == with such value.

Change-Id: Ie9e3f70efd1845bef62db56543f051a50ad2532e
Reviewed-on: https://go-review.googlesource.com/c/go/+/555135
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-01-23 00:02:36 +00:00
Cuong Manh Le
881869dde0 cmd/compile: handle defined iter func type correctly
Fixed #64930

Change-Id: I916de7f97116fb20cb2f3f0b425ac34409afd494
Reviewed-on: https://go-review.googlesource.com/c/go/+/553436
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2024-01-08 16:00:53 +00:00
Keith Randall
f6509cf5cd cmd/compile: handle constant-folding of an out-of-range jump table index
The input index to a jump table can be out of range for unreachable code.

Dynamically the compiler ensures that an out-of-range index can never
reach a jump table, but that guarantee doesn't extend to the static
realm.

Fixes #64826

Change-Id: I5829f3933ae5124ffad8337dfd7dd75e67a8ec33
Reviewed-on: https://go-review.googlesource.com/c/go/+/552055
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
2023-12-21 00:15:58 +00:00
Robert Griesemer
6fe0d3758b cmd/compile: remove interfacecycles debug flag
Per the discussion on the issue, since no problems related to this
appeared since Go 1.20, remove the ability to disable the check for
anonymous interface cycles permanently.

Adjust various tests accordingly.

For #56103.

Change-Id: Ica2b28752dca08934bbbc163a9b062ae1eb2a834
Reviewed-on: https://go-review.googlesource.com/c/go/+/550896
Run-TryBot: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-12-19 04:39:56 +00:00
Than McIntosh
5b84d50038 test: skip rangegen.go on 32-bit platforms
Add a skip for this test that effectively disables it for 32-bit platforms,
so as to not run into timeouts or OOMs on smaller machines.

Fixes #64789.

Change-Id: I2d428e1dccae62b8bb1a69c5f95699692a282bbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/550975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2023-12-18 23:35:19 +00:00
Keith Randall
b60bf8f8e1 cmd/asm: fix encoding for arm right shift by constant 0
Right shifts, for some odd reasons, can encode shifts of constant
1-32 instead of 0-31. Left shifts, however, can encode shifts 0-31.
When the shift amount is 0, arm recommends encoding right shifts
using left shifts.

Fixes #64715

Change-Id: Id3825349aa7195028037893dfe01fa0e405eaa51
Reviewed-on: https://go-review.googlesource.com/c/go/+/549955
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-12-15 19:35:21 +00:00
Danil Timerbulatov
527829a7cb all: remove newline characters after return statements
This commit is aimed at improving the readability and consistency
of the code base. Extraneous newline characters were present after
some return statements, creating unnecessary separation in the code.

Fixes #64610

Change-Id: Ic1b05bf11761c4dff22691c2f1c3755f66d341f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/548316
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-12-14 17:22:18 +00:00
Raghvender
4bf1ca4b0c cmd/compile: fix error message for mismatch between the number of type params and arguments
Fixes #64276

Change-Id: Ib6651669904e6ea0daf275d85d8bd008b8b21cc6
Reviewed-on: https://go-review.googlesource.com/c/go/+/544018
Reviewed-by: raghvender sundarjee <raghvenders@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-12-08 00:50:55 +00:00
Keith Randall
788a227759 cmd/compile: fix SCCP propagation into jump tables
We can't delete all the outgoing edges and then add one back in, because
then we've lost the argument of any phi at the target. Instead, move
the important target to the front of the list and delete the rest.

This normally isn't a problem, because there is never normally a phi
at the target of a jump table. But this isn't quite true when in race
build mode, because there is a phi of the result of a bunch of raceread
calls.

The reason this happens is that each case is written like this (where e
is the runtime.eface we're switching on):

if e.type == $type.int32 {
   m = raceread(e.data, m1)
}
m2 = phi(m1, m)
if e.type == $type.int32 {
   .. do case ..
   goto blah
}

so that if e.type is not $type.int32, it falls through to the default
case. This default case will have a memory phi for all the (jumped around
and not actually called) raceread calls.

If we instead did it like

if e.type == $type.int32 {
  raceread(e.data)
  .. do case ..
  goto blah
}

That would paper over this bug, as it is the only way to construct
a jump table whose target is a block with a phi in it. (Yet.)

But we'll fix the underlying bug in this CL. Maybe we can do the
rewrite mentioned above later.  (It is an optimization for -race mode,
which isn't particularly important.)

Fixes #64606

Change-Id: I6f6e3c90eb1e2638112920ee2e5b6581cef04ea4
Reviewed-on: https://go-review.googlesource.com/c/go/+/548356
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-08 00:29:01 +00:00
Matthew Dempsky
34416d7f6f cmd/compile: fix escape analysis of string min/max
When I was plumbing min/max support through the compiler, I was
thinking mostly about numeric argument types. As a result, I forgot
that escape analysis would need to be aware that min/max can operate
on string values, which contain pointers.

Fixes #64565.

Change-Id: I36127ce5a2da942401910fa0f9de922726c9f94d
Reviewed-on: https://go-review.googlesource.com/c/go/+/547715
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-05 22:06:07 +00:00
Russ Cox
c29444ef39 math/rand, math/rand/v2: use ChaCha8 for global rand
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).

The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.

The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.

Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.

(Wyrand is still reasonable for scheduling and cache decisions.)

Good randomness does have a cost: about twice wyrand.

Also rationalize the various runtime rand references.

goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ bbb48afeb7.amd64 │           5cf807d1ea.amd64           │
                        │      sec/op      │    sec/op     vs base                │
ChaCha8-32                     1.862n ± 2%    1.861n ± 2%        ~ (p=0.825 n=20)
PCG_DXSM-32                    1.471n ± 1%    1.460n ± 2%        ~ (p=0.153 n=20)
SourceUint64-32                1.636n ± 2%    1.582n ± 1%   -3.30% (p=0.000 n=20)
GlobalInt64-32                 2.087n ± 1%    3.663n ± 1%  +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32        0.1042n ± 1%   0.2026n ± 1%  +94.48% (p=0.000 n=20)
GlobalUint64-32                2.263n ± 2%    3.724n ± 1%  +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32       0.1019n ± 1%   0.1973n ± 1%  +93.67% (p=0.000 n=20)
Int64-32                       1.771n ± 1%    1.774n ± 1%        ~ (p=0.449 n=20)
Uint64-32                      1.863n ± 2%    1.866n ± 1%        ~ (p=0.364 n=20)
GlobalIntN1000-32              3.134n ± 3%    4.730n ± 2%  +50.95% (p=0.000 n=20)
IntN1000-32                    2.489n ± 1%    2.489n ± 1%        ~ (p=0.683 n=20)
Int64N1000-32                  2.521n ± 1%    2.516n ± 1%        ~ (p=0.394 n=20)
Int64N1e8-32                   2.479n ± 1%    2.478n ± 2%        ~ (p=0.743 n=20)
Int64N1e9-32                   2.530n ± 2%    2.514n ± 2%        ~ (p=0.193 n=20)
Int64N2e9-32                   2.501n ± 1%    2.494n ± 1%        ~ (p=0.616 n=20)
Int64N1e18-32                  3.227n ± 1%    3.205n ± 1%        ~ (p=0.101 n=20)
Int64N2e18-32                  3.647n ± 1%    3.599n ± 1%        ~ (p=0.019 n=20)
Int64N4e18-32                  5.135n ± 1%    5.069n ± 2%        ~ (p=0.034 n=20)
Int32N1000-32                  2.657n ± 1%    2.637n ± 1%        ~ (p=0.180 n=20)
Int32N1e8-32                   2.636n ± 1%    2.636n ± 1%        ~ (p=0.763 n=20)
Int32N1e9-32                   2.660n ± 2%    2.638n ± 1%        ~ (p=0.358 n=20)
Int32N2e9-32                   2.662n ± 2%    2.618n ± 2%        ~ (p=0.064 n=20)
Float32-32                     2.272n ± 2%    2.239n ± 2%        ~ (p=0.194 n=20)
Float64-32                     2.272n ± 1%    2.286n ± 2%        ~ (p=0.763 n=20)
ExpFloat64-32                  3.762n ± 1%    3.744n ± 1%        ~ (p=0.171 n=20)
NormFloat64-32                 3.706n ± 1%    3.655n ± 2%        ~ (p=0.066 n=20)
Perm3-32                       32.93n ± 3%    34.62n ± 1%   +5.13% (p=0.000 n=20)
Perm30-32                      202.9n ± 1%    204.0n ± 1%        ~ (p=0.482 n=20)
Perm30ViaShuffle-32            115.0n ± 1%    114.9n ± 1%        ~ (p=0.358 n=20)
ShuffleOverhead-32             112.8n ± 1%    112.7n ± 1%        ~ (p=0.692 n=20)
Concurrent-32                  2.107n ± 0%    3.725n ± 1%  +76.75% (p=0.000 n=20)

goos: darwin
goarch: arm64
pkg: math/rand/v2
                       │ bbb48afeb7.arm64 │           5cf807d1ea.arm64            │
                       │      sec/op      │    sec/op     vs base                 │
ChaCha8-8                     2.480n ± 0%    2.429n ± 0%    -2.04% (p=0.000 n=20)
PCG_DXSM-8                    2.531n ± 0%    2.530n ± 0%         ~ (p=0.877 n=20)
SourceUint64-8                2.534n ± 0%    2.533n ± 0%         ~ (p=0.732 n=20)
GlobalInt64-8                 2.172n ± 1%    4.794n ± 0%  +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8        0.4320n ± 0%   0.9605n ± 0%  +122.32% (p=0.000 n=20)
GlobalUint64-8                2.182n ± 0%    4.770n ± 0%  +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8       0.4307n ± 0%   0.9583n ± 0%  +122.51% (p=0.000 n=20)
Int64-8                       4.107n ± 0%    4.104n ± 0%         ~ (p=0.416 n=20)
Uint64-8                      4.080n ± 0%    4.080n ± 0%         ~ (p=0.052 n=20)
GlobalIntN1000-8              2.814n ± 2%    5.643n ± 0%  +100.50% (p=0.000 n=20)
IntN1000-8                    4.141n ± 0%    4.139n ± 0%         ~ (p=0.140 n=20)
Int64N1000-8                  4.140n ± 0%    4.140n ± 0%         ~ (p=0.313 n=20)
Int64N1e8-8                   4.140n ± 0%    4.139n ± 0%         ~ (p=0.103 n=20)
Int64N1e9-8                   4.139n ± 0%    4.140n ± 0%         ~ (p=0.761 n=20)
Int64N2e9-8                   4.140n ± 0%    4.140n ± 0%         ~ (p=0.636 n=20)
Int64N1e18-8                  5.266n ± 0%    5.326n ± 1%    +1.14% (p=0.001 n=20)
Int64N2e18-8                  6.052n ± 0%    6.167n ± 0%    +1.90% (p=0.000 n=20)
Int64N4e18-8                  8.826n ± 0%    9.051n ± 0%    +2.55% (p=0.000 n=20)
Int32N1000-8                  4.127n ± 0%    4.132n ± 0%    +0.12% (p=0.000 n=20)
Int32N1e8-8                   4.126n ± 0%    4.131n ± 0%    +0.12% (p=0.000 n=20)
Int32N1e9-8                   4.127n ± 0%    4.132n ± 0%    +0.12% (p=0.000 n=20)
Int32N2e9-8                   4.132n ± 0%    4.131n ± 0%         ~ (p=0.017 n=20)
Float32-8                     4.109n ± 0%    4.105n ± 0%         ~ (p=0.379 n=20)
Float64-8                     4.107n ± 0%    4.106n ± 0%         ~ (p=0.867 n=20)
ExpFloat64-8                  5.339n ± 0%    5.383n ± 0%    +0.82% (p=0.000 n=20)
NormFloat64-8                 5.735n ± 0%    5.737n ± 1%         ~ (p=0.856 n=20)
Perm3-8                       26.65n ± 0%    26.80n ± 1%    +0.58% (p=0.000 n=20)
Perm30-8                      194.8n ± 1%    197.0n ± 0%    +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8            156.6n ± 0%    157.6n ± 1%    +0.61% (p=0.000 n=20)
ShuffleOverhead-8             124.9n ± 0%    125.5n ± 0%    +0.52% (p=0.000 n=20)
Concurrent-8                  2.434n ± 3%    5.066n ± 0%  +108.09% (p=0.000 n=20)

goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
                        │ bbb48afeb7.386 │            5cf807d1ea.386             │
                        │     sec/op     │    sec/op     vs base                 │
ChaCha8-32                  11.295n ± 1%    4.748n ± 2%   -57.96% (p=0.000 n=20)
PCG_DXSM-32                  7.693n ± 1%    7.738n ± 2%         ~ (p=0.542 n=20)
SourceUint64-32              7.658n ± 2%    7.622n ± 2%         ~ (p=0.344 n=20)
GlobalInt64-32               3.473n ± 2%    7.526n ± 2%  +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32      0.3198n ± 0%   0.5444n ± 0%   +70.22% (p=0.000 n=20)
GlobalUint64-32              3.612n ± 0%    7.575n ± 1%  +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32     0.3168n ± 0%   0.5403n ± 0%   +70.51% (p=0.000 n=20)
Int64-32                     7.673n ± 2%    7.789n ± 1%         ~ (p=0.122 n=20)
Uint64-32                    7.773n ± 1%    7.827n ± 2%         ~ (p=0.920 n=20)
GlobalIntN1000-32            6.268n ± 1%    9.581n ± 1%   +52.87% (p=0.000 n=20)
IntN1000-32                  10.33n ± 2%    10.45n ± 1%         ~ (p=0.233 n=20)
Int64N1000-32                10.98n ± 2%    11.01n ± 1%         ~ (p=0.401 n=20)
Int64N1e8-32                 11.19n ± 2%    10.97n ± 1%         ~ (p=0.033 n=20)
Int64N1e9-32                 11.06n ± 1%    11.08n ± 1%         ~ (p=0.498 n=20)
Int64N2e9-32                 11.10n ± 1%    11.01n ± 2%         ~ (p=0.995 n=20)
Int64N1e18-32                15.23n ± 2%    15.04n ± 1%         ~ (p=0.973 n=20)
Int64N2e18-32                15.89n ± 1%    15.85n ± 1%         ~ (p=0.409 n=20)
Int64N4e18-32                18.96n ± 2%    19.34n ± 2%         ~ (p=0.048 n=20)
Int32N1000-32                10.46n ± 2%    10.44n ± 2%         ~ (p=0.480 n=20)
Int32N1e8-32                 10.46n ± 2%    10.49n ± 2%         ~ (p=0.951 n=20)
Int32N1e9-32                 10.28n ± 2%    10.26n ± 1%         ~ (p=0.431 n=20)
Int32N2e9-32                 10.50n ± 2%    10.44n ± 2%         ~ (p=0.249 n=20)
Float32-32                   13.80n ± 2%    13.80n ± 2%         ~ (p=0.751 n=20)
Float64-32                   23.55n ± 2%    23.87n ± 0%         ~ (p=0.408 n=20)
ExpFloat64-32                15.36n ± 1%    15.29n ± 2%         ~ (p=0.316 n=20)
NormFloat64-32               13.57n ± 1%    13.79n ± 1%    +1.66% (p=0.005 n=20)
Perm3-32                     45.70n ± 2%    46.99n ± 2%    +2.81% (p=0.001 n=20)
Perm30-32                    399.0n ± 1%    403.8n ± 1%    +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32          349.0n ± 1%    350.4n ± 1%         ~ (p=0.909 n=20)
ShuffleOverhead-32           322.3n ± 1%    323.8n ± 1%         ~ (p=0.410 n=20)
Concurrent-32                3.331n ± 1%    7.312n ± 1%  +119.50% (p=0.000 n=20)

For #61716.

Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-05 20:34:30 +00:00
Joel Sing
70c7fb75e9 cmd/compile: correct code generation for right shifts on riscv64
The code generation on riscv64 will currently result in incorrect
assembly when a 32 bit integer is right shifted by an amount that
exceeds the size of the type. In particular, this occurs when an
int32 or uint32 is cast to a 64 bit type and right shifted by a
value larger than 31.

Fix this by moving the SRAW/SRLW conversion into the right shift
rules and removing the SignExt32to64/ZeroExt32to64. Add additional
rules that rewrite to SRAIW/SRLIW when the shift is less than the
size of the type, or replace/eliminate the shift when it exceeds
the size of the type.

Add SSA tests that would have caught this issue. Also add additional
codegen tests to ensure that the resulting assembly is what we
expect in these overflow cases.

Fixes #64285

Change-Id: Ie97b05668597cfcb91413afefaab18ee1aa145ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/545035
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-12-01 19:30:59 +00:00
Cuong Manh Le
fbfe62bc80 cmd/compile: fix typecheck range over rune literal
With range over int, the rune literal in range expression will be left
as untyped rune, but idealType is not handling this case, causing ICE.

Fixing this by setting the concrete type for untyped rune expresison.

Fixes #64471

Change-Id: I07a151c54ea1d9e1b92e4d96cdfb6e73dca13862
Reviewed-on: https://go-review.googlesource.com/c/go/+/546296
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-01 17:20:08 +00:00
Keith Randall
bda1ef13f8 cmd/compile: fix memcombine pass for big endian, > 1 byte elements
The shift amounts were wrong in this case, leading to miscompilation
of load combining.

Also the store combining was not triggering when it should.

Fixes #64468

Change-Id: Iaeb08972c5fc1d6f628800334789c6af7216e87b
Reviewed-on: https://go-review.googlesource.com/c/go/+/546355
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-30 18:35:50 +00:00
Matthew Dempsky
00715d089d cmd/compile/internal/walk: copy SSA-able variables
order.go ensures expressions that are passed to the runtime by address
are in fact addressable. However, in the case of local variables, if the
variable hasn't already been marked as addrtaken, then taking its
address here will effectively prevent the variable from being converted
to SSA form.

Instead, it's better to just copy the variable into a new temporary,
which we can pass by address instead. This ensures the original variable
can still be converted to SSA form.

Fixes #63332.

Change-Id: I182376d98d419df8bf07c400d84c344c9b82c0fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/541715
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-11-21 20:34:12 +00:00
Matthew Dempsky
4a90cdb03d cmd/compile: interleave devirtualization and inlining
This CL interleaves devirtualization and inlining, so that
devirtualized calls can be inlined.

Fixes #52193.

Change-Id: I681e7c55bdb90ebf6df315d334e7a58f05110d9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/528321
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Matthew Dempsky <mdempsky@google.com>
2023-11-20 18:09:45 +00:00
Matthew Dempsky
28a8896d57 cmd/compile/internal/inline: allow inlining of checkptr arguments
The early return here is meant to suppress inlining of the function
call itself. However, it also suppresses recursing to visit the call
arguments, which are safe to inline.

Change-Id: I75887574c00931cb622277d04a822bc84c29bfa2
Reviewed-on: https://go-review.googlesource.com/c/go/+/543658
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-20 15:12:49 +00:00
Than McIntosh
3f04f959d2 cmd/compile/internal/inline: refactor AnalyzeFunc
This patch reworks how inlheur.AnalyzeFunc is called by the top level
inliner. Up until this point the strategy was to analyze a function at
the point where CanInline is invoked on it, however it simplifies
things to instead make the call outside of CanInline (for example, so
that directly recursive functions can be analyzed).

Also as part of this patch, change things so that we no longer run
some of the more compile-time intensive analysis on functions that
haven't been marked inlinable (so as to safe compile time), and add a
teardown/cleanup hook in the inlheur package to be invoked by the
inliner when we're done inlining.

Change-Id: Id0772a285d891b0bed66dd86adaffa69d973c26a
Reviewed-on: https://go-review.googlesource.com/c/go/+/539318
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-11-16 20:15:25 +00:00
Michael Anthony Knyszek
0011342590 test: ignore MemProfileRecords with no live objects in finprofiled.go
This test erroneously assumes that there will always be at least one
live object accounted for in a MemProfileRecord. This is not true; all
memory allocated from a particular location could be dead.

Fixes #64153.

Change-Id: Iadb783ea9b247823439ddc74b62a4c8b2ce8e33e
Reviewed-on: https://go-review.googlesource.com/c/go/+/542736
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-16 05:48:00 +00:00
Paul E. Murphy
773039ed5c cmd/compile/internal/ssa: on PPC64, merge (CMPconst [0] (op ...)) more aggressively
Generate the CC version of many opcodes whose result is compared against
signed 0. The approach taken here works even if the opcode result is used in
multiple places too.

Add support for ADD, ADDconst, ANDN, SUB, NEG, CNTLZD, NOR conversions
to their CC opcode variant. These are the most commonly used variants.

Also, do not set clobberFlags of CNTLZD and CNTLZW, they do not clobber
flags.

This results in about 1% smaller text sections in kubernetes binaries,
and no regressions in the crypto benchmarks.

Change-Id: I9e0381944869c3774106bf348dead5ecb96dffda
Reviewed-on: https://go-review.googlesource.com/c/go/+/538636
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-11-13 22:12:32 +00:00
Paul E. Murphy
3128aeec87 cmd/internal/obj/ppc64: remove C_UCON optab matching class
This optab matching rule was used to match signed 16 bit values shifted
left by 16 bits. Unsigned 16 bit values greater than 0x7FFF<<16 were
classified as C_U32CON which led to larger than necessary codegen.

Instead, rewrite logical/arithmetic operations in the preprocessor pass
to use the 16 bit shifted immediate operation (e.g ADDIS vs ADD). This
simplifies the optab matching rules, while also minimizing codegen size
for large unsigned values.

Note, ADDIS sign-extends the constant argument, all others do not.

For matching opcodes, this means:
	MOVD $is<<16,Rx becomes ADDIS $is,Rx or ORIS $is,Rx
	MOVW $is<<16,Rx becomes ADDIS $is,Rx
	ADD $is<<16,[Rx,]Ry becomes ADDIS $is[Rx,]Ry
	OR $is<<16,[Rx,]Ry becomes ORIS $is[Rx,]Ry
	XOR $is<<16,[Rx,]Ry becomes XORIS $is[Rx,]Ry

Change-Id: I1a988d9f52517a04bb8dc2e41d7caf3d5fff867c
Reviewed-on: https://go-review.googlesource.com/c/go/+/536735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-11-09 18:41:18 +00:00
Cherry Mui
45320c23f2 all: rename GOEXPERIMENT=range to rangefunc
Range over integer is enabled now without the GOEXPERIMENT. The
GOEXPERIMENT is only for range over func. Rename it to rangefunc.

For #61405.

Change-Id: I9405fb8e2e30827875716786823856090a1a0cad
Reviewed-on: https://go-review.googlesource.com/c/go/+/539277
Reviewed-by: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-08 15:32:14 +00:00
Jorropo
8b4e1259d0 cmd/compile: fix findIndVar so it does not match disjointed loop headers
Fix #63955

parseIndVar, prove and maybe more are on the assumption that the loop header
is a single block. This can be wrong, ensure we don't match theses cases we
don't know how to handle.

In the future we could update them so that they know how to handle such cases
but theses cases seems rare so I don't think the value would be really high.
We could also run a loop canonicalization pass first which could handle this.

The repro case looks weird because I massaged it so it would crash with the
previous compiler.

Change-Id: I4aa8afae9e90a17fa1085832250fc1139c97faa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/539977
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-07 17:37:47 +00:00
Robert Griesemer
6a7ef36466 test: run range-over-integer tests without need for -goexperiment
Move the range-over-function tests into range4.go.

Change-Id: Idccf30a0c7d7e8d2a17fb1c5561cf21e00506135
Reviewed-on: https://go-review.googlesource.com/c/go/+/539095
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Robert Griesemer <gri@google.com>
2023-11-02 04:17:21 +00:00
Keith Randall
962ccbef91 cmd/compile: ensure pointer arithmetic happens after the nil check
Have nil checks return a pointer that is known non-nil. Users of
that pointer can use the result, ensuring that they are ordered
after the nil check itself.

The order dependence goes away after scheduling, when we've fixed
an order. At that point we move uses back to the original pointer
so it doesn't change regalloc any.

This prevents pointer arithmetic on nil from being spilled to the
stack and then observed by a stack scan.

Fixes #63657

Change-Id: I1a5fa4f2e6d9000d672792b4f90dfc1b7b67f6ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/537775
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-10-31 20:45:54 +00:00
Ubuntu
8fc043ccfa cmd/compile: optimize right shifts of int32 on riscv64
The compiler is currently sign extending 32 bit signed integers to
64 bits before right shifting them using a 64 bit shift instruction.
There's no need to do this as RISC-V has instructions for right
shifting 32 bit signed values (sraw and sraiw) which sign extend
the result of the shift to 64 bits.  Change the compiler so that
it uses sraw and sraiw for shifts of signed 32 bit integers reducing
in most cases the number of instructions needed to perform the shift.

Here are some examples of code sequences that are changed by this
patch:

int32(a) >> 2

  before:

    sll     x5,x10,0x20
    sra     x10,x5,0x22

  after:

    sraw    x10,x10,0x2

int32(v) >> int(s)

  before:

    sext.w  x5,x10
    sltiu   x6,x11,64
    add     x6,x6,-1
    or      x6,x11,x6
    sra     x10,x5,x6

  after:

    sltiu   x5,x11,32
    add     x5,x5,-1
    or      x5,x11,x5
    sraw    x10,x10,x5

int32(v) >> (int(s) & 31)

  before:

    sext.w  x5,x10
    and     x6,x11,63
    sra     x10,x5,x6

after:

    and     x5,x11,31
    sraw    x10,x10,x5

int32(100) >> int(a)

  before:

    bltz    x10,<target address calls runtime.panicshift>
    sltiu   x5,x10,64
    add     x5,x5,-1
    or      x5,x10,x5
    li      x6,100
    sra     x10,x6,x5

  after:

    bltz    x10,<target address calls runtime.panicshift>
    sltiu   x5,x10,32
    add     x5,x5,-1
    or      x5,x10,x5
    li      x6,100
    sraw    x10,x6,x5

int32(v) >> (int(s) & 63)

  before:

    sext.w  x5,x10
    and     x6,x11,63
    sra     x10,x5,x6

  after:

    and     x5,x11,63
    sltiu   x6,x5,32
    add     x6,x6,-1
    or      x5,x5,x6
    sraw    x10,x10,x5

In most cases we eliminate one instruction.  In the case where
we shift a int32 constant by a variable the number of instructions
generated is identical.  A sra is simply replaced by a sraw.  In the
unusual case where we shift right by a variable anded with a constant
> 31 but < 64, we generate two additional instructions.  As this is
an unusual case we do not try to optimize for it.

Some improvements can be seen in some of the existing benchmarks,
notably in the utf8 package which performs right shifts of runes
which are signed 32 bit integers.

                      |  utf8-old   |              utf8-new            |
                      |   sec/op    |   sec/op     vs base             |
EncodeASCIIRune-4       17.68n ± 0%   17.67n ± 0%       ~ (p=0.312 n=10)
EncodeJapaneseRune-4    35.34n ± 0%   34.53n ± 1%  -2.31% (p=0.000 n=10)
AppendASCIIRune-4       3.213n ± 0%   3.213n ± 0%       ~ (p=0.318 n=10)
AppendJapaneseRune-4    36.14n ± 0%   35.35n ± 0%  -2.19% (p=0.000 n=10)
DecodeASCIIRune-4       28.11n ± 0%   27.36n ± 0%  -2.69% (p=0.000 n=10)
DecodeJapaneseRune-4    38.55n ± 0%   38.58n ± 0%       ~ (p=0.612 n=10)

Change-Id: I60a91cbede9ce65597571c7b7dd9943eeb8d3cc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/535115
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
2023-10-30 14:47:06 +00:00
Ian Lance Taylor
2968f5623c test: add tests that gofrontend failed
I will shortly be sending CLs to let the gofrontend code pass
these tests.

Change-Id: I53ccbdac3ac224a4fdc9577270f48136ca73a62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/536537
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-10-20 03:00:06 +00:00
Dmitri Shuralyov
b2fd76ab8d test: migrate remaining files to go:build syntax
Most of the test cases in the test directory use the new go:build syntax
already. Convert the rest. In general, try to place the build constraint
line below the test directive comment in more places.

For #41184.
For #60268.

Change-Id: I11c41a0642a8a26dc2eda1406da908645bbc005b
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/536236
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-19 23:33:25 +00:00
Cuong Manh Le
1f25f96463 cmd/compile: report mismatched version set by //go:build
Fixes #63489

Change-Id: I5e02dc5165ada7f5c292d56203dc670e96eaf2c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/534755
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-19 11:56:10 +00:00
Paul E. Murphy
1c5862cc0a test/codegen: fix PPC64 AddLargeConst test
Commit 061d77cb70 was published in parallel with another commit
36ecff0893 which changed how certain constants were generated.

Update the test to account for the changes.

Change-Id: I314b735a34857efa02392b7a0dd9fd634e4ee428
Reviewed-on: https://go-review.googlesource.com/c/go/+/536256
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Paul Murphy <murp@ibm.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-10-19 00:40:54 +00:00
Paul E. Murphy
061d77cb70 cmd/compile/internal/ssa: on PPC64, generate large constant paddi
This is only supported power10/linux/PPC64. This generates smaller,
faster code by merging a pli + add into paddi.

Change-Id: I1f4d522fce53aea4c072713cc119a9e0d7065acc
Reviewed-on: https://go-review.googlesource.com/c/go/+/531717
Run-TryBot: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2023-10-18 18:04:48 +00:00
Cuong Manh Le
da7ac77380 cmd/compile: fix funcdata encode for functions with large frame size
The funcdata is encoded as varint, with the upper limit set to 1e9.
However, the stack offsets could be up to 1<<30. Thus emitOpenDeferInfo
will trigger an ICE for function with large frame size.

By using binary.PutUvarint, the frame offset could be encoded correctly
for value larger than 1<<35, allow the compiler to report the error.

Further, the runtime also do validation when reading in the funcdata
value, so a bad offset won't likely cause mis-behavior.

Fixes #52697

Change-Id: I084c243c5d24c5d31cc22d5b439f0889e42b107c
Reviewed-on: https://go-review.googlesource.com/c/go/+/535077
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-10-18 16:32:07 +00:00
Lynn Boger
80834af206 cmd/compile: avoid ANDCCconst on PPC64 if condition not needed
In the PPC64 ISA, the instruction to do an 'and' operation
using an immediate constant is only available in the form that
also sets CR0 (i.e. clobbers the condition register.) This means
CR0 is being clobbered unnecessarily in many cases. That
affects some decisions made during some compiler passes
that check for it.

In those cases when the constant used by the ANDCC is a right
justified consecutive set of bits, a shift instruction can
be used which has the same effect if CR0 does not need to be
set. The rule to do that has been added to the late rules file
after other rules using ANDCCconst have been processed in the
main rules file.

Some codegen tests had to be updated since ANDCC is no
longer generated for some cases. A new test case was added to
verify the ANDCC is present if the results for both the AND
and CR0 are used.

Change-Id: I304f607c039a458e2d67d25351dd00aea72ba542
Reviewed-on: https://go-review.googlesource.com/c/go/+/531435
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-10-18 15:56:53 +00:00
Keith Randall
657c885fb9 cmd/compile: when combining stores, use line number of first store
var p *[2]uint32 = ...
p[0] = 0
p[1] = 0

When we combine these two 32-bit stores into a single 64-bit store,
use the line number of the first store, not the second one.
This differs from the default behavior because usually with the combining
that the compiler does, we use the line number of the last instruction
in the combo (e.g. load+add, we use the line number of the add).

This is the same behavior that gcc does in C (picking the line
number of the first of a set of combined stores).

Change-Id: Ie70bf6151755322d33ecd50e4d9caf62f7881784
Reviewed-on: https://go-review.googlesource.com/c/go/+/521678
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
2023-10-12 18:09:26 +00:00
David Chase
eb832afb23 cmd/compiler: make decompose shortcuts apply for PtrShaped, not just Ptr
The immediate-data interface shortcuts also apply to pointer-shaped
things like maps, channels, and functions.

Fixes #63505.

Change-Id: I43982441bf523f53e9e5d183e95aea7c6655dd6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/534657
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Auto-Submit: David Chase <drchase@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-10-12 00:48:31 +00:00
David Chase
1493dd3cfc cmd/compile: get rid of zero-sized values in call expansion
Do this by removing all stores of zero-sized anything.

Fixes #63433.

Change-Id: I5d8271edab992d15d02005fa3fe31835f2eff8fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/534296
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-10-10 16:50:46 +00:00
Cuong Manh Le
ad9e6edfdd cmd/compile: fix wrong argument of OpSelectN during expand_calls
Fixes #63462

Change-Id: I5ddf831eab630e23156f8f27a079b4ca4bb3a261
Reviewed-on: https://go-review.googlesource.com/c/go/+/533795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-10-09 22:23:06 +00:00
Keith Randall
e0948d825d cmd/compile: use type hash from itab field instead of type field
It is one less dependent load away, and right next to another
field in the itab we also load as part of the type switch or
type assert.

Change-Id: If7aaa7814c47bd79a6c7ed4232ece0bc1d63550e
Reviewed-on: https://go-review.googlesource.com/c/go/+/533117
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-10-09 18:39:50 +00:00
Cuong Manh Le
778880b008 cmd/compile: fix typecheck range over negative integer
Before range over integer, types2 leaves constant expression in RHS of
non-constant shift untyped, so idealType do the validation to ensure
that constant value must be an int >= 0.

With range over int, the range expression can also be left untyped, and
can be an negative integer, causing the validation false.

Fixing this by relaxing the validation in idealType, and moving the
check to Unified IR reader.

Fixes #63378

Change-Id: I43042536c09afd98d52c5981adff5dbc5e7d882a
Reviewed-on: https://go-review.googlesource.com/c/go/+/532835
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-10-09 18:09:34 +00:00
Keith Randall
afd7c15c7f cmd/compile: use cache in front of convI2I
This is the last of the getitab users to receive a cache.
We should now no longer see getitab (and callees) in profiles.
Hopefully.

Change-Id: I2ed72b9943095bbe8067c805da7f08e00706c98c
Reviewed-on: https://go-review.googlesource.com/c/go/+/531055
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-10-09 17:28:22 +00:00
Cuong Manh Le
2744155d36 cmd/compile: fix ICE with parenthesized builtin calls
CL 419456 starts using lookupObj to find types2.Object associated with
builtin functions. However, the new code does not un-parenthesized the
callee expression, causing an ICE because of nil obj returned.

Un-parenthesizing the callee expression fixes the problem.

Fixes #63436

Change-Id: Iebb4fbc08575e7d0b1dbd026c98e8f949ca16460
Reviewed-on: https://go-review.googlesource.com/c/go/+/533476
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-10-08 23:15:25 +00:00
Mark Ryan
561bf0457f cmd/compile: optimize right shifts of uint32 on riscv
The compiler is currently zero extending 32 bit unsigned integers to
64 bits before right shifting them using a 64 bit shift instruction.
There's no need to do this as RISC-V has instructions for right
shifting 32 bit unsigned values (srlw and srliw) which zero extend
the result of the shift to 64 bits.  Change the compiler so that
it uses srlw and srliw for 32 bit unsigned shifts reducing in most
cases the number of instructions needed to perform the shift.

Here are some examples of code sequences that are changed by this
patch:

uint32(a) >> 2

  before:

    sll     x5,x10,0x20
    srl     x10,x5,0x22

  after:

    srlw    x10,x10,0x2

uint32(a) >> int(b)

  before:

    sll     x5,x10,0x20
    srl     x5,x5,0x20
    srl     x5,x5,x11
    sltiu   x6,x11,64
    neg     x6,x6
    and     x10,x5,x6

  after:

    srlw    x5,x10,x11
    sltiu   x6,x11,32
    neg     x6,x6
    and     x10,x5,x6

bits.RotateLeft32(uint32(a), 1)

  before:

    sll     x5,x10,0x1
    sll     x6,x10,0x20
    srl     x7,x6,0x3f
    or      x5,x5,x7

  after:

   sll     x5,x10,0x1
   srlw    x6,x10,0x1f
   or      x10,x5,x6

bits.RotateLeft32(uint32(a), int(b))

  before:
    and     x6,x11,31
    sll     x7,x10,x6
    sll     x8,x10,0x20
    srl     x8,x8,0x20
    add     x6,x6,-32
    neg     x6,x6
    srl     x9,x8,x6
    sltiu   x6,x6,64
    neg     x6,x6
    and     x6,x9,x6
    or      x6,x6,x7

  after:

    and     x5,x11,31
    sll     x6,x10,x5
    add     x5,x5,-32
    neg     x5,x5
    srlw    x7,x10,x5
    sltiu   x5,x5,32
    neg     x5,x5
    and     x5,x7,x5
    or      x10,x6,x5

The one regression observed is the following case, an unbounded right
shift of a uint32 where the value we're shifting by is known to be
< 64 but > 31.  As this is an unusual case this commit does not
optimize for it, although the existing code does.

uint32(a) >> (b & 63)

  before:

    sll     x5,x10,0x20
    srl     x5,x5,0x20
    and     x6,x11,63
    srl     x10,x5,x6

  after

    and     x5,x11,63
    srlw    x6,x10,x5
    sltiu   x5,x5,32
    neg     x5,x5
    and     x10,x6,x5

Here we have one extra instruction.

Some benchmark highlights, generated on a VisionFive2 8GB running
Ubuntu 23.04.

pkg: math/bits
LeadingZeros32-4    18.64n ± 0%     17.32n ± 0%   -7.11% (p=0.000 n=10)
LeadingZeros64-4    15.47n ± 0%     15.51n ± 0%   +0.26% (p=0.027 n=10)
TrailingZeros16-4   18.48n ± 0%     17.68n ± 0%   -4.33% (p=0.000 n=10)
TrailingZeros32-4   16.87n ± 0%     16.07n ± 0%   -4.74% (p=0.000 n=10)
TrailingZeros64-4   15.26n ± 0%     15.27n ± 0%   +0.07% (p=0.043 n=10)
OnesCount32-4       20.08n ± 0%     19.29n ± 0%   -3.96% (p=0.000 n=10)
RotateLeft-4        8.864n ± 0%     8.838n ± 0%   -0.30% (p=0.006 n=10)
RotateLeft32-4      8.837n ± 0%     8.032n ± 0%   -9.11% (p=0.000 n=10)
Reverse32-4         29.77n ± 0%     26.52n ± 0%  -10.93% (p=0.000 n=10)
ReverseBytes32-4    9.640n ± 0%     8.838n ± 0%   -8.32% (p=0.000 n=10)
Sub32-4             8.835n ± 0%     8.035n ± 0%   -9.06% (p=0.000 n=10)
geomean             11.50n          11.33n        -1.45%

pkg: crypto/md5
Hash8Bytes-4             1.486µ ± 0%   1.426µ ± 0%  -4.04% (p=0.000 n=10)
Hash64-4                 2.079µ ± 0%   1.968µ ± 0%  -5.36% (p=0.000 n=10)
Hash128-4                2.720µ ± 0%   2.557µ ± 0%  -5.99% (p=0.000 n=10)
Hash256-4                3.996µ ± 0%   3.733µ ± 0%  -6.58% (p=0.000 n=10)
Hash512-4                6.541µ ± 0%   6.072µ ± 0%  -7.18% (p=0.000 n=10)
Hash1K-4                 11.64µ ± 0%   10.75µ ± 0%  -7.58% (p=0.000 n=10)
Hash8K-4                 82.95µ ± 0%   76.32µ ± 0%  -7.99% (p=0.000 n=10)
Hash1M-4                10.436m ± 0%   9.591m ± 0%  -8.10% (p=0.000 n=10)
Hash8M-4                 83.50m ± 0%   76.73m ± 0%  -8.10% (p=0.000 n=10)
Hash8BytesUnaligned-4    1.494µ ± 0%   1.434µ ± 0%  -4.02% (p=0.000 n=10)
Hash1KUnaligned-4        11.64µ ± 0%   10.76µ ± 0%  -7.52% (p=0.000 n=10)
Hash8KUnaligned-4        83.01µ ± 0%   76.32µ ± 0%  -8.07% (p=0.000 n=10)
geomean                  28.32µ        26.42µ       -6.72%

Change-Id: I20483a6668cca1b53fe83944bee3706aadcf8693
Reviewed-on: https://go-review.googlesource.com/c/go/+/528975
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-10-07 12:31:38 +00:00
mstmdev
be3d5fb6e6 sync: use atomic.Uint32 in Once
Change-Id: If9089f8afd78de3e62cd575f642ff96ab69e2099
GitHub-Last-Rev: 14165018d6
GitHub-Pull-Request: golang/go#63386
Reviewed-on: https://go-review.googlesource.com/c/go/+/532895
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
2023-10-06 21:01:50 +00:00
David Chase
b72bbaebf9 cmd/compile: expand calls cleanup
Convert expand calls into a smaller number of focused
recursive rewrites, and rely on an enhanced version of
"decompose" to clean up afterwards.

Debugging information seems to emerge intact.

Change-Id: Ic46da4207e3a4da5c8e2c47b637b0e35abbe56bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/507295
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-10-06 20:57:33 +00:00
Keith Randall
cbcf8efa5f cmd/compile: use cache in front of type assert runtime call
That way we don't need to call into the runtime for every
type assertion (to an interface type).

name           old time/op  new time/op  delta
TypeAssert-24  3.78ns ± 3%  1.00ns ± 1%  -73.53%  (p=0.000 n=10+8)

Change-Id: I0ba308aaf0f24a5495b4e13c814d35af0c58bfde
Reviewed-on: https://go-review.googlesource.com/c/go/+/529316
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-10-06 17:02:53 +00:00
Keith Randall
39263f34a3 cmd/compile: add a cache to interface type switches
That way we don't need to call into the runtime when the type being
switched on has been seen many times before.

The cache is just a hash table of a sample of all the concrete types
that have been switched on at that source location.  We record the
matching case number and the resulting itab for each concrete input
type.

The caches seldom get large. The only two in a run of all.bash that
get more than 100 entries, even with the sampling rate set to 1, are

test/fixedbugs/issue29264.go, with 101
test/fixedbugs/issue29312.go, with 254

Both happen at the type switch in fmt.(*pp).handleMethods, perhaps
unsurprisingly.

name                                 old time/op  new time/op  delta
SwitchInterfaceTypePredictable-24    25.8ns ± 2%   2.5ns ± 3%  -90.43%  (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24  37.5ns ± 2%  11.2ns ± 1%  -70.02%  (p=0.000 n=10+10)

Change-Id: I4961ac9547b7f15b03be6f55cdcb972d176955eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/526658
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-10-06 15:44:08 +00:00
Keith Randall
28f4ea16a2 cmd/compile: improve interface type switches
For type switches where the targets are interface types,
call into the runtime once instead of doing a sequence
of assert* calls.

name                                 old time/op  new time/op  delta
SwitchInterfaceTypePredictable-24    26.6ns ± 1%  25.8ns ± 2%  -2.86%  (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24  39.3ns ± 1%  37.5ns ± 2%  -4.57%  (p=0.000 n=10+10)

Not super helpful by itself, but this code organization allows
followon CLs that add caching to the lookups.

Change-Id: I7967f85a99171faa6c2550690e311bea8b54b01c
Reviewed-on: https://go-review.googlesource.com/c/go/+/526657
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-10-06 15:42:30 +00:00
Cuong Manh Le
38db0316f4 cmd/compile: do not fatal when typechecking conversion expression
The types2 typechecker already reported all invalid conversions required
by the Go language spec. However, the conversion involves go pragma is
not specified in the spec, so is not checked by types2.

Fixing this by handling the error gracefully during typecheck, just like
how old typechecker did before CL 394575.

Fixes #63333

Change-Id: I04c4121971c62d96f75ded1794ab4bdf3a6cd0ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/532515
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-10-05 19:44:52 +00:00
Paul E. Murphy
dcd018b5c5 cmd/internal/obj/ppc64: generate MOVD mask constants in register
Add a new form of RLDC which maps directly to the ISA definition
of rldc: RLDC Rs, $sh, $mb, Ra. This is used to generate mask
constants described below.

Using MOVD $-1, Rx; RLDC Rx, $sh, $mb, Rx, any mask constant
can be generated. A mask is a contiguous series of 1 bits, which
may wrap.

Change-Id: Ifcaae1114080ad58b5fdaa3e5fc9019e2051f282
Reviewed-on: https://go-review.googlesource.com/c/go/+/531120
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-10-05 14:03:32 +00:00
Paul E. Murphy
36ecff0893 cmd/internal/obj/ppc64: generate small, shifted constants in register
Check for shifted 16b constants, and transform them to avoid the load
penalty. This should be much faster than loading, and reduce binary
size by reducing the constant pool size.

Change-Id: I6834e08be7ca88e3b77449d226d08d199db84299
Reviewed-on: https://go-review.googlesource.com/c/go/+/531119
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-04 19:10:19 +00:00
Jes Cok
81c5d92f52 all: use the indefinite article an in comments
Change-Id: I8787458f9ccd3b5cdcdda820d8a45deb4f77eade
GitHub-Last-Rev: be865d67ef
GitHub-Pull-Request: golang/go#63165
Reviewed-on: https://go-review.googlesource.com/c/go/+/530120
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2023-09-25 14:29:30 +00:00
Paul E. Murphy
c8caad423c cmd/compile/internal/ssa: optimize (AND (MOVDconst [-1] x)) on PPC64
This sequence can show up in the lowering pass on PPC64. If it
makes it to the latelower pass, it will cause an error because
it looks like it can be turned into RLDICL, but -1 isn't an
accepted mask.

Also, print more debug info if panic is called from
encodePPC64RotateMask.

Fixes #62698

Change-Id: I0f3322e2205357abe7fc28f96e05e3f7ad65567c
Reviewed-on: https://go-review.googlesource.com/c/go/+/529195
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-22 14:09:29 +00:00
eric fang
ace1494d92 cmd/compile: optimize absorbing InvertFlags into Noov comparisons for arm64
Previously (LessThanNoov (InvertFlags x)) is lowered as:
CSET
CSET
BIC
With this CL it's lowered as:
CSET
CSEL
This saves one instruction.

Similarly (GreaterEqualNoov (InvertFlags x)) is now lowered as:
CSET
CSINC

$ benchstat old.bench new.bench
goos: linux
goarch: arm64
                       │  old.bench  │             new.bench              │
                       │   sec/op    │   sec/op     vs base               │
InvertLessThanNoov-160   2.249n ± 2%   2.190n ± 1%  -2.62% (p=0.003 n=10)

Change-Id: Idd8979b7f4fe466e74b1a201c4aba7f1b0cffb0b
Reviewed-on: https://go-review.googlesource.com/c/go/+/526237
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-21 02:36:06 +00:00
Russ Cox
2fba42cb52 cmd/compile: implement range over func
Add compiler support for range over functions.
See the large comment at the top of
cmd/compile/internal/rangefunc/rewrite.go for details.

This is only reachable if GOEXPERIMENT=range is set,
because otherwise type checking will fail.

For proposal #61405 (but behind a GOEXPERIMENT).
For #61717.

Change-Id: I05717f94e63089c503acc49b28b47edeb4e011b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/510541
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2023-09-20 14:52:38 +00:00
Russ Cox
a94347a05c cmd/compile: implement range over integer
Add compiler implementation of range over integers.
This is only reachable if GOEXPERIMENT=range is set,
because otherwise type checking will fail.

For proposal #61405 (but behind a GOEXPERIMENT).
For #61717.

Change-Id: I4e35a73c5df1ac57f61ffb54033a433967e5be51
Reviewed-on: https://go-review.googlesource.com/c/go/+/510538
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
2023-09-20 14:52:33 +00:00
Russ Cox
8b727f856e cmd/compile, go/types: typechecking of range over int, func
Add type-checking logic for range over integers and functions,
behind GOEXPERIMENT=range.

For proposal #61405 (but behind a GOEXPERIMENT).
For #61717.

Change-Id: Ibf78cf381798b450dbe05eb922df82af2b009403
Reviewed-on: https://go-review.googlesource.com/c/go/+/510537
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
2023-09-20 14:52:29 +00:00
Than McIntosh
0b07bbd2be cmd/compile/internal/inl: inline based on scoring when GOEXPERIMENT=newinliner
This patch changes the inliner to use callsite scores when deciding to
inline as opposed to looking only at callee cost/hairyness.

For this to work, we have to relax the inline budget cutoff as part of
CanInline to allow for the possibility that a given function might
start off with a cost of N where N > 80, but then be called from a
callsites whose score is less than 80. Once a given function F in
package P has been approved by CanInline (based on the relaxed budget)
it will then be emitted as part of the export data, meaning that other
packages importing P will need to also need to compute callsite scores
appropriately.

For a function F that calls function G, if G is marked as potentially
inlinable then the hairyness computation for F will use G's cost for
the call to G as opposed to the default call cost; for this to work
with the new scheme (given relaxed cost change described above) we
use G's cost only if it falls below inlineExtraCallCost, otherwise
just use inlineExtraCallCost.

Included in this patch are a bunch of skips and workarounds to
selected 'errorcheck' tests in the <GOROOT>/test directory to deal
with the additional "can inline" messages emitted when the new inliner
is turned on.

Change-Id: I9be5f8cd0cd8676beb4296faf80d2f6be7246335
Reviewed-on: https://go-review.googlesource.com/c/go/+/519197
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-09-14 19:43:26 +00:00
Matthew Dempsky
d18e9407b0 cmd/compile/internal/ir: add Func.DeclareParams
There's several copies of this function. We only need one.

While here, normalize so that we always declare parameters, and always
use the names ~pNN for params and ~rNN for results.

Change-Id: I49e90d3fd1820f3c07936227ed5cfefd75d49a1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/528415
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2023-09-14 13:15:50 +00:00
Yi Yang
4ee1d542ed cmd/compile: sparse conditional constant propagation
sparse conditional constant propagation can discover optimization
opportunities that cannot be found by just combining constant folding
and constant propagation and dead code elimination separately.

This is a re-submit of PR#59575, which fix a broken dominance relationship caught by ssacheck

Updates https://github.com/golang/go/issues/59399

Change-Id: I57482dee38f8e80a610aed4f64295e60c38b7a47
GitHub-Last-Rev: 830016f24e
GitHub-Pull-Request: golang/go#60469
Reviewed-on: https://go-review.googlesource.com/c/go/+/498795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-09-12 21:01:50 +00:00
Matthew Dempsky
2e457b3868 cmd/compile/internal/noder: handle unsafe.Sizeof, etc in unified IR
Previously, the unified frontend implemented unsafe.Sizeof, etc that
involved derived types by constructing a normal OSIZEOF, etc
expression, including fully instantiating their argument. (When
unsafe.Sizeof is applied to a non-generic type, types2 handles
constant folding it.)

This worked, but involves unnecessary work, since all we really need
to track is the argument type (and the field selections, for
unsafe.Offsetof).

Further, the argument expression could generate temporary variables,
which would then go unused after typecheck replaced the OSIZEOF
expression with an OLITERAL. This results in compiler failures after
CL 523315, which made later passes stricter about expecting the
frontend to not construct unused temporaries.

Fixes #62515.

Change-Id: I37baed048fd2e35648c59243f66c97c24413aa94
Reviewed-on: https://go-review.googlesource.com/c/go/+/527097
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-09-11 20:48:07 +00:00
Matthew Dempsky
afa3f8e104 cmd/compile/internal/staticinit: make staticopy safe
Currently, cmd/compile optimizes `var a = true; var b = a` into `var a
= true; var b = true`. But this may not be safe if we need to
initialize any other global variables between `a` and `b`, and the
initialization involves calling a user-defined function that reassigns
`a`.

This CL changes staticinit to keep track of the initialization
expressions that we've seen so far, and to stop applying the
staticcopy optimization once we've seen an initialization expression
that might have modified another global variable within this package.

To help identify affected initializers, this CL adds a -d=staticcopy
flag to warn when a staticcopy is suppressed and turned into a dynamic
copy.

Currently, `go build -gcflags=all=-d=staticcopy std` reports only four
instances:

```
encoding/xml/xml.go:1600:5: skipping static copy of HTMLEntity+0 with map[string]string{...}
encoding/xml/xml.go:1869:5: skipping static copy of HTMLAutoClose+0 with []string{...}
net/net.go:661:5: skipping static copy of .stmp_31+0 with poll.ErrNetClosing
net/http/transport.go:2566:5: skipping static copy of errRequestCanceled+0 with ~R0
```

Fixes #51913.

Change-Id: Iab41cf6f84c44f7f960e4e62c28a8aeaade4fbcf
Reviewed-on: https://go-review.googlesource.com/c/go/+/395541
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
2023-09-11 20:12:05 +00:00
Matthew Dempsky
e3ce312621 cmd/compile/internal/typecheck: fix closure field naming
When creating the struct type to hold variables captured by a function
literal, we currently reuse the captured variable names as fields.

However, there's no particular reason to do this: these struct types
aren't visible to users, and it adds extra complexity in making sure
fields belong to the correct packages.

Further, it turns out we were getting that subtly wrong. If two
function literals from different packages capture variables with
identical names starting with an uppercase letter (and in the same
order and with corresponding identical types) end up in the same
function (e.g., due to inlining), then we could end up creating
closure struct types that are "different" (i.e., not types.Identical)
yet end up with equal LinkString representations (which violates
LinkString's contract).

The easy fix is to just always use simple, exported, generated field
names in the struct. This should allow further struct reuse across
packages too, and shrink binary sizes slightly.

Fixes #62498.

Change-Id: I9c973f5087bf228649a8f74f7dc1522d84a26b51
Reviewed-on: https://go-review.googlesource.com/c/go/+/527135
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-09-11 16:02:11 +00:00
Matthew Dempsky
18c6ec1e4a cmd/compile/internal/noder: stop preserving original const strings
One of the more tedious quirks of the original frontend (i.e.,
typecheck) to preserve was that it preserved the original
representation of constants into the backend. To fit into the unified
IR model, I ended up implementing a fairly heavyweight workaround:
simply record the original constant's string expression in the export
data, so that diagnostics could still report it back, and match the
old test expectations.

But now that there's just a single frontend to support, it's easy
enough to just update the test expectations and drop this support for
"raw" constant expressions.

Change-Id: I1d859c5109d679879d937a2b213e777fbddf4f2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/526376
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-09-08 18:50:24 +00:00
Keith Randall
fb5bdb4cc9 cmd/compile: absorb InvertFlags into Noov comparisons
Unfortunately, there isn't a single op that provides the resulting
computation.
At least, I couldn't find one.

Fixes #62469

Change-Id: I236f3965b827aaeb3d70ef9fe89be66b116494f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/526276
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-09-07 15:14:39 +00:00
Paul E. Murphy
5cdb132228 cmd/compile/internal/ssa: improve masking codegen on PPC64
Generate RLDIC[LR] instead of MOVD mask, Rx; AND Rx, Ry, Rz.
This helps reduce code size, and reduces the latency caused
by the constant load.

Similarly, for smaller-than-register values, truncate constants
which exceed the range of the value's type to avoid needing to
load a constant.

Change-Id: I6019684795eb8962d4fd6d9585d08b17c15e7d64
Reviewed-on: https://go-review.googlesource.com/c/go/+/515576
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-06 16:34:20 +00:00
Keith Randall
0e02baa59a Revert "cmd/compile: use shorter ANDL/TESTL if upper 32 bits are known to be zero"
This reverts commit c1dfbf72e1.

Reason for revert: TESTL rule is wrong when the result is used for an ordered comparison.

Fixes #62360

Change-Id: I4d5b6aca24389b0a2bf767bfbc0a9d085359eb38
Reviewed-on: https://go-review.googlesource.com/c/go/+/524255
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-08-30 15:15:28 +00:00
Matthew Dempsky
4f97a7e6ea cmd/compile/internal/ir: set Addrtaken on Canonical ONAME too
In CL 522879, I moved the logic for setting Addrtaken from typecheck's
markAddrOf and ComputeAddrtaken directly into ir.NewAddrExpr. However,
I took the logic from markAddrOf, and failed to notice that
ComputeAddrtaken also set Addrtaken on the canonical ONAME.

The result is that if the only address-of expressions were within a
function literal, the canonical variable never got marked Addrtaken.
In turn, this could cause the consistency check in ir.Reassigned to
fail. (Yay for consistency checks turning mistakes into ICEs, rather
than miscompilation.)

Fixes #62313.

Change-Id: Ieab2854cd7fcc1b6c5d1e61de66453add9890a4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/523375
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-28 14:32:14 +00:00
Keith Randall
b303fb4855 runtime: fix maps.Clone bug when cloning a map mid-grow
Fixes #62203

Change-Id: I0459d3f481b0cd20102f6d9fd3ea84335a7739a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/522317
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-25 16:10:03 +00:00
Bryan C. Mills
5374c1aaf5 cmd/internal/testdir: parse past gofmt'd //go:build lines
Also gofmt a test file to make sure the parser works.

Fixes #62267.

Change-Id: I9b9f12b06bae7df626231000879b5ed7df3cd9ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/522635
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
2023-08-24 17:52:48 +00:00
Bryan C. Mills
fe2c686b63 test/fixedbugs: require cgo for issue10607.go
This test passes "-linkmode=external" to 'go run' to link the binary
using the system C linker.

CGO_ENABLED=0 explicitly tells cmd/go not to use the C toolchain,
so the test should not be run in that configuration.

Updates #46330.

Change-Id: I16ac66aac91178045f9decaeb28134061e9711f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/522495
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
2023-08-24 17:19:11 +00:00
Matthew Dempsky
f92741b1d8 cmd/compile/internal/noder: elide statically known "if" statements
In go.dev/cl/517775, I moved the frontend's deadcode elimination pass
into unified IR. But I also made a small enhancement: a branch like
"if x || true" is now detected as always taken, so the else branch can
be eliminated.

However, the inliner also has an optimization for delaying the
introduction of the result temporary variables when there's a single
return statement (added in go.dev/cl/266199). Consequently, the
inliner turns "if x || true { return true }; return true" into:

	if x || true {
		~R0 := true
		goto .i0
	}
	.i0:
	// code that uses ~R0

In turn, this confuses phi insertion, because it doesn't recognize
that the "if" statement is always taken, and so ~R0 will always be
initialized.

With this CL, after inlining we instead produce:

	_ = x || true
	~R0 := true
	goto .i0
	.i0:

Fixes #62211.

Change-Id: Ic8a12c9eb85833ee4e5d114f60e6c47817fce538
Reviewed-on: https://go-review.googlesource.com/c/go/+/522096
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-08-23 18:01:55 +00:00
Keith Randall
4711299661 cmd/compile: use jump tables for large type switches
For large interface -> concrete type switches, we can use a jump
table on some bits of the type hash instead of a binary search on
the type hash.

name                        old time/op  new time/op  delta
SwitchTypePredictable-24    1.99ns ± 2%  1.78ns ± 5%  -10.87%  (p=0.000 n=10+10)
SwitchTypeUnpredictable-24  11.0ns ± 1%   9.1ns ± 2%  -17.55%  (p=0.000 n=7+9)

Change-Id: Ida4768e5d62c3ce1c2701288b72664aaa9e64259
Reviewed-on: https://go-review.googlesource.com/c/go/+/521497
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2023-08-23 00:30:54 +00:00
Keith Randall
556e9c5f3e cmd/compile: allow non-pointer writes in the middle of a write barrier
This lets us combine more write barriers, getting rid of some of the
test+branch and gcWriteBarrier* calls.
With the new write barriers, it's easy to add a few non-pointer writes
to the set of values written.

We allow up to 2 non-pointer writes between pointer writes. This is enough
for, for example, adjacent slice fields.

Fixes #62126

Change-Id: I872d0fa9cc4eb855e270ffc0223b39fde1723c4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/521498
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2023-08-23 00:16:06 +00:00
Austin Clements
596120fdc6 cmd/compile: redo IsRuntimePkg/IsReflectPkg predicate
Currently, the types package has IsRuntimePkg and IsReflectPkg
predicates for testing if a Pkg is the runtime or reflect packages.
IsRuntimePkg returns "true" for any "CompilingRuntime" package, which
includes all of the packages imported by the runtime. This isn't
inherently wrong, except that all but one use of it is of the form "is
this Sym a specific runtime.X symbol?" for which we clearly only want
the package "runtime" itself. IsRuntimePkg was introduced (as
isRuntime) in CL 37538 as part of separating the real runtime package
from the compiler built-in fake runtime package. As of that CL, the
"runtime" package couldn't import any other packages, so this was
adequate at the time.

We could fix this by just changing the implementation of IsRuntimePkg,
but the meaning of this API is clearly somewhat ambiguous. Instead, we
replace it with a new RuntimeSymName function that returns the name of
a symbol if it's in package "runtime", or "" if not. This is what
every call site (except one) actually wants, which lets us simplify
the callers, and also more clearly addresses the ambiguity between
package "runtime" and the general concept of a runtime package.

IsReflectPkg doesn't have the same issue of ambiguity, but it
parallels IsRuntimePkg and is used in the same way, so we replace it
with a new ReflectSymName for consistency.

Change-Id: If3a81d7d11732a9ab2cac9488d17508415cfb597
Reviewed-on: https://go-review.googlesource.com/c/go/+/521696
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-22 19:18:21 +00:00
Meng Zhuo
63ab68ddc5 cmd/compile: add single-precision FMA code generation for riscv64
This CL adds FMADDS,FMSUBS,FNMADDS,FNMSUBS SSA support for riscv

Change-Id: I1e7dd322b46b9e0f4923dbba256303d69ed12066
Reviewed-on: https://go-review.googlesource.com/c/go/+/506616
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: M Zhuo <mzh@golangcn.org>
2023-08-22 12:05:36 +00:00
Meng Zhuo
05f9511582 cmd/compile: improve FP FMA performance on riscv64
FMADD/FMSUB/FNSUB are an efficient FP FMA instructions, which can
be used by the compiler to improve FP performance.

Erf               188.0n ± 2%   139.5n ± 2%  -25.82% (p=0.000 n=10)
Erfc              193.6n ± 1%   143.2n ± 1%  -26.01% (p=0.000 n=10)
Erfinv            244.4n ± 2%   172.6n ± 0%  -29.40% (p=0.000 n=10)
Erfcinv           244.7n ± 2%   173.0n ± 1%  -29.31% (p=0.000 n=10)
geomean           216.0n        156.3n       -27.65%

Ref: The RISC-V Instruction Set Manual Volume I: Unprivileged ISA
11.6 Single-Precision Floating-Point Computational Instructions

Change-Id: I89aa3a4df7576fdd47f4a6ee608ac16feafd093c
Reviewed-on: https://go-review.googlesource.com/c/go/+/506036
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-22 08:38:08 +00:00
Matthew Dempsky
fecf51717f cmd/compile/internal/walk: reuse runtime.scase
Shaves ~1.5kB off cmd/go binary.

Change-Id: I8ad85aa4a24bc197b009c8e1ea9201957222152a
Reviewed-on: https://go-review.googlesource.com/c/go/+/521677
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-08-21 23:34:34 +00:00
Matthew Dempsky
42f4ccb6f9 cmd/compile/internal/reflectdata: share hmap and hiter types
There's no need for distinct hmap and hiter types for each map.

Shaves 9kB off cmd/go binary size.

Change-Id: I7bc3b2d8ec82e7fcd78c1cb17733ebd8b615990a
Reviewed-on: https://go-review.googlesource.com/c/go/+/521615
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-08-21 23:29:33 +00:00
Keith Randall
0b47b94a62 cmd/compile: remove more extension ops when not needed
If we're not using the upper bits, don't bother issuing a
sign/zero extension operation.

For arm64, after CL 520916 which fixed a correctness bug with
extensions but as a side effect leaves many unnecessary ones
still in place.

Change-Id: I5f4fe4efbf2e9f80969ab5b9a6122fb812dc2ec0
Reviewed-on: https://go-review.googlesource.com/c/go/+/521496
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-21 20:52:15 +00:00
Matthew Dempsky
d63c88d695 cmd/compile: enable -d=zerocopy by default
Fixes #2205.

Change-Id: Ib0802fee2b274798b35f0ebbd0b736b1be5ae00a
Reviewed-on: https://go-review.googlesource.com/c/go/+/520600
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-08-18 11:59:42 +00:00
Matthew Dempsky
925d2fb36c cmd/compile: restore zero-copy string->[]byte optimization
This CL implements the remainder of the zero-copy string->[]byte
conversion optimization initially attempted in go.dev/cl/520395, but
fixes the tracking of mutations due to ODEREF/ODOTPTR assignments, and
adds more comprehensive tests that I should have included originally.

However, this CL also keeps it behind the -d=zerocopy flag. The next
CL will enable it by default (for easier rollback).

Updates #2205.

Change-Id: Ic330260099ead27fc00e2680a59c6ff23cb63c2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/520599
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-08-18 11:58:37 +00:00
Matthew Dempsky
b75ac7a8d5 Revert "cmd/compile: enable zero-copy string->[]byte conversions"
This reverts CL 520395.

Reason for revert: thanm@ pointed out failure cases.

Change-Id: I3fd60b73118be3652be2c08b77ab39e793b42110
Reviewed-on: https://go-review.googlesource.com/c/go/+/520596
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-08-17 20:02:02 +00:00
Matthew Dempsky
c8adb3004f cmd/compile: enable zero-copy string->[]byte conversions
This CL enables the latent support for string->[]byte conversions
added go.dev/cl/520259.

One catch is that we need to make sure []byte("") evaluates to a
non-nil slice, even if "" is (nil, 0). This CL addresses that by
adding a "ptr != nil" check for OSTR2BYTESTMP, unless the NonNil flag
is set.

The existing uses of OSTR2BYTESTMP (which aren't concerned about
[]byte("") evaluating to nil) are updated to set this flag.

Fixes #2205.

Change-Id: I35a9cb16c164cd86156b7560915aba5108d8b523
Reviewed-on: https://go-review.googlesource.com/c/go/+/520395
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-08-17 19:37:36 +00:00
Matthew Dempsky
2c51ea11b0 cmd/compile/internal/typecheck: push ONEW into go/defer wrappers
Currently, we rewrite:

	go f(new(T))

into:

	tmp := new(T)
	go func() { f(tmp) }()

However, we can both shrink the closure and improve escape analysis by
instead rewriting it into:

	go func() { f(new(T)) }()

This CL does that.

Change-Id: Iae16a476368da35123052ca9ff41c49159980458
Reviewed-on: https://go-review.googlesource.com/c/go/+/520340
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-08-17 19:37:04 +00:00
Matthew Dempsky
7e2e648a2d cmd/compile/internal/typecheck: normalize go/defer statements earlier
Normalizing go/defer statements to always use functions with zero
parameters and zero results was added to escape analysis, because that
was the earliest point at which all three frontends converged. Now
that we only have the unified frontend, we can do it during typecheck,
which is where we perform all other desugaring and normalization
rewrites.

Change-Id: Iebf7679b117fd78b1dffee2974bbf85ebc923b23
Reviewed-on: https://go-review.googlesource.com/c/go/+/520260
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-17 19:36:58 +00:00
Matthew Dempsky
f4e6815652 cmd/compile/internal/noder: remove inlined closure naming hack
I previously used a clumsy hack to copy Closgen back and forth while
inlining, to handle when an inlined function contains closures, which
need to each be uniquely numbered.

The real solution was to name the closures using r.inlCaller, rather
than r.curfn. This CL adds a helper method to do exactly this.

Change-Id: I510553b5d7a8f6581ea1d21604e834fd6338cb06
Reviewed-on: https://go-review.googlesource.com/c/go/+/520339
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2023-08-17 19:36:29 +00:00
Matthew Dempsky
ff47dd1d66 cmd/compile/internal/escape: optimize indirect closure calls
This CL extends escape analysis in two ways.

First, we already optimize directly called closures. For example,
given:

	var x int  // already stack allocated today
	p := func() *int { return &x }()

we don't need to move x to the heap, because we can statically track
where &x flows. This CL extends the same idea to work for indirectly
called closures too, as long as we know everywhere that they're
called. For example:

	var x int  // stack allocated after this CL
	f := func() *int { return &x }
	p := f()

This will allow a subsequent CL to move the generation of go/defer
wrappers earlier.

Second, this CL adds tracking to detect when pointer values flow to
the pointee operand of an indirect assignment statement (i.e., flows
to p in "*p = x") or to builtins that modify memory (append, copy,
clear). This isn't utilized in the current CL, but a subsequent CL
will make use of it to better optimize string->[]byte conversions.

Updates #2205.

Change-Id: I610f9c531e135129c947684833e288ce64406f35
Reviewed-on: https://go-review.googlesource.com/c/go/+/520259
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-08-17 16:36:09 +00:00
David Chase
e72ecc6a6b cmd/compile: in expandCalls, move all arg marshalling into call block
For aggregate-typed arguments passed to a call, expandCalls
decomposed them into parts in the same block where the value
was created.  This is not necessarily the call block, and in
the case where stores are involved, can change the memory
leaving that block, and getting that right is problematic.

Instead, do all the expanding in the same block as the call,
which avoids the problems of (1) not being able to reorder
loads/stores across a block boundary to conform to memory
order and (2) (incorrectly, not) exposing the new memory to
consumers in other blocks.  Putting it all in the same block
as the call allows reordering, and the call creates its own
new memory (which is already dealt with correctly).

Fixes #61992.

Change-Id: Icc7918f0d2dd3c480cc7f496cdcd78edeca7f297
Reviewed-on: https://go-review.googlesource.com/c/go/+/519276
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-16 15:40:52 +00:00
Cuong Manh Le
b2a56b7053 cmd/compile: make backingArrayPtrLen to return typecheck-ed nodes
Fixes #61908

Change-Id: Ief8e3a6c42c0644c9f71ebef5f28a294cd7c153f
Reviewed-on: https://go-review.googlesource.com/c/go/+/517936
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2023-08-09 19:30:45 +00:00
Matthew Dempsky
8c5a54f698 cmd/compile: keep all open-coded defer slots as used
Open-coded defer slots are assigned indices upfront, so they're
logically like elements in an array. Without reassigning the indices,
we need to keep all of the elements alive so their relative offsets
are correct.

Fixes #61895.

Change-Id: Ie0191fdb33276f4e8ed0becb69086524fff022b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/517856
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-09 16:43:33 +00:00
Cuong Manh Le
ea565f156b cmd/compile: fix missing init nodes for len(string([]byte)) optimization
CL 497276 added optimization for len(string([]byte)) by avoiding call to
slicebytetostring. However, the bytes to string expression may contain
init nodes, which need to be preserved. Otherwise, it would make the
liveness analysis confusing about the lifetime of temporary variables
created by init nodes.

Fixes #61778

Change-Id: I6d1280a7d61bcc75f11132af41bda086f084ab54
Reviewed-on: https://go-review.googlesource.com/c/go/+/516375
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-08-07 03:12:29 +00:00
Keith Randall
611706b171 cmd/compile: don't use BTS when OR works, add direct memory BTS operations
Stop using BTSconst and friends when ORLconst can be used instead.
OR can be issued by more function units than BTS can, so it could
lead to better IPC. OR might take a few more bytes to encode, but
not a lot more.

Still use BTSconst for cases where the constant otherwise wouldn't
fit and would require a separate movabs instruction to materialize
the constant. This happens when setting bits 31-63 of 64-bit targets.

Add BTS-to-memory operations so we don't need to load/bts/store.

Fixes #61694

Change-Id: I00379608df8fb0167cb01466e97d11dec7c1596c
Reviewed-on: https://go-review.googlesource.com/c/go/+/515755
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-08-04 16:40:24 +00:00
Jorropo
bac4e2f241 cmd/compile: try to rewrite loops to count down
Fixes #61629

This reduce the pressure on regalloc because then the loop only keep alive
one value (the iterator) instead of the iterator and the upper bound since
the comparison now acts against an immediate, often zero which can be skipped.

This optimize things like:
  for i := 0; i < n; i++ {
Or a range over a slice where the index is not used:
  for _, v := range someSlice {
Or the new range over int from #61405:
  for range n {

It is hit in 975 unique places while doing ./make.bash.

Change-Id: I5facff8b267a0b60ea3c1b9a58c4d74cdb38f03f
Reviewed-on: https://go-review.googlesource.com/c/go/+/512935
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
2023-07-31 18:33:29 +00:00