1
0
mirror of https://github.com/golang/go synced 2024-09-29 05:24:32 -06:00
Commit Graph

56305 Commits

Author SHA1 Message Date
Brad Fitzpatrick
fa80fe7b1c cmd/go: simplify code that still assumed the build cache could be nil
cache.Default always returns a non-nil value since Go 1.12; the docs were
updated in https://go.dev/cl/465555.

This updates all the callers of cache.Default that were checking whether
the result was nil so the code isn't misleading/confusing to readers.

Change-Id: Ia63567dd70affef6041c744259f65cea79a2752e
Reviewed-on: https://go-review.googlesource.com/c/go/+/489355
Auto-Submit: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2023-04-26 20:37:24 +00:00
Michael Pratt
7b874619be runtime/cgo: store M for C-created thread in pthread key
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.

CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.

[Original CL 481061 description]

This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.

CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.

[Original CL 392854 description]

In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.

When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key,  only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.

When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.

This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.

This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.

For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

[CL 479915 description]

Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.

This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.

[CL 485500 description]

CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.

We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.

A regression test is added to misc/cgo/testsanitizer.

Fixes #51676.
Fixes #59294.
Fixes #59678.

Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
2023-04-26 19:25:46 +00:00
Paul E. Murphy
d816f85f78 cmd/link/internal/loadelf: set AttrExternal on text section symbols
PPC64 processes external object relocations against the section
symbols. This needs to be set correctly to determine the type of
PLT stub to generate when both Go and External code make PLT calls.

Change-Id: I5abdd5a0473866164083c33e80324dffcc1707f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/488895
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-26 15:25:47 +00:00
Than McIntosh
39957b5d89 coverage: fix count vs emit discrepancy in coverage counter data writing
This patch revises the way coverage counter data writing takes place
to avoid problems where useful counter data (for user-written functions)
is skipped in favor of counter data from stdlib functions that are
executed "late in the game", during the counter writing process itself.

Reading counter values from a running "--coverpkg=all" program is an
inherently racy operation; while the the code that scans the coverage
counter segment is reading values, the program is still executing,
potentially updating those values, and updates can include execution
of previously un-executed functions. The existing counter data writing
code was using a two-pass model (initial sweep over the counter
segment to count live functions, second sweep to actually write data),
and wasn't properly accounting for the fact that the second pass could
see more functions than the first.

In the bug in question, the first pass discovered an initial set of
1240 functions, but by the time the second pass kicked in, several
additional new functions were also live. The second pass scanned the
counter segment again to write out exactly 1240 functions, but since
some of the counters for the newly executed functions were earlier in
the segment (due to linker layout quirks) than the user's selected
function, the sweep terminated before writing out counters for the
function of interest.

The fix rewrites the counter data file encoder to make a single sweep
over the counter segment instead of using a two-pass scheme.

Fixes #59563.

Change-Id: I5e908e226bb224adb90a2fb783013e52deb341da
Reviewed-on: https://go-review.googlesource.com/c/go/+/484535
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Than McIntosh <thanm@google.com>
2023-04-26 12:44:34 +00:00
Than McIntosh
7b89531860 internal/coverage/slicewriter: fix off-by-1 error in seek utilities
The slicewriter Seek method was being too restrictive on offsets
accepted, due to an off-by-one problem in the error checking code.
This fixes the problem and touches up the unit tests.

Change-Id: I75d6121551de19ec9275f0e331810db231db6ea9
Reviewed-on: https://go-review.googlesource.com/c/go/+/488116
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-26 12:44:26 +00:00
ioworker0
ada0eec827 runtime: add a alignment check
The Linux implementation requires that the address addr be
page-aligned, and allows length to be zero.

See Linux notes:
https://man7.org/linux/man-pages/man2/madvise.2.html

Change-Id: Ic49960c32991ef12f23de2de76e9689567c82d03
GitHub-Last-Rev: 35e7f8e5cc
GitHub-Pull-Request: golang/go#59793
Reviewed-on: https://go-review.googlesource.com/c/go/+/488015
Auto-Submit: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2023-04-26 04:06:08 +00:00
Ian Lance Taylor
3c59639b90 crypto/sha512: add WriteString and WriteByte method
This can reduce allocations when hashing a string or byte
rather than []byte.

For #38776

Change-Id: I4926ae2749f6b167edbebb73d8f68763ffb2f0c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/483816
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-04-25 22:06:33 +00:00
Ian Lance Taylor
6c1792d1ff crypto/sha1: add WriteString and WriteByte method
This can reduce allocations when hashing a string or byte
rather than []byte.

For #38776

Change-Id: I7c1fbdf15abf79d2faf360f75adf4bc550a607e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/483815
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
2023-04-25 22:06:06 +00:00
Ian Lance Taylor
bb079efbdc crypto/sha256: add WriteString and WriteByte method
This can reduce allocations when hashing a string or byte
rather than []byte.

For #38776

Change-Id: I1c6dd1bc018220784a05939e92b47558c0562110
Reviewed-on: https://go-review.googlesource.com/c/go/+/481478
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-04-25 22:05:33 +00:00
Austin Clements
ce0b914312 cmd/dist: actually only compile tests with -compile-only
Currently, "dist test -compile-only" still runs the test binaries,
just with -run=^$ so no tests are run. It does this because, until
recently, "go test -c" would fail if passed multiple test packages.
But this has some unexpected consequences: init code still runs,
TestMain still runs, and we generally can't test cross-compiling of
tests.

Now that #15513 is fixed, we can pass multiple packages to "go test
-c". Hence, this CL make dist just use "go test -c" as one would
expect.

Found in the course of working on #37486, though it doesn't really
affect that.

Change-Id: If7d3c72c9e0f74d4ea0dd422411e5ee93b314be4
Reviewed-on: https://go-review.googlesource.com/c/go/+/488275
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2023-04-25 19:49:28 +00:00
qmuntal
715d53c090 runtime: fallback to TEB arbitrary pointer when TLS slots are full
The Go runtime allocates the TLS slot in the TEB TLS slots instead of
using the TEB arbitrary pointer. See CL 431775 for more context.

The problem is that the TEB TLS slots array only has capacity for 64
indices, allocating more requires some complex logic that we don't
support yet.

Although the Go runtime only allocates one index, a Go DLL can be
loaded in a process with more than 64 TLS slots allocated,
in which case it abort.

This CL avoids aborting by falling back to the older behavior, that
is to use the TEB arbitrary pointer.

Fixes #59213

Change-Id: I39c73286fe2da95aa9c5ec5657ee0979ecbec533
Reviewed-on: https://go-review.googlesource.com/c/go/+/486816
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-25 15:37:00 +00:00
Nayef Ghattas
14f833f117 runtime/metrics: specify that bucket counts increase monotonically for histogram metrics
Make it explicit in the documentation that the histogram metrics
are cumulative (i.e. each bucket count increases monotonically).

Change-Id: I89119ba816ac46a63f36e607e695fad3695057ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/487315
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-25 14:24:19 +00:00
fanzha02
e4b03f9425 internal/cpu: add a detection for Neoverse(N2, V2) cores
The memmove implementation relies on the variable
runtime.arm64UseAlignedLoads to select fastest code
path. Considering Neoverse N2 and V2 cores prefer aligned
loads, this patch adds code to detect them for
memmove performance.

And this patch uses a new variable ARM64.IsNeoverse to
represent all Neoverse cores, removing the more specific
versions.

Change-Id: I9e06eae01a0325a0b604ac6af1e55711dd6133f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/487815
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Fannie Zhang <Fannie.Zhang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-25 14:08:20 +00:00
Alan Donovan
a1284d0185 go/ast: add IsGenerated(*File) predicate
See https://go.dev/s/generatedcode for spec.

Fixes #28089

Change-Id: Ic9bb138bdd180f136f9e8e74e187319acca5dbac
Reviewed-on: https://go-review.googlesource.com/c/go/+/487935
Run-TryBot: Alan Donovan <adonovan@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Findley <rfindley@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2023-04-25 13:57:33 +00:00
cui fliter
22d94dfdc8 html/template: fix unavailable url
The previous link is no longer accessible. use latest link.

Change-Id: I76411ee00785f3d92014c5012e4efb446924adaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/487835
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Andrew Polukhin <andrewmathematics2003@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-04-25 01:14:08 +00:00
Ian Lance Taylor
f00e947cdf runtime: add raceFiniLock to lock ranking
Also preserve the PC/SP in reentersyscall when doing lock ranking.
The test is TestDestructorCallbackRace with the staticlockranking
experiment enabled.

For #59711

Change-Id: I87ac1d121ec0d399de369666834891ab9e7d11b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/487955
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-04-24 21:37:06 +00:00
Bryan C. Mills
3f987ae61d internal/testenv: actually try to exec on ios and wasm
Due to a stray edit in CL 486275, the assignment to tryExecOk
in tryExec on ios would be immediately overwritten back to false.
This change fixes the stray edit.

Change-Id: I4f45fbf130dc912305e5f453b0d1a622ba199ad4
Reviewed-on: https://go-review.googlesource.com/c/go/+/488076
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
2023-04-24 20:39:25 +00:00
Roland Shoemaker
a8af76284d archive/zip: reject overflowing directorySize & directoryOffset
We added a check for incorrect baseOffset in CL 408734, but in doing so
we introduced a panic when directoryOffset overflowed a int64. The zip
spec uses uint64, but since io.SectionReader requires int64 we convert,
and possibly introduce an overflow. If offset < 0 && size-offset < 0,
SectionReader will panic when we attempt to read from it.

Since it's extremely unlikely we're ever going to process a zip file
larger than 1<<63-1 byte, just limit directory size and offset to the
max int64.

Change-Id: I1aaa755cf4da927a6e12ef59f97dfc83a3426d86
Reviewed-on: https://go-review.googlesource.com/c/go/+/488195
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
2023-04-24 20:28:37 +00:00
Sameer Ajmani
1d00dc3985 doc: fix typo in Go 1.21 release notes
Change-Id: Ib32567fdd12079cd171a4e1bc118ce27d8ce2a5d
Reviewed-on: https://go-review.googlesource.com/c/go/+/488035
Run-TryBot: Sameer Ajmani <sameer@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-24 18:13:34 +00:00
Bryan C. Mills
5256f90c98 runtime: fix CgoRaceprof and CgoRaceSignal build failures
TestRaceProf and TestRaceSignal were changed to run on all platforms
that support the race detector as of CL 487575, but the testprogcgo
source files needed to run the test rely on POSIX threads and were
still build-constrained to only linux/amd64 and freebsd/amd64.

Since the C test program appears to require only POSIX APIs, update
the constraint to build the source file on all Unix platforms, and
update the tests to skip on Windows.

This may slightly increase testprogcgo build time on Unix platforms
that do not support the race detector.

Change-Id: I704dd496d475a3cd2e2da2a09c7d2e3bb8e96d02
Reviewed-on: https://go-review.googlesource.com/c/go/+/488115
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
2023-04-24 18:13:14 +00:00
Jonathan Amsterdam
c80cedec93 testing/slogtest: tests for slog handlers
Add a package for testing that a slog.Handler implementation
satisfies that interface's documented requirements.

Code copied from x/exp/slog/slogtest.

Updates #56345.

Change-Id: I89e94d93bfbe58e3c524758f7ac3c3fba2a2ea96
Reviewed-on: https://go-review.googlesource.com/c/go/+/487895
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
2023-04-24 18:07:26 +00:00
Cherry Mui
ddd822e5ca cmd/link: don't sort data symbols by name
For data symbols, we currently sort them by size, then by name if
the size is the same. Sorting by name is not really necessary.
Instead, we sort by symbol index. Like name, the symbol index is
deterministic, and pretty stable if only a small portion of the
input is changed, and also naturally partitioned by packages. This
reduces the CPU time for reading the symbol names and comparing
strings.

Linking cmd/compile (on macOS/amd64),

Dodata    57.2ms ± 6%    54.5ms ± 4%   -4.74%  (p=0.000 n=19+17)

Change-Id: I1c4f2b83dbbb4b984b2c8ab4a7e8543b9f7f22b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/487515
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-24 16:50:28 +00:00
Cherry Mui
d33a5136e1 cmd/link: use uint32 as symbol index
Currently, a symbol's global index, the Sym type, is defined as an
int, which is 64-bit on 64-bit machines. We're unlikely to have
more than 4 billion symbols in the near future. Even if we will,
we will probably hit some other limit (e.g. section size) before
the symbol number limit. Use a 32-bit type to reduce memory usage.

E,g, linking cmd/compile in external linking mode (on macOS/amd64)

Munmap_GC    43.2M ± 0%     35.5M ± 1%   -17.74%  (p=0.000 n=16+20)

This brings the memory usage back before the previous CL, and even
lower.

Change-Id: Ie185f1586638fe70d8121312bfa9410942d518c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/487416
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2023-04-24 16:49:08 +00:00
Bryan C. Mills
33c06ee12a cmd/go: declare net hosts in script tests
Although we aren't precise about enforcing the hosts just yet,
we can eventually use the declared hostnames to selectively skip
tests (for example, if an external service has an outage while
a Go release is being tested).

Also relax the constraint to [short] in tests that require only
vcs-test.golang.org, which has redirected to an in-process server
since around CL 427914.

Also enforce that tests that use the network actually use the [net]
constraint, by setting TESTGONETWORK=panic in the test environment
until the condition is evaluated.

For #52545.
For #54503.
Updates #27494.

Change-Id: I13be6b42a9beee97657eb45424882e787ac164c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/473276
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Bypass: Bryan Mills <bcmills@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
2023-04-24 15:54:04 +00:00
Nayef Ghattas
c2c787d73e runtime/metrics: set /sched/latencies:seconds as cumulative
The current implementation for this metric populates a histogram
that is never reset, i.e. where each bucket count increases
monotonically.

The comment in the definition of the Cumulative attribute calls
out that cumulative means that if the metric is a distribution,
then each bucket count increases monotonically.

In that sense, the cumulative attribute should be set to true for
this metric.

Change-Id: Ifc34e965a62f2d7881b5c8e8cbb8b7207a4d5757
Reviewed-on: https://go-review.googlesource.com/c/go/+/486755
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-04-24 14:10:02 +00:00
Cherry Mui
21c2fdd91c cmd/link: use slice and bitmap for some attributes
Currently, a symbol's outer symbol, the "special" attribute, and
whether a symbol is a generator symbol are represented as maps,
and are accessed in some loops over nearly all reachable symbols.
The map lookups are a bit expensive.

For outer symbol, a non-trivial portion of the symbols have outer
symbol set (e.g. type symbols, which we put into container symbols
like "type:*"). Using a slice to access more efficiently.

For the special and generator symbol attributes, use a bitmap.
There are not many symbols have those attributes, so the bitmap is
quite sparse. The bitmap is not too large anyway, so use it for
now. If we want to further reduce memory usage we could consider
some other data structure like a Bloom filter.

Linking cmd/compile in external linking mode (on macOS/amd64)

Symtab   12.9ms ± 9%     6.4ms ± 5%   -50.08%  (p=0.000 n=19+18)
Dodata   64.9ms ±12%    57.1ms ±12%   -11.90%  (p=0.000 n=20+20)
Asmb     36.7ms ±11%    32.8ms ± 9%   -10.61%  (p=0.000 n=20+18)
Asmb2    26.6ms ±15%    21.9ms ±12%   -17.75%  (p=0.000 n=20+18)

There is some increase of memory usage

Munmap_GC   40.9M ± 1%     43.2M ± 0%    +5.54%  (p=0.000 n=20+19)

The next CL will bring the memory usage back.

Change-Id: Ie4347eb96c51f008b9284270de37fc880bb52d2c
Reviewed-on: https://go-review.googlesource.com/c/go/+/487415
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2023-04-24 14:00:39 +00:00
Tero Saarni
2c70690451 crypto/tls: fix PSK binder calculation
When server and client have mismatch in curve preference, the server will
send HelloRetryRequest during TLSv1.3 PSK resumption. There was a bug
introduced by Go1.19.6 or later and Go1.20.1 or later, that makes the client
calculate the PSK binder hash incorrectly. Server will reject the TLS
handshake by sending alert: invalid PSK binder.

Fixes #59424

Change-Id: I2ca8948474275740a36d991c057b62a13392dbb9
GitHub-Last-Rev: 1aad9bcf27
GitHub-Pull-Request: golang/go#59425
Reviewed-on: https://go-review.googlesource.com/c/go/+/481955
Reviewed-by: Roland Shoemaker <roland@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
2023-04-24 13:35:52 +00:00
Ian Lance Taylor
6bbbc5dc70 runtime: call _exit, not exit, on AIX and Solaris
This is the AIX and Solaris equivalent of CL 269378.

On AIX and Solaris, where we use libc for syscalls, when the runtime exits,
it calls the libc exit function, which may call back into user code,
such as invoking functions registered with atexit. In particular, it
may call back into Go. But at this point, the Go runtime is
already exiting, so this wouldn't work.

On non-libc platforms we use exit syscall directly, which doesn't
invoke any callbacks. Use _exit on AIX and Solaris to achieve the same
behavior.

Test is TestDestructorCallback.

For #59711

Change-Id: I666f75538d3e3d8cf3b697b4c32f3ecde8332890
Reviewed-on: https://go-review.googlesource.com/c/go/+/487635
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-04-24 05:12:17 +00:00
Ian Lance Taylor
a5297f59a7 runtime: use platform.RaceDetectorSupported for -race tests
Don't try to duplicate the list of targets that support -race.

Change-Id: I889d5c2f4884de89d88f8efdc89608aa73584a8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/487575
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-04-24 05:06:55 +00:00
Bryan C. Mills
5a10d8a204 internal/testenv: in HasExec, try to actually exec on ios and wasm platforms
Some iOS environments may support exec. wasip1 and js do not, but
trying to exec on those platforms is inexpensive anyway and gives
better test coverage for the ios path.

Change-Id: I4baffb2ef5dc7d81e6a260f69033bfb229f13d92
Reviewed-on: https://go-review.googlesource.com/c/go/+/486275
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
2023-04-22 00:41:24 +00:00
Xiaodong Liu
d5fea5078b cmd/compile: support -buildmode=c-shared on linux/mips64{,le}
The modification of these rules is optimization to load/store global
variables. If there are a sequence of loads/stores nearby a global
variable address, the address can only be loaded from GOT once instead
of every time.

For #43264

Change-Id: Idedaf6c81f085955371320f51bca148ffb42a2d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/348732
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-04-21 23:06:11 +00:00
Keith Randall
00401835c1 flag: panic if a flag is defined after being set
As part of developing #57411, we ran into cases where a flag was
defined in one package init and Set in another package init, and there
was no init ordering implied by the spec between those two
packages. Changes in initialization ordering as part of #57411 caused
a Set to happen before the definition, which makes the Set silently
fail.

This CL makes the Set fail loudly in that situation.

Currently Set *does* fail kinda quietly in that situation, in that it
returns an error. (It seems that no one checks the error from Set,
at least for string flags.) Ian suggsted that instead we panic at
the definition site if there was previously a Set called on that
(at the time undefined) flag.

So Set on an undefined flag is ok and returns an error (as before),
but defining a flag which has already been Set causes a panic.  (The
API for flag definition has no way to return an error, and does
already panic in some situations like a duplicate definition.)

Update #57411

Change-Id: I39b5a49006f9469de0b7f3fe092afe3a352e4fcb
Reviewed-on: https://go-review.googlesource.com/c/go/+/480215
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2023-04-21 22:25:02 +00:00
Johan Brandhorst-Satzkorn
1c17981f4a misc/wasm: support wasmtime in wasip1
Allow switching to wasmtime through the GOWASIRUNTIME variable. This
will allow builders to run the wasip1 standard library tests against
the wasmtime WASI runtime.

For #59583

Change-Id: I4d5200df7bb27b66e041f00e89d4c2e585f5da7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/485615
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Bypass: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Run-TryBot: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-04-21 21:39:27 +00:00
Ian Lance Taylor
30886390c2 runtime: in __tsan_fini tell scheduler we are entering non-Go code
__tsan_fini will call exit which will call destructors which
may in principle call back into Go functions. Prepare the scheduler
by calling entersyscall before __tsan_fini.

Fixes #59711

Change-Id: Ic4df8fba3014bafa516739408ccfc30aba4f22ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/486615
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-04-21 21:26:05 +00:00
Keith Randall
cedf5008a8 cmd/compile: introduce separate memory op combining pass
Memory op combining is currently done using arch-specific rewrite rules.
Instead, do them as a arch-independent rewrite pass. This ensures that
all architectures (with unaligned loads & stores) get equal treatment.

This removes a lot of rewrite rules.

The new pass is a bit more comprehensive. It handles things like out-of-order
writes and is careful not to apply partial optimizations that then block
further optimizations.

Change-Id: I780ff3bb052475cd725a923309616882d25b8d9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/478475
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2023-04-21 21:05:46 +00:00
Bryan C. Mills
e9c2607ab4 runtime: skip TestG0StackOverflow on ios
This test fails when run on ios. (Although ios does not normally
support "exec", in the corellium environment it does.)

For #26061.

Change-Id: Idfdc53758aaabf0cb87ae50f9a4666deebf57fd6
Reviewed-on: https://go-review.googlesource.com/c/go/+/487355
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
2023-04-21 21:03:17 +00:00
Lucien Coffe
f229787aff runtime: prevent double lock in checkdead by unlocking before throws
This change resolves an issue where checkdead could result in a double lock when shedtrace is enabled. This fix involves adding unlocks before all throws in the checkdead function to ensure the scheduler lock is properly released.

Fixes #59758

Change-Id: If3ddf9969f4582c3c88dee9b9ecc355a63958103
Reviewed-on: https://go-review.googlesource.com/c/go/+/487375
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-04-21 20:14:08 +00:00
Bryan C. Mills
6328e445c3 log: avoid leaking goroutines in TestOutputRace
Leaked goroutines are the only explanation I can think of for excess
allocs in TestDiscard, and TestOutputRace is the only place I can see
where the log package leaks goroutines. Let's fix that leak and see if
it eliminates the TestDiscard flakes.

Fixes #58797 (maybe).

Change-Id: I2d54dcba3eb52bd10a62cd1c380131add6a2f651
Reviewed-on: https://go-review.googlesource.com/c/go/+/487356
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-04-21 20:04:37 +00:00
Austin Clements
87272bd1a1 runtime: tidy _Stack* constant naming
For #59670.

Change-Id: I0efa743edc08e48dc8d906803ba45e9f641369db
Reviewed-on: https://go-review.googlesource.com/c/go/+/486977
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2023-04-21 19:29:00 +00:00
Austin Clements
0f099a4bc5 runtime, cmd: rationalize StackLimit and StackGuard
The current definitions of StackLimit and StackGuard only indirectly
specify the NOSPLIT stack limit and duplicate a literal constant
(928). Currently, they define the stack guard delta, and from there
compute the NOSPLIT limit.

Rationalize these by defining a new constant, abi.StackNosplitBase,
which consolidates and directly specifies the NOSPLIT stack limit (in
the default case). From this we then compute the stack guard delta,
inverting the relationship between these two constants. While we're
here, we rename StackLimit to StackNosplit to make it clearer what's
being limited.

This change does not affect the values of these constants in the
default configuration. It does slightly change how
StackGuardMultiplier values other than 1 affect the constants, but
this multiplier is a pretty rough heuristic anyway.

                    before after
stackNosplit           800   800
_StackGuard            928   928
stackNosplit -race    1728  1600
_StackGuard -race     1856  1728

For #59670.

Change-Id: Ia94094c5e47897e7c088d24b4a5e33f5c2768db5
Reviewed-on: https://go-review.googlesource.com/c/go/+/486976
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 19:28:56 +00:00
Austin Clements
03ad1f1a34 internal/abi, runtime, cmd: merge StackSmall, StackBig consts into internal/abi
For #59670.

Change-Id: I91448363be2fc678964ce119d85cd5fae34a14da
Reviewed-on: https://go-review.googlesource.com/c/go/+/486975
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
2023-04-21 19:28:53 +00:00
Austin Clements
7843ca83e7 internal/abi, runtime, cmd: merge PCDATA_* and FUNCDATA_* consts into internal/abi
We also rename the constants related to unsafe-points: currently, they
follow the same naming scheme as the PCDATA table indexes, but are not
PCDATA table indexes.

For #59670.

Change-Id: I06529fecfae535be5fe7d9ac56c886b9106c74fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/485497
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-04-21 19:28:49 +00:00
Austin Clements
2668a190ba internal/abi, runtime, cmd: merge funcFlag_* consts into internal/abi
For #59670.

Change-Id: Ie784ba4dd2701e4f455e1abde4a6bfebee4b1387
Reviewed-on: https://go-review.googlesource.com/c/go/+/485496
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
2023-04-21 19:28:46 +00:00
Austin Clements
9754521157 internal/abi, runtime, cmd: merge funcID_* consts into internal/abi
For #59670.

Change-Id: I517e97ea74cf232e5cfbb77b127fa8804f74d84b
Reviewed-on: https://go-review.googlesource.com/c/go/+/485495
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
2023-04-21 19:28:44 +00:00
Bryan C. Mills
eaecd64200 cmd/go: assert on more of the version string in TestScript/gotoolchain
The previous assert triggers whenever the 40-character git commit
contains the substring "999", which happens with a probability
decidedly greater than zero.

For #57001.

Change-Id: If0f1bc1a3dd0e6b7e66768d0cf3a79545ee4e5ed
Reviewed-on: https://go-review.googlesource.com/c/go/+/486399
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 18:08:08 +00:00
Oleksandr Redko
e7d238aaf4 cmd/go/internal/vcweb: replace ioutil with os and io
Change-Id: I251788cbbb6d740ef24e7561cc4bee880b7bdff8
Reviewed-on: https://go-review.googlesource.com/c/go/+/485017
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 17:47:26 +00:00
Lynn Boger
e23322e2cc cmd/internal/obj/ppc64: modify PCALIGN to ensure alignment
The initial purpose of PCALIGN was to identify code
where it would be beneficial to align code for performance,
but avoid cases where too many NOPs were added. On p10, it
is now necessary to enforce a certain alignment in some
cases, so the behavior of PCALIGN needs to be slightly
different.  Code will now be aligned to the value specified
on the PCALIGN instruction regardless of number of NOPs added,
which is more intuitive and consistent with power assembler
alignment directives.

This also adds 64 as a possible alignment value.

The existing values used in PCALIGN were modified according to
the new behavior.

A testcase was updated and performance testing was done to
verify that this does not adversely affect performance.

Change-Id: Iad1cf5ff112e5bfc0514f0805be90e24095e932b
Reviewed-on: https://go-review.googlesource.com/c/go/+/485056
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-04-21 16:47:45 +00:00
Paul E. Murphy
de788efeac cmd/link/internal/ppc64: Use PCrel relocs in runtime.addmoduledata if supported
This is another step towards supporting TOC-free operations.

Change-Id: I77edcf066c757b8ec815c701d7f6d72cd645eca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/483437
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-04-21 16:11:03 +00:00
Paul E. Murphy
db32aba508 internal/bytealg: rewrite indexbytebody on PPC64
Use P8 instructions throughout to be backwards compatible, but
otherwise not impede performance. Use overlapping loads where
possible, and prioritize larger checks over smaller check.

However, some newer instructions can be used surgically when
targeting a newer GOPPC64. These can lead to noticeable
performance improvements with minimal impact to readability.

All tests run below on a Power10/ppc64le, and use a small
modification to BenchmarkIndexByte to ensure the IndexByte
wrapper call is inlined (as it likely is under realistic usage).
This wrapper adds substantial overhead if not inlined.

Previous (power9 path, GOPPC64=power8) vs. GOPPC64=power8:

IndexByte/1       3.81ns ± 8%     3.11ns ± 5%  -18.39%
IndexByte/2       3.82ns ± 3%     3.20ns ± 6%  -16.23%
IndexByte/3       3.61ns ± 4%     3.25ns ± 6%  -10.13%
IndexByte/4       3.66ns ± 5%     3.08ns ± 1%  -15.91%
IndexByte/5       3.82ns ± 0%     3.75ns ± 2%   -1.94%
IndexByte/6       3.83ns ± 0%     3.87ns ± 4%   +1.04%
IndexByte/7       3.83ns ± 0%     3.82ns ± 0%   -0.27%
IndexByte/8       3.82ns ± 0%     2.92ns ±11%  -23.70%
IndexByte/9       3.70ns ± 2%     3.08ns ± 2%  -16.87%
IndexByte/10      3.74ns ± 2%     3.04ns ± 0%  -18.75%
IndexByte/11      3.75ns ± 0%     3.31ns ± 8%  -11.79%
IndexByte/12      3.74ns ± 0%     3.04ns ± 0%  -18.86%
IndexByte/13      3.83ns ± 4%     3.04ns ± 0%  -20.64%
IndexByte/14      3.80ns ± 1%     3.30ns ± 8%  -13.18%
IndexByte/15      3.77ns ± 1%     3.04ns ± 0%  -19.33%
IndexByte/16      3.81ns ± 0%     2.78ns ± 7%  -26.88%
IndexByte/17      4.12ns ± 0%     3.04ns ± 1%  -26.11%
IndexByte/18      4.27ns ± 6%     3.05ns ± 0%  -28.64%
IndexByte/19      4.30ns ± 4%     3.02ns ± 2%  -29.65%
IndexByte/20      4.43ns ± 7%     3.45ns ± 7%  -22.15%
IndexByte/21      4.12ns ± 0%     3.03ns ± 1%  -26.35%
IndexByte/22      4.40ns ± 6%     3.05ns ± 0%  -30.82%
IndexByte/23      4.40ns ± 6%     3.01ns ± 2%  -31.48%
IndexByte/24      4.32ns ± 5%     3.07ns ± 0%  -28.98%
IndexByte/25      4.76ns ± 2%     3.04ns ± 1%  -36.11%
IndexByte/26      4.82ns ± 0%     3.05ns ± 0%  -36.66%
IndexByte/27      4.82ns ± 0%     2.97ns ± 3%  -38.39%
IndexByte/28      4.82ns ± 0%     2.96ns ± 3%  -38.57%
IndexByte/29      4.82ns ± 0%     3.34ns ± 9%  -30.71%
IndexByte/30      4.82ns ± 0%     3.05ns ± 0%  -36.77%
IndexByte/31      4.81ns ± 0%     3.05ns ± 0%  -36.70%
IndexByte/32      3.52ns ± 0%     3.44ns ± 1%   -2.15%
IndexByte/33      4.77ns ± 1%     3.35ns ± 0%  -29.81%
IndexByte/34      5.01ns ± 5%     3.35ns ± 0%  -33.15%
IndexByte/35      4.92ns ± 9%     3.35ns ± 0%  -31.89%
IndexByte/36      4.81ns ± 5%     3.35ns ± 0%  -30.37%
IndexByte/37      4.99ns ± 6%     3.35ns ± 0%  -32.86%
IndexByte/38      5.06ns ± 5%     3.35ns ± 0%  -33.84%
IndexByte/39      5.02ns ± 5%     3.48ns ± 9%  -30.58%
IndexByte/40      5.21ns ± 9%     3.55ns ± 4%  -31.82%
IndexByte/41      5.18ns ± 0%     3.42ns ± 2%  -33.98%
IndexByte/42      5.19ns ± 0%     3.55ns ±11%  -31.56%
IndexByte/43      5.18ns ± 0%     3.45ns ± 5%  -33.46%
IndexByte/44      5.18ns ± 0%     3.39ns ± 0%  -34.56%
IndexByte/45      5.18ns ± 0%     3.43ns ± 4%  -33.74%
IndexByte/46      5.18ns ± 0%     3.47ns ± 1%  -33.03%
IndexByte/47      5.18ns ± 0%     3.44ns ± 2%  -33.54%
IndexByte/48      5.18ns ± 0%     3.39ns ± 0%  -34.52%
IndexByte/49      5.69ns ± 0%     3.79ns ± 0%  -33.45%
IndexByte/50      5.70ns ± 0%     3.70ns ± 3%  -34.98%
IndexByte/51      5.70ns ± 0%     3.70ns ± 2%  -35.05%
IndexByte/52      5.69ns ± 0%     3.80ns ± 1%  -33.35%
IndexByte/53      5.69ns ± 0%     3.78ns ± 0%  -33.54%
IndexByte/54      5.69ns ± 0%     3.78ns ± 1%  -33.51%
IndexByte/55      5.69ns ± 0%     3.78ns ± 0%  -33.61%
IndexByte/56      5.69ns ± 0%     3.81ns ± 3%  -33.12%
IndexByte/57      6.20ns ± 0%     3.79ns ± 4%  -38.89%
IndexByte/58      6.20ns ± 0%     3.74ns ± 2%  -39.58%
IndexByte/59      6.20ns ± 0%     3.69ns ± 2%  -40.47%
IndexByte/60      6.20ns ± 0%     3.79ns ± 1%  -38.81%
IndexByte/61      6.20ns ± 0%     3.77ns ± 1%  -39.23%
IndexByte/62      6.20ns ± 0%     3.79ns ± 0%  -38.89%
IndexByte/63      6.20ns ± 0%     3.79ns ± 0%  -38.90%
IndexByte/64      4.17ns ± 0%     3.47ns ± 3%  -16.70%
IndexByte/65      5.38ns ± 0%     4.21ns ± 0%  -21.59%
IndexByte/66      5.38ns ± 0%     4.21ns ± 0%  -21.58%
IndexByte/67      5.38ns ± 0%     4.22ns ± 0%  -21.58%
IndexByte/68      5.38ns ± 0%     4.22ns ± 0%  -21.59%
IndexByte/69      5.38ns ± 0%     4.22ns ± 0%  -21.56%
IndexByte/70      5.38ns ± 0%     4.21ns ± 0%  -21.59%
IndexByte/71      5.37ns ± 0%     4.21ns ± 0%  -21.51%
IndexByte/72      5.37ns ± 0%     4.22ns ± 0%  -21.46%
IndexByte/73      5.71ns ± 0%     4.22ns ± 0%  -26.20%
IndexByte/74      5.71ns ± 0%     4.21ns ± 0%  -26.21%
IndexByte/75      5.71ns ± 0%     4.21ns ± 0%  -26.17%
IndexByte/76      5.71ns ± 0%     4.22ns ± 0%  -26.22%
IndexByte/77      5.71ns ± 0%     4.22ns ± 0%  -26.22%
IndexByte/78      5.71ns ± 0%     4.21ns ± 0%  -26.22%
IndexByte/79      5.71ns ± 0%     4.22ns ± 0%  -26.21%
IndexByte/80      5.71ns ± 0%     4.21ns ± 0%  -26.19%
IndexByte/81      6.20ns ± 0%     4.39ns ± 0%  -29.13%
IndexByte/82      6.20ns ± 0%     4.36ns ± 0%  -29.67%
IndexByte/83      6.20ns ± 0%     4.36ns ± 0%  -29.63%
IndexByte/84      6.20ns ± 0%     4.39ns ± 0%  -29.21%
IndexByte/85      6.20ns ± 0%     4.36ns ± 0%  -29.64%
IndexByte/86      6.20ns ± 0%     4.36ns ± 0%  -29.63%
IndexByte/87      6.20ns ± 0%     4.39ns ± 0%  -29.21%
IndexByte/88      6.20ns ± 0%     4.36ns ± 0%  -29.65%
IndexByte/89      6.74ns ± 0%     4.36ns ± 0%  -35.33%
IndexByte/90      6.75ns ± 0%     4.37ns ± 0%  -35.22%
IndexByte/91      6.74ns ± 0%     4.36ns ± 0%  -35.30%
IndexByte/92      6.74ns ± 0%     4.36ns ± 0%  -35.34%
IndexByte/93      6.74ns ± 0%     4.37ns ± 0%  -35.20%
IndexByte/94      6.74ns ± 0%     4.36ns ± 0%  -35.33%
IndexByte/95      6.75ns ± 0%     4.36ns ± 0%  -35.32%
IndexByte/96      4.83ns ± 0%     4.34ns ± 2%  -10.24%
IndexByte/97      5.91ns ± 0%     4.65ns ± 0%  -21.24%
IndexByte/98      5.91ns ± 0%     4.65ns ± 0%  -21.24%
IndexByte/99      5.91ns ± 0%     4.65ns ± 0%  -21.23%
IndexByte/100     5.90ns ± 0%     4.65ns ± 0%  -21.21%
IndexByte/101     5.90ns ± 0%     4.65ns ± 0%  -21.22%
IndexByte/102     5.90ns ± 0%     4.65ns ± 0%  -21.23%
IndexByte/103     5.91ns ± 0%     4.65ns ± 0%  -21.23%
IndexByte/104     5.91ns ± 0%     4.65ns ± 0%  -21.24%
IndexByte/105     6.25ns ± 0%     4.65ns ± 0%  -25.59%
IndexByte/106     6.25ns ± 0%     4.65ns ± 0%  -25.59%
IndexByte/107     6.25ns ± 0%     4.65ns ± 0%  -25.60%
IndexByte/108     6.25ns ± 0%     4.65ns ± 0%  -25.58%
IndexByte/109     6.24ns ± 0%     4.65ns ± 0%  -25.50%
IndexByte/110     6.25ns ± 0%     4.65ns ± 0%  -25.56%
IndexByte/111     6.25ns ± 0%     4.65ns ± 0%  -25.60%
IndexByte/112     6.25ns ± 0%     4.65ns ± 0%  -25.59%
IndexByte/113     6.76ns ± 0%     5.05ns ± 0%  -25.37%
IndexByte/114     6.76ns ± 0%     5.05ns ± 0%  -25.31%
IndexByte/115     6.76ns ± 0%     5.05ns ± 0%  -25.38%
IndexByte/116     6.76ns ± 0%     5.05ns ± 0%  -25.31%
IndexByte/117     6.76ns ± 0%     5.05ns ± 0%  -25.38%
IndexByte/118     6.76ns ± 0%     5.05ns ± 0%  -25.31%
IndexByte/119     6.76ns ± 0%     5.05ns ± 0%  -25.38%
IndexByte/120     6.76ns ± 0%     5.05ns ± 0%  -25.36%
IndexByte/121     7.35ns ± 0%     5.05ns ± 0%  -31.33%
IndexByte/122     7.36ns ± 0%     5.05ns ± 0%  -31.42%
IndexByte/123     7.38ns ± 0%     5.05ns ± 0%  -31.60%
IndexByte/124     7.38ns ± 0%     5.05ns ± 0%  -31.59%
IndexByte/125     7.38ns ± 0%     5.05ns ± 0%  -31.60%
IndexByte/126     7.38ns ± 0%     5.05ns ± 0%  -31.58%
IndexByte/128     5.28ns ± 0%     5.10ns ± 0%   -3.41%
IndexByte/256     7.27ns ± 0%     7.28ns ± 2%   +0.13%
IndexByte/512     12.1ns ± 0%     11.8ns ± 0%   -2.51%
IndexByte/1K      23.1ns ± 3%     22.0ns ± 0%   -4.66%
IndexByte/2K      42.6ns ± 0%     42.4ns ± 0%   -0.41%
IndexByte/4K      90.3ns ± 0%     89.4ns ± 0%   -0.98%
IndexByte/8K       170ns ± 0%      170ns ± 0%   -0.59%
IndexByte/16K      331ns ± 0%      330ns ± 0%   -0.27%
IndexByte/32K      660ns ± 0%      660ns ± 0%   -0.08%
IndexByte/64K     1.30µs ± 0%     1.30µs ± 0%   -0.08%
IndexByte/128K    2.58µs ± 0%     2.58µs ± 0%   -0.04%
IndexByte/256K    5.15µs ± 0%     5.15µs ± 0%   -0.04%
IndexByte/512K    10.3µs ± 0%     10.3µs ± 0%   -0.03%
IndexByte/1M      20.6µs ± 0%     20.5µs ± 0%   -0.03%
IndexByte/2M      41.1µs ± 0%     41.1µs ± 0%   -0.03%
IndexByte/4M      82.2µs ± 0%     82.1µs ± 0%   -0.02%
IndexByte/8M       164µs ± 0%      164µs ± 0%   -0.01%
IndexByte/16M      328µs ± 0%      328µs ± 0%   -0.01%
IndexByte/32M      657µs ± 0%      657µs ± 0%   -0.00%

GOPPC64=power8 vs GOPPC64=power9. The Improvement is
most noticed between 16 and 64B, and goes away around
128B.

IndexByte/16      2.78ns ± 7%     2.65ns ±15%   -4.74%
IndexByte/17      3.04ns ± 1%     2.80ns ± 3%   -7.85%
IndexByte/18      3.05ns ± 0%     2.71ns ± 4%  -11.00%
IndexByte/19      3.02ns ± 2%     2.76ns ±10%   -8.74%
IndexByte/20      3.45ns ± 7%     2.91ns ± 0%  -15.46%
IndexByte/21      3.03ns ± 1%     2.84ns ± 9%   -6.33%
IndexByte/22      3.05ns ± 0%     2.67ns ± 1%  -12.38%
IndexByte/23      3.01ns ± 2%     2.67ns ± 1%  -11.24%
IndexByte/24      3.07ns ± 0%     2.92ns ±12%   -4.79%
IndexByte/25      3.04ns ± 1%     3.15ns ±15%   +3.63%
IndexByte/26      3.05ns ± 0%     2.83ns ±13%   -7.33%
IndexByte/27      2.97ns ± 3%     2.98ns ±10%   +0.56%
IndexByte/28      2.96ns ± 3%     2.96ns ± 9%   -0.05%
IndexByte/29      3.34ns ± 9%     3.03ns ±12%   -9.33%
IndexByte/30      3.05ns ± 0%     2.68ns ± 1%  -12.05%
IndexByte/31      3.05ns ± 0%     2.83ns ±12%   -7.27%
IndexByte/32      3.44ns ± 1%     3.21ns ±10%   -6.78%
IndexByte/33      3.35ns ± 0%     3.41ns ± 2%   +1.95%
IndexByte/34      3.35ns ± 0%     3.13ns ± 0%   -6.53%
IndexByte/35      3.35ns ± 0%     3.13ns ± 0%   -6.54%
IndexByte/36      3.35ns ± 0%     3.13ns ± 0%   -6.52%
IndexByte/37      3.35ns ± 0%     3.13ns ± 0%   -6.52%
IndexByte/38      3.35ns ± 0%     3.24ns ± 4%   -3.30%
IndexByte/39      3.48ns ± 9%     3.44ns ± 2%   -1.19%
IndexByte/40      3.55ns ± 4%     3.46ns ± 2%   -2.44%
IndexByte/41      3.42ns ± 2%     3.39ns ± 4%   -0.86%
IndexByte/42      3.55ns ±11%     3.46ns ± 1%   -2.65%
IndexByte/43      3.45ns ± 5%     3.44ns ± 2%   -0.31%
IndexByte/44      3.39ns ± 0%     3.43ns ± 3%   +1.23%
IndexByte/45      3.43ns ± 4%     3.50ns ± 1%   +2.07%
IndexByte/46      3.47ns ± 1%     3.46ns ± 2%   -0.31%
IndexByte/47      3.44ns ± 2%     3.47ns ± 1%   +0.78%
IndexByte/48      3.39ns ± 0%     3.46ns ± 2%   +1.96%
IndexByte/49      3.79ns ± 0%     3.47ns ± 0%   -8.41%
IndexByte/50      3.70ns ± 3%     3.64ns ± 5%   -1.66%
IndexByte/51      3.70ns ± 2%     3.75ns ± 0%   +1.40%
IndexByte/52      3.80ns ± 1%     3.77ns ± 0%   -0.70%
IndexByte/53      3.78ns ± 0%     3.77ns ± 0%   -0.46%
IndexByte/54      3.78ns ± 1%     3.53ns ± 7%   -6.74%
IndexByte/55      3.78ns ± 0%     3.47ns ± 0%   -8.17%
IndexByte/56      3.81ns ± 3%     3.45ns ± 0%   -9.43%
IndexByte/57      3.79ns ± 4%     3.47ns ± 0%   -8.45%
IndexByte/58      3.74ns ± 2%     3.55ns ± 4%   -5.16%
IndexByte/59      3.69ns ± 2%     3.61ns ± 4%   -2.01%
IndexByte/60      3.79ns ± 1%     3.45ns ± 0%   -9.09%
IndexByte/61      3.77ns ± 1%     3.47ns ± 0%   -7.93%
IndexByte/62      3.79ns ± 0%     3.45ns ± 0%   -8.97%
IndexByte/63      3.79ns ± 0%     3.47ns ± 0%   -8.44%
IndexByte/64      3.47ns ± 3%     3.18ns ± 0%   -8.41%

GOPPC64=power9 vs GOPPC64=power10. Only sizes <16 will
show meaningful changes.

IndexByte/1       3.27ns ± 8%     2.36ns ± 2%  -27.58%
IndexByte/2       3.06ns ± 4%     2.34ns ± 1%  -23.42%
IndexByte/3       3.77ns ±11%     2.48ns ± 7%  -34.03%
IndexByte/4       3.18ns ± 8%     2.33ns ± 1%  -26.69%
IndexByte/5       3.18ns ± 5%     2.34ns ± 4%  -26.26%
IndexByte/6       3.13ns ± 3%     2.35ns ± 1%  -24.97%
IndexByte/7       3.25ns ± 1%     2.33ns ± 1%  -28.22%
IndexByte/8       2.79ns ± 2%     2.36ns ± 1%  -15.32%
IndexByte/9       2.90ns ± 0%     2.34ns ± 2%  -19.36%
IndexByte/10      2.99ns ± 3%     2.31ns ± 1%  -22.70%
IndexByte/11      3.13ns ± 7%     2.31ns ± 0%  -26.08%
IndexByte/12      3.01ns ± 4%     2.32ns ± 1%  -22.91%
IndexByte/13      2.98ns ± 3%     2.31ns ± 1%  -22.72%
IndexByte/14      2.92ns ± 2%     2.61ns ±16%  -10.58%
IndexByte/15      3.02ns ± 5%     2.69ns ± 7%  -10.90%
IndexByte/16      2.65ns ±15%     2.29ns ± 1%  -13.61%

Change-Id: I4482f762d25eabf60def4981a0b2bc0c10ccf50c
Reviewed-on: https://go-review.googlesource.com/c/go/+/478656
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-21 16:10:29 +00:00
Paul E. Murphy
ccb8db88c5 cmd/link,cmd/internal/obj/ppc64: enable PCrel on power10/ppc64/linux
A CI machine has been set up to verify GOPPC64=power10 on ppc64/linux.

This should be sufficient to verify the PCrel relocation support works
for BE.

Note, power10/ppc64/linux is an oddball case. Today, it can only link
internally. Furthermore, all PCrel relocs are resolved at link time,
so it works despite ELFv1 having no official support for PCrel relocs
today.

Change-Id: Ibf79df69406ec6f9352c9d7d941ad946dba74e73
Reviewed-on: https://go-review.googlesource.com/c/go/+/485075
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2023-04-21 16:09:34 +00:00