1
0
mirror of https://github.com/golang/go synced 2024-11-06 13:36:12 -07:00
Commit Graph

5616 Commits

Author SHA1 Message Date
nimelehin
5370494577 runtime: add BenchmarkMemclrRange
This benchmark is added to test improvements in memclr_amd64.
As it is stated in Intel Optimization Manual 15.16.3.3, AVX2-implemented
memclr can produce a skewed result with the branch predictor being
trained by the large loop iteration count.

This benchmark generates sizes between some specified range. This should
help to measure how memclr works when branch predictors may be incorrectly
trained.

Change-Id: I14d173cafe43ca47198ed920e655547a66b3909f
Reviewed-on: https://go-review.googlesource.com/c/go/+/373362
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-20 13:51:52 +00:00
Keith Randall
d8762b2f45 runtime: fix overflow in PingPongHog test
On 32-bit systems the result of hogCount*factor can overflow.
Use division instead to do comparison.

Update #52207

Change-Id: I429fb9dc009af645acb535cee5c70887527ba207
Reviewed-on: https://go-review.googlesource.com/c/go/+/407415
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-19 21:33:15 +00:00
Xiaodong Liu
34507e879d runtime/cgo: add cgo function call support for loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I8ef0e7f17d6ada3d2f07c81524136b78457e7795
Reviewed-on: https://go-review.googlesource.com/c/go/+/342319
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-05-19 21:15:04 +00:00
Xiaodong Liu
a7cc865edb runtime/internal/atomic: add atomic support for loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I0333503db044c6f39df2d7f8d9dff213b1361d6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/342320
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-19 21:14:49 +00:00
Keith Randall
6b6813fdb7 runtime: test alignment of fields targeted by 64-bit atomics
Make sure that all the targets of 64-bit atomic operations
are actually aligned to 8 bytes. This has been a source of
bugs on 32-bit systems. (e.g. CL 399754)

The strategy is to have a simple test that just checks the
alignment of some explicitly listed fields and global variables.

Then there's a more complicated test that makes sure the list
used in the simple test is exhaustive. That test has some
limitations, but it should catch most cases, particularly new
uses of atomic operations on new or existing fields.

Unlike a runtime assert, this check is free and will catch
accesses that occur even in very unlikely code paths.

Change-Id: I25ac78df471ac33b57cb91375bd8453d6ce2814f
Reviewed-on: https://go-review.googlesource.com/c/go/+/407034
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-19 20:10:40 +00:00
Michael Anthony Knyszek
a6c75aa5c3 runtime: use correct heap goal in GC traces
Currently gctrace and gcpacertrace recompute the heap goal for
end-of-cycle information but this is incorrect.

Because both of these traces are printing stats from the previous cycle
in this case, they should print the heap goal at the end of the previous
cycle.

Change-Id: I967621cbaff9f331cd3e361de8850ddfe0cfc099
Reviewed-on: https://go-review.googlesource.com/c/go/+/407138
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-05-19 19:57:28 +00:00
Xiaodong Liu
8542bd8938 runtime: support vdso for linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: Ie9bb5ccfc28e65036e2088c232bb333dcb259a60
Reviewed-on: https://go-review.googlesource.com/c/go/+/368076
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2022-05-19 19:32:35 +00:00
Xiaodong Liu
0084706528 runtime: implement signal for linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: Ifa0229d2044dd53683de4a2b3ab965b16263f267
Reviewed-on: https://go-review.googlesource.com/c/go/+/368075
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2022-05-19 19:32:33 +00:00
Xiaodong Liu
c8de7628cb runtime: implement syscalls for runtime bootstrap on linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I848608267932717895d5cff9e33040029c3f3c4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/368080
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
2022-05-19 19:28:24 +00:00
Xiaodong Liu
96a10b9c84 runtime: implement duffzero/duffcopy for linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: Ida040e76dc8172f60e6aee1ea2b5bce13ab3581e
Reviewed-on: https://go-review.googlesource.com/c/go/+/368077
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
2022-05-19 19:13:17 +00:00
Xiaodong Liu
2a5114f49a runtime: support memclr/memmove for linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I7c1f39670034db6714630d479bc41b6620ba2b1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/368079
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-19 19:13:01 +00:00
Russ Cox
946b4baaf6 all: gofmt main repo
Excluding vendor and testdata.
CL 384268 already reformatted most, but these slipped past.

The struct in the doc comment in debug/dwarf/type.go
was fixed up by hand to indent the first and last lines as well.

For #51082.

Change-Id: Iad020f83aafd671ff58238fe491907e85923d0c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/407137
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-19 15:49:05 +00:00
Michael Anthony Knyszek
b93ceefa7b runtime: use osyield in runqgrab on netbsd
NetBSD appears to have the same issue OpenBSD had in runqgrab. See
issue #52475 for more details.

For #35166.

Change-Id: Ie53192d26919b4717bc0d61cadd88d688ff38bb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/407139
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2022-05-19 14:18:41 +00:00
Xiaodong Liu
740b0ebb76 runtime: implement asyncPreempt for linux/loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I7a64e38b15a99816bd74262c02f62dad021cc166
Reviewed-on: https://go-review.googlesource.com/c/go/+/368078
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-19 02:40:14 +00:00
Cherry Mui
c087121daf runtime: relax the threshold for TestPingPongHog
The test checks that the scheduling of the goroutines are within
a small factor, to ensure the scheduler handing off the P
correctly. There have been flaky failures on the builder (probably
due to OS scheduling delays). Increase the threshold to make it
less flaky. The gap would be much bigger if the scheduler doesn't
work correctly.

For the long term maybe it is better to test it more directly
with the scheduler, e.g. with scheduler instrumentation.

May fix #52207.

Change-Id: I50278b70ab21b7f04761fdc8b38dd13304c67879
Reviewed-on: https://go-review.googlesource.com/c/go/+/407134
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-05-18 19:52:53 +00:00
Leonard Wang
1f9f7db2bf runtime: remove useless constant definition in malloc.go
Change-Id: I060c867d89a06b5a44fbe77804c19299385802d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/311250
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
2022-05-18 15:25:04 +00:00
John Bampton
20db15ce12 all: fix spelling
Change-Id: I63eb42f3ce5ca452279120a5b33518f4ce16be45
GitHub-Last-Rev: a88f2f72be
GitHub-Pull-Request: golang/go#52951
Reviewed-on: https://go-review.googlesource.com/c/go/+/406843
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-05-18 00:47:29 +00:00
Cherry Mui
c6965ad63f runtime: deflake TestCgoPprofThread
In TestCgoPprofThread, the (fake) cgo traceback function pretends
all C CPU samples are in cpuHogThread. But if a profiling signal
lands in C code but outside of that thread, e.g. before/when the
thread is created, we will get a sample which looks like Go calls
into cpuHogThread. This CL makes the cgo traceback function only
return cpuHogThread PCs when a signal lands on that thread.

May fix #52726.

Change-Id: I21c40f974d1882508626faf3ac45e8347fec31c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/406934
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-17 22:59:31 +00:00
John Bampton
b2116f748a all: fix spelling
Change-Id: Iee18987c495d1d4bde9da888d454eea8079d3ebc
GitHub-Last-Rev: ff5e01599d
GitHub-Pull-Request: golang/go#52949
Reviewed-on: https://go-review.googlesource.com/c/go/+/406915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-05-17 21:46:33 +00:00
Xiaodong Liu
7607888df0 runtime: load/save TLS variable g on loong64
Contributors to the loong64 port are:
  Weining Lu <luweining@loongson.cn>
  Lei Wang <wanglei@loongson.cn>
  Lingqin Gong <gonglingqin@loongson.cn>
  Xiaolin Zhao <zhaoxiaolin@loongson.cn>
  Meidan Li <limeidan@loongson.cn>
  Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
  Qiyuan Pu <puqiyuan@loongson.cn>
  Guoqi Chen <chenguoqi@loongson.cn>

This port has been updated to Go 1.15.6:
  https://github.com/loongson/go

Updates #46229

Change-Id: I5e09759ce9201596e89a01fc4a6f7fd7e205449f
Reviewed-on: https://go-review.googlesource.com/c/go/+/368074
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-17 20:18:25 +00:00
John Bampton
13147f744c runtime: fix code span element
Change-Id: I99c593573b3bec560ab3af49ac2f486ee442ee1c
GitHub-Last-Rev: e399ec50f9
GitHub-Pull-Request: golang/go#52946
Reviewed-on: https://go-review.googlesource.com/c/go/+/406837
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2022-05-17 20:05:58 +00:00
John Bampton
a6f3f8d973 all: fix spelling
Change-Id: I68538a50c22b02cdb5aa2a889f9440fed7b94c54
GitHub-Last-Rev: aaac9e7834
GitHub-Pull-Request: golang/go#52944
Reviewed-on: https://go-review.googlesource.com/c/go/+/406835
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
2022-05-17 19:51:29 +00:00
Rhys Hiltner
6ec46f4707 runtime/pprof: slow new goroutine launches in test
The goroutine profiler tests include one that launches a steady stream
of goroutines. That creates a scheduler busy loop that can prevent
forward progress in the rest of the program. Slow down the launches a
bit so other goroutines have a chance to run.

Fixes #52916
For #52934

Change-Id: I748557201b94918b1fa4960544a51a48d9cacc6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/406654
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-05-16 18:30:07 +00:00
Bryan C. Mills
db875f4d1b runtime/pprof: eliminate arbitrary deadline in testCPUProfile
The testCPUProfile helper function iterates until the profile contains
enough samples. However, in general very slow builders may need longer
to complete tests, and may have less-responsive schedulers (leading to
longer durations required to collect profiles with enough samples).
To compensate, slower builders generally run tests with longer timeouts.

Since this test helper already dynamically scales the profile duration
based on the collected samples, allow it to continue to retry and
rescale until it would exceed the test's deadline.

Fixes #52656 (hopefully).

Change-Id: I4561e721927503f33a6d23336efa979bb9d3221f
Reviewed-on: https://go-review.googlesource.com/c/go/+/406614
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
2022-05-16 16:15:58 +00:00
Matthew Dempsky
568590b085 runtime: mark panicshift and panicdivide as //go:yeswritebarrierrec
When compiling package runtime, cmd/compile logically has two copies
of package runtime: the actual source files being compiled, and the
internal description used for emitting compiler-generated calls.

Notably, CL 393715 will cause the compiler's write barrier validation
to start recognizing that compiler-generated calls are actually calls
to the corresponding functions from the source package. And today,
there are some code paths in nowritebarrierrec code paths that
actually end up generating code to call panicshift or panicdivide.

In preparation, this CL marks those functions as
//go:yeswritebarrierrec. We probably want to actually cleanup those
code paths to avoid these calls actually (e.g., explicitly convert
shift count expressions to an unsigned integer type). But for now,
this at least unblocks CL 393715 while preserving the status quo.

Updates #51734.

Change-Id: I01f89adb72466c0260a9cd363e3e09246e39cff9
Reviewed-on: https://go-review.googlesource.com/c/go/+/406316
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-16 09:31:07 +00:00
Meng Zhuo
9956996f6e runtime: add address sanitizer support for riscv64
Updates #44853

Change-Id: I3ba6ec0cfc6c7f311b586deedb1cda0f87a637aa
Reviewed-on: https://go-review.googlesource.com/c/go/+/375256
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: Zhuo Meng <mzh@golangcn.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: David Chase <drchase@google.com>
2022-05-16 06:55:54 +00:00
Michael Anthony Knyszek
08d6c4c2b9 runtime: account for idle mark time in the GC CPU limiter
Currently the GC CPU limiter doesn't account for idle application time
at all. This means that the GC could start thrashing, for example if the
live heap exceeds the max heap set by the memory limit, but the limiter
will fail to kick in when there's a lot of available idle time. User
goroutines will still be assisting at a really high rate because of
assist pacing rules, but the GC CPU limiter will fail to kick in because
the actual fraction of GC CPU time will be low if there's a lot of
otherwise idle time (for example, on an overprovisioned system).

Luckily, that idle time is usually eaten up entirely by idle mark
workers, at least during the GC cycle. And in these cases where we're
GCing continuously, that's all of our idle time. So we can take idle
mark work time and subtract it from the mutator time accumulated in the
GC CPU limiter, and that will give us a more accurate picture of how
much CPU is being spent by user goroutines on GC. This will allow the GC
CPU limiter to kick in, and reduce the impact of the thrashing.

There is a corner case here if the idle mark workers are disabled, for
example for the periodic GC, but in the case of the periodic GC, I don't
think it's possible for us to be thrashing at all, so it doesn't really
matter.

Fixes #52890.

Change-Id: Ie133a7d1f89b603434b415d51eb8733c2708a858
Reviewed-on: https://go-review.googlesource.com/c/go/+/405898
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-13 22:25:51 +00:00
Michael Anthony Knyszek
cfccb5cb7c runtime/metrics: add the last GC cycle that had the limiter enabled
This metric exports the the last GC cycle index that the GC limiter was
enabled. This metric is useful for debugging and identifying the root
cause of OOMs, especially when SetMemoryLimit is in use.

For #48409.

Change-Id: Ic6383b19e88058366a74f6ede1683b8ffb30a69c
Reviewed-on: https://go-review.googlesource.com/c/go/+/403614
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 20:45:19 +00:00
Michael Anthony Knyszek
9bd6e2776f runtime/metrics: add the number of Go-to-C calls
For #47216.

Change-Id: I1c2cd518e6ff510cc3ac8d8f72fd52eadcabc16c
Reviewed-on: https://go-review.googlesource.com/c/go/+/404306
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 20:45:09 +00:00
David Chase
364ced6255 runtime: tweak js and plan9 to avoid/disable write barrier & gc problems
runtime code for js contains possible write barriers that fail
the nowritebarrierrec check when internal local package naming
conventions are changed.  The problem was there all already; this
allows the code to compile, and it seems to work anyway in the
(single-threaded) js/wasm environment.  The offending operations
are noted with TODO, which is an improvement.

runtime code for plan9 contained an apparent allocation that was
not really an allocation; rewrite to remove the potential allocation
to avoid nowritebarrierrec problems.

This CL is a prerequisite for a pending code cleanup,
https://go-review.googlesource.com/c/go/+/393715

Updates #51734.

Change-Id: I93f31831ff9b92632137dd7b0055eaa721c81556
Reviewed-on: https://go-review.googlesource.com/c/go/+/405901
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2022-05-13 19:51:37 +00:00
Rhys Hiltner
ba8310cf29 runtime/pprof: fix allFrames cache
The compiler may choose to inline multiple layers of function call, such
that A calling B calling C may end up with all of the instructions for B
and C written as part of A's function body.

Within that function body, some PCs will represent code from function A.
Some will represent code from function B, and for each of those the
runtime will have an instruction attributable to A that it can report as
its caller. Others will represent code from function C, and for each of
those the runtime will have an instruction attributable to B and an
instruction attributable to A that it can report as callers.

When a profiling signal arrives at an instruction in B (as inlined in A)
that the runtime also uses to describe calls to C, the profileBuilder
ends up with an incorrect cache of allFrames results. That PC should
lead to a location record in the profile that represents the frames
B<-A, but the allFrames cache's view should expand the PC only to the B
frame.

Otherwise, when a profiling signal arrives at an instruction in C (as
inlined in B in A), the PC stack C,B,A can get expanded to the frames
C,B<-A,A as follows: The inlining deck starts empty. The first tryAdd
call proposes PC C and frames C, which the deck accepts. The second
tryAdd call proposes PC B and, due to the incorrect caching, frames B,A.
(A fresh call to allFrames with PC B would return the frame list B.) The
deck accepts that PC and frames. The third tryAdd call proposes PC A and
frames A. The deck rejects those because a call from A to A cannot
possibly have been inlined. This results in a new location record in the
profile representing the frames C<-B<-A (good), as called by A (bad).

The bug is the cached expansion of PC B to frames B<-A. That mapping is
only appropriate for the resulting protobuf-format profile. The cache
needs to reflect the results of a call to allFrames, which expands the
PC B to the single frame B.

For #50996
For #52693
Fixes #52764

Change-Id: I36d080f3c8a05650cdc13ced262189c33b0083b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/404995
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-05-13 19:45:43 +00:00
David Chase
81d567146e runtime: add go:yeswritebarrierrec to panic functions
Panic avoids any write barriers in the runtime by checking first
and throwing if called inappropriately, so it is "okay".  Adding
this annotation repairs recursive write barrier checking, which
becomes more thorough when the local package naming convention
is changed from "" to the actual package name.

This CL is a prerequisite for a pending code cleanup,
https://go-review.googlesource.com/c/go/+/393715

Updates #51734.

Change-Id: If831a3598c6c8cd37a8e9ba269f822cd81464a13
Reviewed-on: https://go-review.googlesource.com/c/go/+/405900
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: David Chase <drchase@google.com>
2022-05-13 19:29:44 +00:00
Michael Anthony Knyszek
ece6ac4d4d runtime/metrics: add gomaxprocs metric
For #47216.

Change-Id: Ib2d48c4583570a2dae9510a52d4c6ffc20161b31
Reviewed-on: https://go-review.googlesource.com/c/go/+/404305
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:59 +00:00
Michael Anthony Knyszek
0f715f1ac9 runtime/internal/atomic: align 64-bit types to 8 bytes everywhere
This change extends https://go.dev/cl/381317 to the
runtime/internal/atomic package in terms of aligning 64-bit types to 8
bytes, even on 32-bit platforms.

Change-Id: Id8c45577d07b256e3144d88b31f201264295cfcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/404096
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:27 +00:00
Michael Anthony Knyszek
2eb8b6eec6 runtime: make CPU limiter assist time much less error-prone
At the expense of performance (having to update another atomic counter)
this change makes CPU limiter assist time much less error-prone to
manage. There are currently a number of issues with respect to how
scavenge assist time is treated, and this change resolves those by just
having the limiter maintain its own internal pool that's drained on each
update.

While we're here, clear the measured assist time each cycle, which was
the impetus for the change.

Change-Id: I84c513a9f012b4007362a33cddb742c5779782b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/404304
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:20 +00:00
Keith Randall
016d755213 runtime: measure stack usage; start stacks larger if needed
Measure the average stack size used by goroutines at every GC. When
starting a new goroutine, allocate an initial goroutine stack of that
average size. Intuition is that we'll waste at most 2x in stack space
because only half the goroutines can be below average. In turn, we
avoid some of the early stack growth / copying needed in the average
case.

More details in the design doc at: https://docs.google.com/document/d/1YDlGIdVTPnmUiTAavlZxBI1d9pwGQgZT7IKFKlIXohQ/edit?usp=sharing

name        old time/op  new time/op  delta
Issue18138  95.3µs ± 0%  67.3µs ±13%  -29.35%  (p=0.000 n=9+10)

Fixes #18138

Change-Id: Iba34d22ed04279da7e718bbd569bbf2734922eaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/345889
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-12 22:32:42 +00:00
Axel Busch
636c5f0208 runtime: enable vDSO support for s390x architecture
This change adds support for vDSO for s390x architecture. This avoids the use of system calls in nanotime and walltime and accelerates them by factor 4-5.

Benchmarks:
100,000,000 x time.Now():
syscall fallback	13923ms		139.23 ns/op
vDSO enabled		2640ms	 	26.40 ns/op

Change-Id: Ic679fe31048379e59ccf83b400140f13c9d49696
GitHub-Last-Rev: 8f6e918a45
GitHub-Pull-Request: golang/go#49717
Reviewed-on: https://go-review.googlesource.com/c/go/+/365995
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Jonathan Albrecht <jonathan.albrecht@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
2022-05-11 13:30:43 +00:00
Cuong Manh Le
4388faf964 runtime: use unsafe.Slice in getStackMap
CL 362934 added open code for unsafe.Slice, so using it now no longer
negatively impacts the performance.

Updates #48798

Change-Id: Ifbabe8bc1cc4349c5bcd11586a11fc99bcb388b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/404974
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-11 04:25:28 +00:00
Cuong Manh Le
579902d0b1 cmd/compile,runtime: open code unsafe.Slice
So prevent heavy runtime call overhead, and the compiler will have a
chance to optimize the bound check.

With this optimization, changing runtime/stack.go to use unsafe.Slice
no longer negatively impacts stack copying performance:

name                   old time/op    new time/op    delta
StackCopyWithStkobj-8    16.3ms ± 6%    16.5ms ± 5%   ~     (p=0.382 n=8+8)

name                   old alloc/op   new alloc/op   delta
StackCopyWithStkobj-8     17.0B ± 0%     17.0B ± 0%   ~     (all equal)

name                   old allocs/op  new allocs/op  delta
StackCopyWithStkobj-8      1.00 ± 0%      1.00 ± 0%   ~     (all equal)

Fixes #48798

Change-Id: I731a9a4abd6dd6846f44eece7f86025b7bb1141b
Reviewed-on: https://go-review.googlesource.com/c/go/+/362934
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-05-11 04:25:16 +00:00
Matthew Dempsky
ccb798741b runtime: change maxSearchAddr into a helper function
This avoids a dependency on the compiler statically initializing
maxSearchAddr, which is necessary so we can disable the (overly
aggressive and spec non-conforming) optimizations in cmd/compile and
gccgo.

Updates #51913.

Change-Id: I424e62c81c722bb179ed8d2d8e188274a1aeb7b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/396194
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-11 03:28:01 +00:00
nimelehin
e0f99775f2 runtime: store pointer-size words in memclr
GC requires the whole zeroed word to be visible for a memory subsystem.
While the implementation of Enhanced REP STOSB tries to use as efficient
stores as possible, e.g writing the whole cache line and not byte-after-byte,
we should use REP STOSQ to guarantee the requirements of the GC.

The performance is not affected.

Change-Id: I1b0fd1444a40bfbb661541291ab96eba11bcc762
Reviewed-on: https://go-review.googlesource.com/c/go/+/405274
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-05-10 20:52:34 +00:00
Cherry Mui
aeaf4b0e5b runtime: not mark save_g NOFRAME on ARM
On ARM, when GOARM<=6 the TLS pointer is fetched via a call to a
kernel helper. This call clobbers LR, even just temporarily. If
the function is NOFRAME, if a profiling signal lands right after
the call returns, the unwinder will find the wrong LR. Not mark it
NOFRAME, so the LR will be saved in the usual way and stack
unwinding should work.

May fix #52829.

Change-Id: I419a31dcf4afbcff8d7ab8f179eec3c477589e60
Reviewed-on: https://go-review.googlesource.com/c/go/+/405482
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-05-10 20:38:07 +00:00
Michael Anthony Knyszek
482669db62 runtime: add lock partial order edge for trace and wbufSpans and mheap
This edge represents the case of executing a write barrier under the
trace lock: we might use the wbufSpans lock to get a new trace buffer,
or mheap to allocate a totally new one.

Fixes #52794.

Change-Id: Ia1ac2c744b8284ae29f4745373df3f9675ab1168
Reviewed-on: https://go-review.googlesource.com/c/go/+/405476
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
2022-05-10 16:40:27 +00:00
Michael Anthony Knyszek
8fdd277fe6 runtime: profile finalizer G more carefully in goroutine profile
If SetFinalizer is never called, we might readgstatus on a nil fing
variable, resulting in a crash. This change guards code that accesses
fing by a nil check.

Fixes #52821.

Change-Id: I3e8e7004f97f073dc622e801a1d37003ea153a29
Reviewed-on: https://go-review.googlesource.com/c/go/+/405475
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
2022-05-10 15:43:40 +00:00
nimelehin
dcfe57b8c2 runtime: use ERMS in memclr_amd64
This patch adds support for REP STOSB in memclr(). The current
implementation uses REP STOSB when 1) ERMS is supported
2) size is bigger than 2kb and less than 32mb.

The threshold of 2kb is chosen based on benchmark results and is
close to what Intel mentioned in their comparison of ERMSB and AVX
(Table 3-4. Relative Performance of Memcpy() Using ERMSB Vs. 128-bit
AVX in the Intel Optimization Guide).

While REP STOS uses a no-RFO write protocol, ERMS could show the
same or slower performance comparing to Non-Temporal Stores when the
size is bigger than LLC depending on hardware.

Benchmarks (including MemclrRange from CL373362)
goos: darwin
goarch: amd64
pkg: runtime
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
name                           old time/op    new time/op    delta
Memclr/5-12                      1.90ns ± 2%    2.13ns ± 2%  +11.72%  (p=0.001 n=7+7)
Memclr/16-12                     2.33ns ± 4%    2.36ns ± 4%     ~     (p=0.259 n=7+7)
Memclr/64-12                     2.58ns ± 2%    2.61ns ± 3%     ~     (p=0.091 n=7+7)
Memclr/256-12                    4.89ns ± 4%    4.94ns ± 3%     ~     (p=0.620 n=7+7)
Memclr/4096-12                   38.4ns ± 2%    39.5ns ± 5%     ~     (p=0.078 n=7+7)
Memclr/65536-12                   929ns ± 2%    1040ns ±19%     ~     (p=0.268 n=5+7)
Memclr/1M-12                     24.2µs ± 5%    19.0µs ± 9%  -21.62%  (p=0.001 n=7+7)
Memclr/4M-12                     93.3µs ± 3%    73.2µs ± 4%  -21.50%  (p=0.001 n=7+7)
Memclr/8M-12                      209µs ± 6%     164µs ± 3%  -21.55%  (p=0.001 n=7+7)
Memclr/16M-12                     731µs ± 4%     507µs ± 6%  -30.71%  (p=0.001 n=7+7)
Memclr/64M-12                    1.79ms ± 1%    1.83ms ± 3%   +2.47%  (p=0.041 n=6+6)
MemclrRange/1_2_47K-12            873ns ± 3%     899ns ± 5%     ~     (p=0.053 n=7+7)
MemclrRange/2_8_166K-12          2.98µs ± 4%    2.90µs ± 5%     ~     (p=0.165 n=7+7)
MemclrRange/4_16_315K-12         6.81µs ± 4%    5.31µs ± 9%  -22.01%  (p=0.001 n=7+7)
MemclrRange/128_256_1623K-12     37.5µs ± 4%    28.1µs ± 4%  -25.19%  (p=0.001 n=7+6)
[Geo mean]                       1.56µs         1.43µs        -8.43%

name                           old speed      new speed      delta
Memclr/5-12                    2.63GB/s ± 2%  2.35GB/s ± 2%  -10.50%  (p=0.001 n=7+7)
Memclr/16-12                   6.86GB/s ± 4%  6.79GB/s ± 4%     ~     (p=0.259 n=7+7)
Memclr/64-12                   24.8GB/s ± 2%  24.5GB/s ± 3%     ~     (p=0.097 n=7+7)
Memclr/256-12                  52.4GB/s ± 4%  51.9GB/s ± 3%     ~     (p=0.620 n=7+7)
Memclr/4096-12                  107GB/s ± 2%   104GB/s ± 5%     ~     (p=0.073 n=7+7)
Memclr/65536-12                70.6GB/s ± 2%  64.2GB/s ±21%     ~     (p=0.268 n=5+7)
Memclr/1M-12                   43.4GB/s ± 5%  55.5GB/s ±10%  +28.04%  (p=0.001 n=7+7)
Memclr/4M-12                   45.0GB/s ± 4%  57.3GB/s ± 4%  +27.38%  (p=0.001 n=7+7)
Memclr/8M-12                   40.1GB/s ± 5%  51.1GB/s ± 3%  +27.37%  (p=0.001 n=7+7)
Memclr/16M-12                  23.0GB/s ± 4%  33.1GB/s ± 6%  +44.39%  (p=0.001 n=7+7)
Memclr/64M-12                  37.6GB/s ± 1%  36.7GB/s ± 3%   -2.38%  (p=0.041 n=6+6)
MemclrRange/1_2_47K-12         55.9GB/s ± 3%  54.3GB/s ± 5%     ~     (p=0.053 n=7+7)
MemclrRange/2_8_166K-12        57.4GB/s ± 5%  58.9GB/s ± 5%     ~     (p=0.165 n=7+7)
MemclrRange/4_16_315K-12       47.4GB/s ± 4%  60.9GB/s ± 9%  +28.40%  (p=0.001 n=7+7)
MemclrRange/128_256_1623K-12   44.3GB/s ± 4%  58.4GB/s ± 9%  +31.73%  (p=0.001 n=7+7)
[Geo mean]                     33.6GB/s       36.8GB/s        +9.27%

goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) Gold 6230N CPU @ 2.30GHz
name                     old time/op    new time/op     delta
Memclr/5-2                 2.53ns ± 0%     2.52ns ± 0%   -0.25%  (p=0.001 n=7+7)
Memclr/16-2                2.77ns ± 0%     2.55ns ± 0%   -7.97%  (p=0.000 n=5+7)
Memclr/64-2                3.16ns ± 0%     3.16ns ± 0%     ~     (p=0.432 n=7+7)
Memclr/256-2               7.26ns ± 0%     7.26ns ± 0%     ~     (p=0.220 n=7+7)
Memclr/4096-2              49.3ns ± 0%     43.5ns ± 0%  -11.80%  (p=0.001 n=7+7)
Memclr/65536-2             1.32µs ± 1%     1.24µs ± 0%   -6.31%  (p=0.001 n=7+7)
Memclr/1M-2                27.3µs ± 0%     26.6µs ± 5%     ~     (p=0.195 n=7+7)
Memclr/4M-2                 195µs ± 0%      148µs ± 4%  -24.22%  (p=0.001 n=7+7)
Memclr/8M-2                 391µs ± 0%      308µs ± 0%  -21.09%  (p=0.001 n=7+6)
Memclr/16M-2                782µs ± 0%      639µs ± 1%  -18.31%  (p=0.001 n=7+7)
Memclr/64M-2               2.83ms ± 1%     2.84ms ± 1%     ~     (p=0.620 n=7+7)
MemclrRange/1K_2K-2        1.24µs ± 0%     1.24µs ± 0%     ~     (p=1.000 n=7+6)
MemclrRange/2K_8K-2        3.89µs ± 0%     3.11µs ± 0%  -20.00%  (p=0.001 n=6+7)
MemclrRange/4K_16K-2       3.63µs ± 0%     2.37µs ± 0%  -34.61%  (p=0.001 n=7+7)
MemclrRange/160K_228K-2    31.0µs ± 0%     30.6µs ± 1%   -1.50%  (p=0.001 n=7+7)
[Geo mean]                 1.97µs          1.76µs       -10.59%

name                     old speed      new speed       delta
Memclr/5-2               1.98GB/s ± 0%   1.98GB/s ± 0%   +0.27%  (p=0.001 n=7+7)
Memclr/16-2              5.78GB/s ± 0%   6.28GB/s ± 0%   +8.67%  (p=0.001 n=7+7)
Memclr/64-2              20.2GB/s ± 0%   20.3GB/s ± 0%     ~     (p=0.535 n=7+7)
Memclr/256-2             35.3GB/s ± 0%   35.2GB/s ± 0%     ~     (p=0.259 n=7+7)
Memclr/4096-2            83.1GB/s ± 0%   94.2GB/s ± 0%  +13.39%  (p=0.001 n=7+7)
Memclr/65536-2           49.7GB/s ± 1%   53.0GB/s ± 0%   +6.73%  (p=0.001 n=7+7)
Memclr/1M-2              38.4GB/s ± 0%   39.4GB/s ± 4%     ~     (p=0.209 n=7+7)
Memclr/4M-2              21.5GB/s ± 0%   28.4GB/s ± 4%  +32.02%  (p=0.001 n=7+7)
Memclr/8M-2              21.5GB/s ± 0%   27.2GB/s ± 0%  +26.73%  (p=0.001 n=7+6)
Memclr/16M-2             21.4GB/s ± 0%   26.2GB/s ± 1%  +22.42%  (p=0.001 n=7+7)
Memclr/64M-2             23.7GB/s ± 1%   23.7GB/s ± 1%     ~     (p=0.620 n=7+7)
MemclrRange/1K_2K-2      77.3GB/s ± 0%   77.3GB/s ± 0%     ~     (p=0.710 n=7+7)
MemclrRange/2K_8K-2      85.7GB/s ± 0%  107.1GB/s ± 0%  +25.00%  (p=0.001 n=6+7)
MemclrRange/4K_16K-2     89.0GB/s ± 0%  136.1GB/s ± 0%  +52.92%  (p=0.001 n=7+7)
MemclrRange/160K_228K-2  53.6GB/s ± 0%   54.4GB/s ± 1%   +1.52%  (p=0.001 n=7+7)
[Geo mean]               29.2GB/s        32.7GB/s       +11.86%

Change-Id: I8f3533f88ebd303ae1666a77391fec304bea9724
Reviewed-on: https://go-review.googlesource.com/c/go/+/374396
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-09 17:41:45 +00:00
Keith Randall
f566fe3910 runtime: make racereadrange ABIinternal
CL 266638 marked racewriterange (and some other race functions) as
ABIinternal but missed racereadrange.

arm64 and ppc64le (the other two register ABI platforms at the moment)
already have racereadrange marked as such.

The other two instrumented calls are to racefuncenter/racefuncexit.
Do you think they would need this treatment as well? arm64 already does,
but amd64 and ppc64le do not.

Fixes #51459

Change-Id: I3f54e1298433b6d67bfe18120d9f86205ff66a73
Reviewed-on: https://go-review.googlesource.com/c/go/+/393154
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-09 16:59:57 +00:00
Ryan Leung
dd8d425fed all: fix some lint issues
Make some code more simple.

Change-Id: I801adf0dba5f6c515681345c732dbb907f945419
GitHub-Last-Rev: a505146bac
GitHub-Pull-Request: golang/go#49626
Reviewed-on: https://go-review.googlesource.com/c/go/+/364634
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-05-08 17:27:54 +00:00
Paul E. Murphy
9c9090eb1d cmd/link: generate PPC64 ABI register save/restore functions if needed
They are usually needed when internally linking gcc code
compiled with -Os. These are typically generated by ld
or gold, but are missing when linking internally.

The PPC64 ELF ABI describes a set of functions to save/restore
non-volatile, callee-save registers using R1/R0/R12:

 _savegpr0_n: Save Rn-R31 relative to R1, save LR (in R0), return
 _restgpr0_n: Restore Rn-R31 from R1, and return to saved LR
  _savefpr_n: Save Fn-F31 based on R1, and save LR (in R0), return
  _restfpr_n: Restore Fn-F31 from R1, and return to 16(R1)
 _savegpr1_n: Save Rn-R31 based on R12, return
 _restgpr1_n: Restore Rn-R31 based on R12, return
   _savevr_m: Save VRm-VR31 based on R0, R12 is scratch, return
   _restvr_m: Restore VRm-VR31 based on R0, R12 is scratch, return

 m is a value 20<=m<=31
 n is a value 14<=n<=31

Add several new functions similar to those suggested by the
PPC64 ELFv2 ABI. And update the linker to scan external relocs
for these calls, and redirect them to runtime.elf_<func>+offset
in runtime/asm_ppc64x.go.

Similarly, code which generates plt stubs is moved into
a dedicated function. This avoids an extra scan of relocs.

fixes #52336

Change-Id: I2f0f8b5b081a7b294dff5c92b4b1db8eba9a9400
Reviewed-on: https://go-review.googlesource.com/c/go/+/400796
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-06 17:56:28 +00:00
Tobias Klauser
7f71e1fc28 runtime/cgo: remove memset in _cgo_sys_thread_start on linux
pthread_attr_init in glibc and musl libc already explicitly clear the
pthread_attr argument before setting it, see
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_attr_init.c
and https://git.musl-libc.org/cgit/musl/log/src/thread/pthread_attr_init.c

It looks like pthread_attr_init has been implemented like this for a
long time in both libcs. The comment and memset in _cgo_sys_thread_start
probably stem from a time where not all libcs did the explicit memset in
pthread_attr_init.

Also, the memset in _cgo_sys_thread_start is not performed on all linux
platforms further indicating that this isn't an issue anymore.

Change-Id: I920148b5d24751195ced7af5bb7c52a7f8293259
Reviewed-on: https://go-review.googlesource.com/c/go/+/404275
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-06 15:32:04 +00:00
Rhys Hiltner
7b1f8b62be runtime: prefer curg for execution trace profile
The CPU profiler adds goroutine labels to its samples based on
getg().m.curg. That allows the profile to correctly attribute work that
the runtime does on behalf of that goroutine on the M's g0 stack via
systemstack calls, such as using runtime.Callers to record the call
stack.

Those labels also cover work on the g0 stack via mcall. When the active
goroutine calls runtime.Gosched, it will receive attribution of its
share of the scheduler work necessary to find the next runnable
goroutine.

The execution tracer's attribution of CPU samples to specific goroutines
should match. When curg is set, attribute the CPU samples to that
goroutine's ID.

Fixes #52693

Change-Id: Ic9af92e153abd8477559e48bc8ebaf3739527b94
Reviewed-on: https://go-review.googlesource.com/c/go/+/404055
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-05 18:17:08 +00:00