This benchmark is added to test improvements in memclr_amd64.
As it is stated in Intel Optimization Manual 15.16.3.3, AVX2-implemented
memclr can produce a skewed result with the branch predictor being
trained by the large loop iteration count.
This benchmark generates sizes between some specified range. This should
help to measure how memclr works when branch predictors may be incorrectly
trained.
Change-Id: I14d173cafe43ca47198ed920e655547a66b3909f
Reviewed-on: https://go-review.googlesource.com/c/go/+/373362
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
On 32-bit systems the result of hogCount*factor can overflow.
Use division instead to do comparison.
Update #52207
Change-Id: I429fb9dc009af645acb535cee5c70887527ba207
Reviewed-on: https://go-review.googlesource.com/c/go/+/407415
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I8ef0e7f17d6ada3d2f07c81524136b78457e7795
Reviewed-on: https://go-review.googlesource.com/c/go/+/342319
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I0333503db044c6f39df2d7f8d9dff213b1361d6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/342320
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Make sure that all the targets of 64-bit atomic operations
are actually aligned to 8 bytes. This has been a source of
bugs on 32-bit systems. (e.g. CL 399754)
The strategy is to have a simple test that just checks the
alignment of some explicitly listed fields and global variables.
Then there's a more complicated test that makes sure the list
used in the simple test is exhaustive. That test has some
limitations, but it should catch most cases, particularly new
uses of atomic operations on new or existing fields.
Unlike a runtime assert, this check is free and will catch
accesses that occur even in very unlikely code paths.
Change-Id: I25ac78df471ac33b57cb91375bd8453d6ce2814f
Reviewed-on: https://go-review.googlesource.com/c/go/+/407034
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently gctrace and gcpacertrace recompute the heap goal for
end-of-cycle information but this is incorrect.
Because both of these traces are printing stats from the previous cycle
in this case, they should print the heap goal at the end of the previous
cycle.
Change-Id: I967621cbaff9f331cd3e361de8850ddfe0cfc099
Reviewed-on: https://go-review.googlesource.com/c/go/+/407138
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: Ie9bb5ccfc28e65036e2088c232bb333dcb259a60
Reviewed-on: https://go-review.googlesource.com/c/go/+/368076
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: Ifa0229d2044dd53683de4a2b3ab965b16263f267
Reviewed-on: https://go-review.googlesource.com/c/go/+/368075
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I848608267932717895d5cff9e33040029c3f3c4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/368080
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: Ida040e76dc8172f60e6aee1ea2b5bce13ab3581e
Reviewed-on: https://go-review.googlesource.com/c/go/+/368077
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I7c1f39670034db6714630d479bc41b6620ba2b1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/368079
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Excluding vendor and testdata.
CL 384268 already reformatted most, but these slipped past.
The struct in the doc comment in debug/dwarf/type.go
was fixed up by hand to indent the first and last lines as well.
For #51082.
Change-Id: Iad020f83aafd671ff58238fe491907e85923d0c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/407137
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
NetBSD appears to have the same issue OpenBSD had in runqgrab. See
issue #52475 for more details.
For #35166.
Change-Id: Ie53192d26919b4717bc0d61cadd88d688ff38bb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/407139
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I7a64e38b15a99816bd74262c02f62dad021cc166
Reviewed-on: https://go-review.googlesource.com/c/go/+/368078
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
The test checks that the scheduling of the goroutines are within
a small factor, to ensure the scheduler handing off the P
correctly. There have been flaky failures on the builder (probably
due to OS scheduling delays). Increase the threshold to make it
less flaky. The gap would be much bigger if the scheduler doesn't
work correctly.
For the long term maybe it is better to test it more directly
with the scheduler, e.g. with scheduler instrumentation.
May fix#52207.
Change-Id: I50278b70ab21b7f04761fdc8b38dd13304c67879
Reviewed-on: https://go-review.googlesource.com/c/go/+/407134
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change-Id: I63eb42f3ce5ca452279120a5b33518f4ce16be45
GitHub-Last-Rev: a88f2f72be
GitHub-Pull-Request: golang/go#52951
Reviewed-on: https://go-review.googlesource.com/c/go/+/406843
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
In TestCgoPprofThread, the (fake) cgo traceback function pretends
all C CPU samples are in cpuHogThread. But if a profiling signal
lands in C code but outside of that thread, e.g. before/when the
thread is created, we will get a sample which looks like Go calls
into cpuHogThread. This CL makes the cgo traceback function only
return cpuHogThread PCs when a signal lands on that thread.
May fix#52726.
Change-Id: I21c40f974d1882508626faf3ac45e8347fec31c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/406934
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Change-Id: Iee18987c495d1d4bde9da888d454eea8079d3ebc
GitHub-Last-Rev: ff5e01599d
GitHub-Pull-Request: golang/go#52949
Reviewed-on: https://go-review.googlesource.com/c/go/+/406915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I5e09759ce9201596e89a01fc4a6f7fd7e205449f
Reviewed-on: https://go-review.googlesource.com/c/go/+/368074
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Change-Id: I99c593573b3bec560ab3af49ac2f486ee442ee1c
GitHub-Last-Rev: e399ec50f9
GitHub-Pull-Request: golang/go#52946
Reviewed-on: https://go-review.googlesource.com/c/go/+/406837
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
The goroutine profiler tests include one that launches a steady stream
of goroutines. That creates a scheduler busy loop that can prevent
forward progress in the rest of the program. Slow down the launches a
bit so other goroutines have a chance to run.
Fixes#52916
For #52934
Change-Id: I748557201b94918b1fa4960544a51a48d9cacc6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/406654
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
The testCPUProfile helper function iterates until the profile contains
enough samples. However, in general very slow builders may need longer
to complete tests, and may have less-responsive schedulers (leading to
longer durations required to collect profiles with enough samples).
To compensate, slower builders generally run tests with longer timeouts.
Since this test helper already dynamically scales the profile duration
based on the collected samples, allow it to continue to retry and
rescale until it would exceed the test's deadline.
Fixes#52656 (hopefully).
Change-Id: I4561e721927503f33a6d23336efa979bb9d3221f
Reviewed-on: https://go-review.googlesource.com/c/go/+/406614
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
When compiling package runtime, cmd/compile logically has two copies
of package runtime: the actual source files being compiled, and the
internal description used for emitting compiler-generated calls.
Notably, CL 393715 will cause the compiler's write barrier validation
to start recognizing that compiler-generated calls are actually calls
to the corresponding functions from the source package. And today,
there are some code paths in nowritebarrierrec code paths that
actually end up generating code to call panicshift or panicdivide.
In preparation, this CL marks those functions as
//go:yeswritebarrierrec. We probably want to actually cleanup those
code paths to avoid these calls actually (e.g., explicitly convert
shift count expressions to an unsigned integer type). But for now,
this at least unblocks CL 393715 while preserving the status quo.
Updates #51734.
Change-Id: I01f89adb72466c0260a9cd363e3e09246e39cff9
Reviewed-on: https://go-review.googlesource.com/c/go/+/406316
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently the GC CPU limiter doesn't account for idle application time
at all. This means that the GC could start thrashing, for example if the
live heap exceeds the max heap set by the memory limit, but the limiter
will fail to kick in when there's a lot of available idle time. User
goroutines will still be assisting at a really high rate because of
assist pacing rules, but the GC CPU limiter will fail to kick in because
the actual fraction of GC CPU time will be low if there's a lot of
otherwise idle time (for example, on an overprovisioned system).
Luckily, that idle time is usually eaten up entirely by idle mark
workers, at least during the GC cycle. And in these cases where we're
GCing continuously, that's all of our idle time. So we can take idle
mark work time and subtract it from the mutator time accumulated in the
GC CPU limiter, and that will give us a more accurate picture of how
much CPU is being spent by user goroutines on GC. This will allow the GC
CPU limiter to kick in, and reduce the impact of the thrashing.
There is a corner case here if the idle mark workers are disabled, for
example for the periodic GC, but in the case of the periodic GC, I don't
think it's possible for us to be thrashing at all, so it doesn't really
matter.
Fixes#52890.
Change-Id: Ie133a7d1f89b603434b415d51eb8733c2708a858
Reviewed-on: https://go-review.googlesource.com/c/go/+/405898
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This metric exports the the last GC cycle index that the GC limiter was
enabled. This metric is useful for debugging and identifying the root
cause of OOMs, especially when SetMemoryLimit is in use.
For #48409.
Change-Id: Ic6383b19e88058366a74f6ede1683b8ffb30a69c
Reviewed-on: https://go-review.googlesource.com/c/go/+/403614
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
runtime code for js contains possible write barriers that fail
the nowritebarrierrec check when internal local package naming
conventions are changed. The problem was there all already; this
allows the code to compile, and it seems to work anyway in the
(single-threaded) js/wasm environment. The offending operations
are noted with TODO, which is an improvement.
runtime code for plan9 contained an apparent allocation that was
not really an allocation; rewrite to remove the potential allocation
to avoid nowritebarrierrec problems.
This CL is a prerequisite for a pending code cleanup,
https://go-review.googlesource.com/c/go/+/393715
Updates #51734.
Change-Id: I93f31831ff9b92632137dd7b0055eaa721c81556
Reviewed-on: https://go-review.googlesource.com/c/go/+/405901
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The compiler may choose to inline multiple layers of function call, such
that A calling B calling C may end up with all of the instructions for B
and C written as part of A's function body.
Within that function body, some PCs will represent code from function A.
Some will represent code from function B, and for each of those the
runtime will have an instruction attributable to A that it can report as
its caller. Others will represent code from function C, and for each of
those the runtime will have an instruction attributable to B and an
instruction attributable to A that it can report as callers.
When a profiling signal arrives at an instruction in B (as inlined in A)
that the runtime also uses to describe calls to C, the profileBuilder
ends up with an incorrect cache of allFrames results. That PC should
lead to a location record in the profile that represents the frames
B<-A, but the allFrames cache's view should expand the PC only to the B
frame.
Otherwise, when a profiling signal arrives at an instruction in C (as
inlined in B in A), the PC stack C,B,A can get expanded to the frames
C,B<-A,A as follows: The inlining deck starts empty. The first tryAdd
call proposes PC C and frames C, which the deck accepts. The second
tryAdd call proposes PC B and, due to the incorrect caching, frames B,A.
(A fresh call to allFrames with PC B would return the frame list B.) The
deck accepts that PC and frames. The third tryAdd call proposes PC A and
frames A. The deck rejects those because a call from A to A cannot
possibly have been inlined. This results in a new location record in the
profile representing the frames C<-B<-A (good), as called by A (bad).
The bug is the cached expansion of PC B to frames B<-A. That mapping is
only appropriate for the resulting protobuf-format profile. The cache
needs to reflect the results of a call to allFrames, which expands the
PC B to the single frame B.
For #50996
For #52693Fixes#52764
Change-Id: I36d080f3c8a05650cdc13ced262189c33b0083b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/404995
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Panic avoids any write barriers in the runtime by checking first
and throwing if called inappropriately, so it is "okay". Adding
this annotation repairs recursive write barrier checking, which
becomes more thorough when the local package naming convention
is changed from "" to the actual package name.
This CL is a prerequisite for a pending code cleanup,
https://go-review.googlesource.com/c/go/+/393715
Updates #51734.
Change-Id: If831a3598c6c8cd37a8e9ba269f822cd81464a13
Reviewed-on: https://go-review.googlesource.com/c/go/+/405900
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: David Chase <drchase@google.com>
This change extends https://go.dev/cl/381317 to the
runtime/internal/atomic package in terms of aligning 64-bit types to 8
bytes, even on 32-bit platforms.
Change-Id: Id8c45577d07b256e3144d88b31f201264295cfcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/404096
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
At the expense of performance (having to update another atomic counter)
this change makes CPU limiter assist time much less error-prone to
manage. There are currently a number of issues with respect to how
scavenge assist time is treated, and this change resolves those by just
having the limiter maintain its own internal pool that's drained on each
update.
While we're here, clear the measured assist time each cycle, which was
the impetus for the change.
Change-Id: I84c513a9f012b4007362a33cddb742c5779782b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/404304
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Measure the average stack size used by goroutines at every GC. When
starting a new goroutine, allocate an initial goroutine stack of that
average size. Intuition is that we'll waste at most 2x in stack space
because only half the goroutines can be below average. In turn, we
avoid some of the early stack growth / copying needed in the average
case.
More details in the design doc at: https://docs.google.com/document/d/1YDlGIdVTPnmUiTAavlZxBI1d9pwGQgZT7IKFKlIXohQ/edit?usp=sharing
name old time/op new time/op delta
Issue18138 95.3µs ± 0% 67.3µs ±13% -29.35% (p=0.000 n=9+10)
Fixes#18138
Change-Id: Iba34d22ed04279da7e718bbd569bbf2734922eaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/345889
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
This change adds support for vDSO for s390x architecture. This avoids the use of system calls in nanotime and walltime and accelerates them by factor 4-5.
Benchmarks:
100,000,000 x time.Now():
syscall fallback 13923ms 139.23 ns/op
vDSO enabled 2640ms 26.40 ns/op
Change-Id: Ic679fe31048379e59ccf83b400140f13c9d49696
GitHub-Last-Rev: 8f6e918a45
GitHub-Pull-Request: golang/go#49717
Reviewed-on: https://go-review.googlesource.com/c/go/+/365995
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Jonathan Albrecht <jonathan.albrecht@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
CL 362934 added open code for unsafe.Slice, so using it now no longer
negatively impacts the performance.
Updates #48798
Change-Id: Ifbabe8bc1cc4349c5bcd11586a11fc99bcb388b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/404974
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
So prevent heavy runtime call overhead, and the compiler will have a
chance to optimize the bound check.
With this optimization, changing runtime/stack.go to use unsafe.Slice
no longer negatively impacts stack copying performance:
name old time/op new time/op delta
StackCopyWithStkobj-8 16.3ms ± 6% 16.5ms ± 5% ~ (p=0.382 n=8+8)
name old alloc/op new alloc/op delta
StackCopyWithStkobj-8 17.0B ± 0% 17.0B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
StackCopyWithStkobj-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Fixes#48798
Change-Id: I731a9a4abd6dd6846f44eece7f86025b7bb1141b
Reviewed-on: https://go-review.googlesource.com/c/go/+/362934
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
This avoids a dependency on the compiler statically initializing
maxSearchAddr, which is necessary so we can disable the (overly
aggressive and spec non-conforming) optimizations in cmd/compile and
gccgo.
Updates #51913.
Change-Id: I424e62c81c722bb179ed8d2d8e188274a1aeb7b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/396194
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
GC requires the whole zeroed word to be visible for a memory subsystem.
While the implementation of Enhanced REP STOSB tries to use as efficient
stores as possible, e.g writing the whole cache line and not byte-after-byte,
we should use REP STOSQ to guarantee the requirements of the GC.
The performance is not affected.
Change-Id: I1b0fd1444a40bfbb661541291ab96eba11bcc762
Reviewed-on: https://go-review.googlesource.com/c/go/+/405274
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
On ARM, when GOARM<=6 the TLS pointer is fetched via a call to a
kernel helper. This call clobbers LR, even just temporarily. If
the function is NOFRAME, if a profiling signal lands right after
the call returns, the unwinder will find the wrong LR. Not mark it
NOFRAME, so the LR will be saved in the usual way and stack
unwinding should work.
May fix#52829.
Change-Id: I419a31dcf4afbcff8d7ab8f179eec3c477589e60
Reviewed-on: https://go-review.googlesource.com/c/go/+/405482
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
This edge represents the case of executing a write barrier under the
trace lock: we might use the wbufSpans lock to get a new trace buffer,
or mheap to allocate a totally new one.
Fixes#52794.
Change-Id: Ia1ac2c744b8284ae29f4745373df3f9675ab1168
Reviewed-on: https://go-review.googlesource.com/c/go/+/405476
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
If SetFinalizer is never called, we might readgstatus on a nil fing
variable, resulting in a crash. This change guards code that accesses
fing by a nil check.
Fixes#52821.
Change-Id: I3e8e7004f97f073dc622e801a1d37003ea153a29
Reviewed-on: https://go-review.googlesource.com/c/go/+/405475
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
CL 266638 marked racewriterange (and some other race functions) as
ABIinternal but missed racereadrange.
arm64 and ppc64le (the other two register ABI platforms at the moment)
already have racereadrange marked as such.
The other two instrumented calls are to racefuncenter/racefuncexit.
Do you think they would need this treatment as well? arm64 already does,
but amd64 and ppc64le do not.
Fixes#51459
Change-Id: I3f54e1298433b6d67bfe18120d9f86205ff66a73
Reviewed-on: https://go-review.googlesource.com/c/go/+/393154
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Make some code more simple.
Change-Id: I801adf0dba5f6c515681345c732dbb907f945419
GitHub-Last-Rev: a505146bac
GitHub-Pull-Request: golang/go#49626
Reviewed-on: https://go-review.googlesource.com/c/go/+/364634
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
They are usually needed when internally linking gcc code
compiled with -Os. These are typically generated by ld
or gold, but are missing when linking internally.
The PPC64 ELF ABI describes a set of functions to save/restore
non-volatile, callee-save registers using R1/R0/R12:
_savegpr0_n: Save Rn-R31 relative to R1, save LR (in R0), return
_restgpr0_n: Restore Rn-R31 from R1, and return to saved LR
_savefpr_n: Save Fn-F31 based on R1, and save LR (in R0), return
_restfpr_n: Restore Fn-F31 from R1, and return to 16(R1)
_savegpr1_n: Save Rn-R31 based on R12, return
_restgpr1_n: Restore Rn-R31 based on R12, return
_savevr_m: Save VRm-VR31 based on R0, R12 is scratch, return
_restvr_m: Restore VRm-VR31 based on R0, R12 is scratch, return
m is a value 20<=m<=31
n is a value 14<=n<=31
Add several new functions similar to those suggested by the
PPC64 ELFv2 ABI. And update the linker to scan external relocs
for these calls, and redirect them to runtime.elf_<func>+offset
in runtime/asm_ppc64x.go.
Similarly, code which generates plt stubs is moved into
a dedicated function. This avoids an extra scan of relocs.
fixes#52336
Change-Id: I2f0f8b5b081a7b294dff5c92b4b1db8eba9a9400
Reviewed-on: https://go-review.googlesource.com/c/go/+/400796
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
pthread_attr_init in glibc and musl libc already explicitly clear the
pthread_attr argument before setting it, see
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_attr_init.c
and https://git.musl-libc.org/cgit/musl/log/src/thread/pthread_attr_init.c
It looks like pthread_attr_init has been implemented like this for a
long time in both libcs. The comment and memset in _cgo_sys_thread_start
probably stem from a time where not all libcs did the explicit memset in
pthread_attr_init.
Also, the memset in _cgo_sys_thread_start is not performed on all linux
platforms further indicating that this isn't an issue anymore.
Change-Id: I920148b5d24751195ced7af5bb7c52a7f8293259
Reviewed-on: https://go-review.googlesource.com/c/go/+/404275
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
The CPU profiler adds goroutine labels to its samples based on
getg().m.curg. That allows the profile to correctly attribute work that
the runtime does on behalf of that goroutine on the M's g0 stack via
systemstack calls, such as using runtime.Callers to record the call
stack.
Those labels also cover work on the g0 stack via mcall. When the active
goroutine calls runtime.Gosched, it will receive attribution of its
share of the scheduler work necessary to find the next runnable
goroutine.
The execution tracer's attribution of CPU samples to specific goroutines
should match. When curg is set, attribute the CPU samples to that
goroutine's ID.
Fixes#52693
Change-Id: Ic9af92e153abd8477559e48bc8ebaf3739527b94
Reviewed-on: https://go-review.googlesource.com/c/go/+/404055
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Gopher Robot <gobot@golang.org>