Currently, newArena holds the gcBitsArenas lock across allocating
memory from the OS for a new gcBits arena. This is a global lock and
allocating physical memory can be expensive, so this has the potential
to cause high lock contention, especially since every single span
sweep operation calls newArena (via newMarkBits).
Improve the situation by temporarily dropping the lock across
allocation. This means the caller now has to revalidate its
assumptions after the lock is dropped, so this also factors out that
code path and reinvokes it after the lock is acquired.
Change-Id: I1113200a954ab4aad16b5071512583cfac744bdc
Reviewed-on: https://go-review.googlesource.com/34594
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0
Reviewed-on: https://go-review.googlesource.com/37793
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently ReadMemStats stops the world for ~1.7 ms/GB of heap because
it collects statistics from every single span. For large heaps, this
can be quite costly. This is particularly unfortunate because many
production infrastructures call this function regularly to collect and
report statistics.
Fix this by tracking the necessary cumulative statistics in the
mcaches. ReadMemStats still has to stop the world to stabilize these
statistics, but there are only O(GOMAXPROCS) mcaches to collect
statistics from, so this pause is only 25µs even at GOMAXPROCS=100.
Fixes#13613.
Change-Id: I3c0a4e14833f4760dab675efc1916e73b4c0032a
Reviewed-on: https://go-review.googlesource.com/34937
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The gcstats structure is no longer consumed by anything and no longer
tracks statistics that are particularly relevant to the concurrent
garbage collector. Remove it. (Having statistics is probably a good
idea, but these aren't the stats we need these days and we don't have
a way to get them out of the runtime.)
In preparation for #13613.
Change-Id: Ib63e2f9067850668f9dcbfd4ed89aab4a6622c3f
Reviewed-on: https://go-review.googlesource.com/34936
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
The iOS test harness was recently changed in response to lldb bugs
to replace breakpoints with the SIGUSR2 signal (CL 34926), and to
pass the current directory in the test binary arguments (CL 35152).
Both the signal sending and working directory setup is done from
the go test driver.
However, the new method doesn't work with tests where a C program is
the test driver instead of go test: the current working directory
will not be changed and SIGUSR2 is not raised.
Instead of copying that logic into any C test program, rework the
test harness (again) to move the setup logic to the early runtime
cgo setup code. That way, the harness will run even in the library
build modes.
Then, use the app Info.plist file to pass the working
directory, removing the need to alter the arguments after running.
Finally, use the SIGINT signal instead of SIGUSR2 to avoid
manipulating the signal masks or handlers.
Fixes the testcarchive tests on iOS.
With this CL, both darwin/arm and darwin/arm64 passes all.bash.
This CL replaces CL 34926, CL 35152 as well as the fixup CL
35123 and CL 35255. They are reverted in CLs earlier in the
relation chain.
Change-Id: I8485c7db1404fbd8daa261efd1ea89e905121a3e
Reviewed-on: https://go-review.googlesource.com/36090
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
In order to generate accurate tracebacks, the runtime needs to know the
inlined call stack for a given PC. This creates two tables per function
for this purpose. The first table is the inlining tree (stored in the
function's funcdata), which has a node containing the file, line, and
function name for every inlined call. The second table is a PC-value
table that maps each PC to a node in the inlining tree (or -1 if the PC
is not the result of inlining).
To give the appearance that inlining hasn't happened, the runtime also
needs the original source position information of inlined AST nodes.
Previously the compiler plastered over the line numbers of inlined AST
nodes with the line number of the call. This meant that the PC-line
table mapped each PC to line number of the outermost call in its inlined
call stack, with no way to access the innermost line number.
Now the compiler retains line numbers of inlined AST nodes and writes
the innermost source position information to the PC-line and PC-file
tables. Some tools and tests expect to see outermost line numbers, so we
provide the OutermostLine function for displaying line info.
To keep track of the inlined call stack for an AST node, we extend the
src.PosBase type with an index into a global inlining tree. Every time
the compiler inlines a call, it creates a node in the global inlining
tree for the call, and writes its index to the PosBase of every inlined
AST node. The parent of this node is the inlining tree index of the
call. -1 signifies no parent.
For each function, the compiler creates a local inlining tree and a
PC-value table mapping each PC to an index in the local tree. These are
written to an object file, which is read by the linker. The linker
re-encodes these tables compactly by deduplicating function names and
file names.
This change increases the size of binaries by 4-5%. For example, this is
how the go1 benchmark binary is impacted by this change:
section old bytes new bytes delta
.text 3.49M ± 0% 3.49M ± 0% +0.06%
.rodata 1.12M ± 0% 1.21M ± 0% +8.21%
.gopclntab 1.50M ± 0% 1.68M ± 0% +11.89%
.debug_line 338k ± 0% 435k ± 0% +28.78%
Total 9.21M ± 0% 9.58M ± 0% +4.01%
Updates #19348.
Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3
Reviewed-on: https://go-review.googlesource.com/37231
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
There are two accesses to mheap_.busy that are guarded by checks
against len(mheap_.free). This works because both lists are (and must
be) the same length, but it makes the code less clear. Change these to
use len(mheap_.busy) so the access more clearly parallels the check.
Fixes#18944.
Change-Id: I9bacbd3663988df351ed4396ae9018bc71018311
Reviewed-on: https://go-review.googlesource.com/36354
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently sweep counts the number of allocated objects, computes the
number of free objects from that, then re-computes the number of
allocated objects from that. Simplify and clean this up by skipping
these intermediate steps.
Change-Id: I3ed98e371eb54bbcab7c8530466c4ab5fde35f0a
Reviewed-on: https://go-review.googlesource.com/34935
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently we scan the finalizers queue both during concurrent mark and
during mark termination. This costs roughly 20ns per queued finalizer
and about 1ns per unused finalizer queue slot (allocated queue length
never decreases), which can drive up STW time if there are many
finalizers.
However, we only add finalizers to this queue during sweeping, which
means that the second scan will never find anything new. Hence, we can
fix this by simply not scanning the finalizers queue during mark
termination. This brings the STW time under the 100µs goal even with
1,000,000 queued finalizers.
Fixes#18869.
Change-Id: I4ce5620c66fb7f13ebeb39ca313ce57047d1d0fb
Reviewed-on: https://go-review.googlesource.com/36013
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Since workbuf is now marked go:notinheap, the write barrier-preventing
wrapper type wbufptr is no longer necessary. Remove it.
Change-Id: I3e5b5803a1547d65de1c1a9c22458a38e08549b7
Reviewed-on: https://go-review.googlesource.com/35971
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Some debugging code was recently added to:
1) provide more detail for the stale reason when it is
determined that a package is stale
2) provide file and package time and date information when
it is determined that runtime.a is stale
This backs out those those debugging messages.
Fixes#19116
Change-Id: I8dd0cbe29324820275b481d8bbb78ff2c5fbc362
Reviewed-on: https://go-review.googlesource.com/37382
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The comments in cmd/internal/obj/funcdata.go are identical to the
comments in runtime/funcdata.h, but the majority of the definitions
they refer to don't apply to Go sources and have been stripped out of
funcdata.go.
Remove these stale comments from funcdata.go and clean up the
references to other copies of the PCDATA and FUNCDATA indexes.
Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884
Reviewed-on: https://go-review.googlesource.com/37330
Reviewed-by: Keith Randall <khr@golang.org>
These functions are not defined and are not used.
Fixes#19290
Change-Id: I2978147220af83cf319f7439f076c131870fb9ee
Reviewed-on: https://go-review.googlesource.com/37448
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
If the caller passes a large number to Profile.Add,
the list of pcs is empty, which results in junk
(a nil pc) being recorded. Check for that explicitly,
and replace such stack traces with a lostProfileEvent.
Fixes#18836.
Change-Id: I99c96aa67dd5525cd239ea96452e6e8fcb25ce02
Reviewed-on: https://go-review.googlesource.com/36891
Reviewed-by: Russ Cox <rsc@golang.org>
The profiles are self-contained now.
Check that they work by themselves in the tests that invoke pprof,
but also keep checking that the old command lines work.
Change-Id: I24c74b5456f0b50473883c3640625c6612f72309
Reviewed-on: https://go-review.googlesource.com/37166
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The existing code builds a full profile in memory.
Then it translates that profile into a data structure (in memory).
Then it marshals that data structure into a protocol buffer (in memory).
Then it gzips that marshaled form into the underlying writer.
So there are three copies of the full profile data in memory
at the same time before we're done. This is obviously dumb.
This CL implements a fully streaming conversion from
the original in-memory profile to the underlying writer.
There is now only one copy of the profile in memory.
For the non-CPU profiles, this is optimal, since we have to
have a full copy in memory to start with.
For the CPU profiles, we could still try to bound the profile
size stored in memory and stream fragments out during
the actual profiling, as Go 1.7 did (with a simpler format),
but so far that hasn't been necessary.
Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989
Reviewed-on: https://go-review.googlesource.com/37164
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
The old hash table was a place holder that allocates memory
during every lookup for key generation, even for keys that hit
in the the table.
Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21
Reviewed-on: https://go-review.googlesource.com/37163
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
This doesn't change the functionality of the current code,
but it sets us up for exporting the profiling labels into the profile.
The old code had a hash table of profile samples maintained
during the signal handler, with evictions going into a log.
The new code just logs every sample directly, leaving the
hash-based deduplication to an ordinary goroutine.
The new code also avoids storing the entire profile in two
forms in memory, an unfortunate regression introduced
when binary profile support was added. After this CL the
entire profile is only stored once in memory. We'd still like
to get back down to storing it zero times (streaming it to
the underlying io.Writer).
Change-Id: I0893a1788267c564aa1af17970d47377b2a43457
Reviewed-on: https://go-review.googlesource.com/36712
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
It's common for some goroutines to loop calling time.Sleep.
Allocate once per goroutine, not every time.
This comes up in runtime/pprof's background reader.
Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade
Reviewed-on: https://go-review.googlesource.com/37162
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The existing CPU profiling buffer is a slice of uintptr, but we want to
start including profiling label data in the profiles, and those labels need
to be pointers in order to let them describe rich information.
This CL implements a new profBuf type that holds both a slice of uint64
for data and a slice of unsafe.Pointer for profiling labels (aka tags).
Making the runtime use these buffers will happen in followup CLs.
Change-Id: I9ff16b532d8edaf4ce0cbba1098229a561834efc
Reviewed-on: https://go-review.googlesource.com/36713
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This updates the testcase to display the timestamps for the
runtime.a, it dependent packages atomic.a and sys.a, and
source files.
Change-Id: Id2901b4e8aa8eb9775c4f404ac01cc07b394ba91
Reviewed-on: https://go-review.googlesource.com/37332
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Suggested by Dmitry in CL 36792 review.
Clearly safe since there are many different semaRoots
that could all have profiled sudogs calling mutexevent.
Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1
Reviewed-on: https://go-review.googlesource.com/37104
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
We have seen one instance of a production job suddenly spinning to
100% CPU and becoming unresponsive. In that one instance, a SIGQUIT
was sent after 328 minutes of spinning, and the stacks showed a single
goroutine in "IO wait (scan)" state.
Looking for things that might get stuck if a goroutine got stuck in
scanning a stack, we found that injectglist does:
lock(&sched.lock)
var n int
for n = 0; glist != nil; n++ {
gp := glist
glist = gp.schedlink.ptr()
casgstatus(gp, _Gwaiting, _Grunnable)
globrunqput(gp)
}
unlock(&sched.lock)
and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes
away. Essentially, this code locks sched.lock and then while holding
sched.lock, waits to lock gp.atomicstatus.
The code that is doing the scan is:
if castogscanstatus(gp, s, s|_Gscan) {
if !gp.gcscandone {
scanstack(gp, gcw)
gp.gcscandone = true
}
restartg(gp)
break loop
}
More analysis showed that scanstack can, in a rare case, end up
calling back into code that acquires sched.lock. For example:
runtime.scanstack at proc.go:866
calls runtime.gentraceback at mgcmark.go:842
calls runtime.scanstack$1 at traceback.go:378
calls runtime.scanframeworker at mgcmark.go:819
calls runtime.scanblock at mgcmark.go:904
calls runtime.greyobject at mgcmark.go:1221
calls (*runtime.gcWork).put at mgcmark.go:1412
calls (*runtime.gcControllerState).enlistWorker at mgcwork.go:127
calls runtime.wakep at mgc.go:632
calls runtime.startm at proc.go:1779
acquires runtime.sched.lock at proc.go:1675
This path was found with an automated deadlock-detecting tool.
There are many such paths but they all go through enlistWorker -> wakep.
The evidence strongly suggests that one of these paths is what caused
the deadlock we observed. We're running those jobs with
GOTRACEBACK=crash now to try to get more information if it happens
again.
Further refinement and analysis shows that if we drop the wakep call
from enlistWorker, the remaining few deadlock cycles found by the tool
are all false positives caused by not understanding the effect of calls
to func variables.
The enlistWorker -> wakep call was intended only as a performance
optimization, it rarely executes, and if it does execute at just the
wrong time it can (and plausibly did) cause the deadlock we saw.
Comment it out, to avoid the potential deadlock.
Fixes#19112.
Unfixes #14179.
Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b
Reviewed-on: https://go-review.googlesource.com/37093
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This changes the os package to use the runtime poller for file I/O
where possible. When a system call blocks on a pollable descriptor,
the goroutine will be blocked on the poller but the thread will be
released to run other goroutines. When using a non-pollable
descriptor, the os package will continue to use thread-blocking system
calls as before.
For example, on GNU/Linux, the runtime poller uses epoll. epoll does
not support ordinary disk files, so they will continue to use blocking
I/O as before. The poller will be used for pipes.
Since this means that the poller is used for many more programs, this
modifies the runtime to only block waiting for the poller if there is
some goroutine that is waiting on the poller. Otherwise, there is no
point, as the poller will never make any goroutine ready. This
preserves the runtime's current simple deadlock detection.
This seems to crash FreeBSD systems, so it is disabled on FreeBSD.
This is issue 19093.
Using the poller on Windows requires opening the file with
FILE_FLAG_OVERLAPPED. We should only do that if we can remove that
flag if the program calls the Fd method. This is issue 19098.
Update #6817.
Update #7903.
Update #15021.
Update #18507.
Update #19093.
Update #19098.
Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe
Reviewed-on: https://go-review.googlesource.com/36800
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Since we're no longer stealing space for the stack barrier array from
the stack allocation, the stack allocation is simply
g.stack.hi-g.stack.lo.
Updates #17503.
Change-Id: Id9b450ae12c3df9ec59cfc4365481a0a16b7c601
Reviewed-on: https://go-review.googlesource.com/36621
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Now that we don't rescan stacks, stack barriers are unnecessary. This
removes all of the code and structures supporting them as well as
tests that were specifically for stack barriers.
Updates #17503.
Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c
Reviewed-on: https://go-review.googlesource.com/36620
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
With the hybrid barrier, rescanning stacks is no longer necessary so
the rescan list is no longer necessary. Remove it.
This leaves the gcrescanstacks GODEBUG variable, since it's useful for
debugging, but changes it to simply walk all of the Gs to rescan
stacks rather than using the rescan list.
We could also remove g.gcscanvalid, which is effectively a distributed
rescan list. However, it's still useful for gcrescanstacks mode and it
adds little complexity, so we'll leave it in.
Fixes#17099.
Updates #17503.
Change-Id: I776d43f0729567335ef1bfd145b75c74de2cc7a9
Reviewed-on: https://go-review.googlesource.com/36619
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The wbshadow implementation was removed a year and a half ago in
1635ab7dfe, but the GODEBUG setting remained. Remove the GODEBUG
setting since it doesn't do anything.
Change-Id: I19cde324a79472aff60acb5cc9f7d4aa86c0c0ed
Reviewed-on: https://go-review.googlesource.com/36618
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
For vet. There are more. This is a start.
Change-Id: Ibbbb2b20b5db60ee3fac4a1b5913d18fab01f6b9
Reviewed-on: https://go-review.googlesource.com/36939
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by
choosing other polynom and reacting on high bit before shift.
Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html
from 32.dat.gz . It is referred as F7711115 cause this list of
polynomials is for LFSR with shift to right (and fastrand uses shift to
left). (old polynomial is referred in 31.dat.gz as 7BB88888).
There were couple of places with conversation of fastrand to int, which
leads to negative values on 32bit platforms. They are fixed.
Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72
Reviewed-on: https://go-review.googlesource.com/36875
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This will make it possible to use the poller with the os package.
This is a lot of code movement but the behavior is intended to be
unchanged.
Update #6817.
Update #7903.
Update #15021.
Update #18507.
Change-Id: I1413685928017c32df5654ded73a2643820977ae
Reviewed-on: https://go-review.googlesource.com/36799
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.
Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.
Update #18492
Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Based on sample code from iant.
Fixes#18788.
Change-Id: I6bb33ed05af2538fbde42ddcac629280ef7c00a6
Reviewed-on: https://go-review.googlesource.com/36892
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>