When building a shared library, all functions that are declared must actually
be defined.
Change-Id: I1488690cecfb66e62d9fdb3b8d257a4dc31d202a
Reviewed-on: https://go-review.googlesource.com/14187
Reviewed-by: Dave Cheney <dave@cheney.net>
This is generated during fp code when -shared is active.
Change-Id: Ia1092299b9c3b63ff771ca4842158b42c34bd008
Reviewed-on: https://go-review.googlesource.com/14286
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Also simplifies some silliness around making the .tbss section wrt internal
vs external linking. The "make TLS make sense" project has quite a few more
steps to go.
Issue #11270
Change-Id: Ia4fa135cb22d916728ead95bdbc0ebc1ae06f05c
Reviewed-on: https://go-review.googlesource.com/13990
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Building for shared libraries requires that all functions that are declared
have an implementation and vice versa so make that so on arm64.
It would be nicer to not require the stub sigreturn (it will never be called)
but that seems a bit awkward.
Change-Id: I3cec81697161b452af81fa35939f748bd1acf7fd
Reviewed-on: https://go-review.googlesource.com/13995
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The hash tests generate occasional failures, quiet them some more.
In particular we can get 1 collision when the expected number is
.001 or so. That shouldn't be a dealbreaker.
Fixes#12311
Change-Id: I784e91b5d21f4f1f166dc51bde2d1cd3a7a3bfea
Reviewed-on: https://go-review.googlesource.com/13902
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Currently the stack barrier stub blindly unwinds the next stack
barrier from the G's stack barrier array without checking that it's
the right stack barrier. If through some bug the stack barrier array
position gets out of sync with where we actually are on the stack,
this could return to the wrong PC, which would lead to difficult to
debug crashes. To address this, this commit adds a check to the amd64
stack barrier stub that it's unwinding the correct stack barrier.
Updates #12238.
Change-Id: If824d95191d07e2512dc5dba0d9978cfd9f54e02
Reviewed-on: https://go-review.googlesource.com/13948
Reviewed-by: Russ Cox <rsc@golang.org>
Currently enabling the debugging mode where stack barriers are
installed at every frame requires recompiling the runtime. However,
this is potentially useful for field debugging and for runtime tests,
so make this mode a GODEBUG.
Updates #12238.
Change-Id: I6fb128f598b19568ae723a612e099c0ed96917f5
Reviewed-on: https://go-review.googlesource.com/13947
Reviewed-by: Russ Cox <rsc@golang.org>
Currently the runtime can install stack barriers in any frame.
However, the frame of cgocallback_gofunc is special: it's the one
function that switches from a regular G stack to the system stack on
return. Hence, the return PC slot in its frame on the G stack is
actually used to save getg().sched.pc (so tracebacks appear to unwind
to the last Go function running on that G), and not as an actual
return PC for cgocallback_gofunc.
Because of this, if we install a stack barrier in cgocallback_gofunc's
return PC slot, when cgocallback_gofunc does return, it will move the
stack barrier stub PC in to getg().sched.pc and switch back to the
system stack. The rest of the runtime doesn't know how to deal with a
stack barrier stub in sched.pc: nothing knows how to match it up with
the G's stack barrier array and, when the runtime removes stack
barriers, it doesn't know to undo the one in sched.pc. Hence, if the C
code later returns back in to Go code, it will attempt to return
through the stack barrier saved in sched.pc, which may no longer have
correct unwinding information.
Fix this by blacklisting cgocallback_gofunc's frame so the runtime
won't install a stack barrier in it's return PC slot.
Fixes#12238.
Change-Id: I46aa2155df2fd050dd50de3434b62987dc4947b8
Reviewed-on: https://go-review.googlesource.com/13944
Reviewed-by: Russ Cox <rsc@golang.org>
Right now we find out implicitly if stack barriers are in place,
or defers. This change makes sure we find out about short
unwinds always.
Change-Id: Ibdde1ba9c79eb792660dcb7aa6f186e4e4d559b3
Reviewed-on: https://go-review.googlesource.com/13966
Reviewed-by: Austin Clements <austin@google.com>
Change-Id: I63bf6d2fdf41b49ff8783052d5d6c53b20e2f050
Reviewed-on: https://go-review.googlesource.com/13760
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
I noticed that they were unimplemented on arm64 but then that they were
in fact not used at all.
Change-Id: Iee579feda2a5e374fa571bcc8c89e4ef607d50f6
Reviewed-on: https://go-review.googlesource.com/13951
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
s is a uint32 and can never be zero. It's max value is already tested
against sig.wanted, whose size is derived from _NSIG. This also
matches the test in signal_enable.
Fixes#11282
Change-Id: I8eec9c7df8eb8682433616462fe51b264c092475
Reviewed-on: https://go-review.googlesource.com/13940
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No longer used after previous hashmap change.
Change-Id: I558470f872281e84a78406132df4e391d077b833
Reviewed-on: https://go-review.googlesource.com/13785
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Previously t.zero always pointed to runtime.zerovalue. Change the hashmap code
to always return a runtime pointer directly, and change that pointer to point
to a larger buffer if one is needed.
(It might be better to only copy from the pointer returned by the mapaccess
functions when the value type is small enough and have the compiler insert
explicit zeroing for larger value types, but I tried and failed to do this).
This removes all uses of the zero field of the type data; the field itself can
be removed in a separate change.
Fixes#11491
Change-Id: I5b81752ff4067d74a5a281c41e88f151bae0171e
Reviewed-on: https://go-review.googlesource.com/13784
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
A comparison of the form l == r where l is an interface and r is
concrete performs a type assertion on l to convert it to r's type.
However, the compiler fails to zero the temporary where the result of
the type assertion is written, so if the type is a pointer type and a
stack scan occurs while in the type assertion, it may see an invalid
pointer on the stack.
Fix this by zeroing the temporary. This is equivalent to the fix for
type switches from c4092ac.
Fixes#12253.
Change-Id: Iaf205d456b856c056b317b4e888ce892f0c555b9
Reviewed-on: https://go-review.googlesource.com/13872
Reviewed-by: Russ Cox <rsc@golang.org>
nmspinning has a value range of [0, 2^31-1]. Update the comment to
indicate this and fix the comparison so it's not always false.
Fixes#11280
Change-Id: Iedaf0654dcba5e2c800645f26b26a1a781ea1991
Reviewed-on: https://go-review.googlesource.com/13877
Reviewed-by: Minux Ma <minux@golang.org>
gobuf.g is a guintptr, so without hex(), it will be printed as
a decimal, which is not very helpful and inconsistent with how
other pointers are printed.
Change-Id: I7c0432e9709e90a5c3b3e22ce799551a6242d017
Reviewed-on: https://go-review.googlesource.com/13879
Reviewed-by: Ian Lance Taylor <iant@golang.org>
I cannot find where it's being used.
This addresses a duplicate symbol issue encountered in golang/go#9327.
Change-Id: I8efda45a006ad3e19423748210c78bd5831215e0
Reviewed-on: https://go-review.googlesource.com/13615
Reviewed-by: Ian Lance Taylor <iant@golang.org>
CL 9184 changed the runtime and syscall packages to link Solaris binaries
directly instead of using dlopen/dlsym but did not remove the unused (and
now broken) references to dlopen, dlclose, and dlsym.
Fixes#11923
Change-Id: I36345ce5e7b371bd601b7d48af000f4ccacd62c0
Reviewed-on: https://go-review.googlesource.com/13410
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Changes the torture test in #12068 from failing about 1/10 times
to not failing in almost 2,000 runs.
This was only happening in -race mode because functions are
bigger in -race mode, so a few of the helpers for heapBitsBulkBarrier
were not being inlined, and they were not marked nosplit,
so (only in -race mode) the write barrier was being preempted by GC,
causing missed pointer updates.
Filed issue #12069 for diagnosis of any other similar errors.
Fixes#12068.
Change-Id: Ic174d9b050ba278b18b08ab0d85a73c33bd5b175
Reviewed-on: https://go-review.googlesource.com/13364
Reviewed-by: Austin Clements <austin@google.com>
Also, crash early on non-Linux SMP ARM systems when GOARM < 7;
without the proper synchronization, SMP cannot work.
Linux is okay because we call kernel-provided routines for
synchronization and barriers, and the kernel takes care of
providing the right routines for the current system.
On non-Linux systems we are left to fend for ourselves.
It is possible to use different synchronization on GOARM=6,
but it's too late to do that in the Go 1.5 cycle.
We don't believe there are any non-Linux SMP GOARM=6 systems anyway.
Fixes#12067.
Change-Id: I771a556e47893ed540ec2cd33d23c06720157ea3
Reviewed-on: https://go-review.googlesource.com/13363
Reviewed-by: Austin Clements <austin@google.com>
Currently, runtime.Goexit() calls goexit()—the goroutine exit stub—to
terminate the goroutine. This *mostly* works, but can cause a
"leftover stack barriers" panic if the following happens:
1. Goroutine A has a reasonably large stack.
2. The garbage collector scan phase runs and installs stack barriers
in A's stack. The top-most stack barrier happens to fall at address X.
3. Goroutine A unwinds the stack far enough to be a candidate for
stack shrinking, but not past X.
4. Goroutine A calls runtime.Goexit(), which calls goexit(), which
calls goexit1().
5. The garbage collector enters mark termination.
6. Goroutine A is preempted right at the prologue of goexit1() and
performs a stack shrink, which calls gentraceback.
gentraceback stops as soon as it sees goexit on the stack, which is
only two frames up at this point, even though there may really be many
frames above it. More to the point, the stack barrier at X is above
the goexit frame, so gentraceback never sees that stack barrier. At
the end of gentraceback, it checks that it saw all of the stack
barriers and panics because it didn't see the one at X.
The fix is simple: call goexit1, which actually implements the process
of exiting a goroutine, rather than goexit, the exit stub.
To make sure this doesn't happen again in the future, we also add an
argument to the stub prototype of goexit so you really, really have to
want to call it in order to call it. We were able to reliably
reproduce the above sequence with a fair amount of awful code inserted
at the right places in the runtime, but chose to change the goexit
prototype to ensure this wouldn't happen again rather than pollute the
runtime with ugly testing code.
Change-Id: Ifb6fb53087e09a252baddadc36eebf954468f2a8
Reviewed-on: https://go-review.googlesource.com/13323
Reviewed-by: Russ Cox <rsc@golang.org>
This makes TestTraceStressStartStop much less flaky.
Running under stress, it changes the failure rate from
above 1/100 to under 1/50000. That very unlikely
failure happens when an unexpected GoSysExit is
written. Not sure how that happens yet, but it is much
less important.
Fixes#11953.
Change-Id: I034671936334b4f3ab733614ef239aa121d20247
Reviewed-on: https://go-review.googlesource.com/13321
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
88e945f introduced a non-speculative double check of the heap trigger
before actually starting a concurrent GC. This was necessary to fix a
race for heap-triggered GC, but broke sysmon-triggered periodic GC,
since the heap check will of course fail for periodically triggered
GC.
Fix this by telling startGC whether or not this GC was triggered by
heap size or a timer and only doing the heap size double check for GCs
triggered by heap size.
Fixes#12026.
Change-Id: I7c3f6ec364545c36d619f2b4b3bf3b758e3bcbd6
Reviewed-on: https://go-review.googlesource.com/13168
Reviewed-by: Russ Cox <rsc@golang.org>
This is what is causing freebsd/arm to crash mysteriously when using cgo.
The bug was introduced in golang.org/cl/4030, which moved this code out
of rt0_go and into its own function. The ARM ABI says that calls must
be made with the stack pointer at an 8-byte boundary, but only FreeBSD
seems to crash when this is violated.
Fixes#10119.
Change-Id: Ibdbe76b2c7b80943ab66b8abbb38b47acb70b1e5
Reviewed-on: https://go-review.googlesource.com/13161
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
When commit 510fd13 enabled assists during the scan phase, it failed
to also update the code in the GC controller that computed the assist
CPU utilization and adjusted the trigger based on it. Fix that code so
it uses the start of the scan phase as the wall-clock time when
assists were enabled rather than the start of the mark phase.
Change-Id: I05013734b4448c3e2c730dc7b0b5ee28c86ed8cf
Reviewed-on: https://go-review.googlesource.com/13048
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
At the start of a GC cycle, the garbage collector computes the assist
ratio based on the total scannable heap size. This was intended to be
conservative; after all, this assumes the entire heap may be reachable
and hence needs to be scanned. But it only assumes that the *current*
entire heap may be reachable. It fails to account for heap allocated
during the GC cycle. If the trigger ratio is very low (near zero), and
most of the heap is reachable when GC starts (which is likely if the
trigger ratio is near zero), then it's possible for the mutator to
create new, reachable heap fast enough that the assists won't keep up
based on the assist ratio computed at the beginning of the cycle. As a
result, the heap can grow beyond the heap goal (by hundreds of megs in
stress tests like in issue #11911).
We already have some vestigial logic for dealing with situations like
this; it just doesn't run often enough. Currently, every 10 ms during
the GC cycle, the GC revises the assist ratio. This was put in before
we switched to a conservative assist ratio (when we really were using
estimates of scannable heap), and it turns out to be exactly what we
need now. However, every 10 ms is far too infrequent for a rapidly
allocating mutator.
This commit reuses this logic, but replaces the 10 ms timer with
revising the assist ratio every time the heap is locked, which
coincides precisely with when the statistics used to compute the
assist ratio are updated.
Fixes#11911.
Change-Id: I377b231ab064946228378fa10422a46d1b50f4c5
Reviewed-on: https://go-review.googlesource.com/13047
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This was useful in debugging the mutator assist behavior for #11911,
and it fits with the other gcpacertrace output.
Change-Id: I1e25590bb4098223a160de796578bd11086309c7
Reviewed-on: https://go-review.googlesource.com/13046
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Proportional concurrent sweep is currently based on a ratio of spans
to be swept per bytes of object allocation. However, proportional
sweeping is performed during span allocation, not object allocation,
in order to minimize contention and overhead. Since objects are
allocated from spans after those spans are allocated, the system tends
to operate in debt, which means when the next GC cycle starts, there
is often sweep debt remaining, so GC has to finish the sweep, which
delays the start of the cycle and delays enabling mutator assists.
For example, it's quite likely that many Ps will simultaneously refill
their span caches immediately after a GC cycle (because GC flushes the
span caches), but at this point, there has been very little object
allocation since the end of GC, so very little sweeping is done. The
Ps then allocate objects from these cached spans, which drives up the
bytes of object allocation, but since these allocations are coming
from cached spans, nothing considers whether more sweeping has to
happen. If the sweep ratio is high enough (which can happen if the
next GC trigger is very close to the retained heap size), this can
easily represent a sweep debt of thousands of pages.
Fix this by making proportional sweep proportional to the number of
bytes of spans allocated, rather than the number of bytes of objects
allocated. Prior to allocating a span, both the small object path and
the large object path ensure credit for allocating that span, so the
system operates in the black, rather than in the red.
Combined with the previous commit, this should eliminate all sweeping
from GC start up. On the stress test in issue #11911, this reduces the
time spent sweeping during GC (and delaying start up) by several
orders of magnitude:
mean 99%ile max
pre fix 1 ms 11 ms 144 ms
post fix 270 ns 735 ns 916 ns
Updates #11911.
Change-Id: I89223712883954c9d6ec2a7a51ecb97172097df3
Reviewed-on: https://go-review.googlesource.com/13044
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently it's possible for the next_gc heap size trigger computed for
the next GC cycle to be less than the current allocated heap size.
This means the next cycle will start immediately, which means there's
no time to perform the concurrent sweep between GC cycles. This places
responsibility for finishing the sweep on GC itself, which delays GC
start-up and hence delays mutator assist.
Fix this by ensuring that next_gc is always at least a little higher
than the allocated heap size, so we won't trigger the next cycle
instantly.
Updates #11911.
Change-Id: I74f0b887bf187518d5fedffc7989817cbcf30592
Reviewed-on: https://go-review.googlesource.com/13043
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently there are two sensitive periods during which a mutator can
allocate past the heap goal but mutator assists can't be enabled: 1)
at the beginning of GC between when the heap first passes the heap
trigger and sweep termination and 2) at the end of GC between mark
termination and when the background GC goroutine parks. During these
periods there's no back-pressure or safety net, so a rapidly
allocating mutator can allocate past the heap goal. This is
exacerbated if there are many goroutines because the GC coordinator is
scheduled as any other goroutine, so if it gets preempted during one
of these periods, it may stay preempted for a long period (10s or 100s
of milliseconds).
Normally the mutator does scan work to create back-pressure against
allocation, but there is no scan work during these periods. Hence, as
a fall back, if a mutator would assist but can't yet, simply yield the
CPU. This delays the mutator somewhat, but more importantly gives more
CPU time to the GC coordinator for it to complete the transition.
This is obviously a workaround. Issue #11970 suggests a far better but
far more invasive way to fix this.
Updates #11911. (This very nearly fixes the issue, but about once
every 15 minutes I get a GC cycle where the assists are enabled but
don't do enough work.)
Change-Id: I9768b79e3778abd3e06d306596c3bd77f65bf3f1
Reviewed-on: https://go-review.googlesource.com/13026
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently allocation checks the GC trigger speculatively during
allocation and then triggers the GC without rechecking. As a result,
it's possible for G 1 and G 2 to detect the trigger simultaneously,
both enter startGC, G 1 actually starts GC while G 2 gets preempted
until after the whole GC cycle, then G 2 immediately starts another GC
cycle even though the heap is now well under the trigger.
Fix this by re-checking the GC trigger non-speculatively just before
actually kicking off a new GC cycle.
This contributes to #11911 because when this happens, we definitely
don't finish the background sweep before starting the next GC cycle,
which can significantly delay the start of concurrent scan.
Change-Id: I560ab79ba5684ba435084410a9765d28f5745976
Reviewed-on: https://go-review.googlesource.com/13025
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
On most systems, a pointer is the worst case alignment, so adding
a pointer field at the end of a struct guarantees there will be no
padding added after that field (to satisfy overall struct alignment
due to some more-aligned field also present).
In the runtime, the map implementation needs a quick way to
get to the overflow pointer, which is last in the bucket struct,
so it uses size - sizeof(pointer) as the offset.
NaCl/amd64p32 is the exception, as always.
The worst case alignment is 64 bits but pointers are 32 bits.
There's a long history that is not worth going into, but when
we moved the overflow pointer to the end of the struct,
we didn't get the padding computation right.
The compiler computed the regular struct size and then
on amd64p32 added another 32-bit field.
And the runtime assumed it could step back two 32-bit fields
(one 64-bit register size) to get to the overflow pointer.
But in fact if the struct needed 64-bit alignment, the computation
of the regular struct size would have added a 32-bit pad already,
and then the code unconditionally added a second 32-bit pad.
This placed the overflow pointer three words from the end, not two.
The last two were padding, and since the runtime was consistent
about using the second-to-last word as the overflow pointer,
no harm done in the sense of overwriting useful memory.
But writing the overflow pointer to a non-pointer word of memory
means that the GC can't see the overflow blocks, so it will
collect them prematurely. Then bad things happen.
Correct all this in a few steps:
1. Add an explicit check at the end of the bucket layout in the
compiler that the overflow field is last in the struct, never
followed by padding.
2. When padding is needed on nacl (not always, just when needed),
insert it before the overflow pointer, to preserve the "last in the struct"
property.
3. Let the compiler have the final word on the width of the struct,
by inserting an explicit padding field instead of overwriting the
results of the width computation it does.
4. For the same reason (tell the truth to the compiler), set the type
of the overflow field when we're trying to pretend its not a pointer
(in this case the runtime maintains a list of the overflow blocks
elsewhere).
5. Make the runtime use "last in the struct" as its location algorithm.
This fixes TestTraceStress on nacl/amd64p32.
The 'bad map state' and 'invalid free list' failures no longer occur.
Fixes#11838.
Change-Id: If918887f8f252d988db0a35159944d2b36512f92
Reviewed-on: https://go-review.googlesource.com/12971
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Dangling pointer error. Unlikely to trigger in practice, but still.
Found by running GODEBUG=efence=1 GOGC=1 trace.test.
Change-Id: Ice474dedcf62dd33ab77526287a023ba3b166db9
Reviewed-on: https://go-review.googlesource.com/12991
Reviewed-by: Austin Clements <austin@google.com>
This only triggers on ARMv7+.
If there are important SMP ARMv6 machines we can reconsider.
Makes TestLFStress tests pass and sync/atomic tests not time out
on Apple iPad Mini 3.
Fixes#7977.
Fixes#10189.
Change-Id: Ie424dea3765176a377d39746be9aa8265d11bec4
Reviewed-on: https://go-review.googlesource.com/12950
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Was not allocating space for the frame above sigpanic,
nor was it pushing the LR into the right place.
Because traceback past sigpanic only needs the
LR for faulting leaves, this was not noticed too much.
But it did break the sync/atomic nil deref tests.
Change-Id: Icba53fffa193423aab744c37f21ee893ce2ee3ac
Reviewed-on: https://go-review.googlesource.com/12926
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Instead of pushing the denominator argument on the stack,
the denominator is now passed in m.
This fixes a variety of bugs related to trying to take stack traces
backwards from the middle of the software div/mod routines.
Some of those bugs have been kludged around in the past,
but others have not. Instead of trying to patch up after breaking
the stack, this CL stops breaking the stack.
This is an update of https://golang.org/cl/19810043,
which was rolled back in https://golang.org/cl/20350043.
The problem in the original CL was that there were divisions
at bad times, when m was not available. These were divisions
by constant denominators, either in C code or in assembly.
The Go compiler knows how to generate division by multiplication
for constant denominators, but the C compiler did not.
There is no longer any C code, so that's taken care of.
There was one problematic DIV in runtime.usleep (assembly)
but https://golang.org/cl/12898 took care of that one.
So now this approach is safe.
Reject DIV/MOD in NOSPLIT functions to keep them from
coming back.
Fixes#6681.
Fixes#6699.
Fixes#10486.
Change-Id: I09a13c76ad08ba75b3bd5d46a3eb78e66a84ab38
Reviewed-on: https://go-review.googlesource.com/12899
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We want to adjust the DIV calling convention to use m,
and usleep can be called without an m, so switch to a
multiplication by the reciprocal (and test).
Step toward a fix for #6699 and #10486.
Change-Id: Iccf76a18432d835e48ec64a2fa34a0e4d6d4b955
Reviewed-on: https://go-review.googlesource.com/12898
Reviewed-by: Ian Lance Taylor <iant@golang.org>
For the android/arm builder.
Change-Id: Iad4881689223cd6479870da9541524a8cc458cce
Reviewed-on: https://go-review.googlesource.com/12859
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Fixes arm64 builder crash.
The bug is possible on all architectures; you just have to get lucky
and hit a preemption or a stack growth on entry to assertE2I2.
The test stacks the deck.
Change-Id: I8419da909b06249b1ad15830cbb64e386b6aa5f6
Reviewed-on: https://go-review.googlesource.com/12890
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
It says to disable until #7564 is fixed. It was fixed in April 2014.
Change-Id: I9bebfe96802bafdd2d1a0a47591df346d91b000c
Reviewed-on: https://go-review.googlesource.com/12858
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Also make invalidptr control the recently added GC pointer check,
as documented.
Change-Id: Iccfdf49480219d12be8b33b8f03d8312d8ceabed
Reviewed-on: https://go-review.googlesource.com/12857
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
The skips added in CL 12579, based on incorrect time stamps,
should be sufficient to identify and exclude all the time-related
flakiness on these systems.
If there is other flakiness, we want to find out.
For #10512.
Change-Id: I5b588ac1585b2e9d1d18143520d2d51686b563e3
Reviewed-on: https://go-review.googlesource.com/12746
Reviewed-by: Austin Clements <austin@google.com>
Nearly all the flaky failures we've seen in trace tests have been
due to the use of time stamps to determine relative event ordering.
This is tricky for many reasons, including:
- different cores might not have exactly synchronized clocks
- VMs are worse than real hardware
- non-x86 chips have different timer resolution than x86 chips
- on fast systems two events can end up with the same time stamp
Stop trying to make time reliable. It's clearly not going to be for Go 1.5.
Instead, record an explicit event sequence number for ordering.
Using our own counter solves all of the above problems.
The trace still contains time stamps, of course. The sequence number
is just used for ordering.
Should alleviate #10554 somewhat. Then tickDiv can be chosen to
be a useful time unit instead of having to be exact for ordering.
Separating ordering and time stamps lets the trace parser diagnose
systems where the time stamp order and actual order do not match
for one reason or another. This CL adds that check to the end of
trace.Parse, after all other sequence order-based checking.
If that error is found, we skip the test instead of failing it.
Putting the check in trace.Parse means that cmd/trace will pick
up the same check, refusing to display a trace where the time stamps
do not match actual ordering.
Using net/http's BenchmarkClientServerParallel4 on various CPU counts,
not tracing vs tracing:
name old time/op new time/op delta
ClientServerParallel4 50.4µs ± 4% 80.2µs ± 4% +59.06% (p=0.000 n=10+10)
ClientServerParallel4-2 33.1µs ± 7% 57.8µs ± 5% +74.53% (p=0.000 n=10+10)
ClientServerParallel4-4 18.5µs ± 4% 32.6µs ± 3% +75.77% (p=0.000 n=10+10)
ClientServerParallel4-6 12.9µs ± 5% 24.4µs ± 2% +89.33% (p=0.000 n=10+10)
ClientServerParallel4-8 11.4µs ± 6% 21.0µs ± 3% +83.40% (p=0.000 n=10+10)
ClientServerParallel4-12 14.4µs ± 4% 23.8µs ± 4% +65.67% (p=0.000 n=10+10)
Fixes#10512.
Change-Id: I173eecf8191e86feefd728a5aad25bf1bc094b12
Reviewed-on: https://go-review.googlesource.com/12579
Reviewed-by: Austin Clements <austin@google.com>
Otherwise the GC may see uninitialized memory there,
which might be old pointers that are retained, or it might
trigger the invalid pointer check.
Fixes#11907.
Change-Id: I67e306384a68468eef45da1a8eb5c9df216a77c0
Reviewed-on: https://go-review.googlesource.com/12852
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The last time we tried this, linux/arm64 broke.
The series of CLs leading to this one fixes that problem.
Let's try again.
Fixes#9880.
Change-Id: I67bc1d959175ec972d4dcbe4aa6f153790f74251
Reviewed-on: https://go-review.googlesource.com/12849
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
arm64 requires either no stack frame or a frame with a size that is 8 mod 16
(adding the saved LR will make it 16-aligned).
The cmd/internal/obj/arm64 has been silently aligning frames, but it led to
a terrible bug when the compiler and obj disagreed on the frame size,
and it's just generally confusing, so we're going to make misaligned frames
an error instead of something that is silently changed.
This CL prepares by updating assembly files.
Note that the changes in this CL are already being done silently by
cmd/internal/obj/arm64, so there is no semantic effect here,
just a clarity effect.
For #9880.
Change-Id: Ibd6928dc5fdcd896c2bacd0291bf26b364591e28
Reviewed-on: https://go-review.googlesource.com/12845
Reviewed-by: Austin Clements <austin@google.com>
This adds a GCCPUFraction field to MemStats that reports the
cumulative fraction of the program's execution time spent in the
garbage collector. This is equivalent to the utilization percent shown
in the gctrace output and makes this available programmatically.
This does make one small effect on the gctrace output: we now report
the duration of mark termination up to just before the final
start-the-world, rather than up to just after. However, unlike
stop-the-world, I don't believe there's any way that start-the-world
can block, so it should take negligible time.
While there are many statistics one might want to expose via MemStats,
this is one of the few that will undoubtedly remain meaningful
regardless of future changes to the memory system.
The diff for this change is larger than the actual change. Mostly it
lifts the code for computing the GC CPU utilization out of the
debug.gctrace path.
Updates #10323.
Change-Id: I0f7dc3fdcafe95e8d1233ceb79de606b48acd989
Reviewed-on: https://go-review.googlesource.com/12844
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we only capture GC phase transition times if
debug.gctrace>0, but we're about to compute GC CPU utilization
regardless of whether debug.gctrace is set, so we need these
regardless of debug.gctrace.
Change-Id: If3acf16505a43d416e9a99753206f03287180660
Reviewed-on: https://go-review.googlesource.com/12843
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The following sequence of events can lead to the runtime attempting an
out-of-bounds access on a stack barrier slice:
1. A SIGPROF comes in on a thread while the G on that thread is in
_Gsyscall. The sigprof handler calls gentraceback, which saves a
local copy of the G's stkbar slice. Currently the G has no stack
barriers, so this slice is empty.
2. On another thread, the GC concurrently scans the stack of the
goroutine being profiled (it considers it stopped because it's in
_Gsyscall) and installs stack barriers.
3. Back on the sigprof thread, gentraceback comes across a stack
barrier in the stack and attempts to look it up in its (zero
length) copy of G's old stkbar slice, which causes an out-of-bounds
access.
This commit fixes this by adding a simple cas spin to synchronize the
SIGPROF handler with stack barrier insertion.
In general I would prefer that this synchronization be done through
the G status, since that's how stack scans are otherwise synchronized,
but adding a new lock is a much smaller change and G statuses are full
of subtlety.
Fixes#11863.
Change-Id: Ie89614a6238bb9c6a5b1190499b0b48ec759eaf7
Reviewed-on: https://go-review.googlesource.com/12748
Reviewed-by: Russ Cox <rsc@golang.org>
The scheduler, work buffer's dispose, and write barriers
can conspire to hide the a pointer from the GC's concurent
mark phase. If this pointer is the only path to a large
amount of marking the STW mark termination phase may take
a lot of time.
Consider the following:
1) dispose places a work buffer on the partial queue
2) the GC is busy so it does not immediately remove and
process the work buffer
3) the scheduler runs a mutator whose write barrier dequeues the
work buffer from the partial queue so the GC won't see it
This repeats until the GC reaches the mark termination
phase where the GC finally discovers the pointer along
with a lot of work to do.
This CL fixes the problem by having the mutator
dispose of the buffer to the full queue instead of
the partial queue. Since the write buffer never asks for full
buffers the conspiracy described above is not possible.
Updates #11694.
Change-Id: I2ce832f9657a7570f800e8ce4459cd9e304ef43b
Reviewed-on: https://go-review.googlesource.com/12840
Reviewed-by: Austin Clements <austin@google.com>
Russ Cox fixed this issue for other systems
in CL 12026, but the Plan 9 part was forgotten.
Fixes#11656.
Change-Id: I91c033687987ba43d13ad8f42e3fe4c7a78e6075
Reviewed-on: https://go-review.googlesource.com/12762
Reviewed-by: Russ Cox <rsc@golang.org>
The function is already defined between syscall_solaris.go and
syscall2_solaris.go.
Change-Id: I034baf7c8531566bebfdbc5a4061352cbcc31449
Reviewed-on: https://go-review.googlesource.com/12773
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
A further attempt to fix raiseproc on Solaris.
Change-Id: I8d8000d6ccd0cd9f029ebe1f211b76ecee230cd0
Reviewed-on: https://go-review.googlesource.com/12771
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I forgot that the libc raise function only sends the signal to the
current thread. We need to actually use kill and getpid here, as we
do on other systems.
Change-Id: Iac34af822c93468bf68cab8879db3ee20891caaf
Reviewed-on: https://go-review.googlesource.com/12704
Reviewed-by: Russ Cox <rsc@golang.org>
Seems like the simplest solution for 1.5. All the parts of the test
suite I can run on my current device (for which my exception handler
fix no longer works, apparently) pass without this code. I'll move it
into x/mobile/app.
Fixes#11884
Change-Id: I2da40c8c7b48a4c6970c4d709dd7c148a22e8727
Reviewed-on: https://go-review.googlesource.com/12721
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently we enter mark 2 by first flushing all existing gcWork caches
and then setting gcBlackenPromptly, which disables further gcWork
caching. However, if a worker or assist pulls a work buffer in to its
gcWork cache after that cache has been flushed but before caching is
disabled, that work may remain in that cache until mark termination.
If that work represents a heap bottleneck (e.g., a single pointer that
is the only way to reach a large amount of the heap), this can force
mark termination to do a large amount of work, resulting in a long
STW.
Fix this by reversing the order of these steps: first disable caching,
then flush all existing caches.
Rick Hudson <rlh> did the hard work of tracking this down. This CL
combined with CL 12672 and CL 12646 distills the critical parts of his
fix from CL 12539.
Fixes#11694.
Change-Id: Ib10d0a21e3f6170a80727d0286f9990df049fed2
Reviewed-on: https://go-review.googlesource.com/12688
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the GC coordinator enables GC assists at the same time it
enables background mark workers, after the concurrent scan phase is
done. However, this means a rapidly allocating mutator has the entire
scan phase during which to allocate beyond the heap trigger and
potentially beyond the heap goal with no back-pressure from assists.
This prevents the feedback system that's supposed to keep the heap
size under the heap goal from doing its job.
Fix this by enabling mutator assists during the scan phase. This is
safe because the write barrier is already enabled and globally
acknowledged at this point.
There's still a very small window between when the heap size reaches
the heap trigger and when the GC coordinator is able to stop the world
during which the mutator can allocate unabated. This allows *very*
rapidly allocator mutators like TestTraceStress to still occasionally
exceed the heap goal by a small amount (~20 MB at most for
TestTraceStress). However, this seems like a corner case.
Fixes#11677.
Change-Id: I0f80d949ec82341cd31ca1604a626efb7295a819
Reviewed-on: https://go-review.googlesource.com/12674
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we hand-code a set of phases when draining is allowed.
However, this set of phases is conservative. The critical invariant is
simply that the write barrier must be enabled if we're draining.
Shortly we're going to enable mutator assists during the scan phase,
which means we may drain during the scan phase. In preparation, this
commit generalizes these assertions to check the fundamental condition
that the write barrier is enabled, rather than checking that we're in
any particular phase.
Change-Id: I0e1bec1ca823d4a697a0831ec4c50f5dd3f2a893
Reviewed-on: https://go-review.googlesource.com/12673
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we clear both the mark 1 and mark 2 signals at the beginning
of concurrent mark. If either if these is clear, it acts as a signal
to the scheduler that it should start background workers. However,
this means that in the interim *between* mark 1 and mark 2, the
scheduler basically loops starting up new workers only to have them
return with nothing to do. In addition to harming performance and
delaying mutator work, this approach has a race where workers started
for mark 1 can mistakenly signal mark 2, causing it to complete
prematurely. This approach also interferes with starting assists
earlier to fix#11677.
Fix this by initially setting both mark 1 and mark 2 to "signaled".
The scheduler will not start background mark workers, though assists
can still run. When we're ready to enter mark 1, we clear the mark 1
signal and wait for it. Then, when we're ready to enter mark 2, we
clear the mark 2 signal and wait for it.
This structure also lets us deal cleanly with the situation where all
work is drained *prior* to the mark 2 wait, meaning that there may be
no workers to signal completion. Currently we deal with this using a
racy (and possibly incorrect) check for work in the coordinator itself
to skip the mark 2 wait if there's no work. This change makes the
coordinator unconditionally wait for mark completion and makes the
scheduler itself signal completion by slightly extending the logic it
already has to determine that there's no work and hence no use in
starting a new worker.
This is a prerequisite to fixing the remaining component of #11677,
which will require enabling assists during the scan phase. However, we
don't want to enable background workers until the mark phase because
they will compete with the scan. This change lets us use bgMark1 and
bgMark2 to indicate when it's okay to start background workers
independent of assists.
This is also a prerequisite to fixing #11694. It significantly reduces
the occurrence of long mark termination pauses in #11694 (from 64 out
of 1000 to 2 out of 1000 in one experiment).
Coincidentally, this also reduces the final heap size (and hence run
time) of TestTraceStress from ~100 MB and ~1.9 seconds to ~14 MB and
~0.4 seconds because it significantly shortens concurrent mark
duration.
Rick Hudson <rlh> did the hard work of tracking this down.
Change-Id: I12ea9ee2db9a0ae9d3a90dde4944a75fcf408f4c
Reviewed-on: https://go-review.googlesource.com/12672
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, there are three ways to satisfy a GC assist: 1) the mutator
steals credit from background GC, 2) the mutator actually does GC
work, and 3) there is no more work available. 3 was never really
intended as a way to satisfy an assist, and it causes problems: there
are periods when it's expected that the GC won't have any work, such
as when transitioning from mark 1 to mark 2 and from mark 2 to mark
termination. During these periods, there's no back-pressure on rapidly
allocating mutators, which lets them race ahead of the heap goal.
For example, test/init1.go and the runtime/trace test both have small
reachable heaps and contain loops that rapidly allocate large garbage
byte slices. This bug lets these tests exceed the heap goal by several
orders of magnitude.
Fix this by forcing the assist (and hence the allocation) to block
until it can satisfy its debt via either 1 or 2, or the GC cycle
terminates.
This fixes one the causes of #11677. It's still possible to overshoot
the GC heap goal, but with this change the overshoot is almost exactly
by the amount of allocation that happens during the concurrent scan
phase, between when the heap passes the GC trigger and when the GC
enables assists.
Change-Id: I5ef4edcb0d2e13a1e432e66e8245f2bd9f8995be
Reviewed-on: https://go-review.googlesource.com/12671
Reviewed-by: Russ Cox <rsc@golang.org>
Currently it's possible for the GC assist to signal completion of the
mark phase, which puts the GC coordinator goroutine on the current P's
run queue, and then return to mutator code that delays until the next
forced preemption before actually yielding control to the GC
coordinator, dragging out completion of the mark phase. This delay can
be further exacerbated if the mutator makes other goroutines runnable
before yielding control, since this will push the GC coordinator on
the back of the P's run queue.
To fix this, this adds a Gosched to the assist if it completed the
mark phase. This immediately and directly yields control to the GC
coordinator. This already happens implicitly in the background mark
workers because they park immediately after completing the mark.
This is one of the reasons completion of the mark phase is being
dragged out and allowing the mutator to allocate without assisting,
leading to the large heap goal overshoot in issue #11677. This is also
a prerequisite to making the assist block when it can't pay off its
debt.
Change-Id: I586adfbecb3ca042a37966752c1dc757f5c7fc78
Reviewed-on: https://go-review.googlesource.com/12670
Reviewed-by: Russ Cox <rsc@golang.org>
Currently it's possible to perform GC work on a system stack or when
locks are held if there's an allocation that triggers an assist. This
is generally a bad idea because of the fragility of these contexts,
and it's incompatible with two changes we're about to make: one is to
yield after signaling mark completion (which we can't do from a
non-preemptible context) and the other is to make assists block if
there's no other way for them to pay off the assist debt.
This commit simply skips the assist if it's called from a
non-preemptible context. The allocation will still count toward the
assist debt, so it will be paid off by a later assist. There should be
little allocation from non-preemptible contexts, so this shouldn't
harm the overall assist mechanism.
Change-Id: I7bf0e6c73e659fe6b52f27437abf39d76b245c79
Reviewed-on: https://go-review.googlesource.com/12649
Reviewed-by: Russ Cox <rsc@golang.org>
When notetsleep_internal is called from notetsleepg, notetsleepg has
just given up the P, so write barriers are not allowed in
notetsleep_internal.
Change-Id: I1b214fa388b1ea05b8ce2dcfe1c0074c0a3c8870
Reviewed-on: https://go-review.googlesource.com/12647
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently fractional and idle mark workers dispose of their gcWork
cache during mark 2 after incrementing work.nwait and after checking
whether there are any workers or any work available. This creates a
window for two races:
1) If the only remaining work is in this worker's gcWork cache, it
will see that there are no more workers and no more work on the
global lists (since it has not yet flushed its own cache) and
prematurely signal mark 2 completion.
2) After this worker has incremented work.nwait but before it has
flushed its cache, another worker may observe that there are no
more workers and no more work and prematurely signal mark 2
completion.
We can fix both of these by simply moving the cache flush above the
increment of nwait and the test of the completion condition.
This is probably contributing to #11694, though this alone is not
enough to fix it.
Change-Id: Idcf9656e5c460c5ea0d23c19c6c51e951f7716c3
Reviewed-on: https://go-review.googlesource.com/12646
Reviewed-by: Russ Cox <rsc@golang.org>
GC assists are supposed to steal at most the amount of background GC
credit available so that background GC credit doesn't go negative.
However, they are instead stealing the *total* amount of their debt
but only claiming up to the amount of credit that was available. This
results in draining the background GC credit pool too quickly, which
results in unnecessary assist work.
The fix is trivial: steal the amount of work we meant to steal (which
is already computed).
Change-Id: I837fe60ed515ba91c6baf363248069734a7895ef
Reviewed-on: https://go-review.googlesource.com/12643
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently the gctrace output reports the trigger heap size, rather
than the actual heap size at the beginning of GC. Often these are the
same, or at least very close. However, it's possible for the heap to
already have exceeded this trigger when we first check the trigger and
start GC; in this case, this output is very misleading. We've
encountered this confusion a few times when debugging and this
behavior is difficult to document succinctly.
Change the gctrace output to report the actual heap size when GC
starts, rather than the trigger.
Change-Id: I246b3ccae4c4c7ea44c012e70d24a46878d7601f
Reviewed-on: https://go-review.googlesource.com/12452
Reviewed-by: Russ Cox <rsc@golang.org>
Whenever someone pastes gctrace output into GitHub, it helpfully turns
the GC cycle number into a link to some unrelated issue. Prevent this
by removing the pound before the cycle number. The fact that this is a
cycle number is probably more obvious at a glance than most of the
other numbers.
Change-Id: Ifa5fc7fe6c715eac50e639f25bc36c81a132ffea
Reviewed-on: https://go-review.googlesource.com/12413
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This extends https://golang.org/cl/2811, which only applied to Darwin
and GNU/Linux, to all Unix systems.
Fixes#9591.
Change-Id: Iec3fb438564ba2924b15b447c0480f87c0bfd009
Reviewed-on: https://go-review.googlesource.com/12661
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
It's not clear this really belongs anywhere at all,
but this is a better place for it than package os.
This way package os can avoid importing "C".
Fixes#10455.
Change-Id: Ibe321a93bf26f478951c3a067d75e22f3d967eb7
Reviewed-on: https://go-review.googlesource.com/12574
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
The system stack is only around 8kb on ARM so one can't put an 8kb buffer on
the stack. More than 1024 ARM cores seems sufficiently unlikely for the
foreseeable future.
Fixes#11853
Change-Id: I7cb27c1250a6153f86e269c172054e9dfc218c72
Reviewed-on: https://go-review.googlesource.com/12622
Reviewed-by: Austin Clements <austin@google.com>
Issue 11214 reports problems with older versions of gdb. It does work
with gdb 7.9 on my Ubuntu Trusty system, so take that as the minimum
required version.
Fixes#11214.
Change-Id: I61b732895506575be7af595f81fc1bcf696f58c2
Reviewed-on: https://go-review.googlesource.com/12626
Reviewed-by: Austin Clements <austin@google.com>
Foreign code can be arbitrarily aligned,
so the function before it can have
arbitrarily much padding.
We can't call pcvalue on values in the padding.
Fixes#11653.
Change-Id: I7d57f813ae5a2409d1520fcc909af3eeef2da131
Reviewed-on: https://go-review.googlesource.com/12550
Reviewed-by: Rob Pike <r@golang.org>
We use 127.0.0.1 instead of localhost in Go networking tests.
The reporter of #11774 has localhost defined to be 120.192.83.162,
for reasons unknown.
Also, if TestTraceSymbolize calls Fatalf (for example because Listen
fails) then we need to stop the trace for future tests to work.
See failure log in #11774.
Fixes#11774.
Change-Id: Iceddb03a72d31e967acd2d559ecb78051f9c14b7
Reviewed-on: https://go-review.googlesource.com/12521
Reviewed-by: Rob Pike <r@golang.org>
Some routines run without and m or g and cannot invoke the
race detector runtime. They must be opaque to the runtime.
That used to be true because they were written in C.
Now that they are written in Go, disable the race detector
annotations for those functions explicitly.
Add test.
Fixes#10874.
Change-Id: Ia8cc28d51e7051528f9f9594b75634e6bb66a785
Reviewed-on: https://go-review.googlesource.com/12534
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the past badsignal would crash the program. In
https://golang.org/cl/10757044 badsignal was changed to call sigsend,
to fix issue #3250. The effect of this was that when a non-Go thread
received a signal, and os/signal.Notify was not being used to check
for occurrences of the signal, the signal was ignored.
This changes the code so that if os/signal.Notify is not being used,
then the signal handler is reset to what it was, and the signal is
raised again. This lets non-Go threads handle the signal as they
wish. In particular, it means that a segmentation violation in a
non-Go thread will ordinarily crash the process, as it should.
Fixes#10139.
Update #11794.
Change-Id: I2109444aaada9d963ad03b1d071ec667760515e5
Reviewed-on: https://go-review.googlesource.com/12503
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
It's a bad test and it's worst on uniprocessors.
Fixes#11143.
Change-Id: I0164231ada294788d7eec251a2fc33e02a26c13b
Reviewed-on: https://go-review.googlesource.com/12522
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
ae1ea2a moved trace-related functions from runtime/pprof to
runtime/trace, but missed a doc comment and a code comment. Update
these to reflect the move.
Change-Id: I6e1e8861e5ede465c08a2e3f80b976145a8b32d8
Reviewed-on: https://go-review.googlesource.com/12525
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Move tracing functions from runtime/pprof to the new runtime/trace package.
Fixes#9710
Change-Id: I718bcb2ae3e5959d9f72cab5e6708289e5c8ebd5
Reviewed-on: https://go-review.googlesource.com/12511
Reviewed-by: Russ Cox <rsc@golang.org>
This is mostly Russ's https://golang.org/cl/12145 but with some extra fixes to
account for the fact that function declarations without implementations now
break shared libraries, and including my test case.
Fixes#11480.
Change-Id: Iabdc2934a0378e5025e4e7affadb535eaef2c8f1
Reviewed-on: https://go-review.googlesource.com/12340
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The runtime.GC documentation was rewritten in df2809f to make it clear
that it blocks until GC is complete, but the re-rewrite in ed9a4c9 and
e28a679 lost this property when clarifying that it may also block the
entire program and not just the caller.
Try to arrive at wording that conveys both of these properties.
Change-Id: I1e255322aa28a21a548556ecf2a44d8d8ac524ef
Reviewed-on: https://go-review.googlesource.com/12392
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
The findmoduledatap function will not return nil in ordinary use, but
check for nil to try to avoid crashing when we are already crashing.
Update #11783.
Change-Id: If7b1adb51efab13b4c1a37b6f3c9ad22641a0b56
Reviewed-on: https://go-review.googlesource.com/12391
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We shouldn't guarantee this behavior, but suggest it's possible.
Change-Id: I4c2afb48b99be4d91537306d3337171a13c9990a
Reviewed-on: https://go-review.googlesource.com/12346
Reviewed-by: David Crawshaw <crawshaw@golang.org>
No code changes. Just make it clear that runtime.GC is not concurrent.
Change-Id: I00a99ebd26402817c665c9a128978cef19f037be
Reviewed-on: https://go-review.googlesource.com/12345
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
An out-of-date comment snuck in to cc8f544. Remove it.
Change-Id: I5bc7c17e737d1cabe57b88de06d7579c60ca28ff
Reviewed-on: https://go-review.googlesource.com/12328
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This fixes a race between 1) sweeping and freeing an unmarked large
span and 2) reusing that span and allocating from it. This race arises
because mSpan_Sweep returns spans for large objects to the heap
*before* heapBitsSweepSpan clears the mark bit on the object in the
span.
Specifically, the following sequence of events can lead to an
incorrectly zeroed bitmap byte, which causes the garbage collector to
not trace any pointers in that object (the pointer bits for the first
four words are cleared, and the scan bits are also cleared, so it
looks like a no-scan object).
1) P0 calls mSpan_Sweep on a large span S0 with an unmarked object on it.
2) mSpan_Sweep calls heapBitsSweepSpan, which invokes the callback for
the one (unmarked) object on the span.
3) The callback calls mHeap_Free, which makes span S0 available for
allocation, but this is too early.
4) P1 grabs this S0 from the heap to use for allocation.
5) P1 allocates an object on this span and writes that object's type
bits to the bitmap.
6) P0 returns from the callback to heapBitsSweepSpan.
heapBitsSweepSpan clears the byte containing the mark, even though
this span is now owned by P1 and this byte contains important
bitmap information.
This fixes this problem by simply delaying the mHeap_Free until after
the heapBitsSweepSpan. I think the overall logic of mSpan_Sweep could
be simplified now, but this seems like the minimal change.
Fixes#11617.
Change-Id: I6b1382c7e7cc35f81984467c0772fe9848b7522a
Reviewed-on: https://go-review.googlesource.com/12320
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Rob Pike <r@golang.org>
Running a safe-point function on syscall entry uses systemstack() and
hence clobbers g.sched.pc and g.sched.sp. Fix this by re-saving them
after the systemstack, just like in the other uses of systemstack in
reentersyscall.
Change-Id: I47868a53eba24d81919fda56ef6bbcf72f1f922e
Reviewed-on: https://go-review.googlesource.com/12125
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, we run a P's safe-point function immediately after entering
_Psyscall state. This is unsafe, since as soon as we put the P in
_Psyscall, we no longer control the P and another M may claim it.
We'll still run the safe-point function only once (because doing so
races on an atomic), but the P may no longer be at a safe-point when
we do so.
In particular, this means that the use of forEachP to dispose all P's
gcw caches is unsafe. A P may enter a syscall, run the safe-point
function, and dispose the P's gcw cache concurrently with another M
claiming the P and attempting to use its gcw cache. If this happens,
we may empty the gcw's workbuf after putting it on
work.{full,partial}, or add pointers to it after putting it in
work.empty. This will cause an assertion failure when we later pop the
workbuf from the list and its object count is inconsistent with the
list we got it from.
Fix this by running the safe-point function just before putting the P
in _Psyscall.
Related to #11640. This probably fixes this issue, but while I'm able
to show that we can enter a bad safe-point state as a result of this,
I can't reproduce that specific failure.
Change-Id: I6989c8ca7ef2a4a941ae1931e9a0748cbbb59434
Reviewed-on: https://go-review.googlesource.com/12124
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
These memstats are currently being computed by gcMark, which was
appropriate in Go 1.4, but gcMark is now just one part of a bigger
picture. In particular, it can't account for the sweep termination
pause time, it can't account for all of the mark termination pause
time, and the reported "pause end" and "last GC" times will be
slightly earlier than they really are.
Lift computing of these statistics into func gc, which has the
appropriate visibility into the process to compute them correctly.
Fixes one of the issues in #10323. This does not add new statistics
appropriate to the concurrent collector; it simply fixes existing
statistics that are being misreported.
Change-Id: I670cb16594a8641f6b27acf4472db15b6e8e086e
Reviewed-on: https://go-review.googlesource.com/11794
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we report MemStats.PauseEnd in nanoseconds, but with no
particular 0 time. On Linux, the 0 time is when the host started. On
Darwin, it's the UNIX epoch. This is also inconsistent with the other
absolute time in MemStats, LastGC, which is always reported in
nanoseconds since 1970.
Fix PauseEnd so it's always reported in nanoseconds since 1970, like
LastGC.
Fixes one of the issues raised in #10323.
Change-Id: Ie2fe3169d45113992363a03b764f4e6c47e5c6a8
Reviewed-on: https://go-review.googlesource.com/11801
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Missed select case when adding the barrier last time.
All the more reason to refactor this code in Go 1.6.
Fixes#11643.
Change-Id: Ib0d19d6e0939296c0a3e06dda5e9b76f813bbc7e
Reviewed-on: https://go-review.googlesource.com/12086
Reviewed-by: Austin Clements <austin@google.com>
The one in misc/makerelease/makerelease.go is particularly bad and
probably warrants rotating our keys.
I didn't update old weekly notes, and reverted some changes involving
test code for now, since we're late in the Go 1.5 freeze. Otherwise,
the rest are all auto-generated changes, and all manually reviewed.
Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d
Reviewed-on: https://go-review.googlesource.com/12048
Reviewed-by: Rob Pike <r@golang.org>
The default behaviour for fatal errors and runtime panics is to dump
the goroutine stack traces and exit with code 2. However, when the process is
owned by foreign code, it is suprising and inappropriate to suddenly exit
the whole process, even on fatal errors. Instead, re-use the crash behaviour
from GOTRACEBACK=crash and abort.
The motivating use case is issue #11382, where an Android crash reporter
is confused by an exiting process, but I believe the aborting behaviour
is appropriate for all cases where Go does not own the process.
The change is simple and contained and will enable reliable crash reporting
for Android apps in Go 1.5, but I'll leave it to others to judge whether it
is too late for Go 1.5.
Fixes#11382
Change-Id: I477328e1092f483591c99da1fbb8bc4411911785
Reviewed-on: https://go-review.googlesource.com/12032
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Recent change (CL 10370) unexpectedly broke TestRaiseException on
Windows XP amd64. I still do not know why. But reverting old
CL 8165 fixes the problem.
This effectively makes Windows XP amd64 use AddVectoredContinueHandler
instead of SetUnhandledExceptionFilter for exception handling. That is
what we do for all recent Windows versions too.
Fixes#11481
Change-Id: If2e8037711f05bf97e3c69f5a8d86af67c58f6fc
Reviewed-on: https://go-review.googlesource.com/11888
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
When GOROOT_FINAL is set when running all.bash, the tests are run
before the files are copied to GOROOT_FINAL. The tests are run with
GOROOT set, so most work fine. This fixes two cases that do not.
In cmd/go/go_test.go we were explicitly removing GOROOT from the
environment, causing tests that did not themselves explicitly set
GOROOT to fail. There was no need to explicitly remove GOROOT, so
don't do it. If people choose to run "go test cmd/go" with a bad
GOROOT, that is their own lookout.
In the runtime GDB test, the linker has told gdb to find the support
script in GOROOT_FINAL, which will fail. Check for that case, and
skip the test when we see it.
Fixes#11652.
Change-Id: I4d3a32311e3973c30fd8a79551aaeab6789d0451
Reviewed-on: https://go-review.googlesource.com/12021
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
sysmon triggers a GC if there has been no GC for two minutes.
Currently, this is a STW GC. There is no reason for this to be STW, so
make it concurrent.
Fixes#10261.
Change-Id: I92f3ac37272d5c2a31480ff1fa897ebad08775a9
Reviewed-on: https://go-review.googlesource.com/11955
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
The expansion of structure, array, slice, and map literals
does not use the right line number in its introduced assignments
to temporaries, which leads to incorrect line number attribution
for expressions in those literals.
Inlining also incorrectly replaced the line numbers of args to
inlined functions.
This was revealed in CL 9721 because a now-avoided temporary
assignment introduced the correct line number.
I.e. before CL 9721
"tmp_wrongline := expr"
was transformed to
"tmp_rightline := expr; tmp_wrongline := tmp_rightline"
Also includes a repair to CL 10334 involving line numbers
where a spurious -1 remained (should have been 0, now is 0).
Fixes#11400.
Change-Id: I3a4687efe463977fa1e2c996606f4d91aaf22722
Reviewed-on: https://go-review.googlesource.com/11730
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Sameer Ajmani <sameer@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Basic randomization of goroutine scheduling for -race mode.
It is probably possible to do much better (there's a paper linked
in the issue that I haven't read, for example), but this suffices
to introduce at least some unpredictability into the scheduling order.
The goal here is to have _something_ for Go 1.5, so that we don't
start hitting more of these scheduling order-dependent bugs
if we change the scheduler order again in Go 1.6.
For #11372.
Change-Id: Idf1154123fbd5b7a1ee4d339e93f97635cc2bacb
Reviewed-on: https://go-review.googlesource.com/11795
Reviewed-by: Austin Clements <austin@google.com>
The old code was recording the current table output offset,
so the table from the next function would be used instead of
the runtime realizing that there was no table at all.
Add debug constant in runtime to check this for every function
at startup. It's too expensive to do that by default, but we can
do the last five functions. The end of the table is usually where
the C symbols end up, so that's where the problems typically are.
Fixes#10747.
Fixes#11396.
Change-Id: I13592e78017969fc22979fa902e19e1b151d41b1
Reviewed-on: https://go-review.googlesource.com/11657
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Currently we fail to reset the live heap accounting state before the
checkmark mark and before the gctrace=2 extra mark. As a result, if
either are enabled, at the end of GC it thinks there are 0 bytes of
live heap, which causes the GC controller to initiate a new GC
immediately, regardless of the true heap size.
Fix this by factoring this state reset into a function and calling it
before all three possible marks.
This function should be merged with gcResetGState, but doing so
requires some additional cleanup, so it will wait for after the
freeze. Filed #11427 for this cleanup.
Fixes#10492.
Change-Id: Ibe46348916fc8368fac6f086e142815c970a6f4d
Reviewed-on: https://go-review.googlesource.com/11561
Reviewed-by: Russ Cox <rsc@golang.org>
Memory for stacks is manually managed by the runtime and, currently
(with one exception) we free stack spans immediately when the last
stack on a span is freed. However, the garbage collector assumes that
spans can never transition from non-free to free during scan or mark.
This disagreement makes it possible for the garbage collector to mark
uninitialized objects and is blocking us from re-enabling the bad
pointer test in the garbage collector (issue #9880).
For example, the following sequence will result in marking an
uninitialized object:
1. scanobject loads a pointer slot out of the object it's scanning.
This happens to be one of the special pointers from the heap into a
stack. Call the pointer p and suppose it points into X's stack.
2. X, running on another thread, grows its stack and frees its old
stack.
3. The old stack happens to be large or was the last stack in its
span, so X frees this span, setting it to state _MSpanFree.
4. The span gets reused as a heap span.
5. scanobject calls heapBitsForObject, which loads the span containing
p, which is now in state _MSpanInUse, but doesn't necessarily have
an object at p. The not-object at p gets marked, and at this point
all sorts of things can go wrong.
We already have a partial solution to this. When shrinking a stack, we
put the old stack on a queue to be freed at the end of garbage
collection. This was done to address exactly this problem, but wasn't
a complete solution.
This commit generalizes this solution to both shrinking and growing
stacks. For stacks that fit in the stack pool, we simply don't free
the span, even if its reference count reaches zero. It's fine to reuse
the span for other stacks, and this enables that. At the end of GC, we
sweep for cached stack spans with a zero reference count and free
them. For larger stacks, we simply queue the stack span to be freed at
the end of GC. Ideally, we would reuse these large stack spans the way
we can small stack spans, but that's a more invasive change that will
have to wait until after the freeze.
Fixes#11267.
Change-Id: Ib7f2c5da4845cc0268e8dc098b08465116972a71
Reviewed-on: https://go-review.googlesource.com/11502
Reviewed-by: Russ Cox <rsc@golang.org>
We don't use this state. _GCoff means we're sweeping in the
background. This makes it clear in the next commit that _GCoff and
only _GCoff means sweeping.
Change-Id: I416324a829ba0be3794a6cf3cf1655114cb6e47c
Reviewed-on: https://go-review.googlesource.com/11501
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently the runtime fails to clear a G's stack barriers in gfput if
the G's stack allocation is _FixedStack bytes. This causes the runtime
to panic if the following sequence of events happens:
1) The runtime installs stack barriers on a G.
2) The G exits by calling runtime.Goexit. Since this does not
necessarily return through the stack barriers installed on the G,
there may still be untriggered stack barriers left on the G's stack
in recorded in g.stkbar.
3) The runtime calls gfput to add the exiting G to the free pool. If
the G's stack allocation is _FixedStack bytes, we fail to clear
g.stkbar.
4) A new G starts and allocates the G that was just added to the free
pool.
5) The new G begins to execute and overwrites the stack slots that had
stack barriers in them.
6) The garbage collector enters mark termination, attempts to remove
stack barriers from the new G, and finds that they've been
overwritten.
Fix this by clearing the stack barriers in gfput in the case where it
reuses the stack.
Fixes#11256.
Change-Id: I377c44258900e6bcc2d4b3451845814a8eeb2bcf
Reviewed-on: https://go-review.googlesource.com/11461
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Stack can move during callback, so libcall struct cannot be stored on stack.
asmstdcall updates return values and errno in libcall struct parameter, but
these could be at different location when callback returns.
Store these in m, so they are not affected by GC.
Fixes#10406
Change-Id: Id01c9d2b4b44530494e6d9e9e1c875261ce477cd
Reviewed-on: https://go-review.googlesource.com/10370
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, to write out the bitmap of a slice of a type with a GCprog,
we construct a new GCprog that executes the underlying type's GCprog
to write out the bitmap once and then repeats those bits n more times.
This results in n+1 repetitions of the bitmap, which is one more
repetition than it should be. This corrupts the bitmap of the heap
following the slice and may write past the mapped bitmap memory and
segfault.
Fix this by repeating the bitmap only n-1 more times.
Fixes#11430.
Change-Id: Ic24854363bffc5a755b66f257339f9309ada3aa5
Reviewed-on: https://go-review.googlesource.com/11570
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Removes the remains of the old C based stepflt implementation.
Also removed goto usage.
Change-Id: Ida4742c49000fae4fea4649f28afde630ce4c577
Reviewed-on: https://go-review.googlesource.com/9600
Reviewed-by: Russ Cox <rsc@golang.org>
The new inlined code for append assumed that it could pass the
desired new cap to growslice, not the number of new elements.
But growslice still interpreted the argument as the number of new elements,
making it always grow by >2x (more precisely, 2x+1 rounded up
to the next malloc block size). At the time, I had intended to change
the other callers to use the new cap as well, but it's too late for that.
Instead, introduce growslice_n for the old callers and keep growslice
for the inlined (common case) caller.
Fixes#11403.
Filed #11419 to merge them.
Change-Id: I1338b1e5b352f3be4e43641f44b652ef7195251b
Reviewed-on: https://go-review.googlesource.com/11541
Reviewed-by: Austin Clements <austin@google.com>
Instrument operands of OKEY.
Also instrument OSLICESTR. Previously it was not needed
because of preceeding bounds checks (which were instrumented).
But the preceeding bounds checks have disappeared.
Change-Id: I3b0de213e23cbcf5b8ef800abeded5eeeb3f8287
Reviewed-on: https://go-review.googlesource.com/11417
Reviewed-by: Russ Cox <rsc@golang.org>
At some point it silently stopped recognizing test output.
Meanwhile two tests degraded...
Change-Id: I90a0325fc9aaa16c3ef16b9c4c642581da2bb10c
Reviewed-on: https://go-review.googlesource.com/11416
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
For debuggers and other program inspectors.
Fixes#9914.
Change-Id: I670728cea28c045e6eaba1808c550ee2f34d16ff
Reviewed-on: https://go-review.googlesource.com/11341
Reviewed-by: Austin Clements <austin@google.com>
The test is flaky on builders lately. I don't see any issues other than
usage of very small sleeps. So increase the sleeps. Also take opportunity
to refactor the code.
On my machine this change significantly reduces failure rate with GOMAXPROCS=2.
I can't reproduce the failure with GOMAXPROCS=1.
Fixes#10726
Change-Id: Iea6f10cf3ce1be5c112a2375d51c13687a8ab4c9
Reviewed-on: https://go-review.googlesource.com/9803
Reviewed-by: Austin Clements <austin@google.com>
When heapBitsSetType repeats a source bitmap with a scalar tail
(typ.ptrdata < typ.size), it lays out the tail upon reaching the end
of the source bitmap by simply increasing the number of bits claimed
to be in the incoming bit buffer. This causes later iterations to read
the appropriate number of zeros out of the bit buffer before starting
on the next repeat of the source bitmap.
Currently, however, later iterations of the loop continue to read bits
from the source bitmap *regardless of the number of bits currently in
the bit buffer*. The bit buffer can only hold 32 or 64 bits, so if the
scalar tail is large and the padding bits exceed the size of the bit
buffer, the read from the source bitmap on the next iteration will
shift the incoming bits into oblivion when it attempts to put them in
the bit buffer. When the buffer does eventually shift down to where
these bits were supposed to be, it will contain zeros. As a result,
words that should be marked as pointers on later repetitions are
marked as scalars, so the garbage collector does not trace them. If
this is the only reference to an object, it will be incorrectly freed.
Fix this by adding logic to drain the bit buffer down if it is large
instead of reading more bits from the source bitmap.
Fixes#11286.
Change-Id: I964432c4b9f1cec334fc8c3da0ff16460203feb6
Reviewed-on: https://go-review.googlesource.com/11360
Reviewed-by: Russ Cox <rsc@golang.org>
h_spans can be accessed concurrently without synchronization from
other threads, which means it needs the appropriate memory barriers on
weakly ordered machines. It happens to already have the necessary
memory barriers because all accesses to h_spans are currently
protected by the heap lock and the unlocks happen in exactly the
places where release barriers are needed, but it's easy to imagine
that this could change in the future. Document the fact that we're
depending on the barrier implied by the unlock.
Related to issue #9984.
Change-Id: I1bc3c95cd73361b041c8c95cd4bb92daf8c1f94a
Reviewed-on: https://go-review.googlesource.com/11361
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
This CL removes the single and racy use of mheap.arena_end outside
of the bookkeeping done in mHeap_init and mHeap_Alloc.
There should be no way for heapBitsForSpan to see a pointer to
an invalid span. This CL makes the check for this more precise by
checking that the pointer is between mheap_.arena_start and
mheap_.arena_used instead of mheap_.arena_end.
Change-Id: I1200b54353ee1eda002d92645fd8d26048600ceb
Reviewed-on: https://go-review.googlesource.com/11342
Reviewed-by: Austin Clements <austin@google.com>
In order to avoid a race with a concurrent write barrier or garbage
collector thread, any update to arena_used must be preceded by mapping
the corresponding heap bitmap and spans array memory. Otherwise, the
concurrent access may observe that a pointer falls within the heap
arena, but then attempt to access unmapped memory to look up its span
or heap bits.
Commit d57c889 fixed all of the places where we updated arena_used
immediately before mapping the heap bitmap and spans, but it missed
the one place where we update arena_used and depend on later code to
update it again and map the bitmap and spans. This creates a window
where the original race can still happen. This commit fixes this by
mapping the heap bitmap and spans before this arena_used update as
well. This code path is only taken when expanding the heap reservation
on 32-bit over a hole in the address space, so these extra mmap calls
should have negligible impact.
Fixes#10212, #11324.
Change-Id: Id67795e6c7563eb551873bc401e5cc997aaa2bd8
Reviewed-on: https://go-review.googlesource.com/11340
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
The unsynchronized accesses to mheap_.arena_used in the concurrent
part of the garbage collector look like a problem waiting to happen.
In fact, they are safe, but the reason is somewhat subtle and
undocumented. This commit documents this reasoning.
Related to issue #9984.
Change-Id: Icdbf2329c1aa11dbe2396a71eb5fc2a85bd4afd5
Reviewed-on: https://go-review.googlesource.com/11254
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Currently its possible for the garbage collector to observe
uninitialized memory or stale heap bitmap bits on weakly ordered
architectures such as ARM and PPC. On such architectures, the stores
that zero newly allocated memory and initialize its heap bitmap may
move after a store in user code that makes the allocated object
observable by the garbage collector.
To fix this, add a "publication barrier" (also known as an "export
barrier") before returning from mallocgc. This is a store/store
barrier that ensures any write done by user code that makes the
returned object observable to the garbage collector will be ordered
after the initialization performed by mallocgc. No barrier is
necessary on the reading side because of the data dependency between
loading the pointer and loading the contents of the object.
Fixes one of the issues raised in #9984.
Change-Id: Ia3d96ad9c5fc7f4d342f5e05ec0ceae700cd17c8
Reviewed-on: https://go-review.googlesource.com/11083
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Martin Capitanio <capnm9@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Some latency regressions have crept into our system over the past few
weeks. This CL fixes those by having the mark phase more aggressively
blacken objects so that the mark termination phase, a STW phase, has less
work to do. Three approaches were taken when the mark phase believes
it has no more work to do, ie all the work buffers are empty.
If things have gone well the mark phase is correct and there is
in fact little or no work. In that case the following items will
take very little time. If the mark phase is wrong this CL will
ferret that work out and give the mark phase a chance to deal with
it concurrently before mark termination begins.
When the mark phase first appears to be out of work, it does three things:
1) It switches from allocating white to allocating black to reduce the
number of unmarked objects reachable only from stacks.
2) It flushes and disables per-P GC work caches so all work must be in
globally visible work buffers.
3) It rescans the global roots---the BSS and data segments---so there
are fewer objects to blacken during mark termination. We do not rescan
stacks at this point, though that could be done in a later CL.
After these steps, it again drains the global work buffers.
On a lightly loaded machine the garbage benchmark has reduced the
number of GC cycles with latency > 10 ms from 83 out of 4083 cycles
down to 2 out of 3995 cycles. Maximum latency was reduced from
60+ msecs down to 20 ms.
Change-Id: I152285b48a7e56c5083a02e8e4485dd39c990492
Reviewed-on: https://go-review.googlesource.com/10590
Reviewed-by: Austin Clements <austin@google.com>
There were two issues.
1. Delayed EvGoSysExit could have been emitted during TraceStart,
while it had not yet emitted EvGoInSyscall.
2. Delayed EvGoSysExit could have been emitted during next tracing session.
Fixes#10476Fixes#11262
Change-Id: Iab68eb31cf38eb6eb6eee427f49c5ca0865a8c64
Reviewed-on: https://go-review.googlesource.com/9132
Reviewed-by: Russ Cox <rsc@golang.org>
In preparation for rename of cgocall_errno into cgocall and
asmcgocall_errno into asmcgocall in the fllowinng CL.
rsc requested CL 9387 to be split into two parts. This is first part.
Change-Id: I7434f0e4b44dd37017540695834bfcb1eebf0b2f
Reviewed-on: https://go-review.googlesource.com/11166
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This fixes a hang during runtime.TestTraceStress.
It also fixes double-scan of stacks, which leads to
stack barrier installation failures.
Both of these have shown up as flaky failures on the dashboard.
Fixes#10941.
Change-Id: Ia2a5991ce2c9f43ba06ae1c7032f7c898dc990e0
Reviewed-on: https://go-review.googlesource.com/11089
Reviewed-by: Austin Clements <austin@google.com>
//go:systemstack means that the function must run on the system stack.
Add one use in runtime as a demonstration.
Fixes#9174.
Change-Id: I8d4a509cb313541426157da703f1c022e964ace4
Reviewed-on: https://go-review.googlesource.com/10840
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Currently, when shrinkstack computes whether the halved stack
allocation will have enough room for the stack, it accounts for the
stack space that's actively in use but fails to leave extra room for
the stack guard space. As a result, *if* the minimum stack size is
small enough or the guard large enough, it may shrink the stack and
leave less than enough room to run nosplit functions. If the next
function called after the stack shrink is a nosplit function, it may
overflow the stack without noticing and overwrite non-stack memory.
We don't think this is happening under normal conditions right now.
The minimum stack allocation is 2K and the guard is 640 bytes. The
"worst case" stack shrink is from 4K (4048 bytes after stack barrier
array reservation) to 2K (2016 bytes after stack barrier array
reservation), which means the largest "used" size that will qualify
for shrinking is 4048/4 - 8 = 1004 bytes. After copying, that leaves
2016 - 1004 = 1012 bytes of available stack, which is significantly
more than the guard space.
If we were to reduce the minimum stack size to 1K or raise the guard
space above 1012 bytes, the logic in shrinkstack would no longer leave
enough space.
It's also possible to trigger this problem by setting
firstStackBarrierOffset to 0, which puts stack barriers in a debug
mode that steals away *half* of the stack for the stack barrier array
reservation. Then, the largest "used" size that qualifies for
shrinking is (4096/2)/4 - 8 = 504 bytes. After copying, that leaves
(2096/2) - 504 = 8 bytes of available stack; much less than the
required guard space. This causes failures like those in issue #11027
because func gc() shrinks its own stack and then immediately calls
casgstatus (a nosplit function), which overflows the stack and
overwrites a free list pointer in the neighboring span. However, since
this seems to require the special debug mode, we don't think it's
responsible for issue #11027.
To forestall all of these subtle issues, this commit modifies
shrinkstack to correctly account for the guard space when considering
whether to halve the stack allocation.
Change-Id: I7312584addc63b5bfe55cc384a1012f6181f1b9d
Reviewed-on: https://go-review.googlesource.com/10714
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Issues #10240, #10541, #10941, #11023, #11027 and possibly others are
indicating memory corruption in the runtime. One of the easiest places
to both get corruption and detect it is in the allocator's free lists
since they appear throughout memory and follow strict invariants. This
commit adds a check when sweeping a span that its free list is sane
and, if not, it prints the corrupted free list and panics. Hopefully
this will help us collect more information on these failures.
Change-Id: I6d417bcaeedf654943a5e068bd76b58bb02d4a64
Reviewed-on: https://go-review.googlesource.com/10713
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
A workaround for #10460.
Change-Id: I607a556561d509db6de047892f886fb565513895
Reviewed-on: https://go-review.googlesource.com/10819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
While we're here, update the documentation and delete variables with no effect.
Change-Id: I4df0d266dff880df61b488ed547c2870205862f0
Reviewed-on: https://go-review.googlesource.com/10790
Reviewed-by: Austin Clements <austin@google.com>
A send on an unbuffered channel to a blocked receiver is the only
case in the runtime where one goroutine writes directly to the stack
of another. The garbage collector assumes that if a goroutine is
blocked, its stack contains no new pointers since the last time it ran.
The send on an unbuffered channel violates this, so it needs an
explicit write barrier. It has an explicit write barrier, but not one that
can handle a write to another stack. Use one that can (based on type bitmap
instead of heap bitmap).
To make this work, raise the limit for type bitmaps so that they are
used for all types up to 64 kB in size (256 bytes of bitmap).
(The runtime already imposes a limit of 64 kB for a channel element size.)
I have been unable to reproduce this problem in a simple test program.
Could help #11035.
Change-Id: I06ad994032d8cff3438c9b3eaa8d853915128af5
Reviewed-on: https://go-review.googlesource.com/10815
Reviewed-by: Austin Clements <austin@google.com>
This avoids a race with gcmarkwb_m that was leading to faults.
Fixes#10212.
Change-Id: I6fcf8d09f2692227063ce29152cb57366ea22487
Reviewed-on: https://go-review.googlesource.com/10816
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
These were found by grepping the comments from the go code and feeding
the output to aspell.
Change-Id: Id734d6c8d1938ec3c36bd94a4dbbad577e3ad395
Reviewed-on: https://go-review.googlesource.com/10941
Reviewed-by: Aamir Khan <syst3m.w0rm@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Commit 1303957 was supposed to enable write barriers during the
concurrent scan phase, but it only enabled *calls* to the write
barrier during this phase. It failed to update the redundant list of
write-barrier-enabled phases in gcmarkwb_m, so it still wasn't greying
objects during the scan phase.
This commit fixes this by replacing the redundant list of phases in
gcmarkwb_m with simply checking writeBarrierEnabled. This is almost
certainly redundant with checks already done in callers, but the last
time we tried to remove these redundant checks everything got much
slower, so I'm leaving it alone for now.
Fixes#11105.
Change-Id: I00230a3cb80a008e749553a8ae901b409097e4be
Reviewed-on: https://go-review.googlesource.com/10801
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
Stack barriers assume that writes through pointers to frames above the
current frame will get write barriers, and hence these frames do not
need to be re-scanned to pick up these changes. For normal writes,
this is true. However, there are places in the runtime that use
typedmemmove to potentially write through pointers to higher frames
(such as mapassign1). Currently, typedmemmove does not execute write
barriers if the destination is on the stack. If there's a stack
barrier between the current frame and the frame being modified with
typedmemmove, and the stack barrier is not otherwise hit, it's
possible that the garbage collector will never see the updated pointer
and incorrectly reclaim the object.
Fix this by making heapBitsBulkBarrier (which lies behind typedmemmove
and its variants) detect when the destination is in the stack and
unwind stack barriers up to the point, forcing mark termination to
later rescan the effected frame and collect these pointers.
Fixes#11084. Might be related to #10240, #10541, #10941, #11023,
#11027 and possibly others.
Change-Id: I323d6cd0f1d29fa01f8fc946f4b90e04ef210efd
Reviewed-on: https://go-review.googlesource.com/10791
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, write barriers are only enabled after completion of the
concurrent scan phase, as we enter the concurrent mark phase. However,
stack barriers are installed during the scan phase and assume that
write barriers will track changes to frames above the stack
barriers. Since write barriers aren't enabled until after stack
barriers are installed, we may miss modifications to the stack that
happen after installing the stack barriers and before enabling write
barriers.
Fix this by enabling write barriers during the scan phase.
This commit intentionally makes the minimal change to do this (there's
only one line of code change; the rest are comment changes). At the
very least, we should consider eliminating the ragged barrier that's
intended to synchronize the enabling of write barriers, but now just
wastes time. I've included a large comment about extensions and
alternative designs.
Change-Id: Ib20fede794e4fcb91ddf36f99bd97344d7f96421
Reviewed-on: https://go-review.googlesource.com/10795
Reviewed-by: Russ Cox <rsc@golang.org>
Currently checkmarks mode fails to rescan stacks because it sees the
leftover state bits indicating that the stacks haven't changed since
the last scan. As a result, it won't detect lost marks caused by
failing to scan stacks correctly during regular garbage collection.
Fix this by marking all stacks dirty before performing the checkmark
phase.
Change-Id: I1f06882bb8b20257120a4b8e7f95bb3ffc263895
Reviewed-on: https://go-review.googlesource.com/10794
Reviewed-by: Russ Cox <rsc@golang.org>
All of the architectures except ppc64 have only "RET" for the return
mnemonic. ppc64 used to have only "RETURN", but commit cf06ea6
introduced RET as a synonym for RETURN to make ppc64 consistent with
the other architectures. However, that commit was never followed up to
make the code itself consistent by eliminating uses of RETURN.
This commit replaces all uses of RETURN in the ppc64 assembly with
RET.
This was done with
sed -i 's/\<RETURN\>/RET/' **/*_ppc64x.s
plus one manual change to syscall/asm.s.
Change-Id: I3f6c8d2be157df8841d48de988ee43f3e3087995
Reviewed-on: https://go-review.googlesource.com/10672
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
gc should ideally consider this an error too; see golang/go#8560.
Change-Id: Ieee71c4ecaff493d7f83e15ba8c8a04ee90a4cf1
Reviewed-on: https://go-review.googlesource.com/10757
Reviewed-by: Robert Griesemer <gri@golang.org>
Currently the stack barriers are installed at the next frame boundary
after gp.sched.sp + 1024*2^n for n=0,1,2,... However, when a G is in a
system call, we set gp.sched.sp to 0, which causes stack barriers to
be installed at *every* frame. This easily overflows the slice we've
reserved for storing the stack barrier information, and causes a
"slice bounds out of range" panic in gcInstallStackBarrier.
Fix this by using gp.syscallsp instead of gp.sched.sp if it's
non-zero. This is the same logic that gentraceback uses to determine
the current SP.
Fixes#11049.
Change-Id: Ie40eeee5bec59b7c1aa715a7c17aa63b1f1cf4e8
Reviewed-on: https://go-review.googlesource.com/10755
Reviewed-by: Russ Cox <rsc@golang.org>
See golang.org/s/go15gomaxprocs for details.
Change-Id: I8de5df34fa01d31d78f0194ec78a2474c281243c
Reviewed-on: https://go-review.googlesource.com/10668
Reviewed-by: Rob Pike <r@golang.org>
Otherwise subsequent tests won't see any modified GOROOT.
With this CL I can move my GOROOT, set GOROOT to the new location, and
the runtime tests pass. Previously the crash_tests would instead look
for the GOROOT baked into the binary, instead of the env var:
--- FAIL: TestGcSys (0.01s)
crash_test.go:92: building source: exit status 2
go: cannot find GOROOT directory: /home/bradfitz/go
--- FAIL: TestGCFairness (0.01s)
crash_test.go:92: building source: exit status 2
go: cannot find GOROOT directory: /home/bradfitz/go
--- FAIL: TestGdbPython (0.07s)
runtime-gdb_test.go:64: building source exit status 2
go: cannot find GOROOT directory: /home/bradfitz/go
--- FAIL: TestLargeStringConcat (0.01s)
crash_test.go:92: building source: exit status 2
go: cannot find GOROOT directory: /home/bradfitz/go
Update #10029
Change-Id: If91be0f04d3acdcf39a9e773a4e7905a446bc477
Reviewed-on: https://go-review.googlesource.com/10685
Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Currently the GODEBUG=gctrace=1 trace line includes "@n.nnns" to
indicate the time that the GC cycle ended relative to the time the
program started. This was meant to be consistent with the utilization
as of the end of the cycle, which is printed next on the trace line,
but it winds up just being confusing and unexpected.
Change the trace line to include the time that the GC cycle started
relative to the time the program started.
Change-Id: I7d64580cd696eb17540716d3e8a74a9d6ae50650
Reviewed-on: https://go-review.googlesource.com/10634
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
This commit implements stack barriers to minimize the amount of
stack re-scanning that must be done during mark termination.
Currently the GC scans stacks of active goroutines twice during every
GC cycle: once at the beginning during root discovery and once at the
end during mark termination. The second scan happens while the world
is stopped and guarantees that we've seen all of the roots (since
there are no write barriers on writes to local stack
variables). However, this means pause time is proportional to stack
size. In particularly recursive programs, this can drive pause time up
past our 10ms goal (e.g., it takes about 150ms to scan a 50MB heap).
Re-scanning the entire stack is rarely necessary, especially for large
stacks, because usually most of the frames on the stack were not
active between the first and second scans and hence any changes to
these frames (via non-escaping pointers passed down the stack) were
tracked by write barriers.
To efficiently track how far a stack has been unwound since the first
scan (and, hence, how much needs to be re-scanned), this commit
introduces stack barriers. During the first scan, at exponentially
spaced points in each stack, the scan overwrites return PCs with the
PC of the stack barrier function. When "returned" to, the stack
barrier function records how far the stack has unwound and jumps to
the original return PC for that point in the stack. Then the second
scan only needs to proceed as far as the lowest barrier that hasn't
been hit.
For deeply recursive programs, this substantially reduces mark
termination time (and hence pause time). For the goscheme example
linked in issue #10898, prior to this change, mark termination times
were typically between 100 and 500ms; with this change, mark
termination times are typically between 10 and 20ms. As a result of
the reduced stack scanning work, this reduces overall execution time
of the goscheme example by 20%.
Fixes#10898.
The effect of this on programs that are not deeply recursive is
minimal:
name old time/op new time/op delta
BinaryTree17 3.16s ± 2% 3.26s ± 1% +3.31% (p=0.000 n=19+19)
Fannkuch11 2.42s ± 1% 2.48s ± 1% +2.24% (p=0.000 n=17+19)
FmtFprintfEmpty 50.0ns ± 3% 49.8ns ± 1% ~ (p=0.534 n=20+19)
FmtFprintfString 173ns ± 0% 175ns ± 0% +1.49% (p=0.000 n=16+19)
FmtFprintfInt 170ns ± 1% 175ns ± 1% +2.97% (p=0.000 n=20+19)
FmtFprintfIntInt 288ns ± 0% 295ns ± 0% +2.73% (p=0.000 n=16+19)
FmtFprintfPrefixedInt 242ns ± 1% 252ns ± 1% +4.13% (p=0.000 n=18+18)
FmtFprintfFloat 324ns ± 0% 323ns ± 0% -0.36% (p=0.000 n=20+19)
FmtManyArgs 1.14µs ± 0% 1.12µs ± 1% -1.01% (p=0.000 n=18+19)
GobDecode 8.88ms ± 1% 8.87ms ± 0% ~ (p=0.480 n=19+18)
GobEncode 6.80ms ± 1% 6.85ms ± 0% +0.82% (p=0.000 n=20+18)
Gzip 363ms ± 1% 363ms ± 1% ~ (p=0.077 n=18+20)
Gunzip 90.6ms ± 0% 90.0ms ± 1% -0.71% (p=0.000 n=17+18)
HTTPClientServer 51.5µs ± 1% 50.8µs ± 1% -1.32% (p=0.000 n=18+18)
JSONEncode 17.0ms ± 0% 17.1ms ± 0% +0.40% (p=0.000 n=18+17)
JSONDecode 61.8ms ± 0% 63.8ms ± 1% +3.11% (p=0.000 n=18+17)
Mandelbrot200 3.84ms ± 0% 3.84ms ± 1% ~ (p=0.583 n=19+19)
GoParse 3.71ms ± 1% 3.72ms ± 1% ~ (p=0.159 n=18+19)
RegexpMatchEasy0_32 100ns ± 0% 100ns ± 1% -0.19% (p=0.033 n=17+19)
RegexpMatchEasy0_1K 342ns ± 1% 331ns ± 0% -3.41% (p=0.000 n=19+19)
RegexpMatchEasy1_32 82.5ns ± 0% 81.7ns ± 0% -0.98% (p=0.000 n=18+18)
RegexpMatchEasy1_1K 505ns ± 0% 494ns ± 1% -2.16% (p=0.000 n=18+18)
RegexpMatchMedium_32 137ns ± 1% 137ns ± 1% -0.24% (p=0.048 n=20+18)
RegexpMatchMedium_1K 41.6µs ± 0% 41.3µs ± 1% -0.57% (p=0.004 n=18+20)
RegexpMatchHard_32 2.11µs ± 0% 2.11µs ± 1% +0.20% (p=0.037 n=17+19)
RegexpMatchHard_1K 63.9µs ± 2% 63.3µs ± 0% -0.99% (p=0.000 n=20+17)
Revcomp 560ms ± 1% 522ms ± 0% -6.87% (p=0.000 n=18+16)
Template 75.0ms ± 0% 75.1ms ± 1% +0.18% (p=0.013 n=18+19)
TimeParse 358ns ± 1% 364ns ± 0% +1.74% (p=0.000 n=20+15)
TimeFormat 360ns ± 0% 372ns ± 0% +3.55% (p=0.000 n=20+18)
Change-Id: If8a9bfae6c128d15a4f405e02bcfa50129df82a2
Reviewed-on: https://go-review.googlesource.com/10314
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently there's a race between stopg scanning another G's stack and
the G reaching a preemption point and scanning its own stack. When
this race occurs, the G's stack is scanned twice. Currently this is
okay, so this race is benign.
However, we will shortly be adding stack barriers during the first
stack scan, so scanning will no longer be idempotent. To prepare for
this, this change ensures that each stack is scanned only once during
each GC phase by checking the flag that indicates that the stack has
been scanned in this phase before scanning the stack.
Change-Id: Id9f4d5e2e5b839bc3f200ec1723a4a12dd677ab4
Reviewed-on: https://go-review.googlesource.com/10458
Reviewed-by: Rick Hudson <rlh@golang.org>
The stack barrier code will need a bookkeeping structure to keep track
of the overwritten return PCs. This commit introduces and allocates
this structure, but does not yet use the structure.
We don't want to allocate space for this structure during garbage
collection, so this commit allocates it along with the allocation of
the corresponding stack. However, we can't do a regular allocation in
newstack because mallocgc may itself grow the stack (which would lead
to a recursive allocation). Hence, this commit makes the bookkeeping
structure part of the stack allocation itself by stealing the
necessary space from the top of the stack allocation. Since the size
of this bookkeeping structure is logarithmic in the size of the stack,
this has minimal impact on stack behavior.
Change-Id: Ia14408be06aafa9ca4867f4e70bddb3fe0e96665
Reviewed-on: https://go-review.googlesource.com/10313
Reviewed-by: Russ Cox <rsc@golang.org>
Currently the runtime assumes that the allocation for the stack is
exactly [stack.lo, stack.hi). We're about to steal a small part of
this allocation for per-stack GC metadata. To prepare for this, this
commit adds a field to the G for the allocated size of the stack.
With this change, stack.lo and stack.hi continue to act as the true
bounds on the stack, but are no longer also used as the bounds on the
stack allocation.
(I also tried this the other way around, where stack.lo and stack.hi
remained the allocation bounds and I introduced a new top of stack.
However, there are far more places that assume stack.hi is the true
top of the stack than there are places that assume it's the top of the
allocation.)
Change-Id: Ifa9d956753be53d286d09cbc73d47fb34a18c0c6
Reviewed-on: https://go-review.googlesource.com/10312
Reviewed-by: Russ Cox <rsc@golang.org>
Currently signalstack takes a lower limit and a length and all calls
hard-code the passed length. Change the API to take a *stack and
compute the lower limit and length from the passed stack.
This will make it easier for the runtime to steal some space from the
top of the stack since it eliminates the hard-coded stack sizes.
Change-Id: I7d2a9f45894b221f4e521628c2165530bbc57d53
Reviewed-on: https://go-review.googlesource.com/10311
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we truncate gctrace clock and CPU times to millisecond
precision. As a result, many phases are typically printed as 0, which
is fine for user consumption, but makes gathering statistics and
reports over GC traces difficult.
In 1.4, the gctrace line printed times in microseconds. This was
better for statistics, but not as easy for users to read or interpret,
and it generally made the trace lines longer.
This change strikes a balance between these extremes by printing
milliseconds, but including the decimal part to two significant
figures down to microsecond precision. This remains easy to read and
interpret, but includes more precision when it's useful.
For example, where the code currently prints,
gc #29 @1.629s 0%: 0+2+0+12+0 ms clock, 0+2+0+0/12/0+0 ms cpu, 4->4->2 MB, 4 MB goal, 1 P
this prints,
gc #29 @1.629s 0%: 0.005+2.1+0+12+0.29 ms clock, 0.005+2.1+0+0/12/0+0.29 ms cpu, 4->4->2 MB, 4 MB goal, 1 P
Fixes#10970.
Change-Id: I249624779433927cd8b0947b986df9060c289075
Reviewed-on: https://go-review.googlesource.com/10554
Reviewed-by: Russ Cox <rsc@golang.org>
runtime.GC() is intentionally very weakly specified. However, it is so
weakly specified that it's difficult to know that it's being used
correctly for its one intended use case: to ensure garbage collection
has run in a test that is garbage-sensitive. In particular, it is
unclear whether it is synchronous or asynchronous. In the old STW
collector this was essentially self-evident; short of queuing up a
garbage collection to run later, it had to be synchronous. However,
with the concurrent collector, there's evidence that people are
inferring that it may be asynchronous (e.g., issue #10986), as this is
both unclear in the documentation and possible in the implementation.
In fact, runtime.GC() runs a fully synchronous STW collection. We
probably don't want to commit to this exact behavior. But we can
commit to the essential property that tests rely on: that runtime.GC()
does not return until the GC has finished.
Change-Id: Ifc3045a505e1898ecdbe32c1f7e80e2e9ffacb5b
Reviewed-on: https://go-review.googlesource.com/10488
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
TestGoroutineParallelism can deadlock if the GC runs during the
test. Currently it tries to prevent this by forcing a GC before the
test, but this is best effort and fails completely if GOGC is very low
for testing.
This change replaces this best-effort fix with simply setting GOGC to
off for the duration of the test.
Change-Id: I8229310833f241b149ebcd32845870c1cb14e9f8
Reviewed-on: https://go-review.googlesource.com/10454
Reviewed-by: Russ Cox <rsc@golang.org>
Most runtime tests that invoke the compiler to build a sub-test binary
do so with a special environment constructed by testEnv that strips
out environment variables that should apply to the test but not to the
build.
Fix TestGdbPython to use this test environment when invoking go build,
like other tests do.
Change-Id: Iafdf89d4765c587cbebc427a5d61cb8a7e71b326
Reviewed-on: https://go-review.googlesource.com/10455
Reviewed-by: Russ Cox <rsc@golang.org>
Implement the changes from CL 10173 on OpenBSD.
Change-Id: I2db1cd8141fd392a34753a1b8113e2e0401173b9
Reviewed-on: https://go-review.googlesource.com/10342
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Ian proposed an improved way of handling signals masks in Go, motivated
by a problem where the Android java runtime expects certain signals to
be blocked for all JVM threads. Discussion here
https://groups.google.com/forum/#!topic/golang-dev/_TSCkQHJt6g
Ian's text is used in the following:
A Go program always needs to have the synchronous signals enabled.
These are the signals for which _SigPanic is set in sigtable, namely
SIGSEGV, SIGBUS, SIGFPE.
A Go program that uses the os/signal package, and calls signal.Notify,
needs to have at least one thread which is not blocking that signal,
but it doesn't matter much which one.
Unix programs do not change signal mask across execve. They inherit
signal masks across fork. The shell uses this fact to some extent;
for example, the job control signals (SIGTTIN, SIGTTOU, SIGTSTP) are
blocked for commands run due to backquote quoting or $().
Our current position on signal masks was not thought out. We wandered
into step by step, e.g., http://golang.org/cl/7323067 .
This CL does the following:
Introduce a new platform hook, msigsave, that saves the signal mask of
the current thread to m.sigsave.
Call msigsave from needm and newm.
In minit grab set up the signal mask from m.sigsave and unblock the
essential synchronous signals, and SIGILL, SIGTRAP, SIGPROF, SIGSTKFLT
(for systems that have it).
In unminit, restore the signal mask from m.sigsave.
The first time that os/signal.Notify is called, start a new thread whose
only purpose is to update its signal mask to make sure signals for
signal.Notify are unblocked on at least one thread.
The effect on Go programs will be that if they are invoked with some
non-synchronous signals blocked, those signals will normally be
ignored. Previously, those signals would mostly be ignored. A change
in behaviour will occur for programs started with any of these signals
blocked, if they receive the signal: SIGHUP, SIGINT, SIGQUIT, SIGABRT,
SIGTERM. Previously those signals would always cause a crash (unless
using the os/signal package); with this change, they will be ignored
if the program is started with the signal blocked (and does not use
the os/signal package).
./all.bash completes successfully on linux/amd64.
OpenBSD is missing the implementation.
Change-Id: I188098ba7eb85eae4c14861269cc466f2aa40e8c
Reviewed-on: https://go-review.googlesource.com/10173
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Given a call frame F of size N where the return values start at offset R,
callwritebarrier was instructing heapBitsBulkBarrier to scan the block
of memory [F+R, F+R+N). It should only scan [F+R, F+N). The extra N-R
bytes scanned might lead into the next allocated block in memory.
Because the scan was consulting the heap bitmap for type information,
scanning into the next block normally "just worked" in the sense of
not crashing.
Scanning the extra N-R bytes of memory is a problem mainly because
it causes the GC to consider pointers that might otherwise not be
considered, leading it to retain objects that should actually be freed.
This is very difficult to detect.
Luckily, juju turned up a case where the heap bitmap and the memory
were out of sync for the block immediately after the call frame, so that
heapBitsBulkBarrier saw an obvious non-pointer where it expected a
pointer, causing a loud crash.
Why is there a non-pointer in memory that the heap bitmap records as
a pointer? That is more difficult to answer. At least one way that it
could happen is that allocations containing no pointers at all do not
update the heap bitmap. So if heapBitsBulkBarrier walked out of the
current object and into a no-pointer object and consulted those bitmap
bits, it would be misled. This doesn't happen in general because all
the paths to heapBitsBulkBarrier first check for the no-pointer case.
This may or may not be what happened, but it's the only scenario
I've been able to construct.
I tried for quite a while to write a simple test for this and could not.
It does fix the juju crash, and it is clearly an improvement over the
old code.
Fixes#10844.
Change-Id: I53982c93ef23ef93155c4086bbd95a4c4fdaac9a
Reviewed-on: https://go-review.googlesource.com/10317
Reviewed-by: Austin Clements <austin@google.com>
Currently runtime.callers invokes gentraceback with the pc and sp of
the G it is called from, but always passes g0 even if it was called
from a regular g. Right now this has no ill effects because
runtime.callers does not use either callback argument or the
_TraceJumpStack flag, but it makes the code fragile and will break
some upcoming changes.
Fix this by lifting the getg() call outside of the systemstack in
runtime.callers.
Change-Id: I4e1e927961c0e0cd4dcf28693be47df7bae9e122
Reviewed-on: https://go-review.googlesource.com/10292
Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
This is dead code. If you want to quiesce the system the
preferred way is to use forEachP(func(*p){}).
Change-Id: Ic7677a5dd55e3639b99e78ddeb2c71dd1dd091fa
Reviewed-on: https://go-review.googlesource.com/10267
Reviewed-by: Austin Clements <austin@google.com>
Prior to this CL whenever the GC marking was enabled and
a P was looking for work we supplied a G to help
the GC do its marking tasks. Once this G finished all
the marking available it would release the P to find another
available G. In the case where there was no work the P would drop
into findrunnable which would execute the mark helper G which would
immediately return and the P would drop into findrunnable again repeating
the process. Since the P was always given a G to run it never blocks.
This CL first checks if the GC mark helper G has available work and if
not the P immediately falls through to its blocking logic.
Fixes#10901
Change-Id: I94ac9646866ba64b7892af358888bc9950de23b5
Reviewed-on: https://go-review.googlesource.com/10189
Reviewed-by: Austin Clements <austin@google.com>
Currently setGCPercent sets heapminimum to heapminimum*GOGC/100. The
real intent is to set heapminimum to a scaled multiple of a fixed
default heap minimum, not to scale heapminimum based on its current
value. This turns out to be okay because setGCPercent is only called
once and heapminimum is initially set to this default heap minimum.
However, the code as written is confusing, especially since
setGCPercent is otherwise written so it could be called again to
change GOGC. Fix this by introducing a defaultHeapMinimum constant and
using this instead of the current value of heapminimum to compute the
scaled heap minimum.
As part of this, this commit improves the documentation on
heapminimum.
Change-Id: I4eb82c73dc2eb44a6e5a17c780a747a2e73d7493
Reviewed-on: https://go-review.googlesource.com/10181
Reviewed-by: Russ Cox <rsc@golang.org>
This is a duplicate of CL 9491.
That CL broke the build due to pprof shortcomings
and was reverted in CL 9565.
CL 9623 fixed pprof, so this can go in again.
Fixes#10659.
Change-Id: If470fc90b3db2ade1d161b4417abd2f5c6c330b8
Reviewed-on: https://go-review.googlesource.com/10212
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Currently, forEachP reuses the stopwait and stopnote fields from
stopTheWorld to track how many Ps have not responded to the safe-point
request and to sleep until all Ps have responded.
It was assumed this was safe because both stopTheWorld and forEachP
must occur under the worlsema and hence stopwait and stopnote cannot
be used for both purposes simultaneously and callers could always
determine the appropriate use based on sched.gcwaiting (which is only
set by stopTheWorld). However, this is not the case, since it's
possible for there to be a window between when an M observes that
gcwaiting is set and when it checks stopwait during which stopwait
could have changed meanings. When this happens, the M decrements
stopwait and may wakeup stopnote, but does not otherwise participate
in the forEachP protocol. As a result, stopwait is decremented too
many times, so it may reach zero before all Ps have run the safe-point
function, causing forEachP to wake up early. It will then either
observe that some P has not run the safe-point function and panic with
"P did not run fn", or the remaining P (or Ps) will run the safe-point
function before it wakes up and it will observe that stopwait is
negative and panic with "not stopped".
Fix this problem by giving forEachP its own safePointWait and
safePointNote fields.
One known sequence of events that can cause this race is as
follows. It involves three actors:
G1 is running on M1 on P1. P1 has an empty run queue.
G2/M2 is in a blocked syscall and has lost its P. (The details of this
don't matter, it just needs to be in a position where it needs to grab
an idle P.)
GC just started on G3/M3/P3. (These aren't very involved, they just
have to be separate from the other G's, M's, and P's.)
1. GC calls stopTheWorld(), which sets sched.gcwaiting to 1.
Now G1/M1 begins to enter a syscall:
2. G1/M1 invokes reentersyscall, which sets the P1's status to
_Psyscall.
3. G1/M1's reentersyscall observes gcwaiting != 0 and calls
entersyscall_gcwait.
4. G1/M1's entersyscall_gcwait blocks acquiring sched.lock.
Back on GC:
5. stopTheWorld cas's P1's status to _Pgcstop, does other stuff, and
returns.
6. GC does stuff and then calls startTheWorld().
7. startTheWorld() calls procresize(), which sets P1's status to
_Pidle and puts P1 on the idle list.
Now G2/M2 returns from its syscall and takes over P1:
8. G2/M2 returns from its blocked syscall and gets P1 from the idle
list.
9. G2/M2 acquires P1, which sets P1's status to _Prunning.
10. G2/M2 starts a new syscall and invokes reentersyscall, which sets
P1's status to _Psyscall.
Back on G1/M1:
11. G1/M1 finally acquires sched.lock in entersyscall_gcwait.
At this point, G1/M1 still thinks it's running on P1. P1's status is
_Psyscall, which is consistent with what G1/M1 is doing, but it's
_Psyscall because *G2/M2* put it in to _Psyscall, not G1/M1. This is
basically an ABA race on P1's status.
Because forEachP currently shares stopwait with stopTheWorld. G1/M1's
entersyscall_gcwait observes the non-zero stopwait set by forEachP,
but mistakes it for a stopTheWorld. It cas's P1's status from
_Psyscall (set by G2/M2) to _Pgcstop and proceeds to decrement
stopwait one more time than forEachP was expecting.
Fixes#10618. (See the issue for details on why the above race is safe
when forEachP is not involved.)
Prior to this commit, the command
stress ./runtime.test -test.run TestFutexsleep\|TestGoroutineProfile
would reliably fail after a few hundred runs. With this commit, it
ran for over 2 million runs and never crashed.
Change-Id: I9a91ea20035b34b6e5f07ef135b144115f281f30
Reviewed-on: https://go-review.googlesource.com/10157
Reviewed-by: Russ Cox <rsc@golang.org>