Otherwise the GC may see uninitialized memory there,
which might be old pointers that are retained, or it might
trigger the invalid pointer check.
Fixes#11907.
Change-Id: I67e306384a68468eef45da1a8eb5c9df216a77c0
Reviewed-on: https://go-review.googlesource.com/12852
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The last time we tried this, linux/arm64 broke.
The series of CLs leading to this one fixes that problem.
Let's try again.
Fixes#9880.
Change-Id: I67bc1d959175ec972d4dcbe4aa6f153790f74251
Reviewed-on: https://go-review.googlesource.com/12849
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The layout code has to date insisted on stack frames that are 16-aligned
including the saved LR, and it ensured this by growing the frame itself.
This breaks code that refers to values near the top of the frame by positive
offset from SP, and in general it's too magical: if you see TEXT xxx, $N,
you expect that the frame size is actually N, not sometimes N and sometimes N+8.
This led to a serious bug in the compiler where ambiguously live values
were not being zeroed correctly, which in turn triggered an assertion
in the GC about finding only valid pointers. The compiler has been
fixed to always emit aligned frames, and the hand-written assembly
has also been fixed.
Now that everything is aligned, make unaligned an error instead of
something to "fix" silently.
For #9880.
Change-Id: I05f01a9df174d64b37fa19b36a6b6c5f18d5ba2d
Reviewed-on: https://go-review.googlesource.com/12848
Reviewed-by: Austin Clements <austin@google.com>
The nosplit stack overflow checks were confused about morestack.
The comment about not having correct SP information at the call
to morestack was true, but that was a real bug, not something to
work around. I fixed that problem in CL 12144. With that fixed,
no need to special-case morestack in the way done here.
This cleanup and simplification of the code was the first step
to fixing a bug that happened when I started working on the
arm64 frame size adjustments, but the cleanup was sufficient
to make the bug go away.
For #9880.
Change-Id: I16b69a5c16b6b8cb4090295d3029c42d606e3b9b
Reviewed-on: https://go-review.googlesource.com/12846
Reviewed-by: Austin Clements <austin@google.com>
arm64 requires either no stack frame or a frame with a size that is 8 mod 16
(adding the saved LR will make it 16-aligned).
The cmd/internal/obj/arm64 has been silently aligning frames, but it led to
a terrible bug when the compiler and obj disagreed on the frame size,
and it's just generally confusing, so we're going to make misaligned frames
an error instead of something that is silently changed.
This CL prepares by updating assembly files.
Note that the changes in this CL are already being done silently by
cmd/internal/obj/arm64, so there is no semantic effect here,
just a clarity effect.
For #9880.
Change-Id: Ibd6928dc5fdcd896c2bacd0291bf26b364591e28
Reviewed-on: https://go-review.googlesource.com/12845
Reviewed-by: Austin Clements <austin@google.com>
If the compiler doesn't do it, cmd/internal/obj/arm64 will,
and that will break the zeroing of ambiguously live values
done in zerorange, which in turn produces uninitialized
pointer cells that the GC trips over.
For #9880.
Change-Id: Ice97c30bc8b36d06b7b88d778d87fab8e1827fdc
Reviewed-on: https://go-review.googlesource.com/12847
Reviewed-by: Austin Clements <austin@google.com>
This adds a GCCPUFraction field to MemStats that reports the
cumulative fraction of the program's execution time spent in the
garbage collector. This is equivalent to the utilization percent shown
in the gctrace output and makes this available programmatically.
This does make one small effect on the gctrace output: we now report
the duration of mark termination up to just before the final
start-the-world, rather than up to just after. However, unlike
stop-the-world, I don't believe there's any way that start-the-world
can block, so it should take negligible time.
While there are many statistics one might want to expose via MemStats,
this is one of the few that will undoubtedly remain meaningful
regardless of future changes to the memory system.
The diff for this change is larger than the actual change. Mostly it
lifts the code for computing the GC CPU utilization out of the
debug.gctrace path.
Updates #10323.
Change-Id: I0f7dc3fdcafe95e8d1233ceb79de606b48acd989
Reviewed-on: https://go-review.googlesource.com/12844
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we only capture GC phase transition times if
debug.gctrace>0, but we're about to compute GC CPU utilization
regardless of whether debug.gctrace is set, so we need these
regardless of debug.gctrace.
Change-Id: If3acf16505a43d416e9a99753206f03287180660
Reviewed-on: https://go-review.googlesource.com/12843
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
The following sequence of events can lead to the runtime attempting an
out-of-bounds access on a stack barrier slice:
1. A SIGPROF comes in on a thread while the G on that thread is in
_Gsyscall. The sigprof handler calls gentraceback, which saves a
local copy of the G's stkbar slice. Currently the G has no stack
barriers, so this slice is empty.
2. On another thread, the GC concurrently scans the stack of the
goroutine being profiled (it considers it stopped because it's in
_Gsyscall) and installs stack barriers.
3. Back on the sigprof thread, gentraceback comes across a stack
barrier in the stack and attempts to look it up in its (zero
length) copy of G's old stkbar slice, which causes an out-of-bounds
access.
This commit fixes this by adding a simple cas spin to synchronize the
SIGPROF handler with stack barrier insertion.
In general I would prefer that this synchronization be done through
the G status, since that's how stack scans are otherwise synchronized,
but adding a new lock is a much smaller change and G statuses are full
of subtlety.
Fixes#11863.
Change-Id: Ie89614a6238bb9c6a5b1190499b0b48ec759eaf7
Reviewed-on: https://go-review.googlesource.com/12748
Reviewed-by: Russ Cox <rsc@golang.org>
The scheduler, work buffer's dispose, and write barriers
can conspire to hide the a pointer from the GC's concurent
mark phase. If this pointer is the only path to a large
amount of marking the STW mark termination phase may take
a lot of time.
Consider the following:
1) dispose places a work buffer on the partial queue
2) the GC is busy so it does not immediately remove and
process the work buffer
3) the scheduler runs a mutator whose write barrier dequeues the
work buffer from the partial queue so the GC won't see it
This repeats until the GC reaches the mark termination
phase where the GC finally discovers the pointer along
with a lot of work to do.
This CL fixes the problem by having the mutator
dispose of the buffer to the full queue instead of
the partial queue. Since the write buffer never asks for full
buffers the conspiracy described above is not possible.
Updates #11694.
Change-Id: I2ce832f9657a7570f800e8ce4459cd9e304ef43b
Reviewed-on: https://go-review.googlesource.com/12840
Reviewed-by: Austin Clements <austin@google.com>
These are the old assemblers written in C, and now they are
not needed.
Fixes#10510.
Change-Id: Id9337ffc8eccfd93c84b2e23f427fb1a576b543d
Reviewed-on: https://go-review.googlesource.com/12784
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
At this stage, dist is only building go_bootstrap as cmd/compile and
the rest of the Go toolchain has already been built.
Change-Id: I6f99fa00ff1d3585e215f4ce84d49344c4fcb8a5
Reviewed-on: https://go-review.googlesource.com/12779
Reviewed-by: Russ Cox <rsc@golang.org>
Russ Cox fixed this issue for other systems
in CL 12026, but the Plan 9 part was forgotten.
Fixes#11656.
Change-Id: I91c033687987ba43d13ad8f42e3fe4c7a78e6075
Reviewed-on: https://go-review.googlesource.com/12762
Reviewed-by: Russ Cox <rsc@golang.org>
This is a reprise of https://golang.org/cl/12623. In that a CL I made
a suggestion which forgot that the +build constraints in the test
directory are not the same as those supported by the go tool: in the
test directory, if a single +build line fails, the test is skipped.
(In my defense, the code I was commenting on was also wrong.)
Change-Id: I8f29392a80b1983027f9a33043c803578409d678
Reviewed-on: https://go-review.googlesource.com/12776
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The function is already defined between syscall_solaris.go and
syscall2_solaris.go.
Change-Id: I034baf7c8531566bebfdbc5a4061352cbcc31449
Reviewed-on: https://go-review.googlesource.com/12773
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reformat some help messages to stay within 80 characters.
Fixes#11840.
Change-Id: Iebafcb616f202ac44405e5897097492a79a51722
Reviewed-on: https://go-review.googlesource.com/12514
Reviewed-by: Rob Pike <r@golang.org>
This change prevents DNS query results using domain search list
overtaking results not using the list unconditionally, which only
happens when using builtin DNS stub resolver.
The previous internal lookup function lookup is split into lookup and
goLookupIPOrder for iteration over a set of names: FQDN or absolute
FQDN, with domain label suffixes in search list, without domain label
suffixes, and for concurrent A and AAAA record queries.
Fixes#11081.
Change-Id: I9ff0640f69276e372d97e709b149ed5b153e8601
Reviewed-on: https://go-review.googlesource.com/10836
Reviewed-by: Russ Cox <rsc@golang.org>
Rename test name from Http to HTTP, and fix some style nits.
Change-Id: I00fe1cecd69ca2f50be86a76ec90031c2f921707
Reviewed-on: https://go-review.googlesource.com/12760
Reviewed-by: Andrew Gerrand <adg@golang.org>
A further attempt to fix raiseproc on Solaris.
Change-Id: I8d8000d6ccd0cd9f029ebe1f211b76ecee230cd0
Reviewed-on: https://go-review.googlesource.com/12771
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I forgot that the libc raise function only sends the signal to the
current thread. We need to actually use kill and getpid here, as we
do on other systems.
Change-Id: Iac34af822c93468bf68cab8879db3ee20891caaf
Reviewed-on: https://go-review.googlesource.com/12704
Reviewed-by: Russ Cox <rsc@golang.org>
I've also changed TestDialSerialAsyncSpuriousConnection for consistency,
although it always computes a finalDeadline of zero.
Note that #11225 is the root cause of the socket leak; this just hides
it from the unit test by restoring the shorter timeout.
Fixes#11878
Change-Id: Ie0037dd3bce6cc81d196765375489f8c61be74c2
Reviewed-on: https://go-review.googlesource.com/12712
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Paul Marks <pmarks@google.com>
Fixes#11436.
Change-Id: I5c4455e9b13b478838f23ac31e6343672dfc60af
Reviewed-on: https://go-review.googlesource.com/12143
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Until cl/12721 and cl/12574, all standard library tests included
runtime/cgo on darwin/arm64 by virtue of package os including it. Now
that is no longer true, runtime/cgo needs to be added by the go tool
just as it is for darwin/arm. (This installs the Mach exception
handler used to properly handle EXC_BAD_ACCESS.)
Fixes#11901
Change-Id: I991525f46eca5b0750b93595579ebc0ff10e47eb
Reviewed-on: https://go-review.googlesource.com/12723
Reviewed-by: Russ Cox <rsc@golang.org>
The new Token API is meant to sit on the side of the Decoder,
so that you only get the new code (and any latent bugs in it)
if you are actively using the Token API.
The unconditional use of dec.peek in dec.tokenPrepareForDecode
violates that intention.
Change tokenPrepareForDecode not to call dec.peek unless needed
(because the Token API has advanced the state).
This restores the old code path behavior, no peeking allowed.
I checked by patching in the new tests from CL 12726 that
this change suffices to "fix" the error handling bug in dec.peek.
Obviously that bug should be fixed too, but the point is that
with this CL, bugs in dec.peek do not affect plain use of Decode
or Unmarshal.
I also checked by putting a panic in dec.peek that the only
tests that now invoke peek are:
TestDecodeInStream
ExampleDecoder_Token
ExampleDecoder_Decode_stream
and those tests all invoke dec.Token directly.
Change-Id: I0b242d0cb54a9c830548644670dc5ab5ccef69f2
Reviewed-on: https://go-review.googlesource.com/12740
Reviewed-by: Andrew Gerrand <adg@golang.org>
Reviewed-by: Peter Waldschmidt <peter@waldschmidt.com>
From https://github.com/golang/go/issues/11745#issuecomment-123555313
this implements option (b), having the server pause slightly after
sending the final response on a TCP connection when we're about to close
it when we know there's a request body outstanding. This biases the
client (which might not be Go) to prefer our response header over the
request body write error.
Updates #11745
Change-Id: I07cb0b74519d266c8049d9e0eb23a61304eedbf8
Reviewed-on: https://go-review.googlesource.com/12658
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Seems like the simplest solution for 1.5. All the parts of the test
suite I can run on my current device (for which my exception handler
fix no longer works, apparently) pass without this code. I'll move it
into x/mobile/app.
Fixes#11884
Change-Id: I2da40c8c7b48a4c6970c4d709dd7c148a22e8727
Reviewed-on: https://go-review.googlesource.com/12721
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently we enter mark 2 by first flushing all existing gcWork caches
and then setting gcBlackenPromptly, which disables further gcWork
caching. However, if a worker or assist pulls a work buffer in to its
gcWork cache after that cache has been flushed but before caching is
disabled, that work may remain in that cache until mark termination.
If that work represents a heap bottleneck (e.g., a single pointer that
is the only way to reach a large amount of the heap), this can force
mark termination to do a large amount of work, resulting in a long
STW.
Fix this by reversing the order of these steps: first disable caching,
then flush all existing caches.
Rick Hudson <rlh> did the hard work of tracking this down. This CL
combined with CL 12672 and CL 12646 distills the critical parts of his
fix from CL 12539.
Fixes#11694.
Change-Id: Ib10d0a21e3f6170a80727d0286f9990df049fed2
Reviewed-on: https://go-review.googlesource.com/12688
Reviewed-by: Rick Hudson <rlh@golang.org>
Currently the GC coordinator enables GC assists at the same time it
enables background mark workers, after the concurrent scan phase is
done. However, this means a rapidly allocating mutator has the entire
scan phase during which to allocate beyond the heap trigger and
potentially beyond the heap goal with no back-pressure from assists.
This prevents the feedback system that's supposed to keep the heap
size under the heap goal from doing its job.
Fix this by enabling mutator assists during the scan phase. This is
safe because the write barrier is already enabled and globally
acknowledged at this point.
There's still a very small window between when the heap size reaches
the heap trigger and when the GC coordinator is able to stop the world
during which the mutator can allocate unabated. This allows *very*
rapidly allocator mutators like TestTraceStress to still occasionally
exceed the heap goal by a small amount (~20 MB at most for
TestTraceStress). However, this seems like a corner case.
Fixes#11677.
Change-Id: I0f80d949ec82341cd31ca1604a626efb7295a819
Reviewed-on: https://go-review.googlesource.com/12674
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we hand-code a set of phases when draining is allowed.
However, this set of phases is conservative. The critical invariant is
simply that the write barrier must be enabled if we're draining.
Shortly we're going to enable mutator assists during the scan phase,
which means we may drain during the scan phase. In preparation, this
commit generalizes these assertions to check the fundamental condition
that the write barrier is enabled, rather than checking that we're in
any particular phase.
Change-Id: I0e1bec1ca823d4a697a0831ec4c50f5dd3f2a893
Reviewed-on: https://go-review.googlesource.com/12673
Reviewed-by: Russ Cox <rsc@golang.org>
Currently we clear both the mark 1 and mark 2 signals at the beginning
of concurrent mark. If either if these is clear, it acts as a signal
to the scheduler that it should start background workers. However,
this means that in the interim *between* mark 1 and mark 2, the
scheduler basically loops starting up new workers only to have them
return with nothing to do. In addition to harming performance and
delaying mutator work, this approach has a race where workers started
for mark 1 can mistakenly signal mark 2, causing it to complete
prematurely. This approach also interferes with starting assists
earlier to fix#11677.
Fix this by initially setting both mark 1 and mark 2 to "signaled".
The scheduler will not start background mark workers, though assists
can still run. When we're ready to enter mark 1, we clear the mark 1
signal and wait for it. Then, when we're ready to enter mark 2, we
clear the mark 2 signal and wait for it.
This structure also lets us deal cleanly with the situation where all
work is drained *prior* to the mark 2 wait, meaning that there may be
no workers to signal completion. Currently we deal with this using a
racy (and possibly incorrect) check for work in the coordinator itself
to skip the mark 2 wait if there's no work. This change makes the
coordinator unconditionally wait for mark completion and makes the
scheduler itself signal completion by slightly extending the logic it
already has to determine that there's no work and hence no use in
starting a new worker.
This is a prerequisite to fixing the remaining component of #11677,
which will require enabling assists during the scan phase. However, we
don't want to enable background workers until the mark phase because
they will compete with the scan. This change lets us use bgMark1 and
bgMark2 to indicate when it's okay to start background workers
independent of assists.
This is also a prerequisite to fixing #11694. It significantly reduces
the occurrence of long mark termination pauses in #11694 (from 64 out
of 1000 to 2 out of 1000 in one experiment).
Coincidentally, this also reduces the final heap size (and hence run
time) of TestTraceStress from ~100 MB and ~1.9 seconds to ~14 MB and
~0.4 seconds because it significantly shortens concurrent mark
duration.
Rick Hudson <rlh> did the hard work of tracking this down.
Change-Id: I12ea9ee2db9a0ae9d3a90dde4944a75fcf408f4c
Reviewed-on: https://go-review.googlesource.com/12672
Reviewed-by: Russ Cox <rsc@golang.org>
Currently, there are three ways to satisfy a GC assist: 1) the mutator
steals credit from background GC, 2) the mutator actually does GC
work, and 3) there is no more work available. 3 was never really
intended as a way to satisfy an assist, and it causes problems: there
are periods when it's expected that the GC won't have any work, such
as when transitioning from mark 1 to mark 2 and from mark 2 to mark
termination. During these periods, there's no back-pressure on rapidly
allocating mutators, which lets them race ahead of the heap goal.
For example, test/init1.go and the runtime/trace test both have small
reachable heaps and contain loops that rapidly allocate large garbage
byte slices. This bug lets these tests exceed the heap goal by several
orders of magnitude.
Fix this by forcing the assist (and hence the allocation) to block
until it can satisfy its debt via either 1 or 2, or the GC cycle
terminates.
This fixes one the causes of #11677. It's still possible to overshoot
the GC heap goal, but with this change the overshoot is almost exactly
by the amount of allocation that happens during the concurrent scan
phase, between when the heap passes the GC trigger and when the GC
enables assists.
Change-Id: I5ef4edcb0d2e13a1e432e66e8245f2bd9f8995be
Reviewed-on: https://go-review.googlesource.com/12671
Reviewed-by: Russ Cox <rsc@golang.org>
Currently it's possible for the GC assist to signal completion of the
mark phase, which puts the GC coordinator goroutine on the current P's
run queue, and then return to mutator code that delays until the next
forced preemption before actually yielding control to the GC
coordinator, dragging out completion of the mark phase. This delay can
be further exacerbated if the mutator makes other goroutines runnable
before yielding control, since this will push the GC coordinator on
the back of the P's run queue.
To fix this, this adds a Gosched to the assist if it completed the
mark phase. This immediately and directly yields control to the GC
coordinator. This already happens implicitly in the background mark
workers because they park immediately after completing the mark.
This is one of the reasons completion of the mark phase is being
dragged out and allowing the mutator to allocate without assisting,
leading to the large heap goal overshoot in issue #11677. This is also
a prerequisite to making the assist block when it can't pay off its
debt.
Change-Id: I586adfbecb3ca042a37966752c1dc757f5c7fc78
Reviewed-on: https://go-review.googlesource.com/12670
Reviewed-by: Russ Cox <rsc@golang.org>
Currently it's possible to perform GC work on a system stack or when
locks are held if there's an allocation that triggers an assist. This
is generally a bad idea because of the fragility of these contexts,
and it's incompatible with two changes we're about to make: one is to
yield after signaling mark completion (which we can't do from a
non-preemptible context) and the other is to make assists block if
there's no other way for them to pay off the assist debt.
This commit simply skips the assist if it's called from a
non-preemptible context. The allocation will still count toward the
assist debt, so it will be paid off by a later assist. There should be
little allocation from non-preemptible contexts, so this shouldn't
harm the overall assist mechanism.
Change-Id: I7bf0e6c73e659fe6b52f27437abf39d76b245c79
Reviewed-on: https://go-review.googlesource.com/12649
Reviewed-by: Russ Cox <rsc@golang.org>
When notetsleep_internal is called from notetsleepg, notetsleepg has
just given up the P, so write barriers are not allowed in
notetsleep_internal.
Change-Id: I1b214fa388b1ea05b8ce2dcfe1c0074c0a3c8870
Reviewed-on: https://go-review.googlesource.com/12647
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Currently fractional and idle mark workers dispose of their gcWork
cache during mark 2 after incrementing work.nwait and after checking
whether there are any workers or any work available. This creates a
window for two races:
1) If the only remaining work is in this worker's gcWork cache, it
will see that there are no more workers and no more work on the
global lists (since it has not yet flushed its own cache) and
prematurely signal mark 2 completion.
2) After this worker has incremented work.nwait but before it has
flushed its cache, another worker may observe that there are no
more workers and no more work and prematurely signal mark 2
completion.
We can fix both of these by simply moving the cache flush above the
increment of nwait and the test of the completion condition.
This is probably contributing to #11694, though this alone is not
enough to fix it.
Change-Id: Idcf9656e5c460c5ea0d23c19c6c51e951f7716c3
Reviewed-on: https://go-review.googlesource.com/12646
Reviewed-by: Russ Cox <rsc@golang.org>
GC assists are supposed to steal at most the amount of background GC
credit available so that background GC credit doesn't go negative.
However, they are instead stealing the *total* amount of their debt
but only claiming up to the amount of credit that was available. This
results in draining the background GC credit pool too quickly, which
results in unnecessary assist work.
The fix is trivial: steal the amount of work we meant to steal (which
is already computed).
Change-Id: I837fe60ed515ba91c6baf363248069734a7895ef
Reviewed-on: https://go-review.googlesource.com/12643
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>