We already do this for shared libraries. Do it for plugins also.
Suggestions on how to test this would be welcome.
I'd like to get this in for 1.8. It could lead to mysterious
hangs when using plugins.
Fixes#18676
Change-Id: I03209b096149090b9ba171c834c5e59087ed0f92
Reviewed-on: https://go-review.googlesource.com/35117
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
The bug subcommand opens up the browser instead of printing information.
Fixes help message to reflect that.
Fixes#18630.
Change-Id: I660c94bc65ef1994292cfd72d08a544699545701
Reviewed-on: https://go-review.googlesource.com/35150
Reviewed-by: Russ Cox <rsc@golang.org>
As is, they were fully vulnerable to the Lucky13 attack. The SHA1
variants implement limited countermeasures (see f28cf8346c) but the
SHA256 ones are apparently used rarely enough (see 8741504888) that
it's not worth the extra code.
Instead, disable them by default and update the warning.
Updates #13385
Updates #15487
Change-Id: I45b8b716001e2fa0811b17e25be76e2512e5abb2
Reviewed-on: https://go-review.googlesource.com/35290
Reviewed-by: Adam Langley <alangley@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matt Layher <mdlayher@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
It looks like it should be there, although I couldn't find a test
case that fails without it. ZeroWB is probably never generated now:
zeroing an initialized heap object is done by making an autotmp on
stack, zeroing it, and copying (typedmemmove) to heap.
Passes "toolstash -cmp" on std.
Change-Id: I702a59759e33fb8cc2a34a3b3029e7540aca080a
Reviewed-on: https://go-review.googlesource.com/35250
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: Iac713ae1f322f893c92b3fc47fe9b5719052f9eb
Reviewed-on: https://go-review.googlesource.com/35240
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Symonds <dsymonds@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
CL 5232 added TestServerHijackGetsBackgroundByte, which is failing
on Plan 9, because CloseWrite is not implemented on Plan 9 yet.
Updates #17906.
Updates #18657.
Change-Id: I3c2f73760b0f767f3f9ed2698c855372170e0481
Reviewed-on: https://go-review.googlesource.com/35178
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL 35234 added TestServerHijackGetsBackgroundByte_big, which is failing
on Plan 9, because CloseWrite is not implemented on Plan 9 yet.
Updates #17906.
Updates #18658.
Change-Id: Icaf3fe3600d586515ecd92aca874104ea81ce6b9
Reviewed-on: https://go-review.googlesource.com/35179
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Follow-up test from Ian's comments in https://golang.org/cl/35232
after submit.
Change-Id: Ifa504bd8d09e555c3c7738376199dfc9b99130cf
Reviewed-on: https://go-review.googlesource.com/35234
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
The working directory is now adjusted to match the typical Go test
working directory in main, as the old trick for adjusting earlier
stopped working with the latest version of LLDB bugs.
That means the small number of places where testdata files are
read before main is called no longer work. This CL adjusts those
reads to happen after main is called. (This has the bonus effect of
not reading some benchmark testdata files in all.bash.)
Fixes compress/bzip2, go/doc, go/parser, os, and time package
tests on the iOS builder.
Change-Id: If60f026aa7848b37511c36ac5e3985469ec25209
Reviewed-on: https://go-review.googlesource.com/35255
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Previously, if the Hijack called stopped the background read call
which read a byte, that byte was sitting in memory, buffered, ready to
be Read by Hijack's returned bufio.Reader, but it wasn't yet in the
bufio.Reader's buffer itself, so bufio.Reader.Buffered() reported 1
byte fewer.
This matters for callers who wanted to stitch together any buffered
data (with bufio.Reader.Peek(bufio.Reader.Buffered())) with Hijack's
returned net.Conn. Otherwise there was no way for callers to know a
byte was read.
Change-Id: Id7cb0a0a33fe2f33d79250e13dbaa9c0f7abba13
Reviewed-on: https://go-review.googlesource.com/35232
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Another change in behvaior (bug) in LLDB. Despite the fact that
LLDB can dump the symtab of our test binaries and show the function
addresses, it can no longer call the functions. This means the chdir
trick on signal is failing.
This CL uses a new trick. For iOS, the exec script passes the change
in directory as an argument, and it is processed early by the test
harness generated by cmd/go.
For the iOS builders.
Change-Id: I8f5d0f831fe18de99f097761f89c5184d5bf2afb
Reviewed-on: https://go-review.googlesource.com/35152
Reviewed-by: Elias Naur <elias.naur@gmail.com>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
mkpost.go replaces all variables prefixed with 'X_' with '_' on s390x
because most of them do not need to be exposed. X__val is being used
by a third party library so it turns out we do need to expose it on
s390x (it is already exposed on all other Linux architectures).
Fixes#17298 and updates #18632.
Change-Id: Ic03463229a5f75ca41a4a4b50300da4b4d892d45
Reviewed-on: https://go-review.googlesource.com/30130
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Also fix a couple of other errors.
Fixes#6877
Change-Id: I94c81c5847cc7b0adab19418e71687bc2ee7fe94
Reviewed-on: https://go-review.googlesource.com/34960
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Make sure that the same type and itab generated in two
different plugins are actually the same thing.
See also CL 35115
Change-Id: I0c1ecb039d7e2bf5a601d58dfa162a435ae4ef76
Reviewed-on: https://go-review.googlesource.com/35116
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
When traceback sees reflect.makeFuncStub (or reflect.methodValueCall)
on the stack, it expects to be able to get the *reflect.makeFuncImpl
(or *reflect.methodValue) for that call from the first outgoing
argument slot of makeFuncStub/methodValueCall.
However, currently this object isn't necessarily kept live across
makeFuncStub. This means it may get garbage collected while in a
reflect call and reused for something else. If we then try to
traceback, the runtime will see a corrupted makeFuncImpl object and
panic. This was not a problem in previous releases because we always
kept arguments live across the whole function. This became a problem
when we stopped doing this.
Fix this by using reflect.KeepAlive to keep the
makeFuncImpl/methodValue live across all of callReflect/callMethod,
which in turn keeps it live as long as makeFuncStub/methodValueCall
are on the stack.
Fixes#18635.
Change-Id: I91853efcf17912390fddedfb0230648391c33936
Reviewed-on: https://go-review.googlesource.com/35151
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
C-only symbols are excluded from pclntab because of a quirk of darwin,
where functions are referred to by an exported symbol so dynamic
relocations de-duplicate to the host binary module and break unwinding.
This doesn't happen on ELF systems because the linker always refers to
unexported module-local symbols, so we don't need this condition.
And the current logic for excluding some functions breaks the module
verification code in moduledataverify1. So disable this for plugins
on linux.
(In 1.9, it will probably be necessary to introduce a module-local
symbol reference system on darwin to fix a different bug, so all of
this onlycsymbol code made be short-lived.)
With this CL, the tests in CL 35116 pass.
Change-Id: I517d7ca4427241fa0a91276c462827efb9383be9
Reviewed-on: https://go-review.googlesource.com/35190
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Ranging over an array causes the array to be copied over to the
stack, which cause large re-growths. Instead, we should iterate
over slices of the array.
Also, assigning a large struct literal uses the stack even
though the actual fields being populated are small in comparison
to the entirety of the struct (see #18636).
Fixing the stack growth does not alter CPU-time performance much
since the stack-growth and copying was such a tiny portion of the
compression work:
name old time/op new time/op delta
Encode/Digits/Default/1e4-8 332µs ± 1% 332µs ± 1% ~ (p=0.796 n=10+10)
Encode/Digits/Default/1e5-8 5.07ms ± 2% 5.05ms ± 1% ~ (p=0.815 n=9+8)
Encode/Digits/Default/1e6-8 53.7ms ± 1% 53.9ms ± 1% ~ (p=0.075 n=10+10)
Encode/Twain/Default/1e4-8 380µs ± 1% 380µs ± 1% ~ (p=0.684 n=10+10)
Encode/Twain/Default/1e5-8 5.79ms ± 2% 5.79ms ± 1% ~ (p=0.497 n=9+10)
Encode/Twain/Default/1e6-8 61.5ms ± 1% 61.8ms ± 1% ~ (p=0.247 n=10+10)
name old speed new speed delta
Encode/Digits/Default/1e4-8 30.1MB/s ± 1% 30.1MB/s ± 1% ~ (p=0.753 n=10+10)
Encode/Digits/Default/1e5-8 19.7MB/s ± 2% 19.8MB/s ± 1% ~ (p=0.795 n=9+8)
Encode/Digits/Default/1e6-8 18.6MB/s ± 1% 18.5MB/s ± 1% ~ (p=0.072 n=10+10)
Encode/Twain/Default/1e4-8 26.3MB/s ± 1% 26.3MB/s ± 1% ~ (p=0.616 n=10+10)
Encode/Twain/Default/1e5-8 17.3MB/s ± 2% 17.3MB/s ± 1% ~ (p=0.484 n=9+10)
Encode/Twain/Default/1e6-8 16.3MB/s ± 1% 16.2MB/s ± 1% ~ (p=0.238 n=10+10)
Updates #18636Fixes#18625
Change-Id: I471b20339bf675f63dc56d38b3acdd824fe23328
Reviewed-on: https://go-review.googlesource.com/35122
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The iOS test harness has set a breakpoint early in the life of Go
programs so that it can change the current working directory using
information only available from the host debugger. Somewhere in the
upgrade to iOS 10 / XCode 8.2, breakpoints stopped working. This
may be an LLDB bug, or a bug in the ios-deploy LLDB scripts, it's
not clear.
Work around the problem by giving up on breakpoints. Instead, early
in the life of every test binary built for iOS, send (and ignore) a
SIGUSR2 signal. The debugger will catch this, giving the script
go_darwin_arm_exec a chance to change the working directory.
For the iOS builders.
Change-Id: I7476531985217d0c76bc176904c48379210576c2
Reviewed-on: https://go-review.googlesource.com/34926
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Make sure that the same type and itab generated in two
different shared library are actually the same thing.
Change-Id: Ica45862d65ff8bc7ad04d59a41f57223f71224cd
Reviewed-on: https://go-review.googlesource.com/35115
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Use R11 (a caller-saved temp register) instead of RBX (a callee-saved
register).
I believe this only affects linux/amd64, since it is the only platform
with a non-trivial cgoSigtramp implementation.
Updates #18328.
Change-Id: I3d35c4512624184d5a8ece653fa09ddf50e079a2
Reviewed-on: https://go-review.googlesource.com/35068
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This reduces compilation time for the program
in #18602 from 7 hours to 30 min.
Updates #14781
Updates #18602
Change-Id: I3c4af878a08920e6373d3b3b0c4453ee002e32eb
Reviewed-on: https://go-review.googlesource.com/35113
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Somehow this file didn't get gofmted after the last change, which
interferes with merges.
Change-Id: I965cfdbf27a01124a6ed300be9687ff84f68f9a1
Reviewed-on: https://go-review.googlesource.com/35064
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Adam Langley <agl@chromium.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matt Layher <mdlayher@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The test was previously an integration test, relying on luck and many
goroutines and lots of time to hit the path to be tested.
Instead, rewrite the test to exactly hit the path to be tested, in one
try, in one goroutine.
Fixes#18205
Change-Id: I63cd513316344bfd7375dcc452c1c396dec0e49f
Reviewed-on: https://go-review.googlesource.com/35107
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The test is inherently racy and vulnerable to starvation,
and within all.bash on some platforms that means it flakes.
Test is kept because it can be useful standalone to verify
behavior of GOEXPERIMENT=preeemptibleloops, and there is
likely to be further development of this feature in the
future.
There's also some question as to why it is flaking, because
though technically this is permitted, it's very odd in this
simple case.
Fixes#18589.
Change-Id: Ia0dd9037285c4a03122da4012c96981c9cc43b60
Reviewed-on: https://go-review.googlesource.com/35051
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Loop breaking with a counter. Benchmarked (see comments),
eyeball checked for sanity on popular loops. This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.
Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test. Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.
If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop. Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.
This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.
Updates #17831.
Updates #10958.
Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
In Go1.7, a 301, 302, or 303 redirect on a HEAD method, would still
cause the following redirects to still use a HEAD.
In CL/29852 this behavior was changed such that those codes always
caused a redirect with the GET method. Fix this such that both
GET and HEAD will preserve the method.
Fixes#18570
Change-Id: I4bfe69872a30799419e3fad9178f907fe439b449
Reviewed-on: https://go-review.googlesource.com/34981
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Change-Id: I429637ca91f7db4144f17621de851a548dc1ce76
Reviewed-on: https://go-review.googlesource.com/34923
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Rule 9 arguably doesn't make sense for IPv4 addresses, and so far it
has only caused problems (#13283, #18518). Disable it until we hear
from users that actually want/need it.
Fixes#18518.
Change-Id: I7b0dd75d03819cab8e0cd4c29f0c1dc8d2e9c179
Reviewed-on: https://go-review.googlesource.com/34914
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CALLPART of STRUCTLIT did not check for incomplete initialization
of struct; modify PTRLIT treatment to force zeroing.
Test for structlit, believe this might have also failed for
arraylit.
Fixes#18410.
Change-Id: I511abf8ef850e300996d40568944665714efe1fc
Reviewed-on: https://go-review.googlesource.com/34622
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
To implement the blocking of a select, a goroutine builds a list of
offers to communicate (pseudo-g's, aka sudog), one for each case,
queues them on the corresponding channels, and waits for another
goroutine to complete one of those cases and wake it up. Obviously it
is not OK for two other goroutines to complete multiple cases and both
wake the goroutine blocked in select. To make sure that only one
branch of the select is chosen, all the sudogs contain a pointer to a
shared (single) 'done uint32', which is atomically cas'ed by any
interested goroutines. The goroutine that wins the cas race gets to
wake up the select. A complication is that 'done uint32' is stored on
the stack of the goroutine running the select, and that stack can move
during the select due to stack growth or stack shrinking.
The relevant ordering to block and unblock in select is:
1. Lock all channels.
2. Create list of sudogs and queue sudogs on all channels.
3. Switch to system stack, mark goroutine as asleep,
unlock all channels.
4. Sleep until woken.
5. Wake up on goroutine stack.
6. Lock all channels.
7. Dequeue sudogs from all channels.
8. Free list of sudogs.
9. Unlock all channels.
There are two kinds of stack moves: stack growth and stack shrinking.
Stack growth happens while the original goroutine is running.
Stack shrinking happens asynchronously, during garbage collection.
While a channel listing a sudog is locked by select in this process,
no other goroutine can attempt to complete communication on that
channel, because that other goroutine doesn't hold the lock and can't
find the sudog. If the stack moves while all the channel locks are
held or when the sudogs are not yet or no longer queued in the
channels, no problem, because no goroutine can get to the sudogs and
therefore to selectdone. We only need to worry about the stack (and
'done uint32') moving with the sudogs queued in unlocked channels.
Stack shrinking can happen any time the goroutine is stopped.
That code already acquires all the channel locks before doing the
stack move, so it avoids this problem.
Stack growth can happen essentially any time the original goroutine is
running on its own stack (not the system stack). In the first half of
the select, all the channels are locked before any sudogs are queued,
and the channels are not unlocked until the goroutine has stopped
executing on its own stack and is asleep, so that part is OK. In the
second half of the select, the goroutine wakes up on its own goroutine
stack and immediately locks all channels. But the actual call to lock
might grow the stack, before acquiring any locks. In that case, the
stack is moving with the sudogs queued in unlocked channels. Not good.
One goroutine has already won a cas on the old stack (that goroutine
woke up the selecting goroutine, moving it out of step 4), and the
fact that done = 1 now should prevent any other goroutines from
completing any other select cases. During the stack move, however,
sudog.selectdone is moved from pointing to the old done variable on
the old stack to a new memory location on the new stack. Another
goroutine might observe the moved pointer before the new memory
location has been initialized. If the new memory word happens to be
zero, that goroutine might win a cas on the new location, thinking it
can now complete the select (again). It will then complete a second
communication (reading from or writing to the goroutine stack
incorrectly) and then attempt to wake up the selecting goroutine,
which is already awake.
The scribbling over the goroutine stack unexpectedly is already bad,
but likely to go unnoticed, at least immediately. As for the second
wakeup, there are a variety of ways it might play out.
* The goroutine might not be asleep.
That will produce a runtime crash (throw) like in #17007:
runtime: gp: gp=0xc0422dcb60, goid=2299, gp->atomicstatus=8
runtime: g: g=0xa5cfe0, goid=0, g->atomicstatus=0
fatal error: bad g->status in ready
Here, atomicstatus=8 is copystack; the second, incorrect wakeup is
observing that the selecting goroutine is in state "Gcopystack"
instead of "Gwaiting".
* The goroutine might be sleeping in a send on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.
* The goroutine might be sleeping in a send on a non-nil chan.
If it wakes up, it will crash with 'fatal error: chansend:
spurious wakeup'.
* The goroutine might be sleeping in a receive on a nil chan.
If it wakes up, it will crash with 'fatal error: unreachable'.
* The goroutine might be sleeping in a receive on a non-nil chan.
If it wakes up, it will silently (incorrectly!) continue as if it
received a zero value from a closed channel, leaving a sudog queued on
the channel pointing at that zero vaue on the goroutine's stack; that
space will be reused as the goroutine executes, and when some other
goroutine finally completes the receive, it will do a stray write into
the goroutine's stack memory, which may cause problems. Then it will
attempt the real wakeup of the goroutine, leading recursively to any
of the cases in this list.
* The goroutine might have been running a select in a finalizer
(I hope not!) and might now be sleeping waiting for more things to
finalize. If it wakes up, as long as it goes back to sleep quickly
(before the real GC code tries to wake it), the spurious wakeup does
no harm (but the stack was still scribbled on).
* The goroutine might be sleeping in gcParkAssist.
If it wakes up, that will let the goroutine continue executing a bit
earlier than we would have liked. Eventually the GC will attempt the
real wakeup of the goroutine, leading recursively to any of the cases
in this list.
* The goroutine cannot be sleeping in bgsweep, because the background
sweepers never use select.
* The goroutine might be sleeping in netpollblock.
If it wakes up, it will crash with 'fatal error: netpollblock:
corrupted state'.
* The goroutine might be sleeping in main as another thread crashes.
If it wakes up, it will exit(0) instead of letting the other thread
crash with a non-zero exit status.
* The goroutine cannot be sleeping in forcegchelper,
because forcegchelper never uses select.
* The goroutine might be sleeping in an empty select - select {}.
If it wakes up, it will return to the next line in the program!
* The goroutine might be sleeping in a non-empty select (again).
In this case, it will wake up spuriously, with gp.param == nil (no
reason for wakeup), but that was fortuitously overloaded for handling
wakeup due to a closing channel and the way it is handled is to rerun
the select, which (accidentally) handles the spurious wakeup
correctly:
if cas == nil {
// This can happen if we were woken up by a close().
// TODO: figure that out explicitly so we don't need this loop.
goto loop
}
Before looping, it will dequeue all the sudogs on all the channels
involved, so that no other goroutine will attempt to wake it.
Since the goroutine was blocked in select before, being blocked in
select again when the spurious wakeup arrives may be quite likely.
In this case, the spurious wakeup does no harm (but the stack was
still scribbled on).
* The goroutine might be sleeping in semacquire (mutex slow path).
If it wakes up, that is taken as a signal to try for the semaphore
again, not a signal that the semaphore is now held, but the next
iteration around the loop will queue the sudog a second time, causing
a cycle in the wakeup list for the given address. If that sudog is the
only one in the list, when it is eventually dequeued, it will
(due to the precise way the code is written) leave the sudog on the
queue inactive with the sudog broken. But the sudog will also be in
the free list, and that will eventually cause confusion.
* The goroutine might be sleeping in notifyListWait, for sync.Cond.
If it wakes up, (*Cond).Wait returns. The docs say "Unlike in other
systems, Wait cannot return unless awoken by Broadcast or Signal,"
so the spurious wakeup is incorrect behavior, but most callers do not
depend on that fact. Eventually the condition will happen, attempting
the real wakeup of the goroutine and leading recursively to any of the
cases in this list.
* The goroutine might be sleeping in timeSleep aka time.Sleep.
If it wakes up, it will continue running, leaving a timer ticking.
When that time bomb goes off, it will try to ready the goroutine
again, leading to any one of the cases in this list.
* The goroutine cannot be sleeping in timerproc,
because timerproc never uses select.
* The goroutine might be sleeping in ReadTrace.
If it wakes up, it will print 'runtime: spurious wakeup of trace
reader' and return nil. All future calls to ReadTrace will print
'runtime: ReadTrace called from multiple goroutines simultaneously'.
Eventually, when trace data is available, a true wakeup will be
attempted, leading to any one of the cases in this list.
None of these fatal errors appear in any of the trybot or dashboard
logs. The 'bad g->status in ready' that happens if the goroutine is
running (the most likely scenario anyway) has happened once on the
dashboard and eight times in trybot logs. Of the eight, five were
atomicstatus=8 during net/http tests, so almost certainly this bug.
The other three were atomicstatus=2, all near code in select,
but in a draft CL by Dmitry that was rewriting select and may or may
not have had its own bugs.
This bug has existed since Go 1.4. Until then the select code was
implemented in C, 'done uint32' was a C stack variable 'uint32 done',
and C stacks never moved. I believe it has become more common recently
because of Brad's work to run more and more tests in net/http in
parallel, which lengthens race windows.
The fix is to run step 6 on the system stack,
avoiding possibility of stack growth.
Fixes#17007 and possibly other mysterious failures.
Change-Id: I9d6575a51ac96ae9d67ec24da670426a4a45a317
Reviewed-on: https://go-review.googlesource.com/34835
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
This adds high-level descriptions of the scheduler structures, the
user and system stacks, error handling, and synchronization.
Change-Id: I1eed97c6dd4a6e3d351279e967b11c6e64898356
Reviewed-on: https://go-review.googlesource.com/34290
Reviewed-by: Rick Hudson <rlh@golang.org>