qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-19 15:44:44 -07:00

Author	SHA1	Message	Date
Keith Randall	0d7a2241cb	runtime: update a few comments noescape is now 0 instructions with the SSA backend. fast atomics are no longer a TODO (at least for amd64). Change-Id: Ib6e06f7471bef282a47ba236d8ce95404bb60a42 Reviewed-on: https://go-review.googlesource.com/28087 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-08-30 18:16:28 +00:00
David Crawshaw	f135c32640	runtime: initialize hash algs before typemap When compiling with -buildmode=shared, a map[int32]*_type is created for each extra module mapping duplicate types back to a canonical object. This is done in the function typelinksinit, which is called before the init function that sets up the hash functions for the map implementation. The result is typemap becomes unusable after runtime initialization. The fix in this CL is to move algorithm init before typelinksinit in the runtime setup process. (For 1.8, we may want to turn typemap into a sorted slice of types and use binary search.) Manually tested on GOOS=linux with: GOHOSTARCH=386 GOARCH=386 ./make.bash && \ go install -buildmode=shared std && \ cd ../test && \ go run run.go -linkshared Fixes #16590 Change-Id: Idc08c50cc70d20028276fbf564509d2cd5405210 Reviewed-on: https://go-review.googlesource.com/25469 Run-TryBot: David Crawshaw <crawshaw@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2016-08-04 17:39:05 +00:00
Ian Lance Taylor	50048a4e8e	runtime: add as many extra M's as needed When a non-Go thread calls into Go, the runtime needs an M to run the Go code. The runtime keeps a list of extra M's available. When the last extra M is allocated, the needextram field is set to tell it to allocate a new extra M as soon as it is running in Go. This ensures that an extra M will always be available for the next thread. However, if many threads need an extra M at the same time, this serializes them all. One thread will get an extra M with the needextram field set. All the other threads will see that there is no M available and will go to sleep. The one thread that succeeded will create a new extra M. One lucky thread will get it. All the other threads will see that there is no M available and will go to sleep. The effect is thundering herd, as all the threads looking for an extra M go through the process one by one. This seems to have a particularly bad effect on the FreeBSD scheduler for some reason. With this change, we track the number of threads waiting for an M, and create all of them as soon as one thread gets through. This still means that all the threads will fight for the lock to pick up the next M. But at least each thread that gets the lock will succeed, instead of going to sleep only to fight again. This smooths out the performance greatly on FreeBSD, reducing the average wall time of `testprogcgo CgoCallbackGC` by 74%. On GNU/Linux the average wall time goes down by 9%. Fixes #13926 Fixes #16396 Change-Id: I6dc42a4156085a7ed4e5334c60b39db8f8ef8fea Reviewed-on: https://go-review.googlesource.com/25047 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-07-20 13:31:55 +00:00
Ian Lance Taylor	25a609556a	runtime: correct printing of blocked field in scheduler trace When the blocked field was first introduced back in https://golang.org/cl/61250043 the scheduler trace code incorrectly used m->blocked instead of mp->blocked. That has carried through the conversion to Go. This CL fixes it. Change-Id: Id81907b625221895aa5c85b9853f7c185efd8f4b Reviewed-on: https://go-review.googlesource.com/24571 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-06-29 01:38:39 +00:00
Ian Lance Taylor	84d8aff94c	runtime: collect stack trace if SIGPROF arrives on non-Go thread Fixes #15994. Change-Id: I5aca91ab53985ac7dcb07ce094ec15eb8ec341f8 Reviewed-on: https://go-review.googlesource.com/23891 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-06-13 21:43:19 +00:00
Russ Cox	7fdec6216c	build: enable framepointer mode by default This has a minor performance cost, but far less than is being gained by SSA. As an experiment, enable it during the Go 1.7 beta. Having frame pointers on by default makes Linux's perf, Intel VTune, and other profilers much more useful, because it lets them gather a stack trace efficiently on profiling events. (It doesn't help us that much, since when we walk the stack we usually need to look up PC-specific information as well.) Fixes #15840. Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a Reviewed-on: https://go-review.googlesource.com/23451 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-26 19:02:00 +00:00
Ian Lance Taylor	a5d1a72a40	cmd/cgo, runtime, runtime/cgo: TSAN support for malloc Acquire and release the TSAN synchronization point when calling malloc, just as we do when calling any other C function. If we don't do this, TSAN will report false positive errors about races calling malloc and free. We used to have a special code path for malloc and free, going through the runtime functions cmalloc and cfree. The special code path for cfree was no longer used even before this CL. This CL stops using the special code path for malloc, because there is no place along that path where we could conditionally insert the TSAN synchronization. This CL removes the support for the special code path for both functions. Instead, cgo now automatically generates the malloc function as though it were referenced as C.malloc. We need to automatically generate it even if C.malloc is not called, even if malloc and size_t are not declared, to support cgo-provided functions like C.CString. Change-Id: I829854ec0787a80f33fa0a8a0dc2ee1d617830e2 Reviewed-on: https://go-review.googlesource.com/23260 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-25 23:22:24 +00:00
Austin Clements	3be48b4dc8	runtime: pass gcWork to scanstack Currently scanstack obtains its own gcWork from the P for the duration of the stack scan and then, if called during mark termination, disposes the gcWork. However, this means that the number of workbufs allocated will be at least the number of stacks scanned during mark termination, which may be very high (especially during a STW GC). This happens because, in steady state, each scanstack will obtain a fresh workbuf (either from the empty list or by allocating it), fill it with the scan results, and then dispose it to the full list. Nothing is consuming from the full list during this (and hence nothing is recycling them to the empty list), so the length of the full list by the time mark termination starts draining it is at least the number of stacks scanned. Fix this by pushing the gcWork acquisition up the stack to either the gcDrain that calls markroot that calls scanstack (which batches across many stack scans and is the path taken during STW GC) or to newstack (which is still a single scanstack call, but this is roughly bounded by the number of Ps). This fix reduces the workbuf allocation for the test program from issue #15319 from 213 MB (roughly 2KB * 1e5 goroutines) to 10 MB. Fixes #15319. Note that there's potentially a similar issue in write barriers during mark 2. Fixing that will be more difficult since there's no broader non-preemptible context, but it should also be less of a problem since the full list is being drained during mark 2. Some overall improvements in the go1 benchmarks, plus the usual noise. No significant change in the garbage benchmark (time/op or GC memory). name old time/op new time/op delta BinaryTree17-12 2.54s ± 1% 2.51s ± 1% -1.09% (p=0.000 n=20+19) Fannkuch11-12 2.12s ± 0% 2.17s ± 0% +2.18% (p=0.000 n=19+18) FmtFprintfEmpty-12 45.1ns ± 1% 45.2ns ± 0% ~ (p=0.078 n=19+18) FmtFprintfString-12 127ns ± 0% 128ns ± 0% +1.08% (p=0.000 n=19+16) FmtFprintfInt-12 125ns ± 0% 122ns ± 1% -2.71% (p=0.000 n=14+18) FmtFprintfIntInt-12 196ns ± 0% 190ns ± 1% -2.91% (p=0.000 n=12+20) FmtFprintfPrefixedInt-12 196ns ± 0% 194ns ± 1% -0.94% (p=0.000 n=13+18) FmtFprintfFloat-12 253ns ± 1% 251ns ± 1% -0.86% (p=0.000 n=19+20) FmtManyArgs-12 807ns ± 1% 784ns ± 1% -2.85% (p=0.000 n=20+20) GobDecode-12 7.13ms ± 1% 7.12ms ± 1% ~ (p=0.351 n=19+20) GobEncode-12 5.89ms ± 0% 5.95ms ± 0% +0.94% (p=0.000 n=19+19) Gzip-12 219ms ± 1% 221ms ± 1% +1.35% (p=0.000 n=18+20) Gunzip-12 37.5ms ± 1% 37.4ms ± 0% ~ (p=0.057 n=20+19) HTTPClientServer-12 81.4µs ± 4% 81.9µs ± 3% ~ (p=0.118 n=17+18) JSONEncode-12 15.7ms ± 1% 15.8ms ± 1% +0.73% (p=0.000 n=17+18) JSONDecode-12 57.9ms ± 1% 57.2ms ± 1% -1.34% (p=0.000 n=19+19) Mandelbrot200-12 4.12ms ± 1% 4.10ms ± 0% -0.33% (p=0.000 n=19+17) GoParse-12 3.22ms ± 2% 3.25ms ± 1% +0.72% (p=0.000 n=18+20) RegexpMatchEasy0_32-12 70.6ns ± 1% 71.1ns ± 2% +0.63% (p=0.005 n=19+20) RegexpMatchEasy0_1K-12 240ns ± 0% 239ns ± 1% -0.59% (p=0.000 n=19+20) RegexpMatchEasy1_32-12 71.3ns ± 1% 71.3ns ± 1% ~ (p=0.844 n=17+17) RegexpMatchEasy1_1K-12 384ns ± 2% 371ns ± 1% -3.45% (p=0.000 n=19+20) RegexpMatchMedium_32-12 109ns ± 1% 108ns ± 2% -0.48% (p=0.029 n=19+19) RegexpMatchMedium_1K-12 34.3µs ± 1% 34.5µs ± 2% ~ (p=0.160 n=18+20) RegexpMatchHard_32-12 1.79µs ± 9% 1.72µs ± 2% -3.83% (p=0.000 n=19+19) RegexpMatchHard_1K-12 53.3µs ± 4% 51.8µs ± 1% -2.82% (p=0.000 n=19+20) Revcomp-12 386ms ± 0% 388ms ± 0% +0.72% (p=0.000 n=17+20) Template-12 62.9ms ± 1% 62.5ms ± 1% -0.57% (p=0.010 n=18+19) TimeParse-12 325ns ± 0% 331ns ± 0% +1.84% (p=0.000 n=18+19) TimeFormat-12 338ns ± 0% 343ns ± 0% +1.34% (p=0.000 n=18+20) [Geo mean] 52.7µs 52.5µs -0.42% Change-Id: Ib2d34736c4ae2ec329605b0fbc44636038d8d018 Reviewed-on: https://go-review.googlesource.com/23391 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-25 21:11:47 +00:00
Austin Clements	91740582c3	runtime: add 'next' flag to ready Currently ready always puts the readied goroutine in runnext. We're going to have to change this for some uses, so add a flag for whether or not to use runnext. For now we always pass true so this is a no-op change. For #15706. Change-Id: Iaa66d8355ccfe4bbe347570cc1b1878c70fa25df Reviewed-on: https://go-review.googlesource.com/23171 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-19 18:17:58 +00:00
Ian Lance Taylor	1f7a0d4b5e	runtime: don't do a plain throw when throwsplit == true The test case in #15639 somehow causes an invalid syscall frame. The failure is obscured because the throw occurs when throwsplit == true, which causes a "stack split at bad time" error when trying to print the throw message. This CL fixes the "stack split at bad time" by using systemstack. No test because there shouldn't be any way to trigger this error anyhow. Update #15639. Change-Id: I4240f3fd01bdc3c112f3ffd1316b68504222d9e1 Reviewed-on: https://go-review.googlesource.com/23153 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-19 04:37:45 +00:00
Ian Lance Taylor	84e808043f	runtime: use cgo traceback for SIGPROF If we collected a cgo traceback when entering the SIGPROF signal handler, record it as part of the profiling stack trace. This serves as the promised test for https://golang.org/cl/21055 . Change-Id: I5f60cd6cea1d9b7c3932211483a6bfab60ed21d2 Reviewed-on: https://go-review.googlesource.com/22650 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-05-04 00:08:19 +00:00
Dmitry Vyukov	caa2147532	runtime: per-P contexts for race detector Race runtime also needs local malloc caches and currently uses a mix of per-OS-thread and per-goroutine caches. This leads to increased memory consumption. But more importantly cache of synchronization objects is per-goroutine and we don't always have goroutine context when feeing memory in GC. As the result synchronization object descriptors leak (more precisely, they can be reused if another synchronization object is recreated at the same address, but it does not always help). For example, the added BenchmarkSyncLeak has effectively runaway memory consumption (based on a real long running server). This change updates race runtime with support for per-P contexts. BenchmarkSyncLeak now stabilizes at ~1GB memory consumption. Long term, this will allow us to remove race runtime dependency on glibc (as malloc is the main cornerstone). I've also implemented a different scheme to pass P context to race runtime: scheduler notified race runtime about association between G and P by calling procwire(g, p)/procunwire(g, p). But it turned out to be very messy as we have lots of places where the association changes (e.g. syscalls). So I dropped it in favor of the current scheme: race runtime asks scheduler about the current P. Fixes #14533 Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69 Reviewed-on: https://go-review.googlesource.com/19970 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 11:00:43 +00:00
Dmitry Vyukov	fcd7c02c70	runtime: fix CPU underutilization Runqempty is a critical predicate for scheduler. If runqempty spuriously returns true, then scheduler can fail to schedule arbitrary number of runnable goroutines on idle Ps for arbitrary long time. With the addition of runnext runqempty predicate become broken (can spuriously return true). Consider that runnext is not nil and the main array is empty. Runqempty observes that the array is empty, then it is descheduled for some time. Then queue owner pushes another element to the queue evicting runnext into the array. Then queue owner pops runnext. Then runqempty resumes and observes runnext is nil and returns true. But there were no point in time when the queue was empty. Fix runqempty predicate to not return true spuriously. Change-Id: Ifb7d75a699101f3ff753c4ce7c983cf08befd31e Reviewed-on: https://go-review.googlesource.com/20858 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-03 10:06:32 +00:00
Austin Clements	2a889b9d93	runtime: make stack re-scan O(# dirty stacks) Currently the stack re-scan during mark termination is O(# stacks) because we enqueue a root marking job for every goroutine. It takes ~34ns to process this root marking job for a valid (clean) stack, so at around 300k goroutines we exceed the 10ms pause goal. A non-trivial portion of this time is spent simply taking the cache miss to check the gcscanvalid flag, so simply optimizing the path that handles clean stacks can only improve this so much. Fix this by keeping an explicit list of goroutines with dirty stacks that need to be rescanned. When a goroutine first transitions to running after a stack scan and marks its stack dirty, it adds itself to this list. We enqueue root marking jobs only for the goroutines in this list, so this improves stack re-scanning asymptotically by completely eliminating time spent on clean goroutines. This reduces mark termination time for 500k idle goroutines from 15ms to 238µs. Overall performance effect is negligible. name \ 95%ile-time/markTerm old new delta IdleGs/gs:500000/gomaxprocs:12 15000µs ± 0% 238µs ± 5% -98.41% (p=0.000 n=10+10) name old time/op new time/op delta XBenchGarbage-12 2.30ms ± 3% 2.29ms ± 1% -0.43% (p=0.049 n=17+18) name old time/op new time/op delta BinaryTree17-12 2.57s ± 3% 2.59s ± 2% ~ (p=0.141 n=19+20) Fannkuch11-12 2.09s ± 0% 2.10s ± 1% +0.53% (p=0.000 n=19+19) FmtFprintfEmpty-12 45.3ns ± 3% 45.2ns ± 2% ~ (p=0.845 n=20+20) FmtFprintfString-12 129ns ± 0% 127ns ± 0% -1.55% (p=0.000 n=16+16) FmtFprintfInt-12 123ns ± 0% 119ns ± 1% -3.24% (p=0.000 n=19+19) FmtFprintfIntInt-12 195ns ± 1% 189ns ± 1% -3.11% (p=0.000 n=17+17) FmtFprintfPrefixedInt-12 193ns ± 1% 187ns ± 1% -3.06% (p=0.000 n=19+19) FmtFprintfFloat-12 254ns ± 0% 255ns ± 1% +0.35% (p=0.001 n=14+17) FmtManyArgs-12 781ns ± 0% 770ns ± 0% -1.48% (p=0.000 n=16+19) GobDecode-12 7.00ms ± 1% 6.98ms ± 1% ~ (p=0.563 n=19+19) GobEncode-12 5.91ms ± 1% 5.92ms ± 0% ~ (p=0.118 n=19+18) Gzip-12 219ms ± 1% 215ms ± 1% -1.81% (p=0.000 n=18+18) Gunzip-12 37.2ms ± 0% 37.4ms ± 0% +0.45% (p=0.000 n=17+19) HTTPClientServer-12 76.9µs ± 3% 77.5µs ± 2% +0.81% (p=0.030 n=20+19) JSONEncode-12 15.0ms ± 0% 14.8ms ± 1% -0.88% (p=0.001 n=15+19) JSONDecode-12 50.6ms ± 0% 53.2ms ± 2% +5.07% (p=0.000 n=17+19) Mandelbrot200-12 4.05ms ± 0% 4.05ms ± 1% ~ (p=0.581 n=16+17) GoParse-12 3.34ms ± 1% 3.30ms ± 1% -1.21% (p=0.000 n=15+20) RegexpMatchEasy0_32-12 69.6ns ± 1% 69.8ns ± 2% ~ (p=0.566 n=19+19) RegexpMatchEasy0_1K-12 238ns ± 1% 236ns ± 0% -0.91% (p=0.000 n=17+13) RegexpMatchEasy1_32-12 69.8ns ± 1% 70.0ns ± 1% +0.23% (p=0.026 n=17+16) RegexpMatchEasy1_1K-12 371ns ± 1% 363ns ± 1% -2.07% (p=0.000 n=19+19) RegexpMatchMedium_32-12 107ns ± 2% 106ns ± 1% -0.51% (p=0.031 n=18+20) RegexpMatchMedium_1K-12 33.0µs ± 0% 32.9µs ± 0% -0.30% (p=0.004 n=16+16) RegexpMatchHard_32-12 1.70µs ± 0% 1.70µs ± 0% +0.45% (p=0.000 n=16+17) RegexpMatchHard_1K-12 51.1µs ± 2% 51.4µs ± 1% +0.53% (p=0.000 n=17+19) Revcomp-12 378ms ± 1% 385ms ± 1% +1.92% (p=0.000 n=19+18) Template-12 64.3ms ± 2% 65.0ms ± 2% +1.09% (p=0.001 n=19+19) TimeParse-12 315ns ± 1% 317ns ± 2% ~ (p=0.108 n=18+20) TimeFormat-12 360ns ± 1% 337ns ± 0% -6.30% (p=0.000 n=18+13) [Geo mean] 51.8µs 51.6µs -0.48% Change-Id: Icf8994671476840e3998236e15407a505d4c760c Reviewed-on: https://go-review.googlesource.com/20700 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:13 +00:00
Austin Clements	5b765ce310	runtime: don't clear gcscanvalid in casfrom_Gscanstatus Currently we clear gcscanvalid in both casgstatus and casfrom_Gscanstatus if the new status is _Grunning. This is very important to do in casgstatus. However, this is potentially wrong in casfrom_Gscanstatus because in this case the caller doesn't own gp and hence the write is racy. Unlike the other _Gscan statuses, during _Gscanrunning, the G is still running. This does not indicate that it's transitioning into a running state. The scan simply hasn't happened yet, so it's neither valid nor invalid. Conveniently, this also means clearing gcscanvalid is unnecessary in this case because the G was already in _Grunning, so we can simply remove this code. What will happen instead is that the G will be preempted to scan itself, that scan will set gcscanvalid to true, and then the G will return to _Grunning via casgstatus, clearing gcscanvalid. This fix will become necessary shortly when we start keeping track of the set of G's with dirty stacks, since it will no longer be idempotent to simply set gcscanvalid to false. Change-Id: I688c82e6fbf00d5dbbbff49efa66acb99ee86785 Reviewed-on: https://go-review.googlesource.com/20669 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:10 +00:00
Austin Clements	c707d83856	runtime: fix typos in comment about gcscanvalid Change-Id: Id4ad7ebf88a21eba2bc5714b96570ed5cfaed757 Reviewed-on: https://go-review.googlesource.com/22210 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:40:07 +00:00
Austin Clements	1a2cf91f5e	runtime: split gfree list into with-stacks and without-stacks Currently all free Gs are added to one list. Split this into two lists: one for free Gs with cached stacks and one for Gs without cached stacks. This lets us preferentially allocate Gs that already have a stack, but more importantly, it sets us up to free cached G stacks concurrently. Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c Reviewed-on: https://go-review.googlesource.com/20664 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-26 23:39:51 +00:00
Dmitry Vyukov	a3703618ea	runtime: use per-goroutine sequence numbers in tracer Currently tracer uses global sequencer and it introduces significant slowdown on parallel machines (up to 10x). Replace the global sequencer with per-goroutine sequencer. If we assign per-goroutine sequence numbers to only 3 types of events (start, unblock and syscall exit), it is enough to restore consistent partial ordering of all events. Even these events don't need sequence numbers all the time (if goroutine starts on the same P where it was unblocked, then start does not need sequence number). The burden of restoring the order is put on trace parser. Details of the algorithm are described in the comments. On http benchmark with GOMAXPROCS=48: no tracing: 5026 ns/op tracing: 27803 ns/op (+453%) with this change: 6369 ns/op (+26%, mostly for traceback) Also trace size is reduced by ~22%. Average event size before: 4.63 bytes/event, after: 3.62 bytes/event. Besides running trace tests, I've also tested with manually broken cputicks (random skew for each event, per-P skew and episodic random skew). In all cases broken timestamps were detected and no test failures. Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa Reviewed-on: https://go-review.googlesource.com/21512 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-23 15:57:05 +00:00
David Crawshaw	7d469179e6	cmd/compile, etc: store method tables as offsets This CL introduces the typeOff type and a lookup method of the same name that can turn a typeOff offset into an rtype. In a typical Go binary (built with buildmode=exe, pie, c-archive, or c-shared), there is one moduledata and all typeOff values are offsets relative to firstmoduledata.types. This makes computing the pointer cheap in typical programs. With buildmode=shared (and one day, buildmode=plugin) there are multiple modules whose relative offset is determined at runtime. We identify a type in the general case by the pair of the original rtype that references it and its typeOff value. We determine the module from the original pointer, and then use the typeOff from there to compute the final rtype. To ensure there is only one rtype representing each type, the runtime initializes a typemap for each module, using any identical type from an earlier module when resolving that offset. This means that types computed from an offset match the type mapped by the pointer dynamic relocations. A series of followup CLs will replace other rtype values with typeOff (and name/string with nameOff). For types created at runtime by reflect, type offsets are treated as global IDs and reference into a reflect offset map kept by the runtime. darwin/amd64: cmd/go: -57KB (0.6%) jujud: -557KB (0.8%) linux/amd64 PIE: cmd/go: -361KB (3.0%) jujud: -3.5MB (4.2%) For #6853. Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96 Reviewed-on: https://go-review.googlesource.com/21285 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-13 13:03:11 +00:00
Emmanuel Odeke	e4f1d9cf2e	runtime: make execution error panic values implement the Error interface Make execution panics implement Error as mandated by https://golang.org/ref/spec#Run_time_panics, instead of panics with strings. Fixes #14965 Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a Reviewed-on: https://go-review.googlesource.com/21214 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-10 01:16:30 +00:00
Michael Hudson-Doyle	31cf1c1779	runtime: clamp OS-reported number of processors to _MaxGomaxprocs So that all Go processes do not die on startup on a system with >256 CPUs. I tested this by hacking osinit to set ncpu to 1000. Updates #15131 Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef Reviewed-on: https://go-review.googlesource.com/21599 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-04-07 00:11:25 +00:00
Dmitry Vyukov	475d113b53	runtime: don't burn CPU unnecessarily Two GC-related functions, scang and casgstatus, wait in an active spin loop. Active spinning is never a good idea in user-space. Once we wait several times more than the expected wait time, something unexpected is happenning (e.g. the thread we are waiting for is descheduled or handling a page fault) and we need to yield to OS scheduler. Moreover, the expected wait time is very high for these functions: scang wait time can be tens of milliseconds, casgstatus can be hundreds of microseconds. It does not make sense to spin even for that time. go install -a std profile on a 4-core machine shows that 11% of time is spent in the active spin in scang: 6.12% compile compile [.] runtime.scang 3.27% compile compile [.] runtime.readgstatus 1.72% compile compile [.] runtime/internal/atomic.Load The active spin also increases tail latency in the case of the slightest oversubscription: GC goroutines spend whole quantum in the loop instead of executing user code. Here is scang wait time histogram during go install -a std: 13707.0000 - 1815442.7667 [ 118]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎... 1815442.7667 - 3617178.5333 [ 9]: ∎∎∎∎∎∎∎∎∎ 3617178.5333 - 5418914.3000 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 5418914.3000 - 7220650.0667 [ 5]: ∎∎∎∎∎ 7220650.0667 - 9022385.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 9022385.8333 - 10824121.6000 [ 13]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 10824121.6000 - 12625857.3667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 12625857.3667 - 14427593.1333 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 14427593.1333 - 16229328.9000 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 16229328.9000 - 18031064.6667 [ 32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 18031064.6667 - 19832800.4333 [ 28]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 19832800.4333 - 21634536.2000 [ 6]: ∎∎∎∎∎∎ 21634536.2000 - 23436271.9667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 23436271.9667 - 25238007.7333 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 25238007.7333 - 27039743.5000 [ 27]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 27039743.5000 - 28841479.2667 [ 20]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 28841479.2667 - 30643215.0333 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 30643215.0333 - 32444950.8000 [ 7]: ∎∎∎∎∎∎∎ 32444950.8000 - 34246686.5667 [ 4]: ∎∎∎∎ 34246686.5667 - 36048422.3333 [ 4]: ∎∎∎∎ 36048422.3333 - 37850158.1000 [ 1]: ∎ 37850158.1000 - 39651893.8667 [ 5]: ∎∎∎∎∎ 39651893.8667 - 41453629.6333 [ 2]: ∎∎ 41453629.6333 - 43255365.4000 [ 2]: ∎∎ 43255365.4000 - 45057101.1667 [ 2]: ∎∎ 45057101.1667 - 46858836.9333 [ 1]: ∎ 46858836.9333 - 48660572.7000 [ 2]: ∎∎ 48660572.7000 - 50462308.4667 [ 3]: ∎∎∎ 50462308.4667 - 52264044.2333 [ 2]: ∎∎ 52264044.2333 - 54065780.0000 [ 2]: ∎∎ and the zoomed-in first part: 13707.0000 - 19916.7667 [ 2]: ∎∎ 19916.7667 - 26126.5333 [ 2]: ∎∎ 26126.5333 - 32336.3000 [ 9]: ∎∎∎∎∎∎∎∎∎ 32336.3000 - 38546.0667 [ 8]: ∎∎∎∎∎∎∎∎ 38546.0667 - 44755.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 44755.8333 - 50965.6000 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 50965.6000 - 57175.3667 [ 5]: ∎∎∎∎∎ 57175.3667 - 63385.1333 [ 6]: ∎∎∎∎∎∎ 63385.1333 - 69594.9000 [ 5]: ∎∎∎∎∎ 69594.9000 - 75804.6667 [ 6]: ∎∎∎∎∎∎ 75804.6667 - 82014.4333 [ 6]: ∎∎∎∎∎∎ 82014.4333 - 88224.2000 [ 4]: ∎∎∎∎ 88224.2000 - 94433.9667 [ 1]: ∎ 94433.9667 - 100643.7333 [ 1]: ∎ 100643.7333 - 106853.5000 [ 2]: ∎∎ 106853.5000 - 113063.2667 [ 0]: 113063.2667 - 119273.0333 [ 2]: ∎∎ 119273.0333 - 125482.8000 [ 2]: ∎∎ 125482.8000 - 131692.5667 [ 1]: ∎ 131692.5667 - 137902.3333 [ 1]: ∎ 137902.3333 - 144112.1000 [ 0]: 144112.1000 - 150321.8667 [ 2]: ∎∎ 150321.8667 - 156531.6333 [ 1]: ∎ 156531.6333 - 162741.4000 [ 1]: ∎ 162741.4000 - 168951.1667 [ 0]: 168951.1667 - 175160.9333 [ 0]: 175160.9333 - 181370.7000 [ 1]: ∎ 181370.7000 - 187580.4667 [ 1]: ∎ 187580.4667 - 193790.2333 [ 2]: ∎∎ 193790.2333 - 200000.0000 [ 0]: Here is casgstatus wait time histogram: 631.0000 - 5276.6333 [ 3]: ∎∎∎ 5276.6333 - 9922.2667 [ 5]: ∎∎∎∎∎ 9922.2667 - 14567.9000 [ 2]: ∎∎ 14567.9000 - 19213.5333 [ 6]: ∎∎∎∎∎∎ 19213.5333 - 23859.1667 [ 5]: ∎∎∎∎∎ 23859.1667 - 28504.8000 [ 6]: ∎∎∎∎∎∎ 28504.8000 - 33150.4333 [ 6]: ∎∎∎∎∎∎ 33150.4333 - 37796.0667 [ 2]: ∎∎ 37796.0667 - 42441.7000 [ 1]: ∎ 42441.7000 - 47087.3333 [ 3]: ∎∎∎ 47087.3333 - 51732.9667 [ 0]: 51732.9667 - 56378.6000 [ 1]: ∎ 56378.6000 - 61024.2333 [ 0]: 61024.2333 - 65669.8667 [ 0]: 65669.8667 - 70315.5000 [ 0]: 70315.5000 - 74961.1333 [ 1]: ∎ 74961.1333 - 79606.7667 [ 0]: 79606.7667 - 84252.4000 [ 0]: 84252.4000 - 88898.0333 [ 0]: 88898.0333 - 93543.6667 [ 0]: 93543.6667 - 98189.3000 [ 0]: 98189.3000 - 102834.9333 [ 0]: 102834.9333 - 107480.5667 [ 1]: ∎ 107480.5667 - 112126.2000 [ 0]: 112126.2000 - 116771.8333 [ 0]: 116771.8333 - 121417.4667 [ 0]: 121417.4667 - 126063.1000 [ 0]: 126063.1000 - 130708.7333 [ 0]: 130708.7333 - 135354.3667 [ 0]: 135354.3667 - 140000.0000 [ 1]: ∎ Ideally we eliminate the waiting by switching to async state machine for GC, but for now just yield to OS scheduler after a reasonable wait time. To choose yielding parameters I've measured golang.org/x/benchmarks/http tail latencies with different yield delays and oversubscription levels. With no oversubscription (to the degree possible): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.41ms ±15% 1.41ms ± 5% ~ (p=0.611 n=13+12) Latency-95 5.21ms ± 2% 5.15ms ± 2% -1.15% (p=0.012 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.54% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.2ms ±10% -5.46% (p=0.004 n=12+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.41ms ±15% 1.41ms ± 8% ~ (p=0.511 n=13+13) Latency-95 5.21ms ± 2% 5.14ms ± 2% -1.23% (p=0.006 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.94% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 10.1ms ± 8% -6.14% (p=0.000 n=12+13) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.41ms ±15% 1.45ms ± 6% ~ (p=0.724 n=13+13) Latency-95 5.21ms ± 2% 5.18ms ± 1% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.64% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 5% -6.72% (p=0.000 n=12+13) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.41ms ±15% 1.51ms ± 7% +6.57% (p=0.002 n=13+13) Latency-95 5.21ms ± 2% 5.21ms ± 2% ~ (p=0.960 n=13+13) Latency-99 7.16ms ± 2% 7.06ms ± 2% -1.50% (p=0.012 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 6% -6.49% (p=0.000 n=12+13) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.41ms ±15% 1.53ms ± 6% +8.48% (p=0.000 n=13+12) Latency-95 5.21ms ± 2% 5.23ms ± 2% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.08ms ± 2% -1.21% (p=0.004 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 3% -7.99% (p=0.000 n=12+12) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.41ms ±15% 1.47ms ± 5% ~ (p=0.072 n=13+13) Latency-95 5.21ms ± 2% 5.17ms ± 2% ~ (p=0.091 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.99% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 5% -7.86% (p=0.000 n=12+13) With slight oversubscription (another instance of http benchmark was running in background with reduced GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 840µs ± 3% 804µs ± 3% -4.37% (p=0.000 n=15+18) Latency-95 6.52ms ± 4% 6.03ms ± 4% -7.51% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.0ms ± 4% -7.33% (p=0.000 n=18+14) Latency-999 18.0ms ± 9% 16.8ms ± 7% -6.84% (p=0.000 n=18+18) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 840µs ± 3% 809µs ± 3% -3.71% (p=0.000 n=15+17) Latency-95 6.52ms ± 4% 6.11ms ± 4% -6.29% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 9.9ms ± 6% -7.55% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±11% -8.49% (p=0.000 n=18+18) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 840µs ± 3% 823µs ± 5% -2.06% (p=0.002 n=15+18) Latency-95 6.52ms ± 4% 6.32ms ± 3% -3.05% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -5.22% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.7ms ±10% -7.09% (p=0.000 n=18+18) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 840µs ± 3% 836µs ± 5% ~ (p=0.442 n=15+18) Latency-95 6.52ms ± 4% 6.39ms ± 3% -2.00% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 6% -5.15% (p=0.000 n=18+17) Latency-999 18.0ms ± 9% 16.6ms ± 8% -7.48% (p=0.000 n=18+18) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 840µs ± 3% 836µs ± 6% ~ (p=0.401 n=15+18) Latency-95 6.52ms ± 4% 6.40ms ± 4% -1.79% (p=0.010 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 5% -4.95% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±14% -8.17% (p=0.000 n=18+18) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 840µs ± 3% 828µs ± 2% -1.49% (p=0.001 n=15+17) Latency-95 6.52ms ± 4% 6.38ms ± 4% -2.04% (p=0.001 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -4.77% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.9ms ± 9% -6.23% (p=0.000 n=18+18) With significant oversubscription (background http benchmark was running with full GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.32ms ±12% 1.30ms ±13% ~ (p=0.454 n=14+14) Latency-95 16.3ms ±10% 15.3ms ± 7% -6.29% (p=0.001 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 5% -5.04% (p=0.001 n=14+12) Latency-999 49.9ms ±19% 45.9ms ± 5% -8.00% (p=0.008 n=14+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.32ms ±12% 1.29ms ± 9% ~ (p=0.227 n=14+14) Latency-95 16.3ms ±10% 15.4ms ± 5% -5.27% (p=0.002 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 6% -5.16% (p=0.001 n=14+14) Latency-999 49.9ms ±19% 46.8ms ± 8% -6.21% (p=0.050 n=14+14) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.32ms ±12% 1.35ms ± 9% ~ (p=0.401 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 4% -7.67% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 5% -6.98% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.7ms ± 5% -10.56% (p=0.000 n=14+11) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.32ms ±12% 1.36ms ±10% ~ (p=0.246 n=14+14) Latency-95 16.3ms ±10% 14.9ms ± 5% -8.31% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 7% -6.70% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.9ms ±15% -10.13% (p=0.003 n=14+14) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.32ms ±12% 1.41ms ± 9% +6.37% (p=0.008 n=14+13) Latency-95 16.3ms ±10% 15.1ms ± 8% -7.45% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.5ms ±12% -6.67% (p=0.002 n=14+14) Latency-999 49.9ms ±19% 45.9ms ±16% -8.06% (p=0.019 n=14+14) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.32ms ±12% 1.42ms ±10% +7.21% (p=0.003 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 7% -7.59% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.3ms ± 8% -7.20% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.8ms ± 8% -10.21% (p=0.001 n=14+13) All numbers are on 8 cores and with GOGC=10 (http benchmark has tiny heap, few goroutines and low allocation rate, so by default GC barely affects tail latency). 10us/5us yield delays seem to provide a reasonable compromise and give 5-10% tail latency reduction. That's what used in this change. go install -a std results on 4 core machine: name old time/op new time/op delta Time 8.39s ± 2% 7.94s ± 2% -5.34% (p=0.000 n=47+49) UserTime 24.6s ± 2% 22.9s ± 2% -6.76% (p=0.000 n=49+49) SysTime 1.77s ± 9% 1.89s ±11% +7.00% (p=0.000 n=49+49) CpuLoad 315ns ± 2% 313ns ± 1% -0.59% (p=0.000 n=49+48) # %CPU MaxRSS 97.1ms ± 4% 97.5ms ± 9% ~ (p=0.838 n=46+49) # bytes Update #14396 Update #14189 Change-Id: I3f4109bf8f7fd79b39c466576690a778232055a2 Reviewed-on: https://go-review.googlesource.com/21503 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:52:03 +00:00
Dmitry Vyukov	3b246fa863	runtime: sleep less when we can do work Usleep(100) in runqgrab negatively affects latency and throughput of parallel application. We are sleeping instead of doing useful work. This is effect is particularly visible on windows where minimal sleep duration is 1-15ms. Reduce sleep from 100us to 3us and use osyield on windows. Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot. benchmark old ns/op new ns/op delta BenchmarkChanSync-12 216 217 +0.46% BenchmarkChanSyncWork-12 27213 25816 -5.13% CPU consumption goes up from 106% to 108% in the first case, and from 107% to 125% in the second case. Test case from #14790 on windows: BenchmarkDefaultResolution-8 4583372 29720 -99.35% Benchmark1ms-8 992056 30701 -96.91% 99-th latency percentile for HTTP request serving is improved by up to 15% (see http://golang.org/cl/20835 for details). The following benchmarks are from the change that originally added this sleep (see https://golang.org/s/go15gomaxprocs): name old time/op new time/op delta Chain 22.6µs ± 2% 22.7µs ± 6% ~ (p=0.905 n=9+10) ChainBuf 22.4µs ± 3% 22.5µs ± 4% ~ (p=0.780 n=9+10) Chain-2 23.5µs ± 4% 24.9µs ± 1% +5.66% (p=0.000 n=10+9) ChainBuf-2 23.7µs ± 1% 24.4µs ± 1% +3.31% (p=0.000 n=9+10) Chain-4 24.2µs ± 2% 25.1µs ± 3% +3.70% (p=0.000 n=9+10) ChainBuf-4 24.4µs ± 5% 25.0µs ± 2% +2.37% (p=0.023 n=10+10) Powser 2.37s ± 1% 2.37s ± 1% ~ (p=0.423 n=8+9) Powser-2 2.48s ± 2% 2.57s ± 2% +3.74% (p=0.000 n=10+9) Powser-4 2.66s ± 1% 2.75s ± 1% +3.40% (p=0.000 n=10+10) Sieve 13.3s ± 2% 13.3s ± 2% ~ (p=1.000 n=10+9) Sieve-2 7.00s ± 2% 7.44s ±16% ~ (p=0.408 n=8+10) Sieve-4 4.13s ±21% 3.85s ±22% ~ (p=0.113 n=9+9) Fixes #14790 Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d Reviewed-on: https://go-review.googlesource.com/20835 Reviewed-by: Austin Clements <austin@google.com>	2016-04-05 15:32:06 +00:00
Ian Lance Taylor	59fc42b230	runtime: allocate mp.cgocallers earlier Fixes #15061. Change-Id: I71f69f398d1c5f3a884bbd044786f1a5600d0fae Reviewed-on: https://go-review.googlesource.com/21398 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-01 22:23:13 +00:00
Michel Lespinasse	7043d2bb5e	runtime: insert itabs into hash table during init See #14874 This change makes the runtime register all compiler generated itabs (as obtained from the moduledata) during init. Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf Reviewed-on: https://go-review.googlesource.com/20900 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>	2016-03-29 02:14:49 +00:00
Dmitry Vyukov	ea0386f85f	runtime: improve randomized stealing logic During random stealing we steal 4*GOMAXPROCS times from random procs. One would expect that most of the time we check all procs this way, but due to low quality PRNG we actually miss procs with frightening probability. Below are modelling experiment results for 1e6 tries: GOMAXPROCS = 2 : missed 1 procs 7944 times GOMAXPROCS = 3 : missed 1 procs 101620 times GOMAXPROCS = 3 : missed 2 procs 3571 times GOMAXPROCS = 4 : missed 1 procs 63916 times GOMAXPROCS = 4 : missed 2 procs 61 times GOMAXPROCS = 4 : missed 3 procs 16 times GOMAXPROCS = 5 : missed 1 procs 133136 times GOMAXPROCS = 5 : missed 2 procs 1025 times GOMAXPROCS = 5 : missed 3 procs 101 times GOMAXPROCS = 5 : missed 4 procs 15 times GOMAXPROCS = 8 : missed 1 procs 151765 times GOMAXPROCS = 8 : missed 2 procs 5057 times GOMAXPROCS = 8 : missed 3 procs 1726 times GOMAXPROCS = 8 : missed 4 procs 68 times GOMAXPROCS = 12 : missed 1 procs 199081 times GOMAXPROCS = 12 : missed 2 procs 27489 times GOMAXPROCS = 12 : missed 3 procs 3113 times GOMAXPROCS = 12 : missed 4 procs 233 times GOMAXPROCS = 12 : missed 5 procs 9 times GOMAXPROCS = 16 : missed 1 procs 237477 times GOMAXPROCS = 16 : missed 2 procs 30037 times GOMAXPROCS = 16 : missed 3 procs 9466 times GOMAXPROCS = 16 : missed 4 procs 1334 times GOMAXPROCS = 16 : missed 5 procs 192 times GOMAXPROCS = 16 : missed 6 procs 5 times GOMAXPROCS = 16 : missed 7 procs 1 times GOMAXPROCS = 16 : missed 8 procs 1 times A missed proc won't lead to underutilization because we check all procs again after dropping P. But it can lead to an unpleasant situation when we miss a proc, drop P, check all procs, discover work, acquire P, miss the proc again, repeat. Improve stealing logic to cover all procs. Also don't enter spinning mode and try to steal when there is nobody around. Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2 Reviewed-on: https://go-review.googlesource.com/20836 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>	2016-03-25 11:00:48 +00:00
Austin Clements	8fb182d020	runtime: never pass stack pointers to gopark gopark calls the unlock function after setting the G to _Gwaiting. This means it's generally unsafe to access the G's stack from the unlock function because the G may start running on another P. Once we start shrinking stacks concurrently, a stack shrink could also move the stack the moment after it enters _Gwaiting and before the unlock function is called. Document this restriction and fix the two places where we currently violate it. This is unlikely to be a problem in practice for these two places right now, but they're already skating on thin ice. For example, the following sequence could in principle cause corruption, deadlock, or a panic in the select code: On M1/P1: 1. G1 selects on channels A and B. 2. selectgoImpl calls gopark. 3. gopark puts G1 in _Gwaiting. 4. gopark calls selparkcommit. 5. selparkcommit releases the lock on channel A. On M2/P2: 6. G2 sends to channel A. 7. The send puts G1 in _Grunnable and puts it on P2's run queue. 8. The scheduler runs, selects G1, puts it in _Grunning, and resumes G1. 9. On G1, the sellock immediately following the gopark gets called. 10. sellock grows and moves the stack. On M1/P1: 11. selparkcommit continues to scan the lock order for the next channel to unlock, but it's now reading from a freed (and possibly reused) stack. This shouldn't happen in practice because step 10 isn't the first call to sellock, so the stack should already be big enough. However, once we start shrinking stacks concurrently, this reasoning won't work any more. For #12967. Change-Id: I3660c5be37e5be9f87433cb8141bdfdf37fadc4c Reviewed-on: https://go-review.googlesource.com/20038 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:10 +00:00
Austin Clements	e4a95b6343	runtime: record channel in sudog Given a G, there's currently no way to find the channel it's blocking on. We'll need this information to fix a (probably theoretical) bug in select and to implement concurrent stack shrinking, so record the channel in the sudog. For #12967. Change-Id: If8fb63a140f1d07175818824d08c0ebeec2bdf66 Reviewed-on: https://go-review.googlesource.com/20035 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-16 20:13:02 +00:00
Emmanuel Odeke	6dfcc336c5	runtime: move testSchedLocalQueue* to export_test Move functions testSchedLocalQueueLocal and testSchedLocalQueueSteal from proc.go to export_test.go, the only site that they are used. Fixes #14796 Change-Id: I16b6fa4a13835eab33f66a2c2e87a5f5c79b7bd3 Reviewed-on: https://go-review.googlesource.com/20640 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-03-13 00:34:58 +00:00
Brad Fitzpatrick	5fea2ccc77	all: single space after period. The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-03-02 00:13:47 +00:00
Dmitry Vyukov	bdc14698f8	runtime: unwire g/m in dropg always Currently dropg does not unwire locked g/m. This is unnecessary distiction between locked and non-locked g/m. We always restart goroutines with execute which re-wires g/m. First, this produces false sense that this distinction is necessary. Second, it can confuse some sanity and cross checks. For example, if we check that g/m are unwired before we wire them in execute, the check will fail for locked g/m. I've hit this while doing some race detector changes, When we deschedule a goroutine and run scheduler code, m.curg is generally nil, but not for locked ms. Remove the distinction. Change-Id: I3b87a28ff343baa1d564aab1f821b582a84dee07 Reviewed-on: https://go-review.googlesource.com/19950 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-26 15:45:45 +00:00
Austin Clements	cbe849fc38	runtime: eliminate unused _Genqueue state _Genqueue and _Gscanenqueue were introduced as part of the GC quiesce code. The quiesce code was removed by `197aa9e`, but these states and some associated code stuck around. Remove them. Change-Id: I69df81881602d4a431556513dac2959668d27c20 Reviewed-on: https://go-review.googlesource.com/19638 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-25 23:37:32 +00:00
Martin Möhrmann	fdd0179bb1	all: fix typos and spelling Change-Id: Icd06d99c42b8299fd931c7da821e1f418684d913 Reviewed-on: https://go-review.googlesource.com/19829 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-24 18:42:29 +00:00
Shenghou Ma	d70c04cf08	runtime: fix missing word in comment Change-Id: I6cb8ac7b59812e82111ab3b0f8303ab8194a5129 Reviewed-on: https://go-review.googlesource.com/19791 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-21 22:40:25 +00:00
Austin Clements	f309bf3eef	runtime: start an M when handing off a P when there's GC work Currently it's possible for the scheduler to deadlock with the right confluence of locked Gs, assists, and scheduling of background mark workers. Broadly, this happens because handoffp is stricter than findrunnable, and if the only work for a P is GC work, handoffp will put the P into idle, rather than starting an M to execute that P. One way this can happen is as follows: 0. There is only one user G, which we'll call G 1. There is more than one P, but they're all idle except the one running G 1. 1. G 1 locks itself to an M using runtime.LockOSThread. 2. GC starts up and enters mark 1. 3. G 1 performs a GC assist, which completes mark 1 without being fully satisfied. Completing mark 1 causes all background mark workers to park. And since the assist isn't fully satisfied, it parks as well, waiting for a background mark worker to satisfy its remaining assist debt. 4. The assist park enters the scheduler. Since G 1 is locked to the M, the scheduler releases the P and calls handoffp to hand the P to another M. 5. handoffp checks the local and global run queues, which are empty, and sees that there are idle Ps, so rather than start an M, it puts the P into idle. At this point, all of the Gs are waiting and all of the Ps are idle. In particular, none of the GC workers are running, so no mark work gets done and the assist on the main G is never satisfied, so the whole process soft locks up. Fix this by making handoffp start an M if there is GC work. This reintroduces a key invariant: that in any situation where findrunnable would return a G to run on a P, handoffp for that P will start an M to run work on that P. Fixes #13645. Tested by running 2,689 iterations of `go tool dist test -no-rebuild runtime:cpu124` across 10 linux-amd64-noopt VMs with no failures. Without this change, the failure rate was somewhere around 1%. Performance change is negligible. name old time/op new time/op delta XBenchGarbage-12 2.48ms ± 2% 2.48ms ± 1% -0.24% (p=0.000 n=92+93) name old time/op new time/op delta BinaryTree17-12 2.86s ± 2% 2.87s ± 2% ~ (p=0.667 n=19+20) Fannkuch11-12 2.52s ± 1% 2.47s ± 1% -2.05% (p=0.000 n=18+20) FmtFprintfEmpty-12 51.7ns ± 1% 51.5ns ± 3% ~ (p=0.931 n=16+20) FmtFprintfString-12 170ns ± 1% 168ns ± 1% -0.65% (p=0.000 n=19+19) FmtFprintfInt-12 160ns ± 0% 160ns ± 0% +0.18% (p=0.033 n=17+19) FmtFprintfIntInt-12 265ns ± 1% 273ns ± 1% +2.98% (p=0.000 n=17+19) FmtFprintfPrefixedInt-12 235ns ± 1% 239ns ± 1% +1.99% (p=0.000 n=16+19) FmtFprintfFloat-12 315ns ± 0% 315ns ± 1% ~ (p=0.250 n=17+19) FmtManyArgs-12 1.04µs ± 1% 1.05µs ± 0% +0.87% (p=0.000 n=17+19) GobDecode-12 7.93ms ± 0% 7.85ms ± 1% -1.03% (p=0.000 n=16+18) GobEncode-12 6.62ms ± 1% 6.58ms ± 1% -0.60% (p=0.000 n=18+19) Gzip-12 322ms ± 1% 320ms ± 1% -0.46% (p=0.009 n=20+20) Gunzip-12 42.5ms ± 1% 42.5ms ± 0% ~ (p=0.751 n=19+19) HTTPClientServer-12 69.7µs ± 1% 70.0µs ± 2% ~ (p=0.056 n=19+19) JSONEncode-12 16.9ms ± 1% 16.7ms ± 1% -1.13% (p=0.000 n=19+19) JSONDecode-12 61.5ms ± 1% 61.3ms ± 1% -0.35% (p=0.001 n=20+17) Mandelbrot200-12 3.94ms ± 0% 3.91ms ± 0% -0.67% (p=0.000 n=20+18) GoParse-12 3.71ms ± 1% 3.70ms ± 1% ~ (p=0.244 n=17+19) RegexpMatchEasy0_32-12 101ns ± 1% 102ns ± 2% +0.54% (p=0.037 n=19+20) RegexpMatchEasy0_1K-12 349ns ± 0% 350ns ± 0% +0.33% (p=0.000 n=17+18) RegexpMatchEasy1_32-12 84.5ns ± 2% 84.2ns ± 1% -0.43% (p=0.048 n=19+20) RegexpMatchEasy1_1K-12 510ns ± 1% 513ns ± 2% +0.58% (p=0.002 n=18+20) RegexpMatchMedium_32-12 132ns ± 1% 134ns ± 1% +0.95% (p=0.000 n=20+20) RegexpMatchMedium_1K-12 40.1µs ± 1% 39.6µs ± 1% -1.39% (p=0.000 n=20+20) RegexpMatchHard_32-12 2.08µs ± 0% 2.06µs ± 1% -0.95% (p=0.000 n=18+18) RegexpMatchHard_1K-12 62.2µs ± 1% 61.9µs ± 1% -0.42% (p=0.001 n=19+20) Revcomp-12 537ms ± 0% 536ms ± 0% ~ (p=0.076 n=20+20) Template-12 71.3ms ± 1% 69.3ms ± 1% -2.75% (p=0.000 n=20+20) TimeParse-12 361ns ± 0% 360ns ± 1% ~ (p=0.056 n=19+19) TimeFormat-12 353ns ± 0% 352ns ± 0% -0.23% (p=0.000 n=17+18) [Geo mean] 62.6µs 62.5µs -0.17% Change-Id: I0fbbbe4d7d99653ba5600ffb4394fa03558bc4e9 Reviewed-on: https://go-review.googlesource.com/19107 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-02-02 02:11:28 +00:00
Austin Clements	09940b92a0	runtime: make p.gcBgMarkWorker a guintptr Currently p.gcBgMarkWorker is a *g. Change it to a guintptr. This eliminates a write barrier during the subtle mark worker parking dance (which isn't known to be causing problems, but may). Change-Id: Ibf12c05ac910820448059e69a68e5b882c993ed8 Reviewed-on: https://go-review.googlesource.com/18970 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-27 02:23:09 +00:00
Austin Clements	eb3b1830b0	runtime: attach mark workers to P after they park Currently mark workers attach to their designated Ps before parking, either during initialization or after performing a phase transition. However, in both of these cases, it's possible that the mark worker is running on a different P than the one it attaches to. This is a problem, because as soon as the worker attaches to a P, that P's scheduler can execute the worker. If the worker hasn't yet parked on the P it's actually running on, this means the worker G will be running in two places at once. The most visible consequence of this is that once the first instance of the worker does park, it will clear g.m and the second instance will crash shortly when it tries to use g.m. Fix this by moving the attach to the gopark callback. At this point, the G is genuinely stopped and the callback is running on the system stack, so it's safe for another P's scheduler to pick up the worker G. Fixes #13363. Fixes #13978. Change-Id: If2f7c4a4174f9511f6227e14a27c56fb842d1cc8 Reviewed-on: https://go-review.googlesource.com/18761 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-01-27 02:13:53 +00:00
Ian Lance Taylor	efd93a412e	runtime: minimize time between lockextra/unlockextra This doesn't fix a bug, but may improve performance in programs that have many concurrent calls from C to Go. The old code made several system calls between lockextra and unlockextra. That could be happening while another thread is spinning acquiring lockextra. This changes the code to not make any system calls while holding the lock. Change-Id: I50576478e478670c3d6429ad4e1b7d80f98a19d8 Reviewed-on: https://go-review.googlesource.com/18548 Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-14 05:55:43 +00:00
Russ Cox	fac8202c3f	runtime: make NumGoroutine and Stack agree not to include system goroutines [Repeat of CL 18343 with build fixes.] Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them, which was inconsistent and confusing. To resolve which way they should be consistent, it seems like package main import "runtime" func main() { println(runtime.NumGoroutine()) } should print 1 regardless of internal runtime details. Make it so. Fixes #11706. Change-Id: If26749fec06aa0ff84311f7941b88d140552e81d Reviewed-on: https://go-review.googlesource.com/18432 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org>	2016-01-13 01:46:01 +00:00
Ian Lance Taylor	c02aa463db	runtime: fix arm/arm64/ppc64/mips64 to dropm when necessary Fixes #13881. Change-Id: Idff77db381640184ddd2b65022133bb226168800 Reviewed-on: https://go-review.googlesource.com/18449 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2016-01-11 18:46:54 +00:00
Ian Lance Taylor	21b4f234c7	runtime: for c-archive/c-shared, install signal handlers synchronously The previous behaviour of installing the signal handlers in a separate thread meant that Go initialization raced with non-Go initialization if the non-Go initialization also wanted to install signal handlers. Make installing signal handlers synchronous so that the process-wide behavior is predictable. Update #9896. Change-Id: Ice24299877ec46f8518b072a381932d273096a32 Reviewed-on: https://go-review.googlesource.com/18150 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2016-01-09 00:58:38 +00:00
Russ Cox	6da608206c	Revert "runtime: make NumGoroutine and Stack agree not to include system goroutines" This reverts commit `c5bafc8281`. Change-Id: Ie7030c978c6263b9e996d5aa0e490086796df26d Reviewed-on: https://go-review.googlesource.com/18431 Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-08 15:31:15 +00:00
Russ Cox	c5bafc8281	runtime: make NumGoroutine and Stack agree not to include system goroutines Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them, which was inconsistent and confusing. To resolve which way they should be consistent, it seems like package main import "runtime" func main() { println(runtime.NumGoroutine()) } should print 1 regardless of internal runtime details. Make it so. Fixes #11706. Change-Id: I6bfe26a901de517728192cfb26a5568c4ef4fe47 Reviewed-on: https://go-review.googlesource.com/18343 Reviewed-by: Austin Clements <austin@google.com>	2016-01-08 15:25:00 +00:00
Austin Clements	3f22adecc7	runtime: fix sigprof stack barrier locking `f90b48e` intended to require the stack barrier lock in all cases of sigprof that walked the user stack, but got it wrong. In particular, if sp < gp.stack.lo \|\| gp.stack.hi < sp, tracebackUser would be true, but we wouldn't acquire the stack lock. If it then turned out that we were in a cgo call, it would walk the stack without the lock. In fact, the whole structure of stack locking is sigprof is somewhat wrong because it assumes the G to lock is gp.m.curg, but all three gentraceback calls start from potentially different Gs. To fix this, we lower the gcTryLockStackBarriers calls much closer to the gentraceback calls. There are now three separate trylock calls, each clearly associated with a gentraceback and the locked G clearly matches the G from which the gentraceback starts. This actually brings the sigprof logic closer to what it originally was before stack barrier locking. This depends on "runtime: increase assumed stack size in externalthreadhandler" because it very slightly increases the stack used by sigprof; without this other commit, this is enough to blow the profiler thread's assumed stack size. Fixes #12528 (hopefully for real this time!). For the 1.5 branch, though it will require some backporting. On the 1.5 branch, this will not require the "runtime: increase assumed stack size in externalthreadhandler" commit: there's no pcvalue cache, so the used stack is smaller. Change-Id: Id2f6446ac276848f6fc158bee550cccd03186b83 Reviewed-on: https://go-review.googlesource.com/18328 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-07 19:40:38 +00:00
Austin Clements	b50b24837d	runtime: don't ignore success of cgo profiling tracebacks If a sigprof happens during a cgo call, we traceback from the entry point of the cgo call. However, if the SP is outside of the G's stack, we'll then ignore this traceback, even if it was successful, and overwrite it with just _ExternalCode. Fix this by accepting any successful traceback, regardless of whether we got it from a cgo entry point or from regular Go code. Fixes #13466. Change-Id: I5da9684361fc5964f44985d74a8cdf02ffefd213 Reviewed-on: https://go-review.googlesource.com/18327 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-07 19:40:26 +00:00
Ian Lance Taylor	70c9a8187a	runtime: set new m signal mask to program startup mask We were setting the signal mask of a new m to the signal mask of the m that created it. That failed when that m happened to be the one created by ensureSigM, which sets its signal mask to only include the signals being caught by os/signal.Notify. Fixes #13164. Update #9896. Change-Id: I705c196fe9d11754e10bab9e9b2e7530ecdfa367 Reviewed-on: https://go-review.googlesource.com/18064 Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-06 23:14:46 +00:00
Ian Lance Taylor	934e055f41	runtime: call msanwrite on object passed to runtime/cgo Avoids an msan error when runtime/cgo is explicitly rebuilt with -fsanitize=memory. Fixes #13815. Change-Id: I70308034011fb308b63585bcd40b0d1e62ec93ef Reviewed-on: https://go-review.googlesource.com/18263 Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-06 04:04:42 +00:00
Austin Clements	f90b48e0d3	runtime: require the stack barrier lock to traceback cgo and libcalls Currently, if sigprof determines that the G is in user code (not cgo or libcall code), it will only traceback the G stack if it can acquire the stack barrier lock. However, it has no such restriction if the G is in cgo or libcall code. Because cgo calls count as syscalls, stack scanning and stack barrier installation can occur during a cgo call, which means sigprof could attempt to traceback a G in a cgo call while scanstack is installing stack barriers in that G's stack. As a result, the following sequence of events can cause the sigprof traceback to panic with "missed stack barrier": 1. M1: G1 performs a Cgo call (which, on Windows, is any system call, which could explain why this is easier to reproduce on Windows). 2. M1: The Cgo call puts G1 into _Gsyscall state. 3. M2: GC starts a scan of G1's stack. It puts G1 in to _Gscansyscall and acquires the stack barrier lock. 4. M3: A profiling signal comes in. On Windows this is a global (though I don't think this matters), so the runtime stops M1 and calls sigprof for G1. 5. M3: sigprof fails to acquire the stack barrier lock (because the GC's stack scan holds it). 6. M3: sigprof observes that G1 is in a Cgo call, so it calls gentraceback on G1 with its Cgo transition point. 7. M3: gentraceback on G1 grabs the currently empty g.stkbar slice. 8. M2: GC finishes scanning G1's stack and installing stack barriers. 9. M3: gentraceback encounters one of the just-installed stack barriers and panics. This commit fixes this by only allowing cgo tracebacks if sigprof can acquire the stack barrier lock, just like in the regular user traceback case. For good measure, we put the same constraint on libcall tracebacks. This case is probably already safe because, unlike cgo calls, libcalls leave the G in _Grunning and prevent reaching a safe point, so scanstack cannot run during a libcall. However, this also means that sigprof will always acquire the stack barrier lock without contention, so there's no cost to adding this constraint to libcall tracebacks. Fixes #12528. For 1.5.3 (will require some backporting). Change-Id: Ia5a4b8e3d66b23b02ffcd54c6315c81055c0cec2 Reviewed-on: https://go-review.googlesource.com/18023 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-12-18 17:08:39 +00:00
Shenghou Ma	86f1944e86	cmd/dist, runtime: make runtime version available as runtime.buildVersion So that there is a uniformed way to retrieve Go version from a Go binary, starting from Go 1.4 (see https://golang.org/cl/117040043) Updates #13507. Change-Id: Iaa2b14fca2d8c4d883d3824e2efc82b3e6fe2624 Reviewed-on: https://go-review.googlesource.com/17459 Reviewed-by: Keith Randall <khr@golang.org>	2015-12-16 05:42:40 +00:00
Austin Clements	01baf13ba5	runtime: only trigger forced GC if GC is not running Currently, sysmon triggers a forced GC solely based on memstats.last_gc. However, memstats.last_gc isn't updated until mark termination, so once sysmon starts triggering forced GC, it will keep triggering them until GC finishes. The first of these actually starts a GC; the remainder up to the last print "GC forced", but gcStart returns immediately because gcphase != _GCoff; then the last may start another GC if the previous GC finishes (and sets last_gc) between sysmon triggering it and gcStart checking the GC phase. Fix this by expanding the condition for starting a forced GC to also require that no GC is currently running. This, combined with the way forcegchelper blocks until the GC cycle is started, ensures sysmon only starts one GC when the time exceeds the forced GC threshold. Fixes #13458. Change-Id: Ie6cf841927f6085136be3f45259956cd5cf10d23 Reviewed-on: https://go-review.googlesource.com/17819 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-12-15 20:13:19 +00:00

1 2 3

112 Commits