qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-19 14:54:43 -07:00

Author	SHA1	Message	Date
Austin Clements	22000f5407	runtime: record swept and reclaimed bytes in sweep trace This extends the GCSweepDone event with counts of swept and reclaimed bytes. These are useful for understanding the duration and effectiveness of sweep events. Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df Reviewed-on: https://go-review.googlesource.com/40811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-04-19 18:31:14 +00:00
Austin Clements	79c56addb6	runtime: make sweep trace events encompass entire sweep loop Currently, each individual span sweep emits a span to the trace. But sweeps are generally done in loops until some condition is satisfied, so this tracing is lower-level than anyone really wants any hides the fact that no other work is being accomplished between adjacent sweep events. This is also high overhead: enabling tracing significantly impacts sweep latency. Replace this with instead tracing around the sweep loops used for allocation. This is slightly tricky because sweep loops don't generally know if any sweeping will happen in them. Hence, we make the tracing lazy by recording in the P that we would like to start tracing the sweep if one happens, and then only closing the sweep event if we started it. This does mean we don't get tracing on every sweep path, which are legion. However, we get much more informative tracing on the paths that block allocation, which are the paths that matter. Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c Reviewed-on: https://go-review.googlesource.com/40810 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-19 18:31:11 +00:00
David Lazar	3249cb0ab4	runtime/trace: iterate over frames instead of PCs Now the runtime/trace tests pass with -l=4. This also gets rid of the frames cache for multiple reasons: 1) The frames cache was used to avoid repeated calls to funcname and funcline. Now these calls happen inside the CallersFrames iterator. 2) Maintaining a frames cache is harder: map[uintptr]traceFrame doesn't work since each PC can map to multiple traceFrames. 3) It's not clear that the cache is important. Change-Id: I2914ac0b3ba08e39b60149d99a98f9f532b35bbb Reviewed-on: https://go-review.googlesource.com/40591 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-04-14 12:21:21 +00:00
Daniel Martí	77b09b8b8d	runtime: remove unused g parameter Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-16 14:03:45 +00:00
Austin Clements	0efc8b2188	runtime: avoid repeated findmoduledatap calls Currently almost every function that deals with a _func has to first look up the moduledata for the module containing the function's entry point. This means we almost always do at least two identical module lookups whenever we deal with a _func (one to get the _func and another to get something from its module data) and sometimes several more. Fix this by making findfunc return a new funcInfo type that embeds _func, but also includes the moduledata, and making all of the functions that currently take a _func instead take a funcInfo and use the already-found moduledata. This transformation is trivial for the most part, since the *_func type is usually inferred. The annoying part is that we can no longer use nil to indicate failure, so this introduces a funcInfo.valid() method and replaces nil checks with calls to valid. Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f Reviewed-on: https://go-review.googlesource.com/37331 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2017-03-06 19:17:24 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Austin Clements	d089a6c718	runtime: remove stack barriers Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 15:52:54 +00:00
Heschi Kreinick	2a74b9e814	cmd/trace: Record mark assists in execution traces During the mark phase of garbage collection, goroutines that allocate may be recruited to assist. This change creates trace events for mark assists and displays them similarly to sweep assists in the trace viewer. Mark assists are different than sweeps in that they can be preempted, so displaying them in the trace viewer is a little tricky -- we may need to synthesize multiple slices for one mark assist. This could have been done in the parser instead, but I thought it might be preferable to keep the parser as true to the event stream as possible. Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7 Reviewed-on: https://go-review.googlesource.com/36015 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-02-10 18:03:42 +00:00
Austin Clements	6da83c6fc0	runtime, cmd/trace: track goroutines blocked on GC assists Currently when a goroutine blocks on a GC assist, it emits a generic EvGoBlock event. Since assist blocking events and, in particular, the length of the blocked assist queue, are important for diagnosing GC behavior, this commit adds a new EvGoBlockGC event for blocking on a GC assist. The trace viewer uses this event to report a "waiting on GC" count in the "Goroutines" row. This makes sense because, unlike other blocked goroutines, these goroutines do have work to do, so being blocked on a GC assist is quite similar to being in the "runnable" state, which we also report in the trace viewer. Change-Id: Ic21a326992606b121ea3d3d00110d8d1fdc7a5ef Reviewed-on: https://go-review.googlesource.com/30704 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-28 14:29:47 +00:00
Austin Clements	6834839427	runtime, cmd/trace: annotate different mark worker types Currently mark workers are shown in the trace as regular goroutines labeled "runtime.gcBgMarkWorker". That's somewhat unhelpful to an end user because of the opaque label and particularly unhelpful to runtime developers because it doesn't distinguish the different types of mark workers. Fix this by introducing a variant of the GoStart event called GoStartLabel that lets the runtime indicate a label for a goroutine execution span and using this to label mark worker executions as "GC (<mode>)" in the trace viewer. Since this bumps the trace version to 1.8, we also add test data for 1.7 traces. Change-Id: Id7b9c0536508430c661ffb9e40e436f3901ca121 Reviewed-on: https://go-review.googlesource.com/30702 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-28 14:29:40 +00:00
Peter Weinberger	ca922b6d36	runtime: Profile goroutines holding contended mutexes. runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack traces of goroutines holding contended mutexes if n > 0. From runtime/pprof, pprot.Lookup("mutex").WriteTo writes the accumulated stack traces to w (in essentially the same format that blocking profiling uses). Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a Reviewed-on: https://go-review.googlesource.com/29650 Reviewed-by: Austin Clements <austin@google.com>	2016-10-28 11:47:16 +00:00
Austin Clements	c242517866	runtime: replace g with guintptr in trace trace's reader g is going to cause write barriers in unfortunate places, so replace it with a guintptr. Change-Id: Ie8fb13bb89a78238f9d2a77ec77da703e96df8af Reviewed-on: https://go-review.googlesource.com/31469 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-21 16:00:20 +00:00
Austin Clements	1bc6be6423	runtime: mark several types go:notinheap This covers basically all sysAlloc'd, persistentalloc'd, and fixalloc'd types. Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b Reviewed-on: https://go-review.googlesource.com/30941 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-15 17:58:20 +00:00
Austin Clements	94589054d3	cmd/trace: label mark termination spans as such Currently these are labeled "MARK", which was accurate in the STW collector, but these really indicate mark termination now, since marking happens for the full duration of the concurrent GC. Re-label them as "MARK TERMINATION" to clarify this. Change-Id: Ie98bd961195acde49598b4fa3f9e7d90d757c0a6 Reviewed-on: https://go-review.googlesource.com/30018 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2016-10-07 18:33:23 +00:00
Austin Clements	fa9b57bb1d	runtime: make next_gc ^0 when GC is disabled When GC is disabled, we set gcpercent to -1. However, we still use gcpercent to compute several values, such as next_gc and gc_trigger. These calculations are meaningless when gcpercent is -1 and result in meaningless values. This is okay in a sense because we also never use these values if gcpercent is -1, but they're confusing when exposed to the user, for example via MemStats or the execution trace. It's particularly unfortunate in the execution trace because it attempts to plot the underflowed value of next_gc, which scales all useful information in the heap row into oblivion. Fix this by making next_gc ^0 when gcpercent < 0. This has the advantage of being true in a way: next_gc is effectively infinite when gcpercent < 0. We can also detect this special value when updating the execution trace and report next_gc as 0 so it doesn't blow up the display of the heap line. Change-Id: I4f366e4451f8892a4908da7b2b6086bdc67ca9a9 Reviewed-on: https://go-review.googlesource.com/30016 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-07 18:32:51 +00:00
Dmitry Vyukov	cd285f1c6f	runtime: fix global buffer reset in StopTrace We reset global buffer only if its pos != 0. We ought to do it always, but queue it only if pos != 0. This is a latent bug. Currently it does not fire because whenever we create a global buffer, we increment pos. Change-Id: I01e28ae88ce9a5412497c524391b8b7cb443ffd9 Reviewed-on: https://go-review.googlesource.com/25574 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-09-02 19:14:11 +00:00
Dmitry Vyukov	747a158ef3	runtime: speed up StartTrace with lots of blocked goroutines In StartTrace we emit EvGoCreate for all existing goroutines. This includes stack unwind to obtain current stack. Real Go programs can contain hundreds of thousands of blocked goroutines. For such programs StartTrace can take up to a second (few ms per goroutine). Obtain current stack ID once and use it for all EvGoCreate events. This speeds up StartTrace with 10K blocked goroutines from 20ms to 4 ms (win for StartTrace called from net/http/pprof hander will be bigger as stack is deeper). Change-Id: I9e5ff9468331a840f8fdcdd56c5018c2cfde61fc Reviewed-on: https://go-review.googlesource.com/25573 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-08-22 17:40:10 +00:00
Dmitry Vyukov	a3703618ea	runtime: use per-goroutine sequence numbers in tracer Currently tracer uses global sequencer and it introduces significant slowdown on parallel machines (up to 10x). Replace the global sequencer with per-goroutine sequencer. If we assign per-goroutine sequence numbers to only 3 types of events (start, unblock and syscall exit), it is enough to restore consistent partial ordering of all events. Even these events don't need sequence numbers all the time (if goroutine starts on the same P where it was unblocked, then start does not need sequence number). The burden of restoring the order is put on trace parser. Details of the algorithm are described in the comments. On http benchmark with GOMAXPROCS=48: no tracing: 5026 ns/op tracing: 27803 ns/op (+453%) with this change: 6369 ns/op (+26%, mostly for traceback) Also trace size is reduced by ~22%. Average event size before: 4.63 bytes/event, after: 3.62 bytes/event. Besides running trace tests, I've also tested with manually broken cputicks (random skew for each event, per-P skew and episodic random skew). In all cases broken timestamps were detected and no test failures. Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa Reviewed-on: https://go-review.googlesource.com/21512 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-23 15:57:05 +00:00
Dmitry Vyukov	2d342fba78	runtime: fix description of trace events Change-Id: I037101b1921fe151695d32e9874b50dd64982298 Reviewed-on: https://go-review.googlesource.com/22314 Reviewed-by: Austin Clements <austin@google.com>	2016-04-22 21:32:37 +00:00
Dmitry Vyukov	3fafe2e888	internal/trace: support parsing of 1.5 traces 1. Parse out version from trace header. 2. Restore handling of 1.5 traces. 3. Restore optional symbolization of traces. 4. Add some canned 1.5 traces for regression testing (http benchmark trace, runtime/trace stress traces, plus one with broken timestamps). Change-Id: Idb18a001d03ded8e13c2730eeeb37c5836e31256 Reviewed-on: https://go-review.googlesource.com/21803 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-04-11 17:56:44 +00:00
Dmitry Vyukov	0fb7b4cccd	runtime: emit file:line info into traces This makes traces self-contained and simplifies trace workflow in modern cloud environments where it is simpler to reach a service via HTTP than to obtain the binary. Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d Reviewed-on: https://go-review.googlesource.com/21732 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2016-04-08 20:52:30 +00:00
Austin Clements	08594ac7c7	runtime: acquire stack lock in traceEvent traceEvent records system call events after a G has already entered _Gsyscall, which means the garbage collector could be installing stack barriers in the G's stack during the traceEvent. If traceEvent attempts to capture the user stack during this, it may observe a inconsistent stack barriers and panic. Fix this by acquiring the stack lock around the stack walk in traceEvent. Fixes #14101. Change-Id: I15f0ab0c70c04c6e182221f65a6f761c5a896459 Reviewed-on: https://go-review.googlesource.com/18973 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2016-01-27 02:22:09 +00:00
Austin Clements	e9aef43d87	runtime: eliminate traceAllocBlock write barriers This replaces *traceAllocBlock with traceAllocBlockPtr. Updates #10600. Change-Id: I94a20d90f04cca7c457b29062427748e315e4857 Reviewed-on: https://go-review.googlesource.com/17004 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-19 21:17:09 +00:00
Michael Matloob	432cb66f16	runtime: break out system-specific constants into package sys runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817 Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-12 17:04:45 +00:00
Austin Clements	f5c42cf88e	runtime: replace traceBuf slice with index Currently traceBuf keeps track of where it is in the trace buffer by also maintaining a slice that points in to this buffer with an initial length of 0 and a cap of the length of the array. All writes to this buffer are done by appending to the slice (as long as the bounds checks are right, it will never overflow and the append won't allocate a new slice). Each of these appends generates a write barrier. As long as we never overflow the buffer, this write barrier won't fire, but this wreaks havoc with eliminating write barriers from the tracing code. If we were to overflow the buffer, this would both allocate and invoke a write barrier, both things that are dicey at best to do in many of the contexts tracing happens. It also wastes space in the traceBuf and leads to more complex code and more complex generated code. Replace this slice trick with keeping track of a simple array position. Updates #10600. Change-Id: I0a63eecec1992e195449f414ed47653f66318d0e Reviewed-on: https://go-review.googlesource.com/16814 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:31 +00:00
Austin Clements	2be1ed80c5	runtime: eliminate traceStack write barriers This replaces *traceStack with traceStackPtr, much like the preceding commit. Updates #10600. Change-Id: Ifadc35eb37a405ae877f9740151fb31a0ca1d08f Reviewed-on: https://go-review.googlesource.com/16813 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:26 +00:00
Austin Clements	03227bb55e	runtime: eliminate traceBuf write barriers The tracing code is currently called from contexts such as sysmon and the scheduler where write barriers are not allowed. Unfortunately, while the common paths through the tracing code do not have write barriers, many of the less common paths dealing with buffer overflow and recycling do. This change replaces all *traceBufs with traceBufPtrs. In the style of guintptr, etc., the GC does not trace traceBufPtrs and write barriers do not apply when these pointers are written. Since traceBufs are allocated from non-GC'd memory and manually managed, this is always safe. Updates #10600. Change-Id: I52b992d36d1b634ebd855c8cde27947ec14f59ba Reviewed-on: https://go-review.googlesource.com/16812 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:18 +00:00
Michael Matloob	67faca7d9c	runtime: break atomics out into package runtime/internal/atomic This change breaks out most of the atomics functions in the runtime into package runtime/internal/atomic. It adds some basic support in the toolchain for runtime packages, and also modifies linux/arm atomics to remove the dependency on the runtime's mutex. The mutexes have been replaced with spinlocks. all trybots are happy! In addition to the trybots, I've tested on the darwin/arm64 builder, on the darwin/arm builder, and on a ppc64le machine. Change-Id: I6698c8e3cf3834f55ce5824059f44d00dc8e3c2f Reviewed-on: https://go-review.googlesource.com/14204 Run-TryBot: Michael Matloob <matloob@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-10 17:38:04 +00:00
Austin Clements	a51905fa04	runtime: decentralize sweep termination and mark transition This moves all of GC initialization, sweep termination, and the transition to concurrent marking in to the off->mark transition function. This means it's now handled on the goroutine that detected the state exit condition. As a result, malloc no longer needs to Gosched() at the beginning of the GC cycle to prevent over-allocation while the GC is starting up because it will now help the GC to start up. The Gosched hack is still necessary during GC shutdown (this is easy to test by enabling gctrace and hitting Ctrl-S to block the gctrace output). At this point, the GC coordinator still handles later phases. This requires a small tweak to how we start the GC coordinator. Currently, starting the GC coordinator is best-effort and may fail if the coordinator is about to park from the previous cycle but hasn't yet. We fix this by replacing the park/ready to wake up the coordinator with a semaphore. This is temporary since the coordinator will be going away in a few commits. Updates #11970. Change-Id: I2c6a11c91e72dfbc59c2d8e7c66146dee9a444fe Reviewed-on: https://go-review.googlesource.com/16357 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:27 +00:00
Austin Clements	f54bcedce1	runtime: beginning of decentralized off->mark transition This begins the conversion of the centralized GC coordinator to a decentralized state machine by introducing the internal API that triggers the first state transition from _GCoff to _GCmark (or _GCmarktermination). This change introduces the transition lock, the off->mark transition condition (which is very similar to shouldtriggergc()), and the general structure of a state transition. Since we're doing this conversion in stages, it then falls back to the GC coordinator to actually execute the cycle. We'll start moving logic out of the GC coordinator and in to transition functions next. This fixes a minor bug in gcstoptheworld debug mode where passing the heap trigger once could trigger multiple STW GCs. Updates #11970. Change-Id: I964087dd190a639eb5766398f8e1bbf8b352902f Reviewed-on: https://go-review.googlesource.com/16355 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-11-05 21:23:17 +00:00
Shawn Walker-Salas	001a75a74c	runtime/trace: fix tracing of blocking system calls The placement and invocation of traceGoSysCall when using entersyscallblock() instead of entersyscall() differs enough that the TestTraceSymbolize test can fail on some platforms. This change moves the invocation of traceGoSysCall for entersyscall() so that the same number of "frames to skip" are present in the trace as when entersyscallblock() is used ensuring system call traces remain identical regardless of internal implementation choices. Fixes golang/go#12056 Change-Id: I8361e91aa3708f5053f98263dfe9feb8c5d1d969 Reviewed-on: https://go-review.googlesource.com/13861 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-09-17 09:06:20 +00:00
Austin Clements	739f133837	runtime: fix hashing of trace stacks The call to hash the trace stack reversed the "seed" and "size" arguments to memhash and, hence, always called memhash with a 0 size, which dutifully returned a hash value that depended only on the number of PCs in the stack and not their values. As a result, all stacks were put in to a very subset of the 8,192 buckets. Fix this by passing these arguments in the correct order. Change-Id: I67cd29312f5615c7ffa23e205008dd72c6b8af62 Reviewed-on: https://go-review.googlesource.com/13613 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-09-14 18:14:14 +00:00
Russ Cox	80c98fa901	runtime/trace: record event sequence numbers explicitly Nearly all the flaky failures we've seen in trace tests have been due to the use of time stamps to determine relative event ordering. This is tricky for many reasons, including: - different cores might not have exactly synchronized clocks - VMs are worse than real hardware - non-x86 chips have different timer resolution than x86 chips - on fast systems two events can end up with the same time stamp Stop trying to make time reliable. It's clearly not going to be for Go 1.5. Instead, record an explicit event sequence number for ordering. Using our own counter solves all of the above problems. The trace still contains time stamps, of course. The sequence number is just used for ordering. Should alleviate #10554 somewhat. Then tickDiv can be chosen to be a useful time unit instead of having to be exact for ordering. Separating ordering and time stamps lets the trace parser diagnose systems where the time stamp order and actual order do not match for one reason or another. This CL adds that check to the end of trace.Parse, after all other sequence order-based checking. If that error is found, we skip the test instead of failing it. Putting the check in trace.Parse means that cmd/trace will pick up the same check, refusing to display a trace where the time stamps do not match actual ordering. Using net/http's BenchmarkClientServerParallel4 on various CPU counts, not tracing vs tracing: name old time/op new time/op delta ClientServerParallel4 50.4µs ± 4% 80.2µs ± 4% +59.06% (p=0.000 n=10+10) ClientServerParallel4-2 33.1µs ± 7% 57.8µs ± 5% +74.53% (p=0.000 n=10+10) ClientServerParallel4-4 18.5µs ± 4% 32.6µs ± 3% +75.77% (p=0.000 n=10+10) ClientServerParallel4-6 12.9µs ± 5% 24.4µs ± 2% +89.33% (p=0.000 n=10+10) ClientServerParallel4-8 11.4µs ± 6% 21.0µs ± 3% +83.40% (p=0.000 n=10+10) ClientServerParallel4-12 14.4µs ± 4% 23.8µs ± 4% +65.67% (p=0.000 n=10+10) Fixes #10512. Change-Id: I173eecf8191e86feefd728a5aad25bf1bc094b12 Reviewed-on: https://go-review.googlesource.com/12579 Reviewed-by: Austin Clements <austin@google.com>	2015-07-29 22:32:14 +00:00
Austin Clements	58f3a82950	runtime: fix comments referring to trace functions in runtime/pprof `ae1ea2a` moved trace-related functions from runtime/pprof to runtime/trace, but missed a doc comment and a code comment. Update these to reflect the move. Change-Id: I6e1e8861e5ede465c08a2e3f80b976145a8b32d8 Reviewed-on: https://go-review.googlesource.com/12525 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-07-22 18:33:38 +00:00
Brad Fitzpatrick	2ae77376f7	all: link to https instead of http The one in misc/makerelease/makerelease.go is particularly bad and probably warrants rotating our keys. I didn't update old weekly notes, and reverted some changes involving test code for now, since we're late in the Go 1.5 freeze. Otherwise, the rest are all auto-generated changes, and all manually reviewed. Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d Reviewed-on: https://go-review.googlesource.com/12048 Reviewed-by: Rob Pike <r@golang.org>	2015-07-11 14:36:33 +00:00
Dmitry Vyukov	e72f5f67a1	runtime: fix tracing of syscallexit There were two issues. 1. Delayed EvGoSysExit could have been emitted during TraceStart, while it had not yet emitted EvGoInSyscall. 2. Delayed EvGoSysExit could have been emitted during next tracing session. Fixes #10476 Fixes #11262 Change-Id: Iab68eb31cf38eb6eb6eee427f49c5ca0865a8c64 Reviewed-on: https://go-review.googlesource.com/9132 Reviewed-by: Russ Cox <rsc@golang.org>	2015-06-18 13:59:55 +00:00
Austin Clements	a1da255aa0	runtime: factor stoptheworld/starttheworld pattern There are several steps to stopping and starting the world and currently they're open-coded in several places. The garbage collector is the only thing that needs to stop and start the world in a non-trivial pattern. Replace all other uses with calls to higher-level functions that implement the entire pattern necessary to stop and start the world. This is a pure refectoring and should not change any code semantics. In the following commits, we'll make changes that are easier to do with this abstraction in place. This commit renames the old starttheworld to startTheWorldWithSema. This is a slight misnomer right now because the callers release worldsema just before calling this. However, a later commit will swap these and I don't want to think of another name in the mean time. Change-Id: I5dc97f87b44fb98963c49c777d7053653974c911 Reviewed-on: https://go-review.googlesource.com/10154 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-18 14:55:25 +00:00
Shenghou Ma	5f69e739d3	runtime: adjust traceTickDiv for non-x86 architectures Fixes #10554. Fixes #10623. Change-Id: I90fbaa34e3d55c8758178f8d2e7fa41ff1194a1b Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9247 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dave Cheney <dave@cheney.net>	2015-05-01 07:25:49 +00:00
Russ Cox	181e26b9fa	runtime: replace func-based write barrier skipping with type-based This CL revises CL 7504 to use explicitly uintptr types for the struct fields that are going to be updated sometimes without write barriers. The result is that the fields are now updated always without write barriers. This approach has two important properties: 1) Now the GC never looks at the field, so if the missing reference could cause a problem, it will do so all the time, not just when the write barrier is missed at just the right moment. 2) Now a write barrier never happens for the field, avoiding the (correct) detection of inconsistent write barriers when GODEBUG=wbshadow=1. Change-Id: Iebd3962c727c0046495cc08914a8dc0808460e0e Reviewed-on: https://go-review.googlesource.com/9019 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-20 20:20:09 +00:00
Dmitry Vyukov	089d363a91	runtime: fix tracing of syscall exit Fix tracing of syscall exit after: https://go-review.googlesource.com/#/c/7504/ Change-Id: Idcde2aa826d2b9a05d0a90a80242b6bfa78846ab Reviewed-on: https://go-review.googlesource.com/8728 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-10 17:39:06 +00:00
Austin Clements	d7e0ad4b82	runtime: introduce heap_live; replace use of heap_alloc in GC Currently there are two main consumers of memstats.heap_alloc: updatememstats (aka ReadMemStats) and shouldtriggergc. updatememstats recomputes heap_alloc from the ground up, so we don't need to keep heap_alloc up to date for it. shouldtriggergc wants to know how many bytes were marked by the previous GC plus how many bytes have been allocated since then, but this isn't what heap_alloc tracks. heap_alloc also includes objects that are not marked and haven't yet been swept. Introduce a new memstat called heap_live that actually tracks what shouldtriggergc wants to know and stop keeping heap_alloc up to date. Unlike heap_alloc, heap_live follows a simple sawtooth that drops during each mark termination and increases monotonically between GCs. heap_alloc, on the other hand, has much more complicated behavior: it may drop during sweep termination, slowly decreases from background sweeping between GCs, is roughly unaffected by allocation as long as there are unswept spans (because we sweep and allocate at the same rate), and may go up after background sweeping is done depending on the GC trigger. heap_live simplifies computing next_gc and using it to figure out when to trigger garbage collection. Currently, we guess next_gc at the end of a cycle and update it as we sweep and get a better idea of how much heap was marked. Now, since we're directly tracking how much heap is marked, we can directly compute next_gc. This also corrects bugs that could cause us to trigger GC early. Currently, in any case where sweep termination actually finds spans to sweep, heap_alloc is an overestimation of live heap, so we'll trigger GC too early. heap_live, on the other hand, is unaffected by sweeping. Change-Id: I1f96807b6ed60d4156e8173a8e68745ffc742388 Reviewed-on: https://go-review.googlesource.com/8389 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-06 21:28:13 +00:00
Dmitry Vyukov	4396ea96c4	runtime: remove futile wakeups from trace Channels and sync.Mutex'es allow another goroutine to acquire resource ahead of an unblocked goroutine. This is good for performance, but leads to futile wakeups (the unblocked goroutine needs to block again). Futile wakeups caused user confusion during the very first evaluation of tracing functionality on a real server (a goroutine as if acquires a mutex in a loop, while there is no loop in user code). This change detects futile wakeups on channels and emits a special event to denote the fact. Later parser finds entire wakeup sequences (unblock->start->block) and removes them. sync.Mutex will be supported in a separate change. Change-Id: Iaaaee9d5c0921afc62b449a97447445030ac19d3 Reviewed-on: https://go-review.googlesource.com/7380 Reviewed-by: Keith Randall <khr@golang.org>	2015-03-17 14:14:55 +00:00
Dmitry Vyukov	9d332a8324	cmd/trace: dump thread id on proc start Augment ProcStart events with OS thread id. This helps in scheduler locality analysis. Change-Id: I93fea75d3072cf68de66110d0b59d07101badcb5 Reviewed-on: https://go-review.googlesource.com/7302 Reviewed-by: Keith Randall <khr@golang.org>	2015-03-11 12:52:41 +00:00
Dmitry Vyukov	919fd24884	runtime: remove runtime frames from stacks in traces Stip uninteresting bottom and top frames from trace stacks. This makes both binary and json trace files smaller, and also makes stacks shorter and more readable in the viewer. Change-Id: Ib9c80ccc280504f0e235f867f53f1d2652c41583 Reviewed-on: https://go-review.googlesource.com/5523 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com>	2015-03-10 14:46:15 +00:00
Matthew Dempsky	3c8a89daf3	runtime: simplify CPU profiling code This makes Go's CPU profiling code somewhat more idiomatic; e.g., using := instead of forward declaring variables, using "int" for element counts instead of "uintptr", and slices instead of C-style pointer+length. This makes the code easier to read and eliminates a lot of type conversion clutter. Additionally, in sigprof we can collect just maxCPUProfStack stack frames, as cpuprof won't use more than that anyway. Change-Id: I0235b5ae552191bcbb453b14add6d8c01381bd06 Reviewed-on: https://go-review.googlesource.com/6072 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-02-26 08:59:24 +00:00
Dmitry Vyukov	edadffa2f3	cmd/trace: add new command Trace command allows to visualize and analyze traces. Run as: $ go tool trace binary trace.file The commands opens web browser with the main page, which contains links for trace visualization, blocking profiler, network IO profiler and per-goroutine traces. Also move trace parser from runtime/pprof/trace_parser_test.go to internal/trace/parser.go, so that it can be shared between tests and the command. Change-Id: Ic97ed59ad6e4c7e1dc9eca5e979701a2b4aed7cf Reviewed-on: https://go-review.googlesource.com/3601 Reviewed-by: Andrew Gerrand <adg@golang.org>	2015-02-20 18:31:25 +00:00
Dmitry Vyukov	4a45ac577f	runtime: fix false race report during tracing Currently race detector produces the following reports on pprof tests: WARNING: DATA RACE Read by goroutine 4: runtime/pprof_test.TestTraceStartStop() src/runtime/pprof/trace_test.go:38 +0x1da testing.tRunner() src/testing/testing.go:448 +0x13a Previous write by goroutine 5: bytes.(Buffer).grow() src/bytes/buffer.go:102 +0x190 bytes.(Buffer).Write() src/bytes/buffer.go:127 +0x75 runtime/pprof.func·002() src/runtime/pprof/pprof.go:633 +0xae Trace writer goroutine synchronizes with StopTrace using trace.shutdownSema runtime semaphore. But race detector does not see that synchronization and so produces false reports. Teach race detector about the synchronization. Change-Id: I1219817325d4e16b423f29a0cbee94c929793881 Reviewed-on: https://go-review.googlesource.com/3746 Reviewed-by: Russ Cox <rsc@golang.org>	2015-02-03 15:41:41 +00:00
Austin Clements	28b5118415	runtime: rename m.gcing to m.preemptoff and make it a string m.gcing has become overloaded to mean "don't preempt this g" in general. Once the garbage collector is preemptible, the one thing it won't mean is that we're in the garbage collector. So, rename gcing to "preemptoff" and make it a string giving a reason that preemption is disabled. gcing was never set to anything but 0 or 1, so we don't have to worry about there being a stack of reasons. Change-Id: I4337c29e8e942e7aa4f106fc29597e1b5de4ef46 Reviewed-on: https://go-review.googlesource.com/3660 Reviewed-by: Russ Cox <rsc@golang.org>	2015-02-02 19:34:51 +00:00
Dmitry Vyukov	256116ad25	runtime: fix trace ticks frequency on windows Change-Id: I8c7fcc7705070bc9979e39d08a4c9b2870087a08 Reviewed-on: https://go-review.googlesource.com/3500 Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2015-01-30 08:35:38 +00:00
Dmitry Vyukov	5288fadbdc	runtime: add tracing of runtime events Add actual tracing of interesting runtime events. Part of a larger tracing functionality: https://docs.google.com/document/u/1/d/1FP5apqzBgr7ahCCgFO-yoVhk4YZrNIDNf9RybngBc14/pub Full change: https://codereview.appspot.com/146920043 Change-Id: Icccf54aea54e09350bb698ba6bf11532f9fbe6d3 Reviewed-on: https://go-review.googlesource.com/1451 Reviewed-by: Russ Cox <rsc@golang.org>	2015-01-28 16:35:24 +00:00

1 2

51 Commits