qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-20 07:14:40 -07:00

Author	SHA1	Message	Date
Michael Hudson-Doyle	fa896733b5	runtime: check consistency of all module data objects Current code just checks the consistency (that the functab is correctly sorted by PC, etc) of the moduledata object that the runtime belongs to. Change to check all of them. Change-Id: I544a44c5de7445fff87d3cdb4840ff04c5e2bf75 Reviewed-on: https://go-review.googlesource.com/9773 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-05-07 15:06:08 +00:00
Alex Brainman	a52dc9fcbd	runtime: fix comments that mention g status values Makes searching in source code easier. Change-Id: Ie2e85934d23920ac0bc01d28168bcfbbdc465580 Reviewed-on: https://go-review.googlesource.com/9774 Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com> Reviewed-by: Minux Ma <minux@golang.org>	2015-05-07 00:00:38 +00:00
Austin Clements	17db6e0420	runtime: use heap scan size as estimate of GC scan work Currently, the GC uses a moving average of recent scan work ratios to estimate the total scan work required by this cycle. This is in turn used to compute how much scan work should be done by mutators when they allocate in order to perform all expected scan work by the time the allocated heap reaches the heap goal. However, our current scan work estimate can be arbitrarily wrong if the heap topography changes significantly from one cycle to the next. For example, in the go1 benchmarks, at the beginning of each benchmark, the heap is dominated by a 256MB no-scan object, so the GC learns that the scan density of the heap is very low. In benchmarks that then rapidly allocate pointer-dense objects, by the time of the next GC cycle, our estimate of the scan work can be too low by a large factor. This in turn lets the mutator allocate faster than the GC can collect, allowing it to get arbitrarily far ahead of the scan work estimate, which leads to very long GC cycles with very little mutator assist that can overshoot the heap goal by large margins. This is particularly easy to demonstrate with BinaryTree17: $ GODEBUG=gctrace=1 ./go1.test -test.bench BinaryTree17 gc #1 @0.017s 2%: 0+0+0+0+0 ms clock, 0+0+0+0/0/0+0 ms cpu, 4->262->262 MB, 4 MB goal, 1 P gc #2 @0.026s 3%: 0+0+0+0+0 ms clock, 0+0+0+0/0/0+0 ms cpu, 262->262->262 MB, 524 MB goal, 1 P testing: warning: no tests to run PASS BenchmarkBinaryTree17 gc #3 @1.906s 0%: 0+0+0+0+7 ms clock, 0+0+0+0/0/0+7 ms cpu, 325->325->287 MB, 325 MB goal, 1 P (forced) gc #4 @12.203s 20%: 0+0+0+10067+10 ms clock, 0+0+0+0/2523/852+10 ms cpu, 430->2092->1950 MB, 574 MB goal, 1 P 1 9150447353 ns/op Change this estimate to instead use the current scannable heap size. This has the advantage of being based solely on the current state of the heap, not on past densities or reachable heap sizes, so it isn't susceptible to falling behind during these sorts of phase changes. This is strictly an over-estimate, but it's better to over-estimate and get more assist than necessary than it is to under-estimate and potentially spiral out of control. Experiments with scaling this estimate back showed no obvious benefit for mutator utilization, heap size, or assist time. This new estimate has little effect for most benchmarks, including most go1 benchmarks, x/benchmarks, and the 6g benchmark. It has a huge effect for benchmarks that triggered the bad pacer behavior: name old mean new mean delta BinaryTree17 10.0s × (1.00,1.00) 3.5s × (0.98,1.01) -64.93% (p=0.000) Fannkuch11 2.74s × (1.00,1.01) 2.65s × (1.00,1.00) -3.52% (p=0.000) FmtFprintfEmpty 56.4ns × (0.99,1.00) 57.8ns × (1.00,1.01) +2.43% (p=0.000) FmtFprintfString 187ns × (0.99,1.00) 185ns × (0.99,1.01) -1.19% (p=0.010) FmtFprintfInt 184ns × (1.00,1.00) 183ns × (1.00,1.00) (no variance) FmtFprintfIntInt 321ns × (1.00,1.00) 315ns × (1.00,1.00) -1.80% (p=0.000) FmtFprintfPrefixedInt 266ns × (1.00,1.00) 263ns × (1.00,1.00) -1.22% (p=0.000) FmtFprintfFloat 353ns × (1.00,1.00) 353ns × (1.00,1.00) -0.13% (p=0.035) FmtManyArgs 1.21µs × (1.00,1.00) 1.19µs × (1.00,1.00) -1.33% (p=0.000) GobDecode 9.69ms × (1.00,1.00) 9.59ms × (1.00,1.00) -1.07% (p=0.000) GobEncode 7.89ms × (0.99,1.01) 7.74ms × (1.00,1.00) -1.92% (p=0.000) Gzip 391ms × (1.00,1.00) 392ms × (1.00,1.00) ~ (p=0.522) Gunzip 97.1ms × (1.00,1.00) 97.0ms × (1.00,1.00) -0.10% (p=0.000) HTTPClientServer 55.7µs × (0.99,1.01) 56.7µs × (0.99,1.01) +1.81% (p=0.001) JSONEncode 19.1ms × (1.00,1.00) 19.0ms × (1.00,1.00) -0.85% (p=0.000) JSONDecode 66.8ms × (1.00,1.00) 66.9ms × (1.00,1.00) ~ (p=0.288) Mandelbrot200 4.13ms × (1.00,1.00) 4.12ms × (1.00,1.00) -0.08% (p=0.000) GoParse 3.97ms × (1.00,1.01) 4.01ms × (1.00,1.00) +0.99% (p=0.000) RegexpMatchEasy0_32 114ns × (1.00,1.00) 115ns × (0.99,1.00) ~ (p=0.070) RegexpMatchEasy0_1K 376ns × (1.00,1.00) 376ns × (1.00,1.00) ~ (p=0.900) RegexpMatchEasy1_32 94.9ns × (1.00,1.00) 96.3ns × (1.00,1.01) +1.53% (p=0.001) RegexpMatchEasy1_1K 568ns × (1.00,1.00) 567ns × (1.00,1.00) -0.22% (p=0.001) RegexpMatchMedium_32 159ns × (1.00,1.00) 159ns × (1.00,1.00) ~ (p=0.178) RegexpMatchMedium_1K 46.4µs × (1.00,1.00) 46.6µs × (1.00,1.00) +0.29% (p=0.000) RegexpMatchHard_32 2.37µs × (1.00,1.00) 2.37µs × (1.00,1.00) ~ (p=0.722) RegexpMatchHard_1K 71.1µs × (1.00,1.00) 71.2µs × (1.00,1.00) ~ (p=0.229) Revcomp 565ms × (1.00,1.00) 562ms × (1.00,1.00) -0.52% (p=0.000) Template 81.0ms × (1.00,1.00) 80.2ms × (1.00,1.00) -0.97% (p=0.000) TimeParse 380ns × (1.00,1.00) 380ns × (1.00,1.00) ~ (p=0.148) TimeFormat 405ns × (0.99,1.00) 385ns × (0.99,1.00) -5.00% (p=0.000) Change-Id: I11274158bf3affaf62662e02de7af12d5fb789e4 Reviewed-on: https://go-review.googlesource.com/9696 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-05-06 19:40:38 +00:00
Austin Clements	3be3cbd548	runtime: track "scannable" bytes of heap This tracks the number of scannable bytes in the allocated heap. That is, bytes that the garbage collector must scan before reaching the last pointer field in each object. This will be used to compute a more robust estimate of the GC scan work. Change-Id: I1eecd45ef9cdd65b69d2afb5db5da885c80086bb Reviewed-on: https://go-review.googlesource.com/9695 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-06 19:40:33 +00:00
Austin Clements	53c53984e7	runtime: include scalar slots in GC scan work metric The garbage collector predicts how much "scan work" must be done in a cycle to determine how much work should be done by mutators when they allocate. Most code doesn't care what units the scan work is in: it simply knows that a certain amount of scan work has to be done in the cycle. Currently, the GC uses the number of pointer slots scanned as the scan work on the theory that this is the bulk of the time spent in the garbage collector and hence reflects real CPU resource usage. However, this metric is difficult to estimate at the beginning of a cycle. Switch to counting the total number of bytes scanned, including both pointer and scalar slots. This is still less than the total marked heap since it omits no-scan objects and no-scan tails of objects. This metric may not reflect absolute performance as well as the count of scanned pointer slots (though it still takes time to scan scalar fields), but it will be much easier to estimate robustly, which is more important. Change-Id: Ie3a5eeeb0384a1ca566f61b2f11e9ff3a75ca121 Reviewed-on: https://go-review.googlesource.com/9694 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-06 19:40:27 +00:00
Austin Clements	c4931a8433	runtime: dispose gcWork caches before updating controller state Currently, we only flush the per-P gcWork caches in gcMark, at the beginning of mark termination. This is necessary to ensure that no work is held up in these caches. However, this flush happens after we update the GC controller state, which depends on statistics about marked heap size and scan work that are only updated by this flush. Hence, the controller is missing the bulk of heap marking and scan work. This bug was introduced in commit `1b4025f`, which introduced the per-P gcWork caches. Fix this by flushing these caches before we update the GC controller state. We continue to flush them at the beginning of mark termination as well to be robust in case any write barriers happened between the previous flush and entering mark termination, but this should be a no-op. Change-Id: I8f0f91024df967ebf0c616d1c4f0c339c304ebaa Reviewed-on: https://go-review.googlesource.com/9646 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-06 19:40:22 +00:00
Rick Hudson	1845314560	runtime: remove unused GC timers During development some tracing routines were added that are not needed in the release. These included GCstarttimes, GCendtimes, and GCprinttimes. Fixes #10462 Change-Id: I0788e6409d61038571a5ae0cbbab793102df0a65 Reviewed-on: https://go-review.googlesource.com/9689 Reviewed-by: Austin Clements <austin@google.com>	2015-05-06 12:53:08 +00:00
Aram Hăvărneanu	fe5ef5c9d7	runtime, syscall: link Solaris binaries directly instead of using dlopen/dlsym Before CL 8214 (use .plt instead of .got on Solaris) Solaris used a dynamic linking scheme that didn't permit lazy binding. To speed program startup, Go binaries only used it for a small number of symbols required by the runtime. Other symbols were resolved on demand on first use, and were cached for subsequent use. This required some moderately complex code in the syscall package. CL 8214 changed the way dynamic linking is implemented, and now lazy binding is supported. As now all symbols are resolved lazily by the dynamic loader, there is no need for the complex code in the syscall package that did the same. This CL makes Go programs link directly with the necessary shared libraries and deletes the lazy-loading code implemented in Go. Change-Id: Ifd7275db72de61b70647242e7056dd303b1aee9e Reviewed-on: https://go-review.googlesource.com/9184 Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-05-06 11:38:50 +00:00
Aram Hăvărneanu	121489cbfd	runtime/cgo: add cgo support for solaris/amd64 Change-Id: Ic9744c7716cdd53f27c6e5874230963e5fff0333 Reviewed-on: https://go-review.googlesource.com/8260 Reviewed-by: Minux Ma <minux@golang.org>	2015-05-06 11:37:28 +00:00
Aram Hăvărneanu	c94f1f791b	runtime: always load address of libcFunc on Solaris The linker always uses .plt for externals, so libcFunc is now an actual external symbol instead of a pointer to one. Fixes most of the breakage introduced in previous CL. Change-Id: I64b8c96f93127f2d13b5289b024677fd3ea7dbea Reviewed-on: https://go-review.googlesource.com/8215 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-05-06 11:36:57 +00:00
Russ Cox	ceefebd795	runtime: rename ptrsize to ptrdata I forgot there is already a ptrSize constant. Rename field to avoid some confusion. Change-Id: I098fdcc8afc947d6c02c41c6e6de24624cc1c8ff Reviewed-on: https://go-review.googlesource.com/9700 Reviewed-by: Austin Clements <austin@google.com>	2015-05-05 19:27:47 +00:00
Keith Randall	5a828cfcde	runtime: let freezetheworld work even when gomaxprocs=1 Freezetheworld still has stuff to do when gomaxprocs=1. In particular, signals can come in on other Ms (like the GC M, say) and the single user M is still running. Fixes #10546 Change-Id: I2f07f17d1c81e93cf905df2cb087112d436ca7e7 Reviewed-on: https://go-review.googlesource.com/9551 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-05-05 15:11:10 +00:00
Shenghou Ma	102436e800	runtime: fix software FP regs corruption when emulating SQRT on ARM When emulating ARM FSQRT instruction, the sqrt function itself should not use any floating point arithmetics, otherwise it will clobber the user software FP registers. Fortunately, the sqrt function only uses floating point instructions to test for corner cases, so it's easy to make that function does all it job using pure integer arithmetic only. I've verified that after this change, runtime.stepflt and runtime.sqrt doesn't contain any call to _sfloat. (Perhaps we should add //go:nosfloat to make the compiler enforce this?) Fixes #10641. Change-Id: Ida4742c49000fae4fea4649f28afde630ce4c576 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9570 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Keith Randall <khr@golang.org>	2015-05-05 07:32:58 +00:00
Austin Clements	98a9d36837	runtime: add pointer size to type structure This adds a field to the runtime type structure that records the size of the prefix of objects of that type containing pointers. Any data after this offset is scalar data. This is necessary for shrinking the type bitmaps to 1 bit and will help the garbage collector efficiently estimate the amount of heap that needs to be scanned. Change-Id: I1318d79e6360dca0ac980245016c562e61f52ff5 Reviewed-on: https://go-review.googlesource.com/9691 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-05-04 20:17:48 +00:00
Rick Hudson	b86e71f5aa	runtime: Reduce calls to shouldtriggergc shouldtriggergc is slightly expensive due to the call overhead and the use of an atomic. This CL reduces the number of time one checks if a GC should be done from one at each allocation to once when a span is allocated. Since shouldtriggergc is an important abstraction simply hand inlining it, along with its atomic instruction would lose the abstraction. Change-Id: Ia3210655b4b3d433f77064a21ecb54e4d9d435f7 Reviewed-on: https://go-review.googlesource.com/9403 Reviewed-by: Austin Clements <austin@google.com>	2015-05-04 17:38:58 +00:00
Alex Brainman	031c3bc9ae	runtime: fix stackDebug comment Change-Id: Ia9191bd7ecdf7bd5ee7d69ae23aa71760f379aa8 Reviewed-on: https://go-review.googlesource.com/9590 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-05-02 02:39:50 +00:00
Austin Clements	dc870d5f4b	runtime: detailed debug output of controller state This adds a detailed debug dump of the state of the GC controller and a GODEBUG flag to enable it. Change-Id: I562fed7981691a84ddf0f9e6fcd9f089f497ac13 Reviewed-on: https://go-review.googlesource.com/9640 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 19:39:43 +00:00
Russ Cox	4fffc50c26	runtime: correct accounting of scan work and bytes marked (1) Count pointer-free objects found during scanning roots as marked bytes, by not zeroing the mark total after scanning roots. (2) Don't count the bytes for the roots themselves, by not adding them to the mark total in scanblock (the zeroing removed by (1) was aimed at that add but hitting more). Combined, (1) and (2) fix the calculation of the marked heap size. This makes the GC trigger much less often in the Go 1 benchmarks, which have a global []byte pointing at 256 MB of data. That 256 MB allocation was not being included in the heap size in the current code, but was included in Go 1.4. This is the source of much of the relative slowdown in that directory. (3) Count the bytes for the roots as scanned work, by not zeroing the scan total after scanning roots. There is no strict justification for this, and it probably doesn't matter much either way, but it was always combined with another buggy zeroing (removed in (1)), so guilty by association. Austin noticed this. name old mean new mean delta BenchmarkBinaryTree17 13.1s × (0.97,1.03) 5.9s × (0.97,1.05) -55.19% (p=0.000) BenchmarkFannkuch11 4.35s × (0.99,1.01) 4.37s × (1.00,1.01) +0.47% (p=0.032) BenchmarkFmtFprintfEmpty 84.6ns × (0.95,1.14) 85.7ns × (0.94,1.05) ~ (p=0.521) BenchmarkFmtFprintfString 320ns × (0.95,1.06) 283ns × (0.99,1.02) -11.48% (p=0.000) BenchmarkFmtFprintfInt 311ns × (0.98,1.03) 288ns × (0.99,1.02) -7.26% (p=0.000) BenchmarkFmtFprintfIntInt 554ns × (0.96,1.05) 478ns × (0.99,1.02) -13.70% (p=0.000) BenchmarkFmtFprintfPrefixedInt 434ns × (0.96,1.06) 393ns × (0.98,1.04) -9.60% (p=0.000) BenchmarkFmtFprintfFloat 620ns × (0.99,1.03) 584ns × (0.99,1.01) -5.73% (p=0.000) BenchmarkFmtManyArgs 2.19µs × (0.98,1.03) 1.94µs × (0.99,1.01) -11.62% (p=0.000) BenchmarkGobDecode 21.2ms × (0.97,1.06) 15.2ms × (0.99,1.01) -28.17% (p=0.000) BenchmarkGobEncode 18.1ms × (0.94,1.06) 11.8ms × (0.99,1.01) -35.00% (p=0.000) BenchmarkGzip 650ms × (0.98,1.01) 649ms × (0.99,1.02) ~ (p=0.802) BenchmarkGunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.438) BenchmarkHTTPClientServer 110µs × (0.98,1.04) 101µs × (0.98,1.02) -8.79% (p=0.000) BenchmarkJSONEncode 40.3ms × (0.97,1.03) 31.8ms × (0.98,1.03) -20.92% (p=0.000) BenchmarkJSONDecode 119ms × (0.97,1.02) 108ms × (0.99,1.02) -9.15% (p=0.000) BenchmarkMandelbrot200 6.03ms × (1.00,1.01) 6.03ms × (0.99,1.01) ~ (p=0.750) BenchmarkGoParse 8.58ms × (0.89,1.10) 6.80ms × (1.00,1.00) -20.71% (p=0.000) BenchmarkRegexpMatchEasy0_32 162ns × (1.00,1.01) 162ns × (0.99,1.02) ~ (p=0.131) BenchmarkRegexpMatchEasy0_1K 540ns × (0.99,1.02) 559ns × (0.99,1.02) +3.58% (p=0.000) BenchmarkRegexpMatchEasy1_32 139ns × (0.98,1.04) 139ns × (1.00,1.00) ~ (p=0.466) BenchmarkRegexpMatchEasy1_1K 889ns × (0.99,1.01) 885ns × (0.99,1.01) -0.50% (p=0.022) BenchmarkRegexpMatchMedium_32 252ns × (0.99,1.02) 252ns × (0.99,1.01) ~ (p=0.469) BenchmarkRegexpMatchMedium_1K 72.9µs × (0.99,1.01) 73.6µs × (0.99,1.03) ~ (p=0.168) BenchmarkRegexpMatchHard_32 3.87µs × (1.00,1.01) 3.86µs × (1.00,1.00) ~ (p=0.055) BenchmarkRegexpMatchHard_1K 118µs × (0.99,1.01) 117µs × (0.99,1.00) ~ (p=0.133) BenchmarkRevcomp 995ms × (0.94,1.10) 949ms × (0.99,1.01) -4.64% (p=0.000) BenchmarkTemplate 141ms × (0.97,1.02) 127ms × (0.99,1.01) -10.00% (p=0.000) BenchmarkTimeParse 641ns × (0.99,1.01) 623ns × (0.99,1.01) -2.79% (p=0.000) BenchmarkTimeFormat 729ns × (0.98,1.03) 679ns × (0.99,1.00) -6.93% (p=0.000) Change-Id: I839bd7356630d18377989a0748763414e15ed057 Reviewed-on: https://go-review.googlesource.com/9602 Reviewed-by: Austin Clements <austin@google.com>	2015-05-01 19:31:00 +00:00
Russ Cox	4d0f3a1c95	cmd/internal/gc, runtime: use 1-bit bitmap for stack frames, data, bss The bitmaps were 2 bits per pointer because we needed to distinguish scalar, pointer, multiword, and we used the leftover value to distinguish uninitialized from scalar, even though the garbage collector (GC) didn't care. Now that there are no multiword structures from the GC's point of view, cut the bitmaps down to 1 bit per pointer, recording just live pointer vs not. The GC assumes the same layout for stack frames and for the maps describing the global data and bss sections, so change them all in one CL. The code still refers to 4-bit heap bitmaps and 2-bit "type bitmaps", since the 2-bit representation lives (at least for now) in some of the reflect data. Because these stack frame bitmaps are stored directly in the rodata in the binary, this CL reduces the size of the 6g binary by about 1.1%. Performance change is basically a wash, but using less memory, and smaller binaries, and enables other bitmap reductions. name old mean new mean delta BenchmarkBinaryTree17 13.2s × (0.97,1.03) 13.0s × (0.99,1.01) -0.93% (p=0.005) BenchmarkBinaryTree17-2 9.69s × (0.96,1.05) 9.51s × (0.96,1.03) -1.86% (p=0.001) BenchmarkBinaryTree17-4 10.1s × (0.97,1.05) 10.0s × (0.96,1.05) ~ (p=0.141) BenchmarkFannkuch11 4.35s × (0.99,1.01) 4.43s × (0.98,1.04) +1.75% (p=0.001) BenchmarkFannkuch11-2 4.31s × (0.99,1.03) 4.32s × (1.00,1.00) ~ (p=0.095) BenchmarkFannkuch11-4 4.32s × (0.99,1.02) 4.38s × (0.98,1.04) +1.38% (p=0.008) BenchmarkFmtFprintfEmpty 83.5ns × (0.97,1.10) 87.3ns × (0.92,1.11) +4.55% (p=0.014) BenchmarkFmtFprintfEmpty-2 81.8ns × (0.98,1.04) 82.5ns × (0.97,1.08) ~ (p=0.364) BenchmarkFmtFprintfEmpty-4 80.9ns × (0.99,1.01) 82.6ns × (0.97,1.08) +2.12% (p=0.010) BenchmarkFmtFprintfString 320ns × (0.95,1.04) 322ns × (0.97,1.05) ~ (p=0.368) BenchmarkFmtFprintfString-2 303ns × (0.97,1.04) 304ns × (0.97,1.04) ~ (p=0.484) BenchmarkFmtFprintfString-4 305ns × (0.97,1.05) 306ns × (0.98,1.05) ~ (p=0.543) BenchmarkFmtFprintfInt 311ns × (0.98,1.03) 319ns × (0.97,1.03) +2.63% (p=0.000) BenchmarkFmtFprintfInt-2 297ns × (0.98,1.04) 301ns × (0.97,1.04) +1.19% (p=0.023) BenchmarkFmtFprintfInt-4 302ns × (0.98,1.02) 304ns × (0.97,1.03) ~ (p=0.126) BenchmarkFmtFprintfIntInt 554ns × (0.96,1.05) 554ns × (0.97,1.03) ~ (p=0.975) BenchmarkFmtFprintfIntInt-2 520ns × (0.98,1.03) 517ns × (0.98,1.02) ~ (p=0.153) BenchmarkFmtFprintfIntInt-4 524ns × (0.98,1.02) 525ns × (0.98,1.03) ~ (p=0.597) BenchmarkFmtFprintfPrefixedInt 433ns × (0.97,1.06) 434ns × (0.97,1.06) ~ (p=0.804) BenchmarkFmtFprintfPrefixedInt-2 413ns × (0.98,1.04) 413ns × (0.98,1.03) ~ (p=0.881) BenchmarkFmtFprintfPrefixedInt-4 420ns × (0.97,1.03) 421ns × (0.97,1.03) ~ (p=0.561) BenchmarkFmtFprintfFloat 620ns × (0.99,1.03) 636ns × (0.97,1.03) +2.57% (p=0.000) BenchmarkFmtFprintfFloat-2 601ns × (0.98,1.02) 617ns × (0.98,1.03) +2.58% (p=0.000) BenchmarkFmtFprintfFloat-4 613ns × (0.98,1.03) 626ns × (0.98,1.02) +2.15% (p=0.000) BenchmarkFmtManyArgs 2.19µs × (0.96,1.04) 2.23µs × (0.97,1.02) +1.65% (p=0.000) BenchmarkFmtManyArgs-2 2.08µs × (0.98,1.03) 2.10µs × (0.99,1.02) +0.79% (p=0.019) BenchmarkFmtManyArgs-4 2.10µs × (0.98,1.02) 2.13µs × (0.98,1.02) +1.72% (p=0.000) BenchmarkGobDecode 21.3ms × (0.97,1.05) 21.1ms × (0.97,1.04) -1.36% (p=0.025) BenchmarkGobDecode-2 20.0ms × (0.97,1.03) 19.2ms × (0.97,1.03) -4.00% (p=0.000) BenchmarkGobDecode-4 19.5ms × (0.99,1.02) 19.0ms × (0.99,1.01) -2.39% (p=0.000) BenchmarkGobEncode 18.3ms × (0.95,1.07) 18.1ms × (0.96,1.08) ~ (p=0.305) BenchmarkGobEncode-2 16.8ms × (0.97,1.02) 16.4ms × (0.98,1.02) -2.79% (p=0.000) BenchmarkGobEncode-4 15.4ms × (0.98,1.02) 15.4ms × (0.98,1.02) ~ (p=0.465) BenchmarkGzip 650ms × (0.98,1.03) 655ms × (0.97,1.04) ~ (p=0.075) BenchmarkGzip-2 652ms × (0.98,1.03) 655ms × (0.98,1.02) ~ (p=0.337) BenchmarkGzip-4 656ms × (0.98,1.04) 653ms × (0.98,1.03) ~ (p=0.291) BenchmarkGunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.507) BenchmarkGunzip-2 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.313) BenchmarkGunzip-4 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.312) BenchmarkHTTPClientServer 110µs × (0.98,1.03) 109µs × (0.99,1.02) -1.40% (p=0.000) BenchmarkHTTPClientServer-2 154µs × (0.90,1.08) 149µs × (0.90,1.08) -3.43% (p=0.007) BenchmarkHTTPClientServer-4 138µs × (0.97,1.04) 138µs × (0.96,1.04) ~ (p=0.670) BenchmarkJSONEncode 40.2ms × (0.98,1.02) 40.2ms × (0.98,1.05) ~ (p=0.828) BenchmarkJSONEncode-2 35.1ms × (0.99,1.02) 35.2ms × (0.98,1.03) ~ (p=0.392) BenchmarkJSONEncode-4 35.3ms × (0.98,1.03) 35.3ms × (0.98,1.02) ~ (p=0.813) BenchmarkJSONDecode 119ms × (0.97,1.02) 117ms × (0.98,1.02) -1.80% (p=0.000) BenchmarkJSONDecode-2 115ms × (0.99,1.02) 114ms × (0.98,1.02) -1.18% (p=0.000) BenchmarkJSONDecode-4 116ms × (0.98,1.02) 114ms × (0.98,1.02) -1.43% (p=0.000) BenchmarkMandelbrot200 6.03ms × (1.00,1.01) 6.03ms × (1.00,1.01) ~ (p=0.985) BenchmarkMandelbrot200-2 6.03ms × (1.00,1.01) 6.02ms × (1.00,1.01) ~ (p=0.320) BenchmarkMandelbrot200-4 6.03ms × (1.00,1.01) 6.03ms × (1.00,1.01) ~ (p=0.799) BenchmarkGoParse 8.63ms × (0.89,1.10) 8.58ms × (0.93,1.09) ~ (p=0.667) BenchmarkGoParse-2 8.20ms × (0.97,1.04) 8.37ms × (0.97,1.04) +1.96% (p=0.001) BenchmarkGoParse-4 8.00ms × (0.98,1.02) 8.14ms × (0.99,1.02) +1.75% (p=0.000) BenchmarkRegexpMatchEasy0_32 162ns × (1.00,1.01) 164ns × (0.98,1.04) +1.35% (p=0.011) BenchmarkRegexpMatchEasy0_32-2 161ns × (1.00,1.01) 161ns × (1.00,1.00) ~ (p=0.185) BenchmarkRegexpMatchEasy0_32-4 161ns × (1.00,1.00) 161ns × (1.00,1.00) -0.19% (p=0.001) BenchmarkRegexpMatchEasy0_1K 540ns × (0.99,1.02) 566ns × (0.98,1.04) +4.98% (p=0.000) BenchmarkRegexpMatchEasy0_1K-2 540ns × (0.99,1.01) 557ns × (0.99,1.01) +3.21% (p=0.000) BenchmarkRegexpMatchEasy0_1K-4 541ns × (0.99,1.01) 559ns × (0.99,1.01) +3.26% (p=0.000) BenchmarkRegexpMatchEasy1_32 139ns × (0.98,1.04) 139ns × (0.99,1.03) ~ (p=0.979) BenchmarkRegexpMatchEasy1_32-2 139ns × (0.99,1.04) 139ns × (0.99,1.02) ~ (p=0.777) BenchmarkRegexpMatchEasy1_32-4 139ns × (0.98,1.04) 139ns × (0.99,1.04) ~ (p=0.771) BenchmarkRegexpMatchEasy1_1K 890ns × (0.99,1.03) 885ns × (1.00,1.01) -0.50% (p=0.004) BenchmarkRegexpMatchEasy1_1K-2 888ns × (0.99,1.01) 885ns × (0.99,1.01) -0.37% (p=0.004) BenchmarkRegexpMatchEasy1_1K-4 890ns × (0.99,1.02) 884ns × (1.00,1.00) -0.70% (p=0.000) BenchmarkRegexpMatchMedium_32 252ns × (0.99,1.01) 251ns × (0.99,1.01) ~ (p=0.081) BenchmarkRegexpMatchMedium_32-2 254ns × (0.99,1.04) 252ns × (0.99,1.01) -0.78% (p=0.027) BenchmarkRegexpMatchMedium_32-4 253ns × (0.99,1.04) 252ns × (0.99,1.01) -0.70% (p=0.022) BenchmarkRegexpMatchMedium_1K 72.9µs × (0.99,1.01) 72.7µs × (1.00,1.00) ~ (p=0.064) BenchmarkRegexpMatchMedium_1K-2 74.1µs × (0.98,1.05) 72.9µs × (1.00,1.01) -1.61% (p=0.001) BenchmarkRegexpMatchMedium_1K-4 73.6µs × (0.99,1.05) 72.8µs × (1.00,1.00) -1.13% (p=0.007) BenchmarkRegexpMatchHard_32 3.88µs × (0.99,1.03) 3.92µs × (0.98,1.05) ~ (p=0.143) BenchmarkRegexpMatchHard_32-2 3.89µs × (0.99,1.03) 3.93µs × (0.98,1.09) ~ (p=0.278) BenchmarkRegexpMatchHard_32-4 3.90µs × (0.99,1.05) 3.93µs × (0.98,1.05) ~ (p=0.252) BenchmarkRegexpMatchHard_1K 118µs × (0.99,1.01) 117µs × (0.99,1.02) -0.54% (p=0.003) BenchmarkRegexpMatchHard_1K-2 118µs × (0.99,1.01) 118µs × (0.99,1.03) ~ (p=0.581) BenchmarkRegexpMatchHard_1K-4 118µs × (0.99,1.02) 117µs × (0.99,1.01) -0.54% (p=0.002) BenchmarkRevcomp 991ms × (0.95,1.10) 989ms × (0.94,1.08) ~ (p=0.879) BenchmarkRevcomp-2 978ms × (0.95,1.11) 962ms × (0.96,1.08) ~ (p=0.257) BenchmarkRevcomp-4 979ms × (0.96,1.07) 974ms × (0.96,1.11) ~ (p=0.678) BenchmarkTemplate 141ms × (0.99,1.02) 145ms × (0.99,1.02) +2.75% (p=0.000) BenchmarkTemplate-2 135ms × (0.98,1.02) 138ms × (0.99,1.02) +2.34% (p=0.000) BenchmarkTemplate-4 136ms × (0.98,1.02) 140ms × (0.99,1.02) +2.71% (p=0.000) BenchmarkTimeParse 640ns × (0.99,1.01) 622ns × (0.99,1.01) -2.88% (p=0.000) BenchmarkTimeParse-2 640ns × (0.99,1.01) 622ns × (1.00,1.00) -2.81% (p=0.000) BenchmarkTimeParse-4 640ns × (1.00,1.01) 622ns × (0.99,1.01) -2.82% (p=0.000) BenchmarkTimeFormat 730ns × (0.98,1.02) 731ns × (0.98,1.03) ~ (p=0.767) BenchmarkTimeFormat-2 709ns × (0.99,1.02) 707ns × (0.99,1.02) ~ (p=0.347) BenchmarkTimeFormat-4 717ns × (0.98,1.01) 718ns × (0.98,1.02) ~ (p=0.793) Change-Id: Ie779c47e912bf80eb918bafa13638bd8dfd6c2d9 Reviewed-on: https://go-review.googlesource.com/9406 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-05-01 18:44:36 +00:00
Josh Bleecher Snyder	7bebccb972	Revert "runtime/pprof: write heap statistics to heap profile always" This reverts commit `c26fc88d56`. This broke pprof. See the comments at 9491. Change-Id: Ic99ce026e86040c050a9bf0ea3024a1a42274ad1 Reviewed-on: https://go-review.googlesource.com/9565 Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2015-05-01 15:56:20 +00:00
Keith Randall	a55b131393	cmd/dist, runtime: Make stack guard larger for non-optimized builds Kind of a hack, but makes the non-optimized builds pass. Fixes #10079 Change-Id: I26f41c546867f8f3f16d953dc043e784768f2aff Reviewed-on: https://go-review.googlesource.com/9552 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 15:41:55 +00:00
David Chase	7fbb1b36c3	cmd/internal/gc: improve flow of input params to output params This includes the following information in the per-function summary: outK = paramJ encoded in outK bits for paramJ outK = paramJ encoded in outK bits for paramJ heap = paramJ EscHeap heap = paramJ EscContentEscapes Note that (currently) if the address of a parameter is taken and returned, necessarily a heap allocation occurred to contain that reference, and the heap can never refer to stack, therefore the parameter and everything downstream from it escapes to the heap. The per-function summary information now has a tuneable number of bits (2 is probably noticeably better than 1, 3 is likely overkill, but it is now easy to check and the -m debugging output includes information that allows you to figure out if more would be better.) A new test was added to check pointer flow through struct-typed and struct-typed parameters and returns; some of these are sensitive to the number of summary bits, and ought to yield better results with a more competent escape analysis algorithm. Another new test checks (some) correctness with array parameters, results, and operations. The old analysis inferred a piece of plan9 runtime was non-escaping by counteracting overconservative analysis with buggy analysis; with the bug fixed, the result was too conservative (and it's not easy to fix in this framework) so the source code was tweaked to get the desired result. A test was added against the discovered bug. The escape analysis was further improved splitting the "level" into 3 parts, one tracking the conventional "level" and the other two computing the highest-level-suffix-from-copy, which is used to generally model the cancelling effect of indirection applied to address-of. With the improved escape analysis enabled, it was necessary to modify one of the runtime tests because it now attempts to allocate too much on the (small, fixed-size) G0 (system) stack and this failed the test. Compiling src/std after touching src/runtime/.go with -m logging turned on shows 420 fewer heap allocation sites (10538 vs 10968). Profiling allocations in src/html/template with for i in {1..5} ; do go tool 6g -memprofile=mastx.${i}.prof -memprofilerate=1 *.go; go tool pprof -alloc_objects -text mastx.${i}.prof ; done showed a 15% reduction in allocations performed by the compiler. Update #3753 Update #4720 Fixes #10466 Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432 Reviewed-on: https://go-review.googlesource.com/8202 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 13:47:20 +00:00
David Crawshaw	4044adedf7	runtime/cgo, cmd/dist: turn off exc_bad_access handler by default App Store policy requires programs do not reference the exc_server symbol. (Some public forum threads show that Unity ran into this several years ago and it is a hard policy rule.) While some research suggests that I could write my own version of exc_server, the expedient course is to disable the exception handler by default. Go programs only need it when running under lldb, which is primarily used by tests. So enable the exception handler in cmd/dist when we are running the tests. Fixes #10646 Change-Id: I853905254894b5367edb8abd381d45585a78ee8b Reviewed-on: https://go-review.googlesource.com/9549 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-05-01 13:19:39 +00:00
Shenghou Ma	5f69e739d3	runtime: adjust traceTickDiv for non-x86 architectures Fixes #10554. Fixes #10623. Change-Id: I90fbaa34e3d55c8758178f8d2e7fa41ff1194a1b Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9247 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dave Cheney <dave@cheney.net>	2015-05-01 07:25:49 +00:00
Russ Cox	79a990b845	runtime: schedule GC work more aggressively Schedule the work as early as possible, while still respecting the utilization percentage on average. The old code tried never to go above the utilization percentage. The new code is willing to go above the utilization percentage by one time slice (but of course after doing that it must wait until the percentage drops back down to the target before it gets another time slice). The effect is that for concurrent GCs that can run in a small number of time slices, the time during which write barriers are enabled is reduced by one mutator + GC time slice round (possibly 30 ms per GC). This only affects the fractional GC processor (the remainder of GOMAXPROCS/4), so it matters most in GOMAXPROCS=1, a bit in GOMAXPROCS=2, and not at all in GOMAXPROCS=4. GOMAXPROCS=1 name old mean new mean delta BenchmarkBinaryTree17 12.4s × (0.98,1.03) 13.5s × (0.97,1.04) +8.84% (p=0.000) BenchmarkFannkuch11 4.38s × (1.00,1.01) 4.38s × (1.00,1.01) ~ (p=0.343) BenchmarkFmtFprintfEmpty 88.9ns × (0.97,1.10) 90.1ns × (0.93,1.14) ~ (p=0.224) BenchmarkFmtFprintfString 356ns × (0.94,1.05) 321ns × (0.94,1.12) -9.77% (p=0.000) BenchmarkFmtFprintfInt 344ns × (0.98,1.03) 325ns × (0.96,1.03) -5.46% (p=0.000) BenchmarkFmtFprintfIntInt 622ns × (0.97,1.03) 571ns × (0.95,1.05) -8.09% (p=0.000) BenchmarkFmtFprintfPrefixedInt 462ns × (0.96,1.04) 431ns × (0.95,1.05) -6.81% (p=0.000) BenchmarkFmtFprintfFloat 653ns × (0.98,1.03) 621ns × (0.99,1.03) -4.90% (p=0.000) BenchmarkFmtManyArgs 2.32µs × (0.97,1.03) 2.19µs × (0.98,1.02) -5.43% (p=0.000) BenchmarkGobDecode 27.0ms × (0.96,1.04) 20.0ms × (0.97,1.04) -26.06% (p=0.000) BenchmarkGobEncode 26.6ms × (0.99,1.01) 17.8ms × (0.95,1.05) -33.19% (p=0.000) BenchmarkGzip 659ms × (0.98,1.03) 650ms × (0.99,1.01) -1.34% (p=0.000) BenchmarkGunzip 145ms × (0.98,1.04) 143ms × (1.00,1.01) -1.47% (p=0.000) BenchmarkHTTPClientServer 111µs × (0.97,1.04) 110µs × (0.96,1.03) -1.30% (p=0.000) BenchmarkJSONEncode 52.0ms × (0.97,1.03) 40.8ms × (0.97,1.03) -21.47% (p=0.000) BenchmarkJSONDecode 127ms × (0.98,1.04) 120ms × (0.98,1.02) -5.55% (p=0.000) BenchmarkMandelbrot200 6.04ms × (0.99,1.04) 6.02ms × (1.00,1.01) ~ (p=0.176) BenchmarkGoParse 8.62ms × (0.96,1.08) 8.55ms × (0.93,1.09) ~ (p=0.302) BenchmarkRegexpMatchEasy0_32 164ns × (0.98,1.05) 165ns × (0.98,1.07) ~ (p=0.293) BenchmarkRegexpMatchEasy0_1K 546ns × (0.98,1.06) 547ns × (0.97,1.07) ~ (p=0.741) BenchmarkRegexpMatchEasy1_32 142ns × (0.97,1.09) 141ns × (0.97,1.05) ~ (p=0.231) BenchmarkRegexpMatchEasy1_1K 904ns × (0.97,1.07) 900ns × (0.98,1.04) ~ (p=0.294) BenchmarkRegexpMatchMedium_32 256ns × (0.98,1.06) 256ns × (0.97,1.04) ~ (p=0.530) BenchmarkRegexpMatchMedium_1K 74.2µs × (0.98,1.05) 73.8µs × (0.98,1.04) ~ (p=0.334) BenchmarkRegexpMatchHard_32 3.94µs × (0.98,1.07) 3.92µs × (0.98,1.05) ~ (p=0.356) BenchmarkRegexpMatchHard_1K 119µs × (0.98,1.07) 119µs × (0.98,1.06) ~ (p=0.467) BenchmarkRevcomp 978ms × (0.96,1.09) 984ms × (0.95,1.07) ~ (p=0.448) BenchmarkTemplate 151ms × (0.96,1.03) 142ms × (0.95,1.04) -5.55% (p=0.000) BenchmarkTimeParse 628ns × (0.99,1.01) 628ns × (0.99,1.01) ~ (p=0.855) BenchmarkTimeFormat 729ns × (0.98,1.06) 734ns × (0.97,1.05) ~ (p=0.149) GOMAXPROCS=2 name old mean new mean delta BenchmarkBinaryTree17-2 9.80s × (0.97,1.03) 9.85s × (0.99,1.02) ~ (p=0.444) BenchmarkFannkuch11-2 4.35s × (0.99,1.01) 4.40s × (0.98,1.05) ~ (p=0.099) BenchmarkFmtFprintfEmpty-2 86.7ns × (0.97,1.05) 85.9ns × (0.98,1.04) ~ (p=0.409) BenchmarkFmtFprintfString-2 297ns × (0.98,1.01) 297ns × (0.99,1.01) ~ (p=0.743) BenchmarkFmtFprintfInt-2 309ns × (0.98,1.02) 310ns × (0.99,1.01) ~ (p=0.464) BenchmarkFmtFprintfIntInt-2 525ns × (0.97,1.05) 518ns × (0.99,1.01) ~ (p=0.151) BenchmarkFmtFprintfPrefixedInt-2 408ns × (0.98,1.02) 408ns × (0.98,1.03) ~ (p=0.797) BenchmarkFmtFprintfFloat-2 603ns × (0.99,1.01) 604ns × (0.98,1.02) ~ (p=0.588) BenchmarkFmtManyArgs-2 2.07µs × (0.98,1.02) 2.05µs × (0.99,1.01) ~ (p=0.091) BenchmarkGobDecode-2 19.1ms × (0.97,1.01) 19.3ms × (0.97,1.04) ~ (p=0.195) BenchmarkGobEncode-2 16.2ms × (0.97,1.03) 16.4ms × (0.99,1.01) ~ (p=0.069) BenchmarkGzip-2 652ms × (0.99,1.01) 651ms × (0.99,1.01) ~ (p=0.705) BenchmarkGunzip-2 143ms × (1.00,1.01) 143ms × (1.00,1.00) ~ (p=0.665) BenchmarkHTTPClientServer-2 149µs × (0.92,1.11) 149µs × (0.91,1.08) ~ (p=0.862) BenchmarkJSONEncode-2 34.6ms × (0.98,1.02) 37.2ms × (0.99,1.01) +7.56% (p=0.000) BenchmarkJSONDecode-2 117ms × (0.99,1.01) 117ms × (0.99,1.01) ~ (p=0.858) BenchmarkMandelbrot200-2 6.10ms × (0.99,1.03) 6.03ms × (1.00,1.00) ~ (p=0.083) BenchmarkGoParse-2 8.25ms × (0.98,1.01) 8.21ms × (0.99,1.02) ~ (p=0.307) BenchmarkRegexpMatchEasy0_32-2 162ns × (0.99,1.02) 162ns × (0.99,1.01) ~ (p=0.857) BenchmarkRegexpMatchEasy0_1K-2 541ns × (0.99,1.01) 540ns × (1.00,1.00) ~ (p=0.530) BenchmarkRegexpMatchEasy1_32-2 138ns × (1.00,1.00) 141ns × (0.98,1.04) +1.88% (p=0.038) BenchmarkRegexpMatchEasy1_1K-2 887ns × (0.99,1.01) 894ns × (0.99,1.01) ~ (p=0.087) BenchmarkRegexpMatchMedium_32-2 252ns × (0.99,1.01) 252ns × (0.99,1.01) ~ (p=0.954) BenchmarkRegexpMatchMedium_1K-2 73.4µs × (0.99,1.02) 72.8µs × (1.00,1.01) -0.87% (p=0.029) BenchmarkRegexpMatchHard_32-2 3.95µs × (0.97,1.05) 3.87µs × (1.00,1.01) -2.11% (p=0.035) BenchmarkRegexpMatchHard_1K-2 117µs × (0.99,1.01) 117µs × (0.99,1.01) ~ (p=0.669) BenchmarkRevcomp-2 980ms × (0.95,1.03) 993ms × (0.94,1.09) ~ (p=0.527) BenchmarkTemplate-2 136ms × (0.98,1.01) 135ms × (0.99,1.01) ~ (p=0.200) BenchmarkTimeParse-2 630ns × (1.00,1.01) 630ns × (1.00,1.00) ~ (p=0.634) BenchmarkTimeFormat-2 705ns × (0.99,1.01) 710ns × (0.98,1.02) ~ (p=0.174) GOMAXPROCS=4 BenchmarkBinaryTree17-4 9.87s × (0.96,1.04) 9.75s × (0.96,1.03) ~ (p=0.178) BenchmarkFannkuch11-4 4.35s × (1.00,1.01) 4.40s × (0.99,1.04) ~ (p=0.071) BenchmarkFmtFprintfEmpty-4 85.8ns × (0.98,1.06) 85.6ns × (0.98,1.04) ~ (p=0.858) BenchmarkFmtFprintfString-4 306ns × (0.99,1.03) 304ns × (0.97,1.02) ~ (p=0.470) BenchmarkFmtFprintfInt-4 317ns × (0.98,1.01) 315ns × (0.98,1.02) -0.92% (p=0.044) BenchmarkFmtFprintfIntInt-4 527ns × (0.99,1.01) 525ns × (0.98,1.01) ~ (p=0.164) BenchmarkFmtFprintfPrefixedInt-4 421ns × (0.98,1.03) 417ns × (0.99,1.02) ~ (p=0.092) BenchmarkFmtFprintfFloat-4 623ns × (0.98,1.02) 618ns × (0.98,1.03) ~ (p=0.172) BenchmarkFmtManyArgs-4 2.09µs × (0.98,1.02) 2.09µs × (0.98,1.02) ~ (p=0.679) BenchmarkGobDecode-4 18.6ms × (0.99,1.01) 18.6ms × (0.98,1.03) ~ (p=0.595) BenchmarkGobEncode-4 15.0ms × (0.98,1.02) 15.1ms × (0.99,1.01) ~ (p=0.301) BenchmarkGzip-4 659ms × (0.98,1.04) 660ms × (0.97,1.02) ~ (p=0.724) BenchmarkGunzip-4 145ms × (0.98,1.04) 144ms × (0.99,1.04) ~ (p=0.671) BenchmarkHTTPClientServer-4 139µs × (0.97,1.02) 138µs × (0.99,1.02) ~ (p=0.392) BenchmarkJSONEncode-4 35.0ms × (0.99,1.02) 35.1ms × (0.98,1.02) ~ (p=0.777) BenchmarkJSONDecode-4 119ms × (0.98,1.01) 118ms × (0.98,1.02) ~ (p=0.710) BenchmarkMandelbrot200-4 6.02ms × (1.00,1.00) 6.02ms × (1.00,1.00) ~ (p=0.289) BenchmarkGoParse-4 7.96ms × (0.99,1.01) 7.96ms × (0.99,1.01) ~ (p=0.884) BenchmarkRegexpMatchEasy0_32-4 164ns × (0.98,1.04) 166ns × (0.97,1.04) ~ (p=0.221) BenchmarkRegexpMatchEasy0_1K-4 540ns × (0.99,1.01) 552ns × (0.97,1.04) +2.10% (p=0.018) BenchmarkRegexpMatchEasy1_32-4 140ns × (0.99,1.04) 142ns × (0.97,1.04) ~ (p=0.226) BenchmarkRegexpMatchEasy1_1K-4 896ns × (0.99,1.03) 907ns × (0.97,1.04) ~ (p=0.155) BenchmarkRegexpMatchMedium_32-4 255ns × (0.99,1.04) 255ns × (0.98,1.04) ~ (p=0.904) BenchmarkRegexpMatchMedium_1K-4 73.4µs × (0.99,1.04) 73.8µs × (0.98,1.04) ~ (p=0.560) BenchmarkRegexpMatchHard_32-4 3.93µs × (0.98,1.04) 3.95µs × (0.98,1.04) ~ (p=0.571) BenchmarkRegexpMatchHard_1K-4 117µs × (1.00,1.01) 119µs × (0.98,1.04) +1.48% (p=0.048) BenchmarkRevcomp-4 990ms × (0.94,1.08) 989ms × (0.94,1.10) ~ (p=0.957) BenchmarkTemplate-4 137ms × (0.98,1.02) 137ms × (0.99,1.01) ~ (p=0.996) BenchmarkTimeParse-4 629ns × (1.00,1.00) 629ns × (0.99,1.01) ~ (p=0.924) BenchmarkTimeFormat-4 710ns × (0.99,1.01) 716ns × (0.98,1.02) +0.84% (p=0.033) Change-Id: I43a04e0f6ad5e3ba9847dddf12e13222561f9cf4 Reviewed-on: https://go-review.googlesource.com/9543 Reviewed-by: Austin Clements <austin@google.com>	2015-04-30 15:50:12 +00:00
Austin Clements	3ca20218c1	runtime: fix gcDumpObject on non-heap pointers gcDumpObject is used to print the source and destination objects when checkmark find a missing mark. However, gcDumpObject currently assumes the given pointer will point to a heap object. This is not true of the source object during root marking and may not even be true of the destination object in the limited situations where the heap points back in to the stack. If the pointer isn't a heap object, gcDumpObject will attempt an out-of-bounds access to h_spans. This will cause a panicslice, which will attempt to construct a useful panic message. This will cause a string allocation, which will lead mallocgc to panic because the GC is in mark termination (checkmark only happens during mark termination). Fix this by checking that the pointer points into the heap arena before attempting to use it as an arena pointer. Change-Id: I09da600c380d4773f1f8f38e45b82cb229ea6382 Reviewed-on: https://go-review.googlesource.com/9498 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-30 14:53:51 +00:00
Keith Randall	4b78c9575d	runtime: print stack of G during a signal Sequence of operations: - Go code does a systemstack call - during the systemstack call, receive a signal - signal requests a traceback of all goroutines The orignal G is still marked as _Grunning, so the traceback code refuses to print its stack. Fix by allowing traceback of Gs whose caller is on the same M as G is. G can't be modifying its stack if that is the case. Fixes #10546 Change-Id: I2bcea48c0197fbf78ab6fa080027cd80181083ad Reviewed-on: https://go-review.googlesource.com/9435 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-29 19:25:10 +00:00
Shenghou Ma	4d1ab2d8d1	runtime: re-enable TestNewProc0 on android/arm and fix heap corruption The problem is not actually specific to android/arm. Linux/ARM's runtime.clone set the stack pointer to child_stk-4 before calling the fn. And then when fn returns, it tries to write to 4(R13) to provide argument for runtime.exit, which is just beyond the allocated child stack, and thus it will corrupt the heap randomly or trigger segfault if that memory happens to be unmapped. While we're at here, shorten the test polling interval to 0.1s to speed up the test (it was only checking at 1s interval, which means the test takes at least 1s). Fixes #10548. Change-Id: I57cd63232022b113b6cd61e987b0684ebcce930a Reviewed-on: https://go-review.googlesource.com/9457 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-29 19:18:07 +00:00
Russ Cox	c26fc88d56	runtime/pprof: write heap statistics to heap profile always The heap statistics were only written if asked for a profile with debug > 0, but that also prints a stack trace for each profile line, which is comparatively much noisier. The statistics are short enough and separate enough (they only appear at the end) and useful enough that we can print them always. This means that people using -test.memprofile in tests will get a memory profile with statistics included now. Pprof won't care, but if people care to look, the numbers will be there. This avoids the need for hacks like using -memprofilerate=1 to find the number of allocations. Change-Id: I10a4f593403d0315aad11b37c6e554b734caa73f Reviewed-on: https://go-review.googlesource.com/9491 Reviewed-by: David Chase <drchase@google.com>	2015-04-29 18:07:43 +00:00
Keith Randall	c526f3ac10	runtime: tail call into memeq/cmp body implementations There's no need to call/ret to the body implementation. It can write the result to the right place. Just jump to it and have it return to our caller. Old: call body implementation compute result put result in a register return write register to result location return New: load address of result location into a register jump to body implementation compute result write result to passed-in address return It's a bit tricky on 386 because there is no free register with which to pass the result location. Free up a register by keeping around blen-alen instead of both alen and blen. Change-Id: If2cf0682a5bf1cc592bdda7c126ed4eee8944fba Reviewed-on: https://go-review.googlesource.com/9202 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2015-04-29 04:46:25 +00:00
Shenghou Ma	7e49c8193c	runtime: skip gdb goroutine backtrace test on non-x86 Gdb is not able to backtrace our non-standard stack frames on RISC architectures without frame pointer. Change-Id: Id62a566ce2d743602ded2da22ff77b9ae34bc5ae Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-29 04:44:38 +00:00
Shenghou Ma	da11a9dda3	cmd/internal/ld, runtime: unify stack reservation in PE header and runtime With 128KB stack reservation, on 32-bit Windows, the maximum number threads is ~9000. The original 65535-byte stack commit is causing problem on Windows XP where it makes the stack reservation to be 1MB despite the fact that the runtime specified 128KB. While we're at here, also fix the extra spacings in the unable to create more OS thread error message: println will insert a space between each argument. See #9457 for more information. Change-Id: I3a82f7d9717d3d55211b6eb1c34b00b0eaad83ed Reviewed-on: https://go-review.googlesource.com/2237 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Run-TryBot: Minux Ma <minux@golang.org>	2015-04-29 03:27:10 +00:00
Shenghou Ma	e7dd28891e	cmd/internal/gc, cmd/[56789]g: rename stackcopy to blockcopy To avoid confusion with the runtime concept of copying stack. Change-Id: I33442377b71012c2482c2d0ddd561492c71e70d0 Reviewed-on: https://go-review.googlesource.com/8639 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-29 00:28:01 +00:00
Ian Lance Taylor	0c62c93a09	runtime/cgo: use PTHREAD_{MUTEX,COND}_INITIALIZER Technically you must initialize static pthread_mutex_t and pthread_cond_t variables with the appropriate INITIALIZER macro. In practice the default initializers are zero anyhow, but it's still good code hygiene. Change-Id: I517304b16c2c7943b3880855c1b47a9a506b4bdf Reviewed-on: https://go-review.googlesource.com/9433 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-28 22:27:26 +00:00
Austin Clements	63caec5dee	runtime: eliminate one heapBitsForObject from scanobject scanobject with ptrmask!=nil is only ever called with the base pointer of a heap object. Currently, scanobject calls heapBitsForObject, which goes to a great deal of trouble to check that the pointer points into the heap and to find the base of the object it points to, both of which are completely unnecessary in this case. Replace this call to heapBitsForObject with much simpler logic to fetch the span and compute the heap bits. Benchmark results with five runs: name old mean new mean delta BenchmarkBinaryTree17 9.21s × (0.95,1.02) 8.55s × (0.91,1.03) -7.16% (p=0.022) BenchmarkFannkuch11 2.65s × (1.00,1.00) 2.62s × (1.00,1.00) -1.10% (p=0.000) BenchmarkFmtFprintfEmpty 73.2ns × (0.99,1.01) 71.7ns × (1.00,1.01) -1.99% (p=0.004) BenchmarkFmtFprintfString 302ns × (0.99,1.00) 292ns × (0.98,1.02) -3.31% (p=0.020) BenchmarkFmtFprintfInt 281ns × (0.98,1.01) 279ns × (0.96,1.02) ~ (p=0.596) BenchmarkFmtFprintfIntInt 482ns × (0.98,1.01) 488ns × (0.95,1.02) ~ (p=0.419) BenchmarkFmtFprintfPrefixedInt 382ns × (0.99,1.01) 365ns × (0.96,1.02) -4.35% (p=0.015) BenchmarkFmtFprintfFloat 475ns × (0.99,1.01) 472ns × (1.00,1.00) ~ (p=0.108) BenchmarkFmtManyArgs 1.89µs × (1.00,1.01) 1.90µs × (0.94,1.02) ~ (p=0.883) BenchmarkGobDecode 22.4ms × (0.99,1.01) 21.9ms × (0.92,1.04) ~ (p=0.332) BenchmarkGobEncode 24.7ms × (0.98,1.02) 23.9ms × (0.87,1.07) ~ (p=0.407) BenchmarkGzip 397ms × (0.99,1.01) 398ms × (0.99,1.01) ~ (p=0.718) BenchmarkGunzip 96.7ms × (1.00,1.00) 96.9ms × (1.00,1.00) ~ (p=0.230) BenchmarkHTTPClientServer 71.5µs × (0.98,1.01) 68.5µs × (0.92,1.06) ~ (p=0.243) BenchmarkJSONEncode 46.1ms × (0.98,1.01) 44.9ms × (0.98,1.03) -2.51% (p=0.040) BenchmarkJSONDecode 86.1ms × (0.99,1.01) 86.5ms × (0.99,1.01) ~ (p=0.343) BenchmarkMandelbrot200 4.12ms × (1.00,1.00) 4.13ms × (1.00,1.00) +0.23% (p=0.000) BenchmarkGoParse 5.89ms × (0.96,1.03) 5.82ms × (0.96,1.04) ~ (p=0.522) BenchmarkRegexpMatchEasy0_32 141ns × (0.99,1.01) 142ns × (1.00,1.00) ~ (p=0.178) BenchmarkRegexpMatchEasy0_1K 408ns × (1.00,1.00) 392ns × (0.99,1.00) -3.83% (p=0.000) BenchmarkRegexpMatchEasy1_32 122ns × (1.00,1.00) 122ns × (1.00,1.00) ~ (p=0.178) BenchmarkRegexpMatchEasy1_1K 626ns × (1.00,1.01) 624ns × (0.99,1.00) ~ (p=0.122) BenchmarkRegexpMatchMedium_32 202ns × (0.99,1.00) 205ns × (0.99,1.01) +1.58% (p=0.001) BenchmarkRegexpMatchMedium_1K 54.4µs × (1.00,1.00) 55.5µs × (1.00,1.00) +1.86% (p=0.000) BenchmarkRegexpMatchHard_32 2.68µs × (1.00,1.00) 2.71µs × (1.00,1.00) +0.97% (p=0.002) BenchmarkRegexpMatchHard_1K 79.8µs × (1.00,1.01) 80.5µs × (1.00,1.01) +0.94% (p=0.003) BenchmarkRevcomp 590ms × (0.99,1.01) 585ms × (1.00,1.00) ~ (p=0.066) BenchmarkTemplate 111ms × (0.97,1.02) 112ms × (0.99,1.01) ~ (p=0.201) BenchmarkTimeParse 392ns × (1.00,1.00) 385ns × (1.00,1.00) -1.69% (p=0.000) BenchmarkTimeFormat 449ns × (0.98,1.01) 448ns × (0.99,1.01) ~ (p=0.550) Change-Id: Ie7c3830c481d96c9043e7bf26853c6c1d05dc9f4 Reviewed-on: https://go-review.googlesource.com/9364 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-28 15:22:20 +00:00
Russ Cox	32d6fbcb4f	runtime: replace needwb() with writeBarrierEnabled Reduce the write barrier check to a single load and compare so that it can be inlined into write barrier use sites. Makes the standard write barrier a little faster too. name old new delta BenchmarkBinaryTree17 17.9s × (0.99,1.01) 17.9s × (1.00,1.01) ~ BenchmarkFannkuch11 4.35s × (1.00,1.00) 4.43s × (1.00,1.00) +1.81% BenchmarkFmtFprintfEmpty 120ns × (0.93,1.06) 110ns × (1.00,1.06) -7.92% BenchmarkFmtFprintfString 479ns × (0.99,1.00) 487ns × (0.99,1.00) +1.67% BenchmarkFmtFprintfInt 452ns × (0.99,1.02) 450ns × (0.99,1.00) ~ BenchmarkFmtFprintfIntInt 766ns × (0.99,1.01) 762ns × (1.00,1.00) ~ BenchmarkFmtFprintfPrefixedInt 576ns × (0.98,1.01) 584ns × (0.99,1.01) ~ BenchmarkFmtFprintfFloat 730ns × (1.00,1.01) 738ns × (1.00,1.00) +1.16% BenchmarkFmtManyArgs 2.84µs × (0.99,1.00) 2.80µs × (1.00,1.01) -1.22% BenchmarkGobDecode 39.3ms × (0.98,1.01) 39.0ms × (0.99,1.00) ~ BenchmarkGobEncode 39.5ms × (0.99,1.01) 37.8ms × (0.98,1.01) -4.33% BenchmarkGzip 663ms × (1.00,1.01) 661ms × (0.99,1.01) ~ BenchmarkGunzip 143ms × (1.00,1.00) 142ms × (1.00,1.00) ~ BenchmarkHTTPClientServer 132µs × (0.99,1.01) 132µs × (0.99,1.01) ~ BenchmarkJSONEncode 57.4ms × (0.99,1.01) 56.3ms × (0.99,1.01) -1.96% BenchmarkJSONDecode 139ms × (0.99,1.00) 138ms × (0.99,1.01) ~ BenchmarkMandelbrot200 6.03ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ BenchmarkGoParse 10.3ms × (0.89,1.14) 10.2ms × (0.87,1.05) ~ BenchmarkRegexpMatchEasy0_32 209ns × (1.00,1.00) 208ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy0_1K 591ns × (0.99,1.00) 588ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy1_32 184ns × (0.99,1.02) 182ns × (0.99,1.01) ~ BenchmarkRegexpMatchEasy1_1K 1.01µs × (1.00,1.00) 0.99µs × (1.00,1.01) -2.33% BenchmarkRegexpMatchMedium_32 330ns × (1.00,1.00) 323ns × (1.00,1.01) -2.12% BenchmarkRegexpMatchMedium_1K 92.6µs × (1.00,1.00) 89.9µs × (1.00,1.00) -2.92% BenchmarkRegexpMatchHard_32 4.80µs × (0.95,1.00) 4.72µs × (0.95,1.01) ~ BenchmarkRegexpMatchHard_1K 136µs × (1.00,1.00) 133µs × (1.00,1.01) -1.86% BenchmarkRevcomp 900ms × (0.99,1.04) 900ms × (1.00,1.05) ~ BenchmarkTemplate 172ms × (1.00,1.00) 168ms × (0.99,1.01) -2.07% BenchmarkTimeParse 637ns × (1.00,1.00) 637ns × (1.00,1.00) ~ BenchmarkTimeFormat 744ns × (1.00,1.01) 738ns × (1.00,1.00) -0.67% Change-Id: I4ecc925805da1f5ee264377f1f7574f54ee575e7 Reviewed-on: https://go-review.googlesource.com/9321 Reviewed-by: Austin Clements <austin@google.com>	2015-04-28 01:37:53 +00:00
Russ Cox	2050f57141	runtime: change unused argument in fat write barriers from pointer to scalar The argument is unused, only present for alignment of the following argument. The compiler today always passes a zero but I'd rather not write anything there during the call sequence, so mark it as a scalar so the garbage collector won't look at it. As expected, no significant performance change. name old new delta BenchmarkBinaryTree17 17.9s × (0.99,1.00) 17.9s × (0.99,1.01) ~ BenchmarkFannkuch11 4.35s × (1.00,1.00) 4.35s × (1.00,1.00) ~ BenchmarkFmtFprintfEmpty 120ns × (0.94,1.05) 120ns × (0.93,1.06) ~ BenchmarkFmtFprintfString 477ns × (1.00,1.00) 479ns × (0.99,1.00) ~ BenchmarkFmtFprintfInt 450ns × (0.99,1.01) 452ns × (0.99,1.02) ~ BenchmarkFmtFprintfIntInt 765ns × (0.99,1.01) 766ns × (0.99,1.01) ~ BenchmarkFmtFprintfPrefixedInt 569ns × (0.99,1.01) 576ns × (0.98,1.01) ~ BenchmarkFmtFprintfFloat 728ns × (1.00,1.00) 730ns × (1.00,1.01) ~ BenchmarkFmtManyArgs 2.82µs × (0.99,1.01) 2.84µs × (0.99,1.00) ~ BenchmarkGobDecode 39.1ms × (0.99,1.01) 39.3ms × (0.98,1.01) ~ BenchmarkGobEncode 39.4ms × (0.99,1.01) 39.5ms × (0.99,1.01) ~ BenchmarkGzip 661ms × (0.99,1.01) 663ms × (1.00,1.01) ~ BenchmarkGunzip 143ms × (1.00,1.00) 143ms × (1.00,1.00) ~ BenchmarkHTTPClientServer 133µs × (0.99,1.01) 132µs × (0.99,1.01) ~ BenchmarkJSONEncode 57.3ms × (0.99,1.04) 57.4ms × (0.99,1.01) ~ BenchmarkJSONDecode 139ms × (0.99,1.00) 139ms × (0.99,1.00) ~ BenchmarkMandelbrot200 6.02ms × (1.00,1.00) 6.03ms × (1.00,1.00) ~ BenchmarkGoParse 9.72ms × (0.92,1.11) 10.31ms × (0.89,1.14) ~ BenchmarkRegexpMatchEasy0_32 209ns × (1.00,1.01) 209ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy0_1K 592ns × (0.99,1.00) 591ns × (0.99,1.00) ~ BenchmarkRegexpMatchEasy1_32 183ns × (0.98,1.01) 184ns × (0.99,1.02) ~ BenchmarkRegexpMatchEasy1_1K 1.01µs × (1.00,1.01) 1.01µs × (1.00,1.00) ~ BenchmarkRegexpMatchMedium_32 330ns × (1.00,1.00) 330ns × (1.00,1.00) ~ BenchmarkRegexpMatchMedium_1K 92.4µs × (1.00,1.00) 92.6µs × (1.00,1.00) ~ BenchmarkRegexpMatchHard_32 4.77µs × (0.95,1.01) 4.80µs × (0.95,1.00) ~ BenchmarkRegexpMatchHard_1K 136µs × (1.00,1.00) 136µs × (1.00,1.00) ~ BenchmarkRevcomp 906ms × (0.99,1.05) 900ms × (0.99,1.04) ~ BenchmarkTemplate 171ms × (0.99,1.01) 172ms × (1.00,1.00) ~ BenchmarkTimeParse 638ns × (1.00,1.00) 637ns × (1.00,1.00) ~ BenchmarkTimeFormat 745ns × (0.99,1.02) 744ns × (1.00,1.01) ~ Change-Id: I0aeac5dc7adfd75e2223e3aabfedc7818d339f9b Reviewed-on: https://go-review.googlesource.com/9320 Reviewed-by: Austin Clements <austin@google.com>	2015-04-28 01:37:45 +00:00
Austin Clements	02ba71e547	runtime/race: fix failing tests Some race tests were sensitive to the goroutine scheduling order. When this changed in commit `e870f06`, these tests started to fail. Fix TestRaceHeapParam by ensuring that the racing goroutine has run before the test exits. Fix TestRaceRWMutexMultipleReaders by adding a third reader to ensure that two readers wind up on the same side of the writer (and race with each other) regardless of the schedule. Fix TestRaceRange by ensuring that the racing goroutine runs before the main goroutine exits the loop it races with. Change-Id: Iaf002f8730ea42227feaf2f3c51b9a1e57ccffdd Reviewed-on: https://go-review.googlesource.com/9402 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 23:12:00 +00:00
Russ Cox	f774e6a1f8	runtime/race: stop listening to external network addresses This makes the OS X firewall box pop up. Not run during all.bash so hasn't been noticed before. Change-Id: I78feb4fd3e1d3c983ae3419085048831c04de3da Reviewed-on: https://go-review.googlesource.com/9401 Reviewed-by: Austin Clements <austin@google.com>	2015-04-27 23:11:45 +00:00
Austin Clements	7c7cd69591	runtime: fix stack use accounting ReadMemStats accounts for stacks slightly differently than the runtime does internally. Internally, only stacks allocated by newosproc0 are accounted in memstats.stacks_sys and other stacks are accounted in heap_sys. readmemstats_m shuffles the statistics so all stacks are accounted in StackSys rather than HeapSys. However, currently, readmemstats_m assumes StackSys will be zero when it does this shuffle. This was true until commit `6ad33be`. If it isn't (e.g., if something called newosproc0), StackSys+HeapSys will be different before and after this shuffle, and the Sys sum that was computed earlier will no longer agree with the sum of its components. Fix this by making the shuffle in readmemstats_m not assume that StackSys is zero. Fixes #10585. Change-Id: If13991c8de68bd7b85e1b613d3f12b4fd6fd5813 Reviewed-on: https://go-review.googlesource.com/9366 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 23:09:39 +00:00
David Crawshaw	d707a6e0e2	runtime: remove unnecessary noescape to fix netbsd I introduced this build failure in golang.org/cl/9302 but failed to notice due to the other failures on the dashboard. Change-Id: I84bf00f664ba572c1ca722e0136d8a2cf21613ca Reviewed-on: https://go-review.googlesource.com/9363 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-04-27 23:04:38 +00:00
Austin Clements	23ce80efeb	runtime/race: fix benchmark deadlock Currently TestRaceCrawl fails to wg.Done for every wg.Adds if the depth ever reaches 0. This causes the test to deadlock. Under the race detector, this deadlock is not detected, so the test eventually times out. This only recently became a problem. Prior to commit `e870f06` the depth would never reach 0 because the strict round-robin goroutine schedule ensured that all of the URLs were already "seen" by depth 2. Now that the runtime prefers scheduling the most recently started goroutine, the test is able to reach depth 0 and trigger this deadlock. Change-Id: I5176302a89614a344c84d587073b364833af6590 Reviewed-on: https://go-review.googlesource.com/9344 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 20:54:34 +00:00
Russ Cox	42da270024	runtime: fix race in BenchmarkPingPongHog The master goroutine was returning before the child goroutine had done its final i < b.N (the one that fails and causes it to exit the loop) and then the benchmark harness was updating b.N, causing a read+write race on b.N. Change-Id: I2504270a0de30544736f6c32161337a25b505c3e Reviewed-on: https://go-review.googlesource.com/9368 Reviewed-by: Austin Clements <austin@google.com>	2015-04-27 20:10:11 +00:00
Austin Clements	33e0f3d853	runtime: fix some out of date comments and typos Change-Id: I061057414c722c5a0f03c709528afc8554114db6 Reviewed-on: https://go-review.googlesource.com/9367 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 20:08:38 +00:00
Josh Bleecher Snyder	9a0fd97ff3	runtime: remove a modulus calculation from pollorder This is a follow-up to CL 9269, as suggested by dvyukov. There is probably even more that can be done to speed up this shuffle. It will matter more once CL 7570 (fine-grained locking in select) is in and can be revisited then, with benchmarks. Change-Id: Ic13a27d11cedd1e1f007951214b3bb56b1644f02 Reviewed-on: https://go-review.googlesource.com/9393 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 19:36:37 +00:00
Austin Clements	1b01910c06	runtime: rename gcController.findRunnable to findRunnableGCWorker This avoids confusion with the main findrunnable in the scheduler. Change-Id: I8cf40657557a8610a2fe5a2f74598518256ca7f0 Reviewed-on: https://go-review.googlesource.com/9305 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:42 +00:00
Austin Clements	bb6320535d	runtime: replace STW for enabling write barriers with ragged barrier Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:37 +00:00
Austin Clements	57afa76471	runtime: add ragged global barrier function This adds forEachP, which performs a general-purpose ragged global barrier. forEachP takes a callback and invokes it for every P at a GC safe point. Ps that are idle or in a syscall are considered to be at a continuous safe point. forEachP ensures that these Ps do not change state by forcing all syscall Ps into idle and holding the sched.lock. To ensure that Ps do not enter syscall or idle without running the safe-point function, this adds checks for a pending callback every place there is currently a gcwaiting check. We'll use forEachP to replace the STW around enabling the write barrier and to replace the current asynchronous per-M wbuf cache with a cooperatively managed per-P gcWork cache. Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc Reviewed-on: https://go-review.googlesource.com/8206 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:33 +00:00
Austin Clements	b0b1a66052	runtime: reset spinning in mspinning if work was ready()ed This fixes a bug where the runtime ready()s a goroutine while setting up a new M that's initially marked as spinning, causing the scheduler to later panic when it finds work in the run queue of a P associated with a spinning M. Specifically, the sequence of events that can lead to this is: 1) sysmon calls handoffp to hand off a P stolen from a syscall. 2) handoffp sees no pending work on the P, so it calls startm with spinning set. 3) startm calls newm, which in turn calls allocm to allocate a new M. 4) allocm "borrows" the P we're handing off in order to do allocation and performs this allocation. 5) This allocation may assist the garbage collector, and this assist may detect the end of concurrent mark and ready() the main GC goroutine to signal this. 6) This ready()ing puts the GC goroutine on the run queue of the borrowed P. 7) newm starts the OS thread, which runs mstart and subsequently mstart1, which marks the M spinning because startm was called with spinning set. 8) mstart1 enters the scheduler, which panics because there's work on the run queue, but the M is marked spinning. To fix this, before marking the M spinning in step 7, add a check to see if work was been added to the P's run queue. If this is the case, undo the spinning instead. Fixes #10573. Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5 Reviewed-on: https://go-review.googlesource.com/9332 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-27 12:49:54 +00:00
Austin Clements	2a46f55b35	runtime: panic when idling a P with runnable Gs This adds a check that we never put a P on the idle list when it has work on its local run queue. Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7 Reviewed-on: https://go-review.googlesource.com/9324 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 12:49:49 +00:00

1 2 3 4 5 ...

1066 Commits