qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-20 04:14:49 -07:00

Author	SHA1	Message	Date
Shenghou Ma	102436e800	runtime: fix software FP regs corruption when emulating SQRT on ARM When emulating ARM FSQRT instruction, the sqrt function itself should not use any floating point arithmetics, otherwise it will clobber the user software FP registers. Fortunately, the sqrt function only uses floating point instructions to test for corner cases, so it's easy to make that function does all it job using pure integer arithmetic only. I've verified that after this change, runtime.stepflt and runtime.sqrt doesn't contain any call to _sfloat. (Perhaps we should add //go:nosfloat to make the compiler enforce this?) Fixes #10641. Change-Id: Ida4742c49000fae4fea4649f28afde630ce4c576 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9570 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Keith Randall <khr@golang.org>	2015-05-05 07:32:58 +00:00
Austin Clements	98a9d36837	runtime: add pointer size to type structure This adds a field to the runtime type structure that records the size of the prefix of objects of that type containing pointers. Any data after this offset is scalar data. This is necessary for shrinking the type bitmaps to 1 bit and will help the garbage collector efficiently estimate the amount of heap that needs to be scanned. Change-Id: I1318d79e6360dca0ac980245016c562e61f52ff5 Reviewed-on: https://go-review.googlesource.com/9691 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-05-04 20:17:48 +00:00
Rick Hudson	b86e71f5aa	runtime: Reduce calls to shouldtriggergc shouldtriggergc is slightly expensive due to the call overhead and the use of an atomic. This CL reduces the number of time one checks if a GC should be done from one at each allocation to once when a span is allocated. Since shouldtriggergc is an important abstraction simply hand inlining it, along with its atomic instruction would lose the abstraction. Change-Id: Ia3210655b4b3d433f77064a21ecb54e4d9d435f7 Reviewed-on: https://go-review.googlesource.com/9403 Reviewed-by: Austin Clements <austin@google.com>	2015-05-04 17:38:58 +00:00
Alex Brainman	031c3bc9ae	runtime: fix stackDebug comment Change-Id: Ia9191bd7ecdf7bd5ee7d69ae23aa71760f379aa8 Reviewed-on: https://go-review.googlesource.com/9590 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-05-02 02:39:50 +00:00
Austin Clements	dc870d5f4b	runtime: detailed debug output of controller state This adds a detailed debug dump of the state of the GC controller and a GODEBUG flag to enable it. Change-Id: I562fed7981691a84ddf0f9e6fcd9f089f497ac13 Reviewed-on: https://go-review.googlesource.com/9640 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 19:39:43 +00:00
Russ Cox	4fffc50c26	runtime: correct accounting of scan work and bytes marked (1) Count pointer-free objects found during scanning roots as marked bytes, by not zeroing the mark total after scanning roots. (2) Don't count the bytes for the roots themselves, by not adding them to the mark total in scanblock (the zeroing removed by (1) was aimed at that add but hitting more). Combined, (1) and (2) fix the calculation of the marked heap size. This makes the GC trigger much less often in the Go 1 benchmarks, which have a global []byte pointing at 256 MB of data. That 256 MB allocation was not being included in the heap size in the current code, but was included in Go 1.4. This is the source of much of the relative slowdown in that directory. (3) Count the bytes for the roots as scanned work, by not zeroing the scan total after scanning roots. There is no strict justification for this, and it probably doesn't matter much either way, but it was always combined with another buggy zeroing (removed in (1)), so guilty by association. Austin noticed this. name old mean new mean delta BenchmarkBinaryTree17 13.1s × (0.97,1.03) 5.9s × (0.97,1.05) -55.19% (p=0.000) BenchmarkFannkuch11 4.35s × (0.99,1.01) 4.37s × (1.00,1.01) +0.47% (p=0.032) BenchmarkFmtFprintfEmpty 84.6ns × (0.95,1.14) 85.7ns × (0.94,1.05) ~ (p=0.521) BenchmarkFmtFprintfString 320ns × (0.95,1.06) 283ns × (0.99,1.02) -11.48% (p=0.000) BenchmarkFmtFprintfInt 311ns × (0.98,1.03) 288ns × (0.99,1.02) -7.26% (p=0.000) BenchmarkFmtFprintfIntInt 554ns × (0.96,1.05) 478ns × (0.99,1.02) -13.70% (p=0.000) BenchmarkFmtFprintfPrefixedInt 434ns × (0.96,1.06) 393ns × (0.98,1.04) -9.60% (p=0.000) BenchmarkFmtFprintfFloat 620ns × (0.99,1.03) 584ns × (0.99,1.01) -5.73% (p=0.000) BenchmarkFmtManyArgs 2.19µs × (0.98,1.03) 1.94µs × (0.99,1.01) -11.62% (p=0.000) BenchmarkGobDecode 21.2ms × (0.97,1.06) 15.2ms × (0.99,1.01) -28.17% (p=0.000) BenchmarkGobEncode 18.1ms × (0.94,1.06) 11.8ms × (0.99,1.01) -35.00% (p=0.000) BenchmarkGzip 650ms × (0.98,1.01) 649ms × (0.99,1.02) ~ (p=0.802) BenchmarkGunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.438) BenchmarkHTTPClientServer 110µs × (0.98,1.04) 101µs × (0.98,1.02) -8.79% (p=0.000) BenchmarkJSONEncode 40.3ms × (0.97,1.03) 31.8ms × (0.98,1.03) -20.92% (p=0.000) BenchmarkJSONDecode 119ms × (0.97,1.02) 108ms × (0.99,1.02) -9.15% (p=0.000) BenchmarkMandelbrot200 6.03ms × (1.00,1.01) 6.03ms × (0.99,1.01) ~ (p=0.750) BenchmarkGoParse 8.58ms × (0.89,1.10) 6.80ms × (1.00,1.00) -20.71% (p=0.000) BenchmarkRegexpMatchEasy0_32 162ns × (1.00,1.01) 162ns × (0.99,1.02) ~ (p=0.131) BenchmarkRegexpMatchEasy0_1K 540ns × (0.99,1.02) 559ns × (0.99,1.02) +3.58% (p=0.000) BenchmarkRegexpMatchEasy1_32 139ns × (0.98,1.04) 139ns × (1.00,1.00) ~ (p=0.466) BenchmarkRegexpMatchEasy1_1K 889ns × (0.99,1.01) 885ns × (0.99,1.01) -0.50% (p=0.022) BenchmarkRegexpMatchMedium_32 252ns × (0.99,1.02) 252ns × (0.99,1.01) ~ (p=0.469) BenchmarkRegexpMatchMedium_1K 72.9µs × (0.99,1.01) 73.6µs × (0.99,1.03) ~ (p=0.168) BenchmarkRegexpMatchHard_32 3.87µs × (1.00,1.01) 3.86µs × (1.00,1.00) ~ (p=0.055) BenchmarkRegexpMatchHard_1K 118µs × (0.99,1.01) 117µs × (0.99,1.00) ~ (p=0.133) BenchmarkRevcomp 995ms × (0.94,1.10) 949ms × (0.99,1.01) -4.64% (p=0.000) BenchmarkTemplate 141ms × (0.97,1.02) 127ms × (0.99,1.01) -10.00% (p=0.000) BenchmarkTimeParse 641ns × (0.99,1.01) 623ns × (0.99,1.01) -2.79% (p=0.000) BenchmarkTimeFormat 729ns × (0.98,1.03) 679ns × (0.99,1.00) -6.93% (p=0.000) Change-Id: I839bd7356630d18377989a0748763414e15ed057 Reviewed-on: https://go-review.googlesource.com/9602 Reviewed-by: Austin Clements <austin@google.com>	2015-05-01 19:31:00 +00:00
Russ Cox	4d0f3a1c95	cmd/internal/gc, runtime: use 1-bit bitmap for stack frames, data, bss The bitmaps were 2 bits per pointer because we needed to distinguish scalar, pointer, multiword, and we used the leftover value to distinguish uninitialized from scalar, even though the garbage collector (GC) didn't care. Now that there are no multiword structures from the GC's point of view, cut the bitmaps down to 1 bit per pointer, recording just live pointer vs not. The GC assumes the same layout for stack frames and for the maps describing the global data and bss sections, so change them all in one CL. The code still refers to 4-bit heap bitmaps and 2-bit "type bitmaps", since the 2-bit representation lives (at least for now) in some of the reflect data. Because these stack frame bitmaps are stored directly in the rodata in the binary, this CL reduces the size of the 6g binary by about 1.1%. Performance change is basically a wash, but using less memory, and smaller binaries, and enables other bitmap reductions. name old mean new mean delta BenchmarkBinaryTree17 13.2s × (0.97,1.03) 13.0s × (0.99,1.01) -0.93% (p=0.005) BenchmarkBinaryTree17-2 9.69s × (0.96,1.05) 9.51s × (0.96,1.03) -1.86% (p=0.001) BenchmarkBinaryTree17-4 10.1s × (0.97,1.05) 10.0s × (0.96,1.05) ~ (p=0.141) BenchmarkFannkuch11 4.35s × (0.99,1.01) 4.43s × (0.98,1.04) +1.75% (p=0.001) BenchmarkFannkuch11-2 4.31s × (0.99,1.03) 4.32s × (1.00,1.00) ~ (p=0.095) BenchmarkFannkuch11-4 4.32s × (0.99,1.02) 4.38s × (0.98,1.04) +1.38% (p=0.008) BenchmarkFmtFprintfEmpty 83.5ns × (0.97,1.10) 87.3ns × (0.92,1.11) +4.55% (p=0.014) BenchmarkFmtFprintfEmpty-2 81.8ns × (0.98,1.04) 82.5ns × (0.97,1.08) ~ (p=0.364) BenchmarkFmtFprintfEmpty-4 80.9ns × (0.99,1.01) 82.6ns × (0.97,1.08) +2.12% (p=0.010) BenchmarkFmtFprintfString 320ns × (0.95,1.04) 322ns × (0.97,1.05) ~ (p=0.368) BenchmarkFmtFprintfString-2 303ns × (0.97,1.04) 304ns × (0.97,1.04) ~ (p=0.484) BenchmarkFmtFprintfString-4 305ns × (0.97,1.05) 306ns × (0.98,1.05) ~ (p=0.543) BenchmarkFmtFprintfInt 311ns × (0.98,1.03) 319ns × (0.97,1.03) +2.63% (p=0.000) BenchmarkFmtFprintfInt-2 297ns × (0.98,1.04) 301ns × (0.97,1.04) +1.19% (p=0.023) BenchmarkFmtFprintfInt-4 302ns × (0.98,1.02) 304ns × (0.97,1.03) ~ (p=0.126) BenchmarkFmtFprintfIntInt 554ns × (0.96,1.05) 554ns × (0.97,1.03) ~ (p=0.975) BenchmarkFmtFprintfIntInt-2 520ns × (0.98,1.03) 517ns × (0.98,1.02) ~ (p=0.153) BenchmarkFmtFprintfIntInt-4 524ns × (0.98,1.02) 525ns × (0.98,1.03) ~ (p=0.597) BenchmarkFmtFprintfPrefixedInt 433ns × (0.97,1.06) 434ns × (0.97,1.06) ~ (p=0.804) BenchmarkFmtFprintfPrefixedInt-2 413ns × (0.98,1.04) 413ns × (0.98,1.03) ~ (p=0.881) BenchmarkFmtFprintfPrefixedInt-4 420ns × (0.97,1.03) 421ns × (0.97,1.03) ~ (p=0.561) BenchmarkFmtFprintfFloat 620ns × (0.99,1.03) 636ns × (0.97,1.03) +2.57% (p=0.000) BenchmarkFmtFprintfFloat-2 601ns × (0.98,1.02) 617ns × (0.98,1.03) +2.58% (p=0.000) BenchmarkFmtFprintfFloat-4 613ns × (0.98,1.03) 626ns × (0.98,1.02) +2.15% (p=0.000) BenchmarkFmtManyArgs 2.19µs × (0.96,1.04) 2.23µs × (0.97,1.02) +1.65% (p=0.000) BenchmarkFmtManyArgs-2 2.08µs × (0.98,1.03) 2.10µs × (0.99,1.02) +0.79% (p=0.019) BenchmarkFmtManyArgs-4 2.10µs × (0.98,1.02) 2.13µs × (0.98,1.02) +1.72% (p=0.000) BenchmarkGobDecode 21.3ms × (0.97,1.05) 21.1ms × (0.97,1.04) -1.36% (p=0.025) BenchmarkGobDecode-2 20.0ms × (0.97,1.03) 19.2ms × (0.97,1.03) -4.00% (p=0.000) BenchmarkGobDecode-4 19.5ms × (0.99,1.02) 19.0ms × (0.99,1.01) -2.39% (p=0.000) BenchmarkGobEncode 18.3ms × (0.95,1.07) 18.1ms × (0.96,1.08) ~ (p=0.305) BenchmarkGobEncode-2 16.8ms × (0.97,1.02) 16.4ms × (0.98,1.02) -2.79% (p=0.000) BenchmarkGobEncode-4 15.4ms × (0.98,1.02) 15.4ms × (0.98,1.02) ~ (p=0.465) BenchmarkGzip 650ms × (0.98,1.03) 655ms × (0.97,1.04) ~ (p=0.075) BenchmarkGzip-2 652ms × (0.98,1.03) 655ms × (0.98,1.02) ~ (p=0.337) BenchmarkGzip-4 656ms × (0.98,1.04) 653ms × (0.98,1.03) ~ (p=0.291) BenchmarkGunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.507) BenchmarkGunzip-2 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.313) BenchmarkGunzip-4 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.312) BenchmarkHTTPClientServer 110µs × (0.98,1.03) 109µs × (0.99,1.02) -1.40% (p=0.000) BenchmarkHTTPClientServer-2 154µs × (0.90,1.08) 149µs × (0.90,1.08) -3.43% (p=0.007) BenchmarkHTTPClientServer-4 138µs × (0.97,1.04) 138µs × (0.96,1.04) ~ (p=0.670) BenchmarkJSONEncode 40.2ms × (0.98,1.02) 40.2ms × (0.98,1.05) ~ (p=0.828) BenchmarkJSONEncode-2 35.1ms × (0.99,1.02) 35.2ms × (0.98,1.03) ~ (p=0.392) BenchmarkJSONEncode-4 35.3ms × (0.98,1.03) 35.3ms × (0.98,1.02) ~ (p=0.813) BenchmarkJSONDecode 119ms × (0.97,1.02) 117ms × (0.98,1.02) -1.80% (p=0.000) BenchmarkJSONDecode-2 115ms × (0.99,1.02) 114ms × (0.98,1.02) -1.18% (p=0.000) BenchmarkJSONDecode-4 116ms × (0.98,1.02) 114ms × (0.98,1.02) -1.43% (p=0.000) BenchmarkMandelbrot200 6.03ms × (1.00,1.01) 6.03ms × (1.00,1.01) ~ (p=0.985) BenchmarkMandelbrot200-2 6.03ms × (1.00,1.01) 6.02ms × (1.00,1.01) ~ (p=0.320) BenchmarkMandelbrot200-4 6.03ms × (1.00,1.01) 6.03ms × (1.00,1.01) ~ (p=0.799) BenchmarkGoParse 8.63ms × (0.89,1.10) 8.58ms × (0.93,1.09) ~ (p=0.667) BenchmarkGoParse-2 8.20ms × (0.97,1.04) 8.37ms × (0.97,1.04) +1.96% (p=0.001) BenchmarkGoParse-4 8.00ms × (0.98,1.02) 8.14ms × (0.99,1.02) +1.75% (p=0.000) BenchmarkRegexpMatchEasy0_32 162ns × (1.00,1.01) 164ns × (0.98,1.04) +1.35% (p=0.011) BenchmarkRegexpMatchEasy0_32-2 161ns × (1.00,1.01) 161ns × (1.00,1.00) ~ (p=0.185) BenchmarkRegexpMatchEasy0_32-4 161ns × (1.00,1.00) 161ns × (1.00,1.00) -0.19% (p=0.001) BenchmarkRegexpMatchEasy0_1K 540ns × (0.99,1.02) 566ns × (0.98,1.04) +4.98% (p=0.000) BenchmarkRegexpMatchEasy0_1K-2 540ns × (0.99,1.01) 557ns × (0.99,1.01) +3.21% (p=0.000) BenchmarkRegexpMatchEasy0_1K-4 541ns × (0.99,1.01) 559ns × (0.99,1.01) +3.26% (p=0.000) BenchmarkRegexpMatchEasy1_32 139ns × (0.98,1.04) 139ns × (0.99,1.03) ~ (p=0.979) BenchmarkRegexpMatchEasy1_32-2 139ns × (0.99,1.04) 139ns × (0.99,1.02) ~ (p=0.777) BenchmarkRegexpMatchEasy1_32-4 139ns × (0.98,1.04) 139ns × (0.99,1.04) ~ (p=0.771) BenchmarkRegexpMatchEasy1_1K 890ns × (0.99,1.03) 885ns × (1.00,1.01) -0.50% (p=0.004) BenchmarkRegexpMatchEasy1_1K-2 888ns × (0.99,1.01) 885ns × (0.99,1.01) -0.37% (p=0.004) BenchmarkRegexpMatchEasy1_1K-4 890ns × (0.99,1.02) 884ns × (1.00,1.00) -0.70% (p=0.000) BenchmarkRegexpMatchMedium_32 252ns × (0.99,1.01) 251ns × (0.99,1.01) ~ (p=0.081) BenchmarkRegexpMatchMedium_32-2 254ns × (0.99,1.04) 252ns × (0.99,1.01) -0.78% (p=0.027) BenchmarkRegexpMatchMedium_32-4 253ns × (0.99,1.04) 252ns × (0.99,1.01) -0.70% (p=0.022) BenchmarkRegexpMatchMedium_1K 72.9µs × (0.99,1.01) 72.7µs × (1.00,1.00) ~ (p=0.064) BenchmarkRegexpMatchMedium_1K-2 74.1µs × (0.98,1.05) 72.9µs × (1.00,1.01) -1.61% (p=0.001) BenchmarkRegexpMatchMedium_1K-4 73.6µs × (0.99,1.05) 72.8µs × (1.00,1.00) -1.13% (p=0.007) BenchmarkRegexpMatchHard_32 3.88µs × (0.99,1.03) 3.92µs × (0.98,1.05) ~ (p=0.143) BenchmarkRegexpMatchHard_32-2 3.89µs × (0.99,1.03) 3.93µs × (0.98,1.09) ~ (p=0.278) BenchmarkRegexpMatchHard_32-4 3.90µs × (0.99,1.05) 3.93µs × (0.98,1.05) ~ (p=0.252) BenchmarkRegexpMatchHard_1K 118µs × (0.99,1.01) 117µs × (0.99,1.02) -0.54% (p=0.003) BenchmarkRegexpMatchHard_1K-2 118µs × (0.99,1.01) 118µs × (0.99,1.03) ~ (p=0.581) BenchmarkRegexpMatchHard_1K-4 118µs × (0.99,1.02) 117µs × (0.99,1.01) -0.54% (p=0.002) BenchmarkRevcomp 991ms × (0.95,1.10) 989ms × (0.94,1.08) ~ (p=0.879) BenchmarkRevcomp-2 978ms × (0.95,1.11) 962ms × (0.96,1.08) ~ (p=0.257) BenchmarkRevcomp-4 979ms × (0.96,1.07) 974ms × (0.96,1.11) ~ (p=0.678) BenchmarkTemplate 141ms × (0.99,1.02) 145ms × (0.99,1.02) +2.75% (p=0.000) BenchmarkTemplate-2 135ms × (0.98,1.02) 138ms × (0.99,1.02) +2.34% (p=0.000) BenchmarkTemplate-4 136ms × (0.98,1.02) 140ms × (0.99,1.02) +2.71% (p=0.000) BenchmarkTimeParse 640ns × (0.99,1.01) 622ns × (0.99,1.01) -2.88% (p=0.000) BenchmarkTimeParse-2 640ns × (0.99,1.01) 622ns × (1.00,1.00) -2.81% (p=0.000) BenchmarkTimeParse-4 640ns × (1.00,1.01) 622ns × (0.99,1.01) -2.82% (p=0.000) BenchmarkTimeFormat 730ns × (0.98,1.02) 731ns × (0.98,1.03) ~ (p=0.767) BenchmarkTimeFormat-2 709ns × (0.99,1.02) 707ns × (0.99,1.02) ~ (p=0.347) BenchmarkTimeFormat-4 717ns × (0.98,1.01) 718ns × (0.98,1.02) ~ (p=0.793) Change-Id: Ie779c47e912bf80eb918bafa13638bd8dfd6c2d9 Reviewed-on: https://go-review.googlesource.com/9406 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-05-01 18:44:36 +00:00
Josh Bleecher Snyder	7bebccb972	Revert "runtime/pprof: write heap statistics to heap profile always" This reverts commit `c26fc88d56`. This broke pprof. See the comments at 9491. Change-Id: Ic99ce026e86040c050a9bf0ea3024a1a42274ad1 Reviewed-on: https://go-review.googlesource.com/9565 Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2015-05-01 15:56:20 +00:00
Keith Randall	a55b131393	cmd/dist, runtime: Make stack guard larger for non-optimized builds Kind of a hack, but makes the non-optimized builds pass. Fixes #10079 Change-Id: I26f41c546867f8f3f16d953dc043e784768f2aff Reviewed-on: https://go-review.googlesource.com/9552 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 15:41:55 +00:00
David Chase	7fbb1b36c3	cmd/internal/gc: improve flow of input params to output params This includes the following information in the per-function summary: outK = paramJ encoded in outK bits for paramJ outK = paramJ encoded in outK bits for paramJ heap = paramJ EscHeap heap = paramJ EscContentEscapes Note that (currently) if the address of a parameter is taken and returned, necessarily a heap allocation occurred to contain that reference, and the heap can never refer to stack, therefore the parameter and everything downstream from it escapes to the heap. The per-function summary information now has a tuneable number of bits (2 is probably noticeably better than 1, 3 is likely overkill, but it is now easy to check and the -m debugging output includes information that allows you to figure out if more would be better.) A new test was added to check pointer flow through struct-typed and struct-typed parameters and returns; some of these are sensitive to the number of summary bits, and ought to yield better results with a more competent escape analysis algorithm. Another new test checks (some) correctness with array parameters, results, and operations. The old analysis inferred a piece of plan9 runtime was non-escaping by counteracting overconservative analysis with buggy analysis; with the bug fixed, the result was too conservative (and it's not easy to fix in this framework) so the source code was tweaked to get the desired result. A test was added against the discovered bug. The escape analysis was further improved splitting the "level" into 3 parts, one tracking the conventional "level" and the other two computing the highest-level-suffix-from-copy, which is used to generally model the cancelling effect of indirection applied to address-of. With the improved escape analysis enabled, it was necessary to modify one of the runtime tests because it now attempts to allocate too much on the (small, fixed-size) G0 (system) stack and this failed the test. Compiling src/std after touching src/runtime/.go with -m logging turned on shows 420 fewer heap allocation sites (10538 vs 10968). Profiling allocations in src/html/template with for i in {1..5} ; do go tool 6g -memprofile=mastx.${i}.prof -memprofilerate=1 *.go; go tool pprof -alloc_objects -text mastx.${i}.prof ; done showed a 15% reduction in allocations performed by the compiler. Update #3753 Update #4720 Fixes #10466 Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432 Reviewed-on: https://go-review.googlesource.com/8202 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 13:47:20 +00:00
David Crawshaw	4044adedf7	runtime/cgo, cmd/dist: turn off exc_bad_access handler by default App Store policy requires programs do not reference the exc_server symbol. (Some public forum threads show that Unity ran into this several years ago and it is a hard policy rule.) While some research suggests that I could write my own version of exc_server, the expedient course is to disable the exception handler by default. Go programs only need it when running under lldb, which is primarily used by tests. So enable the exception handler in cmd/dist when we are running the tests. Fixes #10646 Change-Id: I853905254894b5367edb8abd381d45585a78ee8b Reviewed-on: https://go-review.googlesource.com/9549 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-05-01 13:19:39 +00:00
Shenghou Ma	5f69e739d3	runtime: adjust traceTickDiv for non-x86 architectures Fixes #10554. Fixes #10623. Change-Id: I90fbaa34e3d55c8758178f8d2e7fa41ff1194a1b Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9247 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dave Cheney <dave@cheney.net>	2015-05-01 07:25:49 +00:00
Russ Cox	79a990b845	runtime: schedule GC work more aggressively Schedule the work as early as possible, while still respecting the utilization percentage on average. The old code tried never to go above the utilization percentage. The new code is willing to go above the utilization percentage by one time slice (but of course after doing that it must wait until the percentage drops back down to the target before it gets another time slice). The effect is that for concurrent GCs that can run in a small number of time slices, the time during which write barriers are enabled is reduced by one mutator + GC time slice round (possibly 30 ms per GC). This only affects the fractional GC processor (the remainder of GOMAXPROCS/4), so it matters most in GOMAXPROCS=1, a bit in GOMAXPROCS=2, and not at all in GOMAXPROCS=4. GOMAXPROCS=1 name old mean new mean delta BenchmarkBinaryTree17 12.4s × (0.98,1.03) 13.5s × (0.97,1.04) +8.84% (p=0.000) BenchmarkFannkuch11 4.38s × (1.00,1.01) 4.38s × (1.00,1.01) ~ (p=0.343) BenchmarkFmtFprintfEmpty 88.9ns × (0.97,1.10) 90.1ns × (0.93,1.14) ~ (p=0.224) BenchmarkFmtFprintfString 356ns × (0.94,1.05) 321ns × (0.94,1.12) -9.77% (p=0.000) BenchmarkFmtFprintfInt 344ns × (0.98,1.03) 325ns × (0.96,1.03) -5.46% (p=0.000) BenchmarkFmtFprintfIntInt 622ns × (0.97,1.03) 571ns × (0.95,1.05) -8.09% (p=0.000) BenchmarkFmtFprintfPrefixedInt 462ns × (0.96,1.04) 431ns × (0.95,1.05) -6.81% (p=0.000) BenchmarkFmtFprintfFloat 653ns × (0.98,1.03) 621ns × (0.99,1.03) -4.90% (p=0.000) BenchmarkFmtManyArgs 2.32µs × (0.97,1.03) 2.19µs × (0.98,1.02) -5.43% (p=0.000) BenchmarkGobDecode 27.0ms × (0.96,1.04) 20.0ms × (0.97,1.04) -26.06% (p=0.000) BenchmarkGobEncode 26.6ms × (0.99,1.01) 17.8ms × (0.95,1.05) -33.19% (p=0.000) BenchmarkGzip 659ms × (0.98,1.03) 650ms × (0.99,1.01) -1.34% (p=0.000) BenchmarkGunzip 145ms × (0.98,1.04) 143ms × (1.00,1.01) -1.47% (p=0.000) BenchmarkHTTPClientServer 111µs × (0.97,1.04) 110µs × (0.96,1.03) -1.30% (p=0.000) BenchmarkJSONEncode 52.0ms × (0.97,1.03) 40.8ms × (0.97,1.03) -21.47% (p=0.000) BenchmarkJSONDecode 127ms × (0.98,1.04) 120ms × (0.98,1.02) -5.55% (p=0.000) BenchmarkMandelbrot200 6.04ms × (0.99,1.04) 6.02ms × (1.00,1.01) ~ (p=0.176) BenchmarkGoParse 8.62ms × (0.96,1.08) 8.55ms × (0.93,1.09) ~ (p=0.302) BenchmarkRegexpMatchEasy0_32 164ns × (0.98,1.05) 165ns × (0.98,1.07) ~ (p=0.293) BenchmarkRegexpMatchEasy0_1K 546ns × (0.98,1.06) 547ns × (0.97,1.07) ~ (p=0.741) BenchmarkRegexpMatchEasy1_32 142ns × (0.97,1.09) 141ns × (0.97,1.05) ~ (p=0.231) BenchmarkRegexpMatchEasy1_1K 904ns × (0.97,1.07) 900ns × (0.98,1.04) ~ (p=0.294) BenchmarkRegexpMatchMedium_32 256ns × (0.98,1.06) 256ns × (0.97,1.04) ~ (p=0.530) BenchmarkRegexpMatchMedium_1K 74.2µs × (0.98,1.05) 73.8µs × (0.98,1.04) ~ (p=0.334) BenchmarkRegexpMatchHard_32 3.94µs × (0.98,1.07) 3.92µs × (0.98,1.05) ~ (p=0.356) BenchmarkRegexpMatchHard_1K 119µs × (0.98,1.07) 119µs × (0.98,1.06) ~ (p=0.467) BenchmarkRevcomp 978ms × (0.96,1.09) 984ms × (0.95,1.07) ~ (p=0.448) BenchmarkTemplate 151ms × (0.96,1.03) 142ms × (0.95,1.04) -5.55% (p=0.000) BenchmarkTimeParse 628ns × (0.99,1.01) 628ns × (0.99,1.01) ~ (p=0.855) BenchmarkTimeFormat 729ns × (0.98,1.06) 734ns × (0.97,1.05) ~ (p=0.149) GOMAXPROCS=2 name old mean new mean delta BenchmarkBinaryTree17-2 9.80s × (0.97,1.03) 9.85s × (0.99,1.02) ~ (p=0.444) BenchmarkFannkuch11-2 4.35s × (0.99,1.01) 4.40s × (0.98,1.05) ~ (p=0.099) BenchmarkFmtFprintfEmpty-2 86.7ns × (0.97,1.05) 85.9ns × (0.98,1.04) ~ (p=0.409) BenchmarkFmtFprintfString-2 297ns × (0.98,1.01) 297ns × (0.99,1.01) ~ (p=0.743) BenchmarkFmtFprintfInt-2 309ns × (0.98,1.02) 310ns × (0.99,1.01) ~ (p=0.464) BenchmarkFmtFprintfIntInt-2 525ns × (0.97,1.05) 518ns × (0.99,1.01) ~ (p=0.151) BenchmarkFmtFprintfPrefixedInt-2 408ns × (0.98,1.02) 408ns × (0.98,1.03) ~ (p=0.797) BenchmarkFmtFprintfFloat-2 603ns × (0.99,1.01) 604ns × (0.98,1.02) ~ (p=0.588) BenchmarkFmtManyArgs-2 2.07µs × (0.98,1.02) 2.05µs × (0.99,1.01) ~ (p=0.091) BenchmarkGobDecode-2 19.1ms × (0.97,1.01) 19.3ms × (0.97,1.04) ~ (p=0.195) BenchmarkGobEncode-2 16.2ms × (0.97,1.03) 16.4ms × (0.99,1.01) ~ (p=0.069) BenchmarkGzip-2 652ms × (0.99,1.01) 651ms × (0.99,1.01) ~ (p=0.705) BenchmarkGunzip-2 143ms × (1.00,1.01) 143ms × (1.00,1.00) ~ (p=0.665) BenchmarkHTTPClientServer-2 149µs × (0.92,1.11) 149µs × (0.91,1.08) ~ (p=0.862) BenchmarkJSONEncode-2 34.6ms × (0.98,1.02) 37.2ms × (0.99,1.01) +7.56% (p=0.000) BenchmarkJSONDecode-2 117ms × (0.99,1.01) 117ms × (0.99,1.01) ~ (p=0.858) BenchmarkMandelbrot200-2 6.10ms × (0.99,1.03) 6.03ms × (1.00,1.00) ~ (p=0.083) BenchmarkGoParse-2 8.25ms × (0.98,1.01) 8.21ms × (0.99,1.02) ~ (p=0.307) BenchmarkRegexpMatchEasy0_32-2 162ns × (0.99,1.02) 162ns × (0.99,1.01) ~ (p=0.857) BenchmarkRegexpMatchEasy0_1K-2 541ns × (0.99,1.01) 540ns × (1.00,1.00) ~ (p=0.530) BenchmarkRegexpMatchEasy1_32-2 138ns × (1.00,1.00) 141ns × (0.98,1.04) +1.88% (p=0.038) BenchmarkRegexpMatchEasy1_1K-2 887ns × (0.99,1.01) 894ns × (0.99,1.01) ~ (p=0.087) BenchmarkRegexpMatchMedium_32-2 252ns × (0.99,1.01) 252ns × (0.99,1.01) ~ (p=0.954) BenchmarkRegexpMatchMedium_1K-2 73.4µs × (0.99,1.02) 72.8µs × (1.00,1.01) -0.87% (p=0.029) BenchmarkRegexpMatchHard_32-2 3.95µs × (0.97,1.05) 3.87µs × (1.00,1.01) -2.11% (p=0.035) BenchmarkRegexpMatchHard_1K-2 117µs × (0.99,1.01) 117µs × (0.99,1.01) ~ (p=0.669) BenchmarkRevcomp-2 980ms × (0.95,1.03) 993ms × (0.94,1.09) ~ (p=0.527) BenchmarkTemplate-2 136ms × (0.98,1.01) 135ms × (0.99,1.01) ~ (p=0.200) BenchmarkTimeParse-2 630ns × (1.00,1.01) 630ns × (1.00,1.00) ~ (p=0.634) BenchmarkTimeFormat-2 705ns × (0.99,1.01) 710ns × (0.98,1.02) ~ (p=0.174) GOMAXPROCS=4 BenchmarkBinaryTree17-4 9.87s × (0.96,1.04) 9.75s × (0.96,1.03) ~ (p=0.178) BenchmarkFannkuch11-4 4.35s × (1.00,1.01) 4.40s × (0.99,1.04) ~ (p=0.071) BenchmarkFmtFprintfEmpty-4 85.8ns × (0.98,1.06) 85.6ns × (0.98,1.04) ~ (p=0.858) BenchmarkFmtFprintfString-4 306ns × (0.99,1.03) 304ns × (0.97,1.02) ~ (p=0.470) BenchmarkFmtFprintfInt-4 317ns × (0.98,1.01) 315ns × (0.98,1.02) -0.92% (p=0.044) BenchmarkFmtFprintfIntInt-4 527ns × (0.99,1.01) 525ns × (0.98,1.01) ~ (p=0.164) BenchmarkFmtFprintfPrefixedInt-4 421ns × (0.98,1.03) 417ns × (0.99,1.02) ~ (p=0.092) BenchmarkFmtFprintfFloat-4 623ns × (0.98,1.02) 618ns × (0.98,1.03) ~ (p=0.172) BenchmarkFmtManyArgs-4 2.09µs × (0.98,1.02) 2.09µs × (0.98,1.02) ~ (p=0.679) BenchmarkGobDecode-4 18.6ms × (0.99,1.01) 18.6ms × (0.98,1.03) ~ (p=0.595) BenchmarkGobEncode-4 15.0ms × (0.98,1.02) 15.1ms × (0.99,1.01) ~ (p=0.301) BenchmarkGzip-4 659ms × (0.98,1.04) 660ms × (0.97,1.02) ~ (p=0.724) BenchmarkGunzip-4 145ms × (0.98,1.04) 144ms × (0.99,1.04) ~ (p=0.671) BenchmarkHTTPClientServer-4 139µs × (0.97,1.02) 138µs × (0.99,1.02) ~ (p=0.392) BenchmarkJSONEncode-4 35.0ms × (0.99,1.02) 35.1ms × (0.98,1.02) ~ (p=0.777) BenchmarkJSONDecode-4 119ms × (0.98,1.01) 118ms × (0.98,1.02) ~ (p=0.710) BenchmarkMandelbrot200-4 6.02ms × (1.00,1.00) 6.02ms × (1.00,1.00) ~ (p=0.289) BenchmarkGoParse-4 7.96ms × (0.99,1.01) 7.96ms × (0.99,1.01) ~ (p=0.884) BenchmarkRegexpMatchEasy0_32-4 164ns × (0.98,1.04) 166ns × (0.97,1.04) ~ (p=0.221) BenchmarkRegexpMatchEasy0_1K-4 540ns × (0.99,1.01) 552ns × (0.97,1.04) +2.10% (p=0.018) BenchmarkRegexpMatchEasy1_32-4 140ns × (0.99,1.04) 142ns × (0.97,1.04) ~ (p=0.226) BenchmarkRegexpMatchEasy1_1K-4 896ns × (0.99,1.03) 907ns × (0.97,1.04) ~ (p=0.155) BenchmarkRegexpMatchMedium_32-4 255ns × (0.99,1.04) 255ns × (0.98,1.04) ~ (p=0.904) BenchmarkRegexpMatchMedium_1K-4 73.4µs × (0.99,1.04) 73.8µs × (0.98,1.04) ~ (p=0.560) BenchmarkRegexpMatchHard_32-4 3.93µs × (0.98,1.04) 3.95µs × (0.98,1.04) ~ (p=0.571) BenchmarkRegexpMatchHard_1K-4 117µs × (1.00,1.01) 119µs × (0.98,1.04) +1.48% (p=0.048) BenchmarkRevcomp-4 990ms × (0.94,1.08) 989ms × (0.94,1.10) ~ (p=0.957) BenchmarkTemplate-4 137ms × (0.98,1.02) 137ms × (0.99,1.01) ~ (p=0.996) BenchmarkTimeParse-4 629ns × (1.00,1.00) 629ns × (0.99,1.01) ~ (p=0.924) BenchmarkTimeFormat-4 710ns × (0.99,1.01) 716ns × (0.98,1.02) +0.84% (p=0.033) Change-Id: I43a04e0f6ad5e3ba9847dddf12e13222561f9cf4 Reviewed-on: https://go-review.googlesource.com/9543 Reviewed-by: Austin Clements <austin@google.com>	2015-04-30 15:50:12 +00:00
Austin Clements	3ca20218c1	runtime: fix gcDumpObject on non-heap pointers gcDumpObject is used to print the source and destination objects when checkmark find a missing mark. However, gcDumpObject currently assumes the given pointer will point to a heap object. This is not true of the source object during root marking and may not even be true of the destination object in the limited situations where the heap points back in to the stack. If the pointer isn't a heap object, gcDumpObject will attempt an out-of-bounds access to h_spans. This will cause a panicslice, which will attempt to construct a useful panic message. This will cause a string allocation, which will lead mallocgc to panic because the GC is in mark termination (checkmark only happens during mark termination). Fix this by checking that the pointer points into the heap arena before attempting to use it as an arena pointer. Change-Id: I09da600c380d4773f1f8f38e45b82cb229ea6382 Reviewed-on: https://go-review.googlesource.com/9498 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-30 14:53:51 +00:00
Keith Randall	4b78c9575d	runtime: print stack of G during a signal Sequence of operations: - Go code does a systemstack call - during the systemstack call, receive a signal - signal requests a traceback of all goroutines The orignal G is still marked as _Grunning, so the traceback code refuses to print its stack. Fix by allowing traceback of Gs whose caller is on the same M as G is. G can't be modifying its stack if that is the case. Fixes #10546 Change-Id: I2bcea48c0197fbf78ab6fa080027cd80181083ad Reviewed-on: https://go-review.googlesource.com/9435 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-29 19:25:10 +00:00
Shenghou Ma	4d1ab2d8d1	runtime: re-enable TestNewProc0 on android/arm and fix heap corruption The problem is not actually specific to android/arm. Linux/ARM's runtime.clone set the stack pointer to child_stk-4 before calling the fn. And then when fn returns, it tries to write to 4(R13) to provide argument for runtime.exit, which is just beyond the allocated child stack, and thus it will corrupt the heap randomly or trigger segfault if that memory happens to be unmapped. While we're at here, shorten the test polling interval to 0.1s to speed up the test (it was only checking at 1s interval, which means the test takes at least 1s). Fixes #10548. Change-Id: I57cd63232022b113b6cd61e987b0684ebcce930a Reviewed-on: https://go-review.googlesource.com/9457 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-29 19:18:07 +00:00
Russ Cox	c26fc88d56	runtime/pprof: write heap statistics to heap profile always The heap statistics were only written if asked for a profile with debug > 0, but that also prints a stack trace for each profile line, which is comparatively much noisier. The statistics are short enough and separate enough (they only appear at the end) and useful enough that we can print them always. This means that people using -test.memprofile in tests will get a memory profile with statistics included now. Pprof won't care, but if people care to look, the numbers will be there. This avoids the need for hacks like using -memprofilerate=1 to find the number of allocations. Change-Id: I10a4f593403d0315aad11b37c6e554b734caa73f Reviewed-on: https://go-review.googlesource.com/9491 Reviewed-by: David Chase <drchase@google.com>	2015-04-29 18:07:43 +00:00
Keith Randall	c526f3ac10	runtime: tail call into memeq/cmp body implementations There's no need to call/ret to the body implementation. It can write the result to the right place. Just jump to it and have it return to our caller. Old: call body implementation compute result put result in a register return write register to result location return New: load address of result location into a register jump to body implementation compute result write result to passed-in address return It's a bit tricky on 386 because there is no free register with which to pass the result location. Free up a register by keeping around blen-alen instead of both alen and blen. Change-Id: If2cf0682a5bf1cc592bdda7c126ed4eee8944fba Reviewed-on: https://go-review.googlesource.com/9202 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>	2015-04-29 04:46:25 +00:00
Shenghou Ma	7e49c8193c	runtime: skip gdb goroutine backtrace test on non-x86 Gdb is not able to backtrace our non-standard stack frames on RISC architectures without frame pointer. Change-Id: Id62a566ce2d743602ded2da22ff77b9ae34bc5ae Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/9456 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-29 04:44:38 +00:00
Shenghou Ma	da11a9dda3	cmd/internal/ld, runtime: unify stack reservation in PE header and runtime With 128KB stack reservation, on 32-bit Windows, the maximum number threads is ~9000. The original 65535-byte stack commit is causing problem on Windows XP where it makes the stack reservation to be 1MB despite the fact that the runtime specified 128KB. While we're at here, also fix the extra spacings in the unable to create more OS thread error message: println will insert a space between each argument. See #9457 for more information. Change-Id: I3a82f7d9717d3d55211b6eb1c34b00b0eaad83ed Reviewed-on: https://go-review.googlesource.com/2237 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Run-TryBot: Minux Ma <minux@golang.org>	2015-04-29 03:27:10 +00:00
Shenghou Ma	e7dd28891e	cmd/internal/gc, cmd/[56789]g: rename stackcopy to blockcopy To avoid confusion with the runtime concept of copying stack. Change-Id: I33442377b71012c2482c2d0ddd561492c71e70d0 Reviewed-on: https://go-review.googlesource.com/8639 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-29 00:28:01 +00:00
Ian Lance Taylor	0c62c93a09	runtime/cgo: use PTHREAD_{MUTEX,COND}_INITIALIZER Technically you must initialize static pthread_mutex_t and pthread_cond_t variables with the appropriate INITIALIZER macro. In practice the default initializers are zero anyhow, but it's still good code hygiene. Change-Id: I517304b16c2c7943b3880855c1b47a9a506b4bdf Reviewed-on: https://go-review.googlesource.com/9433 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-28 22:27:26 +00:00
Austin Clements	63caec5dee	runtime: eliminate one heapBitsForObject from scanobject scanobject with ptrmask!=nil is only ever called with the base pointer of a heap object. Currently, scanobject calls heapBitsForObject, which goes to a great deal of trouble to check that the pointer points into the heap and to find the base of the object it points to, both of which are completely unnecessary in this case. Replace this call to heapBitsForObject with much simpler logic to fetch the span and compute the heap bits. Benchmark results with five runs: name old mean new mean delta BenchmarkBinaryTree17 9.21s × (0.95,1.02) 8.55s × (0.91,1.03) -7.16% (p=0.022) BenchmarkFannkuch11 2.65s × (1.00,1.00) 2.62s × (1.00,1.00) -1.10% (p=0.000) BenchmarkFmtFprintfEmpty 73.2ns × (0.99,1.01) 71.7ns × (1.00,1.01) -1.99% (p=0.004) BenchmarkFmtFprintfString 302ns × (0.99,1.00) 292ns × (0.98,1.02) -3.31% (p=0.020) BenchmarkFmtFprintfInt 281ns × (0.98,1.01) 279ns × (0.96,1.02) ~ (p=0.596) BenchmarkFmtFprintfIntInt 482ns × (0.98,1.01) 488ns × (0.95,1.02) ~ (p=0.419) BenchmarkFmtFprintfPrefixedInt 382ns × (0.99,1.01) 365ns × (0.96,1.02) -4.35% (p=0.015) BenchmarkFmtFprintfFloat 475ns × (0.99,1.01) 472ns × (1.00,1.00) ~ (p=0.108) BenchmarkFmtManyArgs 1.89µs × (1.00,1.01) 1.90µs × (0.94,1.02) ~ (p=0.883) BenchmarkGobDecode 22.4ms × (0.99,1.01) 21.9ms × (0.92,1.04) ~ (p=0.332) BenchmarkGobEncode 24.7ms × (0.98,1.02) 23.9ms × (0.87,1.07) ~ (p=0.407) BenchmarkGzip 397ms × (0.99,1.01) 398ms × (0.99,1.01) ~ (p=0.718) BenchmarkGunzip 96.7ms × (1.00,1.00) 96.9ms × (1.00,1.00) ~ (p=0.230) BenchmarkHTTPClientServer 71.5µs × (0.98,1.01) 68.5µs × (0.92,1.06) ~ (p=0.243) BenchmarkJSONEncode 46.1ms × (0.98,1.01) 44.9ms × (0.98,1.03) -2.51% (p=0.040) BenchmarkJSONDecode 86.1ms × (0.99,1.01) 86.5ms × (0.99,1.01) ~ (p=0.343) BenchmarkMandelbrot200 4.12ms × (1.00,1.00) 4.13ms × (1.00,1.00) +0.23% (p=0.000) BenchmarkGoParse 5.89ms × (0.96,1.03) 5.82ms × (0.96,1.04) ~ (p=0.522) BenchmarkRegexpMatchEasy0_32 141ns × (0.99,1.01) 142ns × (1.00,1.00) ~ (p=0.178) BenchmarkRegexpMatchEasy0_1K 408ns × (1.00,1.00) 392ns × (0.99,1.00) -3.83% (p=0.000) BenchmarkRegexpMatchEasy1_32 122ns × (1.00,1.00) 122ns × (1.00,1.00) ~ (p=0.178) BenchmarkRegexpMatchEasy1_1K 626ns × (1.00,1.01) 624ns × (0.99,1.00) ~ (p=0.122) BenchmarkRegexpMatchMedium_32 202ns × (0.99,1.00) 205ns × (0.99,1.01) +1.58% (p=0.001) BenchmarkRegexpMatchMedium_1K 54.4µs × (1.00,1.00) 55.5µs × (1.00,1.00) +1.86% (p=0.000) BenchmarkRegexpMatchHard_32 2.68µs × (1.00,1.00) 2.71µs × (1.00,1.00) +0.97% (p=0.002) BenchmarkRegexpMatchHard_1K 79.8µs × (1.00,1.01) 80.5µs × (1.00,1.01) +0.94% (p=0.003) BenchmarkRevcomp 590ms × (0.99,1.01) 585ms × (1.00,1.00) ~ (p=0.066) BenchmarkTemplate 111ms × (0.97,1.02) 112ms × (0.99,1.01) ~ (p=0.201) BenchmarkTimeParse 392ns × (1.00,1.00) 385ns × (1.00,1.00) -1.69% (p=0.000) BenchmarkTimeFormat 449ns × (0.98,1.01) 448ns × (0.99,1.01) ~ (p=0.550) Change-Id: Ie7c3830c481d96c9043e7bf26853c6c1d05dc9f4 Reviewed-on: https://go-review.googlesource.com/9364 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-28 15:22:20 +00:00
Russ Cox	32d6fbcb4f	runtime: replace needwb() with writeBarrierEnabled Reduce the write barrier check to a single load and compare so that it can be inlined into write barrier use sites. Makes the standard write barrier a little faster too. name old new delta BenchmarkBinaryTree17 17.9s × (0.99,1.01) 17.9s × (1.00,1.01) ~ BenchmarkFannkuch11 4.35s × (1.00,1.00) 4.43s × (1.00,1.00) +1.81% BenchmarkFmtFprintfEmpty 120ns × (0.93,1.06) 110ns × (1.00,1.06) -7.92% BenchmarkFmtFprintfString 479ns × (0.99,1.00) 487ns × (0.99,1.00) +1.67% BenchmarkFmtFprintfInt 452ns × (0.99,1.02) 450ns × (0.99,1.00) ~ BenchmarkFmtFprintfIntInt 766ns × (0.99,1.01) 762ns × (1.00,1.00) ~ BenchmarkFmtFprintfPrefixedInt 576ns × (0.98,1.01) 584ns × (0.99,1.01) ~ BenchmarkFmtFprintfFloat 730ns × (1.00,1.01) 738ns × (1.00,1.00) +1.16% BenchmarkFmtManyArgs 2.84µs × (0.99,1.00) 2.80µs × (1.00,1.01) -1.22% BenchmarkGobDecode 39.3ms × (0.98,1.01) 39.0ms × (0.99,1.00) ~ BenchmarkGobEncode 39.5ms × (0.99,1.01) 37.8ms × (0.98,1.01) -4.33% BenchmarkGzip 663ms × (1.00,1.01) 661ms × (0.99,1.01) ~ BenchmarkGunzip 143ms × (1.00,1.00) 142ms × (1.00,1.00) ~ BenchmarkHTTPClientServer 132µs × (0.99,1.01) 132µs × (0.99,1.01) ~ BenchmarkJSONEncode 57.4ms × (0.99,1.01) 56.3ms × (0.99,1.01) -1.96% BenchmarkJSONDecode 139ms × (0.99,1.00) 138ms × (0.99,1.01) ~ BenchmarkMandelbrot200 6.03ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ BenchmarkGoParse 10.3ms × (0.89,1.14) 10.2ms × (0.87,1.05) ~ BenchmarkRegexpMatchEasy0_32 209ns × (1.00,1.00) 208ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy0_1K 591ns × (0.99,1.00) 588ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy1_32 184ns × (0.99,1.02) 182ns × (0.99,1.01) ~ BenchmarkRegexpMatchEasy1_1K 1.01µs × (1.00,1.00) 0.99µs × (1.00,1.01) -2.33% BenchmarkRegexpMatchMedium_32 330ns × (1.00,1.00) 323ns × (1.00,1.01) -2.12% BenchmarkRegexpMatchMedium_1K 92.6µs × (1.00,1.00) 89.9µs × (1.00,1.00) -2.92% BenchmarkRegexpMatchHard_32 4.80µs × (0.95,1.00) 4.72µs × (0.95,1.01) ~ BenchmarkRegexpMatchHard_1K 136µs × (1.00,1.00) 133µs × (1.00,1.01) -1.86% BenchmarkRevcomp 900ms × (0.99,1.04) 900ms × (1.00,1.05) ~ BenchmarkTemplate 172ms × (1.00,1.00) 168ms × (0.99,1.01) -2.07% BenchmarkTimeParse 637ns × (1.00,1.00) 637ns × (1.00,1.00) ~ BenchmarkTimeFormat 744ns × (1.00,1.01) 738ns × (1.00,1.00) -0.67% Change-Id: I4ecc925805da1f5ee264377f1f7574f54ee575e7 Reviewed-on: https://go-review.googlesource.com/9321 Reviewed-by: Austin Clements <austin@google.com>	2015-04-28 01:37:53 +00:00
Russ Cox	2050f57141	runtime: change unused argument in fat write barriers from pointer to scalar The argument is unused, only present for alignment of the following argument. The compiler today always passes a zero but I'd rather not write anything there during the call sequence, so mark it as a scalar so the garbage collector won't look at it. As expected, no significant performance change. name old new delta BenchmarkBinaryTree17 17.9s × (0.99,1.00) 17.9s × (0.99,1.01) ~ BenchmarkFannkuch11 4.35s × (1.00,1.00) 4.35s × (1.00,1.00) ~ BenchmarkFmtFprintfEmpty 120ns × (0.94,1.05) 120ns × (0.93,1.06) ~ BenchmarkFmtFprintfString 477ns × (1.00,1.00) 479ns × (0.99,1.00) ~ BenchmarkFmtFprintfInt 450ns × (0.99,1.01) 452ns × (0.99,1.02) ~ BenchmarkFmtFprintfIntInt 765ns × (0.99,1.01) 766ns × (0.99,1.01) ~ BenchmarkFmtFprintfPrefixedInt 569ns × (0.99,1.01) 576ns × (0.98,1.01) ~ BenchmarkFmtFprintfFloat 728ns × (1.00,1.00) 730ns × (1.00,1.01) ~ BenchmarkFmtManyArgs 2.82µs × (0.99,1.01) 2.84µs × (0.99,1.00) ~ BenchmarkGobDecode 39.1ms × (0.99,1.01) 39.3ms × (0.98,1.01) ~ BenchmarkGobEncode 39.4ms × (0.99,1.01) 39.5ms × (0.99,1.01) ~ BenchmarkGzip 661ms × (0.99,1.01) 663ms × (1.00,1.01) ~ BenchmarkGunzip 143ms × (1.00,1.00) 143ms × (1.00,1.00) ~ BenchmarkHTTPClientServer 133µs × (0.99,1.01) 132µs × (0.99,1.01) ~ BenchmarkJSONEncode 57.3ms × (0.99,1.04) 57.4ms × (0.99,1.01) ~ BenchmarkJSONDecode 139ms × (0.99,1.00) 139ms × (0.99,1.00) ~ BenchmarkMandelbrot200 6.02ms × (1.00,1.00) 6.03ms × (1.00,1.00) ~ BenchmarkGoParse 9.72ms × (0.92,1.11) 10.31ms × (0.89,1.14) ~ BenchmarkRegexpMatchEasy0_32 209ns × (1.00,1.01) 209ns × (1.00,1.00) ~ BenchmarkRegexpMatchEasy0_1K 592ns × (0.99,1.00) 591ns × (0.99,1.00) ~ BenchmarkRegexpMatchEasy1_32 183ns × (0.98,1.01) 184ns × (0.99,1.02) ~ BenchmarkRegexpMatchEasy1_1K 1.01µs × (1.00,1.01) 1.01µs × (1.00,1.00) ~ BenchmarkRegexpMatchMedium_32 330ns × (1.00,1.00) 330ns × (1.00,1.00) ~ BenchmarkRegexpMatchMedium_1K 92.4µs × (1.00,1.00) 92.6µs × (1.00,1.00) ~ BenchmarkRegexpMatchHard_32 4.77µs × (0.95,1.01) 4.80µs × (0.95,1.00) ~ BenchmarkRegexpMatchHard_1K 136µs × (1.00,1.00) 136µs × (1.00,1.00) ~ BenchmarkRevcomp 906ms × (0.99,1.05) 900ms × (0.99,1.04) ~ BenchmarkTemplate 171ms × (0.99,1.01) 172ms × (1.00,1.00) ~ BenchmarkTimeParse 638ns × (1.00,1.00) 637ns × (1.00,1.00) ~ BenchmarkTimeFormat 745ns × (0.99,1.02) 744ns × (1.00,1.01) ~ Change-Id: I0aeac5dc7adfd75e2223e3aabfedc7818d339f9b Reviewed-on: https://go-review.googlesource.com/9320 Reviewed-by: Austin Clements <austin@google.com>	2015-04-28 01:37:45 +00:00
Austin Clements	02ba71e547	runtime/race: fix failing tests Some race tests were sensitive to the goroutine scheduling order. When this changed in commit `e870f06`, these tests started to fail. Fix TestRaceHeapParam by ensuring that the racing goroutine has run before the test exits. Fix TestRaceRWMutexMultipleReaders by adding a third reader to ensure that two readers wind up on the same side of the writer (and race with each other) regardless of the schedule. Fix TestRaceRange by ensuring that the racing goroutine runs before the main goroutine exits the loop it races with. Change-Id: Iaf002f8730ea42227feaf2f3c51b9a1e57ccffdd Reviewed-on: https://go-review.googlesource.com/9402 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 23:12:00 +00:00
Russ Cox	f774e6a1f8	runtime/race: stop listening to external network addresses This makes the OS X firewall box pop up. Not run during all.bash so hasn't been noticed before. Change-Id: I78feb4fd3e1d3c983ae3419085048831c04de3da Reviewed-on: https://go-review.googlesource.com/9401 Reviewed-by: Austin Clements <austin@google.com>	2015-04-27 23:11:45 +00:00
Austin Clements	7c7cd69591	runtime: fix stack use accounting ReadMemStats accounts for stacks slightly differently than the runtime does internally. Internally, only stacks allocated by newosproc0 are accounted in memstats.stacks_sys and other stacks are accounted in heap_sys. readmemstats_m shuffles the statistics so all stacks are accounted in StackSys rather than HeapSys. However, currently, readmemstats_m assumes StackSys will be zero when it does this shuffle. This was true until commit `6ad33be`. If it isn't (e.g., if something called newosproc0), StackSys+HeapSys will be different before and after this shuffle, and the Sys sum that was computed earlier will no longer agree with the sum of its components. Fix this by making the shuffle in readmemstats_m not assume that StackSys is zero. Fixes #10585. Change-Id: If13991c8de68bd7b85e1b613d3f12b4fd6fd5813 Reviewed-on: https://go-review.googlesource.com/9366 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 23:09:39 +00:00
David Crawshaw	d707a6e0e2	runtime: remove unnecessary noescape to fix netbsd I introduced this build failure in golang.org/cl/9302 but failed to notice due to the other failures on the dashboard. Change-Id: I84bf00f664ba572c1ca722e0136d8a2cf21613ca Reviewed-on: https://go-review.googlesource.com/9363 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-04-27 23:04:38 +00:00
Austin Clements	23ce80efeb	runtime/race: fix benchmark deadlock Currently TestRaceCrawl fails to wg.Done for every wg.Adds if the depth ever reaches 0. This causes the test to deadlock. Under the race detector, this deadlock is not detected, so the test eventually times out. This only recently became a problem. Prior to commit `e870f06` the depth would never reach 0 because the strict round-robin goroutine schedule ensured that all of the URLs were already "seen" by depth 2. Now that the runtime prefers scheduling the most recently started goroutine, the test is able to reach depth 0 and trigger this deadlock. Change-Id: I5176302a89614a344c84d587073b364833af6590 Reviewed-on: https://go-review.googlesource.com/9344 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-27 20:54:34 +00:00
Russ Cox	42da270024	runtime: fix race in BenchmarkPingPongHog The master goroutine was returning before the child goroutine had done its final i < b.N (the one that fails and causes it to exit the loop) and then the benchmark harness was updating b.N, causing a read+write race on b.N. Change-Id: I2504270a0de30544736f6c32161337a25b505c3e Reviewed-on: https://go-review.googlesource.com/9368 Reviewed-by: Austin Clements <austin@google.com>	2015-04-27 20:10:11 +00:00
Austin Clements	33e0f3d853	runtime: fix some out of date comments and typos Change-Id: I061057414c722c5a0f03c709528afc8554114db6 Reviewed-on: https://go-review.googlesource.com/9367 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 20:08:38 +00:00
Josh Bleecher Snyder	9a0fd97ff3	runtime: remove a modulus calculation from pollorder This is a follow-up to CL 9269, as suggested by dvyukov. There is probably even more that can be done to speed up this shuffle. It will matter more once CL 7570 (fine-grained locking in select) is in and can be revisited then, with benchmarks. Change-Id: Ic13a27d11cedd1e1f007951214b3bb56b1644f02 Reviewed-on: https://go-review.googlesource.com/9393 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 19:36:37 +00:00
Austin Clements	1b01910c06	runtime: rename gcController.findRunnable to findRunnableGCWorker This avoids confusion with the main findrunnable in the scheduler. Change-Id: I8cf40657557a8610a2fe5a2f74598518256ca7f0 Reviewed-on: https://go-review.googlesource.com/9305 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:42 +00:00
Austin Clements	bb6320535d	runtime: replace STW for enabling write barriers with ragged barrier Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:37 +00:00
Austin Clements	57afa76471	runtime: add ragged global barrier function This adds forEachP, which performs a general-purpose ragged global barrier. forEachP takes a callback and invokes it for every P at a GC safe point. Ps that are idle or in a syscall are considered to be at a continuous safe point. forEachP ensures that these Ps do not change state by forcing all syscall Ps into idle and holding the sched.lock. To ensure that Ps do not enter syscall or idle without running the safe-point function, this adds checks for a pending callback every place there is currently a gcwaiting check. We'll use forEachP to replace the STW around enabling the write barrier and to replace the current asynchronous per-M wbuf cache with a cooperatively managed per-P gcWork cache. Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc Reviewed-on: https://go-review.googlesource.com/8206 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:33 +00:00
Austin Clements	b0b1a66052	runtime: reset spinning in mspinning if work was ready()ed This fixes a bug where the runtime ready()s a goroutine while setting up a new M that's initially marked as spinning, causing the scheduler to later panic when it finds work in the run queue of a P associated with a spinning M. Specifically, the sequence of events that can lead to this is: 1) sysmon calls handoffp to hand off a P stolen from a syscall. 2) handoffp sees no pending work on the P, so it calls startm with spinning set. 3) startm calls newm, which in turn calls allocm to allocate a new M. 4) allocm "borrows" the P we're handing off in order to do allocation and performs this allocation. 5) This allocation may assist the garbage collector, and this assist may detect the end of concurrent mark and ready() the main GC goroutine to signal this. 6) This ready()ing puts the GC goroutine on the run queue of the borrowed P. 7) newm starts the OS thread, which runs mstart and subsequently mstart1, which marks the M spinning because startm was called with spinning set. 8) mstart1 enters the scheduler, which panics because there's work on the run queue, but the M is marked spinning. To fix this, before marking the M spinning in step 7, add a check to see if work was been added to the P's run queue. If this is the case, undo the spinning instead. Fixes #10573. Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5 Reviewed-on: https://go-review.googlesource.com/9332 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-27 12:49:54 +00:00
Austin Clements	2a46f55b35	runtime: panic when idling a P with runnable Gs This adds a check that we never put a P on the idle list when it has work on its local run queue. Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7 Reviewed-on: https://go-review.googlesource.com/9324 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 12:49:49 +00:00
Josh Bleecher Snyder	fd5540e7e5	runtime: tighten select permutation generation This is the optimization made to math/rand in CL 21030043. Change-Id: I231b24fa77cac1fe74ba887db76313b5efaab3e8 Reviewed-on: https://go-review.googlesource.com/9269 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-27 02:36:24 +00:00
David Crawshaw	a5b693b431	runtime: signal forwarding for darwin/amd64 Follows the linux signal forwarding semantics from http://golang.org/cl/8712, sharing the implementation of sigfwdgo. Forwarding for 386, arm, and arm64 will follow. Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62 Reviewed-on: https://go-review.googlesource.com/9302 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-26 13:46:13 +00:00
Rick Hudson	ada8cdb9f6	runtime: Fix bug due to elided return. A previous change to mbitmap.go dropped a return on a path the seems not to be excersized. This was a mistake that this CL fixes. Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43 Reviewed-on: https://go-review.googlesource.com/9295 Reviewed-by: Austin Clements <austin@google.com>	2015-04-24 21:52:30 +00:00
Austin Clements	1b4025f4bd	runtime: replace per-M workbuf cache with per-P gcWork cache Currently, each M has a cache of the most recently used workbuf. This is used primarily by the write barrier so it doesn't have to access the global workbuf lists on every write barrier. It's also used by stack scanning because it's convenient. This cache is important for write barrier performance, but this particular approach has several downsides. It's faster than no cache, but far from optimal (as the benchmarks below show). It's complex: access to the cache is sprinkled through most of the workbuf list operations and it requires special care to transform into and back out of the gcWork cache that's actually used for scanning and marking. It requires atomic exchanges to take ownership of the cached workbuf and to return it to the M's cache even though it's almost always used by only the current M. Since it's per-M, flushing these caches is O(# of Ms), which may be high. And it has some significant subtleties: for example, in general the cache shouldn't be used after the harvestwbufs() in mark termination because it could hide work from mark termination, but stack scanning can happen after this and will* use the cache (but it turns out this is okay because it will always be followed by a getfull(), which drains the cache). This change replaces this cache with a per-P gcWork object. This gcWork cache can be used directly by scanning and marking (as long as preemption is disabled, which is a general requirement of gcWork). Since it's per-P, it doesn't require synchronization, which simplifies things and means the only atomic operations in the write barrier are occasionally fetching new work buffers and setting a mark bit if the object isn't already marked. This cache can be flushed in O(# of Ps), which is generally small. It follows a simple flushing rule: the cache can be used during any phase, but during mark termination it must be flushed before allowing preemption. This also makes the dispose during mutator assist no longer necessary, which eliminates the vast majority of gcWork dispose calls and reduces contention on the global workbuf lists. And it's a lot faster on some benchmarks: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 11963668673 11206112763 -6.33% BenchmarkFannkuch11 2643217136 2649182499 +0.23% BenchmarkFmtFprintfEmpty 70.4 70.2 -0.28% BenchmarkFmtFprintfString 364 307 -15.66% BenchmarkFmtFprintfInt 317 282 -11.04% BenchmarkFmtFprintfIntInt 512 483 -5.66% BenchmarkFmtFprintfPrefixedInt 404 380 -5.94% BenchmarkFmtFprintfFloat 521 479 -8.06% BenchmarkFmtManyArgs 2164 1894 -12.48% BenchmarkGobDecode 30366146 22429593 -26.14% BenchmarkGobEncode 29867472 26663152 -10.73% BenchmarkGzip 391236616 396779490 +1.42% BenchmarkGunzip 96639491 96297024 -0.35% BenchmarkHTTPClientServer 100110 70763 -29.31% BenchmarkJSONEncode 51866051 52511382 +1.24% BenchmarkJSONDecode 103813138 86094963 -17.07% BenchmarkMandelbrot200 4121834 4120886 -0.02% BenchmarkGoParse 16472789 5879949 -64.31% BenchmarkRegexpMatchEasy0_32 140 140 +0.00% BenchmarkRegexpMatchEasy0_1K 394 394 +0.00% BenchmarkRegexpMatchEasy1_32 120 120 +0.00% BenchmarkRegexpMatchEasy1_1K 621 614 -1.13% BenchmarkRegexpMatchMedium_32 209 202 -3.35% BenchmarkRegexpMatchMedium_1K 54889 55175 +0.52% BenchmarkRegexpMatchHard_32 2682 2675 -0.26% BenchmarkRegexpMatchHard_1K 79383 79524 +0.18% BenchmarkRevcomp 584116718 584595320 +0.08% BenchmarkTemplate 125400565 109620196 -12.58% BenchmarkTimeParse 386 387 +0.26% BenchmarkTimeFormat 580 447 -22.93% (Best out of 10 runs. The delta of averages is similar.) This also puts us in a good position to flush these caches when nearing the end of concurrent marking, which will let us increase the size of the work buffers while still controlling mark termination pause time. Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522 Reviewed-on: https://go-review.googlesource.com/9178 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 20:10:14 +00:00
Austin Clements	d1cae6358c	runtime: fix check for pending GC work When findRunnable considers running a fractional mark worker, it first checks if there's any work to be done; if there isn't there's no point in running the worker because it will just reschedule immediately. However, currently findRunnable just checks work.full and work.partial, whereas getfull can also draw work from m.currentwbuf. As a result, findRunnable may not start a worker even though there actually is work. This problem manifests itself in occasional failures of the test/init1.go test. This test is unusual because it performs a large amount of allocation without executing any write barriers, which means there's nothing to force the pointers in currentwbuf out to the work.partial/full lists where findRunnable can see them. This change fixes this problem by making findRunnable also check for a currentwbuf. This aligns findRunnable with trygetfull's notion of whether or not there's work. Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4 Reviewed-on: https://go-review.googlesource.com/9299 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 20:10:10 +00:00
Austin Clements	26eac917dc	runtime: start dedicated mark workers even if there's no work Currently, findRunnable only considers running a mark worker if there's work in the work queue. In principle, this can delay the start of the desired number of dedicated mark workers if there's no work pending. This is unlikely to occur in practice, since there should be work queued from the scan phase, but if it were to come up, a CPU hog mutator could slow down or delay garbage collection. This check makes sense for fractional mark workers, since they'll just return to the scheduler immediately if there's no work, but we want the scheduler to start all of the dedicated mark workers promptly, even if there's currently no queued work. Hence, this change moves the pending work check after the check for starting a dedicated worker. Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101 Reviewed-on: https://go-review.googlesource.com/9298 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-24 20:10:05 +00:00
Austin Clements	711a164267	runtime: fix some out-of-date comments bgMarkCount no longer exists. Change-Id: I3aa406fdccfca659814da311229afbae55af8304 Reviewed-on: https://go-review.googlesource.com/9297 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-24 20:10:01 +00:00
Srdjan Petrovic	6ad33be2d9	runtime: implement xadduintptr and update system mstats using it The motivation is that sysAlloc/Free() currently aren't safe to be called without a valid G, because arm's xadd64() uses locks that require a valid G. The solution here was proposed by Dmitry Vyukov: use xadduintptr() instead of xadd64(), until arm can support xadd64 on all of its architectures (not a trivial task for arm). Change-Id: I250252079357ea2e4360e1235958b1c22051498f Reviewed-on: https://go-review.googlesource.com/9002 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-24 16:53:26 +00:00
Austin Clements	0e6a6c510f	runtime: simplify process for starting GC goroutine Currently, when allocation reaches the GC trigger, the runtime uses readyExecute to start the GC goroutine immediately rather than wait for the scheduler to get around to the GC goroutine while the mutator continues to grow the heap. Now that the scheduler runs the most recently readied goroutine when a goroutine yields its time slice, this rigmarole is no longer necessary. The runtime can simply ready the GC goroutine and yield from the readying goroutine. Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710 Reviewed-on: https://go-review.googlesource.com/9292 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-24 15:13:05 +00:00
Austin Clements	ce502b063c	runtime: use park/ready to wake up GC at end of concurrent mark Currently, the main GC goroutine sleeps on a note during concurrent mark and the first background mark worker or assist to finish marking use wakes up that note to let the main goroutine proceed into mark termination. Unfortunately, the latency of this wakeup can be quite high, since the GC goroutine will typically have lost its P while in the futex sleep, meaning it will be placed on the global run queue and will wait there until some P is kind enough to pick it up. This delay gives the mutator more time to allocate and create floating garbage, growing the heap unnecessarily. Worse, it's likely that background marking has stopped at this point (unless GOMAXPROCS>4), so anything that's allocated and published to the heap during this window will have to be scanned during mark termination while the world is stopped. This change replaces the note sleep/wakeup with a gopark/ready scheme. This keeps the wakeup inside the Go scheduler and lets the garbage collector take advantage of the new scheduler semantics that run the ready()d goroutine immediately when the ready()ing goroutine sleeps. For the json benchmark from x/benchmarks with GOMAXPROCS=4, this reduces the delay in waking up the GC goroutine and entering mark termination once concurrent marking is done from ~100ms to typically <100µs. Change-Id: Ib11f8b581b8914f2d68e0094f121e49bac3bb384 Reviewed-on: https://go-review.googlesource.com/9291 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:13:01 +00:00
Austin Clements	4e32718d3e	runtime: use timer for GC control revise rather than timeout Currently, we use a note sleep with a timeout in a loop in func gc to periodically revise the GC control variables. Replace this with a fully blocking note sleep and use a periodic timer to trigger the revise instead. This is a step toward replacing the note sleep in func gc. Change-Id: I2d562f6b9b2e5f0c28e9a54227e2c0f8a2603f63 Reviewed-on: https://go-review.googlesource.com/9290 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:56 +00:00
Austin Clements	e870f06c3f	runtime: yield time slice to most recently readied G Currently, when the runtime ready()s a G, it adds it to the end of the current P's run queue and continues running. If there are many other things in the run queue, this can result in a significant delay before the ready()d G actually runs and can hurt fairness when other Gs in the run queue are CPU hogs. For example, if there are three Gs sharing a P, one of which is a CPU hog that never voluntarily gives up the P and the other two of which are doing small amounts of work and communicating back and forth on an unbuffered channel, the two communicating Gs will get very little CPU time. Change this so that when G1 ready()s G2 and then blocks, the scheduler immediately hands off the remainder of G1's time slice to G2. In the above example, the two communicating Gs will now act as a unit and together get half of the CPU time, while the CPU hog gets the other half of the CPU time. This fixes the problem demonstrated by the ping-pong benchmark added in the previous commit: benchmark old ns/op new ns/op delta BenchmarkPingPongHog 684287 825 -99.88% On the x/benchmarks suite, this change improves the performance of garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for GOMAXPROCS=1 and 4. It has negligible effect on heap size. This has no effect on the go1 benchmark suite since those benchmarks are mostly single-threaded. Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f Reviewed-on: https://go-review.googlesource.com/9289 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:52 +00:00
Austin Clements	da0e37fa8d	runtime: benchmark for ping-pong in the presence of a CPU hog This benchmark demonstrates a current problem with the scheduler where a set of frequently communicating goroutines get very little CPU time in the presence of another goroutine that hogs that CPU, even if one of those communicating goroutines is always runnable. Currently it takes about 0.5 milliseconds to switch between ping-ponging goroutines in the presence of a CPU hog: BenchmarkPingPongHog 2000 684287 ns/op Change-Id: I278848c84f778de32344921ae8a4a8056e4898b0 Reviewed-on: https://go-review.googlesource.com/9288 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:47 +00:00
Austin Clements	e5e52f4f2c	runtime: factor checking if P run queue is empty There are a variety of places where we check if a P's run queue is empty. This test is about to get slightly more complicated, so factor it out into a new function, runqempty. This function is inlinable, so this has no effect on performance. Change-Id: If4a0b01ffbd004937de90d8d686f6ded4aad2c6b Reviewed-on: https://go-review.googlesource.com/9287 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:42 +00:00
Srdjan Petrovic	5c8fbc6f1e	runtime: signal forwarding Forward signals to signal handlers installed before Go installs its own, under certain circumstances. In particular, as iant@ suggests, signals are forwarded iff: (1) a non-SIG_DFL signal handler existed before Go, and (2) signal is synchronous (i.e., one of SIGSEGV, SIGBUS, SIGFPE), and (3a) signal occured on a non-Go thread, or (3b) signal occurred on a Go thread but in CGo code. Supported only on Linux, for now. Change-Id: I403219ee47b26cf65da819fb86cf1ec04d3e25f5 Reviewed-on: https://go-review.googlesource.com/8712 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-24 05:19:39 +00:00
Srdjan Petrovic	1f65c9c141	runtime: deflake TestNewOSProc0, fix _rt0_amd64_linux_lib stack alignment This addresses iant's comments from CL 9164. Change-Id: I7b5b282f61b11aab587402c2d302697e76666376 Reviewed-on: https://go-review.googlesource.com/9222 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-23 23:09:03 +00:00
Austin Clements	ed09e0e2bf	runtime: fix underflow in next_gc calculation Currently, it's possible for the next_gc calculation to underflow. Since next_gc is unsigned, this wraps around and effectively disables GC for the rest of the program's execution. Besides being obviously wrong, this is causing test failures on 32-bit because some tests are running out of heap. This underflow happens for two reasons, both having to do with how we estimate the reachable heap size at the end of the GC cycle. One reason is that this calculation depends on the value of heap_live at the beginning of the GC cycle, but we currently only record that value during a concurrent GC and not during a forced STW GC. Fix this by moving the recorded value from gcController to work and recording it on a common code path. The other reason is that we use the amount of allocation during the GC cycle as an approximation of the amount of floating garbage and subtract it from the marked heap to estimate the reachable heap. However, since this is only an approximation, it's possible for the amount of allocation during the cycle to be larger than the marked heap size (since the runtime allocates white and it's possible for these allocations to never be made reachable from the heap). Currently this causes wrap-around in our estimate of the reachable heap size, which in turn causes wrap-around in next_gc. Fix this by bottoming out the reachable heap estimate at 0, in which case we just fall back to triggering GC at heapminimum (which is okay since this only happens on small heaps). Fixes #10555, fixes #10556, and fixes #10559. Change-Id: Iad07b529c03772356fede2ae557732f13ebfdb63 Reviewed-on: https://go-review.googlesource.com/9286 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-23 20:52:54 +00:00
Rick Hudson	77f56af0bc	runtime: Improve scanning performance To achieve a 2% improvement in the garbage benchmark this CL removes an unneeded assert and avoids one hbits.next() call per object being scanned. Change-Id: Ibd542d01e9c23eace42228886f9edc488354df0d Reviewed-on: https://go-review.googlesource.com/9244 Reviewed-by: Austin Clements <austin@google.com>	2015-04-23 20:27:46 +00:00
Hyang-Ah Hana Kim	aef54d40ac	runtime: disable TestNewOSProc0 on android/arm. newosproc0 does not work on android/arm. See issue #10548. Change-Id: Ieaf6f5d0b77cddf5bf0b6c89fd12b1c1b8723f9b Reviewed-on: https://go-review.googlesource.com/9293 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-23 19:08:33 +00:00
Shenghou Ma	edc53e1f14	runtime: fix build after CL 9164 on Linux There is an assumption that the function executed in child thread created by runtime.close should not return. And different systems enforce that differently: some exit that thread, some exit the whole process. The test TestNewOSProc0 introduced in CL 9161 breaks that assumption, so we need to adjust the code to only exit the thread should the called function return. Change-Id: Id631cb2f02ec6fbd765508377a79f3f96c6a2ed6 Reviewed-on: https://go-review.googlesource.com/9246 Reviewed-by: Dave Cheney <dave@cheney.net>	2015-04-22 23:21:25 +00:00
Austin Clements	4655aadd00	runtime: use reachable heap estimate to set trigger/goal Currently, we set the heap goal for the next GC cycle using the size of the marked heap at the end of the current cycle. This can lead to a bad feedback loop if the mutator is rapidly allocating and releasing pointers that can significantly bloat heap size. If the GC were STW, the marked heap size would be exactly the reachable heap size (call it stwLive). However, in concurrent GC, marked=stwLive+floatLive, where floatLive is the amount of "floating garbage": objects that were reachable at some point during the cycle and were marked, but which are no longer reachable by the end of the cycle. If the GC cycle is short, then the mutator doesn't have much time to create floating garbage, so marked≈stwLive. However, if the GC cycle is long and the mutator is allocating and creating floating garbage very rapidly, then it's possible that marked≫stwLive. Since the runtime currently sets the heap goal based on marked, this will cause it to set a high heap goal. This means that 1) the next GC cycle will take longer because of the larger heap and 2) the assist ratio will be low because of the large distance between the trigger and the goal. The combination of these lets the mutator produce even more floating garbage in the next cycle, which further exacerbates the problem. For example, on the garbage benchmark with GOMAXPROCS=1, this causes the heap to grow to ~500MB and the garbage collector to retain upwards of ~300MB of heap, while the true reachable heap size is ~32MB. This, in turn, causes the GC cycle to take upwards of ~3 seconds. Fix this bad feedback loop by estimating the true reachable heap size (stwLive) and using this rather than the marked heap size (stwLive+floatLive) as the basis for the GC trigger and heap goal. This breaks the bad feedback loop and causes the mutator to assist more, which decreases the rate at which it can create floating garbage. On the same garbage benchmark, this reduces the maximum heap size to ~73MB, the retained heap to ~40MB, and the duration of the GC cycle to ~200ms. Change-Id: I7712244c94240743b266f9eb720c03802799cdd1 Reviewed-on: https://go-review.googlesource.com/9177 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-22 19:28:42 +00:00
Austin Clements	1ccc577b8a	runtime: include heap goal in gctrace line This may or may not be useful to the end user, but it's incredibly useful for us to understand the behavior of the pacer. Currently this is fairly easy (though not trivial) to derive from the other heap stats we print, but we're about to change how we compute the goal, which will make it much harder to derive. Change-Id: I796ef233d470c01f606bd9929820c01ece1f585a Reviewed-on: https://go-review.googlesource.com/9176 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-22 19:07:44 +00:00
Austin Clements	1f39beb01a	runtime: avoid divide-by-zero in GC trigger controller The trigger controller computes GC CPU utilization by dividing by the wall-clock time that's passed since concurrent mark began. Since this delta is nanoseconds it's borderline impossible for it to be zero, but if it is zero we'll currently divide by zero. Be robust to this possibility by ignoring the utilization in the error term if no time has elapsed. Change-Id: I93dfc9e84735682af3e637f6538d1e7602634f09 Reviewed-on: https://go-review.googlesource.com/9175 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-22 19:07:36 +00:00
Srdjan Petrovic	ca9128f18f	runtime: merge clone0 and clone We initially added clone0 to handle the case when G or M don't exist, but it turns out that we could have just modified clone. (It also helps that the function we're invoking in clone0 no longer needs arguments.) As a side-effect, newosproc0 is now supported on all linux archs. Change-Id: Ie603af75d8f164310fc16446052d83743961f3ca Reviewed-on: https://go-review.googlesource.com/9164 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-22 16:28:57 +00:00
Shenghou Ma	87054c4704	runtime: fix more vet reported issues Change-Id: Ie8dfdb592ee0bfc736d08c92c3d8413a37b6ac03 Reviewed-on: https://go-review.googlesource.com/9241 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-22 02:50:48 +00:00
Keith Randall	3a56aa0d3e	runtime: check error codes for arm64 system calls Unlike linux arm32, linux arm64 does not set the condition codes to indicate whether a system call failed or not. We must check if the return value is in the error code range (the same as amd64 does). Fixes runtime.TestBadOpen test. Change-Id: I97a8b0a17b5f002a3215c535efa91d199cee3309 Reviewed-on: https://go-review.googlesource.com/9220 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-22 02:30:22 +00:00
Josh Bleecher Snyder	a76099f0d9	runtime: fix arm64 asm vet issues Several naming changes and a real issue in asmcgocall_errno. Change-Id: Ieb0a328a168819fe233d74e0397358384d7e71b3 Reviewed-on: https://go-review.googlesource.com/9212 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-22 02:30:11 +00:00
Austin Clements	170fb10089	runtime: assist harder if GC exceeds the estimated marked heap Currently, the GC controller computes the mutator assist ratio at the beginning of the cycle by estimating that the marked heap size this cycle will be the same as it was the previous cycle. It then uses that assist ratio for the rest of the cycle. However, this means that if the mutator is quickly growing its reachable heap, the heap size is likely to exceed the heap goal and currently there's no additional pressure on mutator assists when this happens. For example, 6g (with GOMAXPROCS=1) frequently exceeds the goal heap size by ~25% because of this. This change makes GC revise its work estimate and the resulting assist ratio every 10ms during the concurrent mark. Instead of unconditionally using the marked heap size from the last cycle as an estimate for this cycle, it takes the minimum of the previously marked heap and the currently marked heap. As a result, as the cycle approaches or exceeds its heap goal, this will increase the assist ratio to put more pressure on the mutator assist to bring the cycle to an end. For 6g, this causes the GC to always finish within 5% and often within 1% of its heap goal. Change-Id: I4333b92ad0878c704964be42c655c38a862b4224 Reviewed-on: https://go-review.googlesource.com/9070 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-21 15:35:55 +00:00
Austin Clements	e0c3d85f08	runtime: fix background marking at 25% utilization Currently, in accordance with the GC pacing proposal, we schedule background marking with a goal of achieving 25% utilization total between mutator assists and background marking. This is stricter than was set out in the Go 1.5 proposal, which suggests that the garbage collector can use 25% just for itself and anything the mutator does to help out is on top of that. It also has several technical drawbacks. Because mutator assist time is constantly changing and we can't have instantaneous information on background marking time, it effectively requires hitting a moving target based on out-of-date information. This works out in the long run, but works poorly for short GC cycles and on short time scales. Also, this requires time-multiplexing all Ps between the mutator and background GC since the goal utilization of background GC constantly fluctuates. This results in a complicated scheduling algorithm, poor affinity, and extra overheads from context switching. This change modifies the way we schedule and run background marking so that background marking always consumes 25% of GOMAXPROCS and mutator assist is in addition to this. This enables a much more robust scheduling algorithm where we pre-determine the number of Ps we should dedicate to background marking as well as the utilization goal for a single floating "remainder" mark worker. Change-Id: I187fa4c03ab6fe78012a84d95975167299eb9168 Reviewed-on: https://go-review.googlesource.com/9013 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:50 +00:00
Austin Clements	24a7252e25	runtime: finish sweeping before concurrent GC starts Currently, the concurrent sweep follows a 1:1 rule: when allocation needs a span, it sweeps a span (likewise, when a large allocation needs N pages, it sweeps until it frees N pages). This rule worked well for the STW collector (especially when GOGC==100) because it did no more sweeping than necessary to keep the heap from growing, would generally finish sweeping just before GC, and ensured good temporal locality between sweeping a page and allocating from it. It doesn't work well with concurrent GC. Since concurrent GC requires starting GC earlier (sometimes much earlier), the sweep often won't be done when GC starts. Unfortunately, the first thing GC has to do is finish the sweep. In the mean time, the mutator can continue allocating, pushing the heap size even closer to the goal size. This worked okay with the 7/8ths trigger, but it gets into a vicious cycle with the GC trigger controller: if the mutator is allocating quickly and driving the trigger lower, more and more sweep work will be left to GC; this both causes GC to take longer (allowing the mutator to allocate more during GC) and delays the start of the concurrent mark phase, which throws off the GC controller's statistics and generally causes it to push the trigger even lower. As an example of a particularly bad case, the garbage benchmark with GOMAXPROCS=4 and -benchmem 512 (MB) spends the first 0.4-0.8 seconds of each GC cycle sweeping, during which the heap grows by between 109MB and 252MB. To fix this, this change replaces the 1:1 sweep rule with a proportional sweep rule. At the end of GC, GC knows exactly how much heap allocation will occur before the next concurrent GC as well as how many span pages must be swept. This change computes this "sweep ratio" and when the mallocgc asks for a span, the mcentral sweeps enough spans to bring the swept span count into ratio with the allocated byte count. On the benchmark from above, this entirely eliminates sweeping at the beginning of GC, which reduces the time between startGC readying the GC goroutine and GC stopping the world for sweep termination to ~100µs during which the heap grows at most 134KB. Change-Id: I35422d6bba0c2310d48bb1f8f30a72d29e98c1af Reviewed-on: https://go-review.googlesource.com/8921 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:46 +00:00
Austin Clements	91c80ce6c7	runtime: make mcache.local_cachealloc a uintptr This field used to decrease with sweeps (and potentially go negative). Now it is always zero or positive, so change it to a uintptr so it meshes better with other memory stats. Change-Id: I6a50a956ddc6077eeaf92011c51743cb69540a3c Reviewed-on: https://go-review.googlesource.com/8899 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:41 +00:00
Austin Clements	a0452a6821	runtime: proportional response GC trigger controller Currently, concurrent GC triggers at a fixed 7/8*GOGC heap growth. For mutators that allocate slowly, this means GC will trigger too early and run too often, wasting CPU time on GC. For mutators that allocate quickly, this means GC will trigger too late, causing the program to exceed the GOGC heap growth goal and/or to exceed CPU goals because of a high mutator assist ratio. This change adds a feedback control loop to dynamically adjust the GC trigger from cycle to cycle. By monitoring the heap growth and GC CPU utilization from cycle to cycle, this adjusts the Go garbage collector to target the GOGC heap growth goal and the 25% CPU utilization goal. Change-Id: Ic82eef288c1fa122f73b69fe604d32cbb219e293 Reviewed-on: https://go-review.googlesource.com/8851 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:37 +00:00
Austin Clements	8d03acce54	runtime: multi-threaded, utilization-scheduled background mark Currently, the concurrent mark phase is performed by the main GC goroutine. Prior to the previous commit enabling preemption, this caused marking to always consume 1/GOMAXPROCS of the available CPU time. If GOMAXPROCS=1, this meant background GC would consume 100% of the CPU (effectively a STW). If GOMAXPROCS>4, background GC would use less than the goal of 25%. If GOMAXPROCS=4, background GC would use the goal 25%, but if the mutator wasn't using the remaining 75%, background marking wouldn't take advantage of the idle time. Enabling preemption in the previous commit made GC miss CPU targets in completely different ways, but set us up to bring everything back in line. This change replaces the fixed GC goroutine with per-P background mark goroutines. Once started, these goroutines don't go in the standard run queues; instead, they are scheduled specially such that the time spent in mutator assists and the background mark goroutines totals 25% of the CPU time available to the program. Furthermore, this lets background marking take advantage of idle Ps, which significantly boosts GC performance for applications that under-utilize the CPU. This requires also changing how time is reported for gctrace, so this change splits the concurrent mark CPU time into assist/background/idle scanning. This also requires increasing the size of the StackRecord slice used in a GoroutineProfile test. Change-Id: I0936ff907d2cee6cb687a208f2df47e8988e3157 Reviewed-on: https://go-review.googlesource.com/8850 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:32 +00:00
Austin Clements	af060c3086	runtime: generally allow preemption during concurrent GC phases Currently, the entire GC process runs with g.m.preemptoff set. In the concurrent phases, the parts that actually need preemption disabled are run on a system stack and there's no overall need to stay on the same M or P during the concurrent phases. Hence, move the setting of g.m.preemptoff to when we start mark termination, at which point we really do need preemption disabled. This dramatically changes the scheduling behavior of the concurrent mark phase. Currently, since this is non-preemptible, concurrent mark gets one dedicated P (so 1/GOMAXPROCS utilization). With this change, the GC goroutine is scheduled like any other goroutine during concurrent mark, so it gets 1/<runnable goroutines> utilization. You might think it's not even necessary to set g.m.preemptoff at that point since the world is stopped, but stackalloc/stackfree use this as a signal that the per-P pools are not safe to access without synchronization. Change-Id: I08aebe8179a7d304650fb8449ff36262b3771099 Reviewed-on: https://go-review.googlesource.com/8839 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:27 +00:00
Austin Clements	100da60979	runtime: track time spent in mutator assists This time is tracked per P and periodically flushed to the global controller state. This will be used to compute mutator assist utilization in order to schedule background GC work. Change-Id: Ib94f90903d426a02cf488bf0e2ef67a068eb3eec Reviewed-on: https://go-review.googlesource.com/8837 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:22 +00:00
Austin Clements	4b2fde945a	runtime: proportional mutator assist Currently, mutator allocation periodically assists the garbage collector by performing a small, fixed amount of scanning work. However, to control heap growth, mutators need to perform scanning work proportional to their allocation rate. This change implements proportional mutator assists. This uses the scan work estimate computed by the garbage collector at the beginning of each cycle to compute how much scan work must be performed per allocation byte to complete the estimated scan work by the time the heap reaches the goal size. When allocation triggers an assist, it uses this ratio and the amount allocated since the last assist to compute the assist work, then attempts to steal as much of this work as possible from the background collector's credit, and then performs any remaining scan work itself. Change-Id: I98b2078147a60d01d6228b99afd414ef857e4fba Reviewed-on: https://go-review.googlesource.com/8836 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:18 +00:00
Austin Clements	028f972847	runtime: make gcDrainN in terms of scan work Currently, the "n" in gcDrainN is in terms of objects to scan. This is used by gchelpwork to perform a limited amount of work on allocation, but is a pretty arbitrary way to bound this amount of work since the number of objects has little relation to how long they take to scan. Modify gcDrainN to perform a fixed amount of scan work instead. For now, gchelpwork still performs a fairly arbitrary amount of scan work, but at least this is much more closely related to how long the work will take. Shortly, we'll use this to precisely control the scan work performed by mutator assists during allocation to achieve the heap size goal. Change-Id: I3cd07fe0516304298a0af188d0ccdf621d4651cc Reviewed-on: https://go-review.googlesource.com/8835 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:14 +00:00
Austin Clements	8e24283a28	runtime: track background scan work credit This tracks scan work done by background GC in a global pool. Mutator assists will draw on this credit to avoid doing work when background GC is staying ahead. Unlike the other GC controller tracking variables, this will be both written and read throughout the cycle. Hence, we can't arbitrarily delay updates like we can for scan work and bytes marked. However, we still want to minimize contention, so this global credit pool is allowed some error from the "true" amount of credit. Background GC accumulates credit locally up to a limit and only then flushes to the global pool. Similarly, mutator assists will draw from the credit pool in batches. Change-Id: I1aa4fc604b63bf53d1ee2a967694dffdfc3e255e Reviewed-on: https://go-review.googlesource.com/8834 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:09 +00:00
Austin Clements	4e9fc0df48	runtime: implement GC scan work estimator This implements tracking the scan work ratio of a GC cycle and using this to estimate the scan work that will be required by the next GC cycle. Currently this estimate is unused; it will be used to drive mutator assists. Change-Id: I8685b59d89cf1d83eddfc9b30d84da4e3a7f4b72 Reviewed-on: https://go-review.googlesource.com/8833 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:04 +00:00
Austin Clements	571ebae6ef	runtime: track scan work performed during concurrent mark This tracks the amount of scan work in terms of scanned pointers during the concurrent mark phase. We'll use this information to estimate scan work for the next cycle. Currently this aggregates the work counter in gcWork and dispose atomically aggregates this into a global work counter. dispose happens relatively infrequently, so the contention on the global counter should be low. If this turns out to be an issue, we can reduce the number of disposes, and if it's still a problem, we can switch to per-P counters. Change-Id: Iac0364c466ee35fab781dbbbe7970a5f3c4e1fc1 Reviewed-on: https://go-review.googlesource.com/8832 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:00 +00:00
Austin Clements	fb9fd2bdd7	runtime: atomic ops for int64 These currently use portable implementations in terms of their uint64 counterparts. Change-Id: Icba5f7134cfcf9d0429edabcdd73091d97e5e905 Reviewed-on: https://go-review.googlesource.com/8831 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:34:54 +00:00
Sebastien Binet	918fdae348	reflect: implement ArrayOf This change exposes reflect.ArrayOf to create new reflect.Type array types at runtime, when given a reflect.Type element. - reflect: implement ArrayOf - reflect: tests for ArrayOf - runtime: document that typeAlg is used by reflect and must be kept in synchronized Fixes #5996. Change-Id: I5d07213364ca915c25612deea390507c19461758 Reviewed-on: https://go-review.googlesource.com/4111 Reviewed-by: Keith Randall <khr@golang.org>	2015-04-21 15:21:09 +00:00
Matthew Dempsky	c0fa9e3f6f	runtime/pprof: disable flaky TestTraceFutileWakeup on linux/ppc64le Update #10512. Change-Id: Ifdc59c3a5d8aba420b34ae4e37b3c2315dd7c783 Reviewed-on: https://go-review.googlesource.com/9162 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-21 10:01:53 +00:00
Rick Hudson	899a4ad47e	runtime: Speed up heapBitsForObject Optimized heapBitsForObject by special casing objects whose size is a power of two. When a span holding such objects is initialized I added a mask that when &ed with an interior pointer results in the base of the pointer. For the garbage benchmark this resulted in CPU_CLK_UNHALTED in heapBitsForObject going from 7.7% down to 5.9% of the total, INST_RETIRED went from 12.2 -> 8.7. Here are the benchmarks that were at lease plus or minus 1%. benchmark old ns/op new ns/op delta BenchmarkFmtFprintfString 249 221 -11.24% BenchmarkFmtFprintfInt 247 223 -9.72% BenchmarkFmtFprintfEmpty 76.5 69.6 -9.02% BenchmarkBinaryTree17 4106631412 3744550160 -8.82% BenchmarkFmtFprintfFloat 424 399 -5.90% BenchmarkGoParse 4484421 4242115 -5.40% BenchmarkGobEncode 8803668 8449107 -4.03% BenchmarkFmtManyArgs 1494 1436 -3.88% BenchmarkGobDecode 10431051 10032606 -3.82% BenchmarkFannkuch11 2591306713 2517400464 -2.85% BenchmarkTimeParse 361 371 +2.77% BenchmarkJSONDecode 70620492 68830357 -2.53% BenchmarkRegexpMatchMedium_1K 54693 53343 -2.47% BenchmarkTemplate 90008879 91929940 +2.13% BenchmarkTimeFormat 380 387 +1.84% BenchmarkRegexpMatchEasy1_32 111 113 +1.80% BenchmarkJSONEncode 21359159 21007583 -1.65% BenchmarkRegexpMatchEasy1_1K 603 613 +1.66% BenchmarkRegexpMatchEasy0_32 127 129 +1.57% BenchmarkFmtFprintfIntInt 399 393 -1.50% BenchmarkRegexpMatchEasy0_1K 373 378 +1.34% Change-Id: I78e297161026f8b5cc7507c965fd3e486f81ed29 Reviewed-on: https://go-review.googlesource.com/8980 Reviewed-by: Austin Clements <austin@google.com>	2015-04-20 21:39:06 +00:00
Russ Cox	181e26b9fa	runtime: replace func-based write barrier skipping with type-based This CL revises CL 7504 to use explicitly uintptr types for the struct fields that are going to be updated sometimes without write barriers. The result is that the fields are now updated always without write barriers. This approach has two important properties: 1) Now the GC never looks at the field, so if the missing reference could cause a problem, it will do so all the time, not just when the write barrier is missed at just the right moment. 2) Now a write barrier never happens for the field, avoiding the (correct) detection of inconsistent write barriers when GODEBUG=wbshadow=1. Change-Id: Iebd3962c727c0046495cc08914a8dc0808460e0e Reviewed-on: https://go-review.googlesource.com/9019 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-20 20:20:09 +00:00
Ian Lance Taylor	357a013060	runtime: save registers in linux/{386,amd64} lib entry point The callee-saved registers must be saved because for the c-shared case this code is invoked from C code in the system library, and that code expects the registers to be saved. The tests were passing because in the normal case the code calls a cgo function that naturally saves callee-saved registers anyhow. However, it fails when the code takes the non-cgo path. Change-Id: I9c1f5e884f5a72db9614478049b1863641c8b2b9 Reviewed-on: https://go-review.googlesource.com/9114 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-20 18:09:41 +00:00
Ian Lance Taylor	725aa3451a	runtime: no deadlock error if buildmode=c-archive or c-shared Change-Id: I4ee6dac32bd3759aabdfdc92b235282785fbcca9 Reviewed-on: https://go-review.googlesource.com/9083 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-20 17:31:44 +00:00
Ian Lance Taylor	9c1868d06d	runtime: add -buildmode=c-archive/c-shared support for linux/386 Change-Id: I87147ca6bb53e3121cc4245449c519509f107638 Reviewed-on: https://go-review.googlesource.com/9009 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-17 19:31:37 +00:00
Russ Cox	8e5346571c	runtime: leave gccheckmark testing off by default It's not helping anymore, and it's fooling people who try to understand performance (like me). Change-Id: I133a644acae0ddf1bfa17c654cdc01e2089da963 Reviewed-on: https://go-review.googlesource.com/9018 Reviewed-by: Austin Clements <austin@google.com>	2015-04-17 19:29:04 +00:00
Austin Clements	c1c667542c	runtime: fix dangling pointer in readyExecute readyExecute passes a closure to mcall that captures an argument to readyExecute. Since mcall is marked noescape, this closure lives on the stack of the calling goroutine. However, the closure puts the calling goroutine on the run queue (and switches to a new goroutine). If the calling goroutine gets scheduled before the mcall returns, this stack-allocated closure will become invalid while it's still executing. One consequence of this we've observed is that the captured gp variable can get overwritten before the call to execute(gp), causing execute(gp) to segfault. Fix this by passing the currently captured gp variable through a field in the calling goroutine's g struct so that the func is no longer a closure. To prevent problems like this in the future, this change also removes the go:noescape annotation from mcall. Due to a compiler bug, this will currently cause a func closure passed to mcall to be implicitly allocated rather than refusing the implicit allocation. However, this is okay because there are no other closures passed to mcall right now and the compiler bug will be fixed shortly. Fixes #10428. Change-Id: I49b48b85de5643323b89e9eaa4df63854e968c32 Reviewed-on: https://go-review.googlesource.com/8866 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-17 17:59:14 +00:00
Dave Cheney	7ae9d06880	runtime/pprof: disable TestTraceStressStartStop Updates #10476 Change-Id: Ic4414f669104905c6004835be5cf0fa873553ea6 Reviewed-on: https://go-review.googlesource.com/8962 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-17 14:54:25 +00:00
David Crawshaw	c8aba85e4a	runtime: export main.main for android Previously we started the Go runtime from a JNI function call, which eventually called the program's main function. Now the runtime is initialized by an ELF initialization function as a c-shared library, and the program's main function is not called. So now we export main so it can be called from JNI. This is necessary for all-Go apps because unlike a normal shared library, the program loading the library is not written by or known to the programmer. As far as they are concerned, the .so is everything. In fact the same code is compiled for iOS as a normal Go program. Change-Id: I61c6a92243240ed229342362231b1bfc7ca526ba Reviewed-on: https://go-review.googlesource.com/9015 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2015-04-17 12:11:04 +00:00
David Crawshaw	5da1c254d5	runtime: do not run main when buildmode=c-shared Change-Id: Ie7f85873978adf3fd5c739176f501ca219592824 Reviewed-on: https://go-review.googlesource.com/9011 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-17 11:31:01 +00:00
Russ Cox	6a2b0c0b6d	runtime: delete cgo_allocate This memory is untyped and can't be used anymore. The next version of SWIG won't need it. Change-Id: I592b287c5f5186975ee09a9b28d8efe3b57134e7 Reviewed-on: https://go-review.googlesource.com/8956 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-17 01:30:47 +00:00
David Crawshaw	5b72b8c7a3	runtime: aeshash stubs for arm64 For some reason the absense of an implementation does not stop arm64 binaries being built. However it comes up with -buildmode=c-archive. Change-Id: Ic0db5fd8fb4fe8252b5aa320818df0c7aec3db8f Reviewed-on: https://go-review.googlesource.com/8989 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-16 19:49:31 +00:00
David Crawshaw	e8b7133e9b	runtime: darwin/arm64 c-archive entry point Change-Id: Ib227aa3e14d01a0ab1ad9e53d107858e045d1c42 Reviewed-on: https://go-review.googlesource.com/8984 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-16 18:56:54 +00:00
David Crawshaw	9cde36be54	runtime/cgo: enable arm64 EXC_BAD_ACCESS handler Change-Id: I8e912ff9327a4163b63b8c628aa3546e86ddcc02 Reviewed-on: https://go-review.googlesource.com/8983 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2015-04-16 18:00:57 +00:00
Shenghou Ma	4a71b91d29	runtime: darwin/arm64 support Change-Id: I3b3f80791a1db4c2b7318f81a115972cd2237f03 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/8782 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-16 13:01:19 +00:00
Shenghou Ma	828de09f8b	runtime/cgo: darwin/arm64 support Fixes #10116. Change-Id: I3b3f80791a1db4c2b7318f81a115972cd2237f05 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/8784 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-16 12:50:49 +00:00
Michael Hudson-Doyle	f616af23e0	cmd/6l: call runtime.addmoduledata from .init_array Change-Id: I09e84161d106960a69972f5fc845a1e40c28e58f Reviewed-on: https://go-review.googlesource.com/8331 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-15 23:54:20 +00:00
Josh Bleecher Snyder	7e0c11c32f	cmd/6g, runtime: improve duffzero throughput It is faster to execute MOVQ AX,(DI) MOVQ AX,8(DI) MOVQ AX,16(DI) MOVQ AX,24(DI) ADDQ $32,DI than STOSQ STOSQ STOSQ STOSQ However, in order to be able to jump into the middle of a block of MOVQs, the call site needs to pre-adjust DI. If we're clearing a small area, the cost of that DI pre-adjustment isn't repaid. This CL switches the DUFFZERO implementation to use a hybrid strategy, in which small clears use STOSQ as before, but large clears use mostly MOVQ/ADDQ blocks. benchmark old ns/op new ns/op delta BenchmarkClearFat8 0.55 0.55 +0.00% BenchmarkClearFat12 0.82 0.83 +1.22% BenchmarkClearFat16 0.55 0.55 +0.00% BenchmarkClearFat24 0.82 0.82 +0.00% BenchmarkClearFat32 2.20 1.94 -11.82% BenchmarkClearFat40 1.92 1.66 -13.54% BenchmarkClearFat48 2.21 1.93 -12.67% BenchmarkClearFat56 3.03 2.20 -27.39% BenchmarkClearFat64 3.26 2.48 -23.93% BenchmarkClearFat72 3.57 2.76 -22.69% BenchmarkClearFat80 3.83 3.05 -20.37% BenchmarkClearFat88 4.14 3.30 -20.29% BenchmarkClearFat128 5.54 4.69 -15.34% BenchmarkClearFat256 9.95 9.09 -8.64% BenchmarkClearFat512 18.7 17.9 -4.28% BenchmarkClearFat1024 36.2 35.4 -2.21% Change-Id: Ic786406d9b3cab68d5a231688f9e66fcd1bd7103 Reviewed-on: https://go-review.googlesource.com/2585 Reviewed-by: Keith Randall <khr@golang.org>	2015-04-15 19:17:07 +00:00
Michael Hudson-Doyle	ab4df700b8	runtime: merge slice and sliceStruct By removing type slice, renaming type sliceStruct to type slice and whacking until it compiles. Has a pleasing net reduction of conversions. Fixes #10188 Change-Id: I77202b8df637185b632fd7875a1fdd8d52c7a83c Reviewed-on: https://go-review.googlesource.com/8770 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-15 16:59:49 +00:00
Dave Cheney	e629cd0f88	runtime: mark all runtime.cputicks implementations NOSPLIT Fixes #10450 runtime.cputicks is called from runtime.exitsyscall and must not split the stack. cputicks is implemented in several ways and the NOSPLIT annotation was missing from a few of these. Change-Id: I5cbbb4e5888c5d298fe2fef240782d0e49f59af8 Reviewed-on: https://go-review.googlesource.com/8939 Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>	2015-04-15 09:22:15 +00:00
Alex Brainman	9402e49450	runtime: really pass return value to Windows in externalthreadhandler When Windows calls externalthreadhandler it expects to receive return value in AX. We don't set AX anywhere. Change that. Store ctrlhandler1 and profileloop1 return values into AX before returning from externalthreadhandler. Fixes #10215. Change-Id: Ied04542cc3ebe7d4a26660e970f9f78098143591 Reviewed-on: https://go-review.googlesource.com/8901 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-15 05:03:42 +00:00
Austin Clements	a23a341e10	runtime: make time slice a const A G will be preempted if it runs for 10ms without blocking. Currently this constant is hard-coded in retake. Move it to a global const. We'll use the time slice length in scheduling background GC. Change-Id: I79a979948af2fad3afe5df9d4af4062f166554b7 Reviewed-on: https://go-review.googlesource.com/8838 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-14 22:06:32 +00:00
Austin Clements	69001e404e	runtime: fix freed page accounting in mHeap_ReclaimList mHeap_ReclaimList is asked to reclaim at least npages pages, but it counts the number of spans reclaimed, not the number of pages reclaimed. The number of spans reclaimed is strictly larger than the number of pages, so this is not strictly wrong, but it is forcing more reclamation than was intended by the caller, which delays large allocations. Fix this by increasing the count by the number of pages in the swept span, rather than just increasing it by 1. Fixes #9048. Change-Id: I5ae364a9837a6012e68fcd431bba000340cfd50c Reviewed-on: https://go-review.googlesource.com/8920 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-14 20:55:14 +00:00
Austin Clements	bedb6f8aef	runtime: remove unnecessary traceNextGC Commit `d7e0ad4` removed the next_gc manipulation from mSpan_Sweep, but left in the traceNextGC() for recording the updated next_gc value. Remove this now unnecessary call. Change-Id: I28e0de071661199be9810d7bdcc81ce50b5a58ae Reviewed-on: https://go-review.googlesource.com/8894 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-14 20:54:23 +00:00
David Crawshaw	3b22ffc07e	runtime: make cgocallback wait on package init With the new buildmodes c-archive and c-shared, it is possible for a cgo call to come in early in the lifecycle of a Go program. Calls before the runtime has been initialized are caught by _cgo_wait_runtime_init_done. However a call can come in after the runtime has initialized, but before the program's package init functions have finished running. To avoid this cgocallback checks m.ncgo to see if we are on a thread running Go. If not, we may be a foreign thread and it blocks until main_init is complete. Change-Id: I7a9f137fa2a40c322a0b93764261f9aa17fcf5b8 Reviewed-on: https://go-review.googlesource.com/8897 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org>	2015-04-14 13:39:02 +00:00
David Crawshaw	cea272de30	runtime: rename close to closefd Avoids shadowing the builtin channel close function. Change-Id: I7a729b0937c8248fe27222be61318a88db995eee Reviewed-on: https://go-review.googlesource.com/8898 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org>	2015-04-14 12:31:29 +00:00
Srdjan Petrovic	d1eee2cebf	runtime: shared library init support for android/arm. Follows http://golang.org/cl/8454, a similar CL for arm architectures. This CL involves android-specific changes, namely, synthesizing argv/auxv, as android doesn't provide those to the init functions. This code is based on crawshaw@ android code in golang.org/x/mobile. Change-Id: I32364efbb2662e80270a99bd7dfb1d0421b5417d Reviewed-on: https://go-review.googlesource.com/8457 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-13 21:53:15 +00:00
Srdjan Petrovic	93644c9118	runtime: shared library runtime init for arm Adds the runtime initialization flow for arm akin to amd64. In particular,we use the library initialization entry point to: - create a new OS thread and run the "regular" runtime init stack on that thread - return immediately from the main (i.e., loader) thread - at the first CGO invocation, we wait for the runtime initialization to complete. Verified to work on a Raspberry Pi and an Android phone. Change-Id: I32f39228ae30a03ce9569287f234b305790fecf6 Reviewed-on: https://go-review.googlesource.com/8455 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Srdjan Petrovic <spetrovic@google.com>	2015-04-13 18:58:18 +00:00
Srdjan Petrovic	a888fcf7a7	runtime: remove runtime wait/notify from ppc64x architectures. Related to issue #10410 For some reason, any non-trivial code in _cgo_wait_runtime_init_done (even fprintf()) will crash that call. If anybody has any guess why this is happening, please let me know! For now, I'm clearing the functions for ppc64, as it's currently not used. Change-Id: I1b11383aaf4f9f9a16f1fd6606842cfeedc9f0b3 Reviewed-on: https://go-review.googlesource.com/8766 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Srdjan Petrovic <spetrovic@google.com>	2015-04-13 17:21:04 +00:00
David Crawshaw	989f0ee80a	runtime/cgo: EXC_BAD_ACCESS handler for arm64 Change-Id: Ia9ff9c0d381fad43fc5d3e5972dd6e66503733a5 Reviewed-on: https://go-review.googlesource.com/8815 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-13 12:08:37 +00:00
David Crawshaw	0a81d31b66	runtime/pprof: skip fork test on darwin/arm64 Just like darwin/arm. Change-Id: Ic75927bd6457d37cda7dd8279fd9b4cd52edc1d1 Reviewed-on: https://go-review.googlesource.com/8813 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-13 11:58:03 +00:00
David Crawshaw	7db8835a50	runtime/debug: disable arm64 test for issue 9993 Like other arm64 platforms, darwin/arm64 has a different physical page size to logical page size so it is running into issue 9993. I hope it can be fixed for Go 1.5, but for now it is demonstrating the same bug as the other skipped os+arch combinations. Change-Id: Iedaf9afe56d6954bb4391b6e843d81742a75a00c Reviewed-on: https://go-review.googlesource.com/8814 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-13 11:57:12 +00:00
David Crawshaw	d6d423b99b	runtime: skip fork test on darwin/arm64 Just like darwin/arm. Change-Id: Ie4998d24b2d891a9f6c8047ec40cd3fdf80622cd Reviewed-on: https://go-review.googlesource.com/8812 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-13 11:52:05 +00:00
Alex Brainman	d1af6bed84	runtime: move all exception related code into signal_windows.go Change-Id: I9654a5c85bd9b3ae9c7a9eddaef1ec752f42bd1b Reviewed-on: https://go-review.googlesource.com/8840 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-13 07:04:21 +00:00
David Crawshaw	6e3a6c4d38	runtime: library entry point for darwin/arm Tested by using -buildmode=c-archive to generate an archive, add it to an Xcode project and calling a Go function from an iOS app. (I'm still investigating proper buildmode tests for all.bash.) Change-Id: I7890df15246df8e90ad27837b8d64ba2cde409fe Reviewed-on: https://go-review.googlesource.com/8719 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-12 12:49:49 +00:00
Michael Hudson-Doyle	e1366f94ee	reflect, runtime: check equality, not identity, for method names When dynamically linking Go code, it is no longer safe to assume that strings that end up in method names are identical if they are equal. The performance impact seems to be noise: benchmark old ns/op new ns/op delta BenchmarkAssertI2E2 13.3 13.1 -1.50% BenchmarkAssertE2I 23.5 23.2 -1.28% BenchmarkAssertE2E2Blank 0.83 0.82 -1.20% BenchmarkConvT2ISmall 60.7 60.1 -0.99% BenchmarkAssertI2T 10.2 10.1 -0.98% BenchmarkAssertE2T 10.2 10.3 +0.98% BenchmarkConvT2ESmall 56.7 57.2 +0.88% BenchmarkConvT2ILarge 59.4 58.9 -0.84% BenchmarkConvI2E 13.0 12.9 -0.77% BenchmarkAssertI2E 13.4 13.3 -0.75% BenchmarkConvT2IUintptr 57.9 58.3 +0.69% BenchmarkConvT2ELarge 55.9 55.6 -0.54% BenchmarkAssertI2I 23.8 23.7 -0.42% BenchmarkConvT2EUintptr 55.4 55.5 +0.18% BenchmarkAssertE2E 6.12 6.11 -0.16% BenchmarkAssertE2E2 14.4 14.4 +0.00% BenchmarkAssertE2T2 10.0 10.0 +0.00% BenchmarkAssertE2T2Blank 0.83 0.83 +0.00% BenchmarkAssertE2TLarge 10.7 10.7 +0.00% BenchmarkAssertI2E2Blank 0.83 0.83 +0.00% BenchmarkConvI2I 23.4 23.4 +0.00% Change-Id: I0b3dfc314215a4d4e09eec6b42c1e3ebce33eb56 Reviewed-on: https://go-review.googlesource.com/8239 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-11 17:35:44 +00:00
Derek Buitenhuis	53840ad6f1	runtime: Fix GDB integration with Python 2 A similar fix was applied in `545686857b` but another instance of 'pc' was missed. Also adds a test for the goroutine gdb command. It currently uses goroutine 2 for the test, since goroutine 1 has its stack pointer set to 0 for some reason. Change-Id: I53ca22be6952f03a862edbdebd9b5c292e0853ae Reviewed-on: https://go-review.googlesource.com/8729 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-10 22:17:59 +00:00
Austin Clements	4b956ae317	runtime: start concurrent GC promptly when we reach its trigger Currently, when allocation reaches the concurrent GC trigger size, we start the concurrent collector by ready'ing its G. This simply puts it on the end of the P's run queue, which means we may not actually start GC for some time as the current G continues to run and then the P drains other Gs already on its run queue. Since the mutator can continue to allocate, the heap can potentially be much larger than we intended by the time GC actually starts. Furthermore, how much larger is difficult to predict since it depends on the scheduler. Fix this by preempting the current G and switching directly to the concurrent GC G as soon as we reach the trigger heap size. On the garbage benchmark from the benchmarks subrepo with GOMAXPROCS=4, this reduces the time from triggering the GC to the beginning of sweep termination by 10 to 30 milliseconds, which reduces allocation after the trigger by up to 10MB (a large fraction of the 64MB live heap the benchmark tries to maintain). One other known source of delay before we "really" start GC is the sweep finalization performed before sweep termination. This has similar negative effects on heap size and predictability, but is an orthogonal problem. This change adds a TODO for this. Change-Id: I8bae98cb43685c1bf353ff55868e4647e3743c47 Reviewed-on: https://go-review.googlesource.com/8513 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-10 18:22:52 +00:00
Austin Clements	6afb5fa48f	runtime: remove GoSched/GoStart trace events around GC These were appropriate for STW GC, since it interrupted the allocating Goroutine, but don't apply to concurrent GC, which runs on its own Goroutine. Forced GC is still STW, but it makes sense to attribute the GC to the goroutine that called runtime.GC(). Change-Id: If12418ca66dc7e53b8b16025af4e03adb5d9577e Reviewed-on: https://go-review.googlesource.com/8715 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-10 18:21:52 +00:00
Austin Clements	7c37249639	runtime: make test for freezetheworld more precise exitsyscallfast checks for freezetheworld, but does so only by checking if stopwait is positive. This can also happen during stoptheworld, which is harmless, but confusing. Shortly, it will be important that we get to the p.status cas even if stopwait is set. Hence, make this test more specific so it only triggers with freezetheworld and not other uses of stopwait. Change-Id: Ibb722cd8360c3ed5a9654482519e3ceb87a8274d Reviewed-on: https://go-review.googlesource.com/8205 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-10 18:02:55 +00:00
Dmitry Vyukov	089d363a91	runtime: fix tracing of syscall exit Fix tracing of syscall exit after: https://go-review.googlesource.com/#/c/7504/ Change-Id: Idcde2aa826d2b9a05d0a90a80242b6bfa78846ab Reviewed-on: https://go-review.googlesource.com/8728 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-10 17:39:06 +00:00
Michael Hudson-Doyle	a1f57598cc	runtime, cmd/internal/ld: rename themoduledata to firstmoduledata 'themoduledata' doesn't really make sense now we support multiple moduledata objects. Change-Id: I8263045d8f62a42cb523502b37289b0fba054f62 Reviewed-on: https://go-review.googlesource.com/8521 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-10 05:11:49 +00:00
Michael Hudson-Doyle	fae4a128cb	runtime, reflect: support multiple moduledata objects This changes all the places that consult themoduledata to consult a linked list of moduledata objects, as will be necessary for -linkshared to work. Obviously, as there is as yet no way of adding moduledata objects to this list, all this change achieves right now is wasting a few instructions here and there. Change-Id: I397af7f60d0849b76aaccedf72238fe664867051 Reviewed-on: https://go-review.googlesource.com/8231 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-10 04:51:42 +00:00
Josh Bleecher Snyder	969f10140c	runtime: fix arm64 build Broken by CL 8541. Change-Id: Ie2e89a22b91748e82f7bc4723660a24ed4135687 Reviewed-on: https://go-review.googlesource.com/8734 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-10 02:29:01 +00:00
Austin Clements	cb10ff1ef9	runtime: report next_gc for initial heap size in gctrace Currently, the initial heap size reported in the gctrace line is the heap_live right before sweep termination. However, we triggered GC when heap_live reached next_gc, and there may have been significant allocation between that point and the beginning of sweep termination. Ideally these would be essentially the same, but currently there's scheduler delay when readying the GC goroutine as well as delay from background sweep finalization. We should fix this delay, but in the mean time, to give the user a better idea of how much the heap grew during the whole of garbage collection, report the trigger rather than what the heap size happened to be after the garbage collector finished rolling out of bed. This will also be more useful for heap growth plots. Change-Id: I08476b9fbcfb2de90592405e9c9f434dfb9eb1f8 Reviewed-on: https://go-review.googlesource.com/8512 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-09 22:18:06 +00:00
David Crawshaw	d1b1eee280	runtime: add isarchive, set by the linker According to Go execution modes, a Go program compiled with -buildmode=c-archive has a main function, but it is ignored on run. This gives the runtime the information it needs not to run the main. I have this working with pending linker changes on darwin/amd64. Change-Id: I49bd7d65aa619ec847c464a872afa5deea7d4d30 Reviewed-on: https://go-review.googlesource.com/8701 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-09 20:02:02 +00:00
Dave Cheney	ee349b5d77	runtime: add arm64 runtime.cmpstring and bytes.Compare Add arm64 assembly implementation of runtime.cmpstring and bytes.Compare. benchmark old ns/op new ns/op delta BenchmarkCompareBytesEqual 98.0 27.5 -71.94% BenchmarkCompareBytesToNil 9.38 10.0 +6.61% BenchmarkCompareBytesEmpty 13.3 10.0 -24.81% BenchmarkCompareBytesIdentical 98.0 27.5 -71.94% BenchmarkCompareBytesSameLength 43.3 16.3 -62.36% BenchmarkCompareBytesDifferentLength 43.4 16.3 -62.44% BenchmarkCompareBytesBigUnaligned 6979680 1360979 -80.50% BenchmarkCompareBytesBig 6915995 1381979 -80.02% BenchmarkCompareBytesBigIdentical 6781440 1327304 -80.43% benchmark old MB/s new MB/s speedup BenchmarkCompareBytesBigUnaligned 150.23 770.46 5.13x BenchmarkCompareBytesBig 151.62 758.76 5.00x BenchmarkCompareBytesBigIdentical 154.63 790.01 5.11x * note, the machine we are benchmarking on has some issues. What is clear is compared to a few days ago the old MB/s value has increased from ~115 to 150. I'm less certain about the new MB/s number, which used to be close to 1Gb/s. Change-Id: I4f31b2c7a06296e13912aacc958525632cb0450d Reviewed-on: https://go-review.googlesource.com/8541 Reviewed-by: Aram Hăvărneanu <aram@mgk.ro> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-09 14:49:31 +00:00
Alex Brainman	6e774faed7	runtime: make windows exception handler code arch independent Mainly it is simple copy. But I had to change amd64 lastcontinuehandler return value from uint32 to int32. I don't remember how it happened to be uint32, but new int32 is matching better with Windows documentation (LONG). I don't think it matters one way or the others. Change-Id: I6935224a2470ad6301e27590f2baa86c13bbe8d5 Reviewed-on: https://go-review.googlesource.com/8686 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-09 09:55:38 +00:00
Alex Brainman	414444d416	runtime: do not calculate asmstdcall address every time we make syscall Change-Id: If3c8c9035e12d41647ae4982883f6a979313ea9d Reviewed-on: https://go-review.googlesource.com/8682 Reviewed-by: Minux Ma <minux@golang.org>	2015-04-09 04:26:44 +00:00
David Crawshaw	c844bf4cfc	runtime: fix darwin/386, darwin/arm builds In cl/8652 I broke darwin/arm and darwin/386 because I removed the *g parameter, which they both expect and use. This CL adjusts both ports to look for g0 in m, just as darwin/amd64 does. Tested on darwin{386,arm,amd64}. Change-Id: Ia56f3d97e126b40d8bbd2e8f677b008e4a1badad Reviewed-on: https://go-review.googlesource.com/8666 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-09 01:36:21 +00:00
Alex Brainman	e0d9342da7	runtime: use (context) ip, setip, sp and setsp everywhere on windows Also move dumpregs into defs_windows_.go. Change-Id: Ic077d7dbb133c7b812856e758d696d6fed557afd Reviewed-on: https://go-review.googlesource.com/4650 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-09 00:57:28 +00:00
David Crawshaw	b0a85f5d93	runtime: darwin/amd64 library entry point This is a practice run for darwin/arm. Similar to the linux/amd64 shared library entry point. With several pending linker changes I am successfully using this to implement -buildmode=c-archive on darwin/amd64 with external linking. The same entry point can be reused to implement -buildmode=c-shared on darwin/amd64, however that will require further ld changes to remove all text relocations. One extra runtime change will follow this. According to the Go execution modes document, -buildmode=c-archive should ignore the Go main function. Right now it is being executed (and the process exits if it doesn't block). I'm still searching for the right way to do this. Change-Id: Id97901ddd4d46970996f222bd79731dabff66a3d Reviewed-on: https://go-review.googlesource.com/8652 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-08 21:53:52 +00:00
Michael Hudson-Doyle	3a84e3305b	runtime, cmd/internal/ld: initialize themoduledata slices directly This CL is quite conservative in some ways. It continues to define symbols that have no real purpose (e.g. epclntab). These could be deleted if there is no concern that external tools might look for them. It would also now be possible to make some changes to the pcln data but I get the impression that would definitely require some thought and discussion. Change-Id: Ib33cde07e4ec38ecc1d6c319a10138c9347933a3 Reviewed-on: https://go-review.googlesource.com/7616 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-08 16:20:57 +00:00
Michael Matloob	a173357cd5	runtime: fix return type for bsdthread_register in comments The return type for bsdthread_register is int32. See runtime/os_darwin.go. This change also rewrites declaration comments for go functions to use go syntax and fixes vet errors in sys_darwin_amd64.s. Change-Id: I7482105f7562929e0ede30099efac9e76babd8a3 Reviewed-on: https://go-review.googlesource.com/3260 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-04-08 14:13:53 +00:00
Shenghou Ma	d0b62d8bfa	runtime: linux/arm64 cgo support Change-Id: I309e3df7608b9eef9339196fdc50dedf5f9439f3 Reviewed-on: https://go-review.googlesource.com/8450 Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>	2015-04-08 09:08:27 +00:00
Shenghou Ma	0accc80fbb	runtime/cgo: linux/arm64 cgo support Change-Id: I309e3df7608b9eef9339196fdc50dedf5f9439f2 Reviewed-on: https://go-review.googlesource.com/8439 Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>	2015-04-08 09:08:12 +00:00
Russ Cox	92c826b1b2	cmd/internal/gc: inline runtime.getg This more closely restores what the old C runtime did. (In C, g was an 'extern register' with the same effective implementation as in this CL.) On a late 2012 MacBookPro10,2, best of 5 old vs best of 5 new: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 4981312777 4463426605 -10.40% BenchmarkFannkuch11 3046495712 3006819428 -1.30% BenchmarkFmtFprintfEmpty 89.3 79.8 -10.64% BenchmarkFmtFprintfString 284 262 -7.75% BenchmarkFmtFprintfInt 282 262 -7.09% BenchmarkFmtFprintfIntInt 480 448 -6.67% BenchmarkFmtFprintfPrefixedInt 382 358 -6.28% BenchmarkFmtFprintfFloat 529 486 -8.13% BenchmarkFmtManyArgs 1849 1773 -4.11% BenchmarkGobDecode 12835963 11794385 -8.11% BenchmarkGobEncode 10527170 10288422 -2.27% BenchmarkGzip 436109569 438422516 +0.53% BenchmarkGunzip 110121663 109843648 -0.25% BenchmarkHTTPClientServer 81930 85446 +4.29% BenchmarkJSONEncode 24638574 24280603 -1.45% BenchmarkJSONDecode 93022423 85753546 -7.81% BenchmarkMandelbrot200 4703899 4735407 +0.67% BenchmarkGoParse 5319853 5086843 -4.38% BenchmarkRegexpMatchEasy0_32 151 151 +0.00% BenchmarkRegexpMatchEasy0_1K 452 453 +0.22% BenchmarkRegexpMatchEasy1_32 131 132 +0.76% BenchmarkRegexpMatchEasy1_1K 761 722 -5.12% BenchmarkRegexpMatchMedium_32 228 224 -1.75% BenchmarkRegexpMatchMedium_1K 63751 64296 +0.85% BenchmarkRegexpMatchHard_32 3188 3238 +1.57% BenchmarkRegexpMatchHard_1K 95396 96756 +1.43% BenchmarkRevcomp 661587262 687107364 +3.86% BenchmarkTemplate 108312598 104008540 -3.97% BenchmarkTimeParse 453 459 +1.32% BenchmarkTimeFormat 475 441 -7.16% The garbage benchmark from the benchmarks subrepo gets 2.6% faster as well. Change-Id: I320aeda332db81012688b26ffab23f6581c59cfa Reviewed-on: https://go-review.googlesource.com/8460 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-04-07 14:26:47 +00:00
David Crawshaw	ede863c673	runtime: add _rt0_arm_android_lib At the moment this function does nothing, runtime initialization is still done in android.c:init_go_runtime. Fixes #10358 Change-Id: I1d762383ba61efcbcf0bbc7c77895f5c1dbf8968 Reviewed-on: https://go-review.googlesource.com/8510 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2015-04-06 22:54:52 +00:00
Austin Clements	8c3fc088fb	runtime: report marked heap size in gctrace When the gctrace GODEBUG option is enabled, it will now report three heap sizes: the heap size at the beginning of the GC cycle, the heap size at the end of the GC cycle before sweeping, and marked heap size, which is the amount of heap that will be retained until the next GC cycle. Change-Id: Ie13f8a6d5c609bc9cc47c7555960ab55b37b5f1c Reviewed-on: https://go-review.googlesource.com/8430 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-06 21:28:23 +00:00
Austin Clements	6d12b1780e	runtime: make next_gc be heap size to trigger GC at In the STW collector, next_gc was both the heap size to trigger GC at as well as the goal heap size. Early in the concurrent collector's development, next_gc was the goal heap size, but was also used as the heap size to trigger GC at. This meant we always overshot the goal because of allocation during concurrent GC. Currently, next_gc is still the goal heap size, but we trigger concurrent GC at 7/8*GOGC heap growth. This complicates shouldtriggergc, but was necessary because of the incremental maintenance of next_gc. Now we simply compute next_gc for the next cycle during mark termination. Hence, it's now easy to take the simpler route and redefine next_gc as the heap size at which the next GC triggers. We can directly compute this with the 7/8 backoff during mark termination and shouldtriggergc can simply test if the live heap size has grown over the next_gc trigger. This will also simplify later changes once we start setting next_gc in more sophisticated ways. Change-Id: I872be4ae06b4f7a0d7f7967360a054bd36b90eea Reviewed-on: https://go-review.googlesource.com/8420 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-06 21:28:18 +00:00
Austin Clements	d7e0ad4b82	runtime: introduce heap_live; replace use of heap_alloc in GC Currently there are two main consumers of memstats.heap_alloc: updatememstats (aka ReadMemStats) and shouldtriggergc. updatememstats recomputes heap_alloc from the ground up, so we don't need to keep heap_alloc up to date for it. shouldtriggergc wants to know how many bytes were marked by the previous GC plus how many bytes have been allocated since then, but this isn't what heap_alloc tracks. heap_alloc also includes objects that are not marked and haven't yet been swept. Introduce a new memstat called heap_live that actually tracks what shouldtriggergc wants to know and stop keeping heap_alloc up to date. Unlike heap_alloc, heap_live follows a simple sawtooth that drops during each mark termination and increases monotonically between GCs. heap_alloc, on the other hand, has much more complicated behavior: it may drop during sweep termination, slowly decreases from background sweeping between GCs, is roughly unaffected by allocation as long as there are unswept spans (because we sweep and allocate at the same rate), and may go up after background sweeping is done depending on the GC trigger. heap_live simplifies computing next_gc and using it to figure out when to trigger garbage collection. Currently, we guess next_gc at the end of a cycle and update it as we sweep and get a better idea of how much heap was marked. Now, since we're directly tracking how much heap is marked, we can directly compute next_gc. This also corrects bugs that could cause us to trigger GC early. Currently, in any case where sweep termination actually finds spans to sweep, heap_alloc is an overestimation of live heap, so we'll trigger GC too early. heap_live, on the other hand, is unaffected by sweeping. Change-Id: I1f96807b6ed60d4156e8173a8e68745ffc742388 Reviewed-on: https://go-review.googlesource.com/8389 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-06 21:28:13 +00:00
Austin Clements	50a66562a0	runtime: track heap bytes marked by GC This tracks the number of heap bytes marked by a GC cycle. We'll use this information to precisely trigger the next GC cycle. Currently this aggregates the work counter in gcWork and dispose atomically aggregates this into a global work counter. dispose happens relatively infrequently, so the contention on the global counter should be low. If this turns out to be an issue, we can reduce the number of disposes, and if it's still a problem, we can switch to per-P counters. Change-Id: I1bc377cb2e802ef61c2968602b63146d52e7f5db Reviewed-on: https://go-review.googlesource.com/8388 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-06 21:28:07 +00:00
Ian Lance Taylor	32dbe07621	runtime: fix arm, arm64, ppc64 builds (I hope) I guess we need more builders. Change-Id: I309e3df7608b9eef9339196fdc50dedf5f9422e4 Reviewed-on: https://go-review.googlesource.com/8434 Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-04-03 05:18:31 +00:00
Srdjan Petrovic	e8694c8196	runtime: initialize shared library at library-load time This is Part 2 of the change, see Part 1 here: in https://go-review.googlesource.com/#/c/7692/ Suggested by iant@, we use the library initialization entry point to: - create a new OS thread and run the "regular" runtime init stack on that thread - return immediately from the main (i.e., loader) thread - at the first CGO invocation, we wait for the runtime initialization to complete. The above mechanism is implemented only on linux_amd64. Next step is to support it on linux_arm. Other platforms don't yet support shared library compiling/linking, but we intend to use the same strategy there as well. Change-Id: Ib2c81b1b83bee837134084b75a3beecfb8de6bf4 Reviewed-on: https://go-review.googlesource.com/8094 Run-TryBot: Srdjan Petrovic <spetrovic@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-04-03 01:24:51 +00:00
Austin Clements	f244a1471d	runtime: add cumulative GC CPU % to gctrace line This tracks both total CPU time used by GC and the total time available to all Ps since the beginning of the program and uses this to derive a cumulative CPU usage percent for the gctrace line. Change-Id: Ica85372b8dd45f7621909b325d5ac713a9b0d015 Reviewed-on: https://go-review.googlesource.com/8350 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-02 23:37:13 +00:00
Austin Clements	24ee948269	runtime: update gctrace line for new garbage collector GODEBUG=gctrace=1 turns on a per-GC cycle trace line. The current line is left over from the STW garbage collector and includes a lot of information that is no longer meaningful for the concurrent GC and doesn't include a lot of information that is important. Replace this line with a new line designed for the new garbage collector. This new line is focused more on helping the user understand the impact of the garbage collector on their program and less on telling us, the runtime developers, everything that's happening inside GC. It's designed to fit in 80 columns and intentionally omit some potentially useful things that were in the old line. We might want a "verbose" mode that adds information for us. We'll be able to further simplify the line once we eliminate the STW around enabling the write barrier. Then we'll have just one STW phase, one concurrent phase, and one more STW phase, so we'll be able to reduce the number of times from five to three. Change-Id: Icc30939fe4576fb4491b4eac811649395727aa2a Reviewed-on: https://go-review.googlesource.com/8208 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-02 23:37:06 +00:00
Austin Clements	822a24b602	runtime: remove checkgc code from hashmap Currently hashmap is riddled with code that attempts to force a GC on the next allocation if checkgc is set. This no longer works as originally intended with the concurrent collector, and is apparently no longer used anyway. Remove checkgc. Change-Id: Ia6c17c405fa8821dc2e6af28d506c1133ab1ca0c Reviewed-on: https://go-review.googlesource.com/8355 Reviewed-by: Keith Randall <khr@golang.org>	2015-04-02 15:28:56 +00:00
Austin Clements	6134caf1f9	runtime: improve MemStats comments This tries to clarify that Alloc and HeapAlloc are tied to how much freeing has been done by the sweeper. Change-Id: Id8320074bd75de791f39ec01bac99afe28052d02 Reviewed-on: https://go-review.googlesource.com/8354 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-02 15:28:50 +00:00
Josh Bleecher Snyder	ad3600945a	runtime: auto-generate duff routines This makes it easier to experiment with alternative implementations. While we're here, update the comments. No functional changes. Passes toolstash -cmp. Change-Id: I428535754908f0fdd7cc36c214ddb6e1e60f376e Reviewed-on: https://go-review.googlesource.com/8310 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-02 02:37:59 +00:00
Michael Hudson-Doyle	67426a8a9e	runtime, cmd/internal/ld: change runtime to use a single linker symbol In preparation for being able to run a go program that has code in several objects, this changes from having several linker symbols used by the runtime into having one linker symbol that points at a structure containing the needed data. Multiple object support will construct a linked list of such structures. A follow up will initialize the slices in the themoduledata structure directly from the linker but I was aiming for a minimal diff for now. Change-Id: I613cce35309801cf265a1d5ae5aaca8d689c5cbf Reviewed-on: https://go-review.googlesource.com/7441 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-31 22:45:07 +00:00
Austin Clements	a2f3d73fee	runtime: improve comment about non-preemption during GC work Currently, gcDrainN is documented saying that it must be run on the system stack. In fact, the problem and solution here are somewhat subtler. First, it doesn't have to happen on the system stack, it just has to be non-stoppable (that is, non-preemptible). Second, this isn't specific to gcDrainN (though gcDrainN is perhaps the most surprising instance); it's general to anything that uses the gcWork structure. Move the comment to gcWork and generalize it. Change-Id: I5277b5abb070e47f8d783bc15a310b379c6adc22 Reviewed-on: https://go-review.googlesource.com/8247 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-31 01:05:38 +00:00
Austin Clements	a4374c1de1	runtime: fix another out of date comment in GC gcDrain used to be passed a *workbuf to start draining from, but now it takes a gcWork, which hides whether or not there's an initial workbuf. Update the comment to match this. Change-Id: I976b58e5bfebc451cfd4fa75e770113067b5cc07 Reviewed-on: https://go-review.googlesource.com/8246 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-31 01:05:31 +00:00
Lee Packham	c45751e8a5	runtime: allow pointers to strings to be printed Being able to printer pointers to strings means one will able to output the result of things like the flag library and other components that use string pointers. While here, adjusted the tests for gdb to test original string pretty printing as well as pointers to them. It was doing it via the map before but for completeness this ensures it's tested as a unit. Change-Id: I4926547ae4fa6c85ef74301e7d96d49ba4a7b0c6 Reviewed-on: https://go-review.googlesource.com/8217 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-30 23:59:24 +00:00
Michael Hudson-Doyle	f78dc1dac1	runtime: rename ·main·f to ·mainPC to avoid duplicate symbol runtime·main·f is normalized by the linker to runtime.main.f, as is the compiler-generated symbol runtime.main·f. Change the former to runtime·mainPC instead. Fixes issue #9934 Change-Id: I656a6fa6422d45385fa2cc55bd036c6affa1abfe Reviewed-on: https://go-review.googlesource.com/8234 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-30 18:52:14 +00:00
David Chase	2270133981	cmd/gc: allocate backing storage for non-escaping interfaces on stack Extend escape analysis to convT2E and conT2I. If the interface value does not escape supply runtime with a stack buffer for the object copy. This is a straight port from .c to .go of Dmitry's patch Change-Id: Ic315dd50d144d94dd3324227099c116be5ca70b6 Reviewed-on: https://go-review.googlesource.com/8201 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-03-30 16:11:22 +00:00
Austin Clements	9e6f7aac28	runtime: make "write barriers are not allowed" comments more precise Currently, various functions are marked with the comment // May run without a P, so write barriers are not allowed. However, "running without a P" is ambiguous. We intended these to mean that m.p may be nil (which is the condition checked by the write barrier). The comment could also be taken to mean that a stop-the-world may happen, which is not the case for these functions because they run in situations where there is in fact a function on the stack holding a P locally, it just isn't in m.p. Change these comments to state precisely what we mean, that m.p may be nil. Change-Id: I4a4a1d26aebd455e5067540e13b9f96a7482146c Reviewed-on: https://go-review.googlesource.com/8209 Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-30 15:13:53 +00:00
Daniel Theophanes	77f4571f71	runtime: do not use AddVectoredContinueHandler on Windows XP/2003. When Windows Error Reporting dialog is disabled on amd64 Windows XP or 2003, the continue handler does not fire. Newer versions work correctly regardless of WER. Fixes #10162 Change-Id: I84ea36ee188b34d1421a8db6231223cf61b4111b Reviewed-on: https://go-review.googlesource.com/8165 Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2015-03-30 03:37:55 +00:00
Dmitry Vyukov	ca98dd773a	runtime/pprof: fix data race in test rp.Close happened concurrently with rp.Read. Order them. Fixes #10280 Change-Id: I7b083bcc336d15396c4e42fc4654ba34fad4a4cc Reviewed-on: https://go-review.googlesource.com/8211 Reviewed-by: Dave Cheney <dave@cheney.net>	2015-03-29 12:24:16 +00:00
Dmitry Vyukov	c61d86af72	os: give race detector chance to override Exit(0) Racy tests do not fail currently, they do os.Exit(0). So if you run go test without -v, you won't even notice. This was probably introduced with testing.TestMain. Racy programs do not have the right to finish successfully. Change-Id: Id133d7424f03d90d438bc3478528683dd02b8846 Reviewed-on: https://go-review.googlesource.com/4371 Reviewed-by: Russ Cox <rsc@golang.org>	2015-03-28 12:42:37 +00:00
Srdjan Petrovic	8da54a4eec	cmd: linker changes for shared library initialization Suggested by iant@, this change: - looks for a symbol _rt0_<GOARCH>_<GOOS>_lib, - if the symbol is present, adds a new entry into the .init_array ELF section that points to the symbol. The end-effect is that the symbol _rt0_<GOARCH>_<GOOS>_lib will be invoked as soon as the (ELF) shared library is loaded, which will in turn initialize the runtime. (To be implemented.) Change-Id: I99911a180215a6df18f8a18483d12b9b497b48f4 Reviewed-on: https://go-review.googlesource.com/7692 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-27 22:52:10 +00:00
Hyang-Ah Hana Kim	39bc78845b	runtime/pprof: fix TestCPUProfileWithFork for GOOS=android. 1) Large allocation in this test caused crash. This was not detected by builder because builder runs tests with -test.short. 2) The command "go" for forking doesn't exist in some platforms including android. This change uses the test binary itself which is guaranteed to exist. This change also adds logging of the total samples collected in TestCPUProfileMultithreaded test that is flaky in android-arm builder. Change-Id: I225c6b7877d811edef8b25e7eb00559450640c42 Reviewed-on: https://go-review.googlesource.com/8131 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-03-27 18:07:06 +00:00
Austin Clements	392336f94e	runtime: disallow write barriers in handoffp and callees handoffp by definition runs without a P, so it's not allowed to have write barriers. It doesn't have any right now, but mark it nowritebarrier to disallow any creeping in in the future. handoffp in turns calls startm, newm, and newosproc, all of which are "below Go" and make sense to run without a P, so disallow write barriers in these as well. For most functions, we've done this because they may race with stoptheworld() and hence must not have write barriers. For these functions, it's a little different: the world can't stop while we're in handoffp, so this race isn't present. But we implement this restriction with a somewhat broader rule that you can't have a write barrier without a P. We like this rule because it's simple and means that our write barriers can depend on there being a P, even though this rule is actually a little broader than necessary. Hence, even though there's no danger of the race in these functions, we want to adhere to the broader rule. Change-Id: Ie22319c30eea37d703eb52f5c7ca5da872030b88 Reviewed-on: https://go-review.googlesource.com/8130 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-26 20:38:59 +00:00
Shenghou Ma	400f58a010	runtime: don't trigger write barrier in newosproc for nacl This should fix the intermittent calling write barrier with mp.p == nil failures on the nacl/386 builder. Change-Id: I34aef5ca75ccd2939e6a6ad3f5dacec64903074e Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/7973 Reviewed-by: Austin Clements <austin@google.com>	2015-03-26 19:58:14 +00:00
Austin Clements	ec2c7e6659	runtime: use uintXX instead of byte for si_addr on Darwin Currently, Darwin's siginfo type uses byte for the si_addr field. This results in unwanted write barriers in set_sigaddr. It's also pointless since it never points to anything real and the get/set methods return/take uintXX and cast it from/to the pointer. All other arches use a uint type for this field. Change Darwin to match. This simplifies the get/set methods and eliminates the unwanted write barriers. Change-Id: Ifdb5646d35e1f2f6808b87a3d59745ec9718add1 Reviewed-on: https://go-review.googlesource.com/8086 Reviewed-by: Austin Clements <austin@google.com>	2015-03-26 16:20:32 +00:00
Austin Clements	9b0ea6aa27	runtime: remove write barrier on G in sighandler sighandler may run during a stop-the-world without a P, so it's not allowed to have write barriers. Fix the G write to disable the write barrier (this is safe because the G is reachable from allgs) and mark the function nowritebarrier. Change-Id: I907f05d3829e24eeb15fa4d020598af36710e87e Reviewed-on: https://go-review.googlesource.com/8020 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-26 15:26:29 +00:00
David Crawshaw	e9d9d0befc	runtime, runtime/cgo: make needextram a bool Also invert it, which means it no longer needs to cross the cgo package boundary. Change-Id: I393cd073bda02b591a55d6bc6b8bb94970ea71cd Reviewed-on: https://go-review.googlesource.com/8082 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-03-26 11:12:25 +00:00
Dave Cheney	e2543ef62c	runtime: add runtime.cmpstring and bytes.Compare Update #10007 Implement runtime.cmpstring and bytes.Compare in asm for arm. benchmark old ns/op new ns/op delta BenchmarkCompareBytesEqual 254 91.4 -64.02% BenchmarkCompareBytesToNil 41.5 37.6 -9.40% BenchmarkCompareBytesEmpty 40.7 37.6 -7.62% BenchmarkCompareBytesIdentical 255 96.3 -62.24% BenchmarkCompareBytesSameLength 125 60.9 -51.28% BenchmarkCompareBytesDifferentLength 133 60.9 -54.21% BenchmarkCompareBytesBigUnaligned 17985879 5669706 -68.48% BenchmarkCompareBytesBig 17097634 4926798 -71.18% BenchmarkCompareBytesBigIdentical 16861941 4389206 -73.97% benchmark old MB/s new MB/s speedup BenchmarkCompareBytesBigUnaligned 58.30 184.95 3.17x BenchmarkCompareBytesBig 61.33 212.83 3.47x BenchmarkCompareBytesBigIdentical 62.19 238.90 3.84x This is a collaboration between Josh Bleecher Snyder and myself. Change-Id: Ib3944b8c410d0e12135c2ba9459bfe131df48edd Reviewed-on: https://go-review.googlesource.com/8010 Reviewed-by: Keith Randall <khr@golang.org>	2015-03-25 22:46:39 +00:00
Alex Brainman	2420926a8a	runtime: remove obsolete comment We do not use SEH to handle Windows exception anymore. Change-Id: I0ac807a0fed7a5b4c745454246764c524460472b Reviewed-on: https://go-review.googlesource.com/8071 Reviewed-by: Minux Ma <minux@golang.org>	2015-03-25 02:55:56 +00:00
Shenghou Ma	003dccfac4	runtime, syscall: use the new get_random_bytes syscall for NaCl The SecureRandom named service was removed in https://codereview.chromium.org/550523002. And the new syscall was introduced in https://codereview.chromium.org/537543003. Accepting this will remove the support for older version of sel_ldr. I've confirmed that both pepper_40 and current pepper_canary have this syscall. After this change, we need sel_ldr from pepper_39 or above to work. Fixes #9261 Change-Id: I096973593aa302ade61f259a3a71ebc7c1a57913 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/1755 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-03-25 02:07:09 +00:00
Aram Hăvărneanu	41f9c430f3	runtime, syscall: fix Solaris exec tests Also fixes a long-existing problem in the fork/exec path. Change-Id: Idec40b1cee0cfb1625fe107db3eafdc0d71798f2 Reviewed-on: https://go-review.googlesource.com/8030 Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Rob Pike <r@golang.org>	2015-03-24 19:51:21 +00:00
David Crawshaw	b8caed823b	runtime: initialize extra M for cgo during mstart Previously the extra m needed for cgo callbacks was created on the first callback. This works for cgo, however the cgocallback mechanism is also borrowed by badsignal which can run before any cgo calls are made. Now we initialize the extra M at runtime startup before any signal handlers are registered, so badsignal cannot be called until the extra M is ready. Updates #10207. Change-Id: Iddda2c80db6dc52d8b60e2b269670fbaa704c7b3 Reviewed-on: https://go-review.googlesource.com/7978 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org>	2015-03-24 19:39:46 +00:00
Rick Hudson	546a54bb2e	runtime: Remove write barrier on g There are calls to stdcall when the GC thinks the world is stopped and stdcall write a *g for the CPU profiler. This produces a write barrier but the GC is not prepared to deal with write barriers when it thinks the world is stopped. Since the g is on allg it does not need a write barrier to keep it alive so eliminate the write barrier. Change-Id: I937633409a66553d7d292d87d7d58caba1fad0b6 Reviewed-on: https://go-review.googlesource.com/7979 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Rick Hudson <rlh@golang.org>	2015-03-24 16:42:39 +00:00
Alex Brainman	9b69196958	runtime: add TestCgoDLLImports The test is a simple reproduction of issue 9356. Update #8948. Update #9356. Change-Id: Ia77bc36d12ed0c3c4a8b1214cade8be181c9ad55 Reviewed-on: https://go-review.googlesource.com/7618 Reviewed-by: Minux Ma <minux@golang.org>	2015-03-24 05:39:28 +00:00
Shenghou Ma	b6ed943bef	runtime: use _main instead of main on windows/386 windows/386 also wants underscore prefix for external names. This CL is in preparation of external linking support. Change-Id: I2d2ea233f976aab3f356f9b508cdd246d5013e2d Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/7282 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2015-03-24 03:23:03 +00:00
Shenghou Ma	6112e6e404	cmd/internal/ld, runtime: record argument size for cgo_dynimport stdcall syscalls When external linking, we must link to implib provided by mingw, so we must use properly decorated names for stdcalls. Because the feature is only used in the runtime, I've designed a new decoration scheme so that we can use the same decorated name for both 386 and amd64. A stdcall function named FooEx from bar16.dll which takes 3 parameters will be imported like this: //go:cgo_import_dynamic runtime._FooEx FooEx%3 "bar16.dll" Depending on the size of uintptr, the linker will later transform it to _FooEx@12 or _FooEx@24. This is in prepration for the next CL that adds external linking support for windows/386. Change-Id: I2d2ea233f976aab3f356f9b508cdd246d5013e2c Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/7163 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-24 03:22:26 +00:00
Michael MacInnis	f7befa43a3	syscall: Add Foreground and Pgid to SysProcAttr On Unix, when placing a child in a new process group, allow that group to become the foreground process group. Also, allow a child process to join a specific process group. When setting the foreground process group, Ctty is used as the file descriptor of the controlling terminal. Ctty has been added to the BSD and Solaris SysProcAttr structures and the handling of Setctty changed to match Linux. Change-Id: I18d169a6c5ab8a6a90708c4ff52eb4aded50bc8c Reviewed-on: https://go-review.googlesource.com/5130 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-23 15:35:53 +00:00
Joel Sing	4f35ad6088	runtime: fix return values for open/read/write/close on openbsd/arm Change-Id: I5b057d16eed1b364e608ff0fd74de323da6492bc Reviewed-on: https://go-review.googlesource.com/7679 Reviewed-by: Minux Ma <minux@golang.org>	2015-03-21 03:52:42 +00:00
Dave Cheney	98485f5ad4	runtime: fix linux/amd64p32 build Implement runtime.atomicand8 for amd64p32 which was overlooked in CL 7861. Change-Id: Ic7eccddc6fd6c4682cac1761294893928f5428a2 Reviewed-on: https://go-review.googlesource.com/7920 Reviewed-by: Minux Ma <minux@golang.org>	2015-03-21 02:59:43 +00:00
Russ Cox	4224d81fae	cmd/internal/gc: inline x := y.(T) and x, ok := y.(T) These can be implemented with just a compare and a move instruction. Do so, avoiding the overhead of a call into the runtime. These assertions are a significant cost in Go code that uses interface{} as a safe alternative to C's void* (or unsafe.Pointer), such as the current version of the Go compiler. T here includes pointer to T but also any Go type represented as a single pointer (chan, func, map). It does not include [1]T or struct{*int}. That requires more work in other parts of the compiler; there is a TODO. Change-Id: I7ff681c20d2c3eb6ad11dd7b3a37b1f3dda23965 Reviewed-on: https://go-review.googlesource.com/7862 Reviewed-by: Rob Pike <r@golang.org>	2015-03-20 20:05:37 +00:00
Austin Clements	653426f08f	runtime: exit getfull barrier if there are partial workbufs Currently, we only exit the getfull barrier if there is work on the full list, even though the exit path will take work from either the full or partial list. Change this to exit the barrier if there is work on either the full or partial lists. I believe it's currently safe to check only the full list, since during mark termination there is no reason to put a workbuf on a partial list. However, checking both is more robust. Change-Id: Icf095b0945c7cad326a87ff2f1dc49b7699df373 Reviewed-on: https://go-review.googlesource.com/7840 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-20 14:05:11 +00:00
Austin Clements	06de3f52a7	runtime: document subtlety around entering mark termination The barrier in gcDrain does not account for concurrent gcDrainNs happening in gchelpwork, so it can actually return while there is still work being done. It turns out this is okay, but for subtle reasons involving gcDrainN always being run on the system stack. Document these reasons. Change-Id: Ib07b3753cc4e2b54533ab3081a359cbd1c3c08fb Reviewed-on: https://go-review.googlesource.com/7736 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-20 14:05:05 +00:00
Russ Cox	4d2b3a0b5f	runtime: fix arm build Make mask uint32, and move down one line to match atomic_arm64.go. Change-Id: I4867de494bc4076b7c2b3bf4fd74aa984e3ea0c8 Reviewed-on: https://go-review.googlesource.com/7854 Reviewed-by: Russ Cox <rsc@golang.org>	2015-03-20 05:00:46 +00:00
Russ Cox	631d6a33bf	runtime: implement atomicand8 atomically We're skating on thin ice, and things are finally starting to melt around here. (I want to avoid the debugging session that will happen when someone uses atomicand8 expecting it to be atomic with respect to other operations.) Change-Id: I254f1582be4eb1f2d7fbba05335a91c6bf0c7f02 Reviewed-on: https://go-review.googlesource.com/7861 Reviewed-by: Minux Ma <minux@golang.org>	2015-03-20 04:45:29 +00:00
Russ Cox	564eab891a	runtime: add GODEBUG=sbrk=1 to bypass memory allocator (and GC) To reduce lock contention in this mode, makes persistent allocation state per-P, which means at most 64 kB overhead x $GOMAXPROCS, which should be completely tolerable. Change-Id: I34ca95e77d7e67130e30822e5a4aff6772b1a1c5 Reviewed-on: https://go-review.googlesource.com/7740 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-20 00:02:30 +00:00
Josh Bleecher Snyder	25e793d7ea	cmd/internal/gc, runtime: speed up some cases of _, ok := i.(T) Some type assertions of the form _, ok := i.(T) allow efficient inlining. Such type assertions commonly show up in type switches. For example, with this optimization, using 6g, the length of encoding/binary's intDataSize function shrinks from 2224 to 1728 bytes (-22%). benchmark old ns/op new ns/op delta BenchmarkAssertI2E2Blank 4.67 0.82 -82.44% BenchmarkAssertE2T2Blank 4.38 0.83 -81.05% BenchmarkAssertE2E2Blank 3.88 0.83 -78.61% BenchmarkAssertE2E2 14.2 14.4 +1.41% BenchmarkAssertE2T2 10.3 10.4 +0.97% BenchmarkAssertI2E2 13.4 13.3 -0.75% Change-Id: Ie9798c3e85432bb8e0f2c723afc376e233639df7 Reviewed-on: https://go-review.googlesource.com/7697 Reviewed-by: Keith Randall <khr@golang.org>	2015-03-19 16:20:32 +00:00
Austin Clements	cadd4f81a8	runtime: combine gcWorkProducer into gcWork The distinction between gcWorkProducer and gcWork (producer and consumer) is not serving us as originally intended, so merge these into just gcWork. The original intent was to replace the currentwbuf cache with a gcWorkProducer. However, with gchelpwork (aka mutator assists), mutators can both produce and consume work, so it will make more sense to cache a whole gcWork. Change-Id: I6e633e96db7cb23a64fbadbfc4607e3ad32bcfb3 Reviewed-on: https://go-review.googlesource.com/7733 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:55:21 +00:00
Austin Clements	77fcf36a5e	runtime: don't use cached wbuf in markroot Currently markroot fetches the wbuf to fill from the per-M wbuf cache. The wbuf cache is primarily meant for the write barrier because it produces very little work on each call. There's little point to using the cache in mark root, since each call to markroot is likely to produce a large amount of work (so the slight win on getting it from the cache instead of from the central wbuf lists doesn't matter), and markroot does not dispose the wbuf back to the cache (so most markroot calls won't get anything from the wbuf cache anyway). Instead, just get the wbuf from the central wbuf lists like other work producers. This will simplify later changes. Change-Id: I07a18a4335a41e266a6d70aa3a0911a40babce23 Reviewed-on: https://go-review.googlesource.com/7732 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:55:16 +00:00
Austin Clements	fa2f9c2c09	runtime: run concurrent mark phase on regular stack Currently, the GC's concurrent mark phase runs on the system stack. There's no need to do this, and running it this way ties up the entire M and P running the GC by preventing the scheduler from preempting the GC even during concurrent mark. Fix this by running concurrent mark on the regular G stack. It's still non-preemptible because we also set preemptoff around the whole GC process, but this moves us closer to making it preemptible. Change-Id: Ia9f1245e299b8c5c513a4b1e3ef13eaa35ac5e73 Reviewed-on: https://go-review.googlesource.com/7730 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:55:12 +00:00
Austin Clements	bef356b282	runtime: improve comment in concurrent GC "Sync" is not very informative. What's being synchronized and with whom? Update this comment to explain what we're really doing: enabling write barriers. Change-Id: I4f0cbb8771988c7ba4606d566b77c26c64165f0f Reviewed-on: https://go-review.googlesource.com/7700 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:55:06 +00:00
Austin Clements	d21cef1f8f	runtime: remove pointless harvestwbufs Currently we harvestwbufs the moment we enter the mark phase, even before starting the world again. Since cached wbufs are only filled when we're in mark or mark termination, they should all be empty at this point, making the harvest pointless. Remove the harvest. We should, but do not currently harvest at the end of the mark phase when we're running out of work to do. Change-Id: I5f4ba874f14dd915b8dfbc4ee5bb526eecc2c0b4 Reviewed-on: https://go-review.googlesource.com/7669 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:55:00 +00:00
Austin Clements	a681c3029d	runtime: remove out of date comment Change-Id: I0ad1a81a235c7c067fea2093bbeac4e06a233c10 Reviewed-on: https://go-review.googlesource.com/7661 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-19 15:54:01 +00:00
Josh Bleecher Snyder	b90638e1de	runtime: delete old .h files Change-Id: I5a49f56518adf7d64ba8610b51ea1621ad888fc4 Reviewed-on: https://go-review.googlesource.com/7771 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-18 23:52:24 +00:00
Josh Bleecher Snyder	6ffed3020c	runtime: fix minor typo Change-Id: I79b7ed8f7e78e9d35b5e30ef70b98db64bc68a7b Reviewed-on: https://go-review.googlesource.com/7720 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-03-18 15:14:37 +00:00
Josh Bleecher Snyder	2adc4e8927	all: use "reports whether" in place of "returns true if(f)" Comment changes only. Change-Id: I56848814564c4aa0988b451df18bebdfc88d6d94 Reviewed-on: https://go-review.googlesource.com/7721 Reviewed-by: Rob Pike <r@golang.org>	2015-03-18 15:14:06 +00:00
Dmitry Vyukov	fcb895feef	runtime: add a select test One of my earlier versions of finer-grained select locking failed on this test. If you just naively lock and check channels one-by-one, it is possible that you skip over ready channels. Consider that initially c1 is ready and c2 is not. Select checks c2. Then another goroutine makes c1 not ready and c2 ready (in that order). Then select checks c1, concludes that no channels are ready and executes the default case. But there was no point in time when no channel is ready and so default case must not be executed. Change-Id: I3594bf1f36cfb120be65e2474794f0562aebcbbd Reviewed-on: https://go-review.googlesource.com/7550 Reviewed-by: Russ Cox <rsc@golang.org>	2015-03-18 08:57:30 +00:00
Russ Cox	87ec06f961	runtime: fix writebarrier throw in lock_sema The value in question is really a bit pattern (a pointer with extra bits thrown in), so treat it as a uintptr instead, avoiding the generation of a write barrier when there might not be a p. Also add the obligatory //go:nowritebarrier. Change-Id: I4ea097945dd7093a140f4740bcadca3ce7191971 Reviewed-on: https://go-review.googlesource.com/7667 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2015-03-17 19:20:11 +00:00
Rick Hudson	41dbcc19ef	runtime: Remove write barriers during STW. The GC assumes that there will be no asynchronous write barriers when the world is stopped. This keeps the synchronization between write barriers and the GC simple. However, currently, there are a few places in runtime code where this assumption does not hold. The GC stops the world by collecting all Ps, which stops all user Go code, but small parts of the runtime can run without a P. For example, the code that releases a P must still deschedule its G onto a runnable queue before stopping. Similarly, when a G returns from a long-running syscall, it must run code to reacquire a P. Currently, this code can contain write barriers. This can lead to the GC collecting reachable objects if something like the following sequence of events happens: 1. GC stops the world by collecting all Ps. 2. G #1 returns from a syscall (for example), tries to install a pointer to object X, and calls greyobject on X. 3. greyobject on G #1 marks X, but does not yet add it to a write buffer. At this point, X is effectively black, not grey, even though it may point to white objects. 4. GC reaches X through some other path and calls greyobject on X, but greyobject does nothing because X is already marked. 5. GC completes. 6. greyobject on G #1 adds X to a work buffer, but it's too late. 7. Objects that were reachable only through X are incorrectly collected. To fix this, we check the invariant that no asynchronous write barriers happen when the world is stopped by checking that write barriers always have a P, and modify all currently known sources of these writes to disable the write barrier. In all modified cases this is safe because the object in question will always be reachable via some other path. Some of the trace code was turned off, in particular the code that traces returning from a syscall. The GC assumes that as far as the heap is concerned the thread is stopped when it is in a syscall. Upon returning the trace code must not do any heap writes for the same reasons discussed above. Fixes #10098 Fixes #9953 Fixes #9951 Fixes #9884 May relate to #9610 #9771 Change-Id: Ic2e70b7caffa053e56156838eb8d89503e3c0c8a Reviewed-on: https://go-review.googlesource.com/7504 Reviewed-by: Austin Clements <austin@google.com>	2015-03-17 17:33:21 +00:00
David Crawshaw	ce9b512ccc	runtime: copy env strings on startup Some versions of libc, in this case Android's bionic, point environ directly at the envp memory. https://android.googlesource.com/platform/bionic/+/master/libc/bionic/libc_init_common.cpp#104 The Go runtime does something surprisingly similar, building the runtime's envs []string using gostringnocopy. Both libc and the Go runtime reusing memory interacts badly. When syscall.Setenv uses cgo to call setenv(3), C modifies the underlying memory of a Go string. This manifests on android/arm. With GOROOT=/data/local/tmp, a runtime test calls syscall.Setenv("/os"), resulting in runtime.GOROOT()=="/os\x00a/local/tmp/goroot". Avoid this by copying environment string memory into Go. Covered by runtime.TestFixedGOROOT on android/arm. Change-Id: Id0cf9553969f587addd462f2239dafca1cf371fa Reviewed-on: https://go-review.googlesource.com/7663 Reviewed-by: Keith Randall <khr@golang.org>	2015-03-17 17:27:42 +00:00
Dmitry Vyukov	2e7f0a00c3	runtime: fix comment IRIW requires 4 threads: first writes x, second writes y, third reads x and y, fourth reads y and x. This is Peterson/Dekker mutual exclusion algorithm based on critical store-load sequences: http://en.wikipedia.org/wiki/Dekker's_algorithm http://en.wikipedia.org/wiki/Peterson%27s_algorithm Change-Id: I30a00865afbe895f7617feed4559018f81ff4528 Reviewed-on: https://go-review.googlesource.com/7561 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-03-17 15:23:20 +00:00

... 2 3 4 5 6 ...

1204 Commits