qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-11-19 18:14:40 -07:00

Author	SHA1	Message	Date
Austin Clements	786eb5b754	runtime: make debug.FreeOSMemory call runtime.GC() Currently freeOSMemory calls gcStart directly, but we really just want it to behave like runtime.GC() and then perform a scavenge, so make it call runtime.GC() rather than gcStart. For #18216. Change-Id: I548ec007afc788e87d383532a443a10d92105937 Reviewed-on: https://go-review.googlesource.com/37518 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:10 +00:00
Austin Clements	3d58498fdb	runtime: simplify forced GC triggering Now that the gcMode is no longer involved in the GC trigger condition, we can simplify the triggering of forced GCs. By making the trigger condition for forced GCs true even if gcphase is not _GCoff, we don't need any special case path in gcStart to ensure that forced GCs don't get consolidated. Change-Id: I6067a13d76e40ff2eef8fade6fc14adb0cb58ee5 Reviewed-on: https://go-review.googlesource.com/37517 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:08 +00:00
Austin Clements	29be3f1999	runtime: generalize GC trigger Currently the GC triggering condition is an awkward combination of the gcMode (whether or not it's gcBackgroundMode) and a boolean "forceTrigger" flag. Replace this with a new gcTrigger type that represents the range of transition predicates we need. This has several advantages: 1. We can remove the awkward logic that affects the trigger behavior based on the gcMode. Now gcMode purely controls whether to run a STW GC or not and the gcTrigger controls whether this is a forced GC that cannot be consolidated with other GC cycles. 2. We can lift the time-based triggering logic in sysmon to just another type of GC trigger and move the logic to the trigger test. 3. This sets us up to have a cycle count-based trigger, which we'll use to make runtime.GC trigger concurrent GC with the desired consolidation properties. For #18216. Change-Id: If9cd49349579a548800f5022ae47b8128004bbfc Reviewed-on: https://go-review.googlesource.com/37516 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:06 +00:00
Austin Clements	640cd3b322	runtime: check transition condition before triggering periodic GC Currently sysmon triggers periodic GC if GC is not currently running and it's been long enough since the last GC. This misses some important conditions; for example, whether GC is enabled at all by GOGC. As a result, if GOGC is off, once we pass the timeout for periodic GC, sysmon will attempt to trigger a GC every 10ms. This GC will be a no-op because gcStart will check all of the appropriate conditions and do nothing, but it still goes through the motions of waking the forcegc goroutine and printing a gctrace line. Fix this by making sysmon call gcShouldStart to check all of the appropriate transition conditions before attempting to trigger a periodic GC. Fixes #19247. Change-Id: Icee5521ce175e8419f934723849853d53773af31 Reviewed-on: https://go-review.googlesource.com/37515 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:03 +00:00
Austin Clements	1be3e76e76	runtime: simplify heap profile flushing Currently the heap profile is flushed by either gcSweep in STW mode or by gcMarkTermination in concurrent mode. Simplify this by making gcMarkTermination always flush the heap profile and by making gcSweep do one extra flush (instead of two) in STW mode. Change-Id: I62147afb2a128e1f3d92ef4bb8144c8a345f53c4 Reviewed-on: https://go-review.googlesource.com/37715 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:01 +00:00
Austin Clements	eee85fc5a1	runtime: snapshot heap profile during mark termination Currently we snapshot the heap profile just after mark termination starts the world because it's a relatively expensive operation. However, this means any alloc or free events that happen between starting the world and snapshotting the heap profile can be accounted to the wrong cycle. In the worst case, a free can be accounted to the cycle before the alloc; if the heap is small, this can result temporarily in a negative "in use" count in the profile. Fix this without making STW more expensive by using a global heap profile cycle counter. This lets us split up the operation into a two parts: 1) a super-cheap snapshot operation that simply increments the global cycle counter during STW, and 2) a more expensive cleanup operation we can do after starting the world that frees up a slot in all buckets for use by the next heap profile cycle. Fixes #19311. Change-Id: I6bdafabf111c48b3d26fe2d91267f7bef0bd4270 Reviewed-on: https://go-review.googlesource.com/37714 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:14:56 +00:00
Austin Clements	3ebe7d7d11	runtime: pull heap profile cycle into a type Currently memRecord has the same set of four fields repeated three times. Pull these into a type and use this type three times. This cleans up and simplifies the code a bit and will make it easier to switch to a globally tracked heap profile cycle for #19311. Change-Id: I414d15673feaa406a8366b48784437c642997cf2 Reviewed-on: https://go-review.googlesource.com/37713 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:14:54 +00:00
Austin Clements	673a8fdfe6	runtime: diagram flow of stats through heap profile Every time I modify heap profiling, I find myself redrawing this diagram, so add it to the comments. This shows how allocations and frees are accounted, how we arrive at consistent profile snapshots, and when those snapshots are published to the user. Change-Id: I106aba1200af3c773b46e24e5f50205e808e2c69 Reviewed-on: https://go-review.googlesource.com/37514 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 00:46:18 +00:00
Austin Clements	ef1829d1de	runtime: improve TestMemStats checks Now that we have a nice predicate system, improve the tests performed by TestMemStats. We add some more non-zero checks (now that we force a GC, things like NumGC must be non-zero), checks for trivial boolean fields, and a few more range checks. Change-Id: I6da46d33fa0ce5738407ee57d587825479413171 Reviewed-on: https://go-review.googlesource.com/37513 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 00:46:16 +00:00
Austin Clements	bda74b0e4a	runtime: make TestMemStats failure messages useful Currently most TestMemStats failures dump the whole MemStats object if anything is amiss without telling you what is amiss, or even which field is wrong. This makes it hard to figure out what the actual problem is. Replace this with a reflection walk over MemStats and a map of predicates to check. If one fails, we can construct a detailed and descriptive error message. The predicates are a direct translation of the current tests. Change-Id: I5a7cafb8e6a1eeab653d2e18bb74e2245eaa5444 Reviewed-on: https://go-review.googlesource.com/37512 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 00:46:14 +00:00
Caleb Spare	592037f381	runtime: fix for implementation notes appearing in godoc Change-Id: I31cfae1e98313b68e3bc8f49079491d2725a662b Reviewed-on: https://go-review.googlesource.com/38850 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-29 22:32:57 +00:00
David Lazar	7bf0adc6ad	runtime: include inlined calls in result of CallersFrames Change-Id: If1a3396175f2afa607d56efd1444181334a9ae3e Reviewed-on: https://go-review.googlesource.com/37862 Reviewed-by: Austin Clements <austin@google.com>	2017-03-29 17:27:38 +00:00
David Lazar	ee97216a17	runtime: handle inlined calls in runtime.Callers The `skip` argument passed to runtime.Caller and runtime.Callers should be interpreted as the number of logical calls to skip (rather than the number of physical stack frames to skip). This changes runtime.Callers to skip inlined calls in addition to physical stack frames. The result value of runtime.Callers is a slice of program counters ([]uintptr) representing physical stack frames. If the `skip` parameter to runtime.Callers skips part-way into a physical frame, there is no convenient way to encode that in the resulting slice. To avoid changing the API in an incompatible way, our solution is to store the number of skipped logical calls of the first frame in the _second_ uintptr returned by runtime.Callers. Since this number is a small integer, we encode it as a valid PC value into a small symbol called: runtime.skipPleaseUseCallersFrames For example, if f() calls g(), g() calls `runtime.Callers(2, pcs)`, and g() is inlined into f, then the frame for f will be partially skipped, resulting in the following slice: pcs = []uintptr{pc_in_f, runtime.skipPleaseUseCallersFrames+1, ...} We store the skip PC in pcs[1] instead of pcs[0] so that `pcs[i:]` will truncate the captured stack trace rather than grow it for all i. Updates #19348. Change-Id: I1c56f89ac48c29e6f52a5d085567c6d77d499cf1 Reviewed-on: https://go-review.googlesource.com/37854 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-29 17:22:08 +00:00
Rick Hudson	6e9ec14186	runtime: redo insert/remove of large spans Currently for spans with up to 1 MBytes (128 pages) we maintain an array indexed by the number of pages in the span. This is efficient both in terms of space as well as time to insert or remove a span of a particular size. Unfortunately for spans larger than 1 MByte we currently place them on a separate linked list. This results in O(n) behavior. Now that we are seeing heaps approaching 100 GBytes n is large enough to be noticed in real programs. This change replaces the linked list now used with a balanced binary tree structure called a treap. A treap is a probabilistically balanced tree offering O(logN) behavior for inserting and removing spans. To verify that this approach will work we start with noting that only spans with sizes > 1MByte will be put into the treap. This means that to support 1 TByte a treap will need at most 1 million nodes and can ideally be held in a treap with a depth of 20. Experiments with adding and removing randomly sized spans from the treap seem to result in treaps with depths of about twice the ideal or 40. A petabyte would require a tree of only twice again that depth again so this algorithm should last well into the future. Fixes #19393 Go1 benchmarks indicate this is basically an overall wash. Tue Mar 28 21:29:21 EDT 2017 name old time/op new time/op delta BinaryTree17-4 2.42s ± 1% 2.42s ± 1% ~ (p=0.980 n=21+21) Fannkuch11-4 3.00s ± 1% 3.18s ± 4% +6.10% (p=0.000 n=22+24) FmtFprintfEmpty-4 40.5ns ± 1% 40.3ns ± 3% ~ (p=0.692 n=22+25) FmtFprintfString-4 65.9ns ± 3% 64.6ns ± 1% -1.98% (p=0.000 n=24+23) FmtFprintfInt-4 69.6ns ± 1% 68.0ns ± 7% -2.30% (p=0.001 n=21+22) FmtFprintfIntInt-4 102ns ± 2% 99ns ± 1% -3.07% (p=0.000 n=23+23) FmtFprintfPrefixedInt-4 126ns ± 0% 125ns ± 0% -0.79% (p=0.000 n=19+17) FmtFprintfFloat-4 206ns ± 2% 205ns ± 1% ~ (p=0.671 n=23+21) FmtManyArgs-4 441ns ± 1% 445ns ± 1% +0.88% (p=0.000 n=22+23) GobDecode-4 5.73ms ± 1% 5.86ms ± 1% +2.37% (p=0.000 n=23+22) GobEncode-4 4.51ms ± 1% 4.89ms ± 1% +8.32% (p=0.000 n=22+22) Gzip-4 197ms ± 0% 202ms ± 1% +2.75% (p=0.000 n=23+24) Gunzip-4 32.9ms ± 8% 32.7ms ± 2% ~ (p=0.466 n=23+24) HTTPClientServer-4 57.3µs ± 1% 56.7µs ± 1% -0.94% (p=0.000 n=21+22) JSONEncode-4 13.8ms ± 1% 13.9ms ± 2% +1.14% (p=0.000 n=22+23) JSONDecode-4 47.4ms ± 1% 48.1ms ± 1% +1.49% (p=0.000 n=23+23) Mandelbrot200-4 3.92ms ± 0% 3.92ms ± 1% +0.21% (p=0.000 n=22+22) GoParse-4 2.89ms ± 1% 2.87ms ± 1% -0.68% (p=0.000 n=21+22) RegexpMatchEasy0_32-4 73.6ns ± 1% 72.0ns ± 2% -2.15% (p=0.000 n=21+22) RegexpMatchEasy0_1K-4 173ns ± 1% 173ns ± 1% ~ (p=0.847 n=22+24) RegexpMatchEasy1_32-4 71.9ns ± 1% 69.8ns ± 1% -2.99% (p=0.000 n=23+20) RegexpMatchEasy1_1K-4 314ns ± 1% 308ns ± 1% -1.91% (p=0.000 n=22+23) RegexpMatchMedium_32-4 106ns ± 0% 105ns ± 1% -0.58% (p=0.000 n=19+21) RegexpMatchMedium_1K-4 34.3µs ± 1% 34.3µs ± 1% ~ (p=0.871 n=23+22) RegexpMatchHard_32-4 1.67µs ± 1% 1.67µs ± 7% ~ (p=0.224 n=22+23) RegexpMatchHard_1K-4 51.5µs ± 1% 50.4µs ± 1% -1.99% (p=0.000 n=22+23) Revcomp-4 383ms ± 1% 415ms ± 0% +8.51% (p=0.000 n=22+22) Template-4 51.5ms ± 1% 51.5ms ± 1% ~ (p=0.555 n=20+23) TimeParse-4 279ns ± 2% 277ns ± 1% -0.95% (p=0.000 n=24+22) TimeFormat-4 294ns ± 1% 296ns ± 1% +0.58% (p=0.003 n=24+23) [Geo mean] 43.7µs 43.8µs +0.32% name old speed new speed delta GobDecode-4 134MB/s ± 1% 131MB/s ± 1% -2.32% (p=0.000 n=23+22) GobEncode-4 170MB/s ± 1% 157MB/s ± 1% -7.68% (p=0.000 n=22+22) Gzip-4 98.7MB/s ± 0% 96.1MB/s ± 1% -2.68% (p=0.000 n=23+24) Gunzip-4 590MB/s ± 7% 593MB/s ± 2% ~ (p=0.466 n=23+24) JSONEncode-4 141MB/s ± 1% 139MB/s ± 2% -1.13% (p=0.000 n=22+23) JSONDecode-4 40.9MB/s ± 1% 40.3MB/s ± 0% -1.47% (p=0.000 n=23+23) GoParse-4 20.1MB/s ± 1% 20.2MB/s ± 1% +0.69% (p=0.000 n=21+22) RegexpMatchEasy0_32-4 435MB/s ± 1% 444MB/s ± 2% +2.21% (p=0.000 n=21+22) RegexpMatchEasy0_1K-4 5.89GB/s ± 1% 5.89GB/s ± 1% ~ (p=0.439 n=22+24) RegexpMatchEasy1_32-4 445MB/s ± 1% 459MB/s ± 1% +3.06% (p=0.000 n=23+20) RegexpMatchEasy1_1K-4 3.26GB/s ± 1% 3.32GB/s ± 1% +1.97% (p=0.000 n=22+23) RegexpMatchMedium_32-4 9.40MB/s ± 1% 9.44MB/s ± 1% +0.43% (p=0.000 n=23+21) RegexpMatchMedium_1K-4 29.8MB/s ± 1% 29.8MB/s ± 1% ~ (p=0.826 n=23+22) RegexpMatchHard_32-4 19.1MB/s ± 1% 19.1MB/s ± 7% ~ (p=0.233 n=22+23) RegexpMatchHard_1K-4 19.9MB/s ± 1% 20.3MB/s ± 1% +2.03% (p=0.000 n=22+23) Revcomp-4 664MB/s ± 1% 612MB/s ± 0% -7.85% (p=0.000 n=22+22) Template-4 37.6MB/s ± 1% 37.7MB/s ± 1% ~ (p=0.558 n=20+23) [Geo mean] 134MB/s 133MB/s -0.76% Tue Mar 28 22:16:54 EDT 2017 Change-Id: I4a4f5c2b53d3fb85ef76c98522d3ed5cf8ae5b7e Reviewed-on: https://go-review.googlesource.com/38732 Reviewed-by: Russ Cox <rsc@golang.org>	2017-03-29 14:18:24 +00:00
Elias Naur	2b4274d667	runtime/cgo: CFRelease result from CFBundleCopyResourceURL The result from CFBundleCopyResourceURL is owned by the caller. This CL adds the necessary CFRelease to release it after use. Fixes #19722 Change-Id: I7afe22ef241d21922a7f5cef6498017e6269a5c3 Reviewed-on: https://go-review.googlesource.com/38639 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-03-27 18:12:17 +00:00
Austin Clements	4234d1decd	runtime: improve systemstack-on-Go stack message We reused the old C stack check mechanism for the implementation of //go:systemstack, so when we execute a //go:systemstack function on a user stack, the system fails by calling morestackc. However, morestackc's message still talks about "executing C code". Fix morestackc's message to reflect its modern usage. Change-Id: I7e70e7980eab761c0520f675d3ce89486496030f Reviewed-on: https://go-review.googlesource.com/38572 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-27 14:53:12 +00:00
Elias Naur	0476c7a7b5	runtime/cgo: raise the thread-local storage slot search limit on Android On Android, the thread local offset is found by looping through memory starting at the TLS base address. The search is limited to PTHREAD_KEYS_MAX, but issue 19472 made it clear that in some cases, the slot is located further from the TLS base. The limit is merely a sanity check in case our assumptions about the thread-local storage layout are wrong, so this CL raises it to 384, which is enough for the test case in issue 19472. Fixes #19472 Change-Id: I89d1db3e9739d3a7fff5548ae487a7483c0a278a Reviewed-on: https://go-review.googlesource.com/38636 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-27 08:56:08 +00:00
Elias Naur	aa4c2ca316	runtime/pprof: fix proto tests on NetBSD The proto_test tests are failing on NetBSD: https://build.golang.org/log/a3a577144ac48c6ef8e384ce6a700ad30549fb78 the failures seem similar to previous failures on Android: https://build.golang.org/log/b5786e0cd6d5941dc37b6a50be5172f6b99e22f0 The Android failures where fixed by CL 37896. This CL is an attempt to fix the NetBSD failures with a similar fix. Change-Id: I3834afa5b32303ca226e6a31f0f321f66fef9a3f Reviewed-on: https://go-review.googlesource.com/38637 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-27 08:55:14 +00:00
Keith Randall	e67d881bc3	cmd/compile: simplify efaceeq and ifaceeq Clean up code that does interface equality. Avoid doing checks in efaceeq/ifaceeq that we already did before calling those routines. No noticeable performance changes for existing benchmarks. name old time/op new time/op delta EfaceCmpDiff-8 604ns ± 1% 553ns ± 1% -8.41% (p=0.000 n=9+10) Fixes #18618 Change-Id: I3bd46db82b96494873045bc3300c56400bc582eb Reviewed-on: https://go-review.googlesource.com/38606 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-03-24 23:03:09 +00:00
Cherry Zhang	3a1ce1085a	runtime: access _cgo_yield indirectly The darwin linker for ARM does not allow PC-relative relocation of external symbol in text section. Work around it by accessing it indirectly: putting its address in a global variable (which is not external), and accessing through that variable. Fixes #19684. Change-Id: I41361bbb281b5dbdda0d100ae49d32c69ed85a81 Reviewed-on: https://go-review.googlesource.com/38596 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Elias Naur <elias.naur@gmail.com>	2017-03-24 15:37:56 +00:00
Carlos Eduardo Seo	189053aee2	runtime/internal/atomic: Remove unnecessary checks for GOARCH_ppc64 Starting in go1.9, the minimum processor requirement for ppc64 is POWER8. This means the checks for GOARCH_ppc64 in asm_ppc64x.s can be removed, since we can assume LBAR and STBCCC instructions (both from ISA 2.06) will always be available. Updates #19074 Change-Id: Ib4418169cd9fc6f871a5ab126b28ee58a2f349e2 Reviewed-on: https://go-review.googlesource.com/38406 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-22 18:14:41 +00:00
Josselin Costanzi	01cd22c687	bytes: add optimized countByte for amd64 Use SSE/AVX2 when counting a single byte. Inspired from runtime indexbyte implementation. Benchmark against previous implementation, where 1 byte in every 8 is the one we are looking for: * On a machine without AVX2 name old time/op new time/op delta CountSingle/10-4 61.8ns ±10% 15.6ns ±11% -74.83% (p=0.000 n=10+10) CountSingle/32-4 100ns ± 4% 17ns ±10% -82.54% (p=0.000 n=10+9) CountSingle/4K-4 9.66µs ± 3% 0.37µs ± 6% -96.21% (p=0.000 n=10+10) CountSingle/4M-4 11.0ms ± 6% 0.4ms ± 4% -96.04% (p=0.000 n=10+10) CountSingle/64M-4 194ms ± 8% 8ms ± 2% -95.64% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-4 162MB/s ±10% 645MB/s ±10% +297.00% (p=0.000 n=10+10) CountSingle/32-4 321MB/s ± 5% 1844MB/s ± 9% +474.79% (p=0.000 n=10+9) CountSingle/4K-4 424MB/s ± 3% 11169MB/s ± 6% +2533.10% (p=0.000 n=10+10) CountSingle/4M-4 381MB/s ± 7% 9609MB/s ± 4% +2421.88% (p=0.000 n=10+10) CountSingle/64M-4 346MB/s ± 7% 7924MB/s ± 2% +2188.78% (p=0.000 n=10+10) * On a machine with AVX2 name old time/op new time/op delta CountSingle/10-8 37.1ns ± 3% 8.2ns ± 1% -77.80% (p=0.000 n=10+10) CountSingle/32-8 66.1ns ± 3% 9.8ns ± 2% -85.23% (p=0.000 n=10+10) CountSingle/4K-8 7.36µs ± 3% 0.11µs ± 1% -98.54% (p=0.000 n=10+10) CountSingle/4M-8 7.46ms ± 2% 0.15ms ± 2% -97.95% (p=0.000 n=10+9) CountSingle/64M-8 124ms ± 2% 6ms ± 4% -95.09% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-8 269MB/s ± 3% 1213MB/s ± 1% +350.32% (p=0.000 n=10+10) CountSingle/32-8 484MB/s ± 4% 3277MB/s ± 2% +576.66% (p=0.000 n=10+10) CountSingle/4K-8 556MB/s ± 3% 37933MB/s ± 1% +6718.36% (p=0.000 n=10+10) CountSingle/4M-8 562MB/s ± 2% 27444MB/s ± 3% +4783.43% (p=0.000 n=10+9) CountSingle/64M-8 543MB/s ± 2% 11054MB/s ± 3% +1935.81% (p=0.000 n=10+10) Fixes #19411 Change-Id: Ieaf20b1fabccabe767c55c66e242e86f3617f883 Reviewed-on: https://go-review.googlesource.com/38258 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-21 20:25:17 +00:00
Daniel Martí	2e29eb57db	runtime: remove unused *chantype parameters The chanrecv funcs don't use it at all. The chansend ones do, but the element type is now part of the hchan struct, which is already a parameter. hchan can be nil in chansend when sending to a nil channel, so when instrumenting we must copy to the stack to be able to read the channel type. name old time/op new time/op delta ChanUncontended 6.42µs ± 1% 6.22µs ± 0% -3.06% (p=0.000 n=19+18) Initially found by github.com/mvdan/unparam. Fixes #19591. Change-Id: I3a5e8a0082e8445cc3f0074695e3593fd9c88412 Reviewed-on: https://go-review.googlesource.com/38351 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-21 17:10:16 +00:00
Vladimir Stefanovic	24dc8c6cb5	cmd/compile,runtime: fix atomic And8 for mipsle Removing stray xori that came from big endian copy/paste. Adding atomicand8 check to runtime.check() that would have revealed this error. Might fix #19396. Change-Id: If8d6f25d3e205496163541eb112548aa66df9c2a Reviewed-on: https://go-review.googlesource.com/38257 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-21 16:03:12 +00:00
Hugues Bruant	5d6b7fcaa1	runtime: add mapdelete_fast* Add benchmarks for map delete with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapDelete/Int32/1-8 151ns ± 8% 99ns ± 3% -34.39% (p=0.008 n=5+5) MapDelete/Int32/2-8 128ns ± 2% 111ns ±15% -13.40% (p=0.040 n=5+5) MapDelete/Int32/4-8 128ns ± 5% 114ns ± 2% -10.82% (p=0.008 n=5+5) MapDelete/Int64/1-8 144ns ± 0% 104ns ± 3% -27.53% (p=0.016 n=4+5) MapDelete/Int64/2-8 153ns ± 1% 126ns ± 3% -17.17% (p=0.008 n=5+5) MapDelete/Int64/4-8 178ns ± 3% 136ns ± 2% -23.60% (p=0.008 n=5+5) MapDelete/Str/1-8 187ns ± 3% 171ns ± 3% -8.54% (p=0.008 n=5+5) MapDelete/Str/2-8 221ns ± 3% 206ns ± 4% -7.18% (p=0.016 n=5+4) MapDelete/Str/4-8 256ns ± 5% 232ns ± 2% -9.36% (p=0.016 n=4+5) name old time/op new time/op delta BinaryTree17-8 2.78s ± 7% 2.70s ± 1% ~ (p=0.151 n=5+5) Fannkuch11-8 3.21s ± 2% 3.19s ± 1% ~ (p=0.310 n=5+5) FmtFprintfEmpty-8 49.1ns ± 3% 50.2ns ± 2% ~ (p=0.095 n=5+5) FmtFprintfString-8 78.6ns ± 4% 80.2ns ± 5% ~ (p=0.460 n=5+5) FmtFprintfInt-8 79.7ns ± 1% 81.0ns ± 3% ~ (p=0.103 n=5+5) FmtFprintfIntInt-8 117ns ± 2% 119ns ± 0% ~ (p=0.079 n=5+4) FmtFprintfPrefixedInt-8 153ns ± 1% 146ns ± 3% -4.19% (p=0.024 n=5+5) FmtFprintfFloat-8 239ns ± 1% 237ns ± 1% ~ (p=0.246 n=5+5) FmtManyArgs-8 506ns ± 2% 509ns ± 2% ~ (p=0.238 n=5+5) GobDecode-8 7.06ms ± 4% 6.86ms ± 1% ~ (p=0.222 n=5+5) GobEncode-8 6.01ms ± 5% 5.87ms ± 2% ~ (p=0.222 n=5+5) Gzip-8 246ms ± 4% 236ms ± 1% -4.12% (p=0.008 n=5+5) Gunzip-8 37.7ms ± 4% 37.3ms ± 1% ~ (p=0.841 n=5+5) HTTPClientServer-8 64.9µs ± 1% 64.4µs ± 0% -0.80% (p=0.032 n=5+4) JSONEncode-8 16.0ms ± 2% 16.2ms ±11% ~ (p=0.548 n=5+5) JSONDecode-8 53.2ms ± 2% 53.1ms ± 4% ~ (p=1.000 n=5+5) Mandelbrot200-8 4.33ms ± 2% 4.32ms ± 2% ~ (p=0.841 n=5+5) GoParse-8 3.24ms ± 2% 3.27ms ± 4% ~ (p=0.690 n=5+5) RegexpMatchEasy0_32-8 86.2ns ± 1% 85.2ns ± 3% ~ (p=0.286 n=5+5) RegexpMatchEasy0_1K-8 198ns ± 2% 199ns ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_32-8 82.6ns ± 2% 81.8ns ± 1% ~ (p=0.294 n=5+5) RegexpMatchEasy1_1K-8 359ns ± 2% 354ns ± 1% -1.39% (p=0.048 n=5+5) RegexpMatchMedium_32-8 123ns ± 2% 123ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchMedium_1K-8 38.2µs ± 2% 38.6µs ± 8% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 1.92µs ± 2% 1.91µs ± 5% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-8 57.6µs ± 1% 57.0µs ± 2% ~ (p=0.310 n=5+5) Revcomp-8 483ms ± 7% 441ms ± 1% -8.79% (p=0.016 n=5+4) Template-8 58.0ms ± 1% 58.2ms ± 7% ~ (p=0.310 n=5+5) TimeParse-8 324ns ± 6% 312ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 330ns ± 1% 329ns ± 1% ~ (p=0.968 n=5+5) name old speed new speed delta GobDecode-8 109MB/s ± 4% 112MB/s ± 1% ~ (p=0.222 n=5+5) GobEncode-8 128MB/s ± 5% 131MB/s ± 2% ~ (p=0.222 n=5+5) Gzip-8 78.9MB/s ± 4% 82.3MB/s ± 1% +4.25% (p=0.008 n=5+5) Gunzip-8 514MB/s ± 4% 521MB/s ± 1% ~ (p=0.841 n=5+5) JSONEncode-8 121MB/s ± 2% 120MB/s ±10% ~ (p=0.548 n=5+5) JSONDecode-8 36.5MB/s ± 2% 36.6MB/s ± 4% ~ (p=1.000 n=5+5) GoParse-8 17.9MB/s ± 2% 17.7MB/s ± 4% ~ (p=0.730 n=5+5) RegexpMatchEasy0_32-8 371MB/s ± 1% 375MB/s ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy0_1K-8 5.15GB/s ± 1% 5.13GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 387MB/s ± 2% 391MB/s ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K-8 2.85GB/s ± 2% 2.89GB/s ± 1% ~ (p=0.056 n=5+5) RegexpMatchMedium_32-8 8.07MB/s ± 2% 8.06MB/s ± 1% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 26.8MB/s ± 2% 26.6MB/s ± 7% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 16.7MB/s ± 2% 16.7MB/s ± 5% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-8 17.8MB/s ± 1% 18.0MB/s ± 2% ~ (p=0.310 n=5+5) Revcomp-8 527MB/s ± 6% 577MB/s ± 1% +9.44% (p=0.016 n=5+4) Template-8 33.5MB/s ± 1% 33.4MB/s ± 7% ~ (p=0.310 n=5+5) Updates #19495 Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5 Reviewed-on: https://go-review.googlesource.com/38172 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-21 06:07:24 +00:00
Alex Brainman	f2b79cadfd	runtime: import os package in BenchmarkRunningGoProgram I would like to use BenchmarkRunningGoProgram to measure changes for issue #15588. So the program in the benchmark should import "os" package. It is also reasonable that basic Go program includes "os" package. For #15588. Change-Id: Ida6712eab22c2e79fbe91b6fdd492eaf31756852 Reviewed-on: https://go-review.googlesource.com/37914 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-21 05:59:45 +00:00
Ian Lance Taylor	5dc14af682	runtime: clear signal stack on main thread This is a workaround for a FreeBSD kernel bug. It can be removed when we are confident that all people are using the fixed kernel. See #15658. Updates #15658. Change-Id: I0ecdccb77ddd0c270bdeac4d3a5c8abaf0449075 Reviewed-on: https://go-review.googlesource.com/38325 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-20 22:59:26 +00:00
Austin Clements	df6025bc0d	runtime: disallow malloc or panic in scavenge Mallocs and panics in the scavenge path are particularly nasty because they're likely to silently self-deadlock on the mheap.lock. Avoid sinking lots of time into debugging these issues in the future by turning these into immediate throws. Change-Id: Ib36fdda33bc90b21c32432b03561630c1f3c69bc Reviewed-on: https://go-review.googlesource.com/38293 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-19 22:42:28 +00:00
Austin Clements	13ae271d5d	runtime: introduce a type for lfstacks The lfstack API is still a C-style API: lfstacks all have unhelpful type uint64 and the APIs are package-level functions. Make the code more readable and Go-style by creating an lfstack type with methods for push, pop, and empty. Change-Id: I64685fa3be0e82ae2d1a782a452a50974440a827 Reviewed-on: https://go-review.googlesource.com/38290 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-19 22:42:24 +00:00
Daniel Martí	77b09b8b8d	runtime: remove unused g parameter Found by github.com/mvdan/unparam. Change-Id: I20145440ff1bcd27fcf15a740354c52f313e536c Reviewed-on: https://go-review.googlesource.com/37894 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-16 14:03:45 +00:00
Carlos Eduardo Seo	d60166d5ee	runtime: improve IndexByte for ppc64x This change adds a better implementation of IndexByte for ppc64x. Improvement for bytes·IndexByte: benchmark old ns/op new ns/op delta BenchmarkIndexByte/10-16 12.5 8.48 -32.16% BenchmarkIndexByte/32-16 34.4 9.85 -71.37% BenchmarkIndexByte/4K-16 3089 217 -92.98% BenchmarkIndexByte/4M-16 3154810 207051 -93.44% BenchmarkIndexByte/64M-16 50564811 5579093 -88.97% benchmark old MB/s new MB/s speedup BenchmarkIndexByte/10-16 800.41 1179.64 1.47x BenchmarkIndexByte/32-16 930.60 3249.10 3.49x BenchmarkIndexByte/4K-16 1325.71 18832.53 14.21x BenchmarkIndexByte/4M-16 1329.49 20257.29 15.24x BenchmarkIndexByte/64M-16 1327.19 12028.63 9.06x Improvement for strings·IndexByte: benchmark old ns/op new ns/op delta BenchmarkIndexByte-16 25.9 7.69 -70.31% Fixes #19030 Change-Id: Ifb82bbb3d643ec44b98eaa2d08a07f47e5c2fd11 Reviewed-on: https://go-review.googlesource.com/37670 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>	2017-03-16 13:54:20 +00:00
Keith Randall	d5dc490519	cmd/compile: intrinsics for math/bits.TrailingZerosX Implement math/bits.TrailingZerosX using intrinsics. Generally reorganize the intrinsic spec a bit. The instrinsics data structure is now built at init time. This will make doing the other functions in math/bits easier. Update sys.CtzX to return int instead of uint{64,32} so it matches math/bits.TrailingZerosX. Improve the intrinsics a bit for amd64. We don't need the CMOV for <64 bit versions. Update #18616 Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39 Reviewed-on: https://go-review.googlesource.com/38155 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2017-03-16 02:44:16 +00:00
Martin Möhrmann	16200c7333	runtime: make complex division c99 compatible - changes tests to check that the real and imaginary part of the go complex division result is equal to the result gcc produces for c99 - changes complex division code to satisfy new complex division test - adds float functions isNan, isFinite, isInf, abs and copysign in the runtime package Fixes #14644. name old time/op new time/op delta Complex128DivNormal-4 21.8ns ± 6% 13.9ns ± 6% -36.37% (p=0.000 n=20+20) Complex128DivNisNaN-4 14.1ns ± 1% 15.0ns ± 1% +5.86% (p=0.000 n=20+19) Complex128DivDisNaN-4 12.5ns ± 1% 16.7ns ± 1% +33.79% (p=0.000 n=19+20) Complex128DivNisInf-4 10.1ns ± 1% 13.0ns ± 1% +28.25% (p=0.000 n=20+19) Complex128DivDisInf-4 11.0ns ± 1% 20.9ns ± 1% +90.69% (p=0.000 n=16+19) ComplexAlgMap-4 86.7ns ± 1% 86.8ns ± 2% ~ (p=0.804 n=20+20) Change-Id: I261f3b4a81f6cc858bc7ff48f6fd1b39c300abf0 Reviewed-on: https://go-review.googlesource.com/37441 Reviewed-by: Robert Griesemer <gri@golang.org>	2017-03-15 22:45:17 +00:00
Austin Clements	4b8f41daa6	runtime: print user stack on other threads during GOTRACBEACK=crash Currently, when printing tracebacks of other threads during GOTRACEBACK=crash, if the thread is on the system stack we print only the header for the user goroutine and fail to print its stack. This happens because we passed the g0 to traceback instead of curg. The g0 never has anything set in its gobuf, so traceback doesn't print anything. Fix this by passing _g_.m.curg to traceback instead of the g0. Fixes #19494. Change-Id: Idfabf94d6a725e9cdf94a3923dead6455ef3b217 Reviewed-on: https://go-review.googlesource.com/38012 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-15 22:16:12 +00:00
Austin Clements	f2e87158f0	runtime: make GOTRACEBACK=crash crash promptly in cgo binaries GOTRACEBACK=crash works by bouncing a SIGQUIT around the process sched.mcount times. However, sched.mcount includes the extra Ms allocated by oneNewExtraM for cgo callbacks. Hence, if there are any extra Ms that don't have real OS threads, we'll try to send SIGQUIT more times than there are threads to catch it. Since nothing will catch these extra signals, we'll fall back to blocking for five seconds before aborting the process. Avoid this five second delay by subtracting out the number of extra Ms when sending SIGQUITs. Of course, in a cgo binary, it's still possible for the SIGQUIT to go to a cgo thread and cause some other failure mode. This does not fix that. Change-Id: I4fbf3c52dd721812796c4c1dcb2ab4cb7026d965 Reviewed-on: https://go-review.googlesource.com/38182 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-15 22:16:10 +00:00
Hugues Bruant	ec091b6af2	runtime: add mapassign_fast* Add benchmarks for map assignment with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapAssignInt32_255-8 24.7ns ± 3% 17.4ns ± 2% -29.75% (p=0.000 n=10+10) MapAssignInt32_64k-8 45.5ns ± 4% 37.6ns ± 4% -17.18% (p=0.000 n=10+10) MapAssignInt64_255-8 26.0ns ± 3% 17.9ns ± 4% -31.03% (p=0.000 n=10+10) MapAssignInt64_64k-8 46.9ns ± 5% 38.7ns ± 2% -17.53% (p=0.000 n=9+10) MapAssignStr_255-8 47.8ns ± 3% 24.8ns ± 4% -48.01% (p=0.000 n=10+10) MapAssignStr_64k-8 83.0ns ± 3% 51.9ns ± 3% -37.45% (p=0.000 n=10+9) name old time/op new time/op delta BinaryTree17-8 3.11s ±19% 2.78s ± 3% ~ (p=0.095 n=5+5) Fannkuch11-8 3.26s ± 1% 3.21s ± 2% ~ (p=0.056 n=5+5) FmtFprintfEmpty-8 50.3ns ± 1% 50.8ns ± 2% ~ (p=0.246 n=5+5) FmtFprintfString-8 82.7ns ± 4% 80.1ns ± 5% ~ (p=0.238 n=5+5) FmtFprintfInt-8 82.6ns ± 2% 81.9ns ± 3% ~ (p=0.508 n=5+5) FmtFprintfIntInt-8 124ns ± 4% 121ns ± 3% ~ (p=0.111 n=5+5) FmtFprintfPrefixedInt-8 158ns ± 6% 160ns ± 2% ~ (p=0.341 n=5+5) FmtFprintfFloat-8 249ns ± 2% 245ns ± 2% ~ (p=0.095 n=5+5) FmtManyArgs-8 513ns ± 2% 519ns ± 3% ~ (p=0.151 n=5+5) GobDecode-8 7.48ms ±12% 7.11ms ± 2% ~ (p=0.222 n=5+5) GobEncode-8 6.25ms ± 1% 6.03ms ± 2% -3.56% (p=0.008 n=5+5) Gzip-8 252ms ± 4% 252ms ± 4% ~ (p=1.000 n=5+5) Gunzip-8 38.4ms ± 3% 38.6ms ± 2% ~ (p=0.690 n=5+5) HTTPClientServer-8 76.9µs ±41% 66.4µs ± 6% ~ (p=0.310 n=5+5) JSONEncode-8 16.5ms ± 3% 16.7ms ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 54.6ms ± 1% 54.3ms ± 2% ~ (p=0.548 n=5+5) Mandelbrot200-8 4.45ms ± 3% 4.47ms ± 1% ~ (p=0.841 n=5+5) GoParse-8 3.43ms ± 1% 3.32ms ± 2% -3.28% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 88.2ns ± 3% 89.4ns ± 2% ~ (p=0.333 n=5+5) RegexpMatchEasy0_1K-8 205ns ± 1% 206ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchEasy1_32-8 85.1ns ± 1% 85.5ns ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 365ns ± 1% 371ns ± 9% ~ (p=1.000 n=5+5) RegexpMatchMedium_32-8 129ns ± 2% 128ns ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 39.8µs ± 0% 39.7µs ± 4% ~ (p=0.730 n=4+5) RegexpMatchHard_32-8 1.99µs ± 3% 2.05µs ±16% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 59.3µs ± 1% 60.3µs ± 7% ~ (p=1.000 n=5+5) Revcomp-8 1.36s ±63% 0.52s ± 5% ~ (p=0.095 n=5+5) Template-8 62.6ms ±14% 60.5ms ± 5% ~ (p=0.690 n=5+5) TimeParse-8 330ns ± 2% 324ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 350ns ± 3% 340ns ± 1% -2.86% (p=0.008 n=5+5) name old speed new speed delta GobDecode-8 103MB/s ±11% 108MB/s ± 2% ~ (p=0.222 n=5+5) GobEncode-8 123MB/s ± 1% 127MB/s ± 2% +3.71% (p=0.008 n=5+5) Gzip-8 77.1MB/s ± 4% 76.9MB/s ± 3% ~ (p=1.000 n=5+5) Gunzip-8 505MB/s ± 3% 503MB/s ± 2% ~ (p=0.690 n=5+5) JSONEncode-8 118MB/s ± 3% 116MB/s ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 35.5MB/s ± 1% 35.8MB/s ± 2% ~ (p=0.397 n=5+5) GoParse-8 16.9MB/s ± 1% 17.4MB/s ± 2% +3.45% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 363MB/s ± 3% 358MB/s ± 2% ~ (p=0.421 n=5+5) RegexpMatchEasy0_1K-8 4.98GB/s ± 1% 4.97GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 376MB/s ± 1% 375MB/s ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 2.80GB/s ± 1% 2.76GB/s ± 9% ~ (p=0.841 n=5+5) RegexpMatchMedium_32-8 7.73MB/s ± 1% 7.76MB/s ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 25.8MB/s ± 0% 25.8MB/s ± 4% ~ (p=0.651 n=4+5) RegexpMatchHard_32-8 16.1MB/s ± 3% 15.7MB/s ±14% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 17.3MB/s ± 1% 17.0MB/s ± 7% ~ (p=0.984 n=5+5) Revcomp-8 273MB/s ±83% 488MB/s ± 5% ~ (p=0.095 n=5+5) Template-8 31.1MB/s ±13% 32.1MB/s ± 5% ~ (p=0.690 n=5+5) Updates #19495 Change-Id: I116e9a2a4594769318b22d736464de8a98499909 Reviewed-on: https://go-review.googlesource.com/38091 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-13 23:43:16 +00:00
Dave Cheney	7ee43faf78	runtime: remove sizeToClass CL 32219 added precomputed sizeclass tables. Remove the unused sizeToClass method which was previously only called from initSizes. Change-Id: I907bf9ed78430ecfaabbec7fca77ef2375010081 Reviewed-on: https://go-review.googlesource.com/38113 Run-TryBot: Dave Cheney <dave@cheney.net> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-13 01:55:44 +00:00
David NewHamlet	e19f184b8f	runtime: use cpuset_getaffinity for runtime.NumCPU() on FreeBSD In FreeBSD when run Go proc under a given sub-list of processors(e.g. 'cpuset -l 0 ./a.out' in multi-core system), runtime.NumCPU() still return all physical CPUs from sysctl hw.ncpu instead of account from sub-list. Fix by use syscall cpuset_getaffinity to account the number of sub-list. Fixes #15206 Change-Id: If87c4b620e870486efa100685db5debbf1210a5b Reviewed-on: https://go-review.googlesource.com/29341 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>	2017-03-10 22:06:24 +00:00
Daniel Martí	da0d23e5cd	runtime: remove unused ratep parameter Found by github.com/mvdan/unparam. Change-Id: Iabcdfec2ae42c735aa23210b7183080d750682ca Reviewed-on: https://go-review.googlesource.com/38030 Reviewed-by: Peter Weinberger <pjw@google.com> Run-TryBot: Peter Weinberger <pjw@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-10 20:46:58 +00:00
Bryan C. Mills	4210930a28	runtime/cgo: return correct sa_flags A typo in the previous revision ("act" instead of "oldact") caused us to return the sa_flags from the new (or zeroed) sigaction rather than the old one. In the presence of a signal handler registered before runtime.libpreinit, this caused setsigstack to erroneously zero out important sa_flags (such as SA_SIGINFO) in its attempt to re-register the existing handler with SA_ONSTACK. Change-Id: I3cd5152a38ec0d44ae611f183bc1651d65b8a115 Reviewed-on: https://go-review.googlesource.com/37852 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-09 18:53:35 +00:00
Bryan C. Mills	e57350f4c0	runtime: fix _cgo_yield usage with sysmon and on BSD There are a few problems from change 35494, discovered during testing of change 37852. 1. I was confused about the usage of n.key in the sema variant, so we were looping on the wrong condition. The error was not caught by the TryBots (presumably due to missing TSAN coverage in the BSD and darwin builders?). 2. The sysmon goroutine sometimes skips notetsleep entirely, using direct usleep syscalls instead. In that case, we were not calling _cgo_yield, leading to missed signals under TSAN. 3. Some notetsleep calls have long finite timeouts. They should be broken up into smaller chunks with a yield at the end of each chunk. updates #18717 Change-Id: I91175af5dea3857deebc686f51a8a40f9d690bcc Reviewed-on: https://go-review.googlesource.com/37867 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-09 18:36:49 +00:00
Josh Bleecher Snyder	23be728950	runtime: optimize slicebytestostring Inline rawstringtmp and simplify. Use memmove instead of copy. name old time/op new time/op delta SliceByteToString/1-8 19.4ns ± 2% 14.1ns ± 1% -27.04% (p=0.000 n=20+17) SliceByteToString/2-8 20.8ns ± 2% 15.5ns ± 2% -25.46% (p=0.000 n=20+20) SliceByteToString/4-8 20.7ns ± 1% 14.9ns ± 1% -28.30% (p=0.000 n=20+20) SliceByteToString/8-8 23.2ns ± 1% 17.1ns ± 1% -26.22% (p=0.000 n=19+19) SliceByteToString/16-8 29.4ns ± 1% 23.6ns ± 1% -19.76% (p=0.000 n=17+20) SliceByteToString/32-8 31.4ns ± 1% 26.0ns ± 1% -17.11% (p=0.000 n=16+19) SliceByteToString/64-8 36.1ns ± 0% 30.0ns ± 0% -16.96% (p=0.000 n=16+16) SliceByteToString/128-8 46.9ns ± 0% 38.9ns ± 0% -17.15% (p=0.000 n=17+19) Change-Id: I422e688830e4a9bd21897d1f74964625b735f436 Reviewed-on: https://go-review.googlesource.com/37791 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-08 22:05:52 +00:00
Bryan C. Mills	29edf0f9fe	runtime: poll libc to deliver signals under TSAN fixes #18717 Change-Id: I7244463d2e7489e0b0fe3b74c4b782e71210beb2 Reviewed-on: https://go-review.googlesource.com/35494 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-08 18:58:30 +00:00
David Chase	d71f36b5aa	cmd/compile: check loop rescheduling with stack bound, not counter After benchmarking with a compiler modified to have better spill location, it became clear that this method of checking was actually faster on (at least) two different architectures (ppc64 and amd64) and it also provides more timely interruption of loops. This change adds a modified FOR loop node "FORUNTIL" that checks after executing the loop body instead of before (i.e., always at least once). This ensures that a pointer past the end of a slice or array is not made visible to the garbage collector. Without the rescheduling checks inserted, the restructured loop from this change apparently provides a 1% geomean improvement on PPC64 running the go1 benchmarks; the improvement on AMD64 is only 0.12%. Inserting the rescheduling check exposed some peculiar bug with the ssa test code for s390x; this was updated based on initial code actually generated for GOARCH=s390x to use appropriate OpArg, OpAddr, and OpVarDef. NaCl is disabled in testing. Change-Id: Ieafaa9a61d2a583ad00968110ef3e7a441abca50 Reviewed-on: https://go-review.googlesource.com/36206 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-03-08 18:52:12 +00:00
Russ Cox	c797256a8f	runtime/pprof: add GNU build IDs to Mappings recorded from /proc/self/maps This helps systems that maintain an external database mapping build ID to symbol information for the given binary, especially in the case where /proc/self/maps lists many different files (for example, many shared libraries). Avoid importing debug/elf to avoid dragging in that whole package (and its dependencies like debug/dwarf) into the build of every program that generates a profile. Fixes #19431. Change-Id: I6d4362a79fe23e4f1726dffb0661d20bb57f766f Reviewed-on: https://go-review.googlesource.com/37855 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-08 01:09:18 +00:00
Austin Clements	d50f892abc	runtime: join selectgo and selectgoImpl Currently selectgo is just a wrapper around selectgoImpl. This keeps the hard-coded frame skip counts for tracing the same between the channel implementation and the select implementation. However, this is fragile and confusing, so pass a skip parameter to send and recv, join selectgo and selectgoImpl into one function, and use decrease all of the skips in selectgo by one. Change-Id: I11b8cbb7d805b55f5dc6ab4875ac7dde79412ff2 Reviewed-on: https://go-review.googlesource.com/37860 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-03-07 21:19:38 +00:00
Austin Clements	b992c2649e	runtime: print SP/FP on bad pointer crashes If the bad pointer is on a stack, this makes it possible to find the frame containing the bad pointer. Change-Id: Ieda44e054aa9ebf22d15d184457c7610b056dded Reviewed-on: https://go-review.googlesource.com/37858 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-07 20:46:54 +00:00
Austin Clements	caa7dacfd2	runtime: honor GOTRACEBACK=crash even if _g_.m.traceback != 0 Change-Id: I6de1ef8f67bde044b8706c01e98400e266e1f8f0 Reviewed-on: https://go-review.googlesource.com/37857 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-07 20:46:52 +00:00
Matthew Dempsky	c310c688ff	cmd/compile, runtime: simplify multiway select implementation This commit reworks multiway select statements to use normal control flow primitives instead of the previous setjmp/longjmp-like behavior. This simplifies liveness analysis and should prevent issues around "returns twice" function calls within SSA passes. test/live.go is updated because liveness analysis's CFG is more representative of actual control flow. The case bodies are the only real successors of the selectgo call, but previously the selectsend, selectrecv, etc. calls were included in the successors list too. Updates #19331. Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83 Reviewed-on: https://go-review.googlesource.com/37661 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-07 20:14:17 +00:00
Daniel Martí	5ed952368e	runtime/pprof: actually use tag parameter It's only ever called with the value it was using, but the code was counterintuitive. Use the parameter instead, like the other funcs near it. Found by github.com/mvdan/unparam. Change-Id: I45855e11d749380b9b2a28e6dd1d5dedf119a19b Reviewed-on: https://go-review.googlesource.com/37893 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-07 20:01:05 +00:00
Elias Naur	b91b694b37	runtime/pprof: fix the protobuf tests on Android Change-Id: I5f85a7980b9a18d3641c4ee8b0992671a8421bb0 Reviewed-on: https://go-review.googlesource.com/37896 Run-TryBot: Elias Naur <elias.naur@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-07 19:18:04 +00:00
Austin Clements	e4f73769bc	runtime: strongly encourage CallersFrames with the result of Callers For historical reasons, it's still commonplace to iterate over the slice returned by runtime.Callers and call FuncForPC on each PC. This is broken in gccgo and somewhat broken in gc and will become more broken in gc with mid-stack inlining. In Go 1.7, we introduced runtime.CallersFrames to deal with these problems, but didn't strongly direct people toward using it. Reword the documentation on runtime.Callers to more strongly encourage people to use CallersFrames and explicitly discourage them from iterating over the PCs or using FuncForPC on the results. Fixes #19426. Change-Id: Id0d14cb51a0e9521c8fdde9612610f2c2b9383c4 Reviewed-on: https://go-review.googlesource.com/37726 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-03-06 20:52:00 +00:00
Austin Clements	0efc8b2188	runtime: avoid repeated findmoduledatap calls Currently almost every function that deals with a _func has to first look up the moduledata for the module containing the function's entry point. This means we almost always do at least two identical module lookups whenever we deal with a _func (one to get the _func and another to get something from its module data) and sometimes several more. Fix this by making findfunc return a new funcInfo type that embeds _func, but also includes the moduledata, and making all of the functions that currently take a _func instead take a funcInfo and use the already-found moduledata. This transformation is trivial for the most part, since the *_func type is usually inferred. The annoying part is that we can no longer use nil to indicate failure, so this introduces a funcInfo.valid() method and replaces nil checks with calls to valid. Change-Id: I9b8075ef1c31185c1943596d96dec45c7ab5100f Reviewed-on: https://go-review.googlesource.com/37331 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2017-03-06 19:17:24 +00:00
Austin Clements	2ef88f7fcf	runtime: lock-free fast path for mark bits allocation Currently we acquire a global lock for every newMarkBits call. This is unfortunate since every span sweep operation calls newMarkBits. However, most allocations are simply linear allocations from the current arena. Take advantage of this to add a lock-free fast path for allocating from the current arena. With this change, the global lock only protects the lists of arenas, not the free offset in the current arena. Change-Id: I6cf6182af8492c8bfc21276114c77275fe3d7826 Reviewed-on: https://go-review.googlesource.com/34595 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-06 18:40:26 +00:00
Austin Clements	6c4a8d195b	runtime: don't hold global gcBitsArenas lock over allocation Currently, newArena holds the gcBitsArenas lock across allocating memory from the OS for a new gcBits arena. This is a global lock and allocating physical memory can be expensive, so this has the potential to cause high lock contention, especially since every single span sweep operation calls newArena (via newMarkBits). Improve the situation by temporarily dropping the lock across allocation. This means the caller now has to revalidate its assumptions after the lock is dropped, so this also factors out that code path and reinvokes it after the lock is acquired. Change-Id: I1113200a954ab4aad16b5071512583cfac744bdc Reviewed-on: https://go-review.googlesource.com/34594 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-06 18:40:23 +00:00
Eitan Adler	789c5255a4	all: remove the the duplicate words Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0 Reviewed-on: https://go-review.googlesource.com/37793 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-06 04:39:12 +00:00
Josh Bleecher Snyder	d4451362c0	runtime: add slicebytetostring benchmark Change-Id: I666d2c6ea8d0b54a71260809d1a2573b122865b2 Reviewed-on: https://go-review.googlesource.com/37790 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-05 05:14:08 +00:00
Austin Clements	4a7cf960c3	runtime: make ReadMemStats STW for < 25µs Currently ReadMemStats stops the world for ~1.7 ms/GB of heap because it collects statistics from every single span. For large heaps, this can be quite costly. This is particularly unfortunate because many production infrastructures call this function regularly to collect and report statistics. Fix this by tracking the necessary cumulative statistics in the mcaches. ReadMemStats still has to stop the world to stabilize these statistics, but there are only O(GOMAXPROCS) mcaches to collect statistics from, so this pause is only 25µs even at GOMAXPROCS=100. Fixes #13613. Change-Id: I3c0a4e14833f4760dab675efc1916e73b4c0032a Reviewed-on: https://go-review.googlesource.com/34937 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-04 02:56:37 +00:00
Austin Clements	3399fd254d	runtime: remove unused gcstats The gcstats structure is no longer consumed by anything and no longer tracks statistics that are particularly relevant to the concurrent garbage collector. Remove it. (Having statistics is probably a good idea, but these aren't the stats we need these days and we don't have a way to get them out of the runtime.) In preparation for #13613. Change-Id: Ib63e2f9067850668f9dcbfd4ed89aab4a6622c3f Reviewed-on: https://go-review.googlesource.com/34936 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-04 02:56:35 +00:00
Elias Naur	7523baed09	misc/ios,cmd/go, runtime/cgo: fix iOS test harness (again) The iOS test harness was recently changed in response to lldb bugs to replace breakpoints with the SIGUSR2 signal (CL 34926), and to pass the current directory in the test binary arguments (CL 35152). Both the signal sending and working directory setup is done from the go test driver. However, the new method doesn't work with tests where a C program is the test driver instead of go test: the current working directory will not be changed and SIGUSR2 is not raised. Instead of copying that logic into any C test program, rework the test harness (again) to move the setup logic to the early runtime cgo setup code. That way, the harness will run even in the library build modes. Then, use the app Info.plist file to pass the working directory, removing the need to alter the arguments after running. Finally, use the SIGINT signal instead of SIGUSR2 to avoid manipulating the signal masks or handlers. Fixes the testcarchive tests on iOS. With this CL, both darwin/arm and darwin/arm64 passes all.bash. This CL replaces CL 34926, CL 35152 as well as the fixup CL 35123 and CL 35255. They are reverted in CLs earlier in the relation chain. Change-Id: I8485c7db1404fbd8daa261efd1ea89e905121a3e Reviewed-on: https://go-review.googlesource.com/36090 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-03-04 01:43:13 +00:00
David Lazar	781fd3998e	runtime: use inlining tables to generate accurate tracebacks The code in https://play.golang.org/p/aYQPrTtzoK now produces the following stack trace: goroutine 1 [running]: main.(*point).negate(...) /tmp/go/main.go:8 main.main() /tmp/go/main.go:14 +0x23 Previously the stack trace missed the inlined call: goroutine 1 [running]: main.main() /tmp/go/main.go:14 +0x23 Fixes #10152. Updates #19348. Change-Id: Ib43c67012f53da0ef1a1e69bcafb65b57d9cecb2 Reviewed-on: https://go-review.googlesource.com/37233 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-03 21:29:34 +00:00
David Lazar	699175a11a	cmd/compile,link: generate PC-value tables with inlining information In order to generate accurate tracebacks, the runtime needs to know the inlined call stack for a given PC. This creates two tables per function for this purpose. The first table is the inlining tree (stored in the function's funcdata), which has a node containing the file, line, and function name for every inlined call. The second table is a PC-value table that maps each PC to a node in the inlining tree (or -1 if the PC is not the result of inlining). To give the appearance that inlining hasn't happened, the runtime also needs the original source position information of inlined AST nodes. Previously the compiler plastered over the line numbers of inlined AST nodes with the line number of the call. This meant that the PC-line table mapped each PC to line number of the outermost call in its inlined call stack, with no way to access the innermost line number. Now the compiler retains line numbers of inlined AST nodes and writes the innermost source position information to the PC-line and PC-file tables. Some tools and tests expect to see outermost line numbers, so we provide the OutermostLine function for displaying line info. To keep track of the inlined call stack for an AST node, we extend the src.PosBase type with an index into a global inlining tree. Every time the compiler inlines a call, it creates a node in the global inlining tree for the call, and writes its index to the PosBase of every inlined AST node. The parent of this node is the inlining tree index of the call. -1 signifies no parent. For each function, the compiler creates a local inlining tree and a PC-value table mapping each PC to an index in the local tree. These are written to an object file, which is read by the linker. The linker re-encodes these tables compactly by deduplicating function names and file names. This change increases the size of binaries by 4-5%. For example, this is how the go1 benchmark binary is impacted by this change: section old bytes new bytes delta .text 3.49M ± 0% 3.49M ± 0% +0.06% .rodata 1.12M ± 0% 1.21M ± 0% +8.21% .gopclntab 1.50M ± 0% 1.68M ± 0% +11.89% .debug_line 338k ± 0% 435k ± 0% +28.78% Total 9.21M ± 0% 9.58M ± 0% +4.01% Updates #19348. Change-Id: Ic4f180c3b516018138236b0c35e0218270d957d3 Reviewed-on: https://go-review.googlesource.com/37231 Run-TryBot: David Lazar <lazard@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-03-03 21:29:30 +00:00
Austin Clements	77f64c50db	runtime: clarify access to mheap_.busy There are two accesses to mheap_.busy that are guarded by checks against len(mheap_.free). This works because both lists are (and must be) the same length, but it makes the code less clear. Change these to use len(mheap_.busy) so the access more clearly parallels the check. Fixes #18944. Change-Id: I9bacbd3663988df351ed4396ae9018bc71018311 Reviewed-on: https://go-review.googlesource.com/36354 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:18 +00:00
Austin Clements	b50b728587	runtime: simplify sweep allocation counting Currently sweep counts the number of allocated objects, computes the number of free objects from that, then re-computes the number of allocated objects from that. Simplify and clean this up by skipping these intermediate steps. Change-Id: I3ed98e371eb54bbcab7c8530466c4ab5fde35f0a Reviewed-on: https://go-review.googlesource.com/34935 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:16 +00:00
Austin Clements	f1ba75f8c5	runtime: don't rescan finalizers queue during mark termination Currently we scan the finalizers queue both during concurrent mark and during mark termination. This costs roughly 20ns per queued finalizer and about 1ns per unused finalizer queue slot (allocated queue length never decreases), which can drive up STW time if there are many finalizers. However, we only add finalizers to this queue during sweeping, which means that the second scan will never find anything new. Hence, we can fix this by simply not scanning the finalizers queue during mark termination. This brings the STW time under the 100µs goal even with 1,000,000 queued finalizers. Fixes #18869. Change-Id: I4ce5620c66fb7f13ebeb39ca313ce57047d1d0fb Reviewed-on: https://go-review.googlesource.com/36013 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:14 +00:00
Austin Clements	98da2d1f91	runtime: remove wbufptr Since workbuf is now marked go:notinheap, the write barrier-preventing wrapper type wbufptr is no longer necessary. Remove it. Change-Id: I3e5b5803a1547d65de1c1a9c22458a38e08549b7 Reviewed-on: https://go-review.googlesource.com/35971 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:12 +00:00
Josh Bleecher Snyder	9b15c13dc5	runtime/pprof: fix data race between Profile.Add and Profile.WriteTo p.m is accessed in WriteTo without holding p.mu. Move the access inside the critical section. The race detector catches this bug using this program: package main import ( "os" "runtime/pprof" "time" ) func main() { p := pprof.NewProfile("ABC") go func() { p.WriteTo(os.Stdout, 1) time.Sleep(time.Second) }() p.Add("abc", 0) time.Sleep(time.Second) } $ go run -race x.go ================== WARNING: DATA RACE Write at 0x00c42007c240 by main goroutine: runtime.mapassign() /Users/josh/go/tip/src/runtime/hashmap.go:485 +0x0 runtime/pprof.(Profile).Add() /Users/josh/go/tip/src/runtime/pprof/pprof.go:281 +0x255 main.main() /Users/josh/go/tip/src/p.go:15 +0x9d Previous read at 0x00c42007c240 by goroutine 6: runtime/pprof.(Profile).WriteTo() /Users/josh/go/tip/src/runtime/pprof/pprof.go:314 +0xc5 main.main.func1() /Users/josh/go/tip/src/x.go:12 +0x69 Goroutine 6 (running) created at: main.main() /Users/josh/go/tip/src/x.go:11 +0x6e ================== ABC profile: total 1 1 @ 0x110ccb4 0x111aeee 0x1055053 0x107f031 Found 1 data race(s) exit status 66 (Exit status 66?) Change-Id: I49d884dc3af9cce2209057a3448fe6bf50653523 Reviewed-on: https://go-review.googlesource.com/37730 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-02 23:30:07 +00:00
Josh Bleecher Snyder	04fc887761	runtime: delay marking maps as writing until after first alg call Fixes #19359 Change-Id: I196b47cf0471915b6dc63785e8542aa1876ff695 Reviewed-on: https://go-review.googlesource.com/37665 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-03-02 17:38:30 +00:00
Lynn Boger	e54bc92a2c	runtime, cmd/go: roll back stale message, test detail Some debugging code was recently added to: 1) provide more detail for the stale reason when it is determined that a package is stale 2) provide file and package time and date information when it is determined that runtime.a is stale This backs out those those debugging messages. Fixes #19116 Change-Id: I8dd0cbe29324820275b481d8bbb78ff2c5fbc362 Reviewed-on: https://go-review.googlesource.com/37382 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-03-01 18:50:27 +00:00
Josh Bleecher Snyder	064e44f218	runtime: evacuate old map buckets more consistently During map growth, buckets are evacuated in two ways. When a value is altered, its containing bucket is evacuated. Also, an evacuation mark is maintained and advanced every time. Prior to this CL, the evacuation mark was always incremented, even if the next bucket to be evacuated had already been evacuated. This CL changes evacuation mark advancement to skip previously evacuated buckets. This has the effect of making map evacuation both more aggressive and more consistent. Aggressive map evacuation is good. While the map is growing, map accesses must check two buckets, which may be far apart in memory. Map growth also delays garbage collection. And if map evacuation is not aggressive enough, there is a risk that a populate-once read-many map may be stuck permanently in map growth. This CL does not eliminate that possibility, but it shrinks the window. There is minimal impact on map benchmarks: name old time/op new time/op delta MapPop100-8 12.4µs ±11% 12.4µs ± 7% ~ (p=0.798 n=15+15) MapPop1000-8 240µs ± 8% 235µs ± 8% ~ (p=0.217 n=15+14) MapPop10000-8 4.49ms ±10% 4.51ms ±15% ~ (p=1.000 n=15+13) MegMap-8 11.9ns ± 2% 11.8ns ± 0% -1.01% (p=0.000 n=15+11) MegOneMap-8 9.30ns ± 1% 9.29ns ± 1% ~ (p=0.955 n=14+14) MegEqMap-8 31.9µs ± 5% 31.9µs ± 3% ~ (p=0.935 n=15+15) MegEmptyMap-8 2.41ns ± 2% 2.41ns ± 0% ~ (p=0.594 n=12+14) SmallStrMap-8 12.8ns ± 1% 12.7ns ± 1% ~ (p=0.569 n=14+13) MapStringKeysEight_16-8 13.6ns ± 1% 13.7ns ± 2% ~ (p=0.100 n=13+15) MapStringKeysEight_32-8 12.1ns ± 1% 12.1ns ± 2% ~ (p=0.340 n=15+15) MapStringKeysEight_64-8 12.1ns ± 1% 12.1ns ± 2% ~ (p=0.582 n=15+14) MapStringKeysEight_1M-8 12.0ns ± 1% 12.1ns ± 1% ~ (p=0.267 n=15+14) IntMap-8 7.96ns ± 1% 7.97ns ± 2% ~ (p=0.991 n=15+13) RepeatedLookupStrMapKey32-8 15.8ns ± 2% 15.8ns ± 1% ~ (p=0.393 n=15+14) RepeatedLookupStrMapKey1M-8 35.3µs ± 2% 35.3µs ± 1% ~ (p=0.815 n=15+15) NewEmptyMap-8 36.0ns ± 4% 36.4ns ± 7% ~ (p=0.270 n=15+15) NewSmallMap-8 85.5ns ± 1% 85.6ns ± 1% ~ (p=0.674 n=14+15) MapIter-8 89.9ns ± 6% 90.8ns ± 6% ~ (p=0.467 n=15+15) MapIterEmpty-8 10.0ns ±22% 10.0ns ±25% ~ (p=0.846 n=15+15) SameLengthMap-8 4.18ns ± 1% 4.17ns ± 1% ~ (p=0.653 n=15+14) BigKeyMap-8 20.2ns ± 1% 20.1ns ± 1% -0.82% (p=0.002 n=15+15) BigValMap-8 22.5ns ± 8% 22.3ns ± 6% ~ (p=0.615 n=15+15) SmallKeyMap-8 15.3ns ± 1% 15.3ns ± 1% ~ (p=0.754 n=15+14) ComplexAlgMap-8 58.4ns ± 1% 58.7ns ± 1% +0.52% (p=0.000 n=14+15) There is a tiny but detectable difference in the compiler: name old time/op new time/op delta Template 218ms ± 5% 219ms ± 4% ~ (p=0.094 n=98+98) Unicode 93.6ms ± 5% 93.6ms ± 4% ~ (p=0.910 n=94+95) GoTypes 596ms ± 5% 598ms ± 6% ~ (p=0.533 n=98+100) Compiler 2.72s ± 3% 2.72s ± 4% ~ (p=0.238 n=100+99) SSA 4.11s ± 3% 4.11s ± 3% ~ (p=0.864 n=99+98) Flate 129ms ± 6% 129ms ± 4% ~ (p=0.522 n=98+96) GoParser 151ms ± 4% 151ms ± 4% -0.48% (p=0.017 n=96+96) Reflect 379ms ± 3% 376ms ± 4% -0.57% (p=0.011 n=99+99) Tar 112ms ± 5% 112ms ± 6% ~ (p=0.688 n=93+95) XML 214ms ± 4% 214ms ± 5% ~ (p=0.968 n=100+99) StdCmd 16.2s ± 2% 16.2s ± 2% -0.26% (p=0.048 n=99+99) name old user-ns/op new user-ns/op delta Template 252user-ms ± 4% 250user-ms ± 4% -0.63% (p=0.020 n=98+97) Unicode 113user-ms ± 7% 114user-ms ± 5% ~ (p=0.057 n=97+94) GoTypes 776user-ms ± 5% 777user-ms ± 5% ~ (p=0.375 n=97+96) Compiler 3.61user-s ± 3% 3.60user-s ± 3% ~ (p=0.445 n=98+93) SSA 5.84user-s ± 6% 5.85user-s ± 5% ~ (p=0.542 n=100+95) Flate 154user-ms ± 5% 154user-ms ± 5% ~ (p=0.699 n=99+99) GoParser 184user-ms ± 6% 183user-ms ± 4% ~ (p=0.557 n=98+95) Reflect 461user-ms ± 5% 462user-ms ± 4% ~ (p=0.853 n=97+99) Tar 130user-ms ± 5% 129user-ms ± 6% ~ (p=0.567 n=93+100) XML 257user-ms ± 6% 258user-ms ± 6% ~ (p=0.205 n=99+100) Change-Id: Id92dd54a152904069aac415e6aaaab5c67f5f476 Reviewed-on: https://go-review.googlesource.com/37011 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-28 20:42:50 +00:00
Josh Bleecher Snyder	504bc3ed24	cmd/compile, runtime: specialize convT2x, don't alloc for zero vals Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-28 19:23:33 +00:00
Austin Clements	bab191042b	cmd/internal/obj, runtime: update funcdata comments The comments in cmd/internal/obj/funcdata.go are identical to the comments in runtime/funcdata.h, but the majority of the definitions they refer to don't apply to Go sources and have been stripped out of funcdata.go. Remove these stale comments from funcdata.go and clean up the references to other copies of the PCDATA and FUNCDATA indexes. Change-Id: I5d6e49a6e586cc9aecd7c3ce1567679f2a605884 Reviewed-on: https://go-review.googlesource.com/37330 Reviewed-by: Keith Randall <khr@golang.org>	2017-02-27 22:29:28 +00:00
Dmitry Vyukov	ba6e5776fd	runtime: remove unused RaceSemacquire declaration These functions are not defined and are not used. Fixes #19290 Change-Id: I2978147220af83cf319f7439f076c131870fb9ee Reviewed-on: https://go-review.googlesource.com/37448 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-27 20:15:51 +00:00
Josh Bleecher Snyder	c7894924c7	runtime/pprof: handle empty stack traces in Profile.Add If the caller passes a large number to Profile.Add, the list of pcs is empty, which results in junk (a nil pc) being recorded. Check for that explicitly, and replace such stack traces with a lostProfileEvent. Fixes #18836. Change-Id: I99c96aa67dd5525cd239ea96452e6e8fcb25ce02 Reviewed-on: https://go-review.googlesource.com/36891 Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-27 17:11:07 +00:00
Russ Cox	8c24e52247	runtime: check that pprof accepts but doesn't need executable The profiles are self-contained now. Check that they work by themselves in the tests that invoke pprof, but also keep checking that the old command lines work. Change-Id: I24c74b5456f0b50473883c3640625c6612f72309 Reviewed-on: https://go-review.googlesource.com/37166 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:46:37 +00:00
Russ Cox	0b8c983ece	runtime/pprof/internal/profile: move internal/pprof/profile here Nothing needs internal/pprof anymore except the runtime/pprof tests. Move the package here to prevent new dependencies. Change-Id: Ia119af91cc2b980e0fa03a15f46f69d7f71d2926 Reviewed-on: https://go-review.googlesource.com/37165 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:45:21 +00:00
Russ Cox	cbab65fdfa	runtime/pprof: add streaming protobuf encoder The existing code builds a full profile in memory. Then it translates that profile into a data structure (in memory). Then it marshals that data structure into a protocol buffer (in memory). Then it gzips that marshaled form into the underlying writer. So there are three copies of the full profile data in memory at the same time before we're done. This is obviously dumb. This CL implements a fully streaming conversion from the original in-memory profile to the underlying writer. There is now only one copy of the profile in memory. For the non-CPU profiles, this is optimal, since we have to have a full copy in memory to start with. For the CPU profiles, we could still try to bound the profile size stored in memory and stream fragments out during the actual profiling, as Go 1.7 did (with a simpler format), but so far that hasn't been necessary. Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989 Reviewed-on: https://go-review.googlesource.com/37164 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 20:15:56 +00:00
Russ Cox	1564817d8c	runtime/pprof: use more efficient hash table for staging profile The old hash table was a place holder that allocates memory during every lookup for key generation, even for keys that hit in the the table. Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21 Reviewed-on: https://go-review.googlesource.com/37163 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 17:05:37 +00:00
Russ Cox	1a680a902a	runtime/pprof: use new profile buffers for CPU profiling This doesn't change the functionality of the current code, but it sets us up for exporting the profiling labels into the profile. The old code had a hash table of profile samples maintained during the signal handler, with evictions going into a log. The new code just logs every sample directly, leaving the hash-based deduplication to an ordinary goroutine. The new code also avoids storing the entire profile in two forms in memory, an unfortunate regression introduced when binary profile support was added. After this CL the entire profile is only stored once in memory. We'd still like to get back down to storing it zero times (streaming it to the underlying io.Writer). Change-Id: I0893a1788267c564aa1af17970d47377b2a43457 Reviewed-on: https://go-review.googlesource.com/36712 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>	2017-02-24 17:01:47 +00:00
Russ Cox	a1261b8b0a	runtime: do not allocate on every time.Sleep It's common for some goroutines to loop calling time.Sleep. Allocate once per goroutine, not every time. This comes up in runtime/pprof's background reader. Change-Id: I89d17dc7379dca266d2c9cd3aefc2382f5bdbade Reviewed-on: https://go-review.googlesource.com/37162 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-24 15:34:01 +00:00
Russ Cox	b788fd80e6	runtime: new profile buffer implementation supporting label pointers The existing CPU profiling buffer is a slice of uintptr, but we want to start including profiling label data in the profiles, and those labels need to be pointers in order to let them describe rich information. This CL implements a new profBuf type that holds both a slice of uint64 for data and a slice of unsafe.Pointer for profiling labels (aka tags). Making the runtime use these buffers will happen in followup CLs. Change-Id: I9ff16b532d8edaf4ce0cbba1098229a561834efc Reviewed-on: https://go-review.googlesource.com/36713 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-23 19:47:23 +00:00
Lynn Boger	ea48c9d232	runtime: more detail for crash_test.go This updates the testcase to display the timestamps for the runtime.a, it dependent packages atomic.a and sys.a, and source files. Change-Id: Id2901b4e8aa8eb9775c4f404ac01cc07b394ba91 Reviewed-on: https://go-review.googlesource.com/37332 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-22 16:34:14 +00:00
Josh Bleecher Snyder	4208fcdcd4	runtime: use standard linux/mipsx clone variable names Change-Id: I62118e197190af1d11a89921d5769101ce6d2257 Reviewed-on: https://go-review.googlesource.com/37306 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-21 18:42:38 +00:00
Josh Bleecher Snyder	b6e0d4647f	runtime: update assembly var names after monotonic time changes Change-Id: I721045120a4df41462c02252e2e5e8529ae2d694 Reviewed-on: https://go-review.googlesource.com/37303 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-21 18:42:05 +00:00
Brad Fitzpatrick	a37f9d8a17	runtime/pprof: mark TestMutexProfile as flaky for now Flaky tests hurt productivity. Disable for now. Updates #19139 Change-Id: I2e3040bdf0e53597a1c4f925b788e3268ea284c1 Reviewed-on: https://go-review.googlesource.com/37291 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Peter Weinberger <pjw@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-20 20:17:16 +00:00
Dmitry Vyukov	0556e26273	sync: make Mutex more fair Add new starvation mode for Mutex. In starvation mode ownership is directly handed off from unlocking goroutine to the next waiter. New arriving goroutines don't compete for ownership. Unfair wait time is now limited to 1ms. Also fix a long standing bug that goroutines were requeued at the tail of the wait queue. That lead to even more unfair acquisition times with multiple waiters. Performance of normal mode is not considerably affected. Fixes #13086 On the provided in the issue lockskew program: done in 1.207853ms done in 1.177451ms done in 1.184168ms done in 1.198633ms done in 1.185797ms done in 1.182502ms done in 1.316485ms done in 1.211611ms done in 1.182418ms name old time/op new time/op delta MutexUncontended-48 0.65ns ± 0% 0.65ns ± 1% ~ (p=0.087 n=10+10) Mutex-48 112ns ± 1% 114ns ± 1% +1.69% (p=0.000 n=10+10) MutexSlack-48 113ns ± 0% 87ns ± 1% -22.65% (p=0.000 n=8+10) MutexWork-48 149ns ± 0% 145ns ± 0% -2.48% (p=0.000 n=9+10) MutexWorkSlack-48 149ns ± 0% 122ns ± 3% -18.26% (p=0.000 n=6+10) MutexNoSpin-48 103ns ± 4% 105ns ± 3% ~ (p=0.089 n=10+10) MutexSpin-48 490ns ± 4% 515ns ± 6% +5.08% (p=0.006 n=10+10) Cond32-48 13.4µs ± 6% 13.1µs ± 5% -2.75% (p=0.023 n=10+10) RWMutexWrite100-48 53.2ns ± 3% 41.2ns ± 3% -22.57% (p=0.000 n=10+10) RWMutexWrite10-48 45.9ns ± 2% 43.9ns ± 2% -4.38% (p=0.000 n=10+10) RWMutexWorkWrite100-48 122ns ± 2% 134ns ± 1% +9.92% (p=0.000 n=10+10) RWMutexWorkWrite10-48 206ns ± 1% 188ns ± 1% -8.52% (p=0.000 n=8+10) Cond32-24 12.1µs ± 3% 12.4µs ± 3% +1.98% (p=0.043 n=10+9) MutexUncontended-24 0.74ns ± 1% 0.75ns ± 1% ~ (p=0.650 n=10+10) Mutex-24 122ns ± 2% 124ns ± 1% +1.31% (p=0.007 n=10+10) MutexSlack-24 96.9ns ± 2% 102.8ns ± 2% +6.11% (p=0.000 n=10+10) MutexWork-24 146ns ± 1% 135ns ± 2% -7.70% (p=0.000 n=10+9) MutexWorkSlack-24 135ns ± 1% 128ns ± 2% -5.01% (p=0.000 n=10+9) MutexNoSpin-24 114ns ± 3% 110ns ± 4% -3.84% (p=0.000 n=10+10) MutexSpin-24 482ns ± 4% 475ns ± 8% ~ (p=0.286 n=10+10) RWMutexWrite100-24 43.0ns ± 3% 43.1ns ± 2% ~ (p=0.956 n=10+10) RWMutexWrite10-24 43.4ns ± 1% 43.2ns ± 1% ~ (p=0.085 n=10+9) RWMutexWorkWrite100-24 130ns ± 3% 131ns ± 3% ~ (p=0.747 n=10+10) RWMutexWorkWrite10-24 191ns ± 1% 192ns ± 1% ~ (p=0.210 n=10+10) Cond32-12 11.5µs ± 2% 11.7µs ± 2% +1.98% (p=0.002 n=10+10) MutexUncontended-12 1.48ns ± 0% 1.50ns ± 1% +1.08% (p=0.004 n=10+10) Mutex-12 141ns ± 1% 143ns ± 1% +1.63% (p=0.000 n=10+10) MutexSlack-12 121ns ± 0% 119ns ± 0% -1.65% (p=0.001 n=8+9) MutexWork-12 141ns ± 2% 150ns ± 3% +6.36% (p=0.000 n=9+10) MutexWorkSlack-12 131ns ± 0% 138ns ± 0% +5.73% (p=0.000 n=9+10) MutexNoSpin-12 87.0ns ± 1% 83.7ns ± 1% -3.80% (p=0.000 n=10+10) MutexSpin-12 364ns ± 1% 377ns ± 1% +3.77% (p=0.000 n=10+10) RWMutexWrite100-12 42.8ns ± 1% 43.9ns ± 1% +2.41% (p=0.000 n=8+10) RWMutexWrite10-12 39.8ns ± 4% 39.3ns ± 1% ~ (p=0.433 n=10+9) RWMutexWorkWrite100-12 131ns ± 1% 131ns ± 0% ~ (p=0.591 n=10+9) RWMutexWorkWrite10-12 173ns ± 1% 174ns ± 0% ~ (p=0.059 n=10+8) Cond32-6 10.9µs ± 2% 10.9µs ± 2% ~ (p=0.739 n=10+10) MutexUncontended-6 2.97ns ± 0% 2.97ns ± 0% ~ (all samples are equal) Mutex-6 122ns ± 6% 122ns ± 2% ~ (p=0.668 n=10+10) MutexSlack-6 149ns ± 3% 142ns ± 3% -4.63% (p=0.000 n=10+10) MutexWork-6 136ns ± 3% 140ns ± 5% ~ (p=0.077 n=10+10) MutexWorkSlack-6 152ns ± 0% 138ns ± 2% -9.21% (p=0.000 n=6+10) MutexNoSpin-6 150ns ± 1% 152ns ± 0% +1.50% (p=0.000 n=8+10) MutexSpin-6 726ns ± 0% 730ns ± 1% ~ (p=0.069 n=10+10) RWMutexWrite100-6 40.6ns ± 1% 40.9ns ± 1% +0.91% (p=0.001 n=8+10) RWMutexWrite10-6 37.1ns ± 0% 37.0ns ± 1% ~ (p=0.386 n=9+10) RWMutexWorkWrite100-6 133ns ± 1% 134ns ± 1% +1.01% (p=0.005 n=9+10) RWMutexWorkWrite10-6 152ns ± 0% 152ns ± 0% ~ (all samples are equal) Cond32-2 7.86µs ± 2% 7.95µs ± 2% +1.10% (p=0.023 n=10+10) MutexUncontended-2 8.10ns ± 0% 9.11ns ± 4% +12.44% (p=0.000 n=9+10) Mutex-2 32.9ns ± 9% 38.4ns ± 6% +16.58% (p=0.000 n=10+10) MutexSlack-2 93.4ns ± 1% 98.5ns ± 2% +5.39% (p=0.000 n=10+9) MutexWork-2 40.8ns ± 3% 43.8ns ± 7% +7.38% (p=0.000 n=10+9) MutexWorkSlack-2 98.6ns ± 5% 108.2ns ± 2% +9.80% (p=0.000 n=10+8) MutexNoSpin-2 399ns ± 1% 398ns ± 2% ~ (p=0.463 n=8+9) MutexSpin-2 1.99µs ± 3% 1.97µs ± 1% -0.81% (p=0.003 n=9+8) RWMutexWrite100-2 37.6ns ± 5% 46.0ns ± 4% +22.17% (p=0.000 n=10+8) RWMutexWrite10-2 50.1ns ± 6% 36.8ns ±12% -26.46% (p=0.000 n=9+10) RWMutexWorkWrite100-2 136ns ± 0% 134ns ± 2% -1.80% (p=0.001 n=7+9) RWMutexWorkWrite10-2 140ns ± 1% 138ns ± 1% -1.50% (p=0.000 n=10+10) Cond32 5.93µs ± 1% 5.91µs ± 0% ~ (p=0.411 n=9+10) MutexUncontended 15.9ns ± 0% 15.8ns ± 0% -0.63% (p=0.000 n=8+8) Mutex 15.9ns ± 0% 15.8ns ± 0% -0.44% (p=0.003 n=10+10) MutexSlack 26.9ns ± 3% 26.7ns ± 2% ~ (p=0.084 n=10+10) MutexWork 47.8ns ± 0% 47.9ns ± 0% +0.21% (p=0.014 n=9+8) MutexWorkSlack 54.9ns ± 3% 54.5ns ± 3% ~ (p=0.254 n=10+10) MutexNoSpin 786ns ± 2% 765ns ± 1% -2.66% (p=0.000 n=10+10) MutexSpin 3.87µs ± 1% 3.83µs ± 0% -0.85% (p=0.005 n=9+8) RWMutexWrite100 21.2ns ± 2% 21.0ns ± 1% -0.88% (p=0.018 n=10+9) RWMutexWrite10 22.6ns ± 1% 22.6ns ± 0% ~ (p=0.471 n=9+9) RWMutexWorkWrite100 132ns ± 0% 132ns ± 0% ~ (all samples are equal) RWMutexWorkWrite10 124ns ± 0% 123ns ± 0% ~ (p=0.656 n=10+10) Change-Id: I66412a3a0980df1233ad7a5a0cd9723b4274528b Reviewed-on: https://go-review.googlesource.com/34310 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-17 17:24:59 +00:00
Russ Cox	990124da2a	runtime: use balanced tree for addr lookup in semaphore implementation CL 36792 fixed #17953, a linear scan caused by n goroutines piling into two different locks that hashed to the same bucket in the semaphore table. In that CL, n goroutines contending for 2 unfortunately chosen locks went from O(n²) to O(n). This CL fixes a different linear scan, when n goroutines are contending for n/2 different locks that all hash to the same bucket in the semaphore table. In this CL, n goroutines contending for n/2 unfortunately chosen locks goes from O(n²) to O(n log n). This case is much less likely, but any linear scan eventually hurts, so we might as well fix it while the problem is fresh in our minds. The new test in this CL checks for both linear scans. The effect of this CL on the sync benchmarks is negligible (but it fixes the new test). name old time/op new time/op delta Cond1-48 576ns ±10% 575ns ±13% ~ (p=0.679 n=71+71) Cond2-48 1.59µs ± 8% 1.61µs ± 9% ~ (p=0.107 n=73+69) Cond4-48 4.56µs ± 7% 4.55µs ± 7% ~ (p=0.670 n=74+72) Cond8-48 9.87µs ± 9% 9.90µs ± 7% ~ (p=0.507 n=69+73) Cond16-48 20.4µs ± 7% 20.4µs ±10% ~ (p=0.588 n=69+71) Cond32-48 45.4µs ±10% 45.4µs ±14% ~ (p=0.944 n=73+73) UncontendedSemaphore-48 19.7ns ±12% 19.7ns ± 8% ~ (p=0.589 n=65+63) ContendedSemaphore-48 55.4ns ±26% 54.9ns ±32% ~ (p=0.441 n=75+75) MutexUncontended-48 0.63ns ± 0% 0.63ns ± 0% ~ (all equal) Mutex-48 210ns ± 6% 213ns ±10% +1.30% (p=0.035 n=70+74) MutexSlack-48 210ns ± 7% 211ns ± 9% ~ (p=0.184 n=71+72) MutexWork-48 299ns ± 5% 300ns ± 5% ~ (p=0.678 n=73+75) MutexWorkSlack-48 302ns ± 6% 300ns ± 5% ~ (p=0.149 n=74+72) MutexNoSpin-48 135ns ± 6% 135ns ±10% ~ (p=0.788 n=67+75) MutexSpin-48 693ns ± 5% 689ns ± 6% ~ (p=0.092 n=65+74) Once-48 0.22ns ±25% 0.22ns ±24% ~ (p=0.882 n=74+73) Pool-48 5.88ns ±36% 5.79ns ±24% ~ (p=0.655 n=69+69) PoolOverflow-48 4.79µs ±18% 4.87µs ±20% ~ (p=0.233 n=75+75) SemaUncontended-48 0.80ns ± 1% 0.82ns ± 8% +2.46% (p=0.000 n=60+74) SemaSyntNonblock-48 103ns ± 4% 102ns ± 5% -1.11% (p=0.003 n=75+75) SemaSyntBlock-48 104ns ± 4% 104ns ± 5% ~ (p=0.231 n=71+75) SemaWorkNonblock-48 128ns ± 4% 129ns ± 6% +1.51% (p=0.000 n=63+75) SemaWorkBlock-48 129ns ± 8% 130ns ± 7% ~ (p=0.072 n=75+74) RWMutexUncontended-48 2.35ns ± 1% 2.35ns ± 0% ~ (p=0.144 n=70+55) RWMutexWrite100-48 139ns ±18% 141ns ±21% ~ (p=0.071 n=75+73) RWMutexWrite10-48 145ns ± 9% 145ns ± 8% ~ (p=0.553 n=75+75) RWMutexWorkWrite100-48 297ns ±13% 297ns ±15% ~ (p=0.519 n=75+74) RWMutexWorkWrite10-48 588ns ± 7% 585ns ± 5% ~ (p=0.173 n=73+70) WaitGroupUncontended-48 0.87ns ± 0% 0.87ns ± 0% ~ (all equal) WaitGroupAddDone-48 63.2ns ± 4% 62.7ns ± 4% -0.82% (p=0.027 n=72+75) WaitGroupAddDoneWork-48 109ns ± 5% 109ns ± 4% ~ (p=0.233 n=75+75) WaitGroupWait-48 0.17ns ± 0% 0.16ns ±16% -8.55% (p=0.000 n=56+75) WaitGroupWaitWork-48 1.78ns ± 1% 2.08ns ± 5% +16.92% (p=0.000 n=74+70) WaitGroupActuallyWait-48 52.0ns ± 3% 50.6ns ± 5% -2.70% (p=0.000 n=71+69) https://perf.golang.org/search?q=upload:20170215.1 Change-Id: Ia29a8bd006c089e401ec4297c3038cca656bcd0a Reviewed-on: https://go-review.googlesource.com/37103 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-16 17:52:15 +00:00
Russ Cox	58d762176a	runtime: run mutexevent profiling without holding semaRoot lock Suggested by Dmitry in CL 36792 review. Clearly safe since there are many different semaRoots that could all have profiled sudogs calling mutexevent. Change-Id: I45eed47a5be3e513b2dad63b60afcd94800e16d1 Reviewed-on: https://go-review.googlesource.com/37104 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2017-02-16 17:16:41 +00:00
Russ Cox	1f77db94f8	runtime: do not call wakep from enlistWorker, to avoid possible deadlock We have seen one instance of a production job suddenly spinning to 100% CPU and becoming unresponsive. In that one instance, a SIGQUIT was sent after 328 minutes of spinning, and the stacks showed a single goroutine in "IO wait (scan)" state. Looking for things that might get stuck if a goroutine got stuck in scanning a stack, we found that injectglist does: lock(&sched.lock) var n int for n = 0; glist != nil; n++ { gp := glist glist = gp.schedlink.ptr() casgstatus(gp, _Gwaiting, _Grunnable) globrunqput(gp) } unlock(&sched.lock) and that casgstatus spins on gp.atomicstatus until the _Gscan bit goes away. Essentially, this code locks sched.lock and then while holding sched.lock, waits to lock gp.atomicstatus. The code that is doing the scan is: if castogscanstatus(gp, s, s\|_Gscan) { if !gp.gcscandone { scanstack(gp, gcw) gp.gcscandone = true } restartg(gp) break loop } More analysis showed that scanstack can, in a rare case, end up calling back into code that acquires sched.lock. For example: runtime.scanstack at proc.go:866 calls runtime.gentraceback at mgcmark.go:842 calls runtime.scanstack$1 at traceback.go:378 calls runtime.scanframeworker at mgcmark.go:819 calls runtime.scanblock at mgcmark.go:904 calls runtime.greyobject at mgcmark.go:1221 calls (runtime.gcWork).put at mgcmark.go:1412 calls (runtime.gcControllerState).enlistWorker at mgcwork.go:127 calls runtime.wakep at mgc.go:632 calls runtime.startm at proc.go:1779 acquires runtime.sched.lock at proc.go:1675 This path was found with an automated deadlock-detecting tool. There are many such paths but they all go through enlistWorker -> wakep. The evidence strongly suggests that one of these paths is what caused the deadlock we observed. We're running those jobs with GOTRACEBACK=crash now to try to get more information if it happens again. Further refinement and analysis shows that if we drop the wakep call from enlistWorker, the remaining few deadlock cycles found by the tool are all false positives caused by not understanding the effect of calls to func variables. The enlistWorker -> wakep call was intended only as a performance optimization, it rarely executes, and if it does execute at just the wrong time it can (and plausibly did) cause the deadlock we saw. Comment it out, to avoid the potential deadlock. Fixes #19112. Unfixes #14179. Change-Id: I6f7e10b890b991c11e79fab7aeefaf70b5d5a07b Reviewed-on: https://go-review.googlesource.com/37093 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:22:36 +00:00
Hana Kim	8833af3f4b	runtime/pprof: print newly added fields of runtime.MemStats in heap profile with debug mode Change-Id: I3a80d03a4aa556614626067a8fd698b3b00f4290 Reviewed-on: https://go-review.googlesource.com/36962 Reviewed-by: Austin Clements <austin@google.com>	2017-02-15 21:14:37 +00:00
Ian Lance Taylor	c05b06a12d	os: use poller for file I/O This changes the os package to use the runtime poller for file I/O where possible. When a system call blocks on a pollable descriptor, the goroutine will be blocked on the poller but the thread will be released to run other goroutines. When using a non-pollable descriptor, the os package will continue to use thread-blocking system calls as before. For example, on GNU/Linux, the runtime poller uses epoll. epoll does not support ordinary disk files, so they will continue to use blocking I/O as before. The poller will be used for pipes. Since this means that the poller is used for many more programs, this modifies the runtime to only block waiting for the poller if there is some goroutine that is waiting on the poller. Otherwise, there is no point, as the poller will never make any goroutine ready. This preserves the runtime's current simple deadlock detection. This seems to crash FreeBSD systems, so it is disabled on FreeBSD. This is issue 19093. Using the poller on Windows requires opening the file with FILE_FLAG_OVERLAPPED. We should only do that if we can remove that flag if the program calls the Fd method. This is issue 19098. Update #6817. Update #7903. Update #15021. Update #18507. Update #19093. Update #19098. Change-Id: Ia5197dcefa7c6fbcca97d19a6f8621b2abcbb1fe Reviewed-on: https://go-review.googlesource.com/36800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-15 19:31:55 +00:00
Austin Clements	0993b2fd06	runtime: remove g.stackAlloc Since we're no longer stealing space for the stack barrier array from the stack allocation, the stack allocation is simply g.stack.hi-g.stack.lo. Updates #17503. Change-Id: Id9b450ae12c3df9ec59cfc4365481a0a16b7c601 Reviewed-on: https://go-review.googlesource.com/36621 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:56 +00:00
Austin Clements	d089a6c718	runtime: remove stack barriers Now that we don't rescan stacks, stack barriers are unnecessary. This removes all of the code and structures supporting them as well as tests that were specifically for stack barriers. Updates #17503. Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c Reviewed-on: https://go-review.googlesource.com/36620 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-14 15:52:54 +00:00
Austin Clements	c5ebcd2c8a	runtime: remove rescan list With the hybrid barrier, rescanning stacks is no longer necessary so the rescan list is no longer necessary. Remove it. This leaves the gcrescanstacks GODEBUG variable, since it's useful for debugging, but changes it to simply walk all of the Gs to rescan stacks rather than using the rescan list. We could also remove g.gcscanvalid, which is effectively a distributed rescan list. However, it's still useful for gcrescanstacks mode and it adds little complexity, so we'll leave it in. Fixes #17099. Updates #17503. Change-Id: I776d43f0729567335ef1bfd145b75c74de2cc7a9 Reviewed-on: https://go-review.googlesource.com/36619 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:51 +00:00
Austin Clements	7aeb915d6b	runtime: remove unused debug.wbshadow The wbshadow implementation was removed a year and a half ago in `1635ab7dfe`, but the GODEBUG setting remained. Remove the GODEBUG setting since it doesn't do anything. Change-Id: I19cde324a79472aff60acb5cc9f7d4aa86c0c0ed Reviewed-on: https://go-review.googlesource.com/36618 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-02-14 15:52:49 +00:00
Josh Bleecher Snyder	ef30a1c8aa	runtime: fix some assembly offset names For vet. There are more. This is a start. Change-Id: Ibbbb2b20b5db60ee3fac4a1b5913d18fab01f6b9 Reviewed-on: https://go-review.googlesource.com/36939 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 02:09:48 +00:00
Josh Bleecher Snyder	cc2a52adef	all: use keyed composite literals Makes vet happy. Change-Id: I7250f283c96e82b9796c5672a0a143ba7568fa63 Reviewed-on: https://go-review.googlesource.com/36937 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 02:09:14 +00:00
Josh Bleecher Snyder	46a75870ad	runtime: speed up fastrand() % n This occurs a fair amount in the runtime for non-power-of-two n. Use an alternative, faster formulation. name old time/op new time/op delta Fastrandn/2-8 4.45ns ± 2% 2.09ns ± 3% -53.12% (p=0.000 n=14+14) Fastrandn/3-8 4.78ns ±11% 2.06ns ± 2% -56.94% (p=0.000 n=15+15) Fastrandn/4-8 4.76ns ± 9% 1.99ns ± 3% -58.28% (p=0.000 n=15+13) Fastrandn/5-8 4.96ns ±13% 2.03ns ± 6% -59.14% (p=0.000 n=15+15) name old time/op new time/op delta SelectUncontended-8 33.7ns ± 2% 33.9ns ± 2% +0.70% (p=0.000 n=49+50) SelectSyncContended-8 1.68µs ± 4% 1.65µs ± 4% -1.54% (p=0.000 n=50+45) SelectAsyncContended-8 282ns ± 1% 277ns ± 1% -1.50% (p=0.000 n=48+43) SelectNonblock-8 5.31ns ± 1% 5.32ns ± 1% ~ (p=0.275 n=45+44) SelectProdCons-8 585ns ± 3% 577ns ± 2% -1.35% (p=0.000 n=50+50) GoroutineSelect-8 1.59ms ± 2% 1.59ms ± 1% ~ (p=0.084 n=49+48) Updates #16213 Change-Id: Ib555a4d7da2042a25c3976f76a436b536487d5b7 Reviewed-on: https://go-review.googlesource.com/36932 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-14 00:01:22 +00:00
Ian Lance Taylor	62237c2c8e	runtime: if runtime is stale while testing, show StaleReason Update #19062. Change-Id: I7397b573389145b56e73d2150ce0fc9aa75b3caa Reviewed-on: https://go-review.googlesource.com/36934 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-13 23:46:12 +00:00
Sokolov Yura	663226d8e1	runtime: make fastrand to generate 32bit values Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by choosing other polynom and reacting on high bit before shift. Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html from 32.dat.gz . It is referred as F7711115 cause this list of polynomials is for LFSR with shift to right (and fastrand uses shift to left). (old polynomial is referred in 31.dat.gz as 7BB88888). There were couple of places with conversation of fastrand to int, which leads to negative values on 32bit platforms. They are fixed. Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72 Reviewed-on: https://go-review.googlesource.com/36875 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-13 20:22:02 +00:00
Ian Lance Taylor	40c27ed5bc	runtime: if runtime is stale while testing, show cmd/go output Update #19062. Change-Id: If6a4c4f8d12e148b162256f13a8ee423f6e30637 Reviewed-on: https://go-review.googlesource.com/36918 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-13 20:00:05 +00:00
Ian Lance Taylor	3792db5183	net: refactor poller into new internal/poll package This will make it possible to use the poller with the os package. This is a lot of code movement but the behavior is intended to be unchanged. Update #6817. Update #7903. Update #15021. Update #18507. Change-Id: I1413685928017c32df5654ded73a2643820977ae Reviewed-on: https://go-review.googlesource.com/36799 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-13 18:36:28 +00:00
Keith Randall	5a75d6a08e	cmd/compile: optimize non-empty-interface type conversions When doing i.(T) for non-empty-interface i and concrete type T, there's no need to read the type out of the itab. Just compare the itab to the itab we expect for that interface/type pair. Also optimize type switches by putting the type hash of the concrete type in the itab. That way we don't need to load the type pointer out of the itab. Update #18492 Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce Reviewed-on: https://go-review.googlesource.com/34810 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: David Chase <drchase@google.com>	2017-02-13 18:16:31 +00:00
Josh Bleecher Snyder	8da91a6297	runtime: add Frames example Based on sample code from iant. Fixes #18788. Change-Id: I6bb33ed05af2538fbde42ddcac629280ef7c00a6 Reviewed-on: https://go-review.googlesource.com/36892 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-13 06:10:35 +00:00
Russ Cox	45c6f59e1f	runtime: use two-level list for semaphore address search in semaRoot If there are many goroutines contending for two different locks and both locks hash to the same semaRoot, the scans to find the goroutines for a particular lock can end up being O(n), making n lock acquisitions quadratic. As long as only one actively-used lock hashes to each semaRoot there's no problem, since the list operations in that case are O(1). But when the second actively-used lock hits the same semaRoot, then scans for entries with for a given lock have to scan over the entries for the other lock. Fix this problem by changing the semaRoot to hold only one sudog per unique address. In the running example, this drops the length of that list from O(n) to 2. Then attach other goroutines waiting on the same address to a separate list headed by the sudog in the semaRoot list. Those "same address list" operations are still O(1), so now the example from above works much better. There is still an assumption here that in real programs you don't have many many goroutines queueing up on many many distinct addresses. If we end up with that problem, we can replace the top-level list with a treap. Fixes #17953. Change-Id: I78c5b1a5053845275ab31686038aa4f6db5720b2 Reviewed-on: https://go-review.googlesource.com/36792 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-12 15:54:16 +00:00
Josh Bleecher Snyder	2c91bb4c8a	cmd/compile: make panicwrap argument-free When code defines a method on T, the compiler generates a corresponding wrapper method on T. The first thing the wrapper does is check whether the pointer is nil and if so, call panicwrap. This is done to provide a useful error message. The existing implementation gets its information from arguments set up by the compiler. However, with some trouble, this information can be extracted from the name of the wrapper method itself. Removing the arguments to panicwrap simplifies and shrinks the wrapper method. It also means that the call to panicwrap does not require any stack space. This enables a further optimization on amd64/x86, which is to skip the function prologue if nothing else in the method requires stack space. This is frequently the case in simple, hot methods, such as Less and Swap in sort.Interface implementations. Fixes #19040. Benchmarks for package sort on amd64: name old time/op new time/op delta SearchWrappers-8 104ns ± 1% 104ns ± 1% ~ (p=0.286 n=27+27) SortString1K-8 128µs ± 1% 128µs ± 1% -0.44% (p=0.004 n=30+30) SortString1K_Slice-8 118µs ± 2% 117µs ± 1% ~ (p=0.106 n=30+30) StableString1K-8 18.6µs ± 1% 18.6µs ± 1% ~ (p=0.446 n=28+26) SortInt1K-8 65.9µs ± 1% 60.7µs ± 1% -7.96% (p=0.000 n=28+30) StableInt1K-8 75.3µs ± 2% 72.8µs ± 1% -3.41% (p=0.000 n=30+30) StableInt1K_Slice-8 57.7µs ± 1% 57.7µs ± 1% ~ (p=0.515 n=30+30) SortInt64K-8 6.28ms ± 1% 6.01ms ± 1% -4.19% (p=0.000 n=28+28) SortInt64K_Slice-8 5.04ms ± 1% 5.04ms ± 1% ~ (p=0.927 n=28+27) StableInt64K-8 6.65ms ± 1% 6.38ms ± 1% -3.97% (p=0.000 n=26+30) Sort1e2-8 37.9µs ± 1% 37.2µs ± 1% -1.89% (p=0.000 n=29+27) Stable1e2-8 77.0µs ± 1% 74.7µs ± 1% -3.06% (p=0.000 n=27+30) Sort1e4-8 8.21ms ± 2% 7.98ms ± 1% -2.77% (p=0.000 n=29+30) Stable1e4-8 24.8ms ± 1% 24.3ms ± 1% -2.31% (p=0.000 n=28+30) Sort1e6-8 1.27s ± 4% 1.22s ± 1% -3.42% (p=0.000 n=30+29) Stable1e6-8 5.06s ± 1% 4.92s ± 1% -2.77% (p=0.000 n=25+29) [Geo mean] 731µs 714µs -2.29% Before/after assembly for sort.(intPairs).Less follows. It can be optimized further, but that's for a follow-up CL. Before: "".(intPairs).Less t=1 size=214 args=0x20 locals=0x38 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Less(SB), $56-32 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) CMPQ SP, 16(CX) 0x000d 00013 (<autogenerated>:1) JLS 204 0x0013 00019 (<autogenerated>:1) SUBQ $56, SP 0x0017 00023 (<autogenerated>:1) MOVQ BP, 48(SP) 0x001c 00028 (<autogenerated>:1) LEAQ 48(SP), BP 0x0021 00033 (<autogenerated>:1) MOVQ 32(CX), BX 0x0025 00037 (<autogenerated>:1) TESTQ BX, BX 0x0028 00040 (<autogenerated>:1) JEQ 55 0x002a 00042 (<autogenerated>:1) LEAQ 64(SP), DI 0x002f 00047 (<autogenerated>:1) CMPQ (BX), DI 0x0032 00050 (<autogenerated>:1) JNE 55 0x0034 00052 (<autogenerated>:1) MOVQ SP, (BX) 0x0037 00055 (<autogenerated>:1) NOP 0x0037 00055 (<autogenerated>:1) FUNCDATA $0, gclocals·4032f753396f2012ad1784f398b170f4(SB) 0x0037 00055 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x0037 00055 (<autogenerated>:1) MOVQ ""..this+64(FP), AX 0x003c 00060 (<autogenerated>:1) TESTQ AX, AX 0x003f 00063 (<autogenerated>:1) JEQ $0, 135 0x0041 00065 (<autogenerated>:1) MOVQ (AX), CX 0x0044 00068 (<autogenerated>:1) MOVQ 8(AX), AX 0x0048 00072 (<autogenerated>:1) MOVQ "".i+72(FP), DX 0x004d 00077 (<autogenerated>:1) CMPQ DX, AX 0x0050 00080 (<autogenerated>:1) JCC $0, 128 0x0052 00082 (<autogenerated>:1) SHLQ $4, DX 0x0056 00086 (<autogenerated>:1) MOVQ (CX)(DX1), DX 0x005a 00090 (<autogenerated>:1) MOVQ "".j+80(FP), BX 0x005f 00095 (<autogenerated>:1) CMPQ BX, AX 0x0062 00098 (<autogenerated>:1) JCC $0, 128 0x0064 00100 (<autogenerated>:1) SHLQ $4, BX 0x0068 00104 (<autogenerated>:1) MOVQ (CX)(BX1), AX 0x006c 00108 (<autogenerated>:1) CMPQ DX, AX 0x006f 00111 (<autogenerated>:1) SETLT AL 0x0072 00114 (<autogenerated>:1) MOVB AL, "".~r2+88(FP) 0x0076 00118 (<autogenerated>:1) MOVQ 48(SP), BP 0x007b 00123 (<autogenerated>:1) ADDQ $56, SP 0x007f 00127 (<autogenerated>:1) RET 0x0080 00128 (<autogenerated>:1) PCDATA $0, $1 0x0080 00128 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x0085 00133 (<autogenerated>:1) UNDEF 0x0087 00135 (<autogenerated>:1) LEAQ go.string."sort_test"(SB), AX 0x008e 00142 (<autogenerated>:1) MOVQ AX, (SP) 0x0092 00146 (<autogenerated>:1) MOVQ $9, 8(SP) 0x009b 00155 (<autogenerated>:1) LEAQ go.string."intPairs"(SB), AX 0x00a2 00162 (<autogenerated>:1) MOVQ AX, 16(SP) 0x00a7 00167 (<autogenerated>:1) MOVQ $8, 24(SP) 0x00b0 00176 (<autogenerated>:1) LEAQ go.string."Less"(SB), AX 0x00b7 00183 (<autogenerated>:1) MOVQ AX, 32(SP) 0x00bc 00188 (<autogenerated>:1) MOVQ $4, 40(SP) 0x00c5 00197 (<autogenerated>:1) PCDATA $0, $1 0x00c5 00197 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x00ca 00202 (<autogenerated>:1) UNDEF 0x00cc 00204 (<autogenerated>:1) NOP 0x00cc 00204 (<autogenerated>:1) PCDATA $0, $-1 0x00cc 00204 (<autogenerated>:1) CALL runtime.morestack_noctxt(SB) 0x00d1 00209 (<autogenerated>:1) JMP 0 After: "".(intPairs).Swap t=1 size=147 args=0x18 locals=0x8 0x0000 00000 (<autogenerated>:1) TEXT "".(intPairs).Swap(SB), $8-24 0x0000 00000 (<autogenerated>:1) MOVQ (TLS), CX 0x0009 00009 (<autogenerated>:1) SUBQ $8, SP 0x000d 00013 (<autogenerated>:1) MOVQ BP, (SP) 0x0011 00017 (<autogenerated>:1) LEAQ (SP), BP 0x0015 00021 (<autogenerated>:1) MOVQ 32(CX), BX 0x0019 00025 (<autogenerated>:1) TESTQ BX, BX 0x001c 00028 (<autogenerated>:1) JEQ 43 0x001e 00030 (<autogenerated>:1) LEAQ 16(SP), DI 0x0023 00035 (<autogenerated>:1) CMPQ (BX), DI 0x0026 00038 (<autogenerated>:1) JNE 43 0x0028 00040 (<autogenerated>:1) MOVQ SP, (BX) 0x002b 00043 (<autogenerated>:1) NOP 0x002b 00043 (<autogenerated>:1) FUNCDATA $0, gclocals·e6397a44f8e1b6e77d0f200b4fba5269(SB) 0x002b 00043 (<autogenerated>:1) FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) 0x002b 00043 (<autogenerated>:1) MOVQ ""..this+16(FP), AX 0x0030 00048 (<autogenerated>:1) TESTQ AX, AX 0x0033 00051 (<autogenerated>:1) JEQ $0, 140 0x0035 00053 (<autogenerated>:1) MOVQ (AX), CX 0x0038 00056 (<autogenerated>:1) MOVQ 8(AX), AX 0x003c 00060 (<autogenerated>:1) MOVQ "".i+24(FP), DX 0x0041 00065 (<autogenerated>:1) CMPQ DX, AX 0x0044 00068 (<autogenerated>:1) JCC $0, 133 0x0046 00070 (<autogenerated>:1) SHLQ $4, DX 0x004a 00074 (<autogenerated>:1) MOVQ 8(CX)(DX1), BX 0x004f 00079 (<autogenerated>:1) MOVQ (CX)(DX1), SI 0x0053 00083 (<autogenerated>:1) MOVQ "".j+32(FP), DI 0x0058 00088 (<autogenerated>:1) CMPQ DI, AX 0x005b 00091 (<autogenerated>:1) JCC $0, 133 0x005d 00093 (<autogenerated>:1) SHLQ $4, DI 0x0061 00097 (<autogenerated>:1) MOVQ 8(CX)(DI1), AX 0x0066 00102 (<autogenerated>:1) MOVQ (CX)(DI1), R8 0x006a 00106 (<autogenerated>:1) MOVQ R8, (CX)(DX1) 0x006e 00110 (<autogenerated>:1) MOVQ AX, 8(CX)(DX1) 0x0073 00115 (<autogenerated>:1) MOVQ SI, (CX)(DI1) 0x0077 00119 (<autogenerated>:1) MOVQ BX, 8(CX)(DI1) 0x007c 00124 (<autogenerated>:1) MOVQ (SP), BP 0x0080 00128 (<autogenerated>:1) ADDQ $8, SP 0x0084 00132 (<autogenerated>:1) RET 0x0085 00133 (<autogenerated>:1) PCDATA $0, $1 0x0085 00133 (<autogenerated>:1) CALL runtime.panicindex(SB) 0x008a 00138 (<autogenerated>:1) UNDEF 0x008c 00140 (<autogenerated>:1) PCDATA $0, $1 0x008c 00140 (<autogenerated>:1) CALL runtime.panicwrap(SB) 0x0091 00145 (<autogenerated>:1) UNDEF Change-Id: I15bb8435f0690badb868799f313ed8817335efd3 Reviewed-on: https://go-review.googlesource.com/36809 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-11 23:27:35 +00:00
Sokolov Yura	d03c124860	runtime: implement fastrand in go So it could be inlined. Using bit-tricks it could be implemented without condition (improved trick version by Minux Ma). Simple benchmark shows it is faster on i386 and x86_64, though I don't know will it be faster on other architectures? benchmark old ns/op new ns/op delta BenchmarkFastrand-3 2.79 1.48 -46.95% BenchmarkFastrandHashiter-3 25.9 24.9 -3.86% Change-Id: Ie2eb6d0f598c0bb5fac7f6ad0f8b5e3eddaa361b Reviewed-on: https://go-review.googlesource.com/34782 Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-10 19:16:29 +00:00
Brad Fitzpatrick	9f75ecd5e1	runtime/debug: don't run a GC when setting SetGCPercent negative If the user is calling SetGCPercent(-1), they intend to disable GC. They probably don't intend to run one. If they do, they can call runtime.GC themselves. Change-Id: I40ef40dfc7e15193df9ff26159cd30e56b666f73 Reviewed-on: https://go-review.googlesource.com/34013 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-02-10 18:28:37 +00:00
Heschi Kreinick	2a74b9e814	cmd/trace: Record mark assists in execution traces During the mark phase of garbage collection, goroutines that allocate may be recruited to assist. This change creates trace events for mark assists and displays them similarly to sweep assists in the trace viewer. Mark assists are different than sweeps in that they can be preempted, so displaying them in the trace viewer is a little tricky -- we may need to synthesize multiple slices for one mark assist. This could have been done in the parser instead, but I thought it might be preferable to keep the parser as true to the event stream as possible. Change-Id: I381dcb1027a187a354b1858537851fa68a620ea7 Reviewed-on: https://go-review.googlesource.com/36015 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>	2017-02-10 18:03:42 +00:00
Russ Cox	9a7544395a	runtime/pprof: merge internal/protopprof into pprof package These are very tightly coupled, and internal/protopprof is small. There's no point to having a separate package. Change-Id: I2c8aa49c9e18a7128657bf2b05323860151b5606 Reviewed-on: https://go-review.googlesource.com/36711 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-10 13:09:19 +00:00
Robert Griesemer	3fd3171c2c	cmd/compile/internal/syntax: removed gcCompat code needed to pass orig. tests The gcCompat mode was introduced to match the new parser's node position setup exactly with the positions used by the original parser. Some of the gcCompat adjustments were required to satisfy syntax error test cases, and the rest were required to make toolstash cmp pass. This change removes the former gcCompat adjustments and instead adjusts the respective test cases as necessary. In some cases this makes the error lines consistent with the ones reported by gccgo. Where it has changed, the position associated with a given syntactic construct is the position (line/col number) of the left-most token belonging to the construct. Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c Reviewed-on: https://go-review.googlesource.com/36695 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-02-10 01:22:30 +00:00
Ian Lance Taylor	e24228af25	runtime: enable/disable SIGPROF if needed when profiling This ensures that SIGPROF is handled correctly when using runtime/pprof in a c-archive or c-shared library. Separate profiler handling into pre-process changes and per-thread changes. Simplify the Windows code slightly accordingly. Fixes #18220. Change-Id: I5060f7084c91ef0bbe797848978bdc527c312777 Reviewed-on: https://go-review.googlesource.com/34018 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2017-02-09 18:53:34 +00:00
Russ Cox	e4371fb179	time: optimize Now on darwin, windows Fetch both monotonic and wall time together when possible. Avoids skew and is cheaper. Also shave a few ns off in conversion in package time. Compared to current implementation (after monotonic changes): name old time/op new time/op delta Now 19.6ns ± 1% 9.7ns ± 1% -50.63% (p=0.000 n=41+49) darwin/amd64 Now 23.5ns ± 4% 10.6ns ± 5% -54.61% (p=0.000 n=30+28) windows/amd64 Now 54.5ns ± 5% 29.8ns ± 9% -45.40% (p=0.000 n=27+29) windows/386 More importantly, compared to Go 1.8: name old time/op new time/op delta Now 9.5ns ± 1% 9.7ns ± 1% +1.94% (p=0.000 n=41+49) darwin/amd64 Now 12.9ns ± 5% 10.6ns ± 5% -17.73% (p=0.000 n=30+28) windows/amd64 Now 15.3ns ± 5% 29.8ns ± 9% +94.36% (p=0.000 n=30+29) windows/386 This brings time.Now back in line with Go 1.8 on darwin/amd64 and windows/amd64. It's not obvious why windows/386 is still noticeably worse than Go 1.8, but it's better than before this CL. The windows/386 speed is not too important; the changes just keep the two architectures similar. Change-Id: If69b94970c8a1a57910a371ee91e0d4e82e46c5d Reviewed-on: https://go-review.googlesource.com/36428 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-09 14:45:16 +00:00
Ian Lance Taylor	87ad863f35	runtime: use atomic ops for fwdSig, make sigtable immutable The fwdSig array is accessed by the signal handler, which may run in parallel with other threads manipulating it via the os/signal package. Use atomic accesses to ensure that there are no problems. Move the _SigHandling flag out of the sigtable array. This makes sigtable immutable and safe to read from the signal handler. Change-Id: Icfa407518c4ebe1da38580920ced764898dfc9ad Reviewed-on: https://go-review.googlesource.com/36321 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-08 04:14:41 +00:00
David Crawshaw	14c2849c3e	runtime: update android time_now call This was broken in https://golang.org/cl/36255 Change-Id: Ib23323a745a650ac51b0ead161076f97efe6d7b7 Reviewed-on: https://go-review.googlesource.com/36543 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-08 02:56:25 +00:00
Sameer Ajmani	38cb9d28a9	runtime/pprof: document that profile names should not contain spaces. Change-Id: I967d897e812bee63b32bc2a7dcf453861b89b7e3 Reviewed-on: https://go-review.googlesource.com/36533 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-07 22:00:48 +00:00
Jaana Burcu Dogan	6cf7918e73	runtime/pprof: clarify CPU profile's captured during the lifetime of the prog Fixes #18504. Change-Id: I3716fc58fc98472eea15ce3617aee3890670c276 Reviewed-on: https://go-review.googlesource.com/36430 Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-07 19:46:15 +00:00
Austin Clements	4af6b81d41	runtime: fix confusion between _MaxMem and _MaxArena32 Currently both _MaxMem and _MaxArena32 represent the maximum arena size on 32-bit hosts (except on MIPS32 where _MaxMem is confusingly smaller than _MaxArena32). Clean up sysAlloc so that it always uses _MaxMem, which is the maximum arena size on both 32- and 64-bit architectures and is the arena size we allocate auxiliary structures for. This lets us simplify and unify some code paths and eliminate _MaxArena32. Fixes #18651. mheap.sysAlloc currently assumes that if the arena is small, we must be on a 32-bit machine and can therefore grow the arena to _MaxArena32. This breaks down on darwin/arm64, where _MaxMem is only 2 GB. As a result, on darwin/arm64, we only reserve spans and bitmap space for a 2 GB heap, and if the application tries to allocate beyond that, sysAlloc takes the 32-bit path, tries to grow the arena beyond 2 GB, and panics when it tries to grow the spans array allocation past its reserved size. This has probably been a problem for several releases now, but was only noticed recently because mapSpans didn't check the bounds on the span reservation until recently. Most likely it corrupted the bitmap before. By using _MaxMem consistently, we avoid thinking that we can grow the arena larger than we have auxiliary structures for. Change-Id: Ifef28cb746a3ead4b31c1d7348495c2242fef520 Reviewed-on: https://go-review.googlesource.com/35253 Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Elias Naur <elias.naur@gmail.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-07 18:39:18 +00:00
Austin Clements	1cc24690b8	runtime: simplify and cleanup mallocinit mallocinit has evolved organically. Make a pass to clean it up in various ways: 1. Merge the computation of spansSize and bitmapSize. These were computed on every loop iteration of two different loops, but always have the same value, which can be derived directly from _MaxMem. This also avoids over-reserving these on MIPS, were _MaxArena32 is larger than _MaxMem. 2. Remove the ulimit -v logic. It's been disabled for many releases and the dead code paths to support it are even more wrong now than they were when it was first disabled, since now we must reserve spans and bitmaps for the full address space. 3. Make it clear that we're using a simple linear allocation to lay out the spans, bitmap, and arena spaces. Previously there were a lot of redundant pointer computations. Now we just bump p1 up as we reserve the spaces. In preparation for #18651. Updates #5049 (respect ulimit). Change-Id: Icbe66570d3a7a17bea227dc54fb3c4978b52a3af Reviewed-on: https://go-review.googlesource.com/35252 Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-07 18:39:15 +00:00
Austin Clements	efb5eae3cf	runtime: make _MaxMem an untyped constant Currently _MaxMem is a uintptr, which is going to complicate some further changes. Make it untyped so we'll be able to do untyped math on it before truncating it to a uintptr. The runtime assembly is identical before and after this change on {linux,windows}/{amd64,386}. Updates #18651. Change-Id: I0f64511faa9e0aa25179a556ab9f185ebf8c9cf8 Reviewed-on: https://go-review.googlesource.com/35251 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-02-07 18:39:12 +00:00
Cherry Zhang	bed8129ee6	cmd/internal/obj: remove Follow pass The Follow pass in the assembler backend reorders and copies instructions. This even applies to hand-written assembly code, which in many cases don't want to be reordered. Now that the SSA compiler does a good job for laying out instructions, the benefit of this pass is very little: AMD64: (old = with Follow, new = without Follow) name old time/op new time/op delta BinaryTree17-12 2.78s ± 1% 2.79s ± 1% +0.44% (p=0.000 n=20+19) Fannkuch11-12 3.11s ± 0% 3.31s ± 1% +6.16% (p=0.000 n=19+19) FmtFprintfEmpty-12 50.9ns ± 1% 51.6ns ± 3% +1.40% (p=0.000 n=17+20) FmtFprintfString-12 127ns ± 0% 128ns ± 1% +0.88% (p=0.000 n=17+17) FmtFprintfInt-12 122ns ± 0% 123ns ± 1% +0.76% (p=0.000 n=20+19) FmtFprintfIntInt-12 185ns ± 1% 186ns ± 1% +0.65% (p=0.000 n=20+19) FmtFprintfPrefixedInt-12 192ns ± 1% 202ns ± 1% +4.99% (p=0.000 n=20+19) FmtFprintfFloat-12 284ns ± 0% 288ns ± 0% +1.33% (p=0.000 n=15+19) FmtManyArgs-12 807ns ± 0% 804ns ± 0% -0.44% (p=0.000 n=16+18) GobDecode-12 7.23ms ± 1% 7.21ms ± 1% ~ (p=0.052 n=20+20) GobEncode-12 6.09ms ± 1% 6.12ms ± 1% +0.41% (p=0.002 n=19+19) Gzip-12 253ms ± 1% 255ms ± 1% +0.95% (p=0.000 n=18+20) Gunzip-12 38.4ms ± 0% 38.5ms ± 0% +0.34% (p=0.000 n=17+17) HTTPClientServer-12 95.4µs ± 2% 96.1µs ± 1% +0.78% (p=0.002 n=19+19) JSONEncode-12 16.5ms ± 1% 16.6ms ± 1% +1.17% (p=0.000 n=19+19) JSONDecode-12 54.6ms ± 1% 55.3ms ± 1% +1.23% (p=0.000 n=18+18) Mandelbrot200-12 4.47ms ± 0% 4.47ms ± 0% +0.06% (p=0.000 n=18+18) GoParse-12 3.47ms ± 1% 3.47ms ± 1% ~ (p=0.583 n=20+20) RegexpMatchEasy0_32-12 84.8ns ± 1% 85.2ns ± 2% +0.51% (p=0.022 n=20+20) RegexpMatchEasy0_1K-12 206ns ± 1% 206ns ± 1% ~ (p=0.770 n=20+20) RegexpMatchEasy1_32-12 82.8ns ± 1% 83.4ns ± 1% +0.64% (p=0.000 n=20+19) RegexpMatchEasy1_1K-12 363ns ± 1% 361ns ± 1% -0.48% (p=0.007 n=20+20) RegexpMatchMedium_32-12 126ns ± 1% 126ns ± 0% +0.72% (p=0.000 n=20+20) RegexpMatchMedium_1K-12 39.1µs ± 1% 39.8µs ± 0% +1.73% (p=0.000 n=19+19) RegexpMatchHard_32-12 1.97µs ± 0% 1.98µs ± 1% +0.29% (p=0.005 n=18+20) RegexpMatchHard_1K-12 59.5µs ± 1% 59.8µs ± 1% +0.36% (p=0.000 n=18+20) Revcomp-12 442ms ± 1% 445ms ± 2% +0.67% (p=0.000 n=19+20) Template-12 58.0ms ± 1% 57.5ms ± 1% -0.85% (p=0.000 n=19+19) TimeParse-12 311ns ± 0% 314ns ± 0% +0.94% (p=0.000 n=20+18) TimeFormat-12 350ns ± 3% 346ns ± 0% ~ (p=0.076 n=20+19) [Geo mean] 55.9µs 56.4µs +0.80% ARM32: name old time/op new time/op delta BinaryTree17-4 30.4s ± 0% 30.1s ± 0% -1.14% (p=0.000 n=10+8) Fannkuch11-4 13.7s ± 0% 13.6s ± 0% -0.75% (p=0.000 n=10+10) FmtFprintfEmpty-4 664ns ± 1% 651ns ± 1% -1.96% (p=0.000 n=7+8) FmtFprintfString-4 1.83µs ± 2% 1.77µs ± 2% -3.21% (p=0.000 n=10+10) FmtFprintfInt-4 1.57µs ± 2% 1.54µs ± 2% -2.25% (p=0.007 n=10+10) FmtFprintfIntInt-4 2.37µs ± 2% 2.31µs ± 1% -2.68% (p=0.000 n=10+10) FmtFprintfPrefixedInt-4 2.14µs ± 2% 2.10µs ± 1% -1.83% (p=0.006 n=10+10) FmtFprintfFloat-4 3.69µs ± 2% 3.74µs ± 1% +1.60% (p=0.000 n=10+10) FmtManyArgs-4 9.43µs ± 1% 9.17µs ± 1% -2.70% (p=0.000 n=10+10) GobDecode-4 76.3ms ± 1% 75.5ms ± 1% -1.14% (p=0.003 n=10+10) GobEncode-4 70.7ms ± 2% 69.0ms ± 1% -2.36% (p=0.000 n=10+10) Gzip-4 2.64s ± 1% 2.65s ± 0% +0.59% (p=0.002 n=10+10) Gunzip-4 402ms ± 0% 398ms ± 0% -1.11% (p=0.000 n=10+9) HTTPClientServer-4 458µs ± 0% 457µs ± 0% ~ (p=0.247 n=10+10) JSONEncode-4 171ms ± 0% 172ms ± 0% +0.56% (p=0.000 n=10+10) JSONDecode-4 672ms ± 1% 668ms ± 1% ~ (p=0.105 n=10+10) Mandelbrot200-4 33.5ms ± 0% 33.5ms ± 0% ~ (p=0.156 n=9+10) GoParse-4 33.9ms ± 0% 34.0ms ± 0% +0.36% (p=0.031 n=9+9) RegexpMatchEasy0_32-4 823ns ± 1% 835ns ± 1% +1.49% (p=0.000 n=8+8) RegexpMatchEasy0_1K-4 3.99µs ± 0% 4.02µs ± 1% +0.92% (p=0.000 n=8+10) RegexpMatchEasy1_32-4 877ns ± 3% 904ns ± 2% +3.07% (p=0.012 n=10+10) RegexpMatchEasy1_1K-4 5.99µs ± 0% 5.97µs ± 1% -0.38% (p=0.023 n=8+8) RegexpMatchMedium_32-4 1.40µs ± 2% 1.40µs ± 2% ~ (p=0.590 n=10+9) RegexpMatchMedium_1K-4 357µs ± 0% 355µs ± 1% -0.72% (p=0.000 n=7+8) RegexpMatchHard_32-4 22.3µs ± 0% 22.1µs ± 0% -0.49% (p=0.000 n=8+7) RegexpMatchHard_1K-4 661µs ± 0% 658µs ± 0% -0.42% (p=0.000 n=8+7) Revcomp-4 46.3ms ± 0% 46.3ms ± 0% ~ (p=0.393 n=10+10) Template-4 753ms ± 1% 750ms ± 0% ~ (p=0.211 n=10+9) TimeParse-4 4.28µs ± 1% 4.22µs ± 1% -1.34% (p=0.000 n=8+10) TimeFormat-4 9.00µs ± 0% 9.05µs ± 0% +0.59% (p=0.000 n=10+10) [Geo mean] 538µs 535µs -0.55% ARM64: name old time/op new time/op delta BinaryTree17-8 8.39s ± 0% 8.39s ± 0% ~ (p=0.684 n=10+10) Fannkuch11-8 5.95s ± 0% 5.99s ± 0% +0.63% (p=0.000 n=10+10) FmtFprintfEmpty-8 116ns ± 0% 116ns ± 0% ~ (all equal) FmtFprintfString-8 361ns ± 0% 360ns ± 0% -0.31% (p=0.003 n=8+6) FmtFprintfInt-8 290ns ± 0% 290ns ± 0% ~ (p=0.620 n=9+9) FmtFprintfIntInt-8 476ns ± 1% 469ns ± 0% -1.47% (p=0.000 n=10+6) FmtFprintfPrefixedInt-8 412ns ± 2% 417ns ± 2% +1.39% (p=0.006 n=9+10) FmtFprintfFloat-8 652ns ± 1% 652ns ± 0% ~ (p=0.161 n=10+8) FmtManyArgs-8 1.94µs ± 0% 1.94µs ± 2% ~ (p=0.781 n=10+10) GobDecode-8 17.7ms ± 1% 17.7ms ± 0% ~ (p=0.962 n=10+7) GobEncode-8 15.6ms ± 0% 15.6ms ± 1% ~ (p=0.063 n=10+10) Gzip-8 786ms ± 0% 787ms ± 0% ~ (p=0.356 n=10+9) Gunzip-8 127ms ± 0% 127ms ± 0% +0.08% (p=0.028 n=10+9) HTTPClientServer-8 198µs ± 6% 198µs ± 7% ~ (p=0.796 n=10+10) JSONEncode-8 42.5ms ± 0% 42.2ms ± 0% -0.73% (p=0.000 n=9+8) JSONDecode-8 158ms ± 1% 162ms ± 0% +2.28% (p=0.000 n=10+9) Mandelbrot200-8 10.1ms ± 0% 10.1ms ± 0% -0.01% (p=0.000 n=10+9) GoParse-8 8.54ms ± 1% 8.63ms ± 1% +1.06% (p=0.000 n=10+9) RegexpMatchEasy0_32-8 231ns ± 1% 225ns ± 0% -2.52% (p=0.000 n=9+10) RegexpMatchEasy0_1K-8 1.63µs ± 0% 1.63µs ± 0% ~ (p=0.170 n=10+10) RegexpMatchEasy1_32-8 253ns ± 0% 249ns ± 0% -1.41% (p=0.000 n=9+10) RegexpMatchEasy1_1K-8 2.08µs ± 0% 2.08µs ± 0% -0.32% (p=0.000 n=9+10) RegexpMatchMedium_32-8 355ns ± 1% 351ns ± 0% -1.04% (p=0.007 n=10+7) RegexpMatchMedium_1K-8 104µs ± 0% 104µs ± 0% ~ (p=0.148 n=10+10) RegexpMatchHard_32-8 5.79µs ± 0% 5.79µs ± 0% ~ (p=0.578 n=10+10) RegexpMatchHard_1K-8 176µs ± 0% 176µs ± 0% ~ (p=0.137 n=10+10) Revcomp-8 1.37s ± 1% 1.36s ± 1% -0.26% (p=0.023 n=10+10) Template-8 151ms ± 1% 154ms ± 1% +2.14% (p=0.000 n=9+10) TimeParse-8 723ns ± 2% 721ns ± 1% ~ (p=0.592 n=10+10) TimeFormat-8 804ns ± 2% 798ns ± 3% ~ (p=0.344 n=10+10) [Geo mean] 154µs 154µs -0.02% Therefore remove this pass. Also reduce text size by 0.5~2%. Comment out some dead code in runtime/sys_nacl_amd64p32.s which contains undefined symbols. Change-Id: I1473986fe5b18b3d2554ce96cdc6f0999b8d955d Reviewed-on: https://go-review.googlesource.com/36205 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-07 15:00:48 +00:00
Michael Matloob	cbef450df7	runtime/pprof: symbolize proto profiles When generating pprof profiles in proto format, symbolize the profiles. Change-Id: I2471ed7f919483e5828868306418a63e41aff5c5 Reviewed-on: https://go-review.googlesource.com/34192 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-07 14:03:13 +00:00
Michael Matloob	62956897c1	runtime: add definitions for SetGoroutineLabels and Do This change defines runtime/pprof.SetGoroutineLabels and runtime/pprof.Do, which are used to set profiler labels on goroutines. The change defines functions in the runtime for setting and getting profile labels, and sets and unsets profile labels when goroutines are created and deleted. The change also adds the package runtime/internal/proflabel, which defines the structure the runtime uses to store profile labels. Change-Id: I747a4400141f89b6e8160dab6aa94ca9f0d4c94d Reviewed-on: https://go-review.googlesource.com/34198 Run-TryBot: Michael Matloob <matloob@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-on: https://go-review.googlesource.com/35010	2017-02-06 20:29:37 +00:00
Michael Matloob	91ad2a2194	runtime/pprof: add definitions of profile label types This change defines WithLabels, Labels, Label, and ForLabels. This is the first step of the profile labels implemention for go 1.9. Updates #17280 Change-Id: I2dfc9aae90f7a4aa1ff7080d5747f0a1f0728e75 Reviewed-on: https://go-review.googlesource.com/34198 Run-TryBot: Michael Matloob <matloob@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-02-06 15:43:06 +00:00
Ian Lance Taylor	6aee6b895c	runtime: remove markBits.clearMarkedNonAtomic It's not used, it's never been used, and it doesn't do what its doc comment says it does. Fixes #18941. Change-Id: Ia89d97fb87525f5b861d7701f919e0d6b7cbd376 Reviewed-on: https://go-review.googlesource.com/36322 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2017-02-06 04:45:55 +00:00
Elias Naur	78074f6850	runtime: handle SIGPIPE in c-archive and c-shared programs Before this CL, Go programs in c-archive or c-shared buildmodes would not handle SIGPIPE. That leads to surprising behaviour where writes on a closed pipe or socket would raise SIGPIPE and terminate the program. This CL changes the Go runtime to handle SIGPIPE regardless of buildmode. In addition, SIGPIPE from non-Go code is forwarded. This is a refinement of CL 32796 that fixes the case where a non-default handler for SIGPIPE is installed by the host C program. Fixes #17393 Change-Id: Ia41186e52c1ac209d0a594bae9904166ae7df7de Reviewed-on: https://go-review.googlesource.com/35960 Run-TryBot: Elias Naur <elias.naur@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-03 20:07:36 +00:00
Russ Cox	0e3355903d	time: record monotonic clock reading in time.Now, for more accurate comparisons See https://golang.org/design/12914-monotonic for details. Fixes #12914. Change-Id: I80edc2e6c012b4ace7161c84cf067d444381a009 Reviewed-on: https://go-review.googlesource.com/36255 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Caleb Spare <cespare@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-03 19:04:52 +00:00
Keith Randall	69e1634985	runtime: darwin/amd64, don't depend on outarg slots being unmodified sigtramp was calling sigtrampgo and depending on the fact that the 3rd argument slot will not be modified on return. Our calling convention doesn't guarantee that. Avoid that assumption. There's no actual bug here, as sigtrampgo does not in fact modify its argument slots. But I found this while working on the dead stack slot clobbering tool. https://go-review.googlesource.com/c/23924/ Change-Id: Ia7e791a2b4c1c74fff24cba8169e7840b4b06ffc Reviewed-on: https://go-review.googlesource.com/36216 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-02-03 05:08:54 +00:00
Cherry Zhang	f69a6defd1	runtime: skip flaky TestGdbPythonCgo on MIPS It seems the problem is on gdb and the dynamic linker. Skip the test for now until we figure out what's going on with the system. Updates #18784. Change-Id: Ic9320ffd463f6c231b2c4192652263b1cf7f4231 Reviewed-on: https://go-review.googlesource.com/36250 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-02 21:45:42 +00:00
Lars Wiegman	e546b295b8	runtime: use mach_absolute_time for runtime.nanotime The existing darwin/amd64 implementation of runtime.nanotime returns the wallclock time, which results in timers not functioning properly when system time runs backwards. By implementing the algorithm used by the darwin syscall mach_absolute_time, timers will function as expected. The algorithm is described at https://opensource.apple.com/source/xnu/xnu-3248.60.10/libsyscall/wrappers/mach_absolute_time.s Fixes #17610 Change-Id: I9c8d35240d48249a6837dca1111b1406e2686f67 Reviewed-on: https://go-review.googlesource.com/35292 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-02-02 21:20:40 +00:00
Josh Bleecher Snyder	0358367576	cmd/compile, runtime: convert byte-sized values to interfaces without allocation Based in part on khr's CL 2500. Updates #17725 Updates #18121 Change-Id: I744e1f92fc2104e6c5bd883a898c30b2eea8cc31 Reviewed-on: https://go-review.googlesource.com/35555 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-02-02 21:04:34 +00:00
Russ Cox	c47df7ae17	all: merge dev.typealias into master For #18130. `f8b4123613` [dev.typealias] spec: use term 'embedded field' rather than 'anonymous field' `9ecc3ee252` [dev.typealias] cmd/compile: avoid false positive cycles from type aliases `49b7af8a30` [dev.typealias] reflect: add test for type aliases `9bbb07ddec` [dev.typealias] cmd/compile, reflect: fix struct field names for embedded byte, rune `43c7094386` [dev.typealias] reflect: fix StructOf use of StructField to match StructField docs `9657e0b077` [dev.typealias] cmd/doc: update for type alias `de2e5459ae` [dev.typealias] cmd/compile: declare methods after resolving receiver type `9259f3073a` [dev.typealias] test: match gccgo error messages on alias2.go `5d92916770` [dev.typealias] cmd/compile: change Func.Shortname to *Sym `a7c884efc1` [dev.typealias] go/internal/gccgoimporter: support for type aliases `5802cfd900` [dev.typealias] cmd/compile: export/import test cases for type aliases `d7cabd40dd` [dev.typealias] go/types: clarified doc string `cc2dcce3d7` [dev.typealias] cmd/compile: a few better comments related to alias types `5c160b28ba` [dev.typealias] cmd/compile: improved error message for cyles involving type aliases `b2386dffa1` [dev.typealias] cmd/compile: type-check type alias declarations `ac8421f9a5` [dev.typealias] cmd/compile: various minor cleanups `f011e0c6c3` [dev.typealias] cmd/compile, go/types, go/importer: various alias related fixes `49de5f0351` [dev.typealias] cmd/compile, go/importer: define export format and implement importing of type aliases `5ceec42dc0` [dev.typealias] go/types: export TypeName.IsAlias so clients can use it `aa1f0681bc` [dev.typealias] go/types: improved Object printing `c80748e389` [dev.typealias] go/types: remove some more vestiges of prior alias implementation `80d8b69e95` [dev.typealias] go/types: implement type aliases `a917097b5e` [dev.typealias] go/build: add go1.9 build tag `3e11940437` [dev.typealias] cmd/compile: recognize type aliases but complain for now (not yet supported) `e0a05c274a` [dev.typealias] cmd/gofmt: added test cases for alias type declarations `2e5116bd99` [dev.typealias] go/ast, go/parser, go/printer, go/types: initial type alias support Change-Id: Ia65f2e011fd7195f18e1dce67d4d49b80a261203	2017-01-31 13:01:31 -05:00
Ian Lance Taylor	0949659952	runtime: add explicit (void) in C to avoid GCC 7 problem This avoids errors like ./traceback.go:80:2: call of non-function C.f1 I filed https://gcc.gnu.org/PR79289 for the GCC problem. I think this is a bug in GCC, and it may be fixed before the final GCC 7 release. This CL is correct either way. Fixes #18855. Change-Id: I0785a7b7c5b1d0ca87b454b5eca9079f390fcbd4 Reviewed-on: https://go-review.googlesource.com/35919 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-01-30 19:27:49 +00:00
David Crawshaw	b531eb3062	runtime: reorder modules so main.main comes first Modules appear in the moduledata linked list in the order they are loaded by the dynamic loader, with one exception: the firstmoduledata itself the module that contains the runtime. This is not always the first module (when using -buildmode=shared, it is typically libstd.so, the second module). The order matters for typelinksinit, so we swap the first module with whatever module contains the main function. Updates #18729 This fixes the test case extracted with -linkshared, and now go test -linkshared encoding/... passes. However the original issue about a plugin failure is not yet fixed. Change-Id: I9f399ecc3518e22e6b0a350358e90b0baa44ac96 Reviewed-on: https://go-review.googlesource.com/35644 Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-01-25 22:33:57 +00:00
Russ Cox	9bbb07ddec	[dev.typealias] cmd/compile, reflect: fix struct field names for embedded byte, rune Will also fix type aliases. Fixes #17766. For #18130. Change-Id: I9e1584d47128782152e06abd0a30ef423d5c30d2 Reviewed-on: https://go-review.googlesource.com/35732 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2017-01-25 18:57:20 +00:00
Ian Lance Taylor	aad06da2b9	cmd/link: mark DWARF function symbols as reachable Otherwise we don't emit any required ELF relocations when doing an external link, because elfrelocsect skips unreachable symbols. Fixes #18745. Change-Id: Ia3583c41bb6c5ebb7579abd26ed8689370311cd6 Reviewed-on: https://go-review.googlesource.com/35590 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Crawshaw <crawshaw@golang.org>	2017-01-24 03:37:56 +00:00
Keith Randall	a96e117a58	runtime: amd64, use 4-byte ops for memmove of 4 bytes memmove used to use 2 2-byte load/store pairs to move 4 bytes. When the result is loaded with a single 4-byte load, it caused a store to load fowarding stall. To avoid the stall, special case memmove to use 4 byte ops for the 4 byte copy case. We already have a special case for 8-byte copies. 386 already specializes 4-byte copies. I'll do 2-byte copies also, but not for 1.8. benchmark old ns/op new ns/op delta BenchmarkIssue18740-8 7567 4799 -36.58% 3-byte copies get a bit slower. Other copies are unchanged. name old time/op new time/op delta Memmove/3-8 4.76ns ± 5% 5.26ns ± 3% +10.50% (p=0.000 n=10+10) Fixes #18740 Change-Id: Iec82cbac0ecfee80fa3c8fc83828f9a1819c3c74 Reviewed-on: https://go-review.googlesource.com/35567 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-01-23 19:39:22 +00:00
Bryan C. Mills	ea7d9e6a52	runtime: check for nil g and m in msanread fixes #18707. Change-Id: Ibc4efef01197799f66d10bfead22faf8ac00473c Reviewed-on: https://go-review.googlesource.com/35452 Run-TryBot: Bryan Mills <bcmills@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-01-19 23:06:54 +00:00
Austin Clements	c1730ae424	runtime: force workers out before checking mark roots Currently we check that all roots are marked as soon as gcMarkDone decides to transition from mark 1 to mark 2. However, issue #16083 indicates that there may be a race where we try to complete mark 1 while a worker is still scanning a stack, causing the root mark check to fail. We don't yet understand this race, but as a simple mitigation, move the root check to after gcMarkDone performs a ragged barrier, which will force any remaining workers to finish their current job. Updates #16083. This may "fix" it, but it would be better to understand and fix the underlying race. Change-Id: I1af9ce67bd87ade7bc2a067295d79c28cd11abd2 Reviewed-on: https://go-review.googlesource.com/35353 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-01-18 15:40:33 +00:00
Keith Randall	81a61a96c9	runtime: for plugins, don't add duplicate itabs We already do this for shared libraries. Do it for plugins also. Suggestions on how to test this would be welcome. I'd like to get this in for 1.8. It could lead to mysterious hangs when using plugins. Fixes #18676 Change-Id: I03209b096149090b9ba171c834c5e59087ed0f92 Reviewed-on: https://go-review.googlesource.com/35117 Reviewed-by: David Crawshaw <crawshaw@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>	2017-01-17 22:37:19 +00:00
Bryan C. Mills	fdde7ba2a2	runtime: avoid clobbering C callee-save register in cgoSigtramp Use R11 (a caller-saved temp register) instead of RBX (a callee-saved register). I believe this only affects linux/amd64, since it is the only platform with a non-trivial cgoSigtramp implementation. Updates #18328. Change-Id: I3d35c4512624184d5a8ece653fa09ddf50e079a2 Reviewed-on: https://go-review.googlesource.com/35068 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-01-12 00:06:32 +00:00
Austin Clements	2817e77024	runtime: debug prints for spanBytesAlloc underflow Updates #18043. Change-Id: I24e687fdd5521c48b672987f15f0d5de9f308884 Reviewed-on: https://go-review.googlesource.com/34612 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-01-10 15:59:39 +00:00
David Chase	7f1ff65c39	cmd/compile: insert scheduling checks on loop backedges Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-01-09 21:01:29 +00:00
Austin Clements	ffedff7e50	runtime: add table of size classes in a comment Change-Id: I52fae67c9aeceaa23e70f2ef0468745b354f8c75 Reviewed-on: https://go-review.googlesource.com/34932 Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2017-01-08 00:01:30 +00:00
shawnps	067bab00a8	all: fix misspellings Change-Id: I429637ca91f7db4144f17621de851a548dc1ce76 Reviewed-on: https://go-review.googlesource.com/34923 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-01-07 16:53:25 +00:00
Russ Cox	b902a63ade	runtime: fix corruption crash/race between select and stack growth To implement the blocking of a select, a goroutine builds a list of offers to communicate (pseudo-g's, aka sudog), one for each case, queues them on the corresponding channels, and waits for another goroutine to complete one of those cases and wake it up. Obviously it is not OK for two other goroutines to complete multiple cases and both wake the goroutine blocked in select. To make sure that only one branch of the select is chosen, all the sudogs contain a pointer to a shared (single) 'done uint32', which is atomically cas'ed by any interested goroutines. The goroutine that wins the cas race gets to wake up the select. A complication is that 'done uint32' is stored on the stack of the goroutine running the select, and that stack can move during the select due to stack growth or stack shrinking. The relevant ordering to block and unblock in select is: 1. Lock all channels. 2. Create list of sudogs and queue sudogs on all channels. 3. Switch to system stack, mark goroutine as asleep, unlock all channels. 4. Sleep until woken. 5. Wake up on goroutine stack. 6. Lock all channels. 7. Dequeue sudogs from all channels. 8. Free list of sudogs. 9. Unlock all channels. There are two kinds of stack moves: stack growth and stack shrinking. Stack growth happens while the original goroutine is running. Stack shrinking happens asynchronously, during garbage collection. While a channel listing a sudog is locked by select in this process, no other goroutine can attempt to complete communication on that channel, because that other goroutine doesn't hold the lock and can't find the sudog. If the stack moves while all the channel locks are held or when the sudogs are not yet or no longer queued in the channels, no problem, because no goroutine can get to the sudogs and therefore to selectdone. We only need to worry about the stack (and 'done uint32') moving with the sudogs queued in unlocked channels. Stack shrinking can happen any time the goroutine is stopped. That code already acquires all the channel locks before doing the stack move, so it avoids this problem. Stack growth can happen essentially any time the original goroutine is running on its own stack (not the system stack). In the first half of the select, all the channels are locked before any sudogs are queued, and the channels are not unlocked until the goroutine has stopped executing on its own stack and is asleep, so that part is OK. In the second half of the select, the goroutine wakes up on its own goroutine stack and immediately locks all channels. But the actual call to lock might grow the stack, before acquiring any locks. In that case, the stack is moving with the sudogs queued in unlocked channels. Not good. One goroutine has already won a cas on the old stack (that goroutine woke up the selecting goroutine, moving it out of step 4), and the fact that done = 1 now should prevent any other goroutines from completing any other select cases. During the stack move, however, sudog.selectdone is moved from pointing to the old done variable on the old stack to a new memory location on the new stack. Another goroutine might observe the moved pointer before the new memory location has been initialized. If the new memory word happens to be zero, that goroutine might win a cas on the new location, thinking it can now complete the select (again). It will then complete a second communication (reading from or writing to the goroutine stack incorrectly) and then attempt to wake up the selecting goroutine, which is already awake. The scribbling over the goroutine stack unexpectedly is already bad, but likely to go unnoticed, at least immediately. As for the second wakeup, there are a variety of ways it might play out. * The goroutine might not be asleep. That will produce a runtime crash (throw) like in #17007: runtime: gp: gp=0xc0422dcb60, goid=2299, gp->atomicstatus=8 runtime: g: g=0xa5cfe0, goid=0, g->atomicstatus=0 fatal error: bad g->status in ready Here, atomicstatus=8 is copystack; the second, incorrect wakeup is observing that the selecting goroutine is in state "Gcopystack" instead of "Gwaiting". * The goroutine might be sleeping in a send on a nil chan. If it wakes up, it will crash with 'fatal error: unreachable'. * The goroutine might be sleeping in a send on a non-nil chan. If it wakes up, it will crash with 'fatal error: chansend: spurious wakeup'. * The goroutine might be sleeping in a receive on a nil chan. If it wakes up, it will crash with 'fatal error: unreachable'. * The goroutine might be sleeping in a receive on a non-nil chan. If it wakes up, it will silently (incorrectly!) continue as if it received a zero value from a closed channel, leaving a sudog queued on the channel pointing at that zero vaue on the goroutine's stack; that space will be reused as the goroutine executes, and when some other goroutine finally completes the receive, it will do a stray write into the goroutine's stack memory, which may cause problems. Then it will attempt the real wakeup of the goroutine, leading recursively to any of the cases in this list. * The goroutine might have been running a select in a finalizer (I hope not!) and might now be sleeping waiting for more things to finalize. If it wakes up, as long as it goes back to sleep quickly (before the real GC code tries to wake it), the spurious wakeup does no harm (but the stack was still scribbled on). * The goroutine might be sleeping in gcParkAssist. If it wakes up, that will let the goroutine continue executing a bit earlier than we would have liked. Eventually the GC will attempt the real wakeup of the goroutine, leading recursively to any of the cases in this list. * The goroutine cannot be sleeping in bgsweep, because the background sweepers never use select. * The goroutine might be sleeping in netpollblock. If it wakes up, it will crash with 'fatal error: netpollblock: corrupted state'. * The goroutine might be sleeping in main as another thread crashes. If it wakes up, it will exit(0) instead of letting the other thread crash with a non-zero exit status. * The goroutine cannot be sleeping in forcegchelper, because forcegchelper never uses select. * The goroutine might be sleeping in an empty select - select {}. If it wakes up, it will return to the next line in the program! * The goroutine might be sleeping in a non-empty select (again). In this case, it will wake up spuriously, with gp.param == nil (no reason for wakeup), but that was fortuitously overloaded for handling wakeup due to a closing channel and the way it is handled is to rerun the select, which (accidentally) handles the spurious wakeup correctly: if cas == nil { // This can happen if we were woken up by a close(). // TODO: figure that out explicitly so we don't need this loop. goto loop } Before looping, it will dequeue all the sudogs on all the channels involved, so that no other goroutine will attempt to wake it. Since the goroutine was blocked in select before, being blocked in select again when the spurious wakeup arrives may be quite likely. In this case, the spurious wakeup does no harm (but the stack was still scribbled on). * The goroutine might be sleeping in semacquire (mutex slow path). If it wakes up, that is taken as a signal to try for the semaphore again, not a signal that the semaphore is now held, but the next iteration around the loop will queue the sudog a second time, causing a cycle in the wakeup list for the given address. If that sudog is the only one in the list, when it is eventually dequeued, it will (due to the precise way the code is written) leave the sudog on the queue inactive with the sudog broken. But the sudog will also be in the free list, and that will eventually cause confusion. * The goroutine might be sleeping in notifyListWait, for sync.Cond. If it wakes up, (Cond).Wait returns. The docs say "Unlike in other systems, Wait cannot return unless awoken by Broadcast or Signal," so the spurious wakeup is incorrect behavior, but most callers do not depend on that fact. Eventually the condition will happen, attempting the real wakeup of the goroutine and leading recursively to any of the cases in this list. The goroutine might be sleeping in timeSleep aka time.Sleep. If it wakes up, it will continue running, leaving a timer ticking. When that time bomb goes off, it will try to ready the goroutine again, leading to any one of the cases in this list. * The goroutine cannot be sleeping in timerproc, because timerproc never uses select. * The goroutine might be sleeping in ReadTrace. If it wakes up, it will print 'runtime: spurious wakeup of trace reader' and return nil. All future calls to ReadTrace will print 'runtime: ReadTrace called from multiple goroutines simultaneously'. Eventually, when trace data is available, a true wakeup will be attempted, leading to any one of the cases in this list. None of these fatal errors appear in any of the trybot or dashboard logs. The 'bad g->status in ready' that happens if the goroutine is running (the most likely scenario anyway) has happened once on the dashboard and eight times in trybot logs. Of the eight, five were atomicstatus=8 during net/http tests, so almost certainly this bug. The other three were atomicstatus=2, all near code in select, but in a draft CL by Dmitry that was rewriting select and may or may not have had its own bugs. This bug has existed since Go 1.4. Until then the select code was implemented in C, 'done uint32' was a C stack variable 'uint32 done', and C stacks never moved. I believe it has become more common recently because of Brad's work to run more and more tests in net/http in parallel, which lengthens race windows. The fix is to run step 6 on the system stack, avoiding possibility of stack growth. Fixes #17007 and possibly other mysterious failures. Change-Id: I9d6575a51ac96ae9d67ec24da670426a4a45a317 Reviewed-on: https://go-review.googlesource.com/34835 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2017-01-06 19:19:35 +00:00
Austin Clements	5dd978a283	runtime: expand HACKING.md This adds high-level descriptions of the scheduler structures, the user and system stacks, error handling, and synchronization. Change-Id: I1eed97c6dd4a6e3d351279e967b11c6e64898356 Reviewed-on: https://go-review.googlesource.com/34290 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-01-06 18:30:36 +00:00
Austin Clements	618c291544	runtime: update big mgc.go comment The comment describing the overall GC algorithm at the top of mgc.go has gotten woefully out-of-date (and was possibly never correct/complete). Update it to reflect the current workings of the GC and the set of phases that we now divide it into. Change-Id: I02143c0ebefe9d4cd7753349dab8045f0973bf95 Reviewed-on: https://go-review.googlesource.com/34711 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-01-06 18:22:35 +00:00
Austin Clements	7aefdfded0	runtime: use 4K as the boundary of legal pointers Currently, the check for legal pointers in stack copying uses _PageSize (8K) as the minimum legal pointer. By default, Linux won't let you map under 64K, but 1) it's less clear what other OSes allow or will allow in the future; 2) while mapping the first page is a terrible idea, mapping anywhere above that is arguably more justifiable; 3) the compiler only assumes the first physical page (4K) is never mapped. Make the runtime consistent with the compiler and more robust by changing the bad pointer check to use 4K as the minimum legal pointer. This came out of discussions on CLs 34663 and 34719. Change-Id: Idf721a788bd9699fb348f47bdd083cf8fa8bd3e5 Reviewed-on: https://go-review.googlesource.com/34890 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2017-01-06 16:19:14 +00:00
Lion Yang	a2b615d527	crypto: detect BMI usability on AMD64 for sha1 and sha256 The existing implementations on AMD64 only detects AVX2 usability, when they also contains BMI (bit-manipulation instructions). These instructions crash the running program as 'unknown instructions' on the architecture, e.g. i3-4000M, which supports AVX2 but not support BMI. This change added the detections for BMI1 and BMI2 to AMD64 runtime with two flags as the result, `support_bmi1` and `support_bmi2`, in runtime/runtime2.go. It also completed the condition to run AVX2 version in packages crypto/sha1 and crypto/sha256. Fixes #18512 Change-Id: I917bf0de365237740999de3e049d2e8f2a4385ad Reviewed-on: https://go-review.googlesource.com/34850 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-01-05 15:37:37 +00:00

1 2 3 4 5 ...

2656 Commits