qbit/go - go - Tape:neT

qbit/go

mirror of https://github.com/golang/go synced 2024-10-03 03:21:22 -06:00

Author	SHA1	Message	Date
Yao Zhang	e0053f8b1c	runtime: restructured os1_linux.go, added mips64 support Linux/mips64 uses a different type of sigset. To deal with it, related functions in os1_linux.go is refactored to os1_linux_generic.go (used for non-mips64 architectures), and os1_linux_mips64x.go (only used in mips64{,le}), to avoid code copying. Change-Id: I5cadfccd86bfc4b30bf97e12607c3c614903ea4c Reviewed-on: https://go-review.googlesource.com/14991 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-12 04:48:23 +00:00
Yao Zhang	c1037aad4d	runtime: added mips64{,le} build tags and GOARCH cases Change-Id: I381c03d957a0dccae5f655f02e92760e5c0e9629 Reviewed-on: https://go-review.googlesource.com/14929 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:47:42 +00:00
Yao Zhang	15b51d6ae6	runtime: updated automatically generated zgoarch_*.go files for unsupported architectures are deleted, as it would require changing cmd/dist to recognize their names as build tags (probably need a separated CL). Change-Id: Ifd164b014867d39b4924d1b859fb84317dce4ab0 Reviewed-on: https://go-review.googlesource.com/14928 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:47:29 +00:00
Yao Zhang	a36dda7880	runtime: added go files for linux/mips64{,le} support Change-Id: I14b537922b97d4bce9e0523d98a822da906348f1 Reviewed-on: https://go-review.googlesource.com/14447 Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:47:15 +00:00
Yao Zhang	980b00f55b	runtime: added go files for mips64 architecture support Change-Id: Ia496470e48b3c5d39fb9fef99fac356dfb73a949 Reviewed-on: https://go-review.googlesource.com/14927 Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:46:50 +00:00
Yao Zhang	b2b8559987	runtime/internal/atomic: added mips64 support. Change-Id: I2eaf0658771a0ff788429e2f503d116531166315 Reviewed-on: https://go-review.googlesource.com/16834 Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:46:35 +00:00
Yao Zhang	424738e43e	runtime: added assembly part of linux/mips64{,le} support Change-Id: I9e94027ef66c88007107de2b2b75c3d7cf1352af Reviewed-on: https://go-review.googlesource.com/14467 Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 04:46:17 +00:00
Matthew Dempsky	a9bebd91c9	runtime: update comment that was missed in CL 6584 Change-Id: Ie5f70af7e673bb2c691a45c28db2c017e6cddd4f Reviewed-on: https://go-review.googlesource.com/16833 Reviewed-by: Minux Ma <minux@golang.org>	2015-11-12 03:38:04 +00:00
Matthew Dempsky	c17c42e8a5	runtime: rewrite lots of foo_Bar(f, ...) into f.bar(...) Applies to types fixAlloc, mCache, mCentral, mHeap, mSpan, and mSpanList. Two special cases: 1. mHeap_Scavenge() previously didn't take an mheap parameter, so it was specially handled in this CL. 2. mHeap_Free() would have collided with mheap's "free" field, so it's been renamed to (mheap).freeSpan to parallel its underlying (*mheap).freeSpanLocked method. Change-Id: I325938554cca432c166fe9d9d689af2bbd68de4b Reviewed-on: https://go-review.googlesource.com/16221 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-12 00:34:58 +00:00
Michael Hudson-Doyle	58db5fc94d	runtime: run TestCgoExternalThreadSIGPROF on ppc64le It was disabled because of the lack of external linking. Change-Id: Iccb4a4ef8c57d048d53deabe4e0f4e6b9dccce33 Reviewed-on: https://go-review.googlesource.com/16797 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-12 00:30:04 +00:00
Hyang-Ah Hana Kim	b2259dcef0	runtime: add syscalls needed for android/386 logging Update golang/go#9327. Change-Id: I27ef973190d9ae652411caf3739414b5d46ca7d2 Reviewed-on: https://go-review.googlesource.com/16679 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-11-11 21:59:53 +00:00
Hyang-Ah Hana Kim	05c4c6e2f4	cmd,runtime: TLS setup for android/386 Same ugly hack as https://go-review.googlesource.com/15991. Update golang/go#9327. Change-Id: I58284e83268a15de95eabc833c3e01bf1e3faa2e Reviewed-on: https://go-review.googlesource.com/16678 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-11-11 21:59:24 +00:00
Austin Clements	d727312cbf	runtime: remove unused marking parfor The GC now handles the root marking jobs as part of general marking, so work.markfor is no longer used. Change-Id: I6c3b23fed27e4e7ea6430d6ca7ba25ae4d04ed14 Reviewed-on: https://go-review.googlesource.com/16811 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-11-11 18:31:33 +00:00
Austin Clements	f32f2954fb	runtime: never allocate new M when jumping time forward When we're jumping time forward, it means everyone is asleep, so there should always be an M available. Furthermore, this causes both allocation and write barriers in contexts that may be running without a P (such as in sysmon). Hence, replace this allocation with a throw. Updates #10600. Change-Id: I2cee70d5db828d0044082878995949edb25dda5f Reviewed-on: https://go-review.googlesource.com/16815 Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-11 17:37:42 +00:00
Austin Clements	f5c42cf88e	runtime: replace traceBuf slice with index Currently traceBuf keeps track of where it is in the trace buffer by also maintaining a slice that points in to this buffer with an initial length of 0 and a cap of the length of the array. All writes to this buffer are done by appending to the slice (as long as the bounds checks are right, it will never overflow and the append won't allocate a new slice). Each of these appends generates a write barrier. As long as we never overflow the buffer, this write barrier won't fire, but this wreaks havoc with eliminating write barriers from the tracing code. If we were to overflow the buffer, this would both allocate and invoke a write barrier, both things that are dicey at best to do in many of the contexts tracing happens. It also wastes space in the traceBuf and leads to more complex code and more complex generated code. Replace this slice trick with keeping track of a simple array position. Updates #10600. Change-Id: I0a63eecec1992e195449f414ed47653f66318d0e Reviewed-on: https://go-review.googlesource.com/16814 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:31 +00:00
Austin Clements	2be1ed80c5	runtime: eliminate traceStack write barriers This replaces *traceStack with traceStackPtr, much like the preceding commit. Updates #10600. Change-Id: Ifadc35eb37a405ae877f9740151fb31a0ca1d08f Reviewed-on: https://go-review.googlesource.com/16813 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:26 +00:00
Austin Clements	03227bb55e	runtime: eliminate traceBuf write barriers The tracing code is currently called from contexts such as sysmon and the scheduler where write barriers are not allowed. Unfortunately, while the common paths through the tracing code do not have write barriers, many of the less common paths dealing with buffer overflow and recycling do. This change replaces all *traceBufs with traceBufPtrs. In the style of guintptr, etc., the GC does not trace traceBufPtrs and write barriers do not apply when these pointers are written. Since traceBufs are allocated from non-GC'd memory and manually managed, this is always safe. Updates #10600. Change-Id: I52b992d36d1b634ebd855c8cde27947ec14f59ba Reviewed-on: https://go-review.googlesource.com/16812 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-11 17:37:18 +00:00
Austin Clements	7d1d642956	runtime: fix use of xadd64 Commit `7407d8e` was rebased over the switch to runtime/internal/atomic and introduced a call to xadd64, which no longer exists. Fix that call. Change-Id: I99c93469794c16504ae4a8ffe3066ac382c66a3a Reviewed-on: https://go-review.googlesource.com/16816 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-11 15:26:24 +00:00
Austin Clements	7407d8e582	runtime: fix over-aggressive proportional sweep Currently, sweeping is performed before allocating a span by charging for the entire size of the span requested, rather than the number of bytes actually available for allocation from the returned span. That is, if the returned span is 8K, but already has 6K in use, the mutator is charged for 8K of heap allocation even though it can only allocate 2K more from the span. As a result, proportional sweep is over-aggressive and tends to finish much earlier than it needs to. This effect is more amplified by fragmented heaps. Fix this by reimbursing the mutator for the used space in a span once it has allocated that span. We still have to charge up-front for the worst-case because we don't know which span the mutator will get, but at least we can correct the over-charge once it has a span, which will go toward later span allocations. This has negligible effect on the throughput of the go1 benchmarks and the garbage benchmark. Fixes #12040. Change-Id: I0e23e7a4ccf126cca000fed5067b20017028dd6b Reviewed-on: https://go-review.googlesource.com/16515 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-11 15:21:32 +00:00
Ian Lance Taylor	880a689124	runtime: don't call msanread when running on the system stack The runtime is not instrumented, but the calls to msanread in the runtime can sometimes refer to the system stack. An example is the call to copy in stkbucket in mprof.go. Depending on what C code has done, the system stack may appear uninitialized to msan. Change-Id: Ic21705b9ac504ae5cf7601a59189302f072e7db1 Reviewed-on: https://go-review.googlesource.com/16660 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-11-11 06:04:04 +00:00
Ian Lance Taylor	8f3f2ccac0	runtime: mark cgo callback results as written for msan This is a fix for the -msan option when using cgo callbacks. A cgo callback works by writing out C code that puts a struct on the stack and passes the address of that struct into Go. The result parameters are fields of the struct. The Go code will write to the result parameters, but the Go code thinks it is just writing into the Go stack, and therefore won't call msanwrite. This CL adds a call to msanwrite in the cgo callback code so that the C knows that results were written. Change-Id: I80438dbd4561502bdee97fad3f02893a06880ee1 Reviewed-on: https://go-review.googlesource.com/16611 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-11-11 05:58:19 +00:00
Austin Clements	f84420c20d	runtime: clean up park messages This changes "mark worker (idle)" to "GC worker (idle)" so it's more clear to users that these goroutines are GC-related. It changes "GC assist" to "GC assist wait" to make it clear that the assist is blocked. Change-Id: Iafbc0903c84f9250ff6bee14baac6fcd4ed5ef76 Reviewed-on: https://go-review.googlesource.com/16511 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-11 01:04:39 +00:00
Austin Clements	56ad88b1ff	runtime: free stack spans outside STW We couldn't do this before this point because it must be done before the next GC cycle starts. Hence, if it delayed the start of the next cycle, that would widen the window between reaching the heap trigger of the next cycle and starting the next GC cycle, during which the mutator could over-allocate. With the decentralized GC, any mutators that reach the heap trigger will block on the GC starting, so it's safe to widen the time between starting the world and being able to start the next GC cycle. Fixes #11465. Change-Id: Ic7ea7e9eba5b66fc050299f843a9c9001ad814aa Reviewed-on: https://go-review.googlesource.com/16394 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-11 01:04:33 +00:00
Ian Lance Taylor	9dcc58c3d1	cmd/cgo, runtime: add checks for passing pointers from Go to C This implements part of the proposal in issue 12416 by adding dynamic checks for passing pointers from Go to C. This code is intended to be on at all times. It does not try to catch every case. It does not implement checks on calling Go functions from C. The new cgo checks may be disabled using GODEBUG=cgocheck=0. Update #12416. Change-Id: I48de130e7e2e83fb99a1e176b2c856be38a4d3c8 Reviewed-on: https://go-review.googlesource.com/16003 Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-10 22:22:10 +00:00
Michael Matloob	67faca7d9c	runtime: break atomics out into package runtime/internal/atomic This change breaks out most of the atomics functions in the runtime into package runtime/internal/atomic. It adds some basic support in the toolchain for runtime packages, and also modifies linux/arm atomics to remove the dependency on the runtime's mutex. The mutexes have been replaced with spinlocks. all trybots are happy! In addition to the trybots, I've tested on the darwin/arm64 builder, on the darwin/arm builder, and on a ppc64le machine. Change-Id: I6698c8e3cf3834f55ce5824059f44d00dc8e3c2f Reviewed-on: https://go-review.googlesource.com/14204 Run-TryBot: Michael Matloob <matloob@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-10 17:38:04 +00:00
Michael Hudson-Doyle	4e3deae96d	cmd/link, runtime: arm64 implementation of addmoduledata Change-Id: I62fb5b20d7caa51b77560a4bfb74a39f17089805 Reviewed-on: https://go-review.googlesource.com/13999 Reviewed-by: Russ Cox <rsc@golang.org>	2015-11-10 01:24:25 +00:00
Keith Randall	e410a527b2	runtime: simplify chan ops, take 2 This change is the same as CL #9345 which was reverted, except for a small bug fix. The only change is to the body of sendDirect and its callsite. Also added a test. The problem was during a channel send operation. The target of the send was a sleeping goroutine waiting to receive. We basically do: 1) Read the destination pointer out of the sudog structure 2) Copy the value we're sending to that destination pointer Unfortunately, the previous change had a goroutine suspend point between 1 & 2 (the call to sendDirect). At that point the destination goroutine's stack could be copied (shrunk). The pointer we read in step 1 is no longer valid for step 2. Fixed by not allowing any suspension points between 1 & 2. I suspect the old code worked correctly basically by accident. Fixes #13169 The original 9345: This change removes the retry mechanism we use for buffered channels. Instead, any sender waking up a receiver or vice versa completes the full protocol with its counterpart. This means the counterpart does not need to relock the channel when it wakes up. (Currently buffered channels need to relock on wakeup.) For sends on a channel with waiting receivers, this change replaces two copies (sender->queue, queue->receiver) with one (sender->receiver). For receives on channels with a waiting sender, two copies are still required. This change unifies to a large degree the algorithm for buffered and unbuffered channels, simplifying the overall implementation. Fixes #11506 Change-Id: I57dfa3fc219cffa4d48301ee15fe5479299efa09 Reviewed-on: https://go-review.googlesource.com/16740 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-08 23:20:25 +00:00
Michael Hudson-Doyle	1b4d28f8cf	cmd/link, runtime: arm implementation of addmoduledata Change-Id: I3975e10c2445e23c2798a7203a877ff2de3427c7 Reviewed-on: https://go-review.googlesource.com/14189 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-11-08 21:46:17 +00:00
Ian Lance Taylor	e884334b55	runtime: use pthread_sigmask, not sigprocmask, on Darwin ARM/ARM64 Other systems use pthread_sigmask. It was a mistake to use sigprocmask here. Change-Id: Ie045aa3f09cf035fcf807b7543b96fa5b847958a Reviewed-on: https://go-review.googlesource.com/16720 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-07 15:48:58 +00:00
Keith Randall	4b7d5f0b94	runtime: memmove/memclr pointers atomically Make sure that we're moving or zeroing pointers atomically. Anything that is a multiple of pointer size and at least pointer aligned might have pointers in it. All the code looks ok except for the 1-pointer-sized moves. Fixes #13160 Update #12552 Change-Id: Ib97d9b918fa9f4cc5c56c67ed90255b7fdfb7b45 Reviewed-on: https://go-review.googlesource.com/16668 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-07 02:42:12 +00:00
Ilya Tocar	321a40721b	runtime: optimize indexbytebody on amd64 Use avx2 to compare 32 bytes per iteration. Results (haswell): name old time/op new time/op delta IndexByte32-6 15.5ns ± 0% 14.7ns ± 5% -4.87% (p=0.000 n=16+20) IndexByte4K-6 360ns ± 0% 183ns ± 0% -49.17% (p=0.000 n=19+20) IndexByte4M-6 384µs ± 0% 256µs ± 1% -33.41% (p=0.000 n=20+20) IndexByte64M-6 6.20ms ± 0% 4.18ms ± 1% -32.52% (p=0.000 n=19+20) IndexBytePortable32-6 73.4ns ± 5% 75.8ns ± 3% +3.35% (p=0.000 n=20+19) IndexBytePortable4K-6 5.15µs ± 0% 5.15µs ± 0% ~ (all samples are equal) IndexBytePortable4M-6 5.26ms ± 0% 5.25ms ± 0% -0.12% (p=0.000 n=20+18) IndexBytePortable64M-6 84.1ms ± 0% 84.1ms ± 0% -0.08% (p=0.012 n=18+20) Index32-6 352ns ± 0% 352ns ± 0% ~ (all samples are equal) Index4K-6 53.8µs ± 0% 53.8µs ± 0% -0.03% (p=0.000 n=16+18) Index4M-6 55.4ms ± 0% 55.4ms ± 0% ~ (p=0.149 n=20+19) Index64M-6 886ms ± 0% 886ms ± 0% ~ (p=0.108 n=20+20) IndexEasy32-6 80.3ns ± 0% 80.1ns ± 0% -0.21% (p=0.000 n=20+20) IndexEasy4K-6 426ns ± 0% 215ns ± 0% -49.53% (p=0.000 n=20+20) IndexEasy4M-6 388µs ± 0% 262µs ± 1% -32.42% (p=0.000 n=18+20) IndexEasy64M-6 6.20ms ± 0% 4.19ms ± 1% -32.47% (p=0.000 n=18+20) name old speed new speed delta IndexByte32-6 2.06GB/s ± 1% 2.17GB/s ± 5% +5.19% (p=0.000 n=18+20) IndexByte4K-6 11.4GB/s ± 0% 22.3GB/s ± 0% +96.45% (p=0.000 n=17+20) IndexByte4M-6 10.9GB/s ± 0% 16.4GB/s ± 1% +50.17% (p=0.000 n=20+20) IndexByte64M-6 10.8GB/s ± 0% 16.0GB/s ± 1% +48.19% (p=0.000 n=19+20) IndexBytePortable32-6 436MB/s ± 5% 422MB/s ± 3% -3.27% (p=0.000 n=20+19) IndexBytePortable4K-6 795MB/s ± 0% 795MB/s ± 0% ~ (p=0.940 n=17+18) IndexBytePortable4M-6 798MB/s ± 0% 799MB/s ± 0% +0.12% (p=0.000 n=20+18) IndexBytePortable64M-6 798MB/s ± 0% 798MB/s ± 0% +0.08% (p=0.011 n=18+20) Index32-6 90.9MB/s ± 0% 90.9MB/s ± 0% -0.00% (p=0.025 n=20+20) Index4K-6 76.1MB/s ± 0% 76.1MB/s ± 0% +0.03% (p=0.000 n=14+15) Index4M-6 75.7MB/s ± 0% 75.7MB/s ± 0% ~ (p=0.076 n=20+19) Index64M-6 75.7MB/s ± 0% 75.7MB/s ± 0% ~ (p=0.456 n=20+17) IndexEasy32-6 399MB/s ± 0% 399MB/s ± 0% +0.20% (p=0.000 n=20+19) IndexEasy4K-6 9.60GB/s ± 0% 19.02GB/s ± 0% +98.19% (p=0.000 n=20+20) IndexEasy4M-6 10.8GB/s ± 0% 16.0GB/s ± 1% +47.98% (p=0.000 n=18+20) IndexEasy64M-6 10.8GB/s ± 0% 16.0GB/s ± 1% +48.08% (p=0.000 n=18+20) Change-Id: I46075921dde9f3580a89544c0b3a2d8c9181ebc4 Reviewed-on: https://go-review.googlesource.com/16484 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Klaus Post <klauspost@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-06 15:16:28 +00:00
Keith Randall	e9f90ba246	Revert "runtime: simplify buffered channels." Revert for now until #13169 is understood. This reverts commit `8e496f1d69`. Change-Id: Ib3eb2588824ef47a2b6eb9e377a24e5c817fcc81 Reviewed-on: https://go-review.googlesource.com/16716 Reviewed-by: Keith Randall <khr@golang.org>	2015-11-06 08:30:35 +00:00
Austin Clements	d5ba582166	runtime: remove background GC goroutine and mark barriers These are now unused. Updates #11970. Change-Id: I43e5c4e5bcda9581bacc63364f96bb4855ab779f Reviewed-on: https://go-review.googlesource.com/16393 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:24:05 +00:00
Austin Clements	bbf2da00fc	runtime: remove GC start up/shutdown workaround in mallocgc Currently mallocgc detects if the GC is in a state where it can't assist, but also can't allocate uncontrolled and yields to help out the GC. This was a workaround for periods when we were trying to schedule the GC coordinator. It is no longer necessary because there is no GC coordinator and malloc can always assist with any GC transitions that are necessary. Updates #11970. Change-Id: I4f7beb7013e85e50ae99a3a8b0bb708ba49cbcd4 Reviewed-on: https://go-review.googlesource.com/16392 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:24:01 +00:00
Austin Clements	c99d7f7f85	runtime: decentralize mark done and mark termination This moves all of the mark 1 to mark 2 transition and mark termination to the mark done transition function. This means these transitions are now handled on the goroutine that detected mark completion. This also means that the GC coordinator and the background completion barriers are no longer used and various workarounds to yield to the coordinator are no longer necessary. These will be removed in follow-up commits. One consequence of this is that mark workers now need to be preemptible when performing the mark done transition. This allows them to stop the world and to perform the final clean-up steps of GC after restarting the world. They are only made preemptible while performing this transition, so if the worker findRunnableGCWorker would schedule isn't available, we didn't want to schedule it anyway. Fixes #11970. Change-Id: I9203a2d6287eeff62d589ec02ad9cb1e29ddb837 Reviewed-on: https://go-review.googlesource.com/16391 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:54 +00:00
Austin Clements	d986bf2741	runtime: account mark worker time before gcMarkDone Currently gcMarkDone takes basically no time, so it's okay to account the worker time after calling it. However, gcMarkDone is about to take potentially much longer because it may perform all of mark termination. Prepare for this by swapping the order so we account the time before calling gcMarkDone. Change-Id: I90c7df68192acfc4fd02a7254dae739dda4e2fcb Reviewed-on: https://go-review.googlesource.com/16390 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:49 +00:00
Austin Clements	171204b561	runtime: factor mark done transition Currently the code for completion of mark 1/mark 2 is duplicated in background workers and assists. Factor this in to a single function that will serve as the transition function for concurrent mark. Change-Id: I4d9f697a15da0d349db3b34d56f3a220dd41d41b Reviewed-on: https://go-review.googlesource.com/16359 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:42 +00:00
Austin Clements	12e23f05ff	runtime: eliminate mark completion in scheduler Currently, findRunnableGCWorker will perform mark completion if there is no remaining work and no running workers. This used to be necessary to resolve a race in the transition from mark 1 to mark 2 where we would enter mark 2 with no mark work (and no dedicated workers), so no workers would run, so no worker would signal mark completion. However, we're about to make mark completion also perform the entire follow-on process, which includes mark termination. We really don't want to do that in the scheduler if it happens to detect completion. Conveniently, this hack is no longer necessary because we always enqueue root scanning work at the beginning of both mark 1 and mark 2, so a mark worker will always run. Hence, we can simply eliminate it. Change-Id: I3fc8f27c8da632f0fb732c9f6425e1f457f5652e Reviewed-on: https://go-review.googlesource.com/16358 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:38 +00:00
Austin Clements	20f276e237	runtime: don't start idle mark workers when barriers are cleared Currently, we don't start dedicated or fractional mark workers unless the mark 1 or mark 2 barriers have been cleared. One intended consequence of this is that no background workers run between the forEachP that disposes all gcWork caches and the beginning of mark 2. However, we (unintentionally) did not apply this restriction to idle mark workers. As a result, these can start in the interim between mark 1 completion and mark 2 starting. This explains why it was necessary to reset the root marking jobs using carefully ordered atomic writes when setting up mark 2. It also means that, even though we definitely enqueue work before starting mark 2, it may be drained by the time we reset the mark 2 barrier. If this happens, currently the only thing preventing the runtime from deadlocking is that the scheduler itself also checks for mark completion and will signal mark 2 completion. Were it not for the odd behavior of idle workers, this check in the scheduler would not be necessary. Clean all of this up and prepare to remove this check in the scheduler by applying the same restriction to starting idle mark workers. Change-Id: Ic1b479e1591bd7773dc27b320ca399a215603b5a Reviewed-on: https://go-review.googlesource.com/16631 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:33 +00:00
Austin Clements	a51905fa04	runtime: decentralize sweep termination and mark transition This moves all of GC initialization, sweep termination, and the transition to concurrent marking in to the off->mark transition function. This means it's now handled on the goroutine that detected the state exit condition. As a result, malloc no longer needs to Gosched() at the beginning of the GC cycle to prevent over-allocation while the GC is starting up because it will now help the GC to start up. The Gosched hack is still necessary during GC shutdown (this is easy to test by enabling gctrace and hitting Ctrl-S to block the gctrace output). At this point, the GC coordinator still handles later phases. This requires a small tweak to how we start the GC coordinator. Currently, starting the GC coordinator is best-effort and may fail if the coordinator is about to park from the previous cycle but hasn't yet. We fix this by replacing the park/ready to wake up the coordinator with a semaphore. This is temporary since the coordinator will be going away in a few commits. Updates #11970. Change-Id: I2c6a11c91e72dfbc59c2d8e7c66146dee9a444fe Reviewed-on: https://go-review.googlesource.com/16357 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:27 +00:00
Austin Clements	9630c47e8c	runtime: decentralize concurrent sweep termination This moves concurrent sweep termination from the coordinator to the off->mark transition. This allows it to be performed by all Gs attempting to start the GC. Updates #11970. Change-Id: I24428e8599a759398c2ef7ec996ba755a448f947 Reviewed-on: https://go-review.googlesource.com/16356 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:22 +00:00
Austin Clements	f54bcedce1	runtime: beginning of decentralized off->mark transition This begins the conversion of the centralized GC coordinator to a decentralized state machine by introducing the internal API that triggers the first state transition from _GCoff to _GCmark (or _GCmarktermination). This change introduces the transition lock, the off->mark transition condition (which is very similar to shouldtriggergc()), and the general structure of a state transition. Since we're doing this conversion in stages, it then falls back to the GC coordinator to actually execute the cycle. We'll start moving logic out of the GC coordinator and in to transition functions next. This fixes a minor bug in gcstoptheworld debug mode where passing the heap trigger once could trigger multiple STW GCs. Updates #11970. Change-Id: I964087dd190a639eb5766398f8e1bbf8b352902f Reviewed-on: https://go-review.googlesource.com/16355 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-11-05 21:23:17 +00:00
Austin Clements	3842596284	runtime: move concurrent mark setup off system stack For historical reasons we currently do a lot of the concurrent mark setup on the system stack. In fact, at this point the one and only thing that needs to happen on the system stack is the start-the-world. Clean up this code by lifting everything other than the start-the-world off the system stack. The diff for this change looks large, but the only code change is to narrow the systemstack call. Everything else is re-indentation. Change-Id: I1e03b8afc759fad726f2397b05a17d183c2713ce Reviewed-on: https://go-review.googlesource.com/16354 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:11 +00:00
Austin Clements	1959621584	runtime: lift state variables from func gc to var work We're about to split func gc across several functions, so lift the local variables it uses for tracking statistics and state across the cycle into the global "work" variable. Change-Id: Ie955f2f1758c7f5a5543ea1f3f33b222bc4b1d37 Reviewed-on: https://go-review.googlesource.com/16353 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-11-05 21:23:06 +00:00
Austin Clements	1698018955	runtime: note a minor issue with GODEUG=gcstoptheworld Change-Id: I91cda8d88b0852cd0f868d33c594206bcca0c386 Reviewed-on: https://go-review.googlesource.com/16352 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-05 21:22:59 +00:00
Ilya Tocar	967564be7e	runtime: optimize string comparison on amd64 Use AVX2 if possible. Results below (haswell): name old time/op new time/op delta CompareStringEqual-6 8.77ns ± 0% 8.63ns ± 1% -1.58% (p=0.000 n=20+19) CompareStringIdentical-6 5.02ns ± 0% 5.02ns ± 0% ~ (all samples are equal) CompareStringSameLength-6 7.51ns ± 0% 7.51ns ± 0% ~ (all samples are equal) CompareStringDifferentLength-6 1.56ns ± 0% 1.56ns ± 0% ~ (all samples are equal) CompareStringBigUnaligned-6 124µs ± 1% 105µs ± 5% -14.99% (p=0.000 n=20+18) CompareStringBig-6 112µs ± 1% 103µs ± 0% -7.87% (p=0.000 n=20+17) name old speed new speed delta CompareStringBigUnaligned-6 8.48GB/s ± 1% 9.98GB/s ± 5% +17.67% (p=0.000 n=20+18) CompareStringBig-6 9.37GB/s ± 1% 10.17GB/s ± 0% +8.54% (p=0.000 n=20+17) Change-Id: I1c949626dd2aaf9f633e3c888a9df71c82eed7e1 Reviewed-on: https://go-review.googlesource.com/16481 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Klaus Post <klauspost@gmail.com>	2015-11-05 15:42:33 +00:00
Keith Randall	8e496f1d69	runtime: simplify buffered channels. This change removes the retry mechanism we use for buffered channels. Instead, any sender waking up a receiver or vice versa completes the full protocol with its counterpart. This means the counterpart does not need to relock the channel when it wakes up. (Currently buffered channels need to relock on wakeup.) For sends on a channel with waiting receivers, this change replaces two copies (sender->queue, queue->receiver) with one (sender->receiver). For receives on channels with a waiting sender, two copies are still required. This change unifies to a large degree the algorithm for buffered and unbuffered channels, simplifying the overall implementation. Fixes #11506 benchmark old ns/op new ns/op delta BenchmarkChanProdCons10 125 110 -12.00% BenchmarkChanProdCons0 303 284 -6.27% BenchmarkChanProdCons100 75.5 71.3 -5.56% BenchmarkChanContended 6452 6125 -5.07% BenchmarkChanNonblocking 11.5 11.0 -4.35% BenchmarkChanCreation 149 143 -4.03% BenchmarkChanSem 63.6 61.6 -3.14% BenchmarkChanUncontended 6390 6212 -2.79% BenchmarkChanSync 282 276 -2.13% BenchmarkChanProdConsWork10 516 506 -1.94% BenchmarkChanProdConsWork0 696 685 -1.58% BenchmarkChanProdConsWork100 470 469 -0.21% BenchmarkChanPopular 660427 660012 -0.06% Change-Id: I164113a56432fbc7cace0786e49c5a6e6a708ea4 Reviewed-on: https://go-review.googlesource.com/9345 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-11-05 15:41:05 +00:00
Austin Clements	dcd9e5bc0f	runtime: make putfull start mark workers Currently we depend on the good graces and timing of the scheduler to get opportunities to start dedicated mark workers. In the worst case, it may take 10ms to get dedicated mark workers going at the beginning of mark 1 and mark 2 or after the amount of available work has dropped and gone back up. Instead of waiting for the regular preemption logic to get around to us, make putfull enlist a random P if we're not already running enough dedicated workers. This should improve performance stability of the garbage collector and is likely to improve the overall performance somewhat. No overall effect on the go1 benchmarks. It speeds up the garbage benchmark by 12%, which more than counters the performance loss from the previous commit. name old time/op new time/op delta XBenchGarbage-12 6.32ms ± 4% 5.58ms ± 2% -11.68% (p=0.000 n=20+16) name old time/op new time/op delta BinaryTree17-12 3.18s ± 5% 3.12s ± 4% -1.83% (p=0.021 n=20+20) Fannkuch11-12 2.50s ± 2% 2.46s ± 2% -1.57% (p=0.000 n=18+19) FmtFprintfEmpty-12 50.8ns ± 3% 50.4ns ± 3% ~ (p=0.184 n=20+20) FmtFprintfString-12 167ns ± 2% 171ns ± 1% +2.46% (p=0.000 n=20+19) FmtFprintfInt-12 161ns ± 2% 163ns ± 2% +1.81% (p=0.000 n=20+20) FmtFprintfIntInt-12 269ns ± 1% 266ns ± 1% -0.81% (p=0.002 n=19+20) FmtFprintfPrefixedInt-12 237ns ± 2% 231ns ± 2% -2.86% (p=0.000 n=20+20) FmtFprintfFloat-12 313ns ± 2% 313ns ± 1% ~ (p=0.681 n=20+20) FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 1% -2.26% (p=0.000 n=20+20) GobDecode-12 8.66ms ± 1% 8.67ms ± 1% ~ (p=0.380 n=19+20) GobEncode-12 6.56ms ± 1% 6.56ms ± 2% ~ (p=0.607 n=19+20) Gzip-12 317ms ± 1% 314ms ± 2% -1.10% (p=0.000 n=20+19) Gunzip-12 42.1ms ± 1% 42.2ms ± 1% +0.27% (p=0.044 n=20+19) HTTPClientServer-12 62.7µs ± 1% 62.0µs ± 1% -1.04% (p=0.000 n=19+18) JSONEncode-12 16.7ms ± 1% 16.8ms ± 2% +0.59% (p=0.021 n=20+20) JSONDecode-12 58.2ms ± 1% 61.4ms ± 2% +5.43% (p=0.000 n=18+19) Mandelbrot200-12 3.84ms ± 1% 3.87ms ± 2% +0.79% (p=0.008 n=18+20) GoParse-12 3.86ms ± 2% 3.76ms ± 2% -2.60% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 100ns ± 2% 100ns ± 1% -0.68% (p=0.005 n=18+15) RegexpMatchEasy0_1K-12 332ns ± 1% 342ns ± 1% +3.16% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 82.9ns ± 3% 83.0ns ± 2% ~ (p=0.906 n=19+20) RegexpMatchEasy1_1K-12 487ns ± 1% 494ns ± 1% +1.50% (p=0.000 n=17+20) RegexpMatchMedium_32-12 131ns ± 2% 130ns ± 1% ~ (p=0.686 n=19+20) RegexpMatchMedium_1K-12 39.6µs ± 1% 39.2µs ± 1% -1.09% (p=0.000 n=18+19) RegexpMatchHard_32-12 2.04µs ± 1% 2.04µs ± 2% ~ (p=0.804 n=20+20) RegexpMatchHard_1K-12 61.7µs ± 2% 61.3µs ± 2% ~ (p=0.052 n=18+20) Revcomp-12 529ms ± 2% 533ms ± 1% +0.83% (p=0.003 n=20+19) Template-12 70.7ms ± 2% 71.0ms ± 2% ~ (p=0.065 n=20+19) TimeParse-12 351ns ± 2% 355ns ± 1% +1.25% (p=0.000 n=19+20) TimeFormat-12 362ns ± 2% 373ns ± 1% +2.83% (p=0.000 n=18+20) [Geo mean] 62.2µs 62.3µs +0.13% name old speed new speed delta GobDecode-12 88.6MB/s ± 1% 88.5MB/s ± 1% ~ (p=0.392 n=19+20) GobEncode-12 117MB/s ± 1% 117MB/s ± 1% ~ (p=0.622 n=19+20) Gzip-12 61.1MB/s ± 1% 61.8MB/s ± 2% +1.11% (p=0.000 n=20+19) Gunzip-12 461MB/s ± 1% 460MB/s ± 1% -0.27% (p=0.044 n=20+19) JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% -0.58% (p=0.022 n=20+20) JSONDecode-12 33.3MB/s ± 1% 31.6MB/s ± 2% -5.15% (p=0.000 n=18+19) GoParse-12 15.0MB/s ± 2% 15.4MB/s ± 2% +2.66% (p=0.000 n=20+20) RegexpMatchEasy0_32-12 317MB/s ± 2% 319MB/s ± 2% ~ (p=0.052 n=20+20) RegexpMatchEasy0_1K-12 3.08GB/s ± 1% 2.99GB/s ± 1% -3.07% (p=0.000 n=19+19) RegexpMatchEasy1_32-12 386MB/s ± 3% 386MB/s ± 2% ~ (p=0.939 n=19+20) RegexpMatchEasy1_1K-12 2.10GB/s ± 1% 2.07GB/s ± 1% -1.46% (p=0.000 n=17+20) RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.64MB/s ± 1% ~ (p=0.702 n=19+20) RegexpMatchMedium_1K-12 25.9MB/s ± 1% 26.1MB/s ± 2% +0.99% (p=0.000 n=18+20) RegexpMatchHard_32-12 15.7MB/s ± 1% 15.7MB/s ± 2% ~ (p=0.723 n=20+20) RegexpMatchHard_1K-12 16.6MB/s ± 2% 16.7MB/s ± 2% ~ (p=0.052 n=18+20) Revcomp-12 481MB/s ± 2% 477MB/s ± 1% -0.83% (p=0.003 n=20+19) Template-12 27.5MB/s ± 2% 27.3MB/s ± 2% ~ (p=0.062 n=20+19) [Geo mean] 99.4MB/s 99.1MB/s -0.35% Change-Id: I914d8cadded5a230509d118164a4c201601afc06 Reviewed-on: https://go-review.googlesource.com/16298 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-04 20:15:51 +00:00
Austin Clements	62ba520b23	runtime: eliminate getfull barrier from concurrent mark Currently dedicated mark workers participate in the getfull barrier during concurrent mark. However, the getfull barrier wasn't designed for concurrent work and this causes no end of headaches. In the concurrent setting, participants come and go. This makes mark completion susceptible to live-lock: since dedicated workers are only periodically polling for completion, it's possible for the program to be in some transient worker each time one of the dedicated workers wakes up to check if it can exit the getfull barrier. It also complicates reasoning about the system because dedicated workers participate directly in the getfull barrier, but transient workers must instead use trygetfull because they have exit conditions that aren't captured by getfull (e.g., fractional workers exit when preempted). The complexity of implementing these exit conditions contributed to #11677. Furthermore, the getfull barrier is inefficient because we could be running user code instead of spinning on a P. In effect, we're dedicating 25% of the CPU to marking even if that means we have to spin to make that 25%. It also causes issues on Windows because we can't actually sleep for 100µs (#8687). Fix this by making dedicated workers no longer participate in the getfull barrier. Instead, dedicated workers simply return to the scheduler when they fail to get more work, regardless of what others workers are doing, and the scheduler only starts new dedicated workers if there's work available. Everything that needs to be handled by this barrier is already handled by detection of mark completion. This makes the system much more symmetric because all workers and assists now use trygetfull during concurrent mark. It also loosens the 25% CPU target so that we can give some of that 25% back to user code if there isn't enough work to keep the mark worker busy. And it eliminates the problematic 100µs sleep on Windows during concurrent mark (though not during mark termination). The downside of this is that if we hit a bottleneck in the heap graph that then expands back out, the system may shut down dedicated workers and take a while to start them back up. We'll address this in the next commit. Updates #12041 and #8687. No effect on the go1 benchmarks. This slows down the garbage benchmark by 9%, but we'll more than make it up in the next commit. name old time/op new time/op delta XBenchGarbage-12 5.80ms ± 2% 6.32ms ± 4% +9.03% (p=0.000 n=20+20) Change-Id: I65100a9ba005a8b5cf97940798918672ea9dd09b Reviewed-on: https://go-review.googlesource.com/16297 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-11-04 20:15:39 +00:00
Austin Clements	3a765430c1	cmd/compile: add go:nowritebarrierrec annotation This introduces a recursive variant of the go:nowritebarrier annotation that prohibits write barriers not only in the annotated function, but in all functions it calls, recursively. The error message gives the shortest call stack from the annotated function to the function containing the prohibited write barrier, including the names of the functions and the line numbers of the calls. To demonstrate the annotation, we apply it to gcmarkwb_m, the write barrier itself. This is a new annotation rather than a modification of the existing go:nowritebarrier annotation because, for better or worse, there are many go:nowritebarrier functions that do call functions with write barriers. In most of these cases this is benign because the annotation was conservative, but it prohibits simply coopting the existing annotation. Change-Id: I225ca483c8f699e8436373ed96349e80ca2c2479 Reviewed-on: https://go-review.googlesource.com/16554 Reviewed-by: Keith Randall <khr@golang.org>	2015-11-04 14:42:04 +00:00

1 2 3 4 5 ...

1487 Commits